Part 3: Ursa Compass: Empowerment Over Automation

Written by Steve Hackbarth | August 11, 2025 4:02:27 PM Z

In Part 1 of our series, we introduced the eager intern model: a strategy for harnessing the power of AI while firmly anchoring it in a loop of human verification. In Part 2, we described how we’ve formalized hard-won institutional knowledge into a living checklist, letting us encode what senior engineers know into prompts that AI can execute and defend.

This final installment is about turning those philosophies into software. This is Ursa Compass: the product we’re building to operationalize our principles.

Principles Put into Practice

Throughout this series, we've explored a consistent theme: AI should empower human expertise, not replace it. This isn't just a philosophical stance — it's a practical imperative in healthcare. When you're dealing with data that drives clinical decisions and patient outcomes, trust isn't optional. And trust, ultimately, requires human judgment.

There are a lot of tools right now that promise AI-powered automation. Compass takes a more cautious path. We don’t want a copilot that flies the plane. We want a smart assistant who moves fast but knows they’re not in charge.

The design is built around a feedback loop. Compass is prompt-driven: it works through a set of questions – our living checklist – and for each, it generates SQL, runs it, interprets the results, and proposes a conclusion. The engineer can challenge, accept, or refine the answer.

The Three-Way Conversation

Ursa Compass is designed around a three-way conversation. The first participant is the software, which itself is split between procedural code – software 1.0 – and a YAML playbook written and maintained by our engineers in plain English – software 3.0. The second participant is the LLM, playing the role of the eager intern. The third participant is the engineer, who accepts or rejects assertions, and helps guide the LLM’s work.

The procedural code uses playbooks to structure the questions that the LLM should answer. It also makes available to the LLM two tools: QUERY_DATABASE and ASK_USER. The model uses its best judgment to use these tools as appropriate. It crafts its own SQL for the calls to QUERY_DATABASE, and the procedural code verifies the safety of the SQL and executes it against the database.

Importantly, the text of the SQL and the results as they’re returned from the database are stored by the procedural code in a hallucination-free “evidence” section, which is made available to the engineer. It’s vital to keep track, in this three-way conversation, who is authoring each piece of text, because the working assumption of Ursa Compass is that nothing from the LLM can be accepted at face value. Allowing the engineer to see the data behind the analysis is a key part of gaining acceptance.

Because each check represents a mini research project which might take several minutes to perform, the checks are designed to be multi-threaded. This way, the LLM can be working on many questions at once, with the engineer being able to review the assertions as they arrive.

As such, each check in the playbook is a thread in a multi-threaded system, and ends with the LLM making an assertion. The engineer can either accept the assertion or send it back for more iteration. Once an assertion is accepted, it unlocks all the other checks that are dictated by the YAML as being dependents, and those follow-up research projects commence.

Try it Yourself

If you’re a data engineer working in the healthcare space, you might be interested to give Ursa Compass a test drive, individually or within your organization. We’ve set up three different ways to do so.

First, if you `git clone` the project at https://github.com/ursahealth/ursa-compass, you can run it locally on your own laptop. This local version doesn’t deal with user authentication, sharing of sessions or playbooks, or any of the enterprise overhead, beyond persisting data to your local filesystem. The LLM is powered by AWS Bedrock, so you’ll need an AWS account as well as the relevant BAA paperwork with Amazon in place.

Second, you can integrate Ursa Copilot into your own project via `npm install ursa-compass ursa-compass-ui`. In this setup, you supply the enterprise overhead and data persistence yourself. You can see the local version as a reference implementation for how you’d integrate these packages.

Lastly, if you're looking to use Ursa Compass in the context of a battle-tested enterprise SaaS platform that covers the full breadth of healthcare data needs, contact info@ursahealth.com and ask about Ursa Studio, which has Ursa Compass embedded as one of its features.

The Road Ahead

We’ve started using Ursa Compass in our own work, and it’s already changing how we approach the earliest stages of data integration. The eager intern gives us speed; the living checklist gives us rigor. Without the rigor, the speed would be irresponsible. Without the speed, the rigor would be infeasible. Both advances reinforce the other.

It’s early. We’re still learning. But the pattern feels right.

In a moment where AI is often pitched as a threat to expertise, Compass suggests a different direction: one where expertise is amplified, not displaced. One where the work gets better — not just faster.

This is a tool we built for ourselves, but it’s also an expression of how we think data work should feel: thoughtful, inspectable, and grounded in trust. We’re excited to see where it leads.

About the Author

Steve Hackbarth, Chief Technology Officer

Before joining Ursa Health, Steve was head of development at xTuple, the world's #1 open-source ERP software company, managing a diverse development team that spanned four continents. Before that, he founded Speak Logistics, a tech startup borne out of his experience in the transportation industry, which introduced the user-friendly and modern sensibilities of the consumer Internet into the enterprise space. His professional passions include JavaScript, open-source software, continuous integration, and scalable code.

Steve holds a bachelor's degree in computer science from Harvard University and an MBA from William and Mary. He is a frequent speaker on subjects such as JavaScript, modular architecture, git, open source, and asynchronous programming.

View full post