Evolving the PRD for Agentic AI Implementation

Why the PRD Needs to Change

A Product Requirements Document (PRD) has traditionally served as a coordination artifact between humans: product managers, engineers, designers, and business stakeholders. Its purpose has been to define what a product should do, why it should exist, and how success will be measured. As teams increasingly incorporate agentic AI systems, systems capable of planning, executing, and iterating autonomously, the PRD’s audience changes. It is no longer read only by people; it must now be interpreted and executed by machines.

Agentic AI differs alot from prompt-based generative systems. Rather than producing a single output in response to a query, agentic systems splits goals into tasks, select tools, execute actions, observe results, and revise their plans. This level of self-driven behaviour might increase the risk of requirementes that are not clear enough. Where a human being would have stopped, and tried to understand or perhaps asked for clarifications, an AI agent might just go full steam ahead. Sometimes incorrectly.

This creates both an opportunity and a necessity. The PRD can evolve into a formalized contract between the user and the AI, defining not only what to build, but how the AI is allowed to reason, decide, and act. Such a contract enables reproducibility across models and vendors, reduces operational risk, and aligns AI behavior with business intent so that the end product can be produced a second time using the same input.

From Human Alignment Tool to Machine-Readable Contract

Traditional PRDs tolerate ambiguity because humans are good at reading between the lines. Phrases like “fast,” “scalable,” or “user-friendly” are often left intentionally vague, relying on shared context and iterative discussion. However, natural language ambiguity is a known failure mode for automated systems.

For an agentic AI, ambiguity becomes executable behavior. If a requirement states “optimize performance,” the agent must decide which metric, under what constraints, and at what cost. Without explicit boundaries, the AI may optimize for the wrong outcome, such as reducing latency by removing logging or security checks. A secondary validation might be cruical for buisness but be disastrus for performance.

Reframing the PRD as a contract introduces several conceptual shifts:

Explicit obligations: What the AI must do.
Explicit constraints: What the AI must not do.
Verification criteria: How success and failure are evaluated.
Termination conditions: When the AI must stop, escalate or ask for clarifications.

This mirrors how software contracts and interface specifications are used to enable independent implementations while preserving consistent behavior .

Design Principles for AI-Consumable PRDs

1. Determinism Over Narrative

Human-oriented PRDs often rely on narrative explanations and stories with emotionally evaluations. For AI consumption, narrative should be minimized in favor of deterministic statements.

Human-style requirement

“The system should load quickly for most users.”

AI-contract requirement

“The system must render the initial UI of the start page within ≤2.0 seconds for ≥95% of requests measured at the 90th percentile under a simulated load of 1,000 concurrent users.”

Clear thresholds reduce the AI’s need to infer intent and make behavior more reproducible across models.

2. Explicit Goal Hierarchies

Agentic AI systems typically plan by decomposing goals into sub-tasks or tokens. A PRD designed for such systems should expose this hierarchy directly.

Instead of a flat list of features, requirements should be structured as:

Primary objective (business outcome)
Secondary objectives (supporting outcomes)
Non-objectives (explicit exclusions)

This reduces unintended optimization, incorrect focus or hallucination. Research on objective misalignment shows that agents will exploit underspecified goals if constraints are absent.

Agentic misalignment makes it possible for models to act similarly to an insider threat, behaving like a previously-trusted coworker or employee who suddenly begins to operate at odds with a company’s objectives.
anthropic.com

3. Constraints as First-Class Requirements

Traditional PRDs often treat constraints as footnotes. For AI agents, constraints must be first-class, machine-readable rules with as little room as possible for interpretation.

Examples:

Technology constraints (“Must use MySQL 8.4 LTS for data storage”)
Security constraints (“No plaintext secrets or personal data in code or logs”)
Organizational constraints (“Do not modify billing systems”)

This aligns with best practices in AI safety, which emphasize bounding action spaces for autonomous systems .

4. Verification and Acceptance Criteria

Humans can negotiate acceptance during reviews. AI agents need predefined acceptance tests.

Every requirement should include:

A measurable condition
A verification method
A pass/fail threshold

This mimic the role of automated tests in continuous integration, which enable repeatable evaluation without human judgment.

Example: PRD as a Contract for an Agentic Coding AI

Below is a simplified excerpt from a PRD explicitly designed for an agentic AI tasked with generating a software feature.

Product Requirement Contract (Excerpt)

Objective
Build a REST API endpoint that allows authenticated users to create and retrieve TODO items.

Primary Goal
Enable users to persist TODO items with title, description, and deadline.

Non-Goals

No user interface implementation
No notification or reminder features
No third-party integrations

Functional Requirements

The API must expose POST /todos and GET /todos.
Each TODO must include: id, title, description, deadline, created_datatime.
Requests must be authenticated using JWT.

Constraints

Language: Python 3.11
Framework: Flask
Database: MySQL
ORM: SQLAlchemy
No direct SQL queries permitted.

Security Requirements

Input validation must reject malformed JSON.
Deadlines must be ISO-8601 formatted.
Authentication failures must return HTTP 401.

Acceptance Criteria

Unit tests must cover ≥90% of business logic.
All tests must pass using pytest.
Linting must pass with flake8 default rules.

Termination Conditions

If database schema migration fails, stop execution and report error.
If test coverage <90%, do not proceed to final output.

This structure minimizes interpretation and enables the same PRD to be reused across different AI models or providers, improving portability and reproducibility. No reason for storytelling, instead focusing on facts.

Model- and Provider-Independence

One motivation for treating the PRD as a contract is to avoid lock-in to a single AI model or vendor. Today’s agentic systems vary significantly in planning depth, tool usage, and error recovery strategies and by using a PRD we can re-run on multiple models and evaluate outcome.

By externalizing intent into a structured PRD:

The PRD defines behavior, not the model.
Different agents can be evaluated against the same acceptance criteria.
Organizations can swap models without rewriting product intent.
This mirrors how open standards enable multiple implementations while preserving interoperability.

Implications for Product and Business Stakeholders

More time is spent defining constraints and success criteria upfront, but less time is lost correcting misaligned outputs. For business stakeholders, the PRD becomes auditable evidence of intent, a record of what the AI was instructed to do.

This is especially relevant for governance and risk management. Regulators increasingly emphasize traceability and accountability in AI systems. Both on what data is being fed in but also on what model it has been run. A PRD as the contract provides a clear artifact linking business intent to AI action.

Brief Notes on Risk, Ethics, and Governance

In no way a full review but some items that came up while writing:

Auditability: A structured PRD enables post-hoc analysis of whether failures stemmed from bad instructions or bad execution. It gives a clearer path to why a decision was taken.
Liability: Explicit constraints reduce ambiguity about responsibility when AI systems cause harm and extra/new contstaints can be added along the way.
Human override: PRDs should specify escalation conditions where human approval is required or when a manual review is needed for the model to continue.
These align with widely cited AI governance principles emphasizing human oversight and bounded autonomy.

Conclusion

As agentic AI systems move from experimental tools to production actors, the PRD must evolve as one of the main tools for product development. No longer just a communication aid between humans, it becomes a contract that defines, constrains, and verifies autonomous behavior. By prioritizing determinism, explicit constraints, and measurable acceptance criteria, organizations can create PRDs that are portable across AI models, safer to deploy, and better aligned with business goals.

In this framing, the PRD is not diminished by AI, it becomes more important than ever.

Notes (international), product management June 18, 2025

The Importance of Data Over Opinions

by: aimartin

Add Comment

“Without data, you’re just another person with an opinion.”
– W. Edwards Deming

In the fast-paced world of product development, everyone, from engineers to executives, has an opinion. But as any experienced product manager (PM) knows, opinions can be loud, persuasive, and dangerously wrong. That’s why great PMs embrace one fundamental truth: data beats opinions.

The Battle Between Gut and Evidence

Product management is a role full of ambiguity. You’re constantly making decisions under uncertainty: Which features to prioritize? Which market to target? What design will convert better?

It’s tempting to rely on instincts, stakeholder preferences, or the HiPPO (Highest Paid Person’s Opinion). But that’s a risky path. Opinions are often shaped by biases, incomplete context, or outdated information.

For example, a CEO may insist that adding a chatbot will increase user engagement because a competitor has one. But unless data supports that belief, like a clear user need, conversion metrics, or usability feedback, it’s just an opinion.

In contrast, data brings objectivity. It provides a shared truth that teams can rally around. It doesn’t mean feelings and vision are irrelevant, they’re essential, but they should be validated through evidence.

Real-World Example: Airbnb

Consider Airbnb in its early days. Founders Brian Chesky and Joe Gebbia believed professional photography of listings would boost bookings. Investors were skeptical, it seemed expensive and hard to scale. But instead of arguing, they ran an experiment: they hired a few photographers in New York to take professional photos of homes. The result? Listings with high-quality photos saw 2x–3x more bookings.

Armed with data, Airbnb rolled out the program. What started as a hunch became one of their most successful early growth strategies, because it was tested, measured, and backed by real user behavior.

Types of Data That Drive Decisions

Effective product decisions are powered by both quantitative and qualitative data. Here’s how they play distinct but complementary roles:

1. Quantitative Data

Numbers that scale, used to validate patterns.

Analytics: Google Analytics, Mixpanel, Amplitude
A/B Testing: Comparing feature variants (e.g., new button design vs. old)
User metrics: Retention, churn, NPS, conversion rate

Example: Dropbox used A/B testing extensively to optimize its onboarding. By tweaking messaging and signup flows based on user drop-off data, it significantly increased activation rates (Source).

2. Qualitative Data

User stories, motivations, and pain points. Often explains the “why” behind the numbers.

User interviews
Support tickets
Usability tests
Surveys

Example: Intercom used qualitative feedback to uncover that users weren’t confused by the interface itself but by unclear onboarding expectations. This insight wouldn’t come from metrics alone.

The Dangers of Being Opinion-Driven

Feature Bloat Without data validation, teams build features based on assumptions. This leads to complex products that don’t solve real problems.
Wasted Resources If you spend months building something nobody uses, that’s not just lost time, it’s opportunity cost. You could’ve been solving something your users actually needed.
Team Misalignment Opinions create silos. Data creates alignment. When teams debate based on data, the conversation becomes collaborative instead of confrontational.

Building a Data-Driven Culture

Being data-driven is a mindset, not just a toolset. Here’s how product managers can cultivate it:

Ask Questions First

Instead of jumping to solutions, PMs should ask:

What problem are we solving?
How do we know it’s a problem?
What does success look like?

Set Measurable Goals

Use OKRs (Objectives and Key Results) or KPIs. A feature without a success metric is a red flag.

Validate Early and Often

Use MVPs, prototypes, fake door tests, and user interviews. Dropbox famously launched with a demo video instead of a full product, to validate demand (Source).

Democratize Data Access

Empower teams with dashboards and self-serve tools. Don’t let data become the domain of analysts only.

Balance Data with Judgment

Data isn’t perfect. It can be incomplete, misinterpreted, or biased. Great PMs combine data with intuition, then validate again. As Jeff Bezos puts it: “We are stubborn on vision. We are flexible on details.”

What If Data Conflicts With Opinion?

This happens often. A powerful stakeholder may push a feature that data doesn’t support. Here’s how to handle it:

Acknowledge their perspective.
Show the data objectively, use visuals.
Suggest a test or experiment to evaluate the idea.
Frame the risk: “If we spend 3 weeks here, we’re not working on X.”

When conversations are rooted in data, they become less personal and more productive.

In Summary

Data beats opinions, not because opinions are worthless, but because decisions built on evidence drive better outcomes. In product management, this means prioritizing features users need, building experiences they love, and creating a shared language for teams to move forward.

It’s not about removing all intuition; it’s about validating hunches through experimentation. That’s how you build better products, faster, and with far less friction.

As Peter Drucker said, “What gets measured, gets managed.” And in product management, what gets measured gets built right.

Notes (international), product management March 5, 2018

Product management and the Bradley Fighting Veichle

by: Martin Karlsson

Add Comment

I came across this gem of a movie clip (From the movie The Pentagon Wars) that in a nutshell explains product management regardless of market or industry. The challenges and problems are always the same and this clip pinpoint that flawlessly.

Naturally, this is exaggerated to prove a point but I still see the essence of the clip to be true. The movie itself is also said to be showing the actual truth of the development of the vehicle but I do not know the truthfulness of this.

Let’s break this down and see what parts we can learn from.

The stakeholders

The stakeholders of the Bradley Fighting Vehicle had many different views on what the vehicle was to be used for. The order was for a personnel carrier but instead of placing trust in the ones developing the vehicle the stakeholders then started to inject ideas into development to be able to show rank.

Even though a stakeholder outranks a specialist, this does not mean that the stakeholder has more information or knowledge about the product being developed.

My five cent from reality here is that this is very common. Often this is smaller things and questions on a design that should be touched neither by the stakeholder nor the product manager. It can be details so tiny that best value would be to let the implementation team solve it while implementing.

Stakeholders and product managers must define the vision and the large strokes of what is to be built. Details are to be dealt by specialists.

The product manager

The product manager of the movie clip, Colonel Smith, did not try to stand strong against the stakeholders and instead let the product bloat and lose usefulness. Where he had the knowledge to stop what was going on, to not go into a conflict he accepted the changes and built a worse product.

Conflicts are the constant elephant in the room if you work with product development. Instead of trying to avoid them, the product manager must learn how to embrace and ride the elephants.

Even when the budget was lifted by Colonel Smith, lack of presentation made the comment pass by all stakeholders. If Colonel Smith instead had been a project manager he would have screamed from the top of his lungs that the budget broke.

One of the three pillars have fallen, stop what we’re doing and regain control!

But again, to skip the conflict, Colonel Smith accepted the change and continued to build a product that was again one step further away from the goal.

The outcome

What was planned to be a personnel carrier, quickly in and get persons out, became something completely different? Instead of a personnel carrier with large space for troops, it became a tank with a high profile, little ammunition and space for a few troops. On top of this, it was hard to operate.

Stakeholders controlled the development, not the vision. Product manager controlled nothing, forced specialists to cut corners and ignored their feedback.

The product created was a consensus product without a clear scope and no purpose.

Could this be avoided?

Transparency and consequences could have helped to manage and stop the scope creep. New requirements should always come with an updated consequence list. And consequences are best measured with the three pillars or Cost, Scope or Time.

If every addition was instead measured in what value it brings to the product and what the consequence would be for the result it would have been easier to discuss and resolve the conflict. The stakeholders would also have the tools to understand what the changes mean and act on it.