May 27, 2026

Where we're looking to invest: Rebuilding the Dev Stack for the AI Era

AI / ML

Software Infra & Security

Insights

Writing

Team Cowboy

GitHub was built for a world where humans write code. That world is ending fast.

‍

Google announced last month that 75% of all new code at the company is now AI-generated. And per the 2025 Stack Overflow Developer Survey, 84% of developers are now using or planning to use AI tools.

‍

The engineering role is changing massively, and yet the tooling engineers use to manage code: git, PRs, code review, documentation, have not changed. And cracks are starting to show. Mitchell Hashimoto, founder of HashiCorp and GitHub user #1299, announced he's moving his project off GitHub after 18 years of daily use. His frustration is about reliability, not AI; but it's a sign that the foundation of software development is ready to be rebuilt. AI authorship is the forcing function.

‍

We see a big window for new engineering tools built for the age of AI. And we're actively looking to back teams reimagining tools for this new era.

‍

What's actually broken

‍

Git was designed around a simple mental model: a human made a decision, wrote some code, and left a commit message explaining why. The whole code review culture that grew up around it: PR comments, blame annotations, changelog discipline, assumes there's a person you can, if needed, tap on the shoulder and ask "why did you do it this way?"

‍

Fewer humans in the loop. The git model falls apart when the author is Claude Code/ Codex/Cursor or a custom agent running in CI. The code still lands in a PR, passes linting, and might even pass code review, but a bunch of things that used to be implicit or discoverable are now gone.

‍

The prompt disappears. The richest artifact of intent for AI-generated code is the prompt that produced it: what the engineer asked for, what constraints they specified, what context they gave the model. In the AI world, this history evaporates the moment the session ends. Six months later, when someone needs to understand why a particular architectural decision was made, there's nothing to go back to. Commit messages from agents tell you nothing you didn't already know ("feat: implement user authentication module"), leaving you with a diff, a vague ticket link, and if you're lucky, an AI-slop README. And it's not just prompts that disappear: the accumulated understanding built within a session vanishes too. This compounds on large repos, where agents are forced to work in slices, understanding enough to complete the task at hand but never seeing how their changes interact with the rest of the system. Every new session, whether it's the same engineer, a different one, or another agent, starts cold.

‍

Right now, it's hard to tell what's human and what's agent generated. Most codebases are a mix right now: some files are touched heavily by agents, some by humans, some by both in alternating passes. git blame shows you the human who ran git commit, not whether the code inside it was AI-generated. Some tools try to surface this: Claude Code adds a Co-Authored-By: Claude trailer to commit messages by default, but it doesn't survive squash merges cleanly, and gives you commit-level signal at best rather than line-level. In practice, once a PR merges, the signal is gone. That matters for debugging, for security review, and for anyone trying to understand a codebase six months from now. Georgia Tech's Vibe Security Radar tracked 56 CVEs attributable to AI-generated code in Q1 2026 alone, more than all of 2025 — and most of those started as code that passed review.

‍

Developers don't yet trust the output. The Stack Overflow 2025 survey surfaced something telling: 46% of developers actively distrust the accuracy of AI tools, while only 33% trust them. Trust dropped 11 percentage points year over year despite adoption climbing. 45% said debugging AI-generated code takes longer than writing it themselves. The code is shipping faster, but the confidence in it is not keeping up.

‍

The community layer is breaking too. GitHub wasn't just a place to store code. Issues, PRs, and commit history were how open source projects communicated and onboarded contributors. The social layer had human authors who could be questioned and would respond. AI is flooding GitHub with noise: issues filed by agents that don't understand the codebase, PRs with no human context, commit histories that are technically accurate yet semantically empty. Maintainers are reporting that triage is becoming unmanageable. The signal-to-noise ratio is deteriorating and we don't have tooling to fix it.

‍

The CLAUDE.md problem

‍

Teams have started working around the context problem on their own. Poke around any repo where engineers are heavy Claude Code or Codex users and you'll find files like CLAUDE.md, AGENTS.md, AI_CONTEXT.md. Basically READMEs written for agents. They write rules like: don't touch the legacy auth module, always use the internal HTTP client, here's the migration pattern, here are the five things you need to know before touching this service.

‍

These files work. When they're accurate, they meaningfully improve agent output. The problem is there's no tooling to keep them in sync with the actual codebase. So they drift. Decisions get made that aren't reflected in the context files. Services get deprecated but the docs still reference them. New patterns can be established that agents don't know about. Nobody updates the file because it's not any one person's job.

‍

At scale, the context layer that could make agents productive becomes a liability: confidently wrong instructions are baked into every session. And agents operating off stale context will make an already shaky situation worse.

‍

Rebuilding the devstack: Opportunities

‍

The most ambitious version to address these issues could be a new GitHub: a platform built from the ground up, assuming a significant portion of code is agent-generated, where provenance, session context, and attribution are first-class primitives. There's a big opportunity for someone to start clean.

‍

The four areas we're most interested in:

‍

Prompt and session capture as a platform primitive. The generation context for AI-written code i.e. the prompt, the model, and what was tried and rejected, needs to live someplace durable and linked to resulting commits. This is the foundational layer everything builds on: you can't do meaningful attribution, review, or audit without it. We're looking for teams building this as org-wide infrastructure, not as a feature inside a single agent tool.

‍

Code review rebuilt for agent-generated diffs. The existing PR review flow wasn't designed for the volume or the nature of AI-generated code. A platform that knows which hunks were agent-generated, what prompt context produced them, and surfaces that natively in the review experience would let engineering teams apply the right scrutiny without slowing everything down. The lack of developer trust in AI output even while using it daily, is as much a review tooling problem as a model quality problem.

‍

Context management at the repo level. The CLAUDE.md pattern works but doesn't scale. The right solution treats agent context as a first-class platform artifact: versioned, synchronized with the codebase, available at the start of every session without manual maintenance. For a CTO running a large repo, this is the difference between agents that compound value over time and agents that plateau because they're always starting cold.

‍

Community and contribution infrastructure. As AI floods open source with noise, maintainers need tooling to manage contribution quality at scale, distinguish signal from AI-generated churn, and preserve the human communication layer for collaborative software development.

‍

These are just four areas we're looking at. The space is early enough that we'd love to meet founders who see things from a different angle.

‍

If you're building in the area, we'd love to meet you! Reach us at rohan@cowboy.vc

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5

Heading 6

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Block quote

Ordered list

Item 1
Item 2
Item 3

Unordered list

Item A
Item B
Item C

Text link

Bold text

Emphasis

^Superscript

_{Subscrinolkm;pvdidunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in represdasc}

‍

Example H2