Embedding “Senior Thinking Processes” into AI Agents: How Open-Source “agent-skills” Breaks Through Autonomous Development

The evolution of AI coding assistants has dramatically transformed the way we develop software. However, many engineers still face the unpredictable and “uncontrolled behavior” of AI: code quality varies wildly depending on how prompts are structured, the AI starts writing code out of nowhere and breaks existing implementations, or it marks a task as complete without writing any tests.

The root cause of this issue is not a lack of performance in AI models. Rather, it is because the “development process framework (workflow)” has not been shared with the AI. Just as hikers without a map easily get lost, an AI without a clear process can quickly lose its way in a complex codebase.

A new product has emerged that offers a definitive paradigm shift to resolve this challenge. Released by Addy Osmani, an Engineering Director at Google, it is called “agent-skills.”

This project is an open-source framework designed to directly install and execute a series of best practices—design, planning, implementation, testing, and review—practiced by top-tier senior engineers directly into AI agents. By utilizing this, your AI assistant evolves from a simple “code generation tool” into a dependable senior partner that thinks autonomously and delivers robust code.

In this article, we will thoroughly explain the basic concepts of agent-skills, the mechanisms that enable autonomous development, and concrete steps to integrate it into major tools like Cursor and Claude Code.

Tech Watch Perspective:
Until now, AI utilization (such as Cursor's Rules) mostly relied on static instruction templates saying, "Write code like this." However, the true innovation of "agent-skills" lies in redefining the entire development process as a "dynamic pipeline with state transitions."
Through a series of commands from /spec to /ship, the AI accurately recognizes which phase (requirements definition, task breakdown, implementation, verification) it is currently in. This virtually eliminates the greatest risk of AI development: the tendency to "write everything at once and self-destruct." It is truly a new "software engineering infrastructure" for the AI agent era.

🛠 The 7 Slash Commands Provided by “agent-skills”

The core of agent-skills lies in its “7 commands,” which are intuitively mapped to the software development lifecycle. By making the AI aware of these clear phases, it becomes possible to guarantee professional-grade output while minimizing human intervention.

CommandAction to TakeCore Principle of Development
/specStrictly define what to buildSolidify specifications before writing code
/planDesign implementation stepsBreak down tasks into minimal, atomic steps
/buildImplement incrementallyThoroughly build out one slice (feature) at a time
/testVerify behavior and functionalityTest code is the proof that the software works
/reviewSelf-review before mergingContinuously improve code health
/code-simplifyRefactor the codePrioritize “clear code anyone can understand” over clever code
/shipRelease to productionAchieve high-frequency, safe delivery

The Power of the Ultimate Autonomous Mode: /build auto

Among these commands, /build auto is exceptionally powerful in real-world development.

Once the specifications (/spec) and plan (/plan) are established, executing this command transitions the AI agent autonomously into the task execution phase. After asking the user for confirmation to start the plan once, the agent continuously runs a completely autonomous loop of executing tasks, running tests, debugging, and committing code.

This is, so to speak, a “Level 3 autonomous driving” environment for development. The AI writes its own tests, verifies behavior, and autonomously debugs if an error occurs. Humans only need to provide feedback when the AI pauses to ask for critical decisions. This dramatically reduces the cognitive load and waiting times previously associated with “prompting and waiting for every single step” in AI development.


🔄 Quick Setup for Major Tools (Cursor, Claude Code, etc.)

agent-skills boasts excellent portability. You can instantly integrate it into various existing AI assistants and IDEs simply by placing markdown files.

In Claude Code, which runs on the CLI, you can install it directly via the official plugin ecosystem.

/plugin marketplace add addyosmani/agent-skills
/plugin install agent-skills@addy-agent-skills

Note: If GitHub SSH key authentication errors occur in your environment, you can reliably complete the installation using the following HTTPS commands instead:

/plugin marketplace add https://github.com/addyosmani/agent-skills.git
/plugin install agent-skills@addy-agent-skills

2. Cursor

To apply the rules across the entire editor, create a .cursor/rules/ directory in the root of your project and copy any SKILL.md (or all skill files) from the skills/ directory of the repository into it. The model behind Cursor will then automatically interpret these “skills” and respond to the commands.

3. GitHub Copilot / Other Tools Supporting System Prompts

Add the agent definition files stored in the repository’s agents/ directory to your .github/copilot-instructions.md, or copy and paste them into the custom system prompt field provided by your tool. With just this step, the agent-skills code of conduct is embedded into the model’s base context.


⚖️ Differences from Competitors and Standard System Prompts

How does agent-skills compare to static system prompts (such as Cursor’s .cursorrules) that many developers already use?

Comparison ItemStandard System Prompts (General Rules)agent-skills (This Project)
Behavioral GuidanceAbstract commands like “write clean code”Procedural steps like test-driven development and atomic division
Process EnforcementNone (starts rewriting code immediately)Enforced phase management via /spec, etc.
Behavior on ErrorLoops on the same error, wasting API tokensIdentifies the issue, writes tests autonomously, and debugs
Adoption CostLow (just paste the prompt)Extremely low (includes ready-to-use config files for each editor)

While traditional Rules define the “style of the outputted code (elegance and writing conventions),” agent-skills defines the “process leading up to the code output (behavior and decision-making).” This difference directly impacts the robustness of the final deliverables and the efficiency of preventing wasted API tokens.


⚠️ Caveats and Pitfalls of Adoption

No matter how powerful a tool is, there is no silver bullet. When actually implementing it, you need to keep the following two points in mind:

  1. Surge in API Token Consumption (Cost Management) Autonomous execution modes like /build auto call APIs at extremely high speeds because humans are not in the loop. The process where the AI autonomously loops through test creation, execution, and debugging can consume a massive amount of tokens in a short period. To prevent unexpected high bills, it is wise to set budget limits based on development scale or restrict AI tool usage in advance.

  2. A Robust Local Testing Environment is a Prerequisite agent-skills defines implementation completion by “passing test codes.” Therefore, if the project lacks an automated testing environment (Jest, Vitest, Pytest, etc.) or has an architecture where tests are extremely difficult to write, the potential of this system will be cut in half. As a first step of adoption, we highly recommend setting up an environment where basic unit tests can be executed.


🙋‍♂️ Frequently Asked Questions (FAQ)

Q1. Is there any benefit to adopting this in solo development or small-scale projects? A1. Yes, there is a massive benefit. When developing alone, it is easy to skip the design and testing phases. By introducing agent-skills, the AI acts as a “dedicated senior tech lead,” enforcing compliance with professional development processes. As a result, you can maintain a clean codebase with minimal technical debt.

Q2. Does it work fine if I issue instructions in Japanese? A2. Yes, it works without any issues. Although the skill definitions and internal commands (like /spec) are written in English, modern LLMs like Claude 3.5 Sonnet and GPT-4o can interpret the English constraints and processes and flexibly apply them when interacting with users in Japanese. Developers can simply issue instructions in their usual Japanese and leave the process management to the AI.

Q3. Can it be used in emerging tools like Windsurf or Gemini CLI? A3. Yes, it is fully compatible. The agent-skills repository contains detailed documentation on setup procedures for major agent tools, including rule configurations for Windsurf and registration methods for Gemini CLI’s gemini skills feature.


🎯 Conclusion: Take Your Development to the Next Stage

The early phase of “getting AI to write code” is already coming to an end. The next era demands a higher-level approach: “installing first-class development processes into AI to let it autonomously create high-quality software.”

Addy Osmani’s “agent-skills” is a powerful framework that embodies this future and will serve as a trigger to elevate software engineering productivity to the next dimension. We encourage you to introduce this “senior thinking process” to your development workflow and experience a new level of autonomous development.


This article is also available in Japanese.