Ericsson's AI-native software programme: principles, deployment, and open questions

0 high-five

In May 2026, four Ericsson engineers (Britto, Palmgren, Saini, and Ohlin) released a paper titled The AI-Native Large-Scale Agile Software Development Manifesto. The text is short and structured as a deliberate echo of the 2001 Agile Manifesto, with five value pairs, six grounding principles, and a vision statement (Britto et al., 2026). The substance of what the company is doing under the manifesto is larger than the preprint itself. Across software engineering at Ericsson, more than 1,500 engineers are now actively using a coordinated stack of AI personas, agents, and skills, with software-specification work moved from weeks to hours and testing and troubleshooting cycles moved from hours to weeks, down to minutes to days (Britto, 2026). The programme is several years old, has its origins in Ericsson's earlier work on AI-native telecommunications architecture (Iovene et al., 2023), and is one of the largest publicly described AI-native software-engineering deployments to date.

For an engineering leader, the practical question is which parts of the programme already rest on solid evidence, which parts extend past the evidence, and what the case tells us about industrial AI-native software engineering more broadly. The rest of this issue takes the programme as a single object of study and places it against the academic literature on large-scale agile and AI-assisted software engineering.

Subscribe to newsletter (S'ouvre dans une nouvelle fenêtre)

Where the programme came from and what it looks like

The term "AI-native" entered Ericsson's vocabulary in 2023 in the context of telecommunications networks (Iovene et al., 2023). It was defined as a system whose architecture embeds trustworthy AI capabilities through design, deployment, operation, and maintenance, rather than treating AI as an add-on. A maturity matrix accompanied the definition, with six levels (0 through 5) across five dimensions: architecture, collaboration, data ingestion and processing, model lifecycle and security, and operations. The rows were explicitly independent, allowing different dimensions of a system to mature at different paces, and the matrix was presented as a measurement instrument rather than as a uniform target.

The 2026 manifesto carries the same conceptual scaffolding into the engineering organisation itself, with software development reorganised around AI agents in the same way that AI-native networks reorganise traffic management. The six manifesto principles can be read as a software-engineering instantiation of the original maturity dimensions: parallel-by-default processes carry the architectural commitment, orchestrated agent workforces carry the collaboration commitment, a unified semantic layer composed of code, architecture, security, and dev-intelligence knowledge graphs carries the data-infrastructure commitment, and verification-first assurance together with reusable blueprints carry the model-lifecycle and commitments. A second maturity model specific to software development (ML 0 through ML 3) operates alongside the manifesto in the company's documentation. ML 0 covers conversational LLM assistance; ML 1 adds role-based AI personas with MCP-mediated tool access; ML 2 adds the knowledge layer that grounds agent reasoning; ML 3 reorganises teams into cross-functional human-AI squads with the knowledge graph functioning as shared memory. Ericsson positions its current programme between ML 2 and ML 3 (Britto, 2026).

The operational record is detailed. The agent workforce is wired through MCP connectors into the tools Ericsson engineers already use: GitLab, JIRA, Confluence, Backstage, Jenkins, Gerrit, SonarQube, Kubernetes and Helm, and the company-internal Open Grok and Kiro CLI. The agent personas correspond to existing engineering roles, with named personas for system architecture, security, verification, UX, DevOps, coding, and bug-report assistance. The reported deployment outcomes are large but self-reported: 1,500+ engineers active, specification time down to hours from weeks, testing cycles down to minutes-to-days from hours-to-weeks. No methodology accompanies these numbers in the public material, so an academic reader should treat them as company-internal indicators rather than as research findings, although they are more specific than the manifesto preprint's prose.

Three pillars are identified as required to reach AI-native operation: change management covering governance and legal framework, technology covering commercial and internally developed tools, and cultural shift covering competence, mindset, and innovation (Britto, 2026). The third pillar is where the manifesto preprint is silent and where the academic literature on large-scale agile transformation has produced the most consistent evidence. The explicit inclusion of cultural shift as a co-equal pillar with technology and governance closes a gap that a manifesto-only reading would leave open.

Against the literature on large-scale agile

The challenges the Ericsson programme is built to address are well-documented in the systematic reviews of large-scale agile. Dikert, Paasivaara, and Lassenius (2016) reviewed 52 publications covering 42 industrial cases and reported 35 distinct challenges across 9 categories, including coordination overheads, loss of autonomy through scaling frameworks, communication barriers in distributed environments, and inconsistent understanding of requirements. Uludağ et al. (2022) extended that picture with a systematic mapping study of 136 publications, adding integration complexity, quality assurance at scale, maintaining an agile mindset during transformation, and tooling fragmentation. The Ericsson programme targets the first cluster through its parallel-by-default process and semantic-layer commitments and the second through verification-first assurance and reusable blueprints. The work is well-grounded in the agile literature, even though the manifesto preprint itself does not engage with the systematic review evidence at length.

The traditional developer writes code. The AI-native developer defines intent. The shift is from execution to direction, from being the hands on the keyboard to being the mind that decides what should be built, provides the context for how it should work, and verifies that the result is correct (Britto et al., 2026).

One strand of the large-scale agile literature is harder to align with the position the manifesto stakes out. Dingsøyr, Moe, Fægri, and Seim (2018) identified role redefinition as a primary site of friction in very large agile programmes, with the implication that any reorganisation depending on engineers performing fundamentally different cognitive work has to invest as much in the human side of the transformation as in the technology side. The cultural-shift pillar acknowledges this, but neither the manifesto nor the public deployment record describes how role conflict is being managed at the 1,500-engineer scale. Career ladders, promotion criteria, and feedback structures designed for production-dominant engineering work are unlikely to translate cleanly to an intent-and-oversight regime, and the deployment outcome data published so far provide no direct evidence on this question.

Behavior-Driven Development, the basis for the verification-first principle, raises a related concern. Irshad, Britto, and Petersen (2021) studied BDD at organisational scale and found that the dominant failure mode was scenario proliferation, contradiction, and overlap as scenarios multiplied across teams. The Ericsson programme delegates scenario management to agents capable of reasoning about scenario hierarchies, which is a plausible mechanism whose precondition is that agent verification quality match or exceed the human review it displaces. No demonstration of that equivalence currently exists in the public record, and the reported reduction in testing and troubleshooting time is consistent with several different underlying realities, only some of which involve genuinely improved defect detection.

Against the AI4SE literature

The strongest empirical cross-check on the Ericsson programme comes from research on AI adoption in software engineering teams. Russo (2024) surveyed 100 engineers qualitatively and validated the resulting model with 183 quantitative responses through PLS-SEM, identifying compatibility with existing workflows as the dominant driver of generative AI adoption, with perceived usefulness and social pressure exerting weaker effects than the standard technology-acceptance literature would predict. The Ericsson deployment is unusually well-aligned with this finding. The agent workforce is wired into tools the engineers already use, the personas correspond to roles already present in the organisation, and the knowledge graphs are populated from existing artefacts. The manifesto's prose frames the change as a paradigm shift; the deployment under it appears to be a high-compatibility roll-out by design, which is one of the more plausible reasons the reported usage figures are as high as they are.

On cognitive evidence, the case has the most remaining work to do. Two decades of automation-bias research have documented that human reviewers tend to over-trust automated recommendations under cognitive load, with the effect intensifying when verification tools deliver outputs in aggregated or high-confidence formats. AI-assisted programming exhibits the same pattern at a smaller scale, with developers accepting LLM completions whose subtle defects they would have caught while writing the code themselves (Guo, 2023). Moving engineers from production to oversight redistributes cognitive load rather than reducing it, and the public Ericsson record does not yet contain defect-rate or escape-rate data of the kind that would let an external evaluator test whether the redistribution is producing better outcomes overall. The headline productivity numbers describe throughput, not defect detection.

A final point of comparison concerns the structure of the compatibility construct itself. Russo's (2024) model treats compatibility as multi-level, with individual fit as a distinct component alongside technical and organisational fit. The Ericsson programme has invested heavily in technical and organisational compatibility through the integration of the agent workforce into existing tools and roles, and identifies cultural shift as a separate pillar requiring its own investment. Whether the individual-fit dimension has received equivalent attention, and whether engineers across the 1,500-person population are converging on the intent-and-oversight role at compatible rates, are questions the published evidence does not yet answer.

A 30-minute readiness audit for AI-native large-scale agile

The Ericsson programme is operationally testable as a list of organisational properties. The ten-item audit below can be completed in thirty minutes by a team or programme lead and produces a rough estimate of where the current operating model sits relative to the four-stage software-development maturity model the company uses.

1. Identify the three most recent decisions reached in coordination meetings. Determine whether the same decisions could have been made if the relevant context had been queryable in a shared semantic layer instead.

2. Count the active documents over thirty days old that engineers still treat as the source of truth. Each one is a candidate for replacement by version-controlled, machine-readable specifications.

3. Locate the last three defects that escaped to production. For each, determine whether a Given-When-Then scenario, written before implementation, would have caught the failure.

4. List the AI tools currently in use across the team. Mark, which operate as isolated assistants (ML 0), as role-based personas (ML 1), or as context-aware agents grounded in a knowledge layer (ML 2).

5. Audit the current verification gates. Note which require human approval as policy and which require human approval only because no automated check yet exists.

6. Examine the role descriptions used in the last three hires. Identify where they emphasise code production and where they emphasise specification authorship and verification.

7. Inspect the architecture documentation. Determine whether it exists as static slides, as a queryable knowledge graph, or as something in between.

8. Track one recent feature from requirement to deployed code. Count the handoffs that occurred and mark which of those handoffs are now automatable through agent personas.

9. Identify a team in the organisation that has invented a successful AI-augmented practice, and determine whether the practice is captured as a reusable blueprint or as tacit knowledge.

10. Ask three engineers to describe their primary cognitive task during a typical day, and compare their answers to the role the manifesto would have them play.

The audit functions as a diagnostic rather than a scorecard. Items the team cannot answer indicate where the current operating model differs from the Ericsson programme's, and roughly where the cost of moving each dimension up the maturity model would fall.

Next moves

For the builder

Spend the next sprint on instrumenting verification rather than production. Treat each completed task as an opportunity to refine the specifications and tests that would let an agent perform the same task next time, and adopt a personal habit of writing the Given-When-Then scenario before the implementation, even when no AI agent is in the loop. The discipline transfers regardless of which platform the organisation eventually adopts. Begin building a private knowledge graph of the systems you maintain, including dependencies, contracts, and invariants. That structure compounds in value across multiple platform changes and remains useful long after any specific tooling decision is revisited.

For the manager

Audit your team's cognitive-load profile across the week, and if the team already spends more than half its time on verification, ensure the tooling and feedback rhythms match that profile rather than a production-dominant one. Re-examine the role descriptions and promotion criteria the team is using, since a career ladder calibrated to lines-of-code throughput will quietly disadvantage engineers who succeed in an intent-and-oversight regime. Establish a recurring forum where engineers surface cases of misplaced trust in AI output and treat those cases as system signals rather than as individual failures. The "Human in Control" governance the manifesto proposes can only function with psychological safety strong enough that engineers can admit when their verification was insufficient.

For the roadmap owner

Treat the semantic-layer investment as a strategic precondition rather than as a downstream tooling choice. The Ericsson programme moved up the maturity model by building knowledge graphs across code, architecture, security, and dev-intelligence, and the rest of the model assumes that foundation. Commission an inventory of which knowledge currently lives in static documents and which lives in queryable, version-controlled structures, and map the gap. The infrastructure carries an upfront cost, and postponing it commits the organisation to incurring it later under less favourable conditions. Plan an explicit timeline for moving the engineering function up the maturity model, with measurable milestones tied to the ratio of intent-authoring time to code-authoring time, and pair the technology investment with the change-management and cultural-shift pillars Ericsson identifies as equally necessary.

Closing thought

The Ericsson programme is worth taking seriously because of its scale and its specificity. A coordinated industrial deployment across more than 1,500 engineers, organised around a defined maturity model, instrumented with a concrete tooling stack, and developed over several years, gives the AI4SE research community a richer empirical object than any single laboratory study has produced. The corresponding methodological transparency has not yet been provided, the cognitive and individual-fit dimensions remain open empirical questions, and the headline productivity numbers describe throughput rather than defect detection. For engineering leaders, the working question is which of the maturity stages the organisation can credibly reach, on which dimensions, by when, and what each move would cost in terms of the change-management, technology, and cultural-shift investments the Ericsson programme treats as equally necessary.

Daniel Russo, Ph.D., is a Professor of Software Engineering whose research examines the intersection of human cognition and artificial intelligence. Through "Software Insights," he translates empirical research into actionable guidance for software practitioners and organizations.

If this issue surfaces a problem your organisation has been trying to name, I work with engineering leaders to diagnose exactly that kind of challenge, using the same methods behind the research you just read. No frameworks. No opinion without evidence.

danielrusso.org/advisory (S'ouvre dans une nouvelle fenêtre)

View all posts (S'ouvre dans une nouvelle fenêtre)

References

Britto, R. (2026, March 5). AI-native large-scale agile software development [Conference presentation]. Ericsson.

Britto, R., Palmgren, F., Saini, N., & Ohlin, M. (2026). The AI-native large-scale agile software development manifesto. arXiv. https://doi.org/10.48550/arXiv.2605.07717 (S'ouvre dans une nouvelle fenêtre)

Dikert, K., Paasivaara, M., & Lassenius, C. (2016). Challenges and success factors for large-scale agile transformations: A systematic literature review. Journal of Systems and Software, 119, 87–108. https://doi.org/10.1016/j.jss.2016.06.013 (S'ouvre dans une nouvelle fenêtre)

Dingsøyr, T., Moe, N. B., Fægri, T. E., & Seim, E. A. (2018). Exploring software development at the very large scale: A revelatory case study and research agenda for agile method adaptation. Empirical Software Engineering, 23(1), 490–520. https://doi.org/10.1007/s10664-017-9524-2 (S'ouvre dans une nouvelle fenêtre)

Iovene, M., Jonsson, L., Roeland, D., D'Angelo, M., Hall, G., Erol-Kantarci, M., & Manocha, J. (2023, February). Defining AI native: A key enabler for advanced intelligent telecom networks (White Paper No. BCSS-23:000056 Uen). Ericsson.

Irshad, M., Britto, R., & Petersen, K. (2021). Adapting behavior driven development (BDD) for large-scale software systems. Journal of Systems and Software, 177, Article 110944. https://doi.org/10.1016/j.jss.2021.110944 (S'ouvre dans une nouvelle fenêtre)

Liang, J. T., Yang, C., & Myers, B. A. (2024). A large-scale survey on the usability of AI programming assistants: Successes and challenges. In Proceedings of the IEEE/ACM 46th International Conference on Software Engineering (ICSE '24) (Article 52, pp. 1–13). ACM. https://doi.org/10.1145/3597503.3608128 (S'ouvre dans une nouvelle fenêtre)

Russo, D. (2024). Navigating the complexity of generative AI adoption in software engineering. ACM Transactions on Software Engineering and Methodology, 33(5), Article 135, 1–50. https://doi.org/10.1145/3652154 (S'ouvre dans une nouvelle fenêtre)

Uludağ, Ö., Philipp, P., Putta, A., Paasivaara, M., Lassenius, C., & Matthes, F. (2022). Revealing the state of the art of large-scale agile development research: A systematic mapping study. Journal of Systems and Software, 194, Article 111473. https://doi.org/10.1016/j.jss.2022.111473 (S'ouvre dans une nouvelle fenêtre)

Date 02/06/2026

0 high-five

Envoyer par e-mail

Signaler un contenu

Voir plus de contenus de Software Insights

Europe is about to make human-first software non-optional

In May 2026, Pope Leo XIV released his first encyclical, "Magnifica humanitas," a text devoted to safeguarding the human person in the time of artificial intelligence…

09/06/2026

Disarm AI, Re-Arm Teams: Two Levers Against a Lost Cohort

The cohort graduating into software work in 2026 enters a labour market in which the entry-level rung has been quietly compressed by AI, and the empirical record on what…

26/05/2026