26 Giugno 2025

Building security that lasts: Microsoft’s journey towards durability at scale ​​ 

In this blog you will hear directly from Microsoft’s Deputy Chief Information Security Officer (CISO) for Azure and operating systems, Mark Russinovich, about how Microsoft operationalized security durability at scale. This blog is part of an ongoing series where our Deputy CISOs share their thoughts on what is most important in their respective domains. In this series you will get practical advice and forward-looking commentary on where the industry is going, as well as tactics you should start (and stop) deploying, and more.

In late 2023, Microsoft launched its most ambitious security transformation to date, the Microsoft Secure Future Initiative (SFI).  An initiative with the equivalent of 34,000 engineers working across 14 product divisions, supporting more than 20,000 cloud services on 1.2 million Azure subscriptions, the scope is massive. These services operate on 21 million compute nodes, protected by 46.7 million certificates, and developed across 134,000 code repositories. 

At Microsoft’s scale, the real challenge isn’t just shipping security fixes—it’s ensuring they’re automatically enforced by the platform, with no extra lift from engineers. This work aligns directly to our Secure by Default principle. Durable security is about building systems that apply fixes proactively, uphold standards over time, and engineering teams can focus on innovation rather than rework. This is the next frontier in security resilience.

Why “staying secure” is harder than getting there 

SFI April 2025 report blog


Read the blog ›

When SFI began, Microsoft made rapid progress: teams addressed vulnerabilities, met key performance indicators (KPIs), and turned dashboards green. Over time, sustaining these gains proved challenging, as some fixes required reinforcement and recurring patterns like misconfigurations and legacy issues began to re-emerge in new projects—highlighting the need for durable, long-term security practices. 

The pattern was clear: security improvements weren’t durable

While key milestones were successfully achieved, there were instances where we did not have a clearly defined ownership or built-in features to automatically sustain security baselines. Enforcement mechanisms varied, leading to inconsistencies in how security standards were upheld. As resources shifted post-delivery, this created a risk of baseline drift over time. 

Moving forward, we realized that our teams need to establish explicit ownership, standardize enforcement design, and embed automation at the platform level because it is essential to ensure long-term resilience, reduce operational burden, and prevent regression. 

Engineering for endurance: The making of Microsoft’s durability strategy 

To transform security from a reactive effort into an enduring capability, Microsoft launched a company-wide initiative to operationalize security durability at scale. The result was the creation of the Security Durability Model, anchored in the principle to “Start Green, Get Green, Stay Green, and Validate Green.” This framework is not a slogan—it is a foundational shift in how Microsoft engineers build, enforce, and sustain secure systems across the enterprise. 

At the core of this effort are Durability Architects—dedicated Architects embedded within each division who act as stewards of persistent security. These individuals champion a “fix-once, fix-forever” mindset by enforcing ownership and driving accountability across teams. One example that catalyzed this effort involved cross-tenant access risks through Passthrough Authentication. In this case, users without presence in a target tenant could authenticate through passthrough mechanisms, unintentionally breaching tenant boundaries. The mitigation initially lacked durability and resurfaced until ownership and enforcement were systemically addressed. 

Microsoft also applies a lifecycle framework they call “Start Green, Get Green, Stay Green, Validated Green.” New features are developed in a secure-by-default posture using hardened templates, ensuring they “Start Green.” Legacy systems or existing features are brought into compliance through targeted remediation efforts—this is “Get Green.” To “Stay Green,” ongoing monitoring and guardrails prevent regression. Finally, security is verified through automated reviews, and executive reporting—ensuring enduring resilience. 

Automating for scale and embedding security into engineering culture 

What is Azure Policy?


Learn more ↗

Recognizing that manual security checks cannot scale across an enterprise of this size, Microsoft has heavily invested in automation to prevent regressions. Tools such as Azure Policy automatically enforce best practices like encryption-at-rest or multifactor authentication across cloud resources. Continuous scanners detect expired certificates or known vulnerable packages. Self-healing scripts autocorrect deviations, closing the loop between detection and remediation. 

To embed durability into the operational fabric, review cadences and executive oversight play a critical role. Security KPIs are reviewed at weekly or biweekly engineering operations meetings, with Microsoft’s top leadership, including the Chief Executive Officer (CEO), Executive Vice Presidents (EVPs), and engineering leaders receiving regular updates. Notably, executive compensation is now directly tied to security performance metrics—an accountability mechanism that has driven measurable improvements in areas such as secret hygiene across code repositories. 

Rather than building fragmented solutions, Microsoft focuses on shared, scalable security capabilities. For example, to maintain a clean build environment, all new build queues will now default to a virtualized setup. Customers will not have the option to revert to the classic Artifact Processor (AP) on their own. Once a build is executed in the virtualized CloudBuild environment, any previously allocated resources in the classic CloudBuild will be either decommissioned or reassigned. 

Finally, durability is now a built-in requirement at development gates. Security fixes must not only remediate current issues but be designed to endure. Teams must assign owners, undergo gated reviews or durability, and build enforcement mechanisms. This philosophy has shifted the mindset from one-time patching to long-term resilience.  

The path to durable security: A maturity framework 

Durable security isn’t just about fixing vulnerabilities—it’s about ensuring security holds over time. As Microsoft learned during the early days of its Secure Future Initiative, lasting protection requires organizations to mature operationally, culturally, and technically. The following framework outlines how to evolve toward security durability at scale: 

1. Stages of security durability maturity: Security durability evolves through distinct operational phases that reflect an organization’s ability to sustain and scale secure outcomes, not just achieve them temporarily. 

  • Reactive: Durable outcomes are rare. Fixes are implemented manually and inconsistently. Drift and regressions are common due to a lack of enforcement or oversight. 
  • Define: Security fixes are codified in basic processes. Teams may implement fixes, but durability is still dependent on individual vigilance rather than systemic support. 
  • Managed: Security controls are embedded in standardized workflows. Durable design patterns are introduced. Baseline drift is measured, and early automation begins to prevent regression. 
  • Optimized: Durability becomes part of engineering culture. Secure-by-default templates, guardrails, and metrics reduce variance. Real-time enforcement prevents security drift. 
  • Autonomous and predictive: Systems proactively enforce durability. AI-assisted controls detect and self-remediate regressions. Durable security becomes self-sustaining and adaptive to change. 

2. Dimensions of security durability: To embed durability across the enterprise, organizations must mature along five integrated dimensions: 

  • Resilience to change: Security controls must remain stable even as infrastructure, tools, and organizational structures evolve. This requires decoupling controls from fragile, manual systems. 
  • Scalability: Durable security must scale effortlessly across expanding environments, including new regions, services, and team structures—without introducing regressions. 
  • Automation and AI readiness: Durability depends on machine-powered enforcement. Manual reviews alone cannot guarantee persistence. AI and automation provide speed, consistency, and fail-safes. 
  • Governance integration: Durability must be wired into governance platforms to provide traceability, accountability, and risk closure across the control lifecycle. 
  • Sustainability: Durable security solutions must be lightweight and operationally viable. If controls are too burdensome, teams will circumvent them, undermining long-term resilience. 

3. Key milestones in security durability evolution: Microsoft’s implementation of durable security revealed critical transformation points that signal organizational maturity: 

  • Establish durable security baselines (identity hygiene, patching, config hardening).
  • Enforce controls through automated policy and self-healing. 
  • Build durability-aware platforms like Govern Risk Intelligent Platform (GRIP) to track regressions and closure loops. 
  • Embed durability reviews into engineering checkpoints and risk ownership cycles.
  • Drive a durability mindset across teams—from development to operations. 
  • Create feedback loops to evaluate what holds and what regresses over time. 
  • Deploy AI-powered agents to detect drift and initiate remediation. 

Each milestone builds a stronger foundation for durability and aligns incentives with sustained security excellence. 

4. Measuring security durability: Tracking the stickiness of security work requires a shift from traditional risk metrics to durability-focused indicators. Microsoft uses the following to monitor progress: 

  • Percentage of controls enforced automatically versus manually 
  • Baseline drift rate (how often known-good states erode) 
  • Mean time to regress (how quickly fixes unravel)
  • Volume of self-healing actions triggered and resolved 
  • Percentage of fixes that meet “never regress” criteria 
  • Durability metadata coverage in systems like GRIP (ownership, status, and closure) 
  • Percentage of engineering teams integrated into durability reporting cadences 

Results: From short-term wins to sustained gains 

By February 2025, the durability push resulted in: 

  • 100% multi-factor authentication (MFA) enforcement or legacy protocol removal remained stable for months. 
  • Teams use real-time dashboards to catch any KPI dips—addressing them before they spiral. 

Where previous improvements faded, new ones held firm—validating the durability model. 

Lessons for any enterprise 

Microsoft’s journey offers valuable takeaways for organizations of all sizes. 

Durability requires programmatic support 

Security doesn’t persist by accident. It needs: 

  • Roles for durability and accountability.
  • Durable design patterns. 
  • Empowering technologies (automation and policy enforcement). 
  • Regular leadership and architect reviews. 
  • Standardized workflows. 

Teams across security, development, and operations must be aligned and coordinated—using the same metrics, tools, and gates. 

Culture and leadership matter 

Security must be everyone’s job—and leadership must reinforce that relentlessly. At Microsoft, security became part of performance reviews, executive dashboards, and everyday conversation. 

As EVP Charlie Bell put it: “Security is not just a feature, it’s the foundation.” 

That mindset—combined with consistent leadership pressure—is what transforms short-lived security into long-term resilience. 

Security that endures 

The Secure Future Initiative proves that durable security is achievable—even at hyperscale.  

Microsoft is showing that lasting security can be achieved by investing in: 

  • People (clear ownership and champions). 
  • Processes (repeatable metrics and reviews). 
  • Platforms (shared tooling and automation). 

The playbook isn’t just for tech giants. Any organization—whether you’re securing 20 cloud services or 20,000—can adopt the principles of security durability 

Because in today’s cyberthreat landscape, fixing isn’t enough.  

Secure Future Initiative

A new world of security.

A person sitting on a couch using a laptop

Learn more with Microsoft Security

To see an example of the Microsoft Durability Strategy in action, read this case study in the appendix below. Learn more about the Microsoft Security Future Initiative and our Secure by Default principle.  

​​To learn more about Microsoft Security solutions, visit our website. Bookmark the Security blog to keep up with our expert coverage on security matters. Also, follow us on LinkedIn (Microsoft Security) and X (@MSFTSecurity) for the latest news and updates on cybersecurity. 


Appendix: 

Security Durability Case Study 

Eliminating pinned certificates: A durable fix for secret hygiene in MSA apps 

SFI Reference: [SFI-ID4.1.3] 
Initiative Owner: Microsoft Account (MSA) Engineering Team 

Overview 

As part of the Secure Future Initiative (SFI), the Microsoft Account (MSA) team addressed a critical weakness identified through Software Security Incident Response Plans (SSIRPs): the unsafe use of pinned certificates. By eliminating this legacy pattern and embedding preventive guardrails, the MSA team set a new bar for durable secrets management and secure partner onboarding

The challenge: Pinned certificates and hidden fragility 

Pinned certificates were once seen as a strong trust enforcement mechanism, ensuring that only specific certificates could be used to establish connections. However, they became a security and operational liability

  • Difficult to rotate: If a pinned certificate expired or was compromised, coordinating a fast and seamless replacement across services was challenging. 
  • Onboarding risk: New services had no safe, scalable path to onboard without replicating this fragile pattern. 
  • Lack of durability: Without controls, the risk of regression and repeated misuse remained high. 

The durable fix: Secure by default and enforced by design 

The MSA team implemented a durability-first solution grounded in engineering enforcement and operational pragmatism: 

Strategy  Action 
Code-Level Blocking  All code paths accepting pinned certificates were hardened to prevent adoption. 
Temporary Allow Lists  Existing apps using pinned certificates were allow-listed to prevent immediate outages. 
Default Deny Posture  New apps are automatically blocked from using pinned certificates, enforcing secure defaults. 

This “fix-once, fix-forever” approach ensures the issue doesn’t resurface—even as new partners onboard or systems evolve. 

Sustained impact and lifecycle integration 

To maintain progress and ensure no regression, the MSA team aligned remediation with each partner’s SFI KPI milestones. Services were removed from the allow list only after completing their transition, closing the loop with full compliance and operational readiness

This work reinforced several Security Durability pillars: 

  • Preventive guardrails 
  • Owner-enforced controls 
  • Security built into the engineering lifecycle 

Lessons and model for the future 

This case is a model for how Microsoft is shifting from reactive security work to systemic, enforceable, and scalable durability models. Rather than patching the same issue repeatedly, the MSA team eliminated the root cause, protected the ecosystem, and created a repeatable blueprint for other risky cryptographic practices. 

Key takeaways 

  • Eliminating pinned certificates reduced fragility and boosted long-term resilience. 
  • Durable controls were enforced via code, not just process. 
  • Gradual deprecation through partner alignment ensured no disruption. 
  • This sets a precedent for eliminating insecure patterns across Microsoft platforms. 

The post Building security that lasts: Microsoft’s journey towards durability at scale ​​  appeared first on Microsoft Security Blog.


Source: Microsoft Security

Share: