Detection Strategy Part 1: Building a Detection Foundation
The Philosophical Core of Modern Security
In an era defined by continuous breaches and escalating complexity, a common challenge plagues security leaders: We have hundreds of detections, but we don't have a detection system. Having a detection capability is table stakes; having a detection strategy is how you achieve operational efficiency and measurable return on investment (ROI).
For security teams, the focus must shift from simply counting rules to optimizing the entire detection ecosystem. This first post in our series explores the philosophical foundation necessary for building a true detection system, culminating in the critical step of business alignment.
1. Thinking in Systems: From Rube Goldberg to Assembly Line
Too often, detection engineering teams inherit a portfolio of rules created reactively over time—a script for a specific breach, an alert for a compliance checkbox, or a blocklist for a campaign from two years ago. This collection of disjointed logic is not a system; it's a Rube Goldberg Machine, relying on fragile, complex steps to achieve a result that is slow, inefficient, and often fails.
A true detection system, by contrast, functions like a finely tuned assembly line. It is repeatable, measurable, and optimized for speed and quality. This shift in mindset immediately provides executive value by focusing on these key efficiency metrics:
Volume (Total Alert Count): High alert volume creates direct costs in log storage, but the true technology cost is in the compute processing required for correlation, enrichment, and analysis. The human cost is more severe: high volume means analysts are fatigued, cannot triage everything, and inevitably miss genuine, mission-critical threats.
True Positive Rate (TPR) and False Negative Rate (FNR): High TPR demonstrates efficacy in detecting threats, but this must be balanced with low False Positive volume. Crucially, a high FNR indicates a failure of the security mission. While optimizing for high TPR and low volume is necessary, achieving a low FNR requires active, external validation. This is accomplished through proactive measures like efficacy testing (purple teaming) and dedicated feedback loops from Incident Response lessons learned, essential to preventing undetected compromise.
Mean Time to Respond (MTTR): Just detecting bad activity isn't enough; the purpose of detection is to enable rapid action that limits damage. MTTR measures how quickly the organization moves from detection to containment. A cohesive system, designed to work together, directly lowers MTTR.
This systemic thinking is the core of Detection in Depth. Just as Defense in Depth requires overlapping security controls, Detection in Depth requires overlapping visibility and context across the environment, ensuring that no single sensor failure blinds the security team. Crucially, a deep detection strategy that focuses on coverage across the entire kill chain and uses fundamental attacker techniques (TTPs) rather than just atomic indicators (IOCs) or single data points, makes the detection system significantly more resilient to constant threat churn.
2. Thinking Holistically: Integrating People and Process
Detection is not a technology problem; it is a system problem where technology, process, and people interact. To achieve operational efficiency, the strategy must holistically integrate the people who interact with and validate the system.
The Analyst: A More Structured Component of the System
Security Analysts are not merely consumers of the detection system's output; they are an integrated system component who must provide essential feedback. Providing alerts that have a high degree of fidelity, providing enrichment, and enabling the analysts isn't just a nice-to-have, it's a must-have. When the system empowers the analyst with high-quality, actionable signals, the analyst provides the vital intelligence that drives continuous tuning and evolution of the rules.
Threat Hunting: A Less Structured Component of the System
Threat Hunting is an investment in systemic risk reduction, but it isn't a disparate function. While dedicated hypothesis-driven hunts that target specific adversaries, specific tactics, or specific high-risk areas are valuable and necessary; much of hunting should be spent on "Hunting Grounds." These focused areas (logically defined by systems, access, roles, or high-risk activities like cyber hygiene failures) should leverage leads that aren't high-fidelity enough to warrant an alert. By providing threads for the Threat Hunting team to pull, the detection engine converts low-fidelity data into a valuable workload.
Efficacy Testing: Auditing the Pipeline
We must treat detection logic with the same rigor as product code. Efficacy testing ensures the system works and continues to work. This should involve at least two pieces:
Routine Testing: Using BAS (Breach Attack Simulation)-like functionality to routinely test the system at various stages of the detection pipeline. Some tests should stop before the analyst (closed by automation), and some less-frequently should go all the way through the pipeline to audit the human response.
Routine Adversarial Simulation: This goes beyond simple vulnerability scans or pentesting on perimeter assets. It involves true red teaming and adversarial simulation designed to find the gaps in your TTP-focused coverage and provide direct, actionable feedback to the Detection Engineering team.
3. Understanding the Business: Aligning Detections with Risk
The single most common failing of a detection program is misalignment: spending resources protecting low-value assets simply because they are easy to monitor. A strategic detection strategy must finish its foundational layer by validating and prioritizing business risk.
The Path to Prioritization: From Business Model to Critical Asset
Before diving into asset-level risk, security teams must build a practical, high-level understanding of the business mechanics. This foundational step defines the overall risk landscape and ensures resources are concentrated where the value is generated.
Start by answering foundational questions like:
How does the business make money? (E-commerce, licensing intellectual property (IP), B2B transactions, operational technology (OT) actions, etc.)
Who are the critical people or roles? (Not just executives, but also roles in sales, accounts receivable, and specific product development teams.)
How is the organization structured? (There are separate IT and OT environments, subsidiaries, or frequent Mergers and Acquisitions (M&A) that rapidly change the perimeter?)
What are the critical interdependencies? (Identify essential B2B connections or cloud services critical to day-to-day operations.)
Once this landscape is mapped, you can identify your Tier 1 and Tier 2 critical systems. Ultimately, every member of your detection team should know the answers to these questions and understand at least the basics of the business, as it dictates the true priority of their work.
Validating Criticality and Risk Acceptance
For those Tier 1 systems, security leaders must move from abstract understanding to concrete validation. The security team needs to engage the system's business owner, senior department head, or the relevant operational VP with this non-technical question: "If system X were down for 72 hours, what is the exact mechanism by which we lose money, and who is authorized to stop it?"
The answer shifts the conversation from abstract "criticality" to measurable business impact. It immediately identifies:
Financial Impact: Which systems directly govern revenue, supply chain, or client delivery.
Decision Ownership: Who owns the acceptance of risk (e.g., the person authorized to approve a costly incident response action or declare a system shutdown).
Security efforts must focus detections on the loss mechanisms identified here. If a system is not tied to a measurable loss within a defined timeline, it should not receive the highest level of detection effort.
Compliance as a Minimum Floor
While the business should drive priorities, regulatory compliance is often the non-negotiable floor for detection. Frameworks like HIPAA, PCI-DSS, or GDPR often impose specific logging and detection requirements. It's crucial to recognize that the requirement for logging (which drives up data volume and cost) does not always equate to a requirement for alerting. A strategic approach meets these mandates without overloading the team. Security teams often mistakenly choose to over-detect to satisfy compliance, generating unnecessary, low-fidelity alerts. The goal should be to meet the intent of the requirement—either through alternative technical controls, or by creating a low-volume audit log for compliance purposes while reserving high-fidelity alerts for actual threats. Compliance should be integrated and automated, but it must not consume resources intended for protecting the specific, high-value assets unique to the organization's business model.
This foundation—systemic, holistic, and business-aligned—is the bedrock.
