What should an SLA include?

An SLA should at least specify availability, response time, recovery time, support hours, escalation levels, and communication channels. Crucially, it must apply to critical services and processes, not just “support.” An SLA without a clear recovery time is often insufficient in an emergency.

What is the difference between response time and recovery time?

Response time means: The provider acknowledges the incident and begins handling it within a specified timeframe. Recovery time means: The service is restored and usable again within a specified timeframe.

Why doesn’t our SLA help in the event of a failure anyway?

Because an SLA often covers only part of the issue, such as hosting or application support. However, the root cause frequently lies between systems, for example in an interface or a data flow. In such cases, all partners respond within their own areas, but no one manages the entire chain. Without an overview of dependencies and clear overall responsibility, resolution becomes a coordination challenge.

Who is responsible in the event of an incident when multiple providers are involved?

A clear role is needed to take charge of incident management and decision-making, regardless of where the technical cause lies. In practice, this means: defined responsibilities, clear escalation procedures, and a clear chain of command so that there is no need to first discuss who is responsible.

What is a single point of failure, and how do I identify it?

A single point of failure is a component or dependency whose failure halts a critical process. You can identify it using a dependency map: Which systems, interfaces, and partners are connected to a process, where is there no redundancy, and which point, if it fails, brings everything to a standstill?

When no one is responsible in the event of an incident

Patrick Büchler

CEO

+41.. Show number

Get expert advice. Our References

Expert knowledge for your project

Patrick Büchler is CEO and owner of soxes AG. With a master’s degree in computer science from ETH Zurich, he shapes the strategic and technical direction of the company. His strengths include software architecture, agile leadership, and customer focus, always with the goal of effectively implementing innovation.

Dependencies and SLAs determine how your business operates

The systems are up and running. The contracts have been signed, and there is a point of contact for every service. Do you recognize these situations? The system is down. Internally, they say the problem lies with the external system. Externally, they say the cause lies with you. And in the middle, production is on hold. But the real problem lies in the interface between them, and no one seems to feel truly responsible for it. Because everyone knows their own area, but no one has the big picture.

On this page, you’ll learn:

why operational disruptions escalate even though all partners are responding
What types of dependencies really matter in operations
what an SLA does and what it doesn’t
how to make dependencies visible and clarify responsibilities
how your operations should feel

The problem you only notice when it's too late

Everything runs smoothly in day-to-day operations. Until the moment a process suddenly comes to a halt and no one can immediately pinpoint the cause.

Typical situations in the workplace:

A production system stops delivering data and manufacturing comes to a standstill
An ERP interface fails, and orders get stuck
Employees can no longer log in, and work comes to a halt
A third-party system isn’t responding, and your system can’t continue operating
A change in the infrastructure triggers errors in multiple applications

Then the familiar chain of phone calls begins. Internal checks are conducted. External partners are contacted. Manufacturer support is called in. Everyone is working, but there’s no clear overview of how everything fits together. That’s exactly what makes disruptions so costly.

Business Risks and Implications

When responsibilities and lines of authority are unclear, a disruption can quickly turn into a “company-wide problem.” The risk arises not only from the outage itself, but also from the delay in determining who is responsible for resolving the issue.

1. Type of risk	2. Typical effects within the company
Financially	Production stoppages, lost revenue, contractual penalties, express shipping costs
Operational	Long resolution times, frantic escalations, recurring errors
Strategic	Dependence on manufacturers, service providers, or individuals
Organizational	Responsibility gets passed around, and decisions take too long
Reputable	Lack of reliability toward customers, partners, and internal stakeholders

Why does this problem affect so many companies?

Interfaces were added because new requirements arose
Vendor solutions were integrated because they provided immediate benefits
Hosting and infrastructure were modernized, often in phases
Various service providers have contributed to the development over the years
Knowledge has been gradually dispersed or lost
The documentation was not updated regularly

How does this problem arise?

This problem does not arise from a single issue. It is the result of many individual decisions that, over the years, were never evaluated within the broader context of the business.

Isolated decisions made without considering the overall impact

Connecting an API without checking dependencies, integrating a third-party system without clear accountability, outsourcing hosting without clarifying SLA implications, or expanding a system without updating documentation and operational procedures: This creates a system landscape that functions in day-to-day operations but is difficult to manage in an emergency because responsibilities, dependencies, and processes have not evolved alongside it.

Responsibility arises randomly rather than in a structured manner

As long as things are running, no one feels compelled to clearly resolve these questions:

Who bears overall responsibility for the operation of the chain?
Who escalates to which partner, and when?
Who makes decisions when multiple parties are involved?
Who has a comprehensive understanding of the dependencies from trigger to effect?

Knowledge and documentation are not keeping pace

Complexity grows with every expansion. Documentation is not consistently updated. Knowledge remains with individual people. New employees understand only parts of it. External partners know only their own segment. In the event of a failure, the overall understanding that would be crucial for a quick resolution is then missing.

What do dependencies and responsibility mean in IT operations?

There are two levels within operations that are rarely considered together. Dependenciesare all the connections that ensure your system functions properly. These include technical connections such as interfaces and infrastructure, as well as organizational and contractual connections such as vendor support, partners, responsibilities, and escalation procedures.

Responsibility in operations means that it is clear who is responsible for which part of the chain in the event of a failure, who is authorized to make decisions, and who manages the restoration of operations. Only when both levels are considered together can true operational reliability be achieved.

What types of dependencies exist in a business?

Technical dependencies

Interfaces, APIs, databases, cloud infrastructure, networks, identity management, third-party systems, vendor software, and data flows. A small change or failure in one area can affect multiple processes at once.

Organizational dependencies

Internal teams, external partners, vendor support, responsibilities, on-call schedules, and a lack of escalation procedures. Even when a technical issue is clear, it can still take a long time for the right person or department to respond.

Contractual dependencies

SLAs and support contracts with partners, manufacturers, and integrators. Often, there are multiple contracts, but no one has a complete picture of which parts of the chain are actually covered in the event of an emergency.

Personnel dependencies

Knowledge experts, long-time developers, individual administrators, and informal point people. Without this knowledge, troubleshooting becomes slower, riskier, and more expensive.

When Dependencies Meet Contracts

Many companies know that their systems are interdependent. And many have SLAs with partners and vendors. But when an emergency arises, it becomes clear whether these SLAs cover the entire chain of dependencies or just individual parts. It is precisely at this point that it is determined whether an incident will be resolved smoothly or escalate.

What Does SLA Really Mean in IT Operations?

SLA stands for Service Level Agreement. It is a contractual agreement between you and a service provider that defines which operational performance is guaranteed.

Typical provisions include availability, response times, recovery times, support hours, escalation levels, and communication channels. An SLA thus describes how quickly someone must respond and what level of service is guaranteed. However, it does not automatically specify who is responsible along the entire system chain if the root cause lies across multiple systems.

Why SLAs Often Don’t Help in an Emergency

Many companies have SLAs, yet outages still take a long time to resolve. The reason is that responsibilities along the chain are not clearly defined.

A typical scenario:

The hosting SLA applies, but the root cause lies in an interface
The manufacturer’s support responds, but the infrastructure is the bottleneck
Multiple partners are involved, and each focuses only on their own part
No one manages the entire chain end-to-end.

Approach: Identify dependencies and clarify responsibilities

This isn’t about more documentation. It’s about operational transparency and clear control in an emergency.

Technical Solutions
A comprehensive dependency map shows which systems, interfaces, and components are linked to a process. Critical points such as “single points of failure” become visible. Upstream and downstream connections become clear, allowing you to immediately assess the impact of a disruption or change. Each dependency is rated based on its criticality and how well it is safeguarded.

Organizational Solutions
Responsibilities are defined along the chain, both internally and externally. Escalation paths are clear and well-rehearsed. SLAs are adjusted to cover real dependencies rather than isolated contractual provisions. Operations become manageable, even when personnel change or partners are replaced.

Target state: How should your operations feel?

In an emergency, there is no longer a chain of phone calls to determine responsibilities. Instead of uncertainty, there is clarity:

You know which dependency affects which process
You know immediately who is responsible for which issue
Escalation proceeds in a structured and rapid manner
SLAs work because they fit the system chain
Operations remain stable, even when key personnel are absent

Operational reliability is achieved when the system landscape, responsibilities, and SLAs are considered together.

Frequently asked questions

What should an SLA include?

An SLA should at least specify availability, response time, recovery time, support hours, escalation levels, and communication channels. Crucially, it must apply to critical services and processes, not just “support.” An SLA without a clear recovery time is often insufficient in an emergency.
What is the difference between response time and recovery time?

Response time means: The provider acknowledges the incident and begins handling it within a specified timeframe. Recovery time means: The service is restored and usable again within a specified timeframe.
Why doesn’t our SLA help in the event of a failure anyway?

Because an SLA often covers only part of the issue, such as hosting or application support. However, the root cause frequently lies between systems, for example in an interface or a data flow. In such cases, all partners respond within their own areas, but no one manages the entire chain. Without an overview of dependencies and clear overall responsibility, resolution becomes a coordination challenge.
Who is responsible in the event of an incident when multiple providers are involved?

A clear role is needed to take charge of incident management and decision-making, regardless of where the technical cause lies. In practice, this means: defined responsibilities, clear escalation procedures, and a clear chain of command so that there is no need to first discuss who is responsible.
What is a single point of failure, and how do I identify it?

A single point of failure is a component or dependency whose failure halts a critical process. You can identify it using a dependency map: Which systems, interfaces, and partners are connected to a process, where is there no redundancy, and which point, if it fails, brings everything to a standstill?

How can we support your business?

Unclear dependencies and unresolved responsibilities often go unnoticed in day-to-day operations.

We bring transparency to your IT infrastructure, clarify responsibilities throughout the entire chain, and take ownership to ensure your operations run reliably even under pressure.

Direct number

+41 55 253 00 53

Book your initial consultation now!

Digitalization and

Individual Software

When no one is responsible in the event of an incident

Contents

Dependencies and SLAs determine how your business operates

The problem you only notice when it's too late

Business Risks and Implications

Why does this problem affect so many companies?

Interfaces were added because new requirements arose

Vendor solutions were integrated because they provided immediate benefits

Hosting and infrastructure were modernized, often in phases

Various service providers have contributed to the development over the years

Knowledge has been gradually dispersed or lost

The documentation was not updated regularly

How does this problem arise?

Isolated decisions made without considering the overall impact

Responsibility arises randomly rather than in a structured manner

Knowledge and documentation are not keeping pace

What do dependencies and responsibility mean in IT operations?

What types of dependencies exist in a business?

Technical dependencies

Organizational dependencies

Contractual dependencies

Personnel dependencies

When Dependencies Meet Contracts

What Does SLA Really Mean in IT Operations?

Why SLAs Often Don’t Help in an Emergency

Approach: Identify dependencies and clarify responsibilities

Target state: How should your operations feel?

Frequently asked questions

How can we support your business?

Trusted by Leading Brands

This might interest you

Do you have any questions? Would you like to find out more about our services?
We look forward to your enquiry.

Top Rated Service

Digitalization and

Individual Software

When no one is responsible in the event of an incident

Contents

Dependencies and SLAs determine how your business operates

The problem you only notice when it's too late

Business Risks and Implications

Why does this problem affect so many companies?

Interfaces were added because new requirements arose

Vendor solutions were integrated because they provided immediate benefits

Hosting and infrastructure were modernized, often in phases

Various service providers have contributed to the development over the years

Knowledge has been gradually dispersed or lost

The documentation was not updated regularly

How does this problem arise?

Isolated decisions made without considering the overall impact

Responsibility arises randomly rather than in a structured manner

Knowledge and documentation are not keeping pace

What do dependencies and responsibility mean in IT operations?

What types of dependencies exist in a business?

Technical dependencies

Organizational dependencies

Contractual dependencies

Personnel dependencies

When Dependencies Meet Contracts

What Does SLA Really Mean in IT Operations?

Why SLAs Often Don’t Help in an Emergency

Approach: Identify dependencies and clarify responsibilities

Target state: How should your operations feel?

Frequently asked questions

How can we support your business?

Trusted by Leading Brands

This might interest you

Do you have any questions? Would you like to find out more about our services? We look forward to your enquiry.

Top Rated Service

Your request

Do you have any questions? Would you like to find out more about our services?
We look forward to your enquiry.