Blink's solution to common use cases
Automating incident investigation and remediation
Blink’s mission is to help you — software developers and SRE — leave work at work. But what happens when there are problems in your cloud-based production system? In such cases, whether you are at home or in the office, you and your company need to investigate and remediate the issue as quickly as possible.
Here are some of the challenges you face:
Understanding the alert
Often the receiver of an alert does not have the knowledge needed to even begin to investigate, thus precious time is wasted on interpreting the issue.
Gathering troubleshooting data
To analyze an alert, you need to gather logs, metrics, and other diagnostic data from multiple sources. Do you know what you need and do you have the necessary tools and permissions to gather the data?
Determining ownership
A common time-waster is figuring out which team is responsible for solving a problem. Answering this question may involve multiple rounds of discussion between the Dev and DevOps teams, as they gather and analyze diagnostic data. These delays cause longer outages and a growing MTTR (mean time to recovery).
Remediating the issue
Once the source of an issue is identified, and you know how to fix it, you may get delayed for other reasons; for example, you may need to request permission to perform a required step or you may not have the tools or access levels needed.
Blink's solution
A Blink Automation, integrated with the relevant cloud systems, can handle the challenges listed above automatically.
An alert triggers a Blink Automation to run a response to the incident automatically. The Automation, depending on its configuration, can fetch the diagnostic data, enrich the alert, and remediate the issue.
- Automations can be configured for varying degrees of automation instead of user intervention. Once initiated, you can have your Automation run non-stop from beginning to end, or you may define steps along the way to allow user input or approval.
- Throughout its run, the Automation displays annotated details of the decision flow and the current step. This provides the user with essential explanations.
Examples of solutions include:
• Custom Slack notifications
• Alerts enriched with threat intel
• Incident response and mitigation
• Admin operations
• On-call runbooks
• Approval flows
• Running security checklists
• Issue tracking and alerts
Automating repetitive tasks
Cloud operations require the skilled management of qualified DevOps personnel, whose time is expensive and in high demand.
A common frustration is that repetitive, multi-step day-to-day operations, such as onboarding a new employee, or spinning up and configuring a Kubernetes cluster, consume considerable DevOps time. Another pain point caused by DevOps overload is the delay of essential requests from Dev and others in the organization. Expensive Dev resources should never be bottlenecked in their work due to DevOps backlog.
Blink's solution
A Blink Automation can perform a time-consuming DevOps task with one click on the Run button. Create and automate any workflow once only, to save valuable time. Using Blink, tasks that occur on a daily basis are effortless to repeat and no resources are wasted.
Examples of solutions include:
- Custom Slack notifications
- SQL-like query interface
- Admin operations
- On-call runbooks
- Approval flows
- Running scheduled jobs
- Long-running tasks
- Running security checklists
- Create an Automation to consume and expose a Terraform project as a service. Follow the detailed tutorial here.
Creating and publishing Automations to a Self-service portal
Make your organization more efficient by reducing bottlenecks and response time between and within departments. DevOps and SREs (Site Reliability Engineers) get many requests and need a way to deal with them efficiently. Blink enables you to create Automations of workflows and securely share them with other teams and workspaces.
The Automations are published as self-service apps and grouped in portals. Access to portals is managed by the organization’s directory. Users granted access can easily use the Automations and move on to the next important job, without waiting for someone to perform the task for them. Little time is wasted on finding the right person and then waiting for them to perform their part. Self-service cuts out the middle-man, enabling a more efficient work process.
Examples of solutions include:
- Automatically configuring new cloud environments
- Quickly scaling or descaling your cloud resources
- Updating permissions for a specific user on a specific resource
- Raising/pausing specific developer environment