These 12 prompts cover the work an IT and infrastructure team actually does: incident postmortems, runbooks, scripting with safety checks, log and root-cause analysis, ticket triage, change comms, security policy, vendor evaluations, onboarding and offboarding, infrastructure docs, troubleshooting trees, and knowledge-base articles. Each one is annotated, uses [BRACKET] placeholders for your details, and is ready to paste into Claude. Scrub your secrets, keep your judgment on the output, and never run a generated command you have not read.
You are an SRE writing a blameless incident postmortem. I will give you the rough timeline and facts. Incident facts (paste): what broke, impact, who was paged, key timestamps, what we tried, what fixed it. [PASTE INCIDENT NOTES, REDACT ANY SECRETS OR CUSTOMER PII] Write the postmortem: 1. A one-paragraph summary a non-technical exec can read: what happened, who was affected, how long, resolved or not. 2. Impact: systems, users, duration, and severity, stated plainly. 3. A clean timeline in UTC: detection, escalation, mitigation, resolution, each with the action taken. 4. Root cause and contributing factors, separating the trigger from the underlying conditions. Where you are inferring, label it "hypothesis" and say what to check to confirm. 5. What went well and what slowed us down, with no blame on individuals. 6. Action items as a table: fix, owner placeholder, priority, and whether it prevents recurrence or just reduces impact. Keep it factual and specific. Do not invent timestamps or causes I did not provide; ask if something critical is missing.
Turn my rough notes into a clear operational runbook that a competent on-call engineer who has never done this task can follow. Task: [WHAT THE RUNBOOK IS FOR, e.g. failover the primary database] Environment: [OS / PLATFORM / TOOL VERSIONS] Rough steps (paste, in any order): [PASTE YOUR NOTES, COMMANDS, AND GOTCHAS] Produce the runbook: 1. Purpose and when to use it (and when not to). 2. Prerequisites: access, tools, and approvals needed before starting. 3. Numbered steps with the exact command or action for each, plus the expected result so the engineer can confirm it worked. 4. A clear "stop and escalate" rule for each risky step. 5. Verification steps to confirm the system is healthy afterward. 6. A rollback section: how to undo each major step if it goes wrong. Flag any step that is destructive or irreversible in bold so it cannot be missed. Where my notes are ambiguous, mark it as a question to resolve rather than guessing.
Write a [POWERSHELL / BASH] script for the task below. Safety matters more than cleverness. Goal: [WHAT THE SCRIPT SHOULD DO] Environment: [OS AND VERSION, SHELL VERSION, ANY RELEVANT TOOL VERSIONS] Inputs and constraints: [PATHS, HOSTS AS PLACEHOLDERS, WHAT IT MUST NOT TOUCH] Requirements for the script: 1. Start with a comment block: what it does, assumptions, and what could go wrong. 2. Validate inputs and fail clearly with a helpful message if something is missing. 3. Include a dry-run or -WhatIf mode that shows what would change without changing it, and make dry-run the default where possible. 4. Add a confirmation prompt before any destructive action (delete, overwrite, restart). 5. Use safe defaults: stop on error, no silent failures, log what it does. 6. After the code, list every destructive or risky line in plain English and tell me exactly what to test before running it for real. Target only the environment I named; do not use flags or cmdlets that exist on a different OS or version. If a safer approach exists, recommend it.
You are a senior systems engineer analyzing logs to find the likely root cause. I will paste log output. Context: [WHAT SYSTEM, WHAT SYMPTOM, WHEN IT STARTED, WHAT CHANGED RECENTLY] Logs (paste, redact secrets and internal hostnames you do not want shared): [PASTE LOG EXCERPT] Do this: 1. Summarize what the logs show in plain English: the sequence of events and where it goes wrong. 2. Identify the first error or anomaly that matters, separating it from downstream noise that is just a symptom. 3. Give me a ranked list of root-cause hypotheses, most to least likely, each with the evidence in the log that supports it. 4. For each hypothesis, the one quick check or command that would confirm or rule it out. 5. Flag anything in the logs that looks like a security event, not just a fault. Reason only from the evidence provided. If the logs are insufficient to be confident, say so and tell me what additional log source or time window would help.
Act as a help-desk lead triaging inbound IT tickets. I will paste a batch of tickets. Our categories: [LIST, e.g. access, hardware, network, software, security, how-to] Our priority scale: [DEFINE P1 to P4 OR YOUR SCHEME] Tickets (paste: id, requester role, subject, body): [PASTE TICKETS, REDACT PERSONAL DATA NOT NEEDED FOR ROUTING] For each ticket, output a row with: 1. Suggested category and a suggested priority, with a one-line reason. 2. The team or queue it should route to. 3. Whether it looks like a duplicate or part of a wider pattern across the batch. 4. A flag if it may be a security or compliance issue that needs escalation, not normal triage. 5. For simple, well-defined ones, a suggested first response the agent can edit and send. End with a short summary: any spike or pattern across the batch (for example several tickets pointing at the same outage) that suggests a single underlying problem. Do not guess at priority when the ticket is vague; mark it "needs clarification" and say what to ask.
Build a troubleshooting decision tree for a recurring problem so a tier-1 tech can work it before escalating. Problem: [DESCRIBE THE SYMPTOM, e.g. user cannot connect to VPN] Environment: [PLATFORM, CLIENT, ANY RELEVANT VERSIONS] What I already know about common causes: [PASTE WHAT YOU HAVE] Produce the decision tree: 1. Start with the fastest, most common checks first (the ones that resolve the most tickets). 2. Lay it out as branching yes/no questions, each leading to either a fix or the next check. 3. For each check, the exact thing to look at or command to run and how to read the result. 4. Clear escalation points: when tier-1 should stop and hand to tier-2 or networking, and what to capture before they do. 5. A short list of the data to collect up front so escalation is not a back-and-forth. Keep the early branches simple enough for a new technician. Mark any step that touches a production setting or could disrupt the user, and tell them when to get consent first.
Draft the communications for a planned change and maintenance window. Change: [WHAT IS CHANGING AND WHY, IN PLAIN TERMS] Window: [DATE, START AND END TIME, TIME ZONE] Impact: [WHO AND WHAT IS AFFECTED, EXPECTED DOWNTIME, ANY ACTION USERS MUST TAKE] Audience: [ALL STAFF / A DEPARTMENT / EXTERNAL CUSTOMERS] Rollback plan: [WHAT HAPPENS IF IT GOES WRONG] Write three short messages: 1. Advance notice (sent days ahead): what, when, why, the impact, and any action needed, calm and specific. 2. Start-of-window note: we are beginning, expected duration, where to watch for updates. 3. Completion note: done, current status, what to do if something still looks off, and who to contact. Also give me a one-line internal note for the change record and a brief entry for the change-approval board. Keep user-facing copy jargon-light and reassuring without overpromising. Do not use em dashes or en dashes anywhere.
Draft a [SECURITY POLICY / ACCEPTABLE USE POLICY] for a [COMPANY SIZE AND INDUSTRY] organization. I will refine it; I need a strong, clear first draft. Scope and intent: [WHAT THIS POLICY MUST COVER, e.g. acceptable use of company devices and accounts] Constraints I care about: [ANY REGULATION, EXISTING TOOLS, OR REALITIES, e.g. BYOD allowed, SSO in place] Write the policy: 1. Purpose and scope: who and what it applies to, in one short section. 2. Clear rules in plain language, grouped logically (accounts and passwords, devices, data handling, acceptable use, reporting incidents). 3. For each rule, keep it specific and enforceable; avoid vague phrasing nobody can act on. 4. A short section on consequences and the exception process. 5. A list of decisions or specifics a human owner still needs to fill in (thresholds, named tools, contacts), so I do not mistake a draft for a finished policy. Make it readable for non-technical staff, not just IT. This is a starting draft for review by security and legal, not legal advice, and a person must own the final version.
Help me build a vendor evaluation matrix to compare [N] tools for [USE CASE, e.g. endpoint management]. Candidates: [LIST THE TOOLS] What matters to us, roughly weighted: [e.g. security and compliance, integration with our stack, total cost, support, scalability, ease of admin] Constraints: [BUDGET RANGE, MUST-HAVE INTEGRATIONS, TEAM SIZE, ANY DEALBREAKERS] What I know about each (paste any notes, pricing, or specs you have): [PASTE WHAT YOU HAVE] Produce: 1. A scoring matrix: criteria as rows, tools as columns, with weights shown. Score only on the evidence I provided and mark cells "unknown" where I have no data rather than guessing. 2. A short narrative on each tool's clear strengths and real weaknesses for our use case. 3. The two or three criteria that should actually decide this, given our constraints. 4. The specific questions I should ask each vendor to close the biggest information gaps before deciding. 5. A provisional recommendation, stated as provisional, with what would change it. Be honest about uncertainty and do not let a tool win on a criterion where I gave you no data.
Build an IT [ONBOARDING / OFFBOARDING] checklist for a [ROLE, e.g. new sales hire] at a [COMPANY SIZE AND INDUSTRY] company. Our stack (paste): [IDENTITY PROVIDER, EMAIL, CORE APPS, DEVICE TYPE, MDM, VPN, ANY ROLE-SPECIFIC TOOLS] Anything special about this role: [ELEVATED ACCESS, SHARED MAILBOXES, HARDWARE NEEDS] Produce the checklist: 1. Grouped by phase: before day one, day one, and first week (for onboarding) or immediate, within 24 hours, and within a week (for offboarding). 2. Each item with the system, the action, and the owner placeholder (IT, manager, security). 3. For offboarding, put account and access revocation first and call out anything time-sensitive (SSO disable, MFA tokens, privileged access, third-party SaaS not behind SSO, hardware return, data handoff). 4. A least-privilege note: access this role should get and access it should not. 5. A final verification step to confirm nothing was missed, with the things most commonly forgotten. Be thorough on access, especially for offboarding; an orphaned account is a security risk. Mark any step that needs manager or security sign-off.
Turn my rough notes into clear infrastructure documentation a new team member or auditor could follow. System: [WHAT THIS DOCUMENTS, e.g. our production web stack] Rough notes (paste: servers, services, networks, dependencies, how data flows, redact anything sensitive you do not want shared): [PASTE NOTES] Produce the documentation: 1. An overview: what this system does and how the major pieces fit together, in a few sentences. 2. Components: each server, service, or resource with its role, and the environment or tier it sits in. 3. Data flow and dependencies: what talks to what, in what direction, and what breaks if a given piece goes down. 4. Access and ownership: who owns each piece and how it is accessed (described, not actual credentials). 5. Known risks and single points of failure visible in my notes. 6. A "to confirm" list of anything ambiguous in my notes that a human should verify before this is trusted as canonical. Use placeholders for hostnames, IPs, and secrets; never echo a credential even if I accidentally include one. Tell me where my notes are thin so the gaps are visible.
Turn this resolved ticket into a reusable knowledge-base article so the next person can self-serve. Resolved ticket (paste: the problem as reported, what was actually wrong, the steps that fixed it): [PASTE TICKET, REDACT THE REQUESTER'S PERSONAL DATA] Audience: [END USERS / IT STAFF] Write the article: 1. A clear, searchable title phrased the way someone would actually look it up. 2. Symptoms: how to know this is the article you need, in the user's words. 3. Cause: a short, plain explanation of what is going on, only if it helps the reader. 4. Resolution: numbered steps written for the audience, with the exact action for each and what success looks like. 5. If the fix needs IT or admin rights, say so up front and give the user the self-serve part plus when to open a ticket. 6. Related issues or follow-up to prevent it recurring. Keep it concise and free of internal jargon for an end-user audience. Do not include any real names, internal hostnames, or credentials from the source ticket.
Drop these into a Claude Project loaded with your team's context: your environment details, your runbook standards, your ticket categories, your security policies, and examples of past write-ups you were happy with. With those inputs, the prompts above produce outputs far better than running them in a blank Claude chat, because the model already knows your stack, your tone, and your bar.
Keep secrets out of that Project too. Load redacted templates and sanitized examples, not live credentials. See an AI Audit to find the IT workflows worth automating first, or have Treetop build the IT Project for your team.