Chapter 4 – Exercise Plans and After-Action Reports

Well, this feels a bit like saying, “Goodbye!” to an old friend.  It’s hard to believe we are seven weeks into this BCP series.  Hopefully, at least, when you decide to tackle this for your organization this effort will help you develop an effective business continuity plan.

Part 7 covers exercising (working out) your plans and documenting the good, the bad, and the ugly…also known as the After-Action Report (AAR).  Anyone who has followed my posts for a while should expect Clint Eastwood films to show up from time to time, especially the Westerns. Okay, where were we? Finding the weak points in our plan is a success if we can incorporate our learning into making the plan better.  Since we are now nearly 82% complete according to the Business Continuity Plan Generator, let’s wrap things up in this final session.

h/t Wikipedia.org

4.1  Business Continuity Plan Exercise Methodology

Borrowing straight from the app, we find four methods that can be sued to validate our plan:

  • Tabletop Exercise – key personnel discussing simulated scenarios in an informal setting.
  • Functional Exercise – simulates the reality of operations in a functional area by presenting complex and realistic problems.
  • Full Scale Exercise – real operations in multiple functional areas present complex and realistic problems that require critical thinking, rapid problem solving, and effective responses by trained personnel.
  • Drill – coordinated, supervised activity usually used to test a single specific operation or function.

Starting out, the Tabletop method will be the easier to implement.  The goal should be to increase the cadence and rigor of your tests over time.  You will want to mix up scenarios and test different department functions and roles under a variety of conditions.  Be sure to schedule this with your IT and other supporting vendors or partners, if necessary, to ensure full participation.  Our vendor for BCDR backups and cloud virtualization want advanced notice of the drill although, if pressed, they do accommodate “spinning up” the virtual servers in the cloud environment with short notice if it comes to that.  Maybe that’s a better measure of their true capabilities (hint, hint)?

4.2  Exercise Objectives

This section documents the desired objectives of your test.  And these goals should be SMART:

h/t slidemodel.com

Further, per the Plan Generator guidance, you want your objects, at a minimum, to accomplish the following:

  • Determine the state of readiness of your BCP by creating a learning environment for all participants to learn about the plan.
  • Validate the BCP resource lists — people and inventories are sufficient to effect recovery of business operations and/or IT services as appropriate. Document changes and updates (including omissions) to the BCP.
  • Verify the information in the BCP is current and accurately reflects the organization’s requirements.

There is a table to enter these or other objectives to document what we expect to get out of our AARs.  Additional guidance is given to have separate tests for the IT staff and assessing technical capabilities and another for the end-users who will not benefit from being in the middle of a technical drill.

Section 4.2 also outlines a timeline of tasks occurring as early as 90 days prior to the test and covers post-exercise steps.  Something like a Tabletop review will be much less formal.

4.3  Developing the Exercise Scenario

Here is where we develop actual testing scenarios.  A fun way to do this might be to write up many different scenarios ahead of time and then pick one at random for the test.  You will want these to be somewhat within the realm of possibility and not always going for the Black Swan event like “An asteroid hit the city and there is no human life left within 100 miles of the crater.”  While exciting to discuss, the AAR is likely to be brief with little actionable takeaways. 

4.4  Exercise Evaluation

The written evaluation of an exercise is most commonly referred to as the After-Action Report (AAR).  This section provides a template for how the report should be written.  The key to a successful test is to have clear (read: SMART) objectives, a rigorous testing scenario, and document every minute detail that could be useful to inform what went to plan and what things need additional work.  The goal should be to learn something actionable, otherwise the inputs to the test likely need adjusting, ie: scope, depth, and rigor.  If, despite raising the bar each time, you are not finding failure points, it just might be a sign your business continuity capabilities are robust and hold up under pressure.  But I would be skeptical of this notion, at least.

Decide who should be copied on the AARs and distribute the report accordingly.  If there are action steps coming from the AAR, be sure you define “who, does what, and by when”.  And that needs to be a person accountable to the task, even if this involves delegating to others.  The adage remains true, “If everyone is responsible, no one is.”

4.5  Exercise Reports

Last, but not least, section 4.5 provides a table where we can record the Test Number, Date, Exercise Type, and Plan Area Exercised.  A copy of each AAR associated with the documented test should be added to the electronic and physical copies of the plan.

Wrapping Up and Changing Gears

If you have followed along for the past seven articles, you have a good idea of what it takes to develop your own business continuity plan.  If you have done the work through each step of the way, all the better and now pat yourself (and your team) on the back.

So what? Now what?  First, do not underestimate the strategic importance of having this plan in place.  It’s hard work at first but will get easier over time.  And when the worst happens, the plan will pay huge dividends, possibly being the one thing that saves your company.  Okay, let’s all take a deep breath before I say – But wait…there’s more!  Next week we will shift gears from business continuity to disaster recovery.  To build your Disaster Recovery Plan, we will leverage the second half of the tool and TCS will walk you through all of the steps just like we have done here.

Until next time, I think we should spike the football, have a victory dance, or engage in any other celebration of choice for getting to this point. Kudos from the team here at TCS! We would love to hear your success stories or help you along this journey, so don’t hesitate to give us a ring if we can help. TTFN!

Chapter 3 – Plan Administration and Maintenance

Folks, we are in the home stretch now.  Our BCP app fun meter shows we are roughly 2/3rds of the way to spiking the football on our plan.  Let’s take a minute to reflect on the journey so far.  In Chapter 1 we defined the scope, policy, initial assumptions, and objectives to set the rails for our plan.  From there, we performed a risk assessment and business impact analysis.  After that we were able to clarify our business continuity strategy and start to organize our plan based on the roles in our organizational chart and physical facilities.

Chapter 2 had use identify and document our teams, outline essential tasks and actions during a crisis, and compile lists of key contacts and mission-critical equipment, software, and supplies.  As a result of the work thus far, we have a plan and we know what we know and what we need.  This should inform how we stage or maintain ready access to the minimum items and information required to sustain business operations.  Now we will shift gears into the administration and maintenance of the plan.  This part is what will make the difference between an old dusty binder on the shelf versus a living and active process that is strategic to the health and sustainability of the organization when the worst happens.

In the first section of Chapter 3 we define the high-level guidance to govern the actions of the Business Continuity Team Leads (non-Service management in our case).  We adopted the sample text in our case with slight modifications.  Specifically for TCS, we will not need an alternate recovery site as our work from home process is sufficient to maintain operations through the recovery period.  A key recommendation is, “The most successful planning teams are limited in size, have a formal membership, regularly scheduled meetings, and members are designated in writing.”  Since we use EOS as our management framework, we can incorporate the ongoing maintenance of our plan into our quarterly planning sessions which will support turning identified “Issues” into quarterly goals (“Rocks”) or shorter term action items (“To Dos”).  This provides the linkage we need to bake this into our ongoing process and meetings to give the plan its proper attention and focus.

3.1  Functional Teams Responsibilities

In this section we define pre and post-disaster responsibilities.  Pre-disaster items include areas like: awareness and training, evacuation drills, and developing alternate site capabilities.  We do not want an actual disaster to be the first time we have thought about these things.  General George Patton said, “You fight like you train.”  Another sentiment expressed by one-time professor at the Royal Academy of Music, Harold Craxton stresses, “Amateurs [musicians] practice until they can get it right; professionals practice until they can’t get it wrong”  Pick your inspiration, but the point remains – we need to pay more than lip service to the preparation and practice of our plan in order to expect it to pay dividends when we put it into action.

3.2  Business Continuity Plan Administration

In this section we define who is responsible for developing training materials and how often training and drills will be conducted.  Your mileage may vary, but we opted for annual training and biannual drills.  The reality for TCS is working remotely is so engrained into our normal process, and our systems are mostly cloud hosted, that we routinely operate in a similar manner as we would in a business continuity situation.  This allows the main thing, Service delivery, to be somewhat assumed and frees management to focus on communication, coordination, and recovery which significantly enhances the capabilities of our small management team by not spending vital energy fixing operations to support Service.  The other benefit is our clients will be less impacted by an event affecting TCS and we don’t want to minimize the importance of that.

3.3  BCP Awareness & Training

Here we will outline the annual events and supporting documentation for our ongoing awareness and training.  To not reinvent the wheel, the guidance provided in the app is solid: “Employee newsletters are a great tool to keep awareness high in between annual events. They are also the perfect venue to remind employees about seasonal hazards like severe winter storms, flooding, hurricanes, tornadoes, etc. Helping to keep your employees personally prepared and resilient will help the company be more resilient as well. The Federal Emergency Management Agency (FEMA) has an excellent Web site: http://www.ready.gov that provides free resources for both personal and business preparedness. In addition, FEMA is the executive agent for the Department of Homeland Security’s National Readiness Month in September of each year. This is a great time to work with local emergency response agencies to give special presentations that focus on personal and business readiness.”

Having a folder content ready to go for employee onboarding, quarterly employee emails, and annual training we ensure you can easily maintain the ongoing effort without much hassle.  These resources can be found online, as mentioned above, so download some posters, graphics for emails, and pdf one-page handouts and you should be set.  Don’t spend time developing anything you can find free on the web.

3.4  Exercising (Testing) the BCP

This section is straightforward and simply documents the date, type of exercise, purpose, and participants of each BCP test.  This can be a “Table Top” test where you verbally talk through a scenario and discuss how your plan would apply, noting any deficiencies.  On the other end of the spectrum you can do a full live BC drill where you will operate in the same manner as if a disaster actually occurred.  These routine tests will help pressure test your plans and find areas where improvement is needed.  I have conducted a number of BCP tests for clients and have written After-Action Reports (AARs)  to document the good, the bad, and the ugly.  This is good to do on an annual basis and this is a requirement for some of our regulated clients.  Feedback from testing will help inform necessary improvements your plan and capabilities to better support the organization in a real disaster.

3.5  Business Continuity Plan Maintenance

Very simple – document the revision history of your plan along with a brief summary of changes to the plan.  Nothing more to do or add here.

3.6  Business Continuity Plan Approvals

Much like section 3.5, this is a straightforward, but essential step – someone in senior management needs to sign-off on each revision of the plan.

At this point we have a Business Continuity Plan, we have documented the supporting details to execute the plan, and have incorporated the ongoing administration and maintenance of your plan into your strategic business management process.  We have a way to train, test, and update the plan.  Next week we will take a deeper dive into exercising our plans and producing after-action reports.  TTFN!

Chapter 2 – Critical Business Information

Hopefully we have all recovered from the somewhat heavy lifting of doing the risk assessment and business impact analysis.  In this chapter we will document several different lists that may be needed during a business continuity scenario.  Most of this information you will already have in various systems like a Customer Relationship Management (CRM) or accounting/billing system.  And it will be good to have all of this captured in one document just in case.  For our purposes, TCS will define the following teams: Management, Service, Human Resources, and Finance.  You could start with a similar structure, depending on your organization, and adjust later if needed.

2.1  Team Call List

In this section we will document the home and work contact information for each employee under his or her respective team.  Easy peasy.

2.2  Team Task List

Now we will define, for each team, a set of tasks to be performed throughout the duration of the business continuity event.  This is a who/what/when list for each department in our case.  Think through your business “People, Process, and Technology” structure again and identify the essential tasks to be performed, deprioritizing the non-essential tasks if necessary.  And, yes, I intend to get maximum use of our PPT graphic since this concept keeps coming up in our discussion.

2.3  Team Action Plan

The Action Plan will feel somewhat redundant to the Task List you just created.  The way the tool (and sample text) treats this section is defining the Continuity-specific responsibilities and tasks on a per-team and per-site basis where the previous list dealt with the continuation of routine work functions.  It is likely some personnel will have more tasks defined in one list than the other depending on the specific role and delegated tasks.  Some (managers in the case of TCS) will coordinate more heavily on managing the continuity and recovery communication and coordination where our Service Team will remain mostly client facing.  This is a little nuanced and the most important thing is the essential tasks are defined in your plan in one place or the other.

2.4  Team Customer List

Now we will create a list of our key customer contacts.  Since the app provides a separate list for each team, you may want to separate a billing contact versus other contacts.  For TCS this includes a list of our technical points of contact and billing contacts.  If you have defined a Management team like we have, then you could also list key management contacts in that section separate from your primary contacts.

2.5  Team Critical Equipment List

Continuing with our critical lists…we want to document essential equipment.  Think through what items you might need to keep your department going and enter the item, quantity, vendor source, item number, per item cost, and total cost for each row.  As before, the list will be broken down by department.  It would be a good idea, if you have a redundant site or an agreement to use another facility, to keep enough inventory stored to provide quick access to essential equipment.  Otherwise, try to identify a local vendor where you would acquire the equipment with short notice.

2.6  Team Software List

Now we will create a list of software required to run your business.  If you are leveraging hosted or cloud software, then it may be possible to operate with minimal critical requirements for downloadable software.  In fact, a key business strategy would be to move in that general direction.  For example, if you are using Microsoft 365 with hosted email and cloud document storage, perhaps you could get away with using a Chromebook (or any devices with a browser) and get by until your normal setup could be fully restored.  Another effective solution, especially for line of business applications that are not yet cloud-ready, is to use a Terminal Server for remote access to these applications. TCS offers a cloud backup solution where the Terminal Server can be run virtualized in the cloud until the on-premise servers and data are be rebuilt. We are aiming for the lowest cost minimum functionality needed to run the business and the trend is moving in this direction, so it’s worth checking with your IT company and software vendors to assess these capabilities before a disaster.  In fact, I wrote my first article for TCS on this subject a little over a year ago: Strategy: Turning Your Business Inside Out. While not ideal, my computing requirements and our business systems would (by design) allow me to operate on a Chromebook with no additional installed software.  My business phone number could be forwarded to my cell, and voila!

2.7  Team Supplies List

Now is time for me to admit having a dark sense of humor…it’s true!  When I read the provided example Supplies List, I couldn’t help but wonder what kind of bad day DHS was aiming for when they put together this list…

This reads like items needed after a zombie apocalypse.  I would add shovel and rope and consider my list complete, but I digress 😊

But you get the drill by now…put together a list of supplies you may need and hope your rainy day is less dramatic than what DHS thinks you will need.  And if you think I’m making this up, Google “CDC” and “Zombie” and see what comes up.

2.8  Team Telecommunications List

The reality for TCS is we are small enough where most of these lists can be done in one place.  We just don’t need 1000 items in 20 different department lists, but your mileage may vary. Simpler is better in my opinion.

For Telecommunications, we want the vendor, account number, support number, and the name and email of your point of contact – most likely an account representative.

My goal is to adopt StarLink satellite Internet, at least as a secondary provider, as soon as service is available in our area.  This would make a great backup connection that doesn’t rely on ground-based cabling/fiber infrastructure.  As Elon Musk says, “Plug in socket, point at sky.  These instructions work in either order.”  How about that for getting your Internet access back up and running post haste?!

2.9  Team Vendor List

Critical vendors.  Document them.

2.10 Team Vital Records List

This list includes backups…you are backing up to the cloud, right?!  Critical data, intellectual property, contracts, diagrams.  The best plan here is to have everything in digital form then ensure you have local and cloud copies/backups of this data.  And when we say backup, we mean geographically redundant storage…not keeping a data tape in a separate room from your server. 

<soapbox>If your technology vendor cannot explain in simple terms how they know your backups are executing and the data integrity is validated daily, then it may be time to shop around for someone who gets this.</soapbox>

Time to pat yourself and your team on the back!  By my count we are 56.8% complete with our plan.  In Chapter 3 we will walk through Plan Administration and Maintenance.  This next section is the catalyst that enables companywide adoption through training and awareness.  It also will develop the process for ongoing plan maintenance, thus not becoming an outdated binder on the shelf that is no longer relevant to your business. 

TTFN!