Implementing a Business Continuity and Disaster Recovery solution using ShoreTel

Posted by Lou Person on Dec 18, 2011 in Cloud Journey

ShoreTel is a single image solution. This means, all sites, users, switches, phones and licenses are all managed through a central interface (called Director) and distributed to users at locations distributed across an organizations footprint. Some organizations are single site with all users at one location and some remote users who access the system while travelling or working from home. Other organizations have multiple locations which are connected together through a Wide Area Network (using such technologies as Internet VPN, MPLS, Metro Ethernet, etc).

Whether single site or multiple location, it is easy to add a Business Continuity/Disaster Recovery site to ShoreTel. The following is a high level overview of the components required and some suggestions and alternatives. Please contact me directly to discuss further. brightstack offers a hosted ShoreTel service where we can host your Business Continuity and Disaster Recovery (BC/DR) site as part of a Managed Services offering. Some companies realize that the BC/DR site effectively becomes a Hosted ShoreTel solution, and they decide to move production to the BC/DR site and have “built-in disaster recovery and business continuity” in their Headquarters Site now in the datacenter. They then carry voice traffic over the private network to the locations (sites with higher user counts often deploy PRI’s at the local site).

Add the site in Director
Select the location of the BC/DR site. If you already have a BC/DR datacenter in place, and it is networking with your primary location, this step is pretty easy.You would simply need to acquire equipment and add the location as a site within Director.You will need a site license for this site.

Establish the Network
You will need a private network between the BC/DR site and your remote locations. If you contract with a carrier to provide network services for your entire organization, it may make sense to use one of their Datacenters as your BC/DR site and hang their datacenter off your network as another site. Also make sure you have some access to the Public Internet at the BC/DR site, this will be helpful for backup Internet VPN’s, as well as third party SIP trucking (more on that below).You may also want to deploy a VPN concentrator for VPN phones at remote locations, or a SSL VPN for VPN clients for things such as Softphones. Some customers, and brightstack includes this in our offering, use Remote Desktop through Microsoft RDC, Citrix or VMWare View to access Communicator remotely.

Deploy equipment in the BC/DR site
In the BC/DR site, you will need a backup Director in an active/passive configuration (known as a DVS) and voice switch. We recommend deploying a small switch just for voice control (such as an SG30) and a T1K for PRI trunking.The DVS will replicate with Director and contain a copy of your configuration. If you already have a BC/DR site up for your data operations and are running Virtual Hosts on Physical Servers using VMWare, you can use your VMWare infrastructure to host the DVS in the BC/DR site.

Provision the circuits
If your BC/DR site is within a carrier datacenter, you can easily get a PRI cross connect with no local loop charges, only port and usage charges. Additionally, there are third party Internet based SIP providers who can also deliver dial tone to your ShoreTel solution using SIP.You will need to create trunk groups and trunks. These trunks will have their own set of numbers associated with the trunk group. When you provision circuits, make sure you have enough capacity to support External Assignment. Meaning, you will need 1 trunk to connect with the externally assigned user and another trunk for the incoming or outgoing call itself.

Program DNIS
DNIS stands for Dialed Number Identification Service.  In other words, the number that was originally dialed.  Since the DID's will be running on a different PRI than the one they are assigned to in Director, a DNIS map should be setup ahead of a disaster.  This way, when the numbers are DTO'ed or sent to the other PRI, you can map the number that was dialed to the extension.  This would be done if you are going to fail over on a temporary basis.  (If it is permanent, then you would move the DID from the trunk group assigned to the original PRI and move it the trunk group assigned to the new PRI).  Here is an example:

Scenario:

Empire Capital has a block of DIDs routing inbound to their PRI in New York.  

Empire Capital also has a remote site in Chicago where they have a local PRI as well.

Empire Capital would like automatic failover so that all inbound calls in NY will failover to the Chicago PRI and retain DNIS.

In ShoreTel, configure the New York block of DIDs under Trunk Group labeled NY PRI. Configure the same DIDs in the Trunk Group labeled Chicago PRI and use the DNIS mapping table so there are entries in both trunk groups for the same numbers.  

If the two sites are serviced by the same carrier provider, the carrier will have an offering allowing for trunk level failover which retains DNIS mapping.  If the carrier detects that the PRI in New York is down, automatic failover will take place and those New York DIDs will failover to the Chicago PRI.  Calls will route to ShoreTel where the DNIS table entry in Chicago will route calls over the WAN back to users' extensions in the New York site. 

Establish forwarding of your numbers
You should plan well ahead of time how numbers will forward to the BC/DR so calls will come into the ShoreTel system in your BC/DR site. Many carriers offer a service called Direct Trunk Overflow. Using DTO, you can automatically forward calls on a failed circuit (such as a location that is lost in the event of a disaster) to another circuit or lead number in a hunt group.You should also plan for usage charges if you are forwarding calls from one carrier to another. If your BC/DR site is on the same network as the carrier who provides services to the failed location, they should be able to point call paths from the failed circuit to the circuit at the BC/DR location. Once a disaster is declared and calls are forwarded to the BC/DR site, they can be answered by an operator (who is located anywhere and accessing the ShoreTel system through one of the mobility options) who is directing calls, an auto attendant or even by DID.

Provide users access
User access is very easy, they shouldn’t miss a beat. The following mechanisms can ensure end user access to the BC/DR site. brightstack includes all of these options in our offering:
1.    Simply login to office anywhere and assign extension to last known external assignment;
2.    Softphone access using VPN Client;
3.    VPN Phone with using VPN Concentrator;
4.    Mobility device (iPhone, Android, Blackberry) using WiFi through the Mobility Router;
5.    Mobile Call Manager (which is really a GUI to office anywhere);

Using any of these solutions, users will be able to access calls. I suggest testing and preparing ahead of time, especially for the operator. As a use case the following scenario may play out in a disaster:

1.    Disaster is declared;
2.    Users move to their assigned recovery location (which may be their home, other office, etc)
3.    User accesses remote desktop
4.    Communicator is launched
5.    Extension is assigned to External Assignment or user logs in to VPN phone
6.    Communicator now controls extension
Prepare your Agents to login and logout using voicemail
When a disaster occurs, your users may not have access to their PC's or Communicator, but be in external assignment. You will still want your users to be able to login and logout. Using voicemail, assuming the agent has agent communicator rights, they login to voicemail. Press 7 for personal options then 9 to change agent status then 1, 2 or 3 to login or logout. If your agent has a mailbox on a DVS at a different location than Director, and the DVS and Director are unable to communicate due to the disaster, Director may not receive the message from the DVS to change the agent's status. The agent will receive a confirmation from the remote DVS, but their status may not change in the system.

Train your staff on star and pound codes
These codes allow users to put calls on hold, transfer, conference, etc. without using Communicator. This is very helpful if your user is on external assignment without access to Communicator and they need to transfer a call internally, conference in another internal party or external party, transfer a call internally or interflow to a different service group.

Test everything, run fire drills
Make sure your users have a set of instructions and are trained on how to use the BC/DR site.You should also test your BC/DR plan on a routine basis.

Use Cases
Empire Capital has a single image ShoreTel solution installed in their NY office.  They also have a Business Continuity and Disaster Recovery site in NJ.The ShoreTel system is designed to be resilient, highly available and redundant and provide N+1 redundancy across key components.  Additionally, Empire Capital  has deployed an MPLS network between locations and deployed Voice services from their carrier at both locations.

New York
Empire Capital’s main location is NY.  The director server is in NY, along with 1 SG120 and 1 PRI switches. 

The following are points of failure, with an assessment of Empire’s risk at this location per point of failure and a recommendation to mitigate the risk.

1.    NY connects to the datacenter via MPLS.  Redundancy for the MPLS network is an Internet based VPN should the MPLS fail.
2.    NY has 2 PRI’swith 2 PRI switches in place.  This will provide N+1 redundancy should one of the PRI’s fail, although users will operatein a degraded state during the outage, since 50% of the trunks are down.
3.    If one of the PRI’s in NY fails, the inbound calls on the DID’s on the failed PRI should automatically, or with manual intervention, route to the channels on the operational PRI.This needs to be programmed and implemented by the carrier. 
4.    If both PRI’s fail, the inbound calls on the DID’s on the NY PRI’s should automatically, or with manual intervention, route to channels on a PRI in the NJ datacenter.  Calls can then be routed back to NY over the MPLS Network, or be handled by an operator in external assignment out of the NJ datacenter if the NY MPLS connection goes down.   
5.    Inbound calls on the failed PRI can also point to the lead number of the backup POTs hunt group in NY. 
6.    If both PRI’s fail in NY, outbound calls will continue to route over the NJ Datacenter’s PRI’s. 
7.    If the voice switch in NY fails, phones will register with the N+1 switch in the datacenter.
8.    There are 2 emergency phones powered by Telco should a power outage occur.
9.    NY uses the Director server for Voicemail.  If the Director server fails, the system will still operate, but pc call manager control, auto attendant and voicemail will not function.  brightstack recommended deploying a DVS in NY, moving the mailboxes onto the DVS for the NY users.  If the DVS fails, the voicemails will be backed up by the Director server.
10.    In the event the NY office is not accessible, users should have instructions regarding accessing the system through Office Anywhere out of the NJ Datacenter.
11.    NY has POTs lines connected to the system for backup 911. These can also be used should both PRI’s fail as describe above.  As a tertiary trunking mechanism, internet based SIP trunks can be implemented.
12.    Operators in Office Anywhere, External Assignment or through Mobility can service all the locations should a failure occur.