Modular Mining, a subsidiary of Komatsu that develops mine equipment management systems, noted recently that in any computerized work environment, hardware issues are a reality. More often than not, the problems are minor and easily resolved by a server reboot or service restart, and result in little, if any, operational disruption. Unfortunately, more serious problems such as a server failure or prolonged power outage are also a possibility. In these situations, the potential for lost productivity, missed production targets and decreased equipment utilization is high.
Modular said it recently helped a Latin American mine protect against prolonged downtime from computer hardware issues by implementing improvements aimed at ensuring availability and operational continuity. The mine, a long-time customer and user of Modular’s DISPATCH fleet management system (FMS) along with its ProVision and MineCare products, sought help in mitigating operational risk should a severe upset event occur.
In a collaborative effort, deployment and support personnel from Modular, a cross-functional group from the mine, and a local information technology (IT) contractor designed a comprehensive Disaster Recovery (DR) plan to meet the needs of the operation. The team assessed the major factors influencing the DR solution, including the Modular Mining technologies in use, SQL Server database structure, application and virtual machine (VM) structure and interoperability among the applications and hardware. After achieving a thorough understanding of the factors involved, the team formulated a plan that uses Microsoft SQL Server Always On Failover Clustering Instances (Always On FCI) and VMware vCenter for reliable, high-availability capability.
To execute the DR plan, the mine acquired the hardware needed to outfit two identically equipped data centers — one on-site and one remote. The On-site Data Center contains the mine’s DISPATCH System control room. Also housed in this location are the DISPATCH, ProVision, and MineCare System VMware servers (on which the applications and application virtual machines reside), database servers, and fibre channel data storage devices and network switches.
To minimize the operational impact of a server failure or other upset event, the database and application servers are configured in clustered, redundant Primary and Target pairs. Should a Primary database server failure occur, the Always On FCI functionality initiates the failover response, causing the Target server to assume the Primary role. The process requires no human intervention, is nearly instantaneous and is transparent to system users and equipment operators.
In the event of a Primary VMware server failure, VMware vCenter initiates failover to the corresponding Target server in the VMware cluster. In addition, a designated dispatcher, or other mine representative, will contact Modular Mining’s local support team to notify them of the situation. A support team member will then remotely start the application system service(s) and VMs on the Target application server to restore the DISPATCH, ProVision and MineCare System functionality.
The Remote Data Center, located about a third of a mile from the mine, replicates the configuration in the On-site Data Center. Because the servers and storage units in the Remote Data Center maintain data continuity with the On-site facility via the fibre channel network, the data centers are always in sync, enabling rapid recovery response.
Should a catastrophic upset event render the On-site Data Center’s hardware resources unavailable, SQL Always On FCI and VMware vCenter will initiate failover to the Primary servers in the Remote Data Center. Again, a designated dispatcher (or other mine representative) will contact Modular Mining’s local support team to notify them of the situation. A support team member will then remotely start the application system service(s) and VMs on the Primary application servers in the Remote Data Center to restore the DISPATCH, ProVision and MineCare system functionality.
Online in 60 Minutes
Hardware failure, be it a single server or a more catastrophic incident, is a real possibility at any mine. The time needed for recovery and resumption of normal operation determines the extent of production degradation and profit loss. Now, according to Modular, because of the mine’s commitment to establishing fully equipped, redundant data centers, it has the ability to recover from a catastrophic failure in 60 minutes or less, ensuring that downtime of crucial technologies is kept to a minimum.
Without a comprehensive DR plan in place, the time required to obtain, install, configure, test and deploy the replacement hardware at this site or most others would cause a significant delay in the return to normal status. The DR plan developed and implemented collaboratively by Modular Mining, mine personnel and the local IT contractor enables the mine to minimize any interruption in haulage optimization and asset health monitoring capabilities provided by Modular Mining. In addition, according to the company, its success in this implementation will serve as a model for Modular’s future DR efforts at other mine operations.Disaster Recovery Planning is Crucial for Cutting Costly Downtime
The Value of Being Prepared
DISPATCH System optimization is a key enabler in the mine’s efforts to achieve the goals set forth in their mine plan. While the mine can operate without the fleet management system, doing so would adversely affect productivity, efficiency and overall performance.
To demonstrate the value that DISPATCH FMS optimization delivers to the mine’s operation, Modular’s Performance Assurance team conducted a simulation comparing production over 30 days, with and without the DISPATCH system in operation. The results revealed that without the advantages of the FMS, the mine would suffer a 4% decrease in production, representing a substantial loss in tons moved.
Because the DR plan enables full operational recovery in less than an hour, the mine’s production losses would be kept to a minimum. The DR plan, according to Modular, also helps ensure the accuracy of data utilized for performance evaluation, trend analysis and reporting following a recovery situation.