13
Feb 17

TheWHIR – Why Does It Seem Like Airline Computers Are Crashing More?

Another week, another major airline is crippled by some kind of software glitch.

If you feel as if you’re hearing about these incidents more often, you are—but not necessarily because they’re happening more frequently.

Delta Air Lines Inc. suffered an IT outage that led to widespread delays and 280 flight cancellations on Jan. 29 and 30, a problem the carrier said was caused by an electrical malfunction. A week earlier, United Continental Holdings Inc. issued a 2 1/2-hour ground stop for all its domestic flights following troubles with a communication system pilots use to receive data.

These two shutdowns were the latest in what’s been a series of computer crack-ups over the past few years, including major system blackouts that hobbled Southwest Airlines Co. as well as Delta for several days last summer—affecting tens of thousands of passengers.

More of the WHIR post from Bloomberg


03
Feb 17

Data Center Knowledge – This Server’s Uptime Puts Your SLA to Shame

An unusual and noteworthy retirement from the IT industry is scheduled to take place in April, Computerworld reports, when a fault-tolerant server from Stratus Technologies running continuously for 24 years in Dearborn, Michigan, is replaced in a system upgrade.

The server was set up in 1993 by Phil Hogan, an IT application architect for a steel product company now known as Great Lakes Works EGL.

Hogan’s server won a contest held by Stratus to identify its longest-running server in 2010, when Great Lakes Works was called Double Eagle Steel Coating Co. (DESCO). While various redundant hardware components have been replaced over the years, Hogan estimates close to 80 percent of the original system remains.

More of the Data Center Knowledge article from Chris Burt


13
Jan 17

ComputerWeekly – Disaster recovery testing: technology systems to test DR

In this concluding part of a two-part series, Computer Weekly looks at ways of testing disaster recovery (DR). In the first article, we discussed the need for disaster recovery and for developing a strategy to test the backup process.

We discussed four main items that need to be evaluated to ensure successful testing. These were:

Time – Evaluating the time since a test was last performed and measuring the time to complete recovery, from a RTO (recovery time objective) perspective.
Change – Testing after major changes occur in the infrastructure, such as application upgrades or infrastructure (hypervisor changes).
Impact – What is the impact of running a test? Can a test be run without impacting the production environment?
People – How do we consider the human factor from the perspective of taking human error out of the recovery process?

In a virtual environment, the options for recovery can be divided into four main sections.

More of the ComputerWeekly article from Chris Evans


12
Jan 17

ComputerWeekly – Disaster recovery testing: A vital part of the DR plan

IT has become critical to the operation of almost every company that offers goods and services to businesses and consumers.

We all depend on email to communicate, collaboration software (such as Microsoft Word and Excel) for our documents and data, plus a range of applications that manage internal operations and customer-facing platforms such as websites and mobile apps.

Disaster recovery – which describes the continuing of operations when a major IT problem hits – is a key business IT processes that has to be implemented in every organisation.

First of all, let’s put in perspective the impact of not doing effective disaster recovery.

Estimates on the cost of application and IT outages vary widely, with some figures quoting around $9000/minute.

More of the ComputerWeekly article from Chris Evans


05
Jan 17

Continuity Central – Survey finds that US companies are struggling with remote and branch office IT disaster recovery

Riverbed Technology has published the results of a survey that looks at the challenges that organizations are facing when managing IT at remote and branch offices. The survey asked IT professionals about the various challenges they face in provisioning and managing remote and branch offices (ROBOs) and found supporting ‘the IT edge’ was expensive, resource-intensive and full of potential data security risks.

ROBO IT continues to be provisioned and managed largely as it has been for the past 20 years, with distributed IT spread out across potentially hundreds of remote and branch locations. However, this approach can bring data risk and operational penalties to companies at an extremely high cost, and in today’s increasingly distributed enterprise with a primary focus on data and security, past approaches may not be ideal for business success. Given the various challenges associated with managing remote sites, organizations have their hands full in supporting the edge.

More of the Continuity Central post


14
Dec 16

Continuity Central – The rise of SIP-based cyber attacks

Cyber attacks using the VoIP protocol Session Initiation Protocol (SIP) have been growing in 2016, accounting for over 51 percent of the Voice over Internet Protocol (VoIP) security event activity analysed in the last 12 months, according to a new report from IBM’s Security Intelligence group.

“SIP is one of the most commonly used application layer protocols in VoIP technology, so it’s not surprising that it’s the most targeted. In fact, we found that there has been an upward trend in attacks targeting the SIP protocol, with the most notable uptick occurring in the second half of 2016,” states the IBM Security Intelligence group.

The second most targeted protocol, Cisco’s proprietary Skinny Client Control Protocol (SCCP), accounted for just over 48 percent of detected security events during the same time period.

More of the Continuity Central post


09
Dec 16

Continuity Central – C-Level and IT pros disagree on organizations’ ability to recover from a disaster: Evolve IP survey

When it comes to assessing an organization’s ability to recover from a disaster, a significant disconnect exists between C-Level executives and IT professionals. While nearly 7 in 10 CEOs, CFOs or COOs feel their organization is very prepared to recover from a disaster, less than half of IT pros (44.5 percent) are as confident , a technology survey conducted by Evolve IP reports. The survey of more than 500 executives and IT professionals uncovered factors, including compliance requirements and use of hosted solutions, that contribute to an organization’s disaster recovery confidence overall.

Disaster recovery compliance was a clear driver of confidence in the ability to recover IT and related assets in the event of an incident. In fact, 67 percent of respondents in banking, 58 percent of respondents in the government sector and 55 percent of respondents at technology companies feel very prepared: of these disaster recovery compliance was noted as a requirement by 97 percent, 73.5 percent and 71 percent respectively.

More of the Continuity Central post


31
Oct 16

Data Center Knowledge – “Right-Sizing” The Data Center: A Fool’s Errand?

Overprovisioned. Undersubscribed. Those are some of the most common adjectives people apply when speaking about IT architecture or data centers. Both can cause data center operational issues that can result in outages or milder reliability issues for mechanical and electrical infrastructure. The simple solution to this problem is to “right-size your data center.”

Unfortunately, that is easier to say than to actually do. For many, the quest to right-size turns into an exercise akin to a dog chasing its tail. So, we constantly ask ourselves the question: Is right-sizing a fool’s errand? From my perspective, the process of right-sizing is invaluable; the process provides the critical data necessary to build (and sustain) a successful data center strategy.

When it comes to right-sizing, the crux of the issue always comes down to what IT assets are being supported and what applications are required to operate the organization.

More of the Data Center Knowledge article from Tim Kittila


28
Oct 16

Continuity Central – Maintenance of a business continuity management system: a managerial approach

Practical approach to achieving the difficult task of keeping your business continuity plans up to date.

When a business continuity management system (BCMS) has been established and implemented, a serious managerial challenge evolves: the BCMS has to be maintained and put into a continuous improvement process. In this article, Alberto Alexander. Ph.D, MBCI, looks at the activities that need to be performed to maintain and improve a BCMS.

INTRODUCTION

Any organization that establishes and implements a BCMS needs to follow the BCMS processes and deliverables, which are depicted in figure one. The BCMS processes, also known as the BCMS process life cycle model, (Alexander, 2009), consist of six phases.

The stages of the BCMS process life cycle model are the following:

Stage one: business impact analysis
The business impact analysis (BIA), which is conducted during the first stage, analyzes the financial and operational impact of disruptive events on the business areas and processes of an organization. The financial impact refers to monetary losses such as lost sales, lost funding, and contractual penalties. Operational impact represents non–monetary losses related to business operations, and can include loss of competitive edge, damage to investor confidence, poor customer service, low staff morale, and damage to business reputation.

More of the Continuity Central article