Eli Weinstock-Herman

IT: Beyond the 'Right Now' Problem

Original post posted on Thursday, March 18, 2010 at LessThanDot.com

In an IT department there is a tendency to classify operations as being reactive or proactive and, often, pressure to have more of the latter and less of the former. Pressure, that is, until a PC breaks down, a network connection drops, a data record goes missing, or any of a dozen other issues which will ultimately receive more attention than disaster recovery, employee development, business analysis, strategic planning and the rest of a long list of proactive tasks. Immediate, defined problems are far easier to focus on than tenuous concepts of proactive prevention.

Level of Pain

One of the strongest drivers of IT prioritization is business pain. This pain can be immediate and obvious, such as hardware failures and software bugs, or it can be future and less evident, such as announced, critical IT plans for an upcoming merger. When the business feels pain then the value in projects is more immediately obvious and this allows many IT departments to slip into a reactive role. It is easier to take on obvious tasks, applying band-aids and surgery, then selling the business on the benefits of long-term planning and expenditures.

However we must keep in mind that businesses are not people and, though we use the metaphor of pain, there is a critical difference between how humans and how businesses deal with pain. As children we learn broad lessons from simple events. Touching a hot pan introduces a wariness of things that are on the stove, things that are hot, and possibly things that are making crackling noises. As we mature, these past events are remembered by our unconscious, making us wary of the next hot pan we come across without then need to actively examine every detail in our environment.

Businesses, on the other hand, will continue repeating the same error and require active attention from personnel to ensure they don't continue picking up the pan every few minutes. It's likely we have all seen this occur, an IT department that fixes the same network outage every month or that undergoes cultural transformation and then backslides a few months later. Businesses are far less able to operate on cruise control than people are and though we liken them to living organisms, we have to remember these critical differences.

Maturing our Problem Solving

In attempting to mature our organization from a reactive to proactive stance, we must address both the tendency for a company to undervalue proactive tasks and the tendency to repeat errors on the small and large scale. A department in a reactive stance is likely at their limit with day to day problems, as they have only ever defined themselves by these tasks and their hiring will in turn be defined by that capacity. While it may seem like this is a bad place to start, with our employee capacity already maxed out, what we actually have is a great deal of potential, if only we could resolve some of those repetitive problems.

Transparency

While many reactive departments will have metrics that they occasionally share, most will have little or no actual communication of their status with the outside business (except in arguments for headcount or to justify the length of time it takes to react to new issues). It may feel like your airing your dirty laundry, but communicating issues to the outside business will ultimately help you expose the importance of future-looking activities, show meaningful progress to the business without having to announce it, and provide your staff with pride in overcoming the challenges of the current environment.

But if it were easy, it wouldn't be worth doing.

The first step is the hardest, providing visibility into a process that is simply running from one problem to the next. The two types of information you want to communicate to the wider business are service impact events and information on what the department is working on. Metrics posted on the front fo the intranet site or a visual board can be god mechanisms to build awareness of the tasks and task load the department is currently working on, while email is generally a good enough vehicle to deliver news of service outages and impacts.

Side Note: While ticket closure or reception rates are interesting metrics, try researching other common metrics and pick ones that drive the behaviors you want in the department. For instance, reporting the ratio of time spent on reactive vs proactive tasks or direct measures of proactive tasks, such as number of systems with a disaster recovery plan or maximum and average length of time between tests of disaster recovery plans.

Having the courage to post an honest evaluation of the state of the IT Department will buy you credibility, while compiling those numbers and sending notifications on outages will force you to evaluate what the business needs from the department and what areas to begin focusing on. Having numbers in public display will also create a faster feedback loop with the business, as members of the business will see the improvements across the whole company and not just as a function of their own individual requests.

Visibility for Analysis

The second type of visibility into tasks is just as critical. As problems occur and are corrected, they need to be recorded and cataloged. The data from this process is not only a perfect source for the business visibility above, it also provides a starting place for identifying common or repetitive problems. Just as touching a hot pan the first or second time will save us from touching it a third and fourth time, we need to examine our recurring issues and put a preventive measure in place.

Side Note: When you start a tracking process, keep it as small as possible and review what you track regularly. Currently there is no tracking, so your initial process has to be as light as possible to get buy in from the members of the IT group. Start with a few key pieces of information and let the team take responsibility for experimenting with others in later feedback sessions. This will promote ownership of the process among the team and should lead to a higher success rate.

Common information to track is a general description of a problem, the person who worked on the problem, the person who reported the problem, date the issue was received, date the issue was closed, and some high level information about the system involved or type of problem (hardware, software, training, whatever). You won't get this right at first and, if it does it's job, some of your early factors will stop being as significant once you start attacking the most prevalent types of problems.

After only a few weeks of gathering data it should be possible to start identifying some areas to focus on. This doesn't require an expensive tool or analysis package, often a few columns of data in a spreadsheet with pivoting will net you some results. Look for the most common factors in recurrence, source of problems, equipment or software generating problems, etc. Make a list of the top 3; these are the areas to focus on first.

Root Cause

The business is becoming familiar with the number of issues that are flowing through the department and you have some clues on where to look to bring your issue rate down, now all we need to do is bring the two together. There are a number of methods and tools for locating the root cause of a problem, many not specific to IT. The method I prefer is the Fishbone/5-Why technique, but there are many others that are just a Google search away.

Each of the items on your top 3 list is responsible for generating an appreciable amount of pain. In many cases it won't even be on purpose (surely you didn't think Bonnie down in purchasing had nothing better to do than generate IT tickets?). By using root cause analysis on your top 3 list, you shoudl be able to start identifying practices or problems that are responsible for the errors yuor actually receiving tickets for. Together with your team and potentially even some members of the affected departments, brainstorm some ideas that could resolve these root causes and try them out. At first the business may be wary about your team spending an afternoon brainstorming on issues they aren't even seeing, but if you can start to report not just how many tickets your closing, but that the rate of tickets being submitted is slowly going down, then you will have a case for continuing.

To return to the pain metaphor, the business will slowly learn that many of the pains they have been suffering, are actually symptoms of longer-term illnesses. By addressing these illnesses, the team will be able to show improvements in the areas they worked on, continue to build credibility for working on non-obvious issues, and provide a foundation for taking on even less obvious issues, such as long term strategic planning.

One more step.

The Wisdom of Age

Now that the business trusts our judgment to spend time correcting ills that are readily apparent, we need to take what we have learned and start projecting forward. By this point the team is solving immediate and future problems with equal ease, members of the team are starting to show proclivities towards one or the other set of problems, and the business has been much more willing to try applying sometimes outlandish solutions to less than immediate problems. Time to scale up.

As our businesses continue to grow and mature, they are going to be looking at new markets, new acquisitions, new tools, new methods....there is a whole lot of "new" coming for the business. IT can be a business driver and help provide tools that grow the bottom and top lines, but it won't be able to help growth or accommodate change smoothly if all we are doing is solving immediate problems. Just as we collected information on individual pains that were occurring and used those to solve underlying problems, we need to start collecting information about the long terms plans from departments and executives. That information, in combination with our newly whetted imaginations and skills from root cause sessions, is going to help us start building a framework to solve "pains" that have not yet occurred.

Internally it is time to start planning. Each server and software package should have a next step, PC retention plans should be in place for end user equipment, that projector in the conference room should have plans for warranty extension or replacement. Software should be evaluated to determine how well it aligns with future plans to grow or acquire new businesses, the potential for the vendor staying in business, and the direction of competing packages. Competitors should be evaluated to see what technologies or services they are beginning to offer, emerging technologies evaluated to see if they can offer additional services to the customer and increase entanglement, and on, and on.

There's a lot of catch up to do to get on top of your environment, but the end goal is worth it. Lower stress, higher pride, smoother operations...Just as a fit body is stricken less often by illness or infirmities, a fit IT group functions more smoothly and can handle a much wider range of calamity.

More Information

For more information on long term architectural planning and some additional information on taking your team to the next level, I highly recommend the following two books:

  • IT Architecture Toolkit - A pragmatic approach to analyzing, planning, and building the IT architecture
  • World Class IT - Focused on the executive and management levels, discusses principles and methods that can grow an IT organization to 'World Class' status

This is just a brief thought on another way of looking at IT maturity in business environments. Recently I've been tied up with refreshing my site and trying to recapture my personal brand. While I haven't had as many opportunities to post as I would like, I have had the time to finish a couple new books and learn a few new things as I rolled out the site. More to come soon.

Comments are available on the original post at lessthandot.com