Category Archives: Hosting

Making a Sound Technology Decision

by Scott Kantner, May 11th, 2012 in Apps, CTO, Hosting, Software

Some times clarity appears when you least expect it. Just this past week it revealed itself again as I was being asked how I thought particular business application systems should be hosted. In each case the choices were:

  1. DIY – The traditional Do-It-Yourself option of using your own hardware and support staff. Everything is in-house and 100% under your control at all times. You are impervious to fires in tunnels and squirrels electrocuting themselves. You bear 100% of the cost of the hardware and the technical staff needed to support the whole affair. Read more »

Gathering Clouds

by Scott Kantner, August 14th, 2009 in Cloud Computing, Hosting

Meet the fearsome spectre of cloud computing. He’s staring at you for a reason.   Despite the fact that the cloud computing is currently the most overused, misunderstood, and over-hyped phrase in our industry, trying to ignore it any longer is probably not a good idea. Though still in a very early phase, the clouds are forming, and we all need to start paying closer attention to which way the winds are blowing them and what the future implications are going to be for our infrastructure.

ostrich-head

After spending several days this week listening to the major players in the industry outline their current offerings and future plans, it was quite obvious that what I was talking about back in April was not caused by lack of sleep or hallucinogenic substances:

A short history lesson tells us all we need to know about cloud computing.  In the 1800’s  power generation was the responsibility of  those who needed it. Be it steam, water, or electricity, if I had factory with electrical machinery and lights, I had to generate my own power, and if you needed power, so did you. And both of us had the hassles of building, operating, and maintaining a power generation infrastructure which, by the way, was not our core business.  Power was necessary to the operation, but it was not the product or service we delivered for profit.

Eventually Edison and Westinghouse figured out how to transmit electricity, and entrepreneurs realized if they could build a Really Big Generator and implement a delivery method, they could sell power to industrial users. The case from the entrepreneurs to business was clear: “Let us worry about the hassles of generating power so you can focus on your core business, and oh by the way, it’s going to cost a lot less than doing it yourself.”

Fast foward to the present…has the light just come on (pun intended)? Cloud computing is nothing more than the name-du-jour for the centralization of computing resources so that they can be delivered as a utility service.  Nothing more, nothing less.

Cloud computing is here now, perhaps in nascent form, but it’s here nonetheless.

The cloud is not new technology we’re going to go out and buy, per se.  It’s not in a box, and it doesn’t have a part number. It’s more of a paradigm shift back to way things used to be in the Golden Age of the Mainframe (which never died by the way…according to people who enjoy crunching the numbers, IBM’s System z sales last year were in the $3.5B range).  You can read the popular Wikipedia page on cloud computing, but I would suggest you think of the cloud more as a way that technology is delivered, much like electricity or water. You only consume the goods delivered rather than first having to generate or pump them yourself.

From a cloud provider’s standpoint, this means a virtualized and highly automated infrastructure, fast and easy provisioning, self-service, enormous scalability, a usage-based billing model, and a very granular portioning of resources. And that means programmers. Hosting companies that plan to operate cloud offerings are going to need a talented programming staff of substantial size along with a very sharp support staff to build and operate all of the plumbing to deliver computing resources to your doorstep.  Proof of this lies in the list of the brave few heavyweights that currently occupy the lion’s share of the cloudscape: Google, Amazon and Rackspace. Building these utility services is not for the faint-at-heart or low-on-cash, and the average hosting company is simply not going to be able to really play in the clouds until the Big Boys blaze the trail.

Currently, we can find cloud services offered to us in three flavors, or if you like, three different levels of abstraction:

Infrastructure as a service (IaaS). This is closest thing to hosting as we knew it before today – computing and network hardware offered as a utility service, but essentially without limits or long term commitments. Rackspace’s current cloud offerings are of this type. Here, the hardware layer is abstracted away, leaving you to worry only about the operating system and applications.

Platform as  a service (PaaS). PasS takes the hardware layer and adds the operating system and a development environment (software APIs), which you then use to develop your applications. The development environment ostensibly hides all the details of where your code executes and how your data is stored and protected.  So at this level, in addition to the hardware layer, the OS and the development environment are now effectively abstracted away.  Google’s AppEngine is a prime example of PaaS.  You may have already surmised a possible evil in PaaS.  If you develop an application on a particular vendor’s proprietary cloud platform (e.g. Google), they’ve got you locked in to their service, and there is no small amount of chatter going on about this in the industry. The open community is crying for an industry standard set of APIs, while the big players are fighting to establish dominance with their proprietary systems.  Obviously, it’s best to stay on the sidelines here until the dust settles.

Software as a service (SaaS). With SaaS, everything is abstracted – you are simply presented with a user interface for an application. Salesforce.com is the premier poster child for this model.

Cloud computing is becoming a deeper and more pertinent topic every day, and we’d do well to begin keeping a sharp eye on it. Cloud is not necessarily an either/or strategy. For small business, and to a large degree medium business, the server room will eventually disappear into the cloud, and that will be a blessing for many. For the enterprise space, however, cloud services  will be just another tool in the arsenal to solve business problems along with, dare I say it, the mainframe.

ibm_mainframe

If you work in IT for a small or medium-size business, there is a career signpost up ahead in the clouds.  Make sure you don’t ignore it.

//spk

Personnel DR

by Scott Kantner, July 10th, 2009 in Disaster Recovery, Hosting

Are you prepared for the departure of a key technical resource in your operation?  Someone who holds the infamous “keys to the kingdom?”   Typically there is a least one person on a company’s IT staff who achieves deity status in regards to physical and logical access. Sometimes key skills also reside in just that one person. If such a person leaves, either voluntarily or involuntarily, how would your critical operations fare?

vista_help_icon_by_thoosje

Now would be a good time to take a fresh look at both your internal documentation and your skills matrix. Things to consider:

  1. Are all sysadmin userids and passwords documented somewhere, somehow?
  2. Are all critical architectures documented in excruciating detail?  (SAN, virtualization, LAN/WAN, disk replication, backup/restore systems). You want to see how things are connected and how they are intended to interact. You want to see things like IP addresses, subnet addressing schemes, WWN numbers, hard and soft zoning information and the like. You’ll know you have all of the information you need when you can hand it to a new engineer and he doesn’t have any questions. Seem impossible? Strive for it, and the result will be good enough.
  3. Where does the above documentation live if you do already have it?  Hopefully it’s not on your staff’s laptops.  If you think you already have it in a shared on-line space, are you sure you have all of it?  And is it being backed up?
  4. Do you have runbooks for all of your servers?  Are they current?  Where are they? Are they backed up?
  5. How many people have practical working knowledge in each area of your critical infrastructure?  Do you have more than one VMWare tech?  More than one SAN person?  How about Active Directory or Exchange? Ideally you’ll want three in each area. Contract for it if you need to.

 
I could go on, but I think you’re getting my point. This process is somewhat like writing a will.  It’s a real drag to write up, and everybody knows that they need to take care it, but yet it often gets ignored until it’s too late. And just like a will, all of this documentation needs to be updated on a regular basis or it may end up being worthless at crisis time.

Alternatively, you could move the responsibility for a large portion of this to a professional hosting facility.   Why not limit your exposure to just your applications and let us worry about how all the  plumbing is hooked up?

//spk

Up or Out?

by Scott Kantner, June 26th, 2009 in Hosting, Servers

I recently read a post that took a new twist on the long-term debate over whether it’s better to scale up (buy bigger servers) or scale out (buy more servers).  Traditionally this battle has been fought mostly on the technical considerations only.  ”Which is better for processing the real-time inventory of my growing  Dippin’ Dots empire vs. the fast serving of web  pages on my trendy social site?”

DIPPIN' DOTS BANANA SPLIT

The conversation is often reduced to raw number crunching power vs. the benefits of highly parallel processing or high availability.  But in this era of sacrifice, we might want to take a look at the oft-overlooked cost factors lurking behind the curtain.  Following the framework of the aforementioned post, let’s consider the costs using IBM hardware.

First, we fire up IBM’s server configuration tool and build a big dreadnought-class server with an x3950 M2:

  • 4 CPU sockets (using 6-core processors)
  • 32 memory sockets
  • 4 drive bays
  • 2 power supplies
  • 4U

Total Price:  $68,429  MSRP.  In Pennsylvania, we throw on another 6% for the Governor, so the number rounds to $72,500.   Sure enough, scaling up has the hefty price tag one would expect.

If instead we were to scale out, what kind of horsepower could we get for the same money?  Taking a trip to the other end of the product line, we find the modest x3250 M2:

  • 1 CPU socket (using a 4 core processor)
  • 4 memory sockets
  • 2 drive bays
  • 1 power supply
  • 1U

Total Price: $2,431 MSRP.  Allowing again for the Governor, we come in at $2,575, which means that for the same $72,500 we could buy 28 of these unassuming smaller servers.

So, if we decide to go shopping to IBM with $72,500 as our budget, what can we get for our money?

: ———— Scaling Up           Scaling Out

CPU’s               24                          112

RAM              256GB                   224GB

Disk               1.2TB                      28TB

It would seem that scaling out puts more resources in our data center for the same money.  Score 1 for scaling out.

Now let’s take a look at things from the software angle:

:—————————–   Scaling Up             Scaling Out

Windows 2K8 Server*            $2,515                     $20,524

SQL-Server                              $7,400                    $16,800

Our quick mental math says that scaling out costs nearly 4 times as much in software.  Score 1 for scaling up.

And now for the tie breaker, let’s examine operational power costs, assuming the boxes run on average at 50% of peak and without factoring in cooling:

——————-    Scaling Up           Scaling Out

Peak Watts                1440w                     9,828w

Power Cost/Year        $441                       $3,013

Scaling out is an order of magnitude higher in power costs.  Final score: Scaling up appears to win in a narrow 2-1 victory.

Having seen the costs, which approach seems to make more sense?   If you object to this question, you’re quite right to do so.  From a strictly financial point of view, scaling up seems to be way the go, unless you decide to level the playing field by zeroing the software costs with open source (e.g. Linux and PostgreSQL).  Scaling out becomes more financially appealing when open source is in play, which is what we often find in places like Google.

Of course, the decision can’t be made solely from a financial point of view, but prior to this exercise  have you ever even considered these hidden-in-plain-sight costs?  Ultimately the decision still does still come down to your particular business needs which must be discussed on the technical requirements involved.   Watching your team whiteboard the various options can sometimes be more tedious than reading Klingon poetry,

klingon2

but you need to let the team work through both the technical and cost considerations to arrive at the best solution.

There are two take-aways from this example.  The first is that when the technical requirements don’t point hard in either direction, you may be able to appeal to cost to help arbitrate the decision.  The second is that you really don’t need to make these types of decisions anymore.  The infrastructure utility trend is already in motion and is gaining momentum.   Before investing significant capital of any scale, consider deploying new applications in a professional hosting data center. Outsource these ongoing scaling decisions to others while you focus on the bigger picture of providing the right applications for your business.

//spk

* Please don’t flame me with comments like “How’d you get those prices, we only pay $20 for Win2K8 server?”   I haven’t spent the four years of education required to be fully conversant in the Microsoft  Licensing program, which is more complex and complicated than ancient Hebrew Law.  These prices were based on recent customer quotes and internal pricing from our distributors.

Jilted Again

by Scott Kantner, May 22nd, 2009 in Data Center, Hosting, Network Infrastructure, Systems Management

On my way out the office earlier this week, I met our master Jedi of monitoring standing in my office door.  “You might want to sit down”  he said.   In over 10 years of working together in the hellfire and brimstone of systems management, he’d never said that before.  “What could possibly be that bad?”  I wondered.  “I just went to the Cittio support site,” he said calmly as he  handed me his Blackberry, “Here’s what I got:”

cittio-done

For those of you unfamiliar with the world of network management systems, the name Cittio  means nothing.   For those of you unfamiliar with the history of  systems management tools at DSS, you’re also likely thinking “Dude, get over it.  It’s just another company folding.”  Or, as a former MVS systems programmer colleague use to say to me, “Get over it…and like it.”

Four Times Bitten, Forever Shy?

IBM Netview. We’ve been managing customer systems with NMS tools since 1995.   Being an IBM business partner, we decided to start with IBM Netview, a close but homely cousin of HP Openview.  While Netview was not without it’s charm, it was a cruel task master. We spent more time offering animal sacrifices to the tool to keep it running than we spent actually using it.  Besides taking 45 minutes to begin polling after a restart, the monitoring daemon would just go off into the weeds and stop polling.   We never could really trust it, and reporting left much to be desired.  As we continued to struggle with Netview, IBM bought Tivoli and the product was moved over to the Tivoli side of the house for assimilation into the Tivoli Enterprise Framework.  Since IBM surely wouldn’t have bought a company with bad products, and since business partners now had easy access to the Tivoli products, we naively decided to take a look at Tivoli Enterprise.

Tivoli Enterprise Distributed Monitoring (DM). After spending considerable time and money getting indoctrinated in the Tivoli Enterprise Framework and DM, we quickly realized the product was even more of a monster than Netview.  More animal sacrifices and offerings of time and energy were required for less functionality and horrible reliability.  We did one customer implementation and stopped.   We had seen and suffered enough.  While contemplating whether to shave our heads and put on sackcloth and ashes, we heard of a new NMS savior coming for the small-medium business space.

Tivoli IT Director. Enter codename “Bossman.”  By divine intervention, our company was selected by Tivoli to become part of small circle of customers and partners involved in a skunk works project to develop an NMS targeted at small shops.  An all-in-one tool that could poll for availability, collect performance data, monitor thresholds, collect HW/SW inventory, and even do software distribution. A veritable Ginsu knife set for systems management (without the 50-year guarantee).  But wait, there’s more…  Tivoli released the product on time, and as insiders we were way ahead of the game.    We began implementing it  at customer sites with good results and the sun was beginning to finally shine again.  No more animal sacrifices. We had finally begun to rebuild our remote monitoring business out of the ashes of the Netview days.

IT Director did have one flaw in it’s armor – it couldn’t support more than a couple of hundred nodes. But the boys in Texas were on top of that, and project “California” was underway to take the number of nodes up to 5,000.   Just days before we were to receive the beta code, Tivoli pulled the plug on the product.   Our sources behind the curtain told us why: it was felt that California, at its dramatically lower price point, would compete against Tivoli Enterprise Distributed Monitoring, and the Mercedes Benz crowd at Tivoli were having none of that.  The product was pulled from the portfolio and given to the IBM PC division in Boca Raton, where it was thoroughly lobotomized and re-released as IBM Netfinity Director.   So began the Dark Times.

Time out. I realize this is a blog post, not the Chronicles of Narnia, so I’ll hasten to the point.   Director was completely unusable after IBM Boca got done with it, and we had to move on.  At this point, having been left at the altar by Tivoli, we decided to develop our own system, DSS Systems Manager, and over the next two years we did exactly that and had very satisfying results.  Customers loved DSM, and so did we, but we had one problem – DSS was, and still is not a software development shop.   At the time we felt we couldn’t continue to develop the product and properly focus on our core business.   As we moved into the data center hosting business, we realized we needed additional functionality that we felt we could no longer afford to develop ourselves.   So we sought yet another commercial answer.  Back to the story….

tang

Cittio Watchtower. Watchtower essentially represented where we wanted to take DSM had we decided to continue development.  We negotiated a deal, installed it in under 30 days and were up and running.  Like good old Tang, we just added water and the rest was history.   We cultivated a close relationship the CEO of Cittio and had regular contact with the VP of Development and other high level folks who controlled the product’s destiny.  We did joint marketing events with them, including speaking on their behalf on webinars, and served as a reference account when they had large deals on the table.

The Betrayal

Only a week before the company dissolved (like Tang perhaps), the CEO personally asked me to serve as a reference to a couple of companies that Cittio was considering for OEM relationships.  Context is everything, and little did I know that OEM had been secretly redefined to mean “Our Exit Money.” In a little over a week after I had happily given a glowing Watchtower review to a company named Nimsoft, my chief monitoring engineer was handing me his Blackberry with news of Cittio’s demise. We contacted Nimsoft on The Day After, and the basic message we got “good luck fellas, you’re pretty much on your own.  The product will be no more. We can’t promise support of any kind.”   Simply fabulous.  To be fair, the whole situation is still in flux, and my sense during the phone call was that they hadn’t fully considered the fallout from their actions.   They may very well come back with a migration plan or limited temporary support, etc., but for now we Watchtower users are out in the cold.   Our new bride has packed her bags and left us with the credit card bills.

Getting Over It But Not Liking It

Thankfully, faith in God allows me to maintain my composure in situations like this, but a wise friend once taught me that buried feelings are buried alive, and when they come back, they come back as either anger or depression. So in the interests of good mental health, I’m compelled  to express my feelings about this debacle and get back to business.  Play this back to back 4-5 times for proper effect:

Nobody understands being jilted quite like Sam Kinison.  I feel much better now.

What Does This Mean to You?

So what’s the take-away from this situation that we can apply in our shops?   DSS has been on both sides of the build vs. buy decision, and there are clear advantages and risks to both positions.   My opinion, while still standing here in the smoking crater, is pretty much what it’s always been:  if you have the talent and can afford the time, building your own critical monitoring systems  is still your best destiny.  You have control of all of the variables and are forever immune to vendor adultery.  There is plenty of good open source material out there to take care of  the heavy lifting and serve as a good starting point.

If you don’t have the time or talent, then buying is obviously the only option.  Cittio was a VC-funded company and therefore subject to the whims and wiles of the angels and VCs. If I were to buy again, my first rule #1 would be to limit the vendor short list to firms beyond at least the magical fourth round of funding.  Translation: No fresh start-ups. Rule #2 would be to pick a product that is already firmly entrenched in a lot of Really Big Companies with big legal departments.  There is safety in numbers and large legal teams.  This may yet turn out to be the case with the Cittio breakup – they had some Really Big Customers, so we’ll wait and see if any major players file for damages in divorce court.

Unless IT is your core business, your best strategy is simple avoidance.  Running your own infrastructure is full of headaches and horror stories that doing nothing but hurt your bottom line.   Let someone else highly skilled in being jilted deal with all the risks, headaches, and heartaches.

//spk

Postscript: Just as I was getting ready to publish this entry, I received a call from a former senior exec at Cittio.  Though no longer on the payroll, he apologized at length for the situation, described what went down, and was genuinely troubled at the way the in which former customers are now being treated.  In the end analysis, the VC guys pulled the plug on a healthy company.  While my contact really didn’t know why it happened, perhaps they were selling healthy assets to compensate for unhealthy ones.   Who knows.   In any event, it’s time to move on.

Hog Wild

by Scott Kantner, April 30th, 2009 in Data Center, Disaster Recovery, Hosting, Support, Systems Management

Try as you might, your IT department won’t be allowed to ignore the current drama surrounding the swine flu outbreak south of the border. While the number of confirmed swine flu deaths is one (yes one) as of this writing, the 7/24 news cycle is in full Doom’s Day mode. Your customers may soon be asking what your plans are because they are just in the process of making their own plans. Unlike “normal” data center disasters like fire or flood, a pandemic scenario is just not on most people’s planning radar.

So what are we in IT do? Chances are you’ve already taken care of it. If you have remote access technology in place for your employees, and you’ve already planned for a building disaster, you’ve probably done as much as you can do unless you can find staff who are impervious to the flu.

Commander Data

The rest is really a matter of business continuity, not disaster recovery.

A relevant article appeared on processor.com a few years ago that stated as much:

A major part of an IT admin’s job during a pandemic will involve remote IT administration. Unlike disaster planning for acts of God, such as floods, fire, or earthquakes, staffers during a pandemic will not immediately seek to relocate.

“One interesting difference between [a pandemic] and another disaster is how everybody cannot just go and work at a different data center. You don’t want to take everybody and put them all in one place,” notes James Governor, an analyst for Redmonk, an analyst firm built on open source. “You do need a distributed and potentially home-working strategy because this is not the same as your [average disaster].”

Enabling staffers to access and perform networking tasks remotely is crucial in the event of a pandemic. “Any establishment worth its salt has good access tools to use the network from wherever they are on the planet. That is just good practice in any case,” Governor says. “And certainly, it is good practice if one is concerned about any potential issues where you might not be able to access the network in a way that you normally would.”

And as Bob DeCoufle pointed out on Tuesday, there is only a remote possibility of needing to invoke your disaster plan, assuming you had a recovery facility “outside of the epidemic region.” How one would anticipate where that would be is another matter, but in any case, few of us have the resources to relocate around a pandemic.

Unless we’re hosting hospital applications or other life support systems, asking our employees to do more than work remotely is probably unrealistic. In a genuine crisis, they will likely be home with their families, and Uncle Sam will probably be calling the shots regardless of our plans.

If by chance you are also required to cover the continuity aspect of your company, Forrester Research offers the following planning tips for a pandemic:

Preparing for a pandemic involves collaboration between all the departments in an enterprise, Forrester Research says. If an outbreak of a contagious virus or disease keeps more than half of all employees from showing up for work, some of the things an organization must do include:

Maintaining inventory and supplier relationships

Providing systematic communications about the outbreak for employees

Making vaccines and medical support for employees available (if possible)

Offering means of transportation to and from work in case public transit systems fail

Providing tools and resources to enable employees to work from home

The phrase “this too shall pass” brings me peace of mind. The swine flu will pass. In the meantime here at DSS, we’ll be making sure our remote access systems are up to snuff and reviewing our staffing plans for the data center. An emergency IT staffing plan should reflect the kind of business you’re in. If your IT systems support the lives of others, you obviously have a greater ethical responsibility than those who are running online shopping sites. For the crisis du jour, you will want to have an appropriate plan for on-site data center support.

And if you put your gear in a facility like this, you’ll have even less to worry about the next time the flu bug oinks in our direction.

That's all Folks!