Public-facing sites are a good way to test the variety of available services
By Joab Jackson
Senior Technology Editor for Government Computer News
May 15, 2009
Toward the end of March, the Obama administration was getting ready to host a virtual town hall meeting on the economy with plans to receive questions from people around the country through its Web site. But how would it manage the questions — and pick the best ones — asked by potentially hundreds of thousands of people?
Instead of ginning up a question box internally for its Web site, the White House new media team used Moderator, a service created by Google to broker its internal meetings. The company built Moderator from two Google cloud services, Google App Engine and Google Web Toolkit.
Using an outside service offered a number of advantages, not the least of which was that it eliminated the need to write code. But perhaps most important advantage was that the White House did not have to worry about how much hardware it would need to handle the traffic — after all, allocating too many servers would be wasteful but too few would cause the site to be sluggish and frustrate visitors.
And the site got plenty of traffic. The White House set up a 48-hour period during which people could submit questions. In that time, more than 92,000 visitors hit the site. They submitted 92,000 questions, and, with a self-moderating system in which visitors rated one another’s questions, cast more than 3.6 million votes. All told, the site got as many as 700 hits per second.
Like others in the information technology industry, government IT managers have been intrigued by the idea of cloud computing, in which an organization can rent IT services on a utility-like basis. And although the Obama administration supports cloud computing, it could be a while before the government sector, with its labyrinthine procurement rules and stringent security requirements, will be in position to try cloud computing services offered by Amazon, Google, Microsoft and others.
But as the White House has shown, one way to kick the tires of this new computing method would be through public-facing Web applications that handle nonsensitive data.
“At this point, no sensitive [government] data can be put into the cloud because certifications aren’t in place,” said Wayne Beekman, co-owner of consulting firm Information Concepts. “But that doesn’t mean that you can’t deploy transaction-based applications” with public data via the Web.
What’s out there?
Of course, agencies have long used outside parties to host their Web sites. If you’re expecting huge waves of users to suddenly appear on your agency’s home page, you can call Akamai or another rehosting service. But the Web has become more than just static content. People expect interactions, transactions and other activities.
This has long been the domain of Web applications. At their most complex, Web applications have three different components: a user interface, back-end data store and business logic, or the computational procedures.
Typically, when an organization needs a new application, the project team building the application must requisition the necessary equipment from the IT department — Web servers, application servers and database servers. That approach can take time and money before the application serves the first user. Worse yet, the developers never know how many servers they will need.
Beekman offered a few numbers. Say you’re setting up a Web application for 5,000 to 10,000 users. Conservatively speaking, you’d need three Web servers, two application servers and two database servers. Buying the equipment and running it for three years would cost for an organization about $500,000.
By running an application in the cloud, your hardware expenses would be zero — whether it would be less expensive during the three years would depend on your usage. “It’s all about building an application and deploying it quickly without investment in hardware or licensing,” he said.
More important, it allows you to scale up or down as needed. Beekman said agencies can find it difficult to pinpoint how many resources they will need for an in-house implementation.
Three types of clouds
Beekman said all cloud offerings can be divided into three categories: cloud-based applications, cloud-based development and cloud-based infrastructure.
For cloud-based applications, service providers offer sets of software programs that customers can use. For example, Google offers its Google Apps suite of office-productivity software. Salesforce.com offers a set of customer relationship management packages online. This category is also known as software as a service.
With cloud-based development, a provider sets up an environment in which a customer can develop applications. The provider keeps tracks of the hardware and supporting software, such as operating systems, application servers and databases. The customer only has to worry about designing the program with one of the languages that the provider has available. Google’s App Engine is an example of a cloud-based development service.
And with cloud-based infrastructure, a provider supplies the hardware, storage and bandwidth, and a customer can provide all the software. This also goes by the name of utility computing. Amazon offers such a service with its Elastic Compute Cloud (EC2) service.
All cloud offerings share some additional characteristics. In April, the National Institute of Standards and Technology offered a draft definition of cloud computing. NIST recommended that when an agency works with a party outside its IT operations, cloud computing should be billed as a metered service, with the customer scaling up or scaling down as demand ebbs and flows.
At the recent Collaborate ’09 conference, held by the Independent Oracle Users Group, Tony Jedlinski, president of consulting firm Konoso, demonstrated how easy it is to set up an application in Amazon EC2. Setting up a fully working virtual server on Amazon cost Jedlinksi a total of 47 cents.
With Amazon, users create entire operating environments within a virtual container. Amazon offers a number of pre-staged environments, called Amazon Machine Images (AMIs), which consist of entire operating systems, either Linux or Microsoft Windows, with supporting software, such as Oracle Application Express. Users can download the AMIs and configure them to their specifications or build their own using their own operating system and software. Once finished, users upload them to Amazon’s site. The performance and billing of these virtual machines can be monitored through a dashboard.
Amazon prices are based on metered use: Customers are billed, by the hour, only when their AMIs are running. They can rent the equivalent of single-core CPU machines at 10 cents an hour for Linux and 12 cents an hour for Windows. Prices scale from that point for larger machines. Additionally, data transfer is 12 cents per gigabyte in and 17 cents per terabyte out for the first 10 terabytes. Storage runs 10 cents per gigabyte, per month.
Another provider, Terremark takes a similar approach. Users develop their operating environments within a VMware virtual environment and upload it to run in the company’s managed hosting service, called Enterprise Cloud. But instead of hourly billing, Terremark offers a monthly rate. On the General Services Administration schedule, Terremark’s Enterprise Cloud comes in a number of pricing tiers. One configuration offers the equivalent processing power of a single dual-core processor-based server (5 GHz), along with 10G of RAM and 100G of storage, for about $2,000 a month. Bandwidth to and from the Internet is offered at the rate of about $47.50 per megabyte of dedicated bandwidth.
With Terremark, users can pick their plan based on processing power estimates. If their usage goes over those limits, customers pay a premium, but, like cell phone users, they can switch to a larger plan. No discounts are offered for not using less than the full capacity, though.
GSA is using Terremark’s Enterprise Cloud service to run the USA.gov search engine. Martha Dorris, acting associate administrator at GSA’s Office of Citizen Services and Communication, estimated that going with Terremark could save as much as 90 percent in annualized infrastructure, IT and software support costs compared to what the agency is paying under its contract with Raytheon.
One advantage of going with cloud-based development rather than cloud-based infrastructure is that development leaves the operating system issues with the provider. In a cloud-based infrastructure, the user still must maintain the operating systems, deploying patches and testing when new versions appear.
That is the appeal of Google App Engine (GAE). GAE is a Google-hosted platform that can run applications written in Python and Java. The compan is considering offering other languages, such as PHP and Ruby. With the downloadable software development kit and a copy of the Python runtime, users develop an application on a local machine and then upload it to Google. Google will run the application and monitor bandwidth, CPU and storage use. Like other services, Google provides a dashboard that lets users keep track of how often the application runs and how much they owe.
“You upload the package which defines your application,” said Google product manager Mike Repass. “It has the database schema built in, your business logic [in Java or Python] and, on top of that, we support Web frameworks like Django for presentation rendering layer on top.” GAE uses the Jetty server to run the Java programs as servlets rather than run them under a full instance of a Java Enterprise Edition application server.
With GAE running in full production mode, the bill accrues at 10 cents per CPU core hour, 10 cents per gigabyte of bandwidth inbound and 12 cents per gigabyte outbound, and about 15 cents per gigabyte of data stored. Computational cycles, bandwidth and storage space are added on the fly in an automated fashion. The program manager doesn’t need to designate those resources. Google adds them automatically and keeps track of how much customers use, then bills them accordingly. Customers can set filters that establish limits for how much they want to pay each month.
And again, the customers do not have to worry about which operating system is being used.
For Windows environments, cloud-based development will also be an advantage with Microsoft’s cloud offering, called Azure. Now in the technology preview stage, the company is welcoming testers. An Active Service Pages-based Web application can be designed in Visual Studio, using any of the languages supported by the company’s .NET framework, such as C#, PHP, Perl, Ruby and Visual Basic. Customers will also be able to use the features of other Microsoft back-office applications, such as BizTalk server for workflow, or access services for security.
Once finished, the applications can then be uploaded to the Azure platform, which Microsoft will run. Data will be kept in Microsoft’s cloud-based database storage, called the SQL Data Services repository.
The company has not revealed how pricing will work. But presumably, it will be an a la carte model.
“You don’t have to use all these services,” said Susie Adams, Microsoft Federal chief technology officer. “You could plug and play these things in any service-oriented architecture. They could live in our data center, or you can have pieces [in-house] talking with pieces in our data center.”
Data safe to roam?
Running Web applications in the cloud has some advantages, but it also has some limitations.
For one thing, where does data in the cloud reside? Many agencies have regulations that prohibit storing their data out of the country. Also, knowing where the data is stashed is important in continuity-of-operations planning. Google does not divulge where it keeps its customer data, and the company makes no promises that the data will reside in the United States or be geographically dispersed. On the other hand, Terremark discloses that its customers’ data is either in its Culpeper, Va., facility, or in a second location in Florida. Microsoft also plans to divulge information on data location for its enterprise customers, Adams said.
Another concern is that government agencies must follow security procedures set in place by the Federal Information Security Management Act, which defines how agencies should control their information resources. The problem with FISMA is that it was drafted before cloud computing became commercially widespread.
NIST is studying how agencies could certify and accredit cloud-computing providers in a way that meets FISMA requirements. GSA’s Patrick Stingley, who has taken on the role of federal cloud chief technology officer, is also studying the issue.
Cloud computing providers point to various certifications that they say would cover FISMA. In particular, ISO 27001, a standard for information security management system standards, has a set of controls that can be mapped out to satisfy all FISMA requirements. Microsoft and Terremark officials have said their data centers are certified under ISO 27001.
“We’re 27001-certified,” Adams said. “So now the question becomes: Will we have to get FISMA-certified? We don’t know. We’re still trying to validate what the government thinks about that. This is a challenge for all the vendors. Everyone is in the same boat.”