Highly-scalable open source distributed database for handling large amounts of data is a Key component in cloud computing.
The Apache Software Foundation (ASF) yesterday announced Apache Cassandra v0.7, the highly-scalable, second generation Open Source distributed database. It says that Apache Cassandra is successfully deployed at organizations with active data sets and large server clusters, including Cisco, Cloudkick, Digg, Facebook, Rackspace, and Twitter. The largest Cassandra cluster to date contains over 400 machines.
The organization lists the following among the new features in Apache Cassandra v0.7:
* Secondary Indexes, an expressive, efficient way to query data through node-local storage on the client side
* Large Row Support, up to two billion columns per row
* Online Schema Changes - automated online schema changes from the client API allow adding and modifying object definitions without requiring a cluster restart
It mentions that Apache Cassandra is available under the Apache Software License v2.0, and is overseen by a Project Management Committee (PMC), which guides its day-to-day operations, including community development and product releases.
"Apache Cassandra is a key component in cloud computing and other applications that deal with massive amounts of data and high query volumes," said Jonathan Ellis, Vice President of Apache Cassandra. "It is particularly successful in powering large web sites with sharp growth rates."
"Running any large website is a constant race between scaling your user base and scaling your infrastructure to support it," said David King, Lead Developer at Reddit. "Our traffic more than tripled this year, and the transparent scalability afforded to us by Apache Cassandra is in large part what allowed us to do it on our limited resources. Cassandra v0.7 represents the real-life operations lessons learned from installations like ours and provides further features like column expiration that allow us to scale even more of our infrastructure."
No comments:
Post a Comment