Twitter runs multiple large Hadoop clusters that are among the biggest in the world
Twitter runs multiple large Hadoop clusters that are among the biggest in the world. Hadoop is at the core of Twitter's data platform and provides vast storage for analytics on Twitter.
"Hadoop at Twitter: scalability and interoperability
Our Hadoop filesystems host over 300PB of data on tens of thousands of servers. We scale HDFS by federating multiple namespaces. This approach allows us to sustain a high HDFS object count (inodes and blocks) without resorting to a single large Java heap size that would suffer from long GC pauses and the inability to use compressed oops.
Twitter runs multiple large Hadoop clusters that are among the biggest in the world. Hadoop is at the core of our data platform and provides vast storage for analytics of user actions on Twitter. In this post, we will highlight our contributions to ViewFs, the client-side Hadoop filesystem view, and its versatile usage here."
https://blog.twitter.com/engineering/en_us/a/2015/hadoop-filesystem-at-twitter.html