Hi David, I wrote that blog post and I know that Lars George has much more experience than me with tuning HBase, especially in different environments, so weight our opinions accordingly. As he says, it will "usually" help, and the unusual cases of lower spec'd hardware (that I did those tests on) are where it might hurt scans, but obviously still helps with disk and network use. So take my post with a grain of salt, and as Kevin says, try it out on your data and your cluster and see what works best for you.
Cheers, Oliver On 2012-11-03, at 3:57 PM, David Koch wrote: > Hello, > > Are scans faster when compression is activated? The HBase book by Lars > George seems to suggest so (p424, Section on "Compression" in chapter > "Performance Tuning"). > > "... compression usually will yield overall better performance, because the > overhead of the CPU performing the compression and de- compression is less > than what is required to read more data from disk." > > I searched around for a bit and found this: > http://gbif.blogspot.fr/2012/02/performance-evaluation-of-hbase.html. The > author conducted a series of scan performance tests on tables of up to > 200million rows and found that compression actually slowed down read > performance slightly - albeit at lower CPU load. > > Thank you, > > /David -- Oliver Meyn Software Developer Global Biodiversity Information Facility (GBIF) +45 35 32 15 12 http://www.gbif.org