Hi, I am reading HFile.java today and found out in the comment section, it's a bit confusing to me.
* <li>Minimum block size. We recommend a setting of minimum block size between *__8KB to 1MB__ for general usage. Larger block size is preferred if files are * primarily for sequential access. However, it would lead to inefficient random * access (because there are more data to decompress). Smaller blocks are good * for random access, but require more memory to hold the block index, and may * be slower to create (because we must flush the compressor stream at the * conclusion of each data block, which leads to an FS I/O flush). Further, due * to the internal caching in Compression codec, the smallest possible block * size would be around __20KB-30KB__. It mentioned block size should be 8k to 1M then it also says __smallest__ block should be around 20k to 30k. Should we fix this description? Sounds like the 20k is more accurate regarding lower bound. Thanks, Xu
