Hi,
I am reading HFile.java today and found out in the comment section,
it's a bit confusing to me.


 * <li>Minimum block size. We recommend a setting of minimum block size between

 *__8KB to 1MB__ for general usage. Larger block size is preferred if files are

 * primarily for sequential access. However, it would lead to inefficient random

 * access (because there are more data to decompress). Smaller blocks are good

 * for random access, but require more memory to hold the block index, and may

 * be slower to create (because we must flush the compressor stream at the

 * conclusion of each data block, which leads to an FS I/O flush). Further, due

 * to the internal caching in Compression codec, the smallest possible block

 * size would be around __20KB-30KB__.

It mentioned block size should be 8k to 1M then it also says
__smallest__ block should be around 20k to 30k.
Should we fix this description? Sounds like the 20k is more accurate
regarding lower bound.
Thanks,

Xu

Reply via email to