Hi

        I tried to use short circuit read to improve my hbase cluster MR scan 
performance.

        I have the following setting in hdfs-site.xml

        dfs.client.read.shortcircuit set to true
        dfs.block.local-path-access.user set to MR job runner.

        The cluster is 1+4 node and each data node have 16cpu/4HDD, with all 
hbase table major compact thus all data is local.
        I have hoped that the short circuit read will improve the performance.

        While the test result is that with short circuit read enabled, the 
performance actually dropped 10-15%. Say scan a 50G table cost around 100s 
instead of 90s.

        My hadoop version is 1.1.1, any idea on this? Thx!

Best Regards,
Raymond Liu


Reply via email to