Hello, For the bulkloading process, the HBase documentation mentions that in a 2nd stage "the appropriate Region Server adopts the HFile, moving it into its storage directory and making the data available to clients." But from my experience the files also remain in the original location from where they are "adopted". So I guess the data is actually copied into the HBase directory right? This means that, compared to the online importing, when bulk loading you essentially need twice the disk space on HDFS, right? Another problem is with data locality immediately after bulk loading through MR. I understand that the locality is obtained in time through compactions and splits. However you don't get this problem while importing online, right?
Thanks in advance, Sever