wenhao created HBASE-29382:
------------------------------
Summary: The always.copy.files parameter does not take effect in
some bulkload scenarios
Key: HBASE-29382
URL: https://issues.apache.org/jira/browse/HBASE-29382
Project: HBase
Issue Type: Bug
Components: regionserver
Affects Versions: 2.5.11, 2.0.0
Reporter: wenhao
Attachments: 1.jpg, 2.jpg, 3.jpg, 4.jpg
When using bulkload, if the region partitions of two tables are inconsistent,
there is a need for the hfile split of the source table to match the region
partitions of the target table. However, in this case, if -Dalways.copy.files
is specified, it will be found that the hfile of the source table is still
cleaned up, and there are no recognizable hfiles in the region directory of
HDFS, resulting in unavailable data.
As shown in the image below, after bulkload, the original hfile is deleted,
while the hfile after split (under .tmp) is retained (-Dalways.copy.files).
However, the files under .tmp cannot be recognized. For example, after
unassign/assign, the data volume of this region becomes 0.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)