[ https://issues.apache.org/jira/browse/HBASE-26225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
xi chaomin resolved HBASE-26225. -------------------------------- Resolution: Won't Fix > let hbase.mapreduce.bulkload.assign.sequenceNumbers take effect in > SecureBulkLoadManager.secureBulkLoadHFiles > ------------------------------------------------------------------------------------------------------------- > > Key: HBASE-26225 > URL: https://issues.apache.org/jira/browse/HBASE-26225 > Project: HBase > Issue Type: Improvement > Components: Performance > Reporter: xi chaomin > Priority: Minor > Attachments: SecureBulkLoadManager.patch > > > HBASE-10958 Call Flush before BulkLoad to obtain the latest sequenceID to > prevent data loss during replay. > '_hbase.mapreduce.bulkload.assign.sequenceNumbers_' controls whether to flush > before BulkLoad, but we pass true to whether to flush in > *SecureBulkLoadManager*. If we bulkload frequently we flush a lot of small > files. Can we make 'hbase.mapreduce.bulkload.assign.sequenceNumbers' work in > SecureBulkLoadManager? This passes -1 to sequenceId, we won't loss data. > SecureBulkLoadManager.java. > secureBulkLoadHFiles > {code:java} > // code placeholder > return region.bulkLoadHFiles(familyPaths, true, new > SecureBulkLoadListener(fs, bulkToken, conf), request.getCopyFile(), > clusterIds, request.getReplicate()); > {code} > Hregion.java > {code:java} > // code placeholder > public Map<byte[], List<Path>> bulkLoadHFiles(Collection<Pair<byte[], > String>> familyPaths, > boolean assignSeqId, BulkLoadListener bulkLoadListener, boolean copyFile, > List<String> clusterIds, boolean replicate) > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)