Himanshu Gwalani created HBASE-27904:
----------------------------------------

             Summary: A random data generator tool leveraging bulk load.
                 Key: HBASE-27904
                 URL: https://issues.apache.org/jira/browse/HBASE-27904
             Project: HBase
          Issue Type: New Feature
          Components: util
            Reporter: Himanshu Gwalani


As of now, there is no data generator tool in HBase leveraging bulk load. Since 
bulk load skips client writes path and if an tooling over HBase need huge amout 
of data for load/performance testing, bulk load can be leveraged.

The tool will generate data as a two-step process:
1. Generate HFiles with random data (using custom Mapper and 
[HFileOutputFormat2|https://github.com/apache/hbase/blob/master/hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat2.java])
2. Bulk load those HFiles to the respective regions of the table using 
[LoadIncrementalFiles|https://hbase.apache.org/2.2/devapidocs/org/apache/hadoop/hbase/tool/LoadIncrementalHFiles.html]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to