[ https://issues.apache.org/jira/browse/HBASE-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14943595#comment-14943595 ]
Ted Yu commented on HBASE-14520: -------------------------------- {code} at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:428) at org.apache.hadoop.mapred.TestMultiFileSplit.testReadWrite(TestMultiFileSplit.java:41) {code} Some mapreduce unit test got picked up by test script. bq. +1 core tests. The patch passed unit tests in . > Optimize the number of calls for tags creation in bulk load > ----------------------------------------------------------- > > Key: HBASE-14520 > URL: https://issues.apache.org/jira/browse/HBASE-14520 > Project: HBase > Issue Type: Improvement > Affects Versions: 2.0.0 > Reporter: Bhupendra Kumar Jain > Assignee: Bhupendra Kumar Jain > Fix For: 2.0.0 > > Attachments: HBASE-14520.patch > > > At present, ttl and Visibility expr is one per tsv line i.e. the values and > the tags remain same for all the columns present in that line. As per the > code, List of tags are created for each cell, Instead of creating new tags > for each cell, tags created once for the line can be reused by other cells. > Assume 1Million rows and 1000 columns. Currently tags creation will happen > for 1M * 1000 times. If reuse the tags, the tags creation can reduce to 1M > times. (i.e. one per tsv line). > This is applicable in both TsvImporterMapper and TextSortReducer logic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)