[ 
https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238095#comment-13238095
 ] 

Laxman commented on HBASE-5564:
-------------------------------

@Anoop, thanks for clarification.

@Stack, thanks for the review. I will update the patch.

bq. need curlies
bq. NO_TIMESTAMP_KEYCOLUMN_INDEX 

I will update the patch for above 2 comments.

bq. Can you confirm that current behavior – setting ts to 
System.currentTimeMillis – is default? It seems to be ... we set 
System.currentTimeMillis as time to use setting up the job.

Before patch, we are setting ts to System.currentTimeMillis in 
TsvImporterMapper.doSetup. This setup methos will be called for each mapper, 
i.e, for each input split. That means it uses a new timestamp for each map task.

After patch, we are setting ts to conf.getLong which is same in all map tasks.

Hope, I understood your question correctly.

                
> Bulkload is discarding duplicate records
> ----------------------------------------
>
>                 Key: HBASE-5564
>                 URL: https://issues.apache.org/jira/browse/HBASE-5564
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>         Environment: HBase 0.92
>            Reporter: Laxman
>            Assignee: Laxman
>              Labels: bulkloader
>             Fix For: 0.96.0
>
>         Attachments: 5564.lint, HBASE-5564_trunk.1.patch, 
> HBASE-5564_trunk.1.patch, HBASE-5564_trunk.patch
>
>
> Duplicate records are getting discarded when duplicate records exists in same 
> input file and more specifically if they exists in same split.
> Duplicate records are considered if the records are from diffrent different 
> splits.
> Version under test: HBase 0.92

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to