[ 
https://issues.apache.org/jira/browse/HBASE-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032600#comment-13032600
 ] 

Adam Phelps commented on HBASE-3871:
------------------------------------

As an FYI - in our use case we do see a decent number of splits, but most of 
the time there aren't a lot of splits for the loading of any individual job, 
looking at the logs we're seeing a handful in any given hour.  The hbase 
configuration is the same on each cluster, its just that the cluster that these 
are later being loaded onto has fewer nodes than the production one.  So I 
imagine that the regions would ultimately end up splitting along the same 
boundaries, but might not be doing so at the same point in time.

> Speedup LoadIncrementalHFiles by parallelizing HFile splitting
> --------------------------------------------------------------
>
>                 Key: HBASE-3871
>                 URL: https://issues.apache.org/jira/browse/HBASE-3871
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>    Affects Versions: 0.90.2
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>         Attachments: 3871.patch
>
>
> From Adam w.r.t. HFile splitting:
> There's actually a good number of messages of that type (HFile no longer fits 
> inside a single region), unfortunately I didn't take a timestamp on just when 
> I was running with the patched jars vs the regular ones, however from the 
> logs I can say that this is occurring fairly regularly on this system.  The 
> cluster I tested this on is our backup cluster, the mapreduce jobs on our 
> production cluster output HFiles which are copied to the backup and then 
> loaded into HBase on both.  Since the regions may be somewhat different on 
> the backup cluster I would expect it to have to split somewhat regularly.
> This JIRA complements HBASE-3721 by parallelizing HFile splitting which is 
> done in the main thread.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to