[ 
https://issues.apache.org/jira/browse/HBASE-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13699393#comment-13699393
 ] 

Nicolas Liochon commented on HBASE-6295:
----------------------------------------

In HBASE-8810, I proposed to have:
    a file called hbase-settings-sample.xml that would not be included when we 
read the conf (while we read hbase-default and hbase-site today). It would be 
for documentation only.
    a unit test to load this file and compare with the code default, to ensure 
our doc is in line with the code.

The script would work as well. I think the test should use the defaults. Then, 
inside the tests, we can change them, for example when we know it's gonna fail 
and we don't want to try 20 times.

                
> Possible performance improvement in client batch operations: presplit and 
> send in background
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6295
>                 URL: https://issues.apache.org/jira/browse/HBASE-6295
>             Project: HBase
>          Issue Type: Improvement
>          Components: Client, Performance
>    Affects Versions: 0.95.2
>            Reporter: Nicolas Liochon
>            Assignee: Nicolas Liochon
>              Labels: noob
>             Fix For: 0.98.0, 0.95.2
>
>         Attachments: 6295.addendum.patch, 6295.v11.patch, 6295.v12.patch, 
> 6295.v14.patch, 6295.v15.patch, 6295.v1.patch, 6295.v2.patch, 6295.v3.patch, 
> 6295.v4.patch, 6295.v5.patch, 6295.v6.patch, 6295.v8.patch, 6295.v9.patch, 
> hbase-ycsb-workloads Build time trend.png
>
>
> today batch algo is:
> {noformat}
> for Operation o: List<Op>{
>   add o to todolist
>   if todolist > maxsize or o last in list
>     split todolist per location
>     send split lists to region servers
>     clear todolist
>     wait
> }
> {noformat}
> We could:
> - create immediately the final object instead of an intermediate array
> - split per location immediately
> - instead of sending when the list as a whole is full, send it when there is 
> enough data for a single location
> It would be:
> {noformat}
> for Operation o: List<Op>{
>   get location
>   add o to todo location.todolist
>   if (location.todolist > maxLocationSize)
>     send location.todolist to region server 
>     clear location.todolist
>     // don't wait, continue the loop
> }
> send remaining
> wait
> {noformat}
> It's not trivial to write if you add error management: retried list must be 
> shared with the operations added in the todolist. But it's doable.
> It's interesting mainly for 'big' writes

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to