[ 
https://issues.apache.org/jira/browse/ACCUMULO-112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner updated ACCUMULO-112:
----------------------------------

    Description: 
Currently the in memory map is not partitioned by locality group.  This could 
negatively impact scan and minor compaction performance.    Would like to run 
some experiments to understand the performance implications.  Partitioning by 
locality group could negatively impact insert performance, it could go from 
O(log(R)+log(C))  to O(L * (log(R)+log(C))) in the worst case.  L is the number 
of locality groups, R is the number of rows and C is the number of columns.  
The worst case is where each mutation has a change for each locality group. 

Currently the in memory map is a map of maps.  Like the following.

{noformat}
  map<row, map<col, val>>
{noformat}

Could conceptually change this to one of the following.  The first is best for 
scans, that access some locality groups, and minor compactions.  The second is 
good for inserts where the mutation covers all locality groups, because the row 
is only looked up once.

{noformat}
  map<localityGroup, map<row, map<col, val>>>
{noformat}


{noformat}
  map<row, map<localityGroup, map<col, val>>>
{noformat}

The Accumulo native map is implemented using C++,STL, JNI, and with thread 
locking in java.


  was:
Currently the in memory map is not partitioned by locality group.  This could 
negatively impact scan and minor compaction performance.    Would like to run 
some experiments to understand the performance implications.  Partitioning by 
locality group could negatively impact insert performance, it could go from 
O(log(R)+log(C))  to O(L * (log(R)+log(C))) in the worst case.  L is the number 
of locality groups, R is the number of rows and C is the number of columns.  
The worst case is where each mutation has a change for each locality group. 

Currently the in memory map is a map of maps.  Like the following.

{noformat}
  map<row, map<col, val>>
{noformat}

Could conceptually change this to one of the following.  The first is best for 
scans, that access some locality groups, and minor compactions.  The second is 
good for inserts where the mutation covers all locality groups, because the row 
is only looked up once.

{noformat}
  map<localityGroup, map<row, map<col, val>>>
{noformat}


{noformat}
  map<row, map<localityGroup, map<col, val>>>
{noformat}



    
> Investigate partitioning in memory map by locality group
> --------------------------------------------------------
>
>                 Key: ACCUMULO-112
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-112
>             Project: Accumulo
>          Issue Type: Task
>          Components: tserver
>            Reporter: Keith Turner
>            Assignee: Keith Turner
>              Labels: gsoc2013, mentor
>
> Currently the in memory map is not partitioned by locality group.  This could 
> negatively impact scan and minor compaction performance.    Would like to run 
> some experiments to understand the performance implications.  Partitioning by 
> locality group could negatively impact insert performance, it could go from 
> O(log(R)+log(C))  to O(L * (log(R)+log(C))) in the worst case.  L is the 
> number of locality groups, R is the number of rows and C is the number of 
> columns.  The worst case is where each mutation has a change for each 
> locality group. 
> Currently the in memory map is a map of maps.  Like the following.
> {noformat}
>   map<row, map<col, val>>
> {noformat}
> Could conceptually change this to one of the following.  The first is best 
> for scans, that access some locality groups, and minor compactions.  The 
> second is good for inserts where the mutation covers all locality groups, 
> because the row is only looked up once.
> {noformat}
>   map<localityGroup, map<row, map<col, val>>>
> {noformat}
> {noformat}
>   map<row, map<localityGroup, map<col, val>>>
> {noformat}
> The Accumulo native map is implemented using C++,STL, JNI, and with thread 
> locking in java.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to