[ 
https://issues.apache.org/jira/browse/TRAFODION-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14943284#comment-14943284
 ] 

Qifan Chen commented on TRAFODION-1271:
---------------------------------------

This problem has been resolved. 

> LP Bug: 1464306 - Compiler:ESP colocation with Hbase Regions
> ------------------------------------------------------------
>
>                 Key: TRAFODION-1271
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-1271
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: sql-cmp
>            Reporter: Ravisha Neelakanthappa
>            Assignee: Ravisha Neelakanthappa
>            Priority: Critical
>
> There is a scope for performance improvement if ESPs are colocated with Habse 
> regions they access by leveraging data locality of HBase Region server and 
> Hadoop data nodes.
> Currently ESPs are assigned to any random node as shown in the code below:
>        // Get the node map for this ESP fragment.
>        NodeMap *nodeMap =
>           (NodeMap *)fragmentDir_->getPartitioningFunction(i)->getNodeMap();
>        for (CollIndex j=0; j<nodeMap->getNumEntries(); j++) {
>          nodeMap->setNodeNumber(j, ANY_NODE);
>          nodeMap->setClusterNumber(j, 0);
>        }
> Because of this assignment the communication between ESP and RegionServers 
> can cross node boundaries causing
> slow performance.
> Here is the algorithm used for ESP colocation:
> 1. During startup create a Hashdictionary of NodeNames(Key):NodeNumber(value)
> 2. During NATable creation make a JNI call to get Node(Host) Names of Table's 
> regions
> 3. get NodeNumber of each NodeName using Hashdictionary
> 4. Populate NodeMap with NodeNumber from step 3 above
> 5. During HbaseScan synthesis, new NodeMap gets created for each context 
> being optimized. 
>    Copy NodeNumbers from NodeMap stored in table's partFunc. 
>    5a. If there is 1:1 mapping, do a direct copy
>    5b. If there is M:N (where M < N), use most popular NodeNumber of 
> partition grouping
> 6. In the generator, assign ANY_NODE only if ESP colocation logic is OFF
>     
> Data locality:
> When data is written in HDFS, one copy is written locally, another is written 
> to another node in a different rack (if possible) and a third copy is written 
> to another node in the same rack. For all practical purposes the two extra 
> copies are written to random nodes in the cluster.
> In typical HBase setups a RegionServer is co-located with an HDFS DataNode on 
> the same physical machine. Thus every write is written locally and then to 
> the two nodes as mentioned above. As long the regions are not moved between 
> RegionServers there is good data locality: A RegionServer can serve most 
> reads just from the local disk (and cache), provided short circuit reads are 
> enabled
> When regions are re-assigned data locality is lost and the RegionServers in 
> question need to request the data over the network from remote DataNodes, 
> until the data is rewritten locally (Major compaction time)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to