aimahou created YARN-10298:
------------------------------

             Summary: TimeLine entity information only stored in one region 
when use apache HBase as backend storage
                 Key: YARN-10298
                 URL: https://issues.apache.org/jira/browse/YARN-10298
             Project: Hadoop YARN
          Issue Type: Improvement
          Components: ATSv2, timelineservice
    Affects Versions: 3.1.1
            Reporter: aimahou


h2. Issue

TimeLine entity information only stored in one region when use apache HBase as 
backend storage
h2. Probable cause

We found in the source code that the rowKey is composed of 
clusterId、userId、flowName、flowRunId and appId when hbase timeline writer stores 
timeline entity info,which probably cause the rowKey is sorted by dictionary 
order. Thus timeline entity may only store in one region or few adjacent 
regions.
h2. Related code snippet

HBaseTimelineWriterImpl.java

public TimelineWriteResponse write(TimelineCollectorContext context,
 TimelineEntities data, UserGroupInformation callerUgi)
 throws IOException {

...

boolean isApplication = ApplicationEntity.isApplicationEntity(te);
byte[] rowKey;
if (isApplication) {
 ApplicationRowKey applicationRowKey =
 new ApplicationRowKey(clusterId, userId, flowName, flowRunId,
 appId);
 rowKey = applicationRowKey.getRowKey();
 store(rowKey, te, flowVersion, Tables.APPLICATION_TABLE);
} else {
 EntityRowKey entityRowKey =
 new EntityRowKey(clusterId, userId, flowName, flowRunId, appId,
 te.getType(), te.getIdPrefix(), te.getId());
 rowKey = entityRowKey.getRowKey();
 store(rowKey, te, flowVersion, Tables.ENTITY_TABLE);
}

if (!isApplication && SubApplicationEntity.isSubApplicationEntity(te)) {
 SubApplicationRowKey subApplicationRowKey =
 new SubApplicationRowKey(subApplicationUser, clusterId,
 te.getType(), te.getIdPrefix(), te.getId(), userId);
 rowKey = subApplicationRowKey.getRowKey();
 store(rowKey, te, flowVersion, Tables.SUBAPPLICATION_TABLE);
}

...

}
h2. Suggestion

We can use the hash code of original rowKey as the rowKey to store and read 
timeline entity data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to