[ 
https://issues.apache.org/jira/browse/YARN-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15325613#comment-15325613
 ] 

Vrushali C edited comment on YARN-5070 at 6/11/16 1:43 AM:
-----------------------------------------------------------

bq. l.120: From the javadoc, it appears that ScannerContext keeps track of the 
progress towards the limits.  If the progress should be monitored across 
multiple invocations of nextRaw(List<Cell>) .  I'm not sure if this will do 
that.

Yes, what I believe is that the progress to be tracked is within the context of 
the invocation of the "next" call, not across. Although the ScannerContext 
class has keepProgress settings in case we want to track progress across RPCs. 
But this patch does not do that.
 
The documentation in the ScannerContext class says
https://git-wip-us.apache.org/repos/asf?p=hbase.git;a=blob;f=hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ScannerContext.java;h=29bffd26753795f33b90f31e9b77a5d1387e5cd7;hb=refs/heads/branch-1.1

{code}
 * ScannerContext instances encapsulate limit tracking AND progress towards 
those limits during
 * invocations of {@link InternalScanner#next(java.util.List)} and
 * {@link RegionScanner#next(java.util.List)}.

{code}

For the flow run coprocessor, the nextRaw/next functions call the nextInternal 
function which is the one that actually does the iteration. Hence the batch 
limit is set up here.

bq. Are we even supposed to create instances of ScannerContext? Am I off? I'm 
basically not sure what is the correct way of using the ScannerContext.
An example of how the ScannerContext is being used in the hbase region server 
code:
https://hbase.apache.org/xref/org/apache/hadoop/hbase/regionserver/StoreFlusher.html

bq. Are we certain that batchLimit is the correct one to use in ScannerContext? 
batchLimit is the one that tracks the batch size during the scan’s next, hence 
we are using that. There are other settings like max results per column family 
or max result size which I believe will have corresponding limit settings in 
ScannerContext. We are not keeping track of those in FlowScanner. 

That said, all this is what I have gathered looking at the code by myself. 
Would appreciate a feedback from an hbase person. 



was (Author: vrushalic):
bq. l.120: From the javadoc, it appears that ScannerContext keeps track of the 
progress towards the limits.  If the progress should be monitored across 
multiple invocations of nextRaw(List<Cell>) .  I'm not sure if this will do 
that.

Yes, I believe the progress to be tracked is within the context of the 
invocation of the "next" call, not across.
 
The documentation in the ScannerContext class says
https://git-wip-us.apache.org/repos/asf?p=hbase.git;a=blob;f=hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ScannerContext.java;h=29bffd26753795f33b90f31e9b77a5d1387e5cd7;hb=refs/heads/branch-1.1

{code}
 * ScannerContext instances encapsulate limit tracking AND progress towards 
those limits during
 * invocations of {@link InternalScanner#next(java.util.List)} and
 * {@link RegionScanner#next(java.util.List)}.

{code}

For the flow run coprocessor, the nextRaw/next functions call the nextInternal 
function which is the one that actually does the iteration. Hence the batch 
limit is set up here.

bq. Are we even supposed to create instances of ScannerContext? Am I off? I'm 
basically not sure what is the correct way of using the ScannerContext.
An example of how the ScannerContext is being used in the hbase region server 
code:
https://hbase.apache.org/xref/org/apache/hadoop/hbase/regionserver/StoreFlusher.html

bq. Are we certain that batchLimit is the correct one to use in ScannerContext? 
batchLimit is the one that tracks the batch size during the scan’s next, hence 
we are using that.


> upgrade HBase version for first merge
> -------------------------------------
>
>                 Key: YARN-5070
>                 URL: https://issues.apache.org/jira/browse/YARN-5070
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Sangjin Lee
>            Assignee: Vrushali C
>            Priority: Critical
>              Labels: yarn-2928-1st-milestone
>         Attachments: YARN-5070-YARN-2928.01.patch, 
> YARN-5070-YARN-2928.02.patch, YARN-5070-YARN-2928.03.patch, 
> YARN-5070-YARN-2928.04.patch
>
>
> Currently we set the HBase version for the timeline service storage to 1.0.1. 
> This is a fairly old version, and there are reasons to upgrade to a newer 
> version. We should upgrade it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to