[jira] [Comment Edited] (PHOENIX-2970) SpoolingResultIterator using memory too conservatively , which leads to using temp file unnecessaryly

chenglei (JIRA) Thu, 30 Jun 2016 01:37:00 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15356733#comment-15356733
 ]


chenglei edited comment on PHOENIX-2970 at 6/30/16 8:35 AM:
------------------------------------------------------------

Yes,[~giacomotaylor], in our business system, we had already set the 
phoenix.query.maxGlobalMemoryPercentage to 100 to avoid unnecessaryly using the 
temp file. 

In this issue,I just emphasize that SpoolingResultIterator always assumes the 
wrapped ResultIterator would fetch "phoenix.query.spoolThresholdBytes" 
bytes,but most of time it won't.An extreme example is "select count(1) from 
table",it just return a single count,so spoolingResultIterator's assumption is 
unreasonable,or too conservative and coarse.

I suggest remove the MemoryManager in SpoolingResultIterator,just as the 
MappedByteBufferSortedQueue does.


was (Author: comnetwork):
Yes,[~giacomotaylor], in our business system, we had already set the 
phoenix.query.maxGlobalMemoryPercentage to 100 to avoid unnecessaryly using the 
temp file. 

In this issue,I just emphasize that SpoolingResultIterator always assumes the 
wrapped ResultIterator would fetch "phoenix.query.spoolThresholdBytes" 
bytes,but most of time it won't.An extreme example is "select count(1) from 
table",it just return a single count,so spoolingResultIterator's assumption is 
unreasonable,or too conservative and coarse.

I suggest remove the MemoryManager in SpoolingResultIterator,just as the 
MappedByteBufferSortedQueue does.

> SpoolingResultIterator using memory too conservatively , which leads to using 
> temp file unnecessaryly
> -----------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-2970
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2970
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.7.0
>            Reporter: chenglei
>
> Even if SpoolingResultIterator will be deprecated, but HBase older than  
> 0.98.17 would continue use it, and I think the DeferredByteBufferSegmentQueue 
> class simialr to SpoolingResultIterator  in the future version may also has 
> the same problem.
> In SpoolingResultIterator's ctor, it tries to allocate maximum 
> "phoenix.query.spoolThresholdBytes" bytes from MemoryManager,and uses the 
> allocated MemoryChunk's size as DeferredFileOutputStream's threshold,just as 
> the following code:
>        {code:borderStyle=solid}
>         final MemoryChunk chunk = mm.allocate(0, thresholdBytes);
>         long waitTime = System.currentTimeMillis() - startTime;
>         GLOBAL_MEMORY_WAIT_TIME.update(waitTime);
>         memoryMetrics.getMemoryWaitTimeMetric().change(waitTime);
>         DeferredFileOutputStream spoolTo = null;
>         try {
>             // Can't be bigger than int, since it's the max of the above 
> allocation
>             int size = (int)chunk.getSize();
>             spoolTo = new DeferredFileOutputStream(size, 
> "ResultSpooler",".bin", new File(spoolDirectory)) {
>                 @Override
>                 protected void thresholdReached() throws IOException {
>                     try {
>                         super.thresholdReached();
>                     } finally {
>                         chunk.close();
>                     }
>                 }
>             };
>        {code} 
>        
> SpoolingResultIterator assumes that the wrapped ResultIterator would always 
> fetch "phoenix.query.spoolThresholdBytes" bytes,but most of time it won't. 
> For example, if we execute "select count(1) from table" on a big table with 
> many regions, the ScanPlan will  parallel too many SpoolingResultIterators to 
> fetch the result, and each SpoolingResultIterator tries to allocate maximum 
> "phoenix.query.spoolThresholdBytes" bytes from MemoryManager.If we don't have 
> too many memory, lots of  SpoolingResultIterators will allocate 0 bytes from 
> MemoryManager, and the corresponding DeferredFileOutputStream's threshold 
> will be 0, so the DeferredFileOutputStream will unnecessaryly use temp file 
> to put the results, even if the result is just a single count value. This 
> behavior will slow the query.
>     
>  Can we remove the MemoryManager in SpoolingResultIterator,just as the 
> MappedByteBufferSortedQueue does?
>     



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (PHOENIX-2970) SpoolingResultIterator using memory too conservatively , which leads to using temp file unnecessaryly

Reply via email to