[ 
https://issues.apache.org/jira/browse/DRILL-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16066939#comment-16066939
 ] 

ASF GitHub Bot commented on DRILL-5420:
---------------------------------------

GitHub user kkhatua opened a pull request:

    https://github.com/apache/drill/pull/862

    DRILL-5420: ParquetAsyncPgReader goes into infinite loop during cleanup

    PageQueue is cleaned up using poll() instead of take(), which constantly 
gets interrupted and causes CPU churn.
    During a columnReader shutdown, a flag is set so as to block any new page 
reading tasks from being submitted, before the queues are finally cleared and 
memory occupied by the pages released.
    More details are in this JIRA comment : 
https://issues.apache.org/jira/browse/DRILL-5420?focusedCommentId=16066933&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16066933

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kkhatua/drill DRILL-5420

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/862.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #862
    
----
commit 2f233d4b1318e29211856877937ef9988c34ffaf
Author: Kunal Khatua <kkha...@maprtech.com>
Date:   2017-06-28T06:35:34Z

    DRILL-5420: ParquetAsyncPgReader goes into infinite loop during cleanup
    
    PageQueue is cleaned up using poll() instead of take(), which constantly 
gets interrupted and causes CPU churn.
    During a columnReader shutdown, a flag is set so as to block any new page 
reading tasks from being submitted.

----


> all cores at 100% of all servers
> --------------------------------
>
>                 Key: DRILL-5420
>                 URL: https://issues.apache.org/jira/browse/DRILL-5420
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.10.0
>         Environment: linux, cluster with 5 servers over hdfs/parquet
>            Reporter: Hugo Bellomusto
>            Assignee: Kunal Khatua
>         Attachments: 2709a36d-804a-261a-64e5-afa271e782f8.json
>
>
> We have a drill cluster with five servers over hdfs/parquet.
> Each machine have 8 cores. All cores get at 100% of use.
> Each thread is looping in the while in line 314 in AsyncPageReader.java 
> inside clear() method.
> https://github.com/apache/drill/blob/1.10.0/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/AsyncPageReader.java#L314
> jstack -l 19255|grep -A 50 $(printf "%x" 29250)
> "271d6262-ff19-ad24-af36-777bfe6c6375:frag:1:4" daemon prio=10 
> tid=0x00007f5b2adec800 nid=0x7242 runnable [0x00007f5aa33e8000]
>    java.lang.Thread.State: RUNNABLE
>       at java.lang.Throwable.fillInStackTrace(Native Method)
>       at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
>       - locked <0x00000007374bfcb0> (a java.lang.InterruptedException)
>       at java.lang.Throwable.<init>(Throwable.java:250)
>       at java.lang.Exception.<init>(Exception.java:54)
>       at java.lang.InterruptedException.<init>(InterruptedException.java:57)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
>       at 
> java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
>       at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:439)
>       at 
> org.apache.drill.exec.store.parquet.columnreaders.AsyncPageReader.clear(AsyncPageReader.java:317)
>       at 
> org.apache.drill.exec.store.parquet.columnreaders.ColumnReader.clear(ColumnReader.java:140)
>       at 
> org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.close(ParquetRecordReader.java:632)
>       at 
> org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:183)
>       at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
>       at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
>       at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>       at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:135)
>       at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
>       at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
>       at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
>       at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>       at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
>       at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
>       at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
>       at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>       at 
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
>       at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
>       at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
>       at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
>       at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>       at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:135)
>       at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
>       at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
>       at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
>       at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>       at 
> org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.innerNext(LimitRecordBatch.java:115)
>       at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
>       at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
>       at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
>       at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>       at 
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
>       at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
>       at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104)
>       at 
> org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:92)
>       at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94)
>       at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:232)
>       at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:226)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to