Yes it did, the problem is gone. Thanks I will share the details I have on a Jira ticket now.
On Tue, Apr 11, 2017 at 9:22 PM, Kunal Khatua <[email protected]> wrote: > Did this help resolve the memory leak, Francois? > > > Could you share the stack trace and other relevant logs on a JIRA? > > > Thanks > > Kunal > > > > > ________________________________ > From: Kunal Khatua <[email protected]> > Sent: Wednesday, April 5, 2017 2:03:19 PM > To: [email protected] > Subject: Re: Memory was Leaked error when using "limit" in 1.10 > > Hi Francois > > Could you try those queries with the AsyncPageReader turned off? > > alter <session|system> set `store.parquet.reader.pagereader.async`=false; > > For Drill 1.9+ , this feature is enabled. However, there were some perf > related improvements that Drill 1.10 carried out. > > If the problem goes away, could you file a JIRA and share the sample query > and data to allow us a repro ? > > Thanks > > Kunal > > ________________________________ > From: François Méthot <[email protected]> > Sent: Wednesday, April 5, 2017 1:39:38 PM > To: [email protected] > Subject: Memory was Leaked error when using "limit" in 1.10 > > Hi, > > I am still investigating this problem, but I will describe the symptom to > you in case there is known issue with drill 1.10. > > We migrated our production system from Drill 1.9 to 1.10 just 5 days ago. > (220 nodes cluster) > > Our log show there was some 900+ queries ran without problem in first 4 > days. (similar queries, that never use the `limit` clause) > > Yesterday we started doing simple adhoc select * ... limit 10 queries (like > we often do, that was our first use of limit with 1.10) > and we got a `Memory was leaked` exception below. > > Also, once we get the error, Most of all subsequent user queries fails with > Channel Close Exception. We need to restart Drill to bring it back to > normal. > > A day later, I used a similar select * limit 10 queries, and the same thing > happen, had to restart Drill. > > In the exception, it was refering to a file (1_0_0.parquet) > I moved that file to smaller test cluster (12 nodes) and got the error on > the first attempt. but I am no longer able to reproduce the issue on that > file. Between the 12 and 220 nodes cluster, a different Column name and Row > Group Start was listed in the error. > The parquet file was generated by Drill 1.10. > > I tried the same file with a local drill-embedded 1.9 and 1.10 and had no > issue. > > > Here is the error (manually typed), if you think of anything obvious, let > us know. > > > AsyncPageReader - User Error Occured: Exception Occurred while reading from > disk (can not read class o.a.parquet.format.PageHeader: > java.io.IOException: input stream is closed.) > > File:..../1_0_0.parquet > Column: StringColXYZ > Row Group Start: 115215476 > > [Error Id: ....] > at UserException.java:544) > at > o.a.d.exec.store.parquet.columnreaders.AsyncPageReader. > handleAndThrowException(AsynvPageReader.java:199) > at > o.a.d.exec.store.parquet.columnreaders.AsyncPageReader. > access(AsynvPageReader.java:81) > at > o.a.d.exec.store.parquet.columnreaders.AsyncPageReader. > AsyncPageReaderTask.call(AsyncPageReader.java:483) > at > o.a.d.exec.store.parquet.columnreaders.AsyncPageReader. > AsyncPageReaderTask.call(AsyncPageReader.java:392) > at > o.a.d.exec.store.parquet.columnreaders.AsyncPageReader. > AsyncPageReaderTask.call(AsyncPageReader.java:392) > ... > Caused by: java.io.IOException: can not read class > org.apache.parquet.format.PageHeader: java.io.IOException: Input Stream is > closed. > at o.a.parquet.format.Util.read(Util.java:216) > at o.a.parquet.format.Util.readPageHeader(Util.java:65) > at > o.a.drill.exec.store.parquet.columnreaders.AsyncPageReader( > AsyncPageReaderTask:430) > Caused by: parquet.org.apache.thrift.transport.TTransportException: Input > stream is closed > at ...read(TIOStreamTransport.java:129) > at ....TTransport.readAll(TTransport.java:84) > at ....TCompactProtocol.readByte(TCompactProtocol.java:474) > at ....TCompactProtocol.readFieldBegin(TCompactProtocol.java:481) > at ....InterningProtocol.readFieldBegin(InterningProtocol.java:158) > at ....o.a.parquet.format.PageHeader.read(PageHeader.java:828) > at ....o.a.parquet.format.Util.read(Util.java:213) > > > Fragment 0:0 > [Error id: ...] > o.a.drill.common.exception.UserException: SYSTEM ERROR: > IllegalStateException: Memory was leaked by query. Memory leaked: (524288) > Allocator(op:0:0:4:ParquetRowGroupScan) 1000000/524288/39919616/ > 10000000000 > at o.a.d.common.exceptions.UserException (UserException.java:544) > at > o.a.d.exec.work.fragment.FragmentExecutor.sendFinalState( > FragmentExecutor.java:293) > at o.a.d.exec.work.fragment.FragmentExecutor.cleanup( > FragmentExecutor.java:160) > at > o.a.d.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:262) > ... > Caused by: IllegalStateException: Memory was leaked by query. Memory > leaked: (524288) > at o.a.d.exec.memory.BaseAllocator.close(BaseAllocator.java:502) > at o.a.d.exec.ops.OperatorContextImpl(OperatorContextImpl.java:149) > at > o.a.d.exec.ops.FragmentContext.suppressingClose(FragmentContext.java:422) > at o.a.d.exec.ops.FragmentContext.close(FragmentContext.java:411) > at > o.a.d.exec.work.fragment.FragmentExecutor.closeOutResources( > FragmentExecutor.java:318) > at > o.a.d.exec.work.fragment.FragmentExecutor.cleanup( > FragmentExecutor.java:155) > > > > > > > > > > > > Francois >
