[ https://issues.apache.org/jira/browse/DRILL-6039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16514293#comment-16514293 ]
Krystal commented on DRILL-6039: -------------------------------- The "drillbit.sh graceful_stop" from command line against parquet files still fails - does not wait for fragments to finish. Interesting thing is the problem does not occur when shutting down the drillbit from the WebUI. The drillbit.log does not show any memory leaks. Here is the stack trace: {code:java} Error: SYSTEM ERROR: IOException: Filesystem closed Fragment 1:10 (org.apache.drill.common.exceptions.DrillRuntimeException) Error in parquet record reader. Message: Failure in setting up reader Parquet Metadata: ParquetMetaData{FileMetaData{schema: message root { optional int64 l_orderkey; optional int64 l_partkey; optional int64 l_suppkey; optional int32 l_linenumber; optional double l_quantity; optional double l_extendedprice; optional double l_discount; optional double l_tax; optional binary l_returnflag (UTF8); optional binary l_linestatus (UTF8); optional int32 l_shipdate (DATE); optional int32 l_commitdate (DATE); optional int32 l_receiptdate (DATE); optional binary l_shipinstruct (UTF8); optional binary l_shipmode (UTF8); optional binary l_comment (UTF8); } , metadata: {drill.version=1.7.0-SNAPSHOT}}, blocks: [BlockMetaData{9785551, 1338587809 [ColumnMetaData{SNAPPY [l_orderkey] INT64 [PLAIN, BIT_PACKED, RLE], 4}, ColumnMetaData{SNAPPY [l_partkey] INT64 [PLAIN, BIT_PACKED, RLE], 15273019}, ColumnMetaData{SNAPPY [l_suppkey] INT64 [PLAIN, BIT_PACKED, RLE], 73277460}, ColumnMetaData{SNAPPY [l_linenumber] INT32 [PLAIN, BIT_PACKED, RLE], 124321400}, ColumnMetaData{SNAPPY [l_quantity] DOUBLE [PLAIN, BIT_PACKED, RLE], 132087986}, ColumnMetaData{SNAPPY [l_extendedprice] DOUBLE [PLAIN, BIT_PACKED, RLE], 151838465}, ColumnMetaData{SNAPPY [l_discount] DOUBLE [PLAIN, BIT_PACKED, RLE], 208270450}, ColumnMetaData{SNAPPY [l_tax] DOUBLE [PLAIN, BIT_PACKED, RLE], 227351535}, ColumnMetaData{SNAPPY [l_returnflag] BINARY [PLAIN, BIT_PACKED, RLE], 245574230}, ColumnMetaData{SNAPPY [l_linestatus] BINARY [PLAIN, BIT_PACKED, RLE], 254814472}, ColumnMetaData{SNAPPY [l_shipdate] INT32 [PLAIN, BIT_PACKED, RLE], 260500185}, ColumnMetaData{SNAPPY [l_commitdate] INT32 [PLAIN, BIT_PACKED, RLE], 290097700}, ColumnMetaData{SNAPPY [l_receiptdate] INT32 [PLAIN, BIT_PACKED, RLE], 319358270}, ColumnMetaData{SNAPPY [l_shipinstruct] BINARY [PLAIN, BIT_PACKED, RLE], 348982057}, ColumnMetaData{SNAPPY [l_shipmode] BINARY [PLAIN, BIT_PACKED, RLE], 370125048}, ColumnMetaData{SNAPPY [l_comment] BINARY [PLAIN, BIT_PACKED, RLE], 392116052}]}]} org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.handleException():316 org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.setup():300 org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():335 org.apache.drill.exec.physical.impl.ScanBatch.internalNext():222 org.apache.drill.exec.physical.impl.ScanBatch.next():274 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 org.apache.drill.exec.record.AbstractRecordBatch.next():164 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():80 org.apache.drill.exec.record.AbstractRecordBatch.next():164 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():134 org.apache.drill.exec.record.AbstractRecordBatch.next():164 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.test.generated.StreamingAggregatorGen42.doWork():187 org.apache.drill.exec.physical.impl.aggregate.StreamingAggBatch.innerNext():194 org.apache.drill.exec.record.AbstractRecordBatch.next():164 org.apache.drill.exec.physical.impl.BaseRootExec.next():105 org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():93 org.apache.drill.exec.physical.impl.BaseRootExec.next():95 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():233 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():226 java.security.AccessController.doPrivileged():-2 javax.security.auth.Subject.doAs():422 org.apache.hadoop.security.UserGroupInformation.doAs():1633 org.apache.drill.exec.work.fragment.FragmentExecutor.run():226 org.apache.drill.common.SelfCleaningRunnable.run():38 java.util.concurrent.ThreadPoolExecutor.runWorker():1149 java.util.concurrent.ThreadPoolExecutor$Worker.run():624 java.lang.Thread.run():748 Caused By (org.apache.drill.common.exceptions.ExecutionSetupException) Error opening or reading metadata for parquet file at location: 1_210_2.parquet org.apache.drill.exec.store.parquet.columnreaders.PageReader.<init>():176 org.apache.drill.exec.store.parquet.columnreaders.AsyncPageReader.<init>():96 org.apache.drill.exec.store.parquet.columnreaders.ColumnReader.<init>():97 org.apache.drill.exec.store.parquet.columnreaders.NullableColumnReader.<init>():43 org.apache.drill.exec.store.parquet.columnreaders.NullableFixedByteAlignedReaders$NullableFixedByteAlignedReader.<init>():59 org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory.getNullableColumnReader():270 org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory.createFixedColumnReader():202 org.apache.drill.exec.store.parquet.columnreaders.ParquetColumnMetadata.makeFixedWidthReader():140 org.apache.drill.exec.store.parquet.columnreaders.ReadState.buildReader():123 org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.setup():298 org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():335 org.apache.drill.exec.physical.impl.ScanBatch.internalNext():222 org.apache.drill.exec.physical.impl.ScanBatch.next():274 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 org.apache.drill.exec.record.AbstractRecordBatch.next():164 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():80 org.apache.drill.exec.record.AbstractRecordBatch.next():164 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():134 org.apache.drill.exec.record.AbstractRecordBatch.next():164 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.test.generated.StreamingAggregatorGen42.doWork():187 org.apache.drill.exec.physical.impl.aggregate.StreamingAggBatch.innerNext():194 org.apache.drill.exec.record.AbstractRecordBatch.next():164 org.apache.drill.exec.physical.impl.BaseRootExec.next():105 org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():93 org.apache.drill.exec.physical.impl.BaseRootExec.next():95 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():233 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():226 java.security.AccessController.doPrivileged():-2 javax.security.auth.Subject.doAs():422 org.apache.hadoop.security.UserGroupInformation.doAs():1633 org.apache.drill.exec.work.fragment.FragmentExecutor.run():226 org.apache.drill.common.SelfCleaningRunnable.run():38 java.util.concurrent.ThreadPoolExecutor.runWorker():1149 java.util.concurrent.ThreadPoolExecutor$Worker.run():624 java.lang.Thread.run():748 Caused By (java.io.IOException) Filesystem closed com.mapr.fs.MapRFileSystem.checkOpen():1638 com.mapr.fs.MapRFileSystem.lookupClient():613 com.mapr.fs.MapRFileSystem.lookupClient():696 com.mapr.fs.MapRFileSystem.open():1000 org.apache.hadoop.fs.FileSystem.open():807 org.apache.drill.exec.store.dfs.DrillFileSystem.open():148 org.apache.drill.exec.store.parquet.columnreaders.PageReader.<init>():139 org.apache.drill.exec.store.parquet.columnreaders.AsyncPageReader.<init>():96 org.apache.drill.exec.store.parquet.columnreaders.ColumnReader.<init>():97 org.apache.drill.exec.store.parquet.columnreaders.NullableColumnReader.<init>():43 org.apache.drill.exec.store.parquet.columnreaders.NullableFixedByteAlignedReaders$NullableFixedByteAlignedReader.<init>():59 org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory.getNullableColumnReader():270 org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory.createFixedColumnReader():202 org.apache.drill.exec.store.parquet.columnreaders.ParquetColumnMetadata.makeFixedWidthReader():140 org.apache.drill.exec.store.parquet.columnreaders.ReadState.buildReader():123 org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.setup():298 org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():335 org.apache.drill.exec.physical.impl.ScanBatch.internalNext():222 org.apache.drill.exec.physical.impl.ScanBatch.next():274 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 org.apache.drill.exec.record.AbstractRecordBatch.next():164 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():80 org.apache.drill.exec.record.AbstractRecordBatch.next():164 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():134 org.apache.drill.exec.record.AbstractRecordBatch.next():164 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.test.generated.StreamingAggregatorGen42.doWork():187 org.apache.drill.exec.physical.impl.aggregate.StreamingAggBatch.innerNext():194 org.apache.drill.exec.record.AbstractRecordBatch.next():164 org.apache.drill.exec.physical.impl.BaseRootExec.next():105 org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():93 org.apache.drill.exec.physical.impl.BaseRootExec.next():95 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():233 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():226 java.security.AccessController.doPrivileged():-2 javax.security.auth.Subject.doAs():422 org.apache.hadoop.security.UserGroupInformation.doAs():1633 org.apache.drill.exec.work.fragment.FragmentExecutor.run():226 org.apache.drill.common.SelfCleaningRunnable.run():38 java.util.concurrent.ThreadPoolExecutor.runWorker():1149 java.util.concurrent.ThreadPoolExecutor$Worker.run():624 java.lang.Thread.run():748 (state=,code=0){code} > drillbit.sh graceful_stop does not wait for fragments to complete before > stopping the drillbit > ---------------------------------------------------------------------------------------------- > > Key: DRILL-6039 > URL: https://issues.apache.org/jira/browse/DRILL-6039 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow > Affects Versions: 1.3.0 > Reporter: Krystal > Assignee: Venkata Jyothsna Donapati > Priority: Major > Fix For: 1.14.0 > > > git.commit.id.abbrev=eb0c403 > I have 3-nodes cluster with drillbits running on each node. I kicked off a > long running query. In the middle of the query, I did a "./drillbit.sh > graceful_stop" on one of the non-foreman node. The node was stopped within a > few seconds and the query failed with error: > Error: SYSTEM ERROR: IOException: Filesystem closed > Fragment 4:15 -- This message was sent by Atlassian JIRA (v7.6.3#76005)