[ 
https://issues.apache.org/jira/browse/DRILL-6039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16514293#comment-16514293
 ] 

Krystal commented on DRILL-6039:
--------------------------------

The "drillbit.sh graceful_stop" from command line against parquet files still 
fails - does not wait for fragments to finish.  Interesting thing is the 
problem does not occur when shutting down the drillbit from the WebUI.  The 
drillbit.log does not show any memory leaks.  Here is the stack trace:
{code:java}
Error: SYSTEM ERROR: IOException: Filesystem closed

Fragment 1:10

(org.apache.drill.common.exceptions.DrillRuntimeException) Error in parquet 
record reader.
Message: Failure in setting up reader
Parquet Metadata: ParquetMetaData{FileMetaData{schema: message root {
  optional int64 l_orderkey;
  optional int64 l_partkey;
  optional int64 l_suppkey;
  optional int32 l_linenumber;
  optional double l_quantity;
  optional double l_extendedprice;
  optional double l_discount;
  optional double l_tax;
  optional binary l_returnflag (UTF8);
  optional binary l_linestatus (UTF8);
  optional int32 l_shipdate (DATE);
  optional int32 l_commitdate (DATE);
  optional int32 l_receiptdate (DATE);
  optional binary l_shipinstruct (UTF8);
  optional binary l_shipmode (UTF8);
  optional binary l_comment (UTF8);
}
, metadata: {drill.version=1.7.0-SNAPSHOT}}, blocks: [BlockMetaData{9785551, 
1338587809 [ColumnMetaData{SNAPPY [l_orderkey] INT64  [PLAIN, BIT_PACKED, RLE], 
4}, ColumnMetaData{SNAPPY [l_partkey] INT64  [PLAIN, BIT_PACKED, RLE], 
15273019}, ColumnMetaData{SNAPPY [l_suppkey] INT64  [PLAIN, BIT_PACKED, RLE], 
73277460}, ColumnMetaData{SNAPPY [l_linenumber] INT32  [PLAIN, BIT_PACKED, 
RLE], 124321400}, ColumnMetaData{SNAPPY [l_quantity] DOUBLE  [PLAIN, 
BIT_PACKED, RLE], 132087986}, ColumnMetaData{SNAPPY [l_extendedprice] DOUBLE  
[PLAIN, BIT_PACKED, RLE], 151838465}, ColumnMetaData{SNAPPY [l_discount] DOUBLE 
 [PLAIN, BIT_PACKED, RLE], 208270450}, ColumnMetaData{SNAPPY [l_tax] DOUBLE  
[PLAIN, BIT_PACKED, RLE], 227351535}, ColumnMetaData{SNAPPY [l_returnflag] 
BINARY  [PLAIN, BIT_PACKED, RLE], 245574230}, ColumnMetaData{SNAPPY 
[l_linestatus] BINARY  [PLAIN, BIT_PACKED, RLE], 254814472}, 
ColumnMetaData{SNAPPY [l_shipdate] INT32  [PLAIN, BIT_PACKED, RLE], 260500185}, 
ColumnMetaData{SNAPPY [l_commitdate] INT32  [PLAIN, BIT_PACKED, RLE], 
290097700}, ColumnMetaData{SNAPPY [l_receiptdate] INT32  [PLAIN, BIT_PACKED, 
RLE], 319358270}, ColumnMetaData{SNAPPY [l_shipinstruct] BINARY  [PLAIN, 
BIT_PACKED, RLE], 348982057}, ColumnMetaData{SNAPPY [l_shipmode] BINARY  
[PLAIN, BIT_PACKED, RLE], 370125048}, ColumnMetaData{SNAPPY [l_comment] BINARY  
[PLAIN, BIT_PACKED, RLE], 392116052}]}]}
    
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.handleException():316
    
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.setup():300
    org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():335
    org.apache.drill.exec.physical.impl.ScanBatch.internalNext():222
    org.apache.drill.exec.physical.impl.ScanBatch.next():274
    org.apache.drill.exec.record.AbstractRecordBatch.next():119
    org.apache.drill.exec.record.AbstractRecordBatch.next():109
    org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
    org.apache.drill.exec.record.AbstractRecordBatch.next():164
    org.apache.drill.exec.record.AbstractRecordBatch.next():119
    org.apache.drill.exec.record.AbstractRecordBatch.next():109
    org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
    
org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():80
    org.apache.drill.exec.record.AbstractRecordBatch.next():164
    org.apache.drill.exec.record.AbstractRecordBatch.next():119
    org.apache.drill.exec.record.AbstractRecordBatch.next():109
    org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
    
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():134
    org.apache.drill.exec.record.AbstractRecordBatch.next():164
    org.apache.drill.exec.record.AbstractRecordBatch.next():119
    org.apache.drill.exec.test.generated.StreamingAggregatorGen42.doWork():187
    
org.apache.drill.exec.physical.impl.aggregate.StreamingAggBatch.innerNext():194
    org.apache.drill.exec.record.AbstractRecordBatch.next():164
    org.apache.drill.exec.physical.impl.BaseRootExec.next():105
    
org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():93
    org.apache.drill.exec.physical.impl.BaseRootExec.next():95
    org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():233
    org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():226
    java.security.AccessController.doPrivileged():-2
    javax.security.auth.Subject.doAs():422
    org.apache.hadoop.security.UserGroupInformation.doAs():1633
    org.apache.drill.exec.work.fragment.FragmentExecutor.run():226
    org.apache.drill.common.SelfCleaningRunnable.run():38
    java.util.concurrent.ThreadPoolExecutor.runWorker():1149
    java.util.concurrent.ThreadPoolExecutor$Worker.run():624
    java.lang.Thread.run():748
  Caused By (org.apache.drill.common.exceptions.ExecutionSetupException) Error 
opening or reading metadata for parquet file at location: 1_210_2.parquet
    org.apache.drill.exec.store.parquet.columnreaders.PageReader.<init>():176
    
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageReader.<init>():96
    org.apache.drill.exec.store.parquet.columnreaders.ColumnReader.<init>():97
    
org.apache.drill.exec.store.parquet.columnreaders.NullableColumnReader.<init>():43
    
org.apache.drill.exec.store.parquet.columnreaders.NullableFixedByteAlignedReaders$NullableFixedByteAlignedReader.<init>():59
    
org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory.getNullableColumnReader():270
    
org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory.createFixedColumnReader():202
    
org.apache.drill.exec.store.parquet.columnreaders.ParquetColumnMetadata.makeFixedWidthReader():140
    
org.apache.drill.exec.store.parquet.columnreaders.ReadState.buildReader():123
    
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.setup():298
    org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():335
    org.apache.drill.exec.physical.impl.ScanBatch.internalNext():222
    org.apache.drill.exec.physical.impl.ScanBatch.next():274
    org.apache.drill.exec.record.AbstractRecordBatch.next():119
    org.apache.drill.exec.record.AbstractRecordBatch.next():109
    org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
    org.apache.drill.exec.record.AbstractRecordBatch.next():164
    org.apache.drill.exec.record.AbstractRecordBatch.next():119
    org.apache.drill.exec.record.AbstractRecordBatch.next():109
    org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
    
org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():80
    org.apache.drill.exec.record.AbstractRecordBatch.next():164
    org.apache.drill.exec.record.AbstractRecordBatch.next():119
    org.apache.drill.exec.record.AbstractRecordBatch.next():109
    org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
    
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():134
    org.apache.drill.exec.record.AbstractRecordBatch.next():164
    org.apache.drill.exec.record.AbstractRecordBatch.next():119
    org.apache.drill.exec.test.generated.StreamingAggregatorGen42.doWork():187
    
org.apache.drill.exec.physical.impl.aggregate.StreamingAggBatch.innerNext():194
    org.apache.drill.exec.record.AbstractRecordBatch.next():164
    org.apache.drill.exec.physical.impl.BaseRootExec.next():105
    
org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():93
    org.apache.drill.exec.physical.impl.BaseRootExec.next():95
    org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():233
    org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():226
    java.security.AccessController.doPrivileged():-2
    javax.security.auth.Subject.doAs():422
    org.apache.hadoop.security.UserGroupInformation.doAs():1633
    org.apache.drill.exec.work.fragment.FragmentExecutor.run():226
    org.apache.drill.common.SelfCleaningRunnable.run():38
    java.util.concurrent.ThreadPoolExecutor.runWorker():1149
    java.util.concurrent.ThreadPoolExecutor$Worker.run():624
    java.lang.Thread.run():748
  Caused By (java.io.IOException) Filesystem closed
    com.mapr.fs.MapRFileSystem.checkOpen():1638
    com.mapr.fs.MapRFileSystem.lookupClient():613
    com.mapr.fs.MapRFileSystem.lookupClient():696
    com.mapr.fs.MapRFileSystem.open():1000
    org.apache.hadoop.fs.FileSystem.open():807
    org.apache.drill.exec.store.dfs.DrillFileSystem.open():148
    org.apache.drill.exec.store.parquet.columnreaders.PageReader.<init>():139
    
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageReader.<init>():96
    org.apache.drill.exec.store.parquet.columnreaders.ColumnReader.<init>():97
    
org.apache.drill.exec.store.parquet.columnreaders.NullableColumnReader.<init>():43
    
org.apache.drill.exec.store.parquet.columnreaders.NullableFixedByteAlignedReaders$NullableFixedByteAlignedReader.<init>():59
    
org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory.getNullableColumnReader():270
    
org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory.createFixedColumnReader():202
    
org.apache.drill.exec.store.parquet.columnreaders.ParquetColumnMetadata.makeFixedWidthReader():140
    
org.apache.drill.exec.store.parquet.columnreaders.ReadState.buildReader():123
    
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.setup():298
    org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():335
    org.apache.drill.exec.physical.impl.ScanBatch.internalNext():222
    org.apache.drill.exec.physical.impl.ScanBatch.next():274
    org.apache.drill.exec.record.AbstractRecordBatch.next():119
    org.apache.drill.exec.record.AbstractRecordBatch.next():109
    org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
    org.apache.drill.exec.record.AbstractRecordBatch.next():164
    org.apache.drill.exec.record.AbstractRecordBatch.next():119
    org.apache.drill.exec.record.AbstractRecordBatch.next():109
    org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
    
org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():80
    org.apache.drill.exec.record.AbstractRecordBatch.next():164
    org.apache.drill.exec.record.AbstractRecordBatch.next():119
    org.apache.drill.exec.record.AbstractRecordBatch.next():109
    org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
    
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():134
    org.apache.drill.exec.record.AbstractRecordBatch.next():164
    org.apache.drill.exec.record.AbstractRecordBatch.next():119
    org.apache.drill.exec.test.generated.StreamingAggregatorGen42.doWork():187
    
org.apache.drill.exec.physical.impl.aggregate.StreamingAggBatch.innerNext():194
    org.apache.drill.exec.record.AbstractRecordBatch.next():164
    org.apache.drill.exec.physical.impl.BaseRootExec.next():105
    
org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():93
    org.apache.drill.exec.physical.impl.BaseRootExec.next():95
    org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():233
    org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():226
    java.security.AccessController.doPrivileged():-2
    javax.security.auth.Subject.doAs():422
    org.apache.hadoop.security.UserGroupInformation.doAs():1633
    org.apache.drill.exec.work.fragment.FragmentExecutor.run():226
    org.apache.drill.common.SelfCleaningRunnable.run():38
    java.util.concurrent.ThreadPoolExecutor.runWorker():1149
    java.util.concurrent.ThreadPoolExecutor$Worker.run():624
    java.lang.Thread.run():748 (state=,code=0){code}

> drillbit.sh graceful_stop does not wait for fragments to complete before 
> stopping the drillbit
> ----------------------------------------------------------------------------------------------
>
>                 Key: DRILL-6039
>                 URL: https://issues.apache.org/jira/browse/DRILL-6039
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow
>    Affects Versions: 1.3.0
>            Reporter: Krystal
>            Assignee: Venkata Jyothsna Donapati
>            Priority: Major
>             Fix For: 1.14.0
>
>
> git.commit.id.abbrev=eb0c403
> I have 3-nodes cluster with drillbits running on each node.  I kicked off a 
> long running query.  In the middle of the query, I did a "./drillbit.sh 
> graceful_stop" on one of the non-foreman node.  The node was stopped within a 
> few seconds and the query failed with error:
> Error: SYSTEM ERROR: IOException: Filesystem closed
> Fragment 4:15



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to