Nothing is immediately coming to mind.  Out of curiosity, does it still
have this problem if you copy the local file back on HDFS and then query
it?

What version of HDFS are using?  Is the file something you can share
privately or publically or is too large?

thanks,
Jacques

On Tue, Jan 6, 2015 at 9:29 PM, Adam Gilmore <dragoncu...@gmail.com> wrote:

> Anyone got any ideas on this one?  I can consistently reproduce the issue
> with HDFS - the minute I get the data off HDFS (to a local drive), it all
> works fine.
>
> Doesn't seem to be a problem with Parquet - more like the HDFS storage
> engine.
>
> On Tue, Jan 6, 2015 at 9:50 AM, Adam Gilmore <dragoncu...@gmail.com>
> wrote:
>
> > The data is okay, because the exact same Parquet directory is working
> fine
> > on the local drive, it's just not working when using HDFS.  I tried
> casting
> > as you said, but that ended up with the exact same problem.
> >
> > On Tue, Jan 6, 2015 at 9:49 AM, MapR <sth...@maprtech.com> wrote:
> >
> >> Please try casting the colum data type. Also please verify that all the
> >> column data is satisfying your data type.
> >>
> >> Sudhakar Thota
> >> Sent from my iPhone
> >>
> >> > On Jan 5, 2015, at 5:56 AM, Adam Gilmore <dragoncu...@gmail.com>
> wrote:
> >> >
> >> > The actual stack trace is:
> >> >
> >> > 2015-01-05 13:48:27,356
> [2b5569d5-3771-748d-1390-3a8930d02002:frag:1:12]
> >> > ERROR o.a.drill.exec.ops.FragmentContext - Fragment Context received
> >> > failure.
> >> > org.apache.drill.common.exceptions.DrillRuntimeException:
> >> > java.io.IOException: can not read class parquet.format.PageHeader:
> don't
> >> > know what type: 13
> >> >        at
> >> >
> >>
> org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.next(ParquetRecordReader.java:427)
> >> > ~[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >> >        at
> >> > org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:158)
> >> > ~[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >> >        at
> >> >
> >>
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
> >> > [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >> >        at
> >> >
> >>
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
> >> > [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >> >        at
> >> >
> >>
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
> >> > [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >> >        at
> >> >
> >>
> org.apache.drill.exec.physical.impl.aggregate.StreamingAggBatch.buildSchema(StreamingAggBatch.java:83)
> >> > [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >> >        at
> >> >
> >>
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:130)
> >> > [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >> >        at
> >> >
> >>
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
> >> > [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >> >        at
> >> >
> >>
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:67)
> >> > [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >> >        at
> >> >
> >>
> org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:97)
> >> > [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >> >        at
> >> >
> >>
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:57)
> >> > [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >> >        at
> >> >
> >>
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:114)
> >> > [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >> >        at
> >> >
> >>
> org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:254)
> >> > [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >> >        at
> >> >
> >>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >> > [na:1.7.0_71]
> >> >        at
> >> >
> >>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >> > [na:1.7.0_71]
> >> >        at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> >> > Caused by: java.io.IOException: can not read class
> >> > parquet.format.PageHeader: don't know what type: 13
> >> >        at parquet.format.Util.read(Util.java:50)
> >> > ~[parquet-format-2.1.1-drill-r1.jar:na]
> >> >        at parquet.format.Util.readPageHeader(Util.java:26)
> >> > ~[parquet-format-2.1.1-drill-r1.jar:na]
> >> >        at
> >> >
> >>
> org.apache.drill.exec.store.parquet.ColumnDataReader.readPageHeader(ColumnDataReader.java:47)
> >> > ~[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >> >        at
> >> >
> >>
> org.apache.drill.exec.store.parquet.columnreaders.PageReader.next(PageReader.java:169)
> >> > ~[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >> >        at
> >> >
> >>
> org.apache.drill.exec.store.parquet.columnreaders.NullableColumnReader.processPages(NullableColumnReader.java:76)
> >> > ~[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >> >        at
> >> >
> >>
> org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.readAllFixedFields(ParquetRecordReader.java:366)
> >> > ~[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >> >        at
> >> >
> >>
> org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.next(ParquetRecordReader.java:409)
> >> > ~[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >> >        ... 15 common frames omitted
> >> > Caused by: parquet.org.apache.thrift.protocol.TProtocolException:
> don't
> >> > know what type: 13
> >> >        at
> >> >
> >>
> parquet.org.apache.thrift.protocol.TCompactProtocol.getTType(TCompactProtocol.java:806)
> >> > ~[parquet-format-2.1.1-drill-r1.jar:na]
> >> >        at
> >> >
> >>
> parquet.org.apache.thrift.protocol.TCompactProtocol.readListBegin(TCompactProtocol.java:536)
> >> > ~[parquet-format-2.1.1-drill-r1.jar:na]
> >> >        at
> >> >
> >>
> parquet.org.apache.thrift.protocol.TCompactProtocol.readSetBegin(TCompactProtocol.java:547)
> >> > ~[parquet-format-2.1.1-drill-r1.jar:na]
> >> >        at
> >> >
> >>
> parquet.org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:128)
> >> > ~[parquet-format-2.1.1-drill-r1.jar:na]
> >> >        at
> >> >
> >>
> parquet.org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:60)
> >> > ~[parquet-format-2.1.1-drill-r1.jar:na]
> >> >        at parquet.format.PageHeader.read(PageHeader.java:897)
> >> > ~[parquet-format-2.1.1-drill-r1.jar:na]
> >> >        at parquet.format.Util.read(Util.java:47)
> >> > ~[parquet-format-2.1.1-drill-r1.jar:na]
> >> >        ... 21 common frames omitted
> >> >
> >> >
> >> >> On Mon, Jan 5, 2015 at 6:26 PM, Adam Gilmore <dragoncu...@gmail.com>
> >> wrote:
> >> >>
> >> >> Hi all,
> >> >>
> >> >> I'm trying to do a really simple query on a parquet directory on
> HDFS.
> >> >>
> >> >> This works fine:
> >> >>
> >> >> select count(*) from hdfs.warehouse.saleparquet
> >> >>
> >> >> However, this fails:
> >> >>
> >> >> 0: jdbc:drill:local> select sum(sellprice) from
> >> hdfs.warehouse.saleparquet;
> >> >> Query failed: Query failed: Failure while running fragment., You
> tried
> >> to
> >> >> do a batch data read operation when you were in a state of STOP.  You
> >> can
> >> >> only do this type of operation when you are in a state of OK or
> >> >> OK_NEW_SCHEMA. [ 92fc8807-220b-466c-bbac-1f524d4251cb on
> >> >> ip-10-8-1-154.ap-southeast-2.compute.internal:31010 ]
> >> >> [ 92fc8807-220b-466c-bbac-1f524d4251cb on
> >> >> ip-10-8-1-154.ap-southeast-2.compute.internal:31010 ]
> >> >>
> >> >>
> >> >> Error: exception while executing query: Failure while executing
> query.
> >> >> (state=,code=0)
> >> >>
> >> >> Seems like a very simple query.
> >> >>
> >> >> Funnily enough, if I copy it off HDFS to the local system and run the
> >> same
> >> >> query against the local file, it works fine.  Just purely something
> to
> >> do
> >> >> with HDFS.
> >> >>
> >> >> Any ideas?  I'm running 0.7.
> >> >>
> >>
> >
> >
>

Reply via email to