Hi Jacques,
I debugged and found the issue. I have a JIRA issue going at:
https://issues.apache.org/jira/browse/DRILL-1948
If someone could point me in the right direction of that Parquet fork, I
can probably submit a patch for this and resolve it.
On Wed, Jan 7, 2015 at 3:56 PM, Jacques Nadea
P.S. For a more extreme example (1M rows) that returns:
Query failed: Query failed: Failure while running fragment., You tried to
do a batch data read operation when you were in a state of STOP. You can
only do this type of operation when you are in a state of OK or
OK_NEW_SCHEMA. [ 91b9e166-d185
I can definitely put it up somewhere - it's only 72kb (the Parquet file).
I'm using Hadoop 2.4.0 running on Amazon EMR. If I get and put it back
onto HDFS, it still has the same problem, unfortunately.
https://www.dropbox.com/s/nzbg8986mt5t8md/saletest2.tgz?dl=0
I notice in the source that there
I can definitely put it up somewhere - it's only 72kb (the Parquet file).
I'm using Hadoop 2.4.0 running on Amazon EMR. If I get and put it back
onto HDFS, it still has the same problem, unfortunately.
https://www.dropbox.com/s/nzbg8986mt5t8md/saletest2.tgz?dl=0
I notice in the source that there
Nothing is immediately coming to mind. Out of curiosity, does it still
have this problem if you copy the local file back on HDFS and then query
it?
What version of HDFS are using? Is the file something you can share
privately or publically or is too large?
thanks,
Jacques
On Tue, Jan 6, 2015 a
Anyone got any ideas on this one? I can consistently reproduce the issue
with HDFS - the minute I get the data off HDFS (to a local drive), it all
works fine.
Doesn't seem to be a problem with Parquet - more like the HDFS storage
engine.
On Tue, Jan 6, 2015 at 9:50 AM, Adam Gilmore wrote:
> Th
The data is okay, because the exact same Parquet directory is working fine
on the local drive, it's just not working when using HDFS. I tried casting
as you said, but that ended up with the exact same problem.
Regards,
*Adam Gilmore*
Director of Technology
a...@pharmadata.net.au
+61 421 99
The data is okay, because the exact same Parquet directory is working fine
on the local drive, it's just not working when using HDFS. I tried casting
as you said, but that ended up with the exact same problem.
On Tue, Jan 6, 2015 at 9:49 AM, MapR wrote:
> Please try casting the colum data type.
Please try casting the colum data type. Also please verify that all the column
data is satisfying your data type.
Sudhakar Thota
Sent from my iPhone
> On Jan 5, 2015, at 5:56 AM, Adam Gilmore wrote:
>
> The actual stack trace is:
>
> 2015-01-05 13:48:27,356 [2b5569d5-3771-748d-1390-3a8930d020
The actual stack trace is:
2015-01-05 13:48:27,356 [2b5569d5-3771-748d-1390-3a8930d02002:frag:1:12]
ERROR o.a.drill.exec.ops.FragmentContext - Fragment Context received
failure.
org.apache.drill.common.exceptions.DrillRuntimeException:
java.io.IOException: can not read class parquet.format.PageHea
Hi all,
I'm trying to do a really simple query on a parquet directory on HDFS.
This works fine:
select count(*) from hdfs.warehouse.saleparquet
However, this fails:
0: jdbc:drill:local> select sum(sellprice) from hdfs.warehouse.saleparquet;
Query failed: Query failed: Failure while running fra
11 matches
Mail list logo