Hello guys,

Asking this question here because I think i've hit a wall with this
problem, I am consistently getting the same error, when running a query on
a directory-based parquet file.

The directory contains six 158MB parquet files.

RESOURCE ERROR: Waited for 15000ms, but tasks for 'Fetch parquet
metadata' are not complete. Total runnable size 6, parallelism 6.


Both queries fail:

*select count(*) from dfs.`/tmp/37454954-3c0a-47c5-9793-1c333d87fbbb/`*

*select * from* *from dfs.`/tmp/37454954-3c0a-47c5-9793-1c333d87fbbb/`
limit 1*

BUT If I try running any other query in any of the 6 parquet files inside
the directory it works fine:
eg:
*select * from
dfs.`/tmp/37454954-3c0a-47c5-9793-1c333d87fbbb/185d3076-v_docker_node0001-140526122190592.parquet`*

Running *`refresh table metadata`* gives me the exact same error.

Also tried to set *planner.hashjoin* to false.

Checking the drill source it seems that the wait metadata timeout is not
configurable.

Have any of you faced a similar situation ?

Running this locally on my 16GB RAM machine, hdfs in a single node.

I also found an open ticket with the same error message:
https://issues.apache.org/jira/browse/DRILL-5903

Thank you in advance.

Reply via email to