Re: Failed to fetch parquet metadata after 15000ms

Parth Chandra Wed, 09 May 2018 18:05:12 -0700

The most common reason I know of for this error is if you do not have
enough CPU. Both Drill and the distributed file system will be using cpu
and sometimes the file system, especially if it is distributed, will take
too long. With your configuration and data set size, reading the file
metadata should take no time at all (I'll assume the metadata in the files
is reasonable and not many MB itself).  Is your system by any chance
overloaded?


Also, call me paranoid, but seeing /tmp in the path makes me suspicious.
Can we assume the files are written completely when the metadata read is
occurring. They probably are, since you can query the files individually,
but I'm just checking to make sure.

Finally, there is a similar JIRA
https://issues.apache.org/jira/browse/DRILL-5908, that looks related.




On Wed, May 9, 2018 at 4:15 PM, Carlos Derich <carlosder...@gmail.com>
wrote:

> Hello guys,
>
> Asking this question here because I think i've hit a wall with this
> problem, I am consistently getting the same error, when running a query on
> a directory-based parquet file.
>
> The directory contains six 158MB parquet files.
>
> RESOURCE ERROR: Waited for 15000ms, but tasks for 'Fetch parquet
> metadata' are not complete. Total runnable size 6, parallelism 6.
>
>
> Both queries fail:
>
> *select count(*) from dfs.`/tmp/37454954-3c0a-47c5-9793-1c333d87fbbb/`*
>
> *select * from* *from dfs.`/tmp/37454954-3c0a-47c5-9793-1c333d87fbbb/`
> limit 1*
>
> BUT If I try running any other query in any of the 6 parquet files inside
> the directory it works fine:
> eg:
> *select * from
> dfs.`/tmp/37454954-3c0a-47c5-9793-1c333d87fbbb/185d3076-v_docker_node0001-
> 140526122190592.parquet`*
>
> Running *`refresh table metadata`* gives me the exact same error.
>
> Also tried to set *planner.hashjoin* to false.
>
> Checking the drill source it seems that the wait metadata timeout is not
> configurable.
>
> Have any of you faced a similar situation ?
>
> Running this locally on my 16GB RAM machine, hdfs in a single node.
>
> I also found an open ticket with the same error message:
> https://issues.apache.org/jira/browse/DRILL-5903
>
> Thank you in advance.
>

Re: Failed to fetch parquet metadata after 15000ms

Reply via email to