Re: Failed to fetch parquet metadata after 15000ms

Kunal Khatua Thu, 10 May 2018 11:58:13 -0700

That doesn't look too big. Are the queries failing during planning or execution 
phase?


Also, you mentioned that you are running this on a machine with 16GB RAM. How 
much memory have you given to Drill? Typical min config is about 8GB for Drill 
alone, and with 2-3GB with the OS, not a whole lot is left for other services 
(incl. HDFS).

If you need to run barebones, here are a couple of things worth doing.
1. Check the memory allocated to Drill. I'm not sure how much HDFS, etc have 
been allocated, but you should be able to get away with as low 4-5GB (1-2GB 
heap and the rest for Direct)
2. Reduce the 'planner.width.max_per_node' to as low as 1. This will force 
Drill to run, in effect, at a parallelization of 1 (i.e 1 thread per major 
fragment). 
3. There is the option of "planner.memory.max_query_memory_per_node" that 
defaults at 2GB, which should be sufficient even in your environment, when 
coupled with #2.

The current Apache Drill master is also carrying real-time information about 
the heap and direct memory. If you are using that, you should be able to see 
whether you are quickly running out of memory (which, I suspect, is the source 
of your troubles).

Hope that helps. 
On 5/10/2018 10:05:29 AM, Carlos Derich <carlosder...@gmail.com> wrote:
It is relatively big ?

parquet-tools schema output:

message schema {
optional int64 id;
optional int64 cbd_id;
optional binary company_name (UTF8);
optional binary category (UTF8);
optional binary subcategory (UTF8);
optional binary description (UTF8);
optional binary full_address_source (UTF8);
optional binary street_address (UTF8);
optional binary neighborhood (UTF8);
optional binary city (UTF8);
optional binary administrative_area_level_3 (UTF8);
optional binary administrative_area_level_2 (UTF8);
optional binary administrative_area_level_1 (UTF8);
optional binary postal_code (UTF8);
optional binary country (UTF8);
optional binary formatted_address (UTF8);
optional binary geometry;
optional binary telephone (UTF8);
optional binary website (UTF8);
optional int32 retrieved_at;
optional binary source_url (UTF8);
}

Thanks for the help , will keep you posted, this will help me understand
better drill hardware requirements.

On Thu, May 10, 2018 at 12:59 PM, Parth Chandra wrote:

> That might be it. How big is the schema of your data? Do you have lots of
> fields? If parquet-tools cannot read the metadata, there is little chance
> anybody else will be able to do so either.
>
>
> On Thu, May 10, 2018 at 9:57 AM, Carlos Derich
> wrote:
>
> > Hey Parth, thanks for the response !
> >
> > I tried fetching the metadata using parquet-tools Hadoop mode instead,
> and
> > I get OOM errors: Heap and GC limit exceeded.
> >
> > It seems that my problem is actually resource related, still a bit weird
> > how parquet metadata read is so hungry ?
> >
> > It seems that even after a restart (clean state/no queries running) only
> > ~4GB mem is free from a 16GB machine.
> >
> > I am going to run the tests on a bigger machine, and will tweak the JVM
> > options and will let you know.
> >
> > Regards.
> > Carlos.
> >
> > On Wed, May 9, 2018 at 9:04 PM, Parth Chandra wrote:
> >
> > > The most common reason I know of for this error is if you do not have
> > > enough CPU. Both Drill and the distributed file system will be using
> cpu
> > > and sometimes the file system, especially if it is distributed, will
> take
> > > too long. With your configuration and data set size, reading the file
> > > metadata should take no time at all (I'll assume the metadata in the
> > files
> > > is reasonable and not many MB itself). Is your system by any chance
> > > overloaded?
> > >
> > > Also, call me paranoid, but seeing /tmp in the path makes me
> suspicious.
> > > Can we assume the files are written completely when the metadata read
> is
> > > occurring. They probably are, since you can query the files
> individually,
> > > but I'm just checking to make sure.
> > >
> > > Finally, there is a similar JIRA
> > > https://issues.apache.org/jira/browse/DRILL-5908, that looks related.
> > >
> > >
> > >
> > >
> > > On Wed, May 9, 2018 at 4:15 PM, Carlos Derich
> > > wrote:
> > >
> > > > Hello guys,
> > > >
> > > > Asking this question here because I think i've hit a wall with this
> > > > problem, I am consistently getting the same error, when running a
> query
> > > on
> > > > a directory-based parquet file.
> > > >
> > > > The directory contains six 158MB parquet files.
> > > >
> > > > RESOURCE ERROR: Waited for 15000ms, but tasks for 'Fetch parquet
> > > > metadata' are not complete. Total runnable size 6, parallelism 6.
> > > >
> > > >
> > > > Both queries fail:
> > > >
> > > > *select count(*) from dfs.`/tmp/37454954-3c0a-47c5-
> > 9793-1c333d87fbbb/`*
> > > >
> > > > *select * from* *from dfs.`/tmp/37454954-3c0a-47c5-
> 9793-1c333d87fbbb/`
> > > > limit 1*
> > > >
> > > > BUT If I try running any other query in any of the 6 parquet files
> > inside
> > > > the directory it works fine:
> > > > eg:
> > > > *select * from
> > > > dfs.`/tmp/37454954-3c0a-47c5-9793-1c333d87fbbb/185d3076-v_
> > > docker_node0001-
> > > > 140526122190592.parquet`*
> > > >
> > > > Running *`refresh table metadata`* gives me the exact same error.
> > > >
> > > > Also tried to set *planner.hashjoin* to false.
> > > >
> > > > Checking the drill source it seems that the wait metadata timeout is
> > not
> > > > configurable.
> > > >
> > > > Have any of you faced a similar situation ?
> > > >
> > > > Running this locally on my 16GB RAM machine, hdfs in a single node.
> > > >
> > > > I also found an open ticket with the same error message:
> > > > https://issues.apache.org/jira/browse/DRILL-5903
> > > >
> > > > Thank you in advance.
> > > >
> > >
> >
>

Re: Failed to fetch parquet metadata after 15000ms

Reply via email to