Re: Failed to fetch parquet metadata after 15000ms

2018-10-10 Thread Abhishek Girish
Hey Karthik,

This is a bug and there are a few JIRAs to track this and one of those is
DRILL-5788 . It's likely
because of a hard-coded default for the timeout which is sometimes not
sufficient. Can you please update the JIRA with your findings, which can
help resolving the issue?

Regards,
Abhishek

On Mon, Oct 8, 2018 at 10:40 AM karthik.R  wrote:

> Hi,
> I am frequently getting below exception when running a query in drill 1.14.
> Could you please help what option to set to increase this timeout?
>
> Waited for 15000ms , but tasks for 'Fetch parquet metadata' are not
> complete. Total runnable size 4 , parallelism 4
>
>
> My parquet files present in s3 location with 5 parquet file partitions.
>
> Please help
>


Re: Failed to fetch parquet metadata after 15000ms

2018-10-08 Thread Khurram Faraaz
Hi Karthik,

You can try setting the session/system option, exec.queue.timeout_millis to
a higher value, the default is 30

Thanks,
Khurram

On Mon, Oct 8, 2018 at 10:40 AM karthik.R  wrote:

> Hi,
> I am frequently getting below exception when running a query in drill 1.14.
> Could you please help what option to set to increase this timeout?
>
> Waited for 15000ms , but tasks for 'Fetch parquet metadata' are not
> complete. Total runnable size 4 , parallelism 4
>
>
> My parquet files present in s3 location with 5 parquet file partitions.
>
> Please help
>


Re: Failed to fetch parquet metadata after 15000ms

2018-05-11 Thread Carlos Derich
Hey Kunal, thanks for the help !

I've found the problem, my parquet files were being generated by a third
party warehouse db with an export to parquet command.

It seems that the parquet files being generated are somehow corrupted,
making the parquet reader to eat all the memory and then dying.

I'm opening a ticket with them.

Tested creating the parquet files from drill itself instead of from the
warehouse and it worked fine and super fast.

Thanks for all the help



On Thu, May 10, 2018 at 2:57 PM, Kunal Khatua  wrote:

> That doesn't look too big. Are the queries failing during planning or
> execution phase?
>
> Also, you mentioned that you are running this on a machine with 16GB RAM.
> How much memory have you given to Drill? Typical min config is about 8GB
> for Drill alone, and with 2-3GB with the OS, not a whole lot is left for
> other services (incl. HDFS).
>
> If you need to run barebones, here are a couple of things worth doing.
> 1. Check the memory allocated to Drill. I'm not sure how much HDFS, etc
> have been allocated, but you should be able to get away with as low 4-5GB
> (1-2GB heap and the rest for Direct)
> 2. Reduce the 'planner.width.max_per_node' to as low as 1. This will
> force Drill to run, in effect, at a parallelization of 1 (i.e 1 thread per
> major fragment).
> 3. There is the option of "planner.memory.max_query_memory_per_node" that
> defaults at 2GB, which should be sufficient even in your environment, when
> coupled with #2.
>
> The current Apache Drill master is also carrying real-time information
> about the heap and direct memory. If you are using that, you should be able
> to see whether you are quickly running out of memory (which, I suspect, is
> the source of your troubles).
>
> Hope that helps.
> On 5/10/2018 10:05:29 AM, Carlos Derich  wrote:
> It is relatively big ?
>
> parquet-tools schema output:
>
> message schema {
> optional int64 id;
> optional int64 cbd_id;
> optional binary company_name (UTF8);
> optional binary category (UTF8);
> optional binary subcategory (UTF8);
> optional binary description (UTF8);
> optional binary full_address_source (UTF8);
> optional binary street_address (UTF8);
> optional binary neighborhood (UTF8);
> optional binary city (UTF8);
> optional binary administrative_area_level_3 (UTF8);
> optional binary administrative_area_level_2 (UTF8);
> optional binary administrative_area_level_1 (UTF8);
> optional binary postal_code (UTF8);
> optional binary country (UTF8);
> optional binary formatted_address (UTF8);
> optional binary geometry;
> optional binary telephone (UTF8);
> optional binary website (UTF8);
> optional int32 retrieved_at;
> optional binary source_url (UTF8);
> }
>
> Thanks for the help , will keep you posted, this will help me understand
> better drill hardware requirements.
>
> On Thu, May 10, 2018 at 12:59 PM, Parth Chandra wrote:
>
> > That might be it. How big is the schema of your data? Do you have lots of
> > fields? If parquet-tools cannot read the metadata, there is little chance
> > anybody else will be able to do so either.
> >
> >
> > On Thu, May 10, 2018 at 9:57 AM, Carlos Derich
> > wrote:
> >
> > > Hey Parth, thanks for the response !
> > >
> > > I tried fetching the metadata using parquet-tools Hadoop mode instead,
> > and
> > > I get OOM errors: Heap and GC limit exceeded.
> > >
> > > It seems that my problem is actually resource related, still a bit
> weird
> > > how parquet metadata read is so hungry ?
> > >
> > > It seems that even after a restart (clean state/no queries running)
> only
> > > ~4GB mem is free from a 16GB machine.
> > >
> > > I am going to run the tests on a bigger machine, and will tweak the JVM
> > > options and will let you know.
> > >
> > > Regards.
> > > Carlos.
> > >
> > > On Wed, May 9, 2018 at 9:04 PM, Parth Chandra wrote:
> > >
> > > > The most common reason I know of for this error is if you do not have
> > > > enough CPU. Both Drill and the distributed file system will be using
> > cpu
> > > > and sometimes the file system, especially if it is distributed, will
> > take
> > > > too long. With your configuration and data set size, reading the file
> > > > metadata should take no time at all (I'll assume the metadata in the
> > > files
> > > > is reasonable and not many MB itself). Is your system by any chance
> > > > overloaded?
> > > >
> > > > Also, call me paranoid, but seeing /tmp in the path makes me
> > suspicious.
> > > > Can we assume the files are written completely when the metadata read
> > is
> > > > occurring. They probably are, since you can query the files
> > individually,
> > > > but I'm just checking to make sure.
> > > >
> > > > Finally, there is a similar JIRA
> > > > https://issues.apache.org/jira/browse/DRILL-5908, that looks
> related.
> > > >
> > > >
> > > >
> > > >
> > > > On Wed, May 9, 2018 at 4:15 PM, Carlos Derich
> > > > wrote:
> > > >
> > > > > Hello guys,
> > > > >
> > > > > Asking this question here 

Re: Failed to fetch parquet metadata after 15000ms

2018-05-10 Thread Kunal Khatua
That doesn't look too big. Are the queries failing during planning or execution 
phase?

Also, you mentioned that you are running this on a machine with 16GB RAM. How 
much memory have you given to Drill? Typical min config is about 8GB for Drill 
alone, and with 2-3GB with the OS, not a whole lot is left for other services 
(incl. HDFS).

If you need to run barebones, here are a couple of things worth doing.
1. Check the memory allocated to Drill. I'm not sure how much HDFS, etc have 
been allocated, but you should be able to get away with as low 4-5GB (1-2GB 
heap and the rest for Direct)
2. Reduce the 'planner.width.max_per_node' to as low as 1. This will force 
Drill to run, in effect, at a parallelization of 1 (i.e 1 thread per major 
fragment). 
3. There is the option of "planner.memory.max_query_memory_per_node" that 
defaults at 2GB, which should be sufficient even in your environment, when 
coupled with #2.

The current Apache Drill master is also carrying real-time information about 
the heap and direct memory. If you are using that, you should be able to see 
whether you are quickly running out of memory (which, I suspect, is the source 
of your troubles).

Hope that helps. 
On 5/10/2018 10:05:29 AM, Carlos Derich  wrote:
It is relatively big ?

parquet-tools schema output:

message schema {
optional int64 id;
optional int64 cbd_id;
optional binary company_name (UTF8);
optional binary category (UTF8);
optional binary subcategory (UTF8);
optional binary description (UTF8);
optional binary full_address_source (UTF8);
optional binary street_address (UTF8);
optional binary neighborhood (UTF8);
optional binary city (UTF8);
optional binary administrative_area_level_3 (UTF8);
optional binary administrative_area_level_2 (UTF8);
optional binary administrative_area_level_1 (UTF8);
optional binary postal_code (UTF8);
optional binary country (UTF8);
optional binary formatted_address (UTF8);
optional binary geometry;
optional binary telephone (UTF8);
optional binary website (UTF8);
optional int32 retrieved_at;
optional binary source_url (UTF8);
}

Thanks for the help , will keep you posted, this will help me understand
better drill hardware requirements.

On Thu, May 10, 2018 at 12:59 PM, Parth Chandra wrote:

> That might be it. How big is the schema of your data? Do you have lots of
> fields? If parquet-tools cannot read the metadata, there is little chance
> anybody else will be able to do so either.
>
>
> On Thu, May 10, 2018 at 9:57 AM, Carlos Derich
> wrote:
>
> > Hey Parth, thanks for the response !
> >
> > I tried fetching the metadata using parquet-tools Hadoop mode instead,
> and
> > I get OOM errors: Heap and GC limit exceeded.
> >
> > It seems that my problem is actually resource related, still a bit weird
> > how parquet metadata read is so hungry ?
> >
> > It seems that even after a restart (clean state/no queries running) only
> > ~4GB mem is free from a 16GB machine.
> >
> > I am going to run the tests on a bigger machine, and will tweak the JVM
> > options and will let you know.
> >
> > Regards.
> > Carlos.
> >
> > On Wed, May 9, 2018 at 9:04 PM, Parth Chandra wrote:
> >
> > > The most common reason I know of for this error is if you do not have
> > > enough CPU. Both Drill and the distributed file system will be using
> cpu
> > > and sometimes the file system, especially if it is distributed, will
> take
> > > too long. With your configuration and data set size, reading the file
> > > metadata should take no time at all (I'll assume the metadata in the
> > files
> > > is reasonable and not many MB itself). Is your system by any chance
> > > overloaded?
> > >
> > > Also, call me paranoid, but seeing /tmp in the path makes me
> suspicious.
> > > Can we assume the files are written completely when the metadata read
> is
> > > occurring. They probably are, since you can query the files
> individually,
> > > but I'm just checking to make sure.
> > >
> > > Finally, there is a similar JIRA
> > > https://issues.apache.org/jira/browse/DRILL-5908, that looks related.
> > >
> > >
> > >
> > >
> > > On Wed, May 9, 2018 at 4:15 PM, Carlos Derich
> > > wrote:
> > >
> > > > Hello guys,
> > > >
> > > > Asking this question here because I think i've hit a wall with this
> > > > problem, I am consistently getting the same error, when running a
> query
> > > on
> > > > a directory-based parquet file.
> > > >
> > > > The directory contains six 158MB parquet files.
> > > >
> > > > RESOURCE ERROR: Waited for 15000ms, but tasks for 'Fetch parquet
> > > > metadata' are not complete. Total runnable size 6, parallelism 6.
> > > >
> > > >
> > > > Both queries fail:
> > > >
> > > > *select count(*) from dfs.`/tmp/37454954-3c0a-47c5-
> > 9793-1c333d87fbbb/`*
> > > >
> > > > *select * from* *from dfs.`/tmp/37454954-3c0a-47c5-
> 9793-1c333d87fbbb/`
> > > > limit 1*
> > > >
> > > > BUT If I try running any other query in any of the 6 parquet files
> > inside
> > > > the directory it works fine:
> > 

Re: Failed to fetch parquet metadata after 15000ms

2018-05-10 Thread Carlos Derich
It is relatively big ?

parquet-tools schema output:

message schema {
  optional int64 id;
  optional int64 cbd_id;
  optional binary company_name (UTF8);
  optional binary category (UTF8);
  optional binary subcategory (UTF8);
  optional binary description (UTF8);
  optional binary full_address_source (UTF8);
  optional binary street_address (UTF8);
  optional binary neighborhood (UTF8);
  optional binary city (UTF8);
  optional binary administrative_area_level_3 (UTF8);
  optional binary administrative_area_level_2 (UTF8);
  optional binary administrative_area_level_1 (UTF8);
  optional binary postal_code (UTF8);
  optional binary country (UTF8);
  optional binary formatted_address (UTF8);
  optional binary geometry;
  optional binary telephone (UTF8);
  optional binary website (UTF8);
  optional int32 retrieved_at;
  optional binary source_url (UTF8);
}

Thanks for the help , will keep you posted, this will help me understand
better drill hardware requirements.

On Thu, May 10, 2018 at 12:59 PM, Parth Chandra  wrote:

> That might be it. How big is the schema of your data? Do you have lots of
> fields? If parquet-tools cannot read the metadata, there is little chance
> anybody else will be able to do so either.
>
>
> On Thu, May 10, 2018 at 9:57 AM, Carlos Derich 
> wrote:
>
> > Hey Parth, thanks for the response !
> >
> > I tried fetching the metadata using parquet-tools Hadoop mode instead,
> and
> > I get OOM errors: Heap and GC limit exceeded.
> >
> > It seems that my problem is actually resource related, still a bit weird
> > how parquet metadata read is so hungry ?
> >
> > It seems that even after a restart (clean state/no queries running) only
> > ~4GB mem is free from a 16GB machine.
> >
> > I am going to run the tests on a bigger machine, and will tweak the JVM
> > options and will let you know.
> >
> > Regards.
> > Carlos.
> >
> > On Wed, May 9, 2018 at 9:04 PM, Parth Chandra  wrote:
> >
> > > The most common reason I know of for this error is if you do not have
> > > enough CPU. Both Drill and the distributed file system will be using
> cpu
> > > and sometimes the file system, especially if it is distributed, will
> take
> > > too long. With your configuration and data set size, reading the file
> > > metadata should take no time at all (I'll assume the metadata in the
> > files
> > > is reasonable and not many MB itself).  Is your system by any chance
> > > overloaded?
> > >
> > > Also, call me paranoid, but seeing /tmp in the path makes me
> suspicious.
> > > Can we assume the files are written completely when the metadata read
> is
> > > occurring. They probably are, since you can query the files
> individually,
> > > but I'm just checking to make sure.
> > >
> > > Finally, there is a similar JIRA
> > > https://issues.apache.org/jira/browse/DRILL-5908, that looks related.
> > >
> > >
> > >
> > >
> > > On Wed, May 9, 2018 at 4:15 PM, Carlos Derich 
> > > wrote:
> > >
> > > > Hello guys,
> > > >
> > > > Asking this question here because I think i've hit a wall with this
> > > > problem, I am consistently getting the same error, when running a
> query
> > > on
> > > > a directory-based parquet file.
> > > >
> > > > The directory contains six 158MB parquet files.
> > > >
> > > > RESOURCE ERROR: Waited for 15000ms, but tasks for 'Fetch parquet
> > > > metadata' are not complete. Total runnable size 6, parallelism 6.
> > > >
> > > >
> > > > Both queries fail:
> > > >
> > > > *select count(*) from dfs.`/tmp/37454954-3c0a-47c5-
> > 9793-1c333d87fbbb/`*
> > > >
> > > > *select * from* *from dfs.`/tmp/37454954-3c0a-47c5-
> 9793-1c333d87fbbb/`
> > > > limit 1*
> > > >
> > > > BUT If I try running any other query in any of the 6 parquet files
> > inside
> > > > the directory it works fine:
> > > > eg:
> > > > *select * from
> > > > dfs.`/tmp/37454954-3c0a-47c5-9793-1c333d87fbbb/185d3076-v_
> > > docker_node0001-
> > > > 140526122190592.parquet`*
> > > >
> > > > Running *`refresh table metadata`* gives me the exact same error.
> > > >
> > > > Also tried to set *planner.hashjoin* to false.
> > > >
> > > > Checking the drill source it seems that the wait metadata timeout is
> > not
> > > > configurable.
> > > >
> > > > Have any of you faced a similar situation ?
> > > >
> > > > Running this locally on my 16GB RAM machine, hdfs in a single node.
> > > >
> > > > I also found an open ticket with the same error message:
> > > > https://issues.apache.org/jira/browse/DRILL-5903
> > > >
> > > > Thank you in advance.
> > > >
> > >
> >
>


Re: Failed to fetch parquet metadata after 15000ms

2018-05-10 Thread Parth Chandra
That might be it. How big is the schema of your data? Do you have lots of
fields? If parquet-tools cannot read the metadata, there is little chance
anybody else will be able to do so either.


On Thu, May 10, 2018 at 9:57 AM, Carlos Derich 
wrote:

> Hey Parth, thanks for the response !
>
> I tried fetching the metadata using parquet-tools Hadoop mode instead, and
> I get OOM errors: Heap and GC limit exceeded.
>
> It seems that my problem is actually resource related, still a bit weird
> how parquet metadata read is so hungry ?
>
> It seems that even after a restart (clean state/no queries running) only
> ~4GB mem is free from a 16GB machine.
>
> I am going to run the tests on a bigger machine, and will tweak the JVM
> options and will let you know.
>
> Regards.
> Carlos.
>
> On Wed, May 9, 2018 at 9:04 PM, Parth Chandra  wrote:
>
> > The most common reason I know of for this error is if you do not have
> > enough CPU. Both Drill and the distributed file system will be using cpu
> > and sometimes the file system, especially if it is distributed, will take
> > too long. With your configuration and data set size, reading the file
> > metadata should take no time at all (I'll assume the metadata in the
> files
> > is reasonable and not many MB itself).  Is your system by any chance
> > overloaded?
> >
> > Also, call me paranoid, but seeing /tmp in the path makes me suspicious.
> > Can we assume the files are written completely when the metadata read is
> > occurring. They probably are, since you can query the files individually,
> > but I'm just checking to make sure.
> >
> > Finally, there is a similar JIRA
> > https://issues.apache.org/jira/browse/DRILL-5908, that looks related.
> >
> >
> >
> >
> > On Wed, May 9, 2018 at 4:15 PM, Carlos Derich 
> > wrote:
> >
> > > Hello guys,
> > >
> > > Asking this question here because I think i've hit a wall with this
> > > problem, I am consistently getting the same error, when running a query
> > on
> > > a directory-based parquet file.
> > >
> > > The directory contains six 158MB parquet files.
> > >
> > > RESOURCE ERROR: Waited for 15000ms, but tasks for 'Fetch parquet
> > > metadata' are not complete. Total runnable size 6, parallelism 6.
> > >
> > >
> > > Both queries fail:
> > >
> > > *select count(*) from dfs.`/tmp/37454954-3c0a-47c5-
> 9793-1c333d87fbbb/`*
> > >
> > > *select * from* *from dfs.`/tmp/37454954-3c0a-47c5-9793-1c333d87fbbb/`
> > > limit 1*
> > >
> > > BUT If I try running any other query in any of the 6 parquet files
> inside
> > > the directory it works fine:
> > > eg:
> > > *select * from
> > > dfs.`/tmp/37454954-3c0a-47c5-9793-1c333d87fbbb/185d3076-v_
> > docker_node0001-
> > > 140526122190592.parquet`*
> > >
> > > Running *`refresh table metadata`* gives me the exact same error.
> > >
> > > Also tried to set *planner.hashjoin* to false.
> > >
> > > Checking the drill source it seems that the wait metadata timeout is
> not
> > > configurable.
> > >
> > > Have any of you faced a similar situation ?
> > >
> > > Running this locally on my 16GB RAM machine, hdfs in a single node.
> > >
> > > I also found an open ticket with the same error message:
> > > https://issues.apache.org/jira/browse/DRILL-5903
> > >
> > > Thank you in advance.
> > >
> >
>


Re: Failed to fetch parquet metadata after 15000ms

2018-05-10 Thread Carlos Derich
Hey Parth, thanks for the response !

I tried fetching the metadata using parquet-tools Hadoop mode instead, and
I get OOM errors: Heap and GC limit exceeded.

It seems that my problem is actually resource related, still a bit weird
how parquet metadata read is so hungry ?

It seems that even after a restart (clean state/no queries running) only
~4GB mem is free from a 16GB machine.

I am going to run the tests on a bigger machine, and will tweak the JVM
options and will let you know.

Regards.
Carlos.

On Wed, May 9, 2018 at 9:04 PM, Parth Chandra  wrote:

> The most common reason I know of for this error is if you do not have
> enough CPU. Both Drill and the distributed file system will be using cpu
> and sometimes the file system, especially if it is distributed, will take
> too long. With your configuration and data set size, reading the file
> metadata should take no time at all (I'll assume the metadata in the files
> is reasonable and not many MB itself).  Is your system by any chance
> overloaded?
>
> Also, call me paranoid, but seeing /tmp in the path makes me suspicious.
> Can we assume the files are written completely when the metadata read is
> occurring. They probably are, since you can query the files individually,
> but I'm just checking to make sure.
>
> Finally, there is a similar JIRA
> https://issues.apache.org/jira/browse/DRILL-5908, that looks related.
>
>
>
>
> On Wed, May 9, 2018 at 4:15 PM, Carlos Derich 
> wrote:
>
> > Hello guys,
> >
> > Asking this question here because I think i've hit a wall with this
> > problem, I am consistently getting the same error, when running a query
> on
> > a directory-based parquet file.
> >
> > The directory contains six 158MB parquet files.
> >
> > RESOURCE ERROR: Waited for 15000ms, but tasks for 'Fetch parquet
> > metadata' are not complete. Total runnable size 6, parallelism 6.
> >
> >
> > Both queries fail:
> >
> > *select count(*) from dfs.`/tmp/37454954-3c0a-47c5-9793-1c333d87fbbb/`*
> >
> > *select * from* *from dfs.`/tmp/37454954-3c0a-47c5-9793-1c333d87fbbb/`
> > limit 1*
> >
> > BUT If I try running any other query in any of the 6 parquet files inside
> > the directory it works fine:
> > eg:
> > *select * from
> > dfs.`/tmp/37454954-3c0a-47c5-9793-1c333d87fbbb/185d3076-v_
> docker_node0001-
> > 140526122190592.parquet`*
> >
> > Running *`refresh table metadata`* gives me the exact same error.
> >
> > Also tried to set *planner.hashjoin* to false.
> >
> > Checking the drill source it seems that the wait metadata timeout is not
> > configurable.
> >
> > Have any of you faced a similar situation ?
> >
> > Running this locally on my 16GB RAM machine, hdfs in a single node.
> >
> > I also found an open ticket with the same error message:
> > https://issues.apache.org/jira/browse/DRILL-5903
> >
> > Thank you in advance.
> >
>


Re: Failed to fetch parquet metadata after 15000ms

2018-05-09 Thread Parth Chandra
The most common reason I know of for this error is if you do not have
enough CPU. Both Drill and the distributed file system will be using cpu
and sometimes the file system, especially if it is distributed, will take
too long. With your configuration and data set size, reading the file
metadata should take no time at all (I'll assume the metadata in the files
is reasonable and not many MB itself).  Is your system by any chance
overloaded?

Also, call me paranoid, but seeing /tmp in the path makes me suspicious.
Can we assume the files are written completely when the metadata read is
occurring. They probably are, since you can query the files individually,
but I'm just checking to make sure.

Finally, there is a similar JIRA
https://issues.apache.org/jira/browse/DRILL-5908, that looks related.




On Wed, May 9, 2018 at 4:15 PM, Carlos Derich 
wrote:

> Hello guys,
>
> Asking this question here because I think i've hit a wall with this
> problem, I am consistently getting the same error, when running a query on
> a directory-based parquet file.
>
> The directory contains six 158MB parquet files.
>
> RESOURCE ERROR: Waited for 15000ms, but tasks for 'Fetch parquet
> metadata' are not complete. Total runnable size 6, parallelism 6.
>
>
> Both queries fail:
>
> *select count(*) from dfs.`/tmp/37454954-3c0a-47c5-9793-1c333d87fbbb/`*
>
> *select * from* *from dfs.`/tmp/37454954-3c0a-47c5-9793-1c333d87fbbb/`
> limit 1*
>
> BUT If I try running any other query in any of the 6 parquet files inside
> the directory it works fine:
> eg:
> *select * from
> dfs.`/tmp/37454954-3c0a-47c5-9793-1c333d87fbbb/185d3076-v_docker_node0001-
> 140526122190592.parquet`*
>
> Running *`refresh table metadata`* gives me the exact same error.
>
> Also tried to set *planner.hashjoin* to false.
>
> Checking the drill source it seems that the wait metadata timeout is not
> configurable.
>
> Have any of you faced a similar situation ?
>
> Running this locally on my 16GB RAM machine, hdfs in a single node.
>
> I also found an open ticket with the same error message:
> https://issues.apache.org/jira/browse/DRILL-5903
>
> Thank you in advance.
>