Re: Monitoring long / stuck CTAS

2015-05-29 Thread Ted Dunning
Apologies for the plug, but using MapR FS would help you a lot here. The trick is that you can run an NFS server on every node and mount that server as localhost. The benefits are: 1) the entire cluster appears as a conventional POSIX style file system in addition to being available via HDFS

Re: Monitoring long / stuck CTAS

2015-05-29 Thread Carol McDonald
What Ted just talked about is also explained in this On Demand Training https://www.mapr.com/services/mapr-academy/mapr-distribution-essentials-training-course-on-demand (which is free) On Fri, May 29, 2015 at 5:29 PM, Ted Dunning ted.dunn...@gmail.com wrote: There are two methods to

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Mehant Baid
I think the problem might be related to a single laggard, looks like we are waiting for one minor fragment to complete. Based on the output you provided looks like the fragment 1_1 hasn't completed. You might want to find out where the fragment was scheduled and what is going on in that node.

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Matt
That is a good point. The difference between the number of source rows, and those that made it into the parquet files is about the same count as the other fragments. Indeed the query profile does show fragment 1_1 as CANCELED while the others all have State FINISHED. Additionally the other

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Matt
Bumping memory to: DRILL_MAX_DIRECT_MEMORY=16G DRILL_HEAP=8G The 44GB file imported successfully in 25 minutes - acceptable on this hardware. I don't know if the default memory setting was to blame or not. On 28 May 2015, at 14:22, Andries Engelbrecht wrote: That is the Drill direct

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Jason Altekruse
He mentioned in his original post that he saw CPU and IO on all of the nodes for a while when the query was active, but it suddenly dropped down to low CPU usage and stopped producing files. It seems like we are failing to detect an error an cancel the query. It is possible that the failure

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Matt
Did you check the log files for any errors? No messages related to this query containing errors or warning, nor nothing mentioning memory or heap. Querying now to determine what is missing in the parquet destination. drillbit.out on the master shows no error messages, and what looks like

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Andries Engelbrecht
That is the Drill direct memory per node. DRILL_HEAP is for the heap size per node. More info here http://drill.apache.org/docs/configuring-drill-memory/ —Andries On May 28, 2015, at 11:09 AM, Matt bsg...@gmail.com wrote: Referencing http://drill.apache.org/docs/configuring-drill-memory/

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Matt
To make sure I am adjusting the correct config, these are heap parameters within the Drill configure path, not for Hadoop or Zookeeper? On May 28, 2015, at 12:08 PM, Jason Altekruse altekruseja...@gmail.com wrote: There should be no upper limit on the size of the tables you can create

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Matt
I did not note any memory errors or warnings in a quick scan of the logs, but to double check, is there a specific log I would find such warnings in? On May 28, 2015, at 12:01 PM, Andries Engelbrecht aengelbre...@maprtech.com wrote: I have used a single CTAS to create tables using parquet

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Jason Altekruse
That is correct. I guess it could be possible that HDFS might run out of heap, but I'm guessing that is unlikely the cause of the failure you are seeing. We should not be taxing zookeeper enough to be causing any issues there. On Thu, May 28, 2015 at 9:17 AM, Matt bsg...@gmail.com wrote: To

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Matt
How large is the data set you are working with, and your cluster/nodes? Just testing with that single 44GB source file currently, and my test cluster is made from 4 nodes, each with 8 CPU cores, 32GB RAM, a 6TB Ext4 volume (RAID-10). Drill defaults left as come in v1.0. I will be adjusting

Re: Monitoring long / stuck CTAS

2015-05-28 Thread Jason Altekruse
There should be no upper limit on the size of the tables you can create with Drill. Be advised that Drill does currently operate entirely optimistically in regards to available resources. If a network connection between two drillbits fails during a query, we will not currently re-schedule the work