That is the Drill direct memory per node. DRILL_HEAP is for the heap size per node.
More info here http://drill.apache.org/docs/configuring-drill-memory/ —Andries On May 28, 2015, at 11:09 AM, Matt <bsg...@gmail.com> wrote: > Referencing http://drill.apache.org/docs/configuring-drill-memory/ > > Is DRILL_MAX_DIRECT_MEMORY the limit for each node, or the cluster? > > The root page on a drillbit at port 8047 list for nodes, with the 16G Maximum > Direct Memory equal to DRILL_MAX_DIRECT_MEMORY, thus uncertain if that is a > node or cluster limit. > > > On 28 May 2015, at 12:23, Jason Altekruse wrote: > >> That is correct. I guess it could be possible that HDFS might run out of >> heap, but I'm guessing that is unlikely the cause of the failure you are >> seeing. We should not be taxing zookeeper enough to be causing any issues >> there. >> >> On Thu, May 28, 2015 at 9:17 AM, Matt <bsg...@gmail.com> wrote: >> >>> To make sure I am adjusting the correct config, these are heap parameters >>> within the Drill configure path, not for Hadoop or Zookeeper? >>> >>> >>>> On May 28, 2015, at 12:08 PM, Jason Altekruse <altekruseja...@gmail.com> >>> wrote: >>>> >>>> There should be no upper limit on the size of the tables you can create >>>> with Drill. Be advised that Drill does currently operate entirely >>>> optimistically in regards to available resources. If a network connection >>>> between two drillbits fails during a query, we will not currently >>>> re-schedule the work to make use of remaining nodes and network >>> connections >>>> that are still live. While we have had a good amount of success using >>> Drill >>>> for data conversion, be aware that these conditions could cause long >>>> running queries to fail. >>>> >>>> That being said, it isn't the only possible cause for such a failure. In >>>> the case of a network failure we would expect to see a message returned >>> to >>>> you that part of the query was unsuccessful and that it had been >>> cancelled. >>>> Andries has a good suggestion in regards to checking the heap memory, >>> this >>>> should also be detected and reported back to you at the CLI, but we may >>> be >>>> failing to propagate the error back to the head node for the query. I >>>> believe writing parquet may still be the most heap-intensive operation in >>>> Drill, despite our efforts to refactor the write path to use direct >>> memory >>>> instead of on-heap for large buffers needed in the process of creating >>>> parquet files. >>>> >>>>> On Thu, May 28, 2015 at 8:43 AM, Matt <bsg...@gmail.com> wrote: >>>>> >>>>> Is 300MM records too much to do in a single CTAS statement? >>>>> >>>>> After almost 23 hours I killed the query (^c) and it returned: >>>>> >>>>> ~~~ >>>>> +-----------+----------------------------+ >>>>> | Fragment | Number of records written | >>>>> +-----------+----------------------------+ >>>>> | 1_20 | 13568824 | >>>>> | 1_15 | 12411822 | >>>>> | 1_7 | 12470329 | >>>>> | 1_12 | 13693867 | >>>>> | 1_5 | 13292136 | >>>>> | 1_18 | 13874321 | >>>>> | 1_16 | 13303094 | >>>>> | 1_9 | 13639049 | >>>>> | 1_10 | 13698380 | >>>>> | 1_22 | 13501073 | >>>>> | 1_8 | 13533736 | >>>>> | 1_2 | 13549402 | >>>>> | 1_21 | 13665183 | >>>>> | 1_0 | 13544745 | >>>>> | 1_4 | 13532957 | >>>>> | 1_19 | 12767473 | >>>>> | 1_17 | 13670687 | >>>>> | 1_13 | 13469515 | >>>>> | 1_23 | 12517632 | >>>>> | 1_6 | 13634338 | >>>>> | 1_14 | 13611322 | >>>>> | 1_3 | 13061900 | >>>>> | 1_11 | 12760978 | >>>>> +-----------+----------------------------+ >>>>> 23 rows selected (82294.854 seconds) >>>>> ~~~ >>>>> >>>>> The sum of those record counts is 306,772,763 which is close to the >>>>> 320,843,454 in the source file: >>>>> >>>>> ~~~ >>>>> 0: jdbc:drill:zk=es05:2181> select count(*) FROM >>> root.`sample_201501.dat`; >>>>> +------------+ >>>>> | EXPR$0 | >>>>> +------------+ >>>>> | 320843454 | >>>>> +------------+ >>>>> 1 row selected (384.665 seconds) >>>>> ~~~ >>>>> >>>>> >>>>> It represents one month of data, 4 key columns and 38 numeric measure >>>>> columns, which could also be partitioned daily. The test here was to >>> create >>>>> monthly Parquet files to see how the min/max stats on Parquet chunks >>> help >>>>> with range select performance. >>>>> >>>>> Instead of a small number of large monthly RDBMS tables, I am attempting >>>>> to determine how many Parquet files should be used with Drill / HDFS. >>>>> >>>>> >>>>> >>>>> >>>>> On 27 May 2015, at 15:17, Matt wrote: >>>>> >>>>> Attempting to create a Parquet backed table with a CTAS from an 44GB tab >>>>>> delimited file in HDFS. The process seemed to be running, as CPU and >>> IO was >>>>>> seen on all 4 nodes in this cluster, and .parquet files being created >>> in >>>>>> the expected path. >>>>>> >>>>>> In however in the last two hours or so, all nodes show near zero CPU or >>>>>> IO, and the Last Modified date on the .parquet have not changed. Same >>> time >>>>>> delay shown in the Last Progress column in the active fragment profile. >>>>>> >>>>>> What approach can I take to determine what is happening (or not)? >>>>> >>>