I have another test case that queries a table using a filter of a range
of dates and customer key, that SUMs 38 columns. The returned record set
encompasses all 42 columns in the table - not a good design for parquet
files or any RDBMS, but a modeling problem that is not yet fully in my
control (the application needs some changes).
Simply selecting all the columns in the parquet files with that filter
returns data to the client in about 3 seconds, but SUMming all of the 38
measure columns resulted in the query still running at the client 22
hours later.
However, the query profile shows no fragments with a Max Runtime or more
than 2h20m, much like the "stuck CTAS" I had before. Learning from that
case, I looked at the node hosting the one fragment that did not finish.
Could this be a communication failure between nodes that is not
signaling the client?
~~~
Major Fragment: 02-xx-xx
Minor Fragment ID Host Name Start End Runtime Max Records Max
Batches Last Update Last Progress Peak Memory State
02-00-xx es06 1.011s 2h20m 2h20m 0 1 02:35:43
02:35:43 2MB CANCELLED
02-01-xx es08 0.999s 4m33s 4m32s 0 1 01:19:52
01:19:52 2MB FINISHED
02-02-xx es07 1.010s 2m16s 2m15s 0 1 01:17:34
01:17:34 2MB FINISHED
02-03-xx es05 1.009s 2m56s 2m55s 0 1 01:18:14
01:18:14 2MB FINISHED
~~~
~~~
2015-05-29 05:23:07,822 [UserServer-1] INFO
o.a.drill.exec.work.foreman.Foreman - Failure while trying communicate
query result to initiating client. This would happen if a client is
disconnected before response notice can be sent.
org.apache.drill.exec.rpc.ChannelClosedException: null
at
org.apache.drill.exec.rpc.CoordinationQueue$RpcListener.operationComplete(CoordinationQueue.java:89)
[drill-java-exec-1.0.0-rebuffed.jar:1.0.0]
at
org.apache.drill.exec.rpc.CoordinationQueue$RpcListener.operationComplete(CoordinationQueue.java:67)
[drill-java-exec-1.0.0-rebuffed.jar:1.0.0]
at
io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
[netty-common-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
[netty-common-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
[netty-common-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424)
[netty-common-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetFailure(AbstractChannel.java:788)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.AbstractChannel$AbstractUnsafe.write(AbstractChannel.java:689)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.DefaultChannelPipeline$HeadContext.write(DefaultChannelPipeline.java:1114)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:705)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:32)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:980)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1032)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:965)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
[netty-common-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:254)
[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
[netty-common-4.0.27.Final.jar:4.0.27.Final]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79]
2015-05-29 05:23:07,822 [UserServer-1] INFO
o.a.drill.exec.work.foreman.Foreman - State change requested. CANCELED
--> FAILED
org.apache.drill.exec.rpc.ChannelClosedException: null
at
org.apache.drill.exec.rpc.CoordinationQueue$RpcListener.operationComplete(CoordinationQueue.java:89)
[drill-java-exec-1.0.0-rebuffed.jar:1.0.0]
at
org.apache.drill.exec.rpc.CoordinationQueue$RpcListener.operationComplete(CoordinationQueue.java:67)
[drill-java-exec-1.0.0-rebuffed.jar:1.0.0]
at
io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
[netty-common-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
[netty-common-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
[netty-common-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424)
[netty-common-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetFailure(AbstractChannel.java:788)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.AbstractChannel$AbstractUnsafe.write(AbstractChannel.java:689)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.DefaultChannelPipeline$HeadContext.write(DefaultChannelPipeline.java:1114)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:705)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:32)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:980)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1032)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:965)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
[netty-common-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:254)
[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
[netty-common-4.0.27.Final.jar:4.0.27.Final]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79]
~~~
On 28 May 2015, at 16:43, Mehant Baid wrote:
I think the problem might be related to a single laggard, looks like
we are waiting for one minor fragment to complete. Based on the output
you provided looks like the fragment 1_1 hasn't completed. You might
want to find out where the fragment was scheduled and what is going on
in that node. It might also be useful to look at the profile for that
minor fragment to see how much data has been processed.
Thanks
Mehant
On 5/28/15 10:57 AM, Matt wrote:
Did you check the log files for any errors?
No messages related to this query containing errors or warning, nor
nothing mentioning memory or heap. Querying now to determine what is
missing in the parquet destination.
drillbit.out on the master shows no error messages, and what looks
like the last relevant line is:
~~~
May 27, 2015 6:43:50 PM INFO:
parquet.hadoop.ColumnChunkPageWriteStore: written 2,258,263B for
[bytes_1250] INT64: 3,069,414 values, 24,555,504B raw, 2,257,112B
comp, 24 pages, encodings: [RLE, PLAIN, BIT_PACKED]
May 27, 2015 6:43:51 PM INFO: parquet.haMay 28, 2015 5:13:42 PM
org.apache.calcite.sql.validate.SqlValidatorException <init>
~~~
The final lines in drillbit.log (which appear to use a different time
format in the log) that contain the profile ID:
~~~
2015-05-27 18:39:49,980
[2a9a10ec-6f96-5dc5-54fc-dc5295a77e42:frag:1:20] INFO
o.a.d.e.w.fragment.FragmentExecutor -
2a9a10ec-6f96-5dc5-54fc-dc5295a77e42:1:20: State change requested
from RUNNING --> FINISHED for
2015-05-27 18:39:49,981
[2a9a10ec-6f96-5dc5-54fc-dc5295a77e42:frag:1:20] INFO
o.a.d.e.w.f.AbstractStatusReporter - State changed for
2a9a10ec-6f96-5dc5-54fc-dc5295a77e42:1:20. New state: FINISHED
2015-05-27 18:40:05,650
[2a9a10ec-6f96-5dc5-54fc-dc5295a77e42:frag:1:12] INFO
o.a.d.e.w.fragment.FragmentExecutor -
2a9a10ec-6f96-5dc5-54fc-dc5295a77e42:1:12: State change requested
from RUNNING --> FINISHED for
2015-05-27 18:40:05,650
[2a9a10ec-6f96-5dc5-54fc-dc5295a77e42:frag:1:12] INFO
o.a.d.e.w.f.AbstractStatusReporter - State changed for
2a9a10ec-6f96-5dc5-54fc-dc5295a77e42:1:12. New state: FINISHED
2015-05-27 18:41:57,444
[2a9a10ec-6f96-5dc5-54fc-dc5295a77e42:frag:1:16] INFO
o.a.d.e.w.fragment.FragmentExecutor -
2a9a10ec-6f96-5dc5-54fc-dc5295a77e42:1:16: State change requested
from RUNNING --> FINISHED for
2015-05-27 18:41:57,444
[2a9a10ec-6f96-5dc5-54fc-dc5295a77e42:frag:1:16] INFO
o.a.d.e.w.f.AbstractStatusReporter - State changed for
2a9a10ec-6f96-5dc5-54fc-dc5295a77e42:1:16. New state: FINISHED
2015-05-27 18:43:25,005
[2a9a10ec-6f96-5dc5-54fc-dc5295a77e42:frag:1:8] INFO
o.a.d.e.w.fragment.FragmentExecutor -
2a9a10ec-6f96-5dc5-54fc-dc5295a77e42:1:8: State change requested from
RUNNING --> FINISHED for
2015-05-27 18:43:25,005
[2a9a10ec-6f96-5dc5-54fc-dc5295a77e42:frag:1:8] INFO
o.a.d.e.w.f.AbstractStatusReporter - State changed for
2a9a10ec-6f96-5dc5-54fc-dc5295a77e42:1:8. New state: FINISHED
2015-05-27 18:43:54,539
[2a9a10ec-6f96-5dc5-54fc-dc5295a77e42:frag:1:0] INFO
o.a.d.e.w.fragment.FragmentExecutor -
2a9a10ec-6f96-5dc5-54fc-dc5295a77e42:1:0: State change requested from
RUNNING --> FINISHED for
2015-05-27 18:43:54,540
[2a9a10ec-6f96-5dc5-54fc-dc5295a77e42:frag:1:0] INFO
o.a.d.e.w.f.AbstractStatusReporter - State changed for
2a9a10ec-6f96-5dc5-54fc-dc5295a77e42:1:0. New state: FINISHED
2015-05-27 18:43:59,947
[2a9a10ec-6f96-5dc5-54fc-dc5295a77e42:frag:1:4] INFO
o.a.d.e.w.fragment.FragmentExecutor -
2a9a10ec-6f96-5dc5-54fc-dc5295a77e42:1:4: State change requested from
RUNNING --> FINISHED for
2015-05-27 18:43:59,947
[2a9a10ec-6f96-5dc5-54fc-dc5295a77e42:frag:1:4] INFO
o.a.d.e.w.f.AbstractStatusReporter - State changed for
2a9a10ec-6f96-5dc5-54fc-dc5295a77e42:1:4. New state: FINISHED
~~~
On 28 May 2015, at 13:42, Andries Engelbrecht wrote:
It should execute multi threaded, need to check on text file.
Did you check the log files for any errors?
On May 28, 2015, at 10:36 AM, Matt <bsg...@gmail.com> wrote:
The time seems pretty long for that file size. What type of file
is it?
Tab delimited UTF-8 text.
I left the query to run overnight to see if it would complete, but
24 hours for an import like this would indeed be too long.
Is the CTAS running single threaded?
In the first hour, with this being the only client connected to the
cluster, I observed activity on all 4 nodes.
Is multi-threaded query execution the default? I would not have
changed anything deliberately to force single thread execution.
On 28 May 2015, at 13:06, Andries Engelbrecht wrote:
The time seems pretty long for that file size. What type of file
is it?
Is the CTAS running single threaded?
—Andries
On May 28, 2015, at 9:37 AM, Matt <bsg...@gmail.com> wrote:
How large is the data set you are working with, and your
cluster/nodes?
Just testing with that single 44GB source file currently, and my
test cluster is made from 4 nodes, each with 8 CPU cores, 32GB
RAM, a 6TB Ext4 volume (RAID-10).
Drill defaults left as come in v1.0. I will be adjusting memory
and retrying the CTAS.
I know I can / should assign individual disks to HDFS, but as a
test cluster there are apps that expect data volumes to work on.
A dedicated Hadoop production cluster would have a disk layout
specific to the task.
On 28 May 2015, at 12:26, Andries Engelbrecht wrote:
Just check the drillbit.log and drillbit.out files in the log
directory.
Before adjusting memory, see if that is an issue first. It was
for me, but as Jason mentioned there can be other causes as
well.
You adjust memory allocation in the drill-env.sh files, and have
to restart the drill bits.
How large is the data set you are working with, and your
cluster/nodes?
—Andries
On May 28, 2015, at 9:17 AM, Matt <bsg...@gmail.com> wrote:
To make sure I am adjusting the correct config, these are heap
parameters within the Drill configure path, not for Hadoop or
Zookeeper?
On May 28, 2015, at 12:08 PM, Jason Altekruse
<altekruseja...@gmail.com> wrote:
There should be no upper limit on the size of the tables you
can create
with Drill. Be advised that Drill does currently operate
entirely
optimistically in regards to available resources. If a network
connection
between two drillbits fails during a query, we will not
currently
re-schedule the work to make use of remaining nodes and
network connections
that are still live. While we have had a good amount of
success using Drill
for data conversion, be aware that these conditions could
cause long
running queries to fail.
That being said, it isn't the only possible cause for such a
failure. In
the case of a network failure we would expect to see a message
returned to
you that part of the query was unsuccessful and that it had
been cancelled.
Andries has a good suggestion in regards to checking the heap
memory, this
should also be detected and reported back to you at the CLI,
but we may be
failing to propagate the error back to the head node for the
query. I
believe writing parquet may still be the most heap-intensive
operation in
Drill, despite our efforts to refactor the write path to use
direct memory
instead of on-heap for large buffers needed in the process of
creating
parquet files.
On Thu, May 28, 2015 at 8:43 AM, Matt <bsg...@gmail.com>
wrote:
Is 300MM records too much to do in a single CTAS statement?
After almost 23 hours I killed the query (^c) and it
returned:
~~~
+-----------+----------------------------+
| Fragment | Number of records written |
+-----------+----------------------------+
| 1_20 | 13568824 |
| 1_15 | 12411822 |
| 1_7 | 12470329 |
| 1_12 | 13693867 |
| 1_5 | 13292136 |
| 1_18 | 13874321 |
| 1_16 | 13303094 |
| 1_9 | 13639049 |
| 1_10 | 13698380 |
| 1_22 | 13501073 |
| 1_8 | 13533736 |
| 1_2 | 13549402 |
| 1_21 | 13665183 |
| 1_0 | 13544745 |
| 1_4 | 13532957 |
| 1_19 | 12767473 |
| 1_17 | 13670687 |
| 1_13 | 13469515 |
| 1_23 | 12517632 |
| 1_6 | 13634338 |
| 1_14 | 13611322 |
| 1_3 | 13061900 |
| 1_11 | 12760978 |
+-----------+----------------------------+
23 rows selected (82294.854 seconds)
~~~
The sum of those record counts is 306,772,763 which is close
to the
320,843,454 in the source file:
~~~
0: jdbc:drill:zk=es05:2181> select count(*) FROM
root.`sample_201501.dat`;
+------------+
| EXPR$0 |
+------------+
| 320843454 |
+------------+
1 row selected (384.665 seconds)
~~~
It represents one month of data, 4 key columns and 38 numeric
measure
columns, which could also be partitioned daily. The test here
was to create
monthly Parquet files to see how the min/max stats on Parquet
chunks help
with range select performance.
Instead of a small number of large monthly RDBMS tables, I am
attempting
to determine how many Parquet files should be used with Drill
/ HDFS.
On 27 May 2015, at 15:17, Matt wrote:
Attempting to create a Parquet backed table with a CTAS from
an 44GB tab
delimited file in HDFS. The process seemed to be running, as
CPU and IO was
seen on all 4 nodes in this cluster, and .parquet files
being created in
the expected path.
In however in the last two hours or so, all nodes show near
zero CPU or
IO, and the Last Modified date on the .parquet have not
changed. Same time
delay shown in the Last Progress column in the active
fragment profile.
What approach can I take to determine what is happening (or
not)?