[
https://issues.apache.org/jira/browse/DRILL-1162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16078790#comment-16078790
]
Rahul Challapalli commented on DRILL-1162:
------------------------------------------
Eventually the drillbit crashed. There were many of these messages in the logs
{code}
java.util.concurrent.ExecutionException: java.net.ConnectException: Connection
refused: qa-node183.qa.lab/10.10.100.183:31012
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:47)
~[netty-common-4.0.27.Final.jar:4.0.27.Final]
at
org.apache.drill.exec.rpc.BasicClient$ConnectionMultiListener$ConnectionHandler.operationComplete(BasicClient.java:225)
[drill-rpc-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at
org.apache.drill.exec.rpc.BasicClient$ConnectionMultiListener$ConnectionHandler.operationComplete(BasicClient.java:212)
[drill-rpc-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at
io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
[netty-common-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
[netty-common-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
[netty-common-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424)
[netty-common-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:268)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:284)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
[netty-common-4.0.27.Final.jar:4.0.27.Final]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_92]
Caused by: java.net.ConnectException: Connection refused:
qa-node183.qa.lab/10.10.100.183:31012
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
~[na:1.8.0_92]
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
~[na:1.8.0_92]
at
io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
~[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:281)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
... 6 common frames omitted
{code}
And eventually before the drillbit crashing, the log contains
{code}
java.lang.InterruptedException: null
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998)
~[na:1.8.0_92]
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
~[na:1.8.0_92]
at java.util.concurrent.Semaphore.acquire(Semaphore.java:467)
~[na:1.8.0_92]
at
org.apache.drill.exec.ops.SendingAccountor.waitForSendComplete(SendingAccountor.java:48)
~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at
org.apache.drill.exec.ops.FragmentContext.waitForSendComplete(FragmentContext.java:486)
[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at
org.apache.drill.exec.physical.impl.BaseRootExec.close(BaseRootExec.java:134)
[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at
org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:313)
[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:155)
[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:264)
[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
[drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[na:1.8.0_92]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[na:1.8.0_92]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_92]
{code}
And drillbit.out contained the below exception
{code}
java.lang.OutOfMemoryError: Java heap space
Catastrophic failure occurred. Exiting. Information follows: Unable to handle
out of memory condition in FragmentExecutor.
java.lang.OutOfMemoryError: Java heap space
Jun 27, 2017 10:34:58 AM WARNING: org.apache.parquet.CorruptStatistics:
Ignoring statistics because created_by could not be parsed (see PARQUET-251):
parquet-mr
org.apache.parquet.VersionParser$VersionParseException: Could not parse
created_by: parquet-mr using format: (.+) version ((.*) )?\(build ?(.*)\)
at org.apache.parquet.VersionParser.parse(VersionParser.java:112)
at
org.apache.parquet.CorruptStatistics.shouldIgnoreStatistics(CorruptStatistics.java:66)
at
org.apache.parquet.format.converter.ParquetMetadataConverter.fromParquetStatistics(ParquetMetadataConverter.java:264)
at
org.apache.parquet.format.converter.ParquetMetadataConverter.fromParquetMetadata(ParquetMetadataConverter.java:568)
at
org.apache.parquet.format.converter.ParquetMetadataConverter.readParquetMetadata(ParquetMetadataConverter.java:545)
at
org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:455)
at
org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:412)
at
org.apache.drill.exec.store.parquet.Metadata.getParquetFileMetadata_v3(Metadata.java:387)
at
org.apache.drill.exec.store.parquet.Metadata.access$100(Metadata.java:81)
at
org.apache.drill.exec.store.parquet.Metadata$MetadataGatherer.runInner(Metadata.java:326)
at
org.apache.drill.exec.store.parquet.Metadata$MetadataGatherer.runInner(Metadata.java:314)
at org.apache.drill.exec.store.TimedRunnable.run(TimedRunnable.java:56)
at org.apache.drill.exec.store.TimedRunnable.run(TimedRunnable.java:122)
at
org.apache.drill.exec.store.parquet.Metadata.getParquetFileMetadata_v3(Metadata.java:288)
at
org.apache.drill.exec.store.parquet.Metadata.createMetaFilesRecursively(Metadata.java:198)
at
org.apache.drill.exec.store.parquet.Metadata.createMetaFilesRecursively(Metadata.java:184)
at
org.apache.drill.exec.store.parquet.Metadata.createMetaFilesRecursively(Metadata.java:184)
at
org.apache.drill.exec.store.parquet.Metadata.createMeta(Metadata.java:103)
at
org.apache.drill.exec.planner.sql.handlers.RefreshMetadataHandler.getPlan(RefreshMetadataHandler.java:116)
at
org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:131)
at
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:79)
at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:1050)
at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:280)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}
> 25 way join ended up in 0 results which is not expected
> -------------------------------------------------------
>
> Key: DRILL-1162
> URL: https://issues.apache.org/jira/browse/DRILL-1162
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Flow, Query Planning & Optimization
> Reporter: Rahul Challapalli
> Assignee: Chris Westin
> Priority: Critical
> Fix For: Future
>
> Attachments: error.log, oom_error.log
>
>
> git.commit.id.abbrev=e5c2da0
> The below query results in 0 results being returned
> select count(*) from `lineitem1.parquet` a
> inner join `part.parquet` j on a.l_partkey = j.p_partkey
> inner join `orders.parquet` k on a.l_orderkey = k.o_orderkey
> inner join `supplier.parquet` l on a.l_suppkey = l.s_suppkey
> inner join `partsupp.parquet` m on j.p_partkey = m.ps_partkey and l.s_suppkey
> = m.ps_suppkey
> inner join `customer.parquet` n on k.o_custkey = n.c_custkey
> inner join `lineitem2.parquet` b on a.l_orderkey = b.l_orderkey
> inner join `lineitem2.parquet` c on a.l_partkey = c.l_partkey
> inner join `lineitem2.parquet` d on a.l_suppkey = d.l_suppkey
> inner join `lineitem2.parquet` e on a.l_extendedprice = e.l_extendedprice
> inner join `lineitem2.parquet` f on a.l_comment = f.l_comment
> inner join `lineitem2.parquet` g on a.l_shipdate = g.l_shipdate
> inner join `lineitem2.parquet` h on a.l_commitdate = h.l_commitdate
> inner join `lineitem2.parquet` i on a.l_receiptdate = i.l_receiptdate
> inner join `lineitem2.parquet` o on a.l_receiptdate = o.l_receiptdate
> inner join `lineitem2.parquet` p on a.l_receiptdate = p.l_receiptdate
> inner join `lineitem2.parquet` q on a.l_receiptdate = q.l_receiptdate
> inner join `lineitem2.parquet` r on a.l_receiptdate = r.l_receiptdate
> inner join `lineitem2.parquet` s on a.l_receiptdate = s.l_receiptdate
> inner join `lineitem2.parquet` t on a.l_receiptdate = t.l_receiptdate
> inner join `lineitem2.parquet` u on a.l_receiptdate = u.l_receiptdate
> inner join `lineitem2.parquet` v on a.l_receiptdate = v.l_receiptdate
> inner join `lineitem2.parquet` w on a.l_receiptdate = w.l_receiptdate
> inner join `lineitem2.parquet` x on a.l_receiptdate = x.l_receiptdate;
> However when we remove the last 'inner join' and run the query it returns
> '716372534'. Since the last inner join is similar to the one's before it, it
> should match some records and return the data appropriately.
> The logs indicated that it actually returned 0 results. Attached the log file.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)