Robert Hou created DRILL-6566: --------------------------------- Summary: Jenkins Regression: TPCDS query 66 fails with RESOURCE ERROR: One or more nodes ran out of memory while executing the query. AGGR OOM at First Phase. Key: DRILL-6566 URL: https://issues.apache.org/jira/browse/DRILL-6566 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Pritesh Maker Fix For: 1.14.0
This is TPCDS Query 66. Query: /root/drillAutomation/framework-master/framework/resources/Advanced/tpcds/tpcds_sf1/hive-generated-parquet/hive1_native/query66.sql SELECT w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country, ship_carriers, year1, Sum(jan_sales) AS jan_sales, Sum(feb_sales) AS feb_sales, Sum(mar_sales) AS mar_sales, Sum(apr_sales) AS apr_sales, Sum(may_sales) AS may_sales, Sum(jun_sales) AS jun_sales, Sum(jul_sales) AS jul_sales, Sum(aug_sales) AS aug_sales, Sum(sep_sales) AS sep_sales, Sum(oct_sales) AS oct_sales, Sum(nov_sales) AS nov_sales, Sum(dec_sales) AS dec_sales, Sum(jan_sales / w_warehouse_sq_ft) AS jan_sales_per_sq_foot, Sum(feb_sales / w_warehouse_sq_ft) AS feb_sales_per_sq_foot, Sum(mar_sales / w_warehouse_sq_ft) AS mar_sales_per_sq_foot, Sum(apr_sales / w_warehouse_sq_ft) AS apr_sales_per_sq_foot, Sum(may_sales / w_warehouse_sq_ft) AS may_sales_per_sq_foot, Sum(jun_sales / w_warehouse_sq_ft) AS jun_sales_per_sq_foot, Sum(jul_sales / w_warehouse_sq_ft) AS jul_sales_per_sq_foot, Sum(aug_sales / w_warehouse_sq_ft) AS aug_sales_per_sq_foot, Sum(sep_sales / w_warehouse_sq_ft) AS sep_sales_per_sq_foot, Sum(oct_sales / w_warehouse_sq_ft) AS oct_sales_per_sq_foot, Sum(nov_sales / w_warehouse_sq_ft) AS nov_sales_per_sq_foot, Sum(dec_sales / w_warehouse_sq_ft) AS dec_sales_per_sq_foot, Sum(jan_net) AS jan_net, Sum(feb_net) AS feb_net, Sum(mar_net) AS mar_net, Sum(apr_net) AS apr_net, Sum(may_net) AS may_net, Sum(jun_net) AS jun_net, Sum(jul_net) AS jul_net, Sum(aug_net) AS aug_net, Sum(sep_net) AS sep_net, Sum(oct_net) AS oct_net, Sum(nov_net) AS nov_net, Sum(dec_net) AS dec_net FROM (SELECT w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country, 'ZOUROS' || ',' || 'ZHOU' AS ship_carriers, d_year AS year1, Sum(CASE WHEN d_moy = 1 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS jan_sales, Sum(CASE WHEN d_moy = 2 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS feb_sales, Sum(CASE WHEN d_moy = 3 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS mar_sales, Sum(CASE WHEN d_moy = 4 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS apr_sales, Sum(CASE WHEN d_moy = 5 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS may_sales, Sum(CASE WHEN d_moy = 6 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS jun_sales, Sum(CASE WHEN d_moy = 7 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS jul_sales, Sum(CASE WHEN d_moy = 8 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS aug_sales, Sum(CASE WHEN d_moy = 9 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS sep_sales, Sum(CASE WHEN d_moy = 10 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS oct_sales, Sum(CASE WHEN d_moy = 11 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS nov_sales, Sum(CASE WHEN d_moy = 12 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS dec_sales, Sum(CASE WHEN d_moy = 1 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS jan_net, Sum(CASE WHEN d_moy = 2 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS feb_net, Sum(CASE WHEN d_moy = 3 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS mar_net, Sum(CASE WHEN d_moy = 4 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS apr_net, Sum(CASE WHEN d_moy = 5 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS may_net, Sum(CASE WHEN d_moy = 6 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS jun_net, Sum(CASE WHEN d_moy = 7 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS jul_net, Sum(CASE WHEN d_moy = 8 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS aug_net, Sum(CASE WHEN d_moy = 9 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS sep_net, Sum(CASE WHEN d_moy = 10 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS oct_net, Sum(CASE WHEN d_moy = 11 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS nov_net, Sum(CASE WHEN d_moy = 12 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS dec_net FROM web_sales, warehouse, date_dim, time_dim, ship_mode WHERE ws_warehouse_sk = w_warehouse_sk AND ws_sold_date_sk = d_date_sk AND ws_sold_time_sk = t_time_sk AND ws_ship_mode_sk = sm_ship_mode_sk AND d_year = 1998 AND t_time BETWEEN 7249 AND 7249 + 28800 AND sm_carrier IN ( 'ZOUROS', 'ZHOU' ) GROUP BY w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country, d_year UNION ALL SELECT w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country, 'ZOUROS' || ',' || 'ZHOU' AS ship_carriers, d_year AS year1, Sum(CASE WHEN d_moy = 1 THEN cs_ext_sales_price * cs_quantity ELSE 0 END) AS jan_sales, Sum(CASE WHEN d_moy = 2 THEN cs_ext_sales_price * cs_quantity ELSE 0 END) AS feb_sales, Sum(CASE WHEN d_moy = 3 THEN cs_ext_sales_price * cs_quantity ELSE 0 END) AS mar_sales, Sum(CASE WHEN d_moy = 4 THEN cs_ext_sales_price * cs_quantity ELSE 0 END) AS apr_sales, Sum(CASE WHEN d_moy = 5 THEN cs_ext_sales_price * cs_quantity ELSE 0 END) AS may_sales, Sum(CASE WHEN d_moy = 6 THEN cs_ext_sales_price * cs_quantity ELSE 0 END) AS jun_sales, Sum(CASE WHEN d_moy = 7 THEN cs_ext_sales_price * cs_quantity ELSE 0 END) AS jul_sales, Sum(CASE WHEN d_moy = 8 THEN cs_ext_sales_price * cs_quantity ELSE 0 END) AS aug_sales, Sum(CASE WHEN d_moy = 9 THEN cs_ext_sales_price * cs_quantity ELSE 0 END) AS sep_sales, Sum(CASE WHEN d_moy = 10 THEN cs_ext_sales_price * cs_quantity ELSE 0 END) AS oct_sales, Sum(CASE WHEN d_moy = 11 THEN cs_ext_sales_price * cs_quantity ELSE 0 END) AS nov_sales, Sum(CASE WHEN d_moy = 12 THEN cs_ext_sales_price * cs_quantity ELSE 0 END) AS dec_sales, Sum(CASE WHEN d_moy = 1 THEN cs_net_paid * cs_quantity ELSE 0 END) AS jan_net, Sum(CASE WHEN d_moy = 2 THEN cs_net_paid * cs_quantity ELSE 0 END) AS feb_net, Sum(CASE WHEN d_moy = 3 THEN cs_net_paid * cs_quantity ELSE 0 END) AS mar_net, Sum(CASE WHEN d_moy = 4 THEN cs_net_paid * cs_quantity ELSE 0 END) AS apr_net, Sum(CASE WHEN d_moy = 5 THEN cs_net_paid * cs_quantity ELSE 0 END) AS may_net, Sum(CASE WHEN d_moy = 6 THEN cs_net_paid * cs_quantity ELSE 0 END) AS jun_net, Sum(CASE WHEN d_moy = 7 THEN cs_net_paid * cs_quantity ELSE 0 END) AS jul_net, Sum(CASE WHEN d_moy = 8 THEN cs_net_paid * cs_quantity ELSE 0 END) AS aug_net, Sum(CASE WHEN d_moy = 9 THEN cs_net_paid * cs_quantity ELSE 0 END) AS sep_net, Sum(CASE WHEN d_moy = 10 THEN cs_net_paid * cs_quantity ELSE 0 END) AS oct_net, Sum(CASE WHEN d_moy = 11 THEN cs_net_paid * cs_quantity ELSE 0 END) AS nov_net, Sum(CASE WHEN d_moy = 12 THEN cs_net_paid * cs_quantity ELSE 0 END) AS dec_net FROM catalog_sales, warehouse, date_dim, time_dim, ship_mode WHERE cs_warehouse_sk = w_warehouse_sk AND cs_sold_date_sk = d_date_sk AND cs_sold_time_sk = t_time_sk AND cs_ship_mode_sk = sm_ship_mode_sk AND d_year = 1998 AND t_time BETWEEN 7249 AND 7249 + 28800 AND sm_carrier IN ( 'ZOUROS', 'ZHOU' ) GROUP BY w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country, d_year) x GROUP BY w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country, ship_carriers, year1 ORDER BY w_warehouse_name LIMIT 100; Here is the error message with stack trace: java.sql.SQLException: RESOURCE ERROR: One or more nodes ran out of memory while executing the query. AGGR OOM at First Phase. Partitions: 1. Estimated batch size: 31260672. values size: 25165824. Output alloc size: 25165824. Planned batches: 1 Memory limit: 2302755 so far allocated: 262144. Fragment 6:1 [Error Id: 660fd3f9-bb90-4245-b63c-208fc37c0199 on atsqa6c82.qa.lab:31010] (org.apache.drill.exec.exception.OutOfMemoryException) AGGR OOM at First Phase. Partitions: 1. Estimated batch size: 31260672. values size: 25165824. Output alloc size: 25165824. Planned batches: 1 Memory limit: 2302755 so far allocated: 262144. org.apache.drill.exec.test.generated.HashAggregatorGen15454.spillIfNeeded():1468 org.apache.drill.exec.test.generated.HashAggregatorGen15454.doSpill():1430 org.apache.drill.exec.test.generated.HashAggregatorGen15454.checkGroupAndAggrValues():1353 org.apache.drill.exec.test.generated.HashAggregatorGen15454.doWork():587 org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():184 org.apache.drill.exec.record.AbstractRecordBatch.next():172 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():147 org.apache.drill.exec.record.AbstractRecordBatch.next():172 org.apache.drill.exec.physical.impl.BaseRootExec.next():103 org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():93 org.apache.drill.exec.physical.impl.BaseRootExec.next():93 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():294 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():281 java.security.AccessController.doPrivileged():-2 javax.security.auth.Subject.doAs():422 org.apache.hadoop.security.UserGroupInformation.doAs():1595 org.apache.drill.exec.work.fragment.FragmentExecutor.run():281 org.apache.drill.common.SelfCleaningRunnable.run():38 java.util.concurrent.ThreadPoolExecutor.runWorker():1149 java.util.concurrent.ThreadPoolExecutor$Worker.run():624 java.lang.Thread.run():748 at org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:528) at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:632) at oadd.org.apache.calcite.avatica.AvaticaResultSet.next(AvaticaResultSet.java:207) at org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:153) at org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:253) at org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:115) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: oadd.org.apache.drill.common.exceptions.UserRemoteException: RESOURCE ERROR: One or more nodes ran out of memory while executing the query. AGGR OOM at First Phase. Partitions: 1. Estimated batch size: 31260672. values size: 25165824. Output alloc size: 25165824. Planned batches: 1 Memory limit: 2302755 so far allocated: 262144. Fragment 6:1 [Error Id: 660fd3f9-bb90-4245-b63c-208fc37c0199 on atsqa6c82.qa.lab:31010] (org.apache.drill.exec.exception.OutOfMemoryException) AGGR OOM at First Phase. Partitions: 1. Estimated batch size: 31260672. values size: 25165824. Output alloc size: 25165824. Planned batches: 1 Memory limit: 2302755 so far allocated: 262144. org.apache.drill.exec.test.generated.HashAggregatorGen15454.spillIfNeeded():1468 org.apache.drill.exec.test.generated.HashAggregatorGen15454.doSpill():1430 org.apache.drill.exec.test.generated.HashAggregatorGen15454.checkGroupAndAggrValues():1353 org.apache.drill.exec.test.generated.HashAggregatorGen15454.doWork():587 org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():184 org.apache.drill.exec.record.AbstractRecordBatch.next():172 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():147 org.apache.drill.exec.record.AbstractRecordBatch.next():172 org.apache.drill.exec.physical.impl.BaseRootExec.next():103 org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():93 org.apache.drill.exec.physical.impl.BaseRootExec.next():93 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():294 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():281 java.security.AccessController.doPrivileged():-2 javax.security.auth.Subject.doAs():422 org.apache.hadoop.security.UserGroupInformation.doAs():1595 org.apache.drill.exec.work.fragment.FragmentExecutor.run():281 org.apache.drill.common.SelfCleaningRunnable.run():38 java.util.concurrent.ThreadPoolExecutor.runWorker():1149 java.util.concurrent.ThreadPoolExecutor$Worker.run():624 java.lang.Thread.run():748 at oadd.org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123) at oadd.org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:422) at oadd.org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:96) at oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:274) at oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:244) at oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) at oadd.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) at oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) at oadd.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312) at oadd.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:286) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) at oadd.io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) at oadd.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at oadd.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911) at oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580) at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497) at oadd.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) at oadd.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131) ... 1 more Here is the commit id: 1.14.0-SNAPSHOT 140d09e69b65ac2cb1bed09a37fa5861d39a99b3 DRILL-6539: Record count not set for this vector container error 28.06.2018 @ 16:13:20 PDT Unknown 28.06.2018 @ 16:21:42 PDT -- This message was sent by Atlassian JIRA (v7.6.3#76005)