[jira] [Updated] (DRILL-7113) Issue with filtering null values from MapRDB-JSON
[ https://issues.apache.org/jira/browse/DRILL-7113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aman Sinha updated DRILL-7113: -- Labels: ready-to-commit (was: ) > Issue with filtering null values from MapRDB-JSON > - > > Key: DRILL-7113 > URL: https://issues.apache.org/jira/browse/DRILL-7113 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.15.0 >Reporter: Hanumath Rao Maduri >Assignee: Aman Sinha >Priority: Major > Labels: ready-to-commit > Fix For: 1.16.0 > > > When the Drill is querying documents from MapRDBJSON that contain fields with > null value, it returns the wrong result. > The issue is locally reproduced. > Please find the repro steps: > [1] Create a MaprDBJSON table. Say '/tmp/dmdb2/'. > [2] Insert the following sample records to table: > {code:java} > insert --table /tmp/dmdb2/ --value '{"_id": "1", "label": "person", > "confidence": 0.24}' > insert --table /tmp/dmdb2/ --value '{"_id": "2", "label": "person2"}' > insert --table /tmp/dmdb2/ --value '{"_id": "3", "label": "person3", > "confidence": 0.54}' > insert --table /tmp/dmdb2/ --value '{"_id": "4", "label": "person4", > "confidence": null}' > {code} > We can see that for field 'confidence' document 1 has value 0.24, document 3 > has value 0.54, document 2 does not have the field and document 4 has the > field with value null. > [3] Query the table from DRILL. > *Query 1:* > {code:java} > 0: jdbc:drill:> select label,confidence from dfs.tmp.dmdb2; > +--+-+ > | label | confidence | > +--+-+ > | person | 0.24| > | person2 | null| > | person3 | 0.54| > | person4 | null| > +--+-+ > 4 rows selected (0.2 seconds) > {code} > *Query 2:* > {code:java} > 0: jdbc:drill:> select * from dfs.tmp.dmdb2; > +--+-+--+ > | _id | confidence | label | > +--+-+--+ > | 1| 0.24| person | > | 2| null| person2 | > | 3| 0.54| person3 | > | 4| null| person4 | > +--+-+--+ > 4 rows selected (0.174 seconds) > {code} > *Query 3:* > {code:java} > 0: jdbc:drill:> select label,confidence from dfs.tmp.dmdb2 where confidence > is not null; > +--+-+ > | label | confidence | > +--+-+ > | person | 0.24| > | person3 | 0.54| > | person4 | null| > +--+-+ > 3 rows selected (0.192 seconds) > {code} > *Query 4:* > {code:java} > 0: jdbc:drill:> select label,confidence from dfs.tmp.dmdb2 where confidence > is null; > +--+-+ > | label | confidence | > +--+-+ > | person2 | null| > +--+-+ > 1 row selected (0.262 seconds) > {code} > As you can see, Query 3 which queries for all documents with confidence value > 'is not null', returns a document with null value. > *Other observation:* > Querying the same data using DRILL without MapRDB provides the correct > result. > For example, create 4 different JSON files with following data: > {"label": "person", "confidence": 0.24} \{"label": "person2"} \{"label": > "person3", "confidence": 0.54} \{"label": "person4", "confidence": null} > Query it directly using DRILL: > *Query 5:* > {code:java} > 0: jdbc:drill:> select label,confidence from dfs.tmp.t2; > +--+-+ > | label | confidence | > +--+-+ > | person4 | null| > | person3 | 0.54| > | person2 | null| > | person | 0.24| > +--+-+ > 4 rows selected (0.203 seconds) > {code} > *Query 6:* > {code:java} > 0: jdbc:drill:> select label,confidence from dfs.tmp.t2 where confidence is > null; > +--+-+ > | label | confidence | > +--+-+ > | person4 | null| > | person2 | null| > +--+-+ > 2 rows selected (0.352 seconds) > {code} > *Query 7:* > {code:java} > 0: jdbc:drill:> select label,confidence from dfs.tmp.t2 where confidence is > not null; > +--+-+ > | label | confidence | > +--+-+ > | person3 | 0.54| > | person | 0.24| > +--+-+ > 2 rows selected (0.265 seconds) > {code} > As seen in query 6 & 7, it returns the correct result. > I believe the issue is at the MapRDB layer where it is fetching the results. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7121) TPCH 4 takes longer
Robert Hou created DRILL-7121: - Summary: TPCH 4 takes longer Key: DRILL-7121 URL: https://issues.apache.org/jira/browse/DRILL-7121 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.16.0 Reporter: Robert Hou Assignee: Gautam Parai Fix For: 1.16.0 Here is TPCH 4 with sf 100: {noformat} select o.o_orderpriority, count(*) as order_count from orders o where o.o_orderdate >= date '1996-10-01' and o.o_orderdate < date '1996-10-01' + interval '3' month and exists ( select * from lineitem l where l.l_orderkey = o.o_orderkey and l.l_commitdate < l.l_receiptdate ) group by o.o_orderpriority order by o.o_orderpriority; {noformat} The plan has changed when Statistics is disabled. A Hash Agg and a Broadcast Exchange have been added. These two operators expand the number of rows from the lineitem table from 137M to 9B rows. This forces the hash join to use 6GB of memory instead of 30 MB. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7120) Query fails with ChannelClosedException when Statistics is disabled.
[ https://issues.apache.org/jira/browse/DRILL-7120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou updated DRILL-7120: -- Summary: Query fails with ChannelClosedException when Statistics is disabled. (was: Query fails with ChannelClosedException) > Query fails with ChannelClosedException when Statistics is disabled. > > > Key: DRILL-7120 > URL: https://issues.apache.org/jira/browse/DRILL-7120 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.16.0 >Reporter: Robert Hou >Assignee: Gautam Parai >Priority: Blocker > Fix For: 1.16.0 > > > TPCH query 5 fails at sf100 when Statistics is disabled. Here is the query: > {noformat} > select > n.n_name, > sum(l.l_extendedprice * (1 - l.l_discount)) as revenue > from > customer c, > orders o, > lineitem l, > supplier s, > nation n, > region r > where > c.c_custkey = o.o_custkey > and l.l_orderkey = o.o_orderkey > and l.l_suppkey = s.s_suppkey > and c.c_nationkey = s.s_nationkey > and s.s_nationkey = n.n_nationkey > and n.n_regionkey = r.r_regionkey > and r.r_name = 'EUROPE' > and o.o_orderdate >= date '1997-01-01' > and o.o_orderdate < date '1997-01-01' + interval '1' year > group by > n.n_name > order by > revenue desc; > {noformat} > This is the error from drillbit.log: > {noformat} > 2019-03-04 17:46:38,684 [23822b0a-b7bd-0b79-b905-1438f5b1d039:frag:6:64] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 23822b0a-b7bd-0b79-b905-1438f5b1d039:6:64: State change requested RUNNING --> > FINISHED > 2019-03-04 17:46:38,684 [23822b0a-b7bd-0b79-b905-1438f5b1d039:frag:6:64] INFO > o.a.d.e.w.f.FragmentStatusReporter - > 23822b0a-b7bd-0b79-b905-1438f5b1d039:6:64: State to report: FINISHED > 2019-03-04 18:17:51,454 [BitServer-13] WARN > o.a.d.exec.rpc.ProtobufLengthDecoder - Failure allocating buffer on incoming > stream due to memory limits. Current Allocation: 262144. > 2019-03-04 18:17:51,454 [BitServer-13] ERROR > o.a.drill.exec.rpc.data.DataServer - Out of memory in RPC layer. > 2019-03-04 18:17:51,463 [BitServer-13] ERROR > o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication. > Connection: /10.10.120.104:31012 <--> /10.10.120.106:53048 (data server). > Closing connection. > io.netty.handler.codec.DecoderException: > org.apache.drill.exec.exception.OutOfMemoryException: Failure allocating > buffer. > at > io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:271) > ~[netty-codec-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.ch
[jira] [Created] (DRILL-7123) TPCDS query 83 runs slower when Statistics is disabled
Robert Hou created DRILL-7123: - Summary: TPCDS query 83 runs slower when Statistics is disabled Key: DRILL-7123 URL: https://issues.apache.org/jira/browse/DRILL-7123 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.16.0 Reporter: Robert Hou Assignee: Gautam Parai Fix For: 1.16.0 Query is TPCDS 83 with sf 100: {noformat} WITH sr_items AS (SELECT i_item_id item_id, Sum(sr_return_quantity) sr_item_qty FROM store_returns, item, date_dim WHERE sr_item_sk = i_item_sk AND d_date IN (SELECT d_date FROM date_dim WHERE d_week_seq IN (SELECT d_week_seq FROM date_dim WHERE d_date IN ( '1999-06-30', '1999-08-28', '1999-11-18' ))) AND sr_returned_date_sk = d_date_sk GROUP BY i_item_id), cr_items AS (SELECT i_item_id item_id, Sum(cr_return_quantity) cr_item_qty FROM catalog_returns, item, date_dim WHERE cr_item_sk = i_item_sk AND d_date IN (SELECT d_date FROM date_dim WHERE d_week_seq IN (SELECT d_week_seq FROM date_dim WHERE d_date IN ( '1999-06-30', '1999-08-28', '1999-11-18' ))) AND cr_returned_date_sk = d_date_sk GROUP BY i_item_id), wr_items AS (SELECT i_item_id item_id, Sum(wr_return_quantity) wr_item_qty FROM web_returns, item, date_dim WHERE wr_item_sk = i_item_sk AND d_date IN (SELECT d_date FROM date_dim WHERE d_week_seq IN (SELECT d_week_seq FROM date_dim WHERE d_date IN ( '1999-06-30', '1999-08-28', '1999-11-18' ))) AND wr_returned_date_sk = d_date_sk GROUP BY i_item_id) SELECT sr_items.item_id, sr_item_qty, sr_item_qty / ( sr_item_qty + cr_item_qty + wr_item_qty ) / 3.0 * 100 sr_dev, cr_item_qty, cr_item_qty / ( sr_item_qty + cr_item_qty + wr_item_qty ) / 3.0 * 100 cr_dev, wr_item_qty, wr_item_qty / ( sr_item_qty + cr_item_qty + wr_item_qty ) / 3.0 * 100 wr_dev, ( sr_item_qty + cr_item_qty + wr_item_qty ) / 3.0 average FROM sr_items, cr_items, wr_items WHERE sr_items.item_id = cr_items.item_id AND sr_items.item_id = wr_items.item_id ORDER BY sr_items.item_id, sr_item_qty LIMIT 100; {noformat} The number of threads for major fragments 1 and 2 has changed when Statistics is disabled. The number of minor fragments has been reduced from 10 and 15 fragments down to 3 fragments. Rowcount has changed for major fragment 2 from 1439754.0 down to 287950.8. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7121) TPCH 4 takes longer when Statistics is disabled.
[ https://issues.apache.org/jira/browse/DRILL-7121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou updated DRILL-7121: -- Summary: TPCH 4 takes longer when Statistics is disabled. (was: TPCH 4 takes longer) > TPCH 4 takes longer when Statistics is disabled. > > > Key: DRILL-7121 > URL: https://issues.apache.org/jira/browse/DRILL-7121 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.16.0 >Reporter: Robert Hou >Assignee: Gautam Parai >Priority: Blocker > Fix For: 1.16.0 > > > Here is TPCH 4 with sf 100: > {noformat} > select > o.o_orderpriority, > count(*) as order_count > from > orders o > where > o.o_orderdate >= date '1996-10-01' > and o.o_orderdate < date '1996-10-01' + interval '3' month > and > exists ( > select > * > from > lineitem l > where > l.l_orderkey = o.o_orderkey > and l.l_commitdate < l.l_receiptdate > ) > group by > o.o_orderpriority > order by > o.o_orderpriority; > {noformat} > The plan has changed when Statistics is disabled. A Hash Agg and a > Broadcast Exchange have been added. These two operators expand the number of > rows from the lineitem table from 137M to 9B rows. This forces the hash > join to use 6GB of memory instead of 30 MB. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7120) Query fails with ChannelClosedException when Statistics is disabled
[ https://issues.apache.org/jira/browse/DRILL-7120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou updated DRILL-7120: -- Summary: Query fails with ChannelClosedException when Statistics is disabled (was: Query fails with ChannelClosedException when Statistics is disabled.) > Query fails with ChannelClosedException when Statistics is disabled > --- > > Key: DRILL-7120 > URL: https://issues.apache.org/jira/browse/DRILL-7120 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.16.0 >Reporter: Robert Hou >Assignee: Gautam Parai >Priority: Blocker > Fix For: 1.16.0 > > > TPCH query 5 fails at sf100 when Statistics is disabled. Here is the query: > {noformat} > select > n.n_name, > sum(l.l_extendedprice * (1 - l.l_discount)) as revenue > from > customer c, > orders o, > lineitem l, > supplier s, > nation n, > region r > where > c.c_custkey = o.o_custkey > and l.l_orderkey = o.o_orderkey > and l.l_suppkey = s.s_suppkey > and c.c_nationkey = s.s_nationkey > and s.s_nationkey = n.n_nationkey > and n.n_regionkey = r.r_regionkey > and r.r_name = 'EUROPE' > and o.o_orderdate >= date '1997-01-01' > and o.o_orderdate < date '1997-01-01' + interval '1' year > group by > n.n_name > order by > revenue desc; > {noformat} > This is the error from drillbit.log: > {noformat} > 2019-03-04 17:46:38,684 [23822b0a-b7bd-0b79-b905-1438f5b1d039:frag:6:64] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 23822b0a-b7bd-0b79-b905-1438f5b1d039:6:64: State change requested RUNNING --> > FINISHED > 2019-03-04 17:46:38,684 [23822b0a-b7bd-0b79-b905-1438f5b1d039:frag:6:64] INFO > o.a.d.e.w.f.FragmentStatusReporter - > 23822b0a-b7bd-0b79-b905-1438f5b1d039:6:64: State to report: FINISHED > 2019-03-04 18:17:51,454 [BitServer-13] WARN > o.a.d.exec.rpc.ProtobufLengthDecoder - Failure allocating buffer on incoming > stream due to memory limits. Current Allocation: 262144. > 2019-03-04 18:17:51,454 [BitServer-13] ERROR > o.a.drill.exec.rpc.data.DataServer - Out of memory in RPC layer. > 2019-03-04 18:17:51,463 [BitServer-13] ERROR > o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication. > Connection: /10.10.120.104:31012 <--> /10.10.120.106:53048 (data server). > Closing connection. > io.netty.handler.codec.DecoderException: > org.apache.drill.exec.exception.OutOfMemoryException: Failure allocating > buffer. > at > io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:271) > ~[netty-codec-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) > [netty-transport-4.0.48.Final.jar:4.0.48.Final]
[jira] [Assigned] (DRILL-7122) TPCDS queries 29 25 17 are slower when Statistics is disabled.
[ https://issues.apache.org/jira/browse/DRILL-7122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou reassigned DRILL-7122: - Assignee: Gautam Parai Affects Version/s: 1.16.0 Priority: Blocker (was: Major) Fix Version/s: 1.16.0 Description: Here is query 29 with sf 100: {noformat} SELECT i_item_id, i_item_desc, s_store_id, s_store_name, Avg(ss_quantity)AS store_sales_quantity, Avg(sr_return_quantity) AS store_returns_quantity, Avg(cs_quantity)AS catalog_sales_quantity FROM store_sales, store_returns, catalog_sales, date_dim d1, date_dim d2, date_dim d3, store, item WHERE d1.d_moy = 4 AND d1.d_year = 1998 AND d1.d_date_sk = ss_sold_date_sk AND i_item_sk = ss_item_sk AND s_store_sk = ss_store_sk AND ss_customer_sk = sr_customer_sk AND ss_item_sk = sr_item_sk AND ss_ticket_number = sr_ticket_number AND sr_returned_date_sk = d2.d_date_sk AND d2.d_moy BETWEEN 4 AND 4 + 3 AND d2.d_year = 1998 AND sr_customer_sk = cs_bill_customer_sk AND sr_item_sk = cs_item_sk AND cs_sold_date_sk = d3.d_date_sk AND d3.d_year IN ( 1998, 1998 + 1, 1998 + 2 ) GROUP BY i_item_id, i_item_desc, s_store_id, s_store_name ORDER BY i_item_id, i_item_desc, s_store_id, s_store_name LIMIT 100; {noformat} The hash join order has changed. As a result, one of the hash joins does not seem to reduce the number of rows significantly. Component/s: Query Planning & Optimization > TPCDS queries 29 25 17 are slower when Statistics is disabled. > -- > > Key: DRILL-7122 > URL: https://issues.apache.org/jira/browse/DRILL-7122 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.16.0 >Reporter: Robert Hou >Assignee: Gautam Parai >Priority: Blocker > Fix For: 1.16.0 > > > Here is query 29 with sf 100: > {noformat} > SELECT i_item_id, >i_item_desc, >s_store_id, >s_store_name, >Avg(ss_quantity)AS store_sales_quantity, >Avg(sr_return_quantity) AS store_returns_quantity, >Avg(cs_quantity)AS catalog_sales_quantity > FROM store_sales, >store_returns, >catalog_sales, >date_dim d1, >date_dim d2, >date_dim d3, >store, >item > WHERE d1.d_moy = 4 >AND d1.d_year = 1998 >AND d1.d_date_sk = ss_sold_date_sk >AND i_item_sk = ss_item_sk >AND s_store_sk = ss_store_sk >AND ss_customer_sk = sr_customer_sk >AND ss_item_sk = sr_item_sk >AND ss_ticket_number = sr_ticket_number >AND sr_returned_date_sk = d2.d_date_sk >AND d2.d_moy BETWEEN 4 AND 4 + 3 >AND d2.d_year = 1998 >AND sr_customer_sk = cs_bill_customer_sk >AND sr_item_sk = cs_item_sk >AND cs_sold_date_sk = d3.d_date_sk >AND d3.d_year IN ( 1998, 1998 + 1, 1998 + 2 ) > GROUP BY i_item_id, > i_item_desc, > s_store_id, > s_store_name > ORDER BY i_item_id, > i_item_desc, > s_store_id, > s_store_name > LIMIT 100; > {noformat} > The hash join order has changed. As a result, one of the hash joins does not > seem to reduce the number of rows significantly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7122) TPCDS queries 29 25 17 are slower when Statistics is disabled.
Robert Hou created DRILL-7122: - Summary: TPCDS queries 29 25 17 are slower when Statistics is disabled. Key: DRILL-7122 URL: https://issues.apache.org/jira/browse/DRILL-7122 Project: Apache Drill Issue Type: Bug Reporter: Robert Hou -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7120) Query fails with ChannelClosedException
[ https://issues.apache.org/jira/browse/DRILL-7120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou updated DRILL-7120: -- Description: TPCH query 5 fails at sf100 when Statistics is disabled. Here is the query: {noformat} select n.n_name, sum(l.l_extendedprice * (1 - l.l_discount)) as revenue from customer c, orders o, lineitem l, supplier s, nation n, region r where c.c_custkey = o.o_custkey and l.l_orderkey = o.o_orderkey and l.l_suppkey = s.s_suppkey and c.c_nationkey = s.s_nationkey and s.s_nationkey = n.n_nationkey and n.n_regionkey = r.r_regionkey and r.r_name = 'EUROPE' and o.o_orderdate >= date '1997-01-01' and o.o_orderdate < date '1997-01-01' + interval '1' year group by n.n_name order by revenue desc; {noformat} This is the error from drillbit.log: {noformat} 2019-03-04 17:46:38,684 [23822b0a-b7bd-0b79-b905-1438f5b1d039:frag:6:64] INFO o.a.d.e.w.fragment.FragmentExecutor - 23822b0a-b7bd-0b79-b905-1438f5b1d039:6:64: State change requested RUNNING --> FINISHED 2019-03-04 17:46:38,684 [23822b0a-b7bd-0b79-b905-1438f5b1d039:frag:6:64] INFO o.a.d.e.w.f.FragmentStatusReporter - 23822b0a-b7bd-0b79-b905-1438f5b1d039:6:64: State to report: FINISHED 2019-03-04 18:17:51,454 [BitServer-13] WARN o.a.d.exec.rpc.ProtobufLengthDecoder - Failure allocating buffer on incoming stream due to memory limits. Current Allocation: 262144. 2019-03-04 18:17:51,454 [BitServer-13] ERROR o.a.drill.exec.rpc.data.DataServer - Out of memory in RPC layer. 2019-03-04 18:17:51,463 [BitServer-13] ERROR o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication. Connection: /10.10.120.104:31012 <--> /10.10.120.106:53048 (data server). Closing connection. io.netty.handler.codec.DecoderException: org.apache.drill.exec.exception.OutOfMemoryException: Failure allocating buffer. at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:271) ~[netty-codec-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131) [netty-common-4.0.48.Final.jar:4.0.48.Final] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_112] Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Failure allocating buffer. at io.netty.buffer.PooledByteBufAllocatorL.allocate(PooledByteBufAllocatorL.java:67) ~[drill-memory-base-1.16.0-SNAPSHOT.jar:4.0.
[jira] [Created] (DRILL-7120) Query fails with ChannelClosedException
Robert Hou created DRILL-7120: - Summary: Query fails with ChannelClosedException Key: DRILL-7120 URL: https://issues.apache.org/jira/browse/DRILL-7120 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.16.0 Reporter: Robert Hou Assignee: Gautam Parai Fix For: 1.16.0 TPCH query 5 fails at sf100. Here is the query: {noformat} select n.n_name, sum(l.l_extendedprice * (1 - l.l_discount)) as revenue from customer c, orders o, lineitem l, supplier s, nation n, region r where c.c_custkey = o.o_custkey and l.l_orderkey = o.o_orderkey and l.l_suppkey = s.s_suppkey and c.c_nationkey = s.s_nationkey and s.s_nationkey = n.n_nationkey and n.n_regionkey = r.r_regionkey and r.r_name = 'EUROPE' and o.o_orderdate >= date '1997-01-01' and o.o_orderdate < date '1997-01-01' + interval '1' year group by n.n_name order by revenue desc; {noformat} This is the error from drillbit.log: {noformat} 2019-03-04 17:46:38,684 [23822b0a-b7bd-0b79-b905-1438f5b1d039:frag:6:64] INFO o.a.d.e.w.fragment.FragmentExecutor - 23822b0a-b7bd-0b79-b905-1438f5b1d039:6:64: State change requested RUNNING --> FINISHED 2019-03-04 17:46:38,684 [23822b0a-b7bd-0b79-b905-1438f5b1d039:frag:6:64] INFO o.a.d.e.w.f.FragmentStatusReporter - 23822b0a-b7bd-0b79-b905-1438f5b1d039:6:64: State to report: FINISHED 2019-03-04 18:17:51,454 [BitServer-13] WARN o.a.d.exec.rpc.ProtobufLengthDecoder - Failure allocating buffer on incoming stream due to memory limits. Current Allocation: 262144. 2019-03-04 18:17:51,454 [BitServer-13] ERROR o.a.drill.exec.rpc.data.DataServer - Out of memory in RPC layer. 2019-03-04 18:17:51,463 [BitServer-13] ERROR o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication. Connection: /10.10.120.104:31012 <--> /10.10.120.106:53048 (data server). Closing connection. io.netty.handler.codec.DecoderException: org.apache.drill.exec.exception.OutOfMemoryException: Failure allocating buffer. at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:271) ~[netty-codec-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131) [netty-common-4.0.48.Final.jar:4.0.48.Final] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_1
[jira] [Updated] (DRILL-7048) Implement JDBC Statement.setMaxRows() with System Option
[ https://issues.apache.org/jira/browse/DRILL-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Khatua updated DRILL-7048: Labels: doc-impacting (was: ) > Implement JDBC Statement.setMaxRows() with System Option > > > Key: DRILL-7048 > URL: https://issues.apache.org/jira/browse/DRILL-7048 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC, Query Planning & Optimization >Affects Versions: 1.15.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Labels: doc-impacting > Fix For: 1.17.0 > > > With DRILL-6960, the webUI will get an auto-limit on the number of results > fetched. > Since more of the plumbing is already there, it makes sense to provide the > same for the JDBC client. > In addition, it would be nice if the Server can have a pre-defined value as > well (default 0; i.e. no limit) so that an _admin_ would be able to ensure a > max limit on the resultset size as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7048) Implement JDBC Statement.setMaxRows() with System Option
[ https://issues.apache.org/jira/browse/DRILL-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Khatua updated DRILL-7048: Fix Version/s: (was: 1.16.0) 1.17.0 > Implement JDBC Statement.setMaxRows() with System Option > > > Key: DRILL-7048 > URL: https://issues.apache.org/jira/browse/DRILL-7048 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC, Query Planning & Optimization >Affects Versions: 1.15.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.17.0 > > > With DRILL-6960, the webUI will get an auto-limit on the number of results > fetched. > Since more of the plumbing is already there, it makes sense to provide the > same for the JDBC client. > In addition, it would be nice if the Server can have a pre-defined value as > well (default 0; i.e. no limit) so that an _admin_ would be able to ensure a > max limit on the resultset size as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6050) Provide a limit to number of rows fetched for a query in UI
[ https://issues.apache.org/jira/browse/DRILL-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Khatua updated DRILL-6050: Fix Version/s: 1.17.0 > Provide a limit to number of rows fetched for a query in UI > --- > > Key: DRILL-6050 > URL: https://issues.apache.org/jira/browse/DRILL-6050 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Minor > Labels: ready-to-commit, user-experience > Fix For: 1.16.0, 1.17.0 > > > Currently, the WebServer side needs to process the entire set of results and > stream it back to the WebClient. > Since the WebUI does paginate results, we can load a larger set for > pagination on the browser client and relieve pressure off the WebServer to > host all the data. > e.g. Fetching all rows from a 1Billion records table is impractical and can > be capped at 10K. Currently, the user has to explicitly specify LIMIT in the > submitted query. > An option can be provided in the field to allow for this entry. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6050) Provide a limit to number of rows fetched for a query in UI
[ https://issues.apache.org/jira/browse/DRILL-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Khatua updated DRILL-6050: Labels: ready-to-commit user-experience (was: doc-impacting ready-to-commit user-experience) > Provide a limit to number of rows fetched for a query in UI > --- > > Key: DRILL-6050 > URL: https://issues.apache.org/jira/browse/DRILL-6050 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Minor > Labels: ready-to-commit, user-experience > Fix For: 1.16.0 > > > Currently, the WebServer side needs to process the entire set of results and > stream it back to the WebClient. > Since the WebUI does paginate results, we can load a larger set for > pagination on the browser client and relieve pressure off the WebServer to > host all the data. > e.g. Fetching all rows from a 1Billion records table is impractical and can > be capped at 10K. Currently, the user has to explicitly specify LIMIT in the > submitted query. > An option can be provided in the field to allow for this entry. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6960) Auto Limit Wrapping should not apply to non-select query
[ https://issues.apache.org/jira/browse/DRILL-6960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Khatua updated DRILL-6960: Labels: doc-impacting user-experience (was: user-experience) > Auto Limit Wrapping should not apply to non-select query > > > Key: DRILL-6960 > URL: https://issues.apache.org/jira/browse/DRILL-6960 > Project: Apache Drill > Issue Type: Bug > Components: Web Server >Affects Versions: 1.16.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Blocker > Labels: doc-impacting, user-experience > Fix For: 1.17.0 > > > [~IhorHuzenko] pointed out that DRILL-6050 can cause submission of queries > with incorrect syntax. > For example, when user enters {{SHOW DATABASES}}' and after limitation > wrapping {{SELECT * FROM (SHOW DATABASES) LIMIT 10}} will be posted. > This results into parsing errors, like: > {{Query Failed: An Error Occurred > org.apache.drill.common.exceptions.UserRemoteException: PARSE ERROR: > Encountered "( show" at line 2, column 15. Was expecting one of: > ... }}. > The fix should involve a javascript check for all non-select queries and not > apply the LIMIT wrap for those queries. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7110) Skip writing profile when an ALTER SESSION is executed
[ https://issues.apache.org/jira/browse/DRILL-7110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-7110: Labels: doc-impacting (was: ) > Skip writing profile when an ALTER SESSION is executed > -- > > Key: DRILL-7110 > URL: https://issues.apache.org/jira/browse/DRILL-7110 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Monitoring >Affects Versions: 1.16.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Minor > Labels: doc-impacting > Fix For: 1.16.0 > > > Currently, any {{ALTER }} query will be logged. While this is useful, > it can potentially add up to a lot of profiles being written unnecessarily, > since those changes are also reflected on the queries that follow. > This JIRA is proposing an option to skip writing such profiles to the profile > store. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7110) Skip writing profile when an ALTER SESSION is executed
[ https://issues.apache.org/jira/browse/DRILL-7110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Khatua updated DRILL-7110: Fix Version/s: (was: 1.17.0) 1.16.0 > Skip writing profile when an ALTER SESSION is executed > -- > > Key: DRILL-7110 > URL: https://issues.apache.org/jira/browse/DRILL-7110 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Monitoring >Affects Versions: 1.16.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Minor > Fix For: 1.16.0 > > > Currently, any {{ALTER }} query will be logged. While this is useful, > it can potentially add up to a lot of profiles being written unnecessarily, > since those changes are also reflected on the queries that follow. > This JIRA is proposing an option to skip writing such profiles to the profile > store. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7110) Skip writing profile when an ALTER SESSION is executed
[ https://issues.apache.org/jira/browse/DRILL-7110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Khatua updated DRILL-7110: Reviewer: Arina Ielchiieva > Skip writing profile when an ALTER SESSION is executed > -- > > Key: DRILL-7110 > URL: https://issues.apache.org/jira/browse/DRILL-7110 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Monitoring >Affects Versions: 1.16.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Minor > Fix For: 1.16.0 > > > Currently, any {{ALTER }} query will be logged. While this is useful, > it can potentially add up to a lot of profiles being written unnecessarily, > since those changes are also reflected on the queries that follow. > This JIRA is proposing an option to skip writing such profiles to the profile > store. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7032) Ignore corrupt rows in a PCAP file
[ https://issues.apache.org/jira/browse/DRILL-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796473#comment-16796473 ] Arina Ielchiieva commented on DRILL-7032: - [~cgivre] did you open Jira for PCAP-NG parser? Can you please link it? > Ignore corrupt rows in a PCAP file > -- > > Key: DRILL-7032 > URL: https://issues.apache.org/jira/browse/DRILL-7032 > Project: Apache Drill > Issue Type: Improvement > Components: Functions - Drill >Affects Versions: 1.15.0 > Environment: OS: Ubuntu 18.4 > Drill version: 1.15.0 > Java(TM) SE Runtime Environment (build 1.8.0_191-b12) >Reporter: Giovanni Conte >Assignee: Charles Givre >Priority: Major > Fix For: 1.16.0 > > > Would be useful for Drill to have some ability to ignore corrupt rows in a > PCAP file instead of trow the java exception. > This is because there are many pcap files with corrupted lines and this > funcionality will avoid to do a pre-fixing of the packet-captures (example > attached file). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7119) Modify selectivity calculations to use histograms
Aman Sinha created DRILL-7119: - Summary: Modify selectivity calculations to use histograms Key: DRILL-7119 URL: https://issues.apache.org/jira/browse/DRILL-7119 Project: Apache Drill Issue Type: Sub-task Components: Query Planning & Optimization Reporter: Aman Sinha Assignee: Aman Sinha Fix For: 1.16.0 (Please see parent JIRA for the design document) Once the t-digest based histogram is created, we need to read it back and modify the selectivity calculations such that they use the histogram buckets for range conditions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7118) Filter not getting pushed down on MapR-DB tables.
Hanumath Rao Maduri created DRILL-7118: -- Summary: Filter not getting pushed down on MapR-DB tables. Key: DRILL-7118 URL: https://issues.apache.org/jira/browse/DRILL-7118 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.15.0 Reporter: Hanumath Rao Maduri Assignee: Hanumath Rao Maduri Fix For: 1.16.0 A simple is null filter is not being pushed down for the mapr-db tables. Here is the repro for the same. {code:java} 0: jdbc:drill:zk=local> explain plan for select * from dfs.`/tmp/js` where b is null; ANTLR Tool version 4.5 used for code generation does not match the current runtime version 4.7.1ANTLR Runtime version 4.5 used for parser compilation does not match the current runtime version 4.7.1ANTLR Tool version 4.5 used for code generation does not match the current runtime version 4.7.1ANTLR Runtime version 4.5 used for parser compilation does not match the current runtime version 4.7.1+--+--+ | text | json | +--+--+ | 00-00 Screen 00-01 Project(**=[$0]) 00-02 Project(T0¦¦**=[$0]) 00-03 SelectionVectorRemover 00-04 Filter(condition=[IS NULL($1)]) 00-05 Project(T0¦¦**=[$0], b=[$1]) 00-06 Scan(table=[[dfs, /tmp/js]], groupscan=[JsonTableGroupScan [ScanSpec=JsonScanSpec [tableName=/tmp/js, condition=null], columns=[`**`, `b`], maxwidth=1]]) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7117) Support creation of histograms for numeric data types (except Decimal)
Aman Sinha created DRILL-7117: - Summary: Support creation of histograms for numeric data types (except Decimal) Key: DRILL-7117 URL: https://issues.apache.org/jira/browse/DRILL-7117 Project: Apache Drill Issue Type: Sub-task Components: Query Planning & Optimization Reporter: Aman Sinha Assignee: Aman Sinha Fix For: 1.16.0 This JIRA is specific to creating histograms for numeric data types: INT, BIGINT, FLOAT4, FLOAT8 and their corresponding nullable/non-nullable versions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6501) Revert/modify fix for DRILL-6212 after CALCITE-2223 is fixed
[ https://issues.apache.org/jira/browse/DRILL-6501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sorabh Hamirwasia updated DRILL-6501: - Fix Version/s: (was: 1.16.0) 1.17.0 > Revert/modify fix for DRILL-6212 after CALCITE-2223 is fixed > > > Key: DRILL-6501 > URL: https://issues.apache.org/jira/browse/DRILL-6501 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.14.0 >Reporter: Gautam Parai >Assignee: Gautam Parai >Priority: Major > Fix For: 1.17.0 > > Original Estimate: 48h > Remaining Estimate: 48h > > DRILL-6212 is a temporary fix to alleviate issues due to CALCITE-2223. Once, > CALCITE-2223 is fixed this change needs to be reverted back which would > require DrillProjectMergeRule to go back to extending the ProjectMergeRule. > Please take a look at how CALCITE-2223 is eventually fixed (as of now it is > still not clear which fix is the way to do). Depending on the fix we may need > to additional work to integrate these changes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7032) Ignore corrupt rows in a PCAP file
[ https://issues.apache.org/jira/browse/DRILL-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sorabh Hamirwasia updated DRILL-7032: - Reviewer: Arina Ielchiieva > Ignore corrupt rows in a PCAP file > -- > > Key: DRILL-7032 > URL: https://issues.apache.org/jira/browse/DRILL-7032 > Project: Apache Drill > Issue Type: Improvement > Components: Functions - Drill >Affects Versions: 1.15.0 > Environment: OS: Ubuntu 18.4 > Drill version: 1.15.0 > Java(TM) SE Runtime Environment (build 1.8.0_191-b12) >Reporter: Giovanni Conte >Assignee: Charles Givre >Priority: Major > Fix For: 1.16.0 > > > Would be useful for Drill to have some ability to ignore corrupt rows in a > PCAP file instead of trow the java exception. > This is because there are many pcap files with corrupted lines and this > funcionality will avoid to do a pre-fixing of the packet-captures (example > attached file). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6992) Support column histogram statistics
[ https://issues.apache.org/jira/browse/DRILL-6992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sorabh Hamirwasia updated DRILL-6992: - Fix Version/s: 1.16.0 > Support column histogram statistics > --- > > Key: DRILL-6992 > URL: https://issues.apache.org/jira/browse/DRILL-6992 > Project: Apache Drill > Issue Type: New Feature > Components: Query Planning & Optimization >Affects Versions: 1.15.0 >Reporter: Aman Sinha >Assignee: Aman Sinha >Priority: Major > Fix For: 1.16.0 > > > As a follow-up to > [DRILL-1328|https://issues.apache.org/jira/browse/DRILL-1328] which is adding > NDV (num distinct values) support and creating the framework for statistics, > we also need Histograms. These are needed for range predicates selectivity > estimation as well as equality predicates when there is non-uniform > distribution of data. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6956) Maintain a single entry for Drill Version in the pom file
[ https://issues.apache.org/jira/browse/DRILL-6956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sorabh Hamirwasia updated DRILL-6956: - Fix Version/s: (was: 1.16.0) 1.17.0 > Maintain a single entry for Drill Version in the pom file > - > > Key: DRILL-6956 > URL: https://issues.apache.org/jira/browse/DRILL-6956 > Project: Apache Drill > Issue Type: Improvement > Components: Tools, Build & Test >Affects Versions: 1.15.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.17.0 > > > Currently, updating the version information for a Drill release involves > updating 30+ pom files. > The right way would be to use the Multi Module Setup for Maven CI. > https://maven.apache.org/maven-ci-friendly.html#Multi_Module_Setup -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6899) Fix timestamp issues in unit tests ignored with DRILL-6833
[ https://issues.apache.org/jira/browse/DRILL-6899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sorabh Hamirwasia updated DRILL-6899: - Fix Version/s: (was: 1.16.0) 1.17.0 > Fix timestamp issues in unit tests ignored with DRILL-6833 > -- > > Key: DRILL-6899 > URL: https://issues.apache.org/jira/browse/DRILL-6899 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.15.0 >Reporter: Gautam Parai >Assignee: Gautam Parai >Priority: Major > Fix For: 1.17.0 > > Original Estimate: 96h > Remaining Estimate: 96h > > {{The following tests were disabled in the PR for DRILL-6833}} > {{IndexPlanTest.testCastTimestampPlan() - Re-enable after the MapRDB format > plugin issue is fixed.}} > {{IndexPlanTest.testRowkeyJoinPushdown_13() - Re-enable the testcase after > fixing the execution issue with HashJoin used as Rowkeyjoin.}} > {{IndexPlanTest.testRowkeyJoinPushdown_12() - Remove the testcase since the > SemiJoin transformation makes the rowkeyjoinpushdown transformation invalid.}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7063) Create separate summary file for schema, totalRowCount, totalNullCount (includes maintenance)
[ https://issues.apache.org/jira/browse/DRILL-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aman Sinha updated DRILL-7063: -- Reviewer: Aman Sinha > Create separate summary file for schema, totalRowCount, totalNullCount > (includes maintenance) > - > > Key: DRILL-7063 > URL: https://issues.apache.org/jira/browse/DRILL-7063 > Project: Apache Drill > Issue Type: Sub-task > Components: Metadata >Reporter: Venkata Jyothsna Donapati >Assignee: Venkata Jyothsna Donapati >Priority: Major > Fix For: 1.16.0 > > Original Estimate: 252h > Remaining Estimate: 252h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7114) ANALYZE command generates warnings for stats file and materialization
[ https://issues.apache.org/jira/browse/DRILL-7114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796306#comment-16796306 ] Gautam Parai commented on DRILL-7114: - [~vitalii] yes you are right - this happens for all queries. I plan to address both issues as part of this JIRA. > ANALYZE command generates warnings for stats file and materialization > - > > Key: DRILL-7114 > URL: https://issues.apache.org/jira/browse/DRILL-7114 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Reporter: Aman Sinha >Assignee: Gautam Parai >Priority: Minor > Fix For: 1.16.0 > > > When I run ANALYZE, I see warnings in the log file as shown below. The > ANALYZE command should not try to read the stats file or materialize the > stats. > {noformat} > 12:04:32.939 [2370143e-c419-f33c-d879-84989712bc85:foreman] WARN > o.a.d.e.p.common.DrillStatsTable - Failed to read the stats file. > java.io.FileNotFoundException: File /tmp/orders3/.stats.drill/0_0.json does > not exist > 12:04:32.939 [2370143e-c419-f33c-d879-84989712bc85:foreman] WARN > o.a.d.e.p.common.DrillStatsTable - Failed to materialize the stats. > Continuing without stats. > java.io.FileNotFoundException: File /tmp/orders3/.stats.drill/0_0.json does > not exist > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7089) Implement caching of BaseMetadata classes
[ https://issues.apache.org/jira/browse/DRILL-7089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sorabh Hamirwasia updated DRILL-7089: - Reviewer: Aman Sinha > Implement caching of BaseMetadata classes > - > > Key: DRILL-7089 > URL: https://issues.apache.org/jira/browse/DRILL-7089 > Project: Apache Drill > Issue Type: Sub-task >Affects Versions: 1.16.0 >Reporter: Volodymyr Vysotskyi >Assignee: Volodymyr Vysotskyi >Priority: Major > Fix For: 1.16.0 > > > In the scope of DRILL-6852 were introduced new classes for metadata usage. > These classes may be reused in other GroupScan instances to preserve heap > usage for the case when metadata is large. > The idea is to store {{BaseMetadata}} inheritors in {{DrillTable}} and pass > them to the {{GroupScan}}, so in the scope of the single query, it will be > possible to reuse them. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-5028) Opening profiles page from web ui gets very slow when a lot of history files have been stored in HDFS or Local FS.
[ https://issues.apache.org/jira/browse/DRILL-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Khatua updated DRILL-5028: Fix Version/s: (was: 1.16.0) 1.17.0 > Opening profiles page from web ui gets very slow when a lot of history files > have been stored in HDFS or Local FS. > -- > > Key: DRILL-5028 > URL: https://issues.apache.org/jira/browse/DRILL-5028 > Project: Apache Drill > Issue Type: Improvement > Components: Functions - Drill >Affects Versions: 1.8.0 >Reporter: Account Not Used >Assignee: Kunal Khatua >Priority: Minor > Fix For: 1.17.0 > > > We have a Drill cluster with 20+ Nodes and we store all history profiles in > hdfs. Without doing periodically cleans for hdfs, the profiles page gets > slower while serving more queries. > Code from LocalPersistentStore.java uses fs.list(false, basePath) for > fetching the latest 100 history profiles by default, I guess this operation > blocks the page loading (Millions small files can be stored in the basePath), > maybe we can try some other ways to reach the same goal. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6032) Use RecordBatchSizer to estimate size of columns in HashAgg
[ https://issues.apache.org/jira/browse/DRILL-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sorabh Hamirwasia updated DRILL-6032: - Fix Version/s: (was: 1.16.0) 1.17.0 > Use RecordBatchSizer to estimate size of columns in HashAgg > --- > > Key: DRILL-6032 > URL: https://issues.apache.org/jira/browse/DRILL-6032 > Project: Apache Drill > Issue Type: Improvement >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > Fix For: 1.17.0 > > > We need to use the RecordBatchSize to estimate the size of columns in the > Partition batches created by HashAgg. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7107) Unable to connect to Drill 1.15 through ZK
[ https://issues.apache.org/jira/browse/DRILL-7107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sorabh Hamirwasia updated DRILL-7107: - Reviewer: Sorabh Hamirwasia > Unable to connect to Drill 1.15 through ZK > -- > > Key: DRILL-7107 > URL: https://issues.apache.org/jira/browse/DRILL-7107 > Project: Apache Drill > Issue Type: Bug >Reporter: Karthikeyan Manivannan >Assignee: Karthikeyan Manivannan >Priority: Major > Fix For: 1.16.0 > > > After upgrading to Drill 1.15, users are seeing they are no longer able to > connect to Drill using ZK quorum. They are getting the following "Unable to > setup ZK for client" error. > [~]$ sqlline -u "jdbc:drill:zk=172.16.2.165:5181;auth=maprsasl" > Error: Failure in connecting to Drill: > org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for client. > (state=,code=0) > java.sql.SQLNonTransientConnectionException: Failure in connecting to Drill: > org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for client. > at > org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:174) > at > org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:67) > at > org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:67) > at > org.apache.calcite.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:138) > at org.apache.drill.jdbc.Driver.connect(Driver.java:72) > at sqlline.DatabaseConnection.connect(DatabaseConnection.java:130) > at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:179) > at sqlline.Commands.connect(Commands.java:1247) > at sqlline.Commands.connect(Commands.java:1139) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38) > at sqlline.SqlLine.dispatch(SqlLine.java:722) > at sqlline.SqlLine.initArgs(SqlLine.java:416) > at sqlline.SqlLine.begin(SqlLine.java:514) > at sqlline.SqlLine.start(SqlLine.java:264) > at sqlline.SqlLine.main(SqlLine.java:195) > Caused by: org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for > client. > at org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:340) > at > org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:165) > ... 18 more > Caused by: java.lang.NullPointerException > at > org.apache.drill.exec.coord.zk.ZKACLProviderFactory.findACLProvider(ZKACLProviderFactory.java:68) > at > org.apache.drill.exec.coord.zk.ZKACLProviderFactory.getACLProvider(ZKACLProviderFactory.java:47) > at > org.apache.drill.exec.coord.zk.ZKClusterCoordinator.(ZKClusterCoordinator.java:114) > at > org.apache.drill.exec.coord.zk.ZKClusterCoordinator.(ZKClusterCoordinator.java:86) > at org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:337) > ... 19 more > Apache Drill 1.15.0.0 > "This isn't your grandfather's SQL." > sqlline> > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6540) Upgrade to HADOOP-3.1 libraries
[ https://issues.apache.org/jira/browse/DRILL-6540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sorabh Hamirwasia updated DRILL-6540: - Fix Version/s: (was: 1.16.0) 1.17.0 > Upgrade to HADOOP-3.1 libraries > > > Key: DRILL-6540 > URL: https://issues.apache.org/jira/browse/DRILL-6540 > Project: Apache Drill > Issue Type: Improvement > Components: Tools, Build & Test >Affects Versions: 1.14.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka >Priority: Major > Fix For: 1.17.0 > > > Currently Drill uses 2.7.4 version of hadoop libraries (hadoop-common, > hadoop-hdfs, hadoop-annotations, hadoop-aws, hadoop-yarn-api, hadoop-client, > hadoop-yarn-client). > Half of year ago the [Hadoop > 3.0|https://hadoop.apache.org/docs/r3.0.0/index.html] was released and > recently it was an update - [Hadoop > 3.2.0|https://hadoop.apache.org/docs/r3.2.0/]. > To use Drill under Hadoop3.0 distribution we need this upgrade. Also the > newer version includes new features, which can be useful for Drill. > This upgrade is also needed to leverage the newest version of Zookeeper > libraries and Hive 3.1 version. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7098) File Metadata Metastore Plugin
[ https://issues.apache.org/jira/browse/DRILL-7098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sorabh Hamirwasia updated DRILL-7098: - Fix Version/s: (was: 1.16.0) 2.0.0 > File Metadata Metastore Plugin > -- > > Key: DRILL-7098 > URL: https://issues.apache.org/jira/browse/DRILL-7098 > Project: Apache Drill > Issue Type: Sub-task > Components: Server, Metadata >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka >Priority: Major > Labels: Metastore > Fix For: 2.0.0 > > > DRILL-6852 introduces Drill Metastore API. > The second step is to create internal Drill Metastore mechanism (and File > Metastore Plugin), which will involve Metastore API and can be extended for > using by other Storage Plugins. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6960) Auto Limit Wrapping should not apply to non-select query
[ https://issues.apache.org/jira/browse/DRILL-6960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Khatua updated DRILL-6960: Fix Version/s: (was: 1.16.0) 1.17.0 > Auto Limit Wrapping should not apply to non-select query > > > Key: DRILL-6960 > URL: https://issues.apache.org/jira/browse/DRILL-6960 > Project: Apache Drill > Issue Type: Bug > Components: Web Server >Affects Versions: 1.16.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Blocker > Labels: user-experience > Fix For: 1.17.0 > > > [~IhorHuzenko] pointed out that DRILL-6050 can cause submission of queries > with incorrect syntax. > For example, when user enters {{SHOW DATABASES}}' and after limitation > wrapping {{SELECT * FROM (SHOW DATABASES) LIMIT 10}} will be posted. > This results into parsing errors, like: > {{Query Failed: An Error Occurred > org.apache.drill.common.exceptions.UserRemoteException: PARSE ERROR: > Encountered "( show" at line 2, column 15. Was expecting one of: > ... }}. > The fix should involve a javascript check for all non-select queries and not > apply the LIMIT wrap for those queries. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Khatua updated DRILL-5270: Fix Version/s: (was: 1.16.0) 1.17.0 > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.17.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-2362) Drill should manage Query Profiling archiving
[ https://issues.apache.org/jira/browse/DRILL-2362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Khatua updated DRILL-2362: Fix Version/s: (was: 1.16.0) 1.17.0 > Drill should manage Query Profiling archiving > - > > Key: DRILL-2362 > URL: https://issues.apache.org/jira/browse/DRILL-2362 > Project: Apache Drill > Issue Type: New Feature > Components: Storage - Other >Affects Versions: 0.7.0 >Reporter: Chris Westin >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.17.0 > > > We collect query profile information for analysis purposes, but we keep it > forever. At this time, for a few queries, it isn't a problem. But as users > start putting Drill into production, automated use via other applications > will make this grow quickly. We need to come up with a retention policy > mechanism, with suitable settings administrators can use, and implement it so > that this data can be cleaned up. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7113) Issue with filtering null values from MapRDB-JSON
[ https://issues.apache.org/jira/browse/DRILL-7113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sorabh Hamirwasia updated DRILL-7113: - Reviewer: Hanumath Rao Maduri > Issue with filtering null values from MapRDB-JSON > - > > Key: DRILL-7113 > URL: https://issues.apache.org/jira/browse/DRILL-7113 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.15.0 >Reporter: Hanumath Rao Maduri >Assignee: Aman Sinha >Priority: Major > Fix For: 1.16.0 > > > When the Drill is querying documents from MapRDBJSON that contain fields with > null value, it returns the wrong result. > The issue is locally reproduced. > Please find the repro steps: > [1] Create a MaprDBJSON table. Say '/tmp/dmdb2/'. > [2] Insert the following sample records to table: > {code:java} > insert --table /tmp/dmdb2/ --value '{"_id": "1", "label": "person", > "confidence": 0.24}' > insert --table /tmp/dmdb2/ --value '{"_id": "2", "label": "person2"}' > insert --table /tmp/dmdb2/ --value '{"_id": "3", "label": "person3", > "confidence": 0.54}' > insert --table /tmp/dmdb2/ --value '{"_id": "4", "label": "person4", > "confidence": null}' > {code} > We can see that for field 'confidence' document 1 has value 0.24, document 3 > has value 0.54, document 2 does not have the field and document 4 has the > field with value null. > [3] Query the table from DRILL. > *Query 1:* > {code:java} > 0: jdbc:drill:> select label,confidence from dfs.tmp.dmdb2; > +--+-+ > | label | confidence | > +--+-+ > | person | 0.24| > | person2 | null| > | person3 | 0.54| > | person4 | null| > +--+-+ > 4 rows selected (0.2 seconds) > {code} > *Query 2:* > {code:java} > 0: jdbc:drill:> select * from dfs.tmp.dmdb2; > +--+-+--+ > | _id | confidence | label | > +--+-+--+ > | 1| 0.24| person | > | 2| null| person2 | > | 3| 0.54| person3 | > | 4| null| person4 | > +--+-+--+ > 4 rows selected (0.174 seconds) > {code} > *Query 3:* > {code:java} > 0: jdbc:drill:> select label,confidence from dfs.tmp.dmdb2 where confidence > is not null; > +--+-+ > | label | confidence | > +--+-+ > | person | 0.24| > | person3 | 0.54| > | person4 | null| > +--+-+ > 3 rows selected (0.192 seconds) > {code} > *Query 4:* > {code:java} > 0: jdbc:drill:> select label,confidence from dfs.tmp.dmdb2 where confidence > is null; > +--+-+ > | label | confidence | > +--+-+ > | person2 | null| > +--+-+ > 1 row selected (0.262 seconds) > {code} > As you can see, Query 3 which queries for all documents with confidence value > 'is not null', returns a document with null value. > *Other observation:* > Querying the same data using DRILL without MapRDB provides the correct > result. > For example, create 4 different JSON files with following data: > {"label": "person", "confidence": 0.24} \{"label": "person2"} \{"label": > "person3", "confidence": 0.54} \{"label": "person4", "confidence": null} > Query it directly using DRILL: > *Query 5:* > {code:java} > 0: jdbc:drill:> select label,confidence from dfs.tmp.t2; > +--+-+ > | label | confidence | > +--+-+ > | person4 | null| > | person3 | 0.54| > | person2 | null| > | person | 0.24| > +--+-+ > 4 rows selected (0.203 seconds) > {code} > *Query 6:* > {code:java} > 0: jdbc:drill:> select label,confidence from dfs.tmp.t2 where confidence is > null; > +--+-+ > | label | confidence | > +--+-+ > | person4 | null| > | person2 | null| > +--+-+ > 2 rows selected (0.352 seconds) > {code} > *Query 7:* > {code:java} > 0: jdbc:drill:> select label,confidence from dfs.tmp.t2 where confidence is > not null; > +--+-+ > | label | confidence | > +--+-+ > | person3 | 0.54| > | person | 0.24| > +--+-+ > 2 rows selected (0.265 seconds) > {code} > As seen in query 6 & 7, it returns the correct result. > I believe the issue is at the MapRDB layer where it is fetching the results. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7113) Issue with filtering null values from MapRDB-JSON
[ https://issues.apache.org/jira/browse/DRILL-7113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aman Sinha updated DRILL-7113: -- Fix Version/s: (was: 1.17.0) > Issue with filtering null values from MapRDB-JSON > - > > Key: DRILL-7113 > URL: https://issues.apache.org/jira/browse/DRILL-7113 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.15.0 >Reporter: Hanumath Rao Maduri >Assignee: Aman Sinha >Priority: Major > Fix For: 1.16.0 > > > When the Drill is querying documents from MapRDBJSON that contain fields with > null value, it returns the wrong result. > The issue is locally reproduced. > Please find the repro steps: > [1] Create a MaprDBJSON table. Say '/tmp/dmdb2/'. > [2] Insert the following sample records to table: > {code:java} > insert --table /tmp/dmdb2/ --value '{"_id": "1", "label": "person", > "confidence": 0.24}' > insert --table /tmp/dmdb2/ --value '{"_id": "2", "label": "person2"}' > insert --table /tmp/dmdb2/ --value '{"_id": "3", "label": "person3", > "confidence": 0.54}' > insert --table /tmp/dmdb2/ --value '{"_id": "4", "label": "person4", > "confidence": null}' > {code} > We can see that for field 'confidence' document 1 has value 0.24, document 3 > has value 0.54, document 2 does not have the field and document 4 has the > field with value null. > [3] Query the table from DRILL. > *Query 1:* > {code:java} > 0: jdbc:drill:> select label,confidence from dfs.tmp.dmdb2; > +--+-+ > | label | confidence | > +--+-+ > | person | 0.24| > | person2 | null| > | person3 | 0.54| > | person4 | null| > +--+-+ > 4 rows selected (0.2 seconds) > {code} > *Query 2:* > {code:java} > 0: jdbc:drill:> select * from dfs.tmp.dmdb2; > +--+-+--+ > | _id | confidence | label | > +--+-+--+ > | 1| 0.24| person | > | 2| null| person2 | > | 3| 0.54| person3 | > | 4| null| person4 | > +--+-+--+ > 4 rows selected (0.174 seconds) > {code} > *Query 3:* > {code:java} > 0: jdbc:drill:> select label,confidence from dfs.tmp.dmdb2 where confidence > is not null; > +--+-+ > | label | confidence | > +--+-+ > | person | 0.24| > | person3 | 0.54| > | person4 | null| > +--+-+ > 3 rows selected (0.192 seconds) > {code} > *Query 4:* > {code:java} > 0: jdbc:drill:> select label,confidence from dfs.tmp.dmdb2 where confidence > is null; > +--+-+ > | label | confidence | > +--+-+ > | person2 | null| > +--+-+ > 1 row selected (0.262 seconds) > {code} > As you can see, Query 3 which queries for all documents with confidence value > 'is not null', returns a document with null value. > *Other observation:* > Querying the same data using DRILL without MapRDB provides the correct > result. > For example, create 4 different JSON files with following data: > {"label": "person", "confidence": 0.24} \{"label": "person2"} \{"label": > "person3", "confidence": 0.54} \{"label": "person4", "confidence": null} > Query it directly using DRILL: > *Query 5:* > {code:java} > 0: jdbc:drill:> select label,confidence from dfs.tmp.t2; > +--+-+ > | label | confidence | > +--+-+ > | person4 | null| > | person3 | 0.54| > | person2 | null| > | person | 0.24| > +--+-+ > 4 rows selected (0.203 seconds) > {code} > *Query 6:* > {code:java} > 0: jdbc:drill:> select label,confidence from dfs.tmp.t2 where confidence is > null; > +--+-+ > | label | confidence | > +--+-+ > | person4 | null| > | person2 | null| > +--+-+ > 2 rows selected (0.352 seconds) > {code} > *Query 7:* > {code:java} > 0: jdbc:drill:> select label,confidence from dfs.tmp.t2 where confidence is > not null; > +--+-+ > | label | confidence | > +--+-+ > | person3 | 0.54| > | person | 0.24| > +--+-+ > 2 rows selected (0.265 seconds) > {code} > As seen in query 6 & 7, it returns the correct result. > I believe the issue is at the MapRDB layer where it is fetching the results. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6970) Issue with LogRegex format plugin where drillbuf was overflowing
[ https://issues.apache.org/jira/browse/DRILL-6970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-6970: Labels: (was: ready-to-commit) > Issue with LogRegex format plugin where drillbuf was overflowing > - > > Key: DRILL-6970 > URL: https://issues.apache.org/jira/browse/DRILL-6970 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.15.0 >Reporter: jean-claude >Assignee: jean-claude >Priority: Major > Fix For: 1.16.0 > > > The log format plugin does re-allocate the drillbuf when it fills up. You can > query small log files but larger ones will fail with this error: > 0: jdbc:drill:zk=local> select * from dfs.root.`/prog/test.log`; > Error: INTERNAL_ERROR ERROR: index: 32724, length: 108 (expected: range(0, > 32768)) > Fragment 0:0 > Please, refer to logs for more information. > > I'm running drill-embeded. The log storage plugin is configured like so > {code:java} > "log": { > "type": "logRegex", > "regex": "(.+)", > "extension": "log", > "maxErrors": 10, > "schema": [ > { > "fieldName": "line" > } > ] > }, > {code} > The log files is very simple > {code:java} > jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk > jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk > jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk > jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk > jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk > jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk > ...{code} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7116) Adapt statistics to use Drill Metastore API
Volodymyr Vysotskyi created DRILL-7116: -- Summary: Adapt statistics to use Drill Metastore API Key: DRILL-7116 URL: https://issues.apache.org/jira/browse/DRILL-7116 Project: Apache Drill Issue Type: Sub-task Affects Versions: 1.16.0 Reporter: Volodymyr Vysotskyi Assignee: Volodymyr Vysotskyi Fix For: 1.17.0 The current implementation of statistics supposes the usage of files for storing and reading statistics. The aim of this Jira is to adapt statistics to use Drill Metastore API so in future it may be stored in other metastore implementations. Implementation details: - Move statistics info into {{TableMetadata}} - Provide a way for obtaining {{TableMetadata}} in the places where statistics may be used (partially implemented in the scope of DRILL-7089) - Investigate and implement (if possible) lazy materialization of {{DrillStatsTable}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6923) Show schemas uses default(user defined) schema first for resolving table from information_schema
[ https://issues.apache.org/jira/browse/DRILL-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Guzenko updated DRILL-6923: Fix Version/s: (was: 1.16.0) 1.17.0 > Show schemas uses default(user defined) schema first for resolving table from > information_schema > > > Key: DRILL-6923 > URL: https://issues.apache.org/jira/browse/DRILL-6923 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Hive >Affects Versions: 1.14.0 >Reporter: Igor Guzenko >Assignee: Igor Guzenko >Priority: Minor > Fix For: 1.17.0 > > > Show tables tries to find table `information_schema`.`schemata` in default > (user defined) schema, and after failed attempt it resolves table > successfully against root schema. Please check description below for details > explained using example with hive plugin. > *Abstract* > When Drill used with enabled Hive SQL Standard authorization, execution of > queries like, > {code:sql} > USE hive.db_general; > SHOW SCHEMAS LIKE 'hive.%'; {code} > results in error DrillRuntimeException: Failed to use the Hive authorization > components: Error getting object from metastore for Object > [type=TABLE_OR_VIEW, name=db_general.information_schema] . > *Details* > Consider showSchemas() test similar to one defined in > TestSqlStdBasedAuthorization : > {code:java} > @Test > public void showSchemas() throws Exception { > test("USE " + hivePluginName + "." + db_general); > testBuilder() > .sqlQuery("SHOW SCHEMAS LIKE 'hive.%'") > .unOrdered() > .baselineColumns("SCHEMA_NAME") > .baselineValues("hive.db_general") > .baselineValues("hive.default") > .go(); > } > {code} > Currently execution of such test will produce following stacktrace: > {code:none} > Caused by: org.apache.drill.common.exceptions.DrillRuntimeException: Failed > to use the Hive authorization components: Error getting object from metastore > for Object [type=TABLE_OR_VIEW, name=db_general.information_schema] > at > org.apache.drill.exec.store.hive.HiveAuthorizationHelper.authorize(HiveAuthorizationHelper.java:149) > at > org.apache.drill.exec.store.hive.HiveAuthorizationHelper.authorizeReadTable(HiveAuthorizationHelper.java:134) > at > org.apache.drill.exec.store.hive.DrillHiveMetaStoreClient$HiveClientWithAuthzWithCaching.getHiveReadEntry(DrillHiveMetaStoreClient.java:450) > at > org.apache.drill.exec.store.hive.schema.HiveSchemaFactory$HiveSchema.getSelectionBaseOnName(HiveSchemaFactory.java:233) > at > org.apache.drill.exec.store.hive.schema.HiveSchemaFactory$HiveSchema.getDrillTable(HiveSchemaFactory.java:214) > at > org.apache.drill.exec.store.hive.schema.HiveDatabaseSchema.getTable(HiveDatabaseSchema.java:63) > at > org.apache.calcite.jdbc.SimpleCalciteSchema.getImplicitTable(SimpleCalciteSchema.java:83) > at org.apache.calcite.jdbc.CalciteSchema.getTable(CalciteSchema.java:288) > at org.apache.calcite.sql.validate.EmptyScope.resolve_(EmptyScope.java:143) > at org.apache.calcite.sql.validate.EmptyScope.resolveTable(EmptyScope.java:99) > at > org.apache.calcite.sql.validate.DelegatingScope.resolveTable(DelegatingScope.java:203) > at > org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl(IdentifierNamespace.java:105) > at > org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl(IdentifierNamespace.java:177) > at > org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84) > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:967) > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:943) > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:3032) > at > org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom(SqlConverter.java:274) > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:3014) > at > org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom(SqlConverter.java:274) > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3284) > at > org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60) > at > org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84) > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:967) > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:943) > at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:225) > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:918) >
[jira] [Updated] (DRILL-6923) Show schemas uses default(user defined) schema first for resolving table from information_schema
[ https://issues.apache.org/jira/browse/DRILL-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Guzenko updated DRILL-6923: Priority: Minor (was: Major) > Show schemas uses default(user defined) schema first for resolving table from > information_schema > > > Key: DRILL-6923 > URL: https://issues.apache.org/jira/browse/DRILL-6923 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Hive >Affects Versions: 1.14.0 >Reporter: Igor Guzenko >Assignee: Igor Guzenko >Priority: Minor > Fix For: 1.16.0 > > > Show tables tries to find table `information_schema`.`schemata` in default > (user defined) schema, and after failed attempt it resolves table > successfully against root schema. Please check description below for details > explained using example with hive plugin. > *Abstract* > When Drill used with enabled Hive SQL Standard authorization, execution of > queries like, > {code:sql} > USE hive.db_general; > SHOW SCHEMAS LIKE 'hive.%'; {code} > results in error DrillRuntimeException: Failed to use the Hive authorization > components: Error getting object from metastore for Object > [type=TABLE_OR_VIEW, name=db_general.information_schema] . > *Details* > Consider showSchemas() test similar to one defined in > TestSqlStdBasedAuthorization : > {code:java} > @Test > public void showSchemas() throws Exception { > test("USE " + hivePluginName + "." + db_general); > testBuilder() > .sqlQuery("SHOW SCHEMAS LIKE 'hive.%'") > .unOrdered() > .baselineColumns("SCHEMA_NAME") > .baselineValues("hive.db_general") > .baselineValues("hive.default") > .go(); > } > {code} > Currently execution of such test will produce following stacktrace: > {code:none} > Caused by: org.apache.drill.common.exceptions.DrillRuntimeException: Failed > to use the Hive authorization components: Error getting object from metastore > for Object [type=TABLE_OR_VIEW, name=db_general.information_schema] > at > org.apache.drill.exec.store.hive.HiveAuthorizationHelper.authorize(HiveAuthorizationHelper.java:149) > at > org.apache.drill.exec.store.hive.HiveAuthorizationHelper.authorizeReadTable(HiveAuthorizationHelper.java:134) > at > org.apache.drill.exec.store.hive.DrillHiveMetaStoreClient$HiveClientWithAuthzWithCaching.getHiveReadEntry(DrillHiveMetaStoreClient.java:450) > at > org.apache.drill.exec.store.hive.schema.HiveSchemaFactory$HiveSchema.getSelectionBaseOnName(HiveSchemaFactory.java:233) > at > org.apache.drill.exec.store.hive.schema.HiveSchemaFactory$HiveSchema.getDrillTable(HiveSchemaFactory.java:214) > at > org.apache.drill.exec.store.hive.schema.HiveDatabaseSchema.getTable(HiveDatabaseSchema.java:63) > at > org.apache.calcite.jdbc.SimpleCalciteSchema.getImplicitTable(SimpleCalciteSchema.java:83) > at org.apache.calcite.jdbc.CalciteSchema.getTable(CalciteSchema.java:288) > at org.apache.calcite.sql.validate.EmptyScope.resolve_(EmptyScope.java:143) > at org.apache.calcite.sql.validate.EmptyScope.resolveTable(EmptyScope.java:99) > at > org.apache.calcite.sql.validate.DelegatingScope.resolveTable(DelegatingScope.java:203) > at > org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl(IdentifierNamespace.java:105) > at > org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl(IdentifierNamespace.java:177) > at > org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84) > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:967) > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:943) > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:3032) > at > org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom(SqlConverter.java:274) > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:3014) > at > org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom(SqlConverter.java:274) > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3284) > at > org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60) > at > org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84) > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:967) > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:943) > at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:225) > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:918) > at > org.apache.calcite.sql.
[jira] [Updated] (DRILL-7076) NPE is logged when querying postgres tables
[ https://issues.apache.org/jira/browse/DRILL-7076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-7076: Priority: Blocker (was: Minor) > NPE is logged when querying postgres tables > --- > > Key: DRILL-7076 > URL: https://issues.apache.org/jira/browse/DRILL-7076 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.16.0 >Reporter: Volodymyr Vysotskyi >Assignee: Gautam Parai >Priority: Blocker > Fix For: 1.16.0 > > > NPE is seen in logs when querying Postgres table: > {code:sql} > select 1 from postgres.public.tdt > {code} > Stack trace from {{sqlline.log}}: > {noformat} > 2019-03-05 13:49:19,395 [23819dc0-abf8-24f3-ea81-6ced1b6e11af:foreman] WARN > o.a.d.e.p.common.DrillStatsTable - Failed to materialize the stats. > Continuing without stats. > java.lang.NullPointerException: null > at > org.apache.drill.exec.planner.common.DrillStatsTable$StatsMaterializationVisitor.visit(DrillStatsTable.java:189) > [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at org.apache.calcite.rel.SingleRel.childrenAccept(SingleRel.java:72) > [calcite-core-1.18.0-drill-r0.jar:1.18.0-drill-r0] > at org.apache.calcite.rel.RelVisitor.visit(RelVisitor.java:44) > [calcite-core-1.18.0-drill-r0.jar:1.18.0-drill-r0] > at > org.apache.drill.exec.planner.common.DrillStatsTable$StatsMaterializationVisitor.visit(DrillStatsTable.java:202) > [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at org.apache.calcite.rel.RelVisitor.go(RelVisitor.java:61) > [calcite-core-1.18.0-drill-r0.jar:1.18.0-drill-r0] > at > org.apache.drill.exec.planner.common.DrillStatsTable$StatsMaterializationVisitor.materialize(DrillStatsTable.java:177) > [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRawDrel(DefaultSqlHandler.java:235) > [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:331) > [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:178) > [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at > org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:204) > [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at > org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan(DrillSqlWorker.java:114) > [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at > org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:80) > [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:584) > [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:272) > [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [na:1.8.0_191] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_191] > at java.lang.Thread.run(Thread.java:748) [na:1.8.0_191] > {noformat} > But query runs and returns the correct result. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6540) Upgrade to HADOOP-3.1 libraries
[ https://issues.apache.org/jira/browse/DRILL-6540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-6540: Reviewer: Sorabh Hamirwasia > Upgrade to HADOOP-3.1 libraries > > > Key: DRILL-6540 > URL: https://issues.apache.org/jira/browse/DRILL-6540 > Project: Apache Drill > Issue Type: Improvement > Components: Tools, Build & Test >Affects Versions: 1.14.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka >Priority: Major > Fix For: 1.16.0 > > > Currently Drill uses 2.7.4 version of hadoop libraries (hadoop-common, > hadoop-hdfs, hadoop-annotations, hadoop-aws, hadoop-yarn-api, hadoop-client, > hadoop-yarn-client). > Half of year ago the [Hadoop > 3.0|https://hadoop.apache.org/docs/r3.0.0/index.html] was released and > recently it was an update - [Hadoop > 3.2.0|https://hadoop.apache.org/docs/r3.2.0/]. > To use Drill under Hadoop3.0 distribution we need this upgrade. Also the > newer version includes new features, which can be useful for Drill. > This upgrade is also needed to leverage the newest version of Zookeeper > libraries and Hive 3.1 version. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-7079) Drill can't query views from the S3 storage when plain authentication is enabled
[ https://issues.apache.org/jira/browse/DRILL-7079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva reassigned DRILL-7079: --- Assignee: Bohdan Kazydub (was: Arina Ielchiieva) > Drill can't query views from the S3 storage when plain authentication is > enabled > > > Key: DRILL-7079 > URL: https://issues.apache.org/jira/browse/DRILL-7079 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.15.0 >Reporter: Denys Ordynskiy >Assignee: Bohdan Kazydub >Priority: Major > Fix For: 1.16.0 > > > Enable plain authentication in Drill. > Create the view on the S3 storage: > create view s3.tmp.`testview` as select * from cp.`employee.json` limit 20; > Try to select data from the created view: > select * from s3.tmp.`testview`; > *Actual result*: > {noformat} > 2019-02-27 17:01:09,202 [Client-1] INFO > o.a.d.j.i.DrillCursor$ResultsListener - [#4] Query failed: > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > IllegalArgumentException: A valid userName is expected > Please, refer to logs for more information. > [Error Id: 2271c3aa-6d09-4b51-a585-0e0e954b46eb on maprhost:31010] > at > org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123) > [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:422) > [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:96) > [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:273) > [drill-rpc-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:243) > [drill-rpc-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88) > [netty-codec-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287) > [netty-handler-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) > [netty-codec-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312) > [netty-codec-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:286) > [netty-codec-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86
[jira] [Assigned] (DRILL-7079) Drill can't query views from the S3 storage when plain authentication is enabled
[ https://issues.apache.org/jira/browse/DRILL-7079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva reassigned DRILL-7079: --- Assignee: Arina Ielchiieva (was: Bohdan Kazydub) > Drill can't query views from the S3 storage when plain authentication is > enabled > > > Key: DRILL-7079 > URL: https://issues.apache.org/jira/browse/DRILL-7079 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.15.0 >Reporter: Denys Ordynskiy >Assignee: Arina Ielchiieva >Priority: Major > Fix For: 1.16.0 > > > Enable plain authentication in Drill. > Create the view on the S3 storage: > create view s3.tmp.`testview` as select * from cp.`employee.json` limit 20; > Try to select data from the created view: > select * from s3.tmp.`testview`; > *Actual result*: > {noformat} > 2019-02-27 17:01:09,202 [Client-1] INFO > o.a.d.j.i.DrillCursor$ResultsListener - [#4] Query failed: > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > IllegalArgumentException: A valid userName is expected > Please, refer to logs for more information. > [Error Id: 2271c3aa-6d09-4b51-a585-0e0e954b46eb on maprhost:31010] > at > org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123) > [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:422) > [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:96) > [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:273) > [drill-rpc-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:243) > [drill-rpc-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88) > [netty-codec-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287) > [netty-handler-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) > [netty-codec-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312) > [netty-codec-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:286) > [netty-codec-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:
[jira] [Updated] (DRILL-7079) Drill can't query views from the S3 storage when plain authentication is enabled
[ https://issues.apache.org/jira/browse/DRILL-7079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-7079: Reviewer: Volodymyr Vysotskyi > Drill can't query views from the S3 storage when plain authentication is > enabled > > > Key: DRILL-7079 > URL: https://issues.apache.org/jira/browse/DRILL-7079 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.15.0 >Reporter: Denys Ordynskiy >Assignee: Bohdan Kazydub >Priority: Major > Fix For: 1.16.0 > > > Enable plain authentication in Drill. > Create the view on the S3 storage: > create view s3.tmp.`testview` as select * from cp.`employee.json` limit 20; > Try to select data from the created view: > select * from s3.tmp.`testview`; > *Actual result*: > {noformat} > 2019-02-27 17:01:09,202 [Client-1] INFO > o.a.d.j.i.DrillCursor$ResultsListener - [#4] Query failed: > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > IllegalArgumentException: A valid userName is expected > Please, refer to logs for more information. > [Error Id: 2271c3aa-6d09-4b51-a585-0e0e954b46eb on maprhost:31010] > at > org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123) > [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:422) > [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:96) > [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:273) > [drill-rpc-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:243) > [drill-rpc-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88) > [netty-codec-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287) > [netty-handler-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) > [netty-codec-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312) > [netty-codec-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:286) > [netty-codec-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) > [netty-transport-4.0.4
[jira] [Updated] (DRILL-7115) Improve Hive schema show tables performance
[ https://issues.apache.org/jira/browse/DRILL-7115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-7115: Reviewer: Vitalii Diravka > Improve Hive schema show tables performance > --- > > Key: DRILL-7115 > URL: https://issues.apache.org/jira/browse/DRILL-7115 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Hive, Storage - Information Schema >Affects Versions: 1.15.0 >Reporter: Igor Guzenko >Assignee: Igor Guzenko >Priority: Major > Fix For: 1.16.0 > > > In Sqlline(Drill), "show tables" on a Hive schema is taking nearly 15mins to > 20mins. The schema has nearly ~8000 tables. > Whereas the same in beeline(Hive) is throwing the result in a split second(~ > 0.2 secs). > I tested the same in my test cluster by creating 6000 tables(empty!) in Hive > and then doing "show tables" in Drill. It took more than 2 mins(~140 secs). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7115) Improve Hive schema show tables performance
[ https://issues.apache.org/jira/browse/DRILL-7115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-7115: Fix Version/s: 1.16.0 > Improve Hive schema show tables performance > --- > > Key: DRILL-7115 > URL: https://issues.apache.org/jira/browse/DRILL-7115 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Hive, Storage - Information Schema >Reporter: Igor Guzenko >Assignee: Igor Guzenko >Priority: Major > Fix For: 1.16.0 > > > In Sqlline(Drill), "show tables" on a Hive schema is taking nearly 15mins to > 20mins. The schema has nearly ~8000 tables. > Whereas the same in beeline(Hive) is throwing the result in a split second(~ > 0.2 secs). > I tested the same in my test cluster by creating 6000 tables(empty!) in Hive > and then doing "show tables" in Drill. It took more than 2 mins(~140 secs). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7115) Improve Hive schema show tables performance
[ https://issues.apache.org/jira/browse/DRILL-7115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-7115: Priority: Major (was: Minor) > Improve Hive schema show tables performance > --- > > Key: DRILL-7115 > URL: https://issues.apache.org/jira/browse/DRILL-7115 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Hive, Storage - Information Schema >Reporter: Igor Guzenko >Assignee: Igor Guzenko >Priority: Major > > In Sqlline(Drill), "show tables" on a Hive schema is taking nearly 15mins to > 20mins. The schema has nearly ~8000 tables. > Whereas the same in beeline(Hive) is throwing the result in a split second(~ > 0.2 secs). > I tested the same in my test cluster by creating 6000 tables(empty!) in Hive > and then doing "show tables" in Drill. It took more than 2 mins(~140 secs). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7115) Improve Hive schema show tables performance
[ https://issues.apache.org/jira/browse/DRILL-7115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-7115: Affects Version/s: 1.15.0 > Improve Hive schema show tables performance > --- > > Key: DRILL-7115 > URL: https://issues.apache.org/jira/browse/DRILL-7115 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Hive, Storage - Information Schema >Affects Versions: 1.15.0 >Reporter: Igor Guzenko >Assignee: Igor Guzenko >Priority: Major > Fix For: 1.16.0 > > > In Sqlline(Drill), "show tables" on a Hive schema is taking nearly 15mins to > 20mins. The schema has nearly ~8000 tables. > Whereas the same in beeline(Hive) is throwing the result in a split second(~ > 0.2 secs). > I tested the same in my test cluster by creating 6000 tables(empty!) in Hive > and then doing "show tables" in Drill. It took more than 2 mins(~140 secs). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7115) Improve Hive schema show tables performance
Igor Guzenko created DRILL-7115: --- Summary: Improve Hive schema show tables performance Key: DRILL-7115 URL: https://issues.apache.org/jira/browse/DRILL-7115 Project: Apache Drill Issue Type: Improvement Components: Storage - Hive, Storage - Information Schema Reporter: Igor Guzenko Assignee: Igor Guzenko In Sqlline(Drill), "show tables" on a Hive schema is taking nearly 15mins to 20mins. The schema has nearly ~8000 tables. Whereas the same in beeline(Hive) is throwing the result in a split second(~ 0.2 secs). I tested the same in my test cluster by creating 6000 tables(empty!) in Hive and then doing "show tables" in Drill. It took more than 2 mins(~140 secs). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7095) Expose Tuple Metadata to the physical operator
[ https://issues.apache.org/jira/browse/DRILL-7095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-7095: Labels: ready-to-commit (was: ) > Expose Tuple Metadata to the physical operator > -- > > Key: DRILL-7095 > URL: https://issues.apache.org/jira/browse/DRILL-7095 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Arina Ielchiieva >Assignee: Arina Ielchiieva >Priority: Major > Labels: ready-to-commit > Fix For: 1.16.0 > > > Provide mechanism to expose Tuple Metadata to the physical operator (sub > scan). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6970) Issue with LogRegex format plugin where drillbuf was overflowing
[ https://issues.apache.org/jira/browse/DRILL-6970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-6970: Labels: ready-to-commit (was: ) > Issue with LogRegex format plugin where drillbuf was overflowing > - > > Key: DRILL-6970 > URL: https://issues.apache.org/jira/browse/DRILL-6970 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.15.0 >Reporter: jean-claude >Assignee: jean-claude >Priority: Major > Labels: ready-to-commit > Fix For: 1.16.0 > > > The log format plugin does re-allocate the drillbuf when it fills up. You can > query small log files but larger ones will fail with this error: > 0: jdbc:drill:zk=local> select * from dfs.root.`/prog/test.log`; > Error: INTERNAL_ERROR ERROR: index: 32724, length: 108 (expected: range(0, > 32768)) > Fragment 0:0 > Please, refer to logs for more information. > > I'm running drill-embeded. The log storage plugin is configured like so > {code:java} > "log": { > "type": "logRegex", > "regex": "(.+)", > "extension": "log", > "maxErrors": 10, > "schema": [ > { > "fieldName": "line" > } > ] > }, > {code} > The log files is very simple > {code:java} > jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk > jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk > jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk > jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk > jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk > jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk > ...{code} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7086) Enhance row-set scan framework to use external schema
[ https://issues.apache.org/jira/browse/DRILL-7086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-7086: Labels: ready-to-commit (was: ) > Enhance row-set scan framework to use external schema > - > > Key: DRILL-7086 > URL: https://issues.apache.org/jira/browse/DRILL-7086 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.15.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Major > Labels: ready-to-commit > Fix For: 1.16.0 > > > Modify the row-set scan framework to work with an external (partial) schema; > inserting "type conversion shims" to convert as needed. The reader provides > an "input schema" the data types the reader is prepared to handle. An > optional "output schema" describes the types of the value vectors to create. > The type conversion "shims" give the reader the "setFoo" method it wants to > use, while converting the data to the type needed for the vector. For > example, the CSV reader might read only text fields, while the shim converts > a column to an INT. > This is just the framework layer, DRILL-7011 will combine this mechanism with > the plan-side features to enable use of the feature in the new row-set based > CSV reader. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6430) Drill Should Not Fail If It Sees Deprecated Options Stored In Zookeeper Or Locally
[ https://issues.apache.org/jira/browse/DRILL-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16795938#comment-16795938 ] Bohdan Kazydub commented on DRILL-6430: --- The functionality seems to be implemented in DRILL-2304. > Drill Should Not Fail If It Sees Deprecated Options Stored In Zookeeper Or > Locally > -- > > Key: DRILL-6430 > URL: https://issues.apache.org/jira/browse/DRILL-6430 > Project: Apache Drill > Issue Type: Improvement >Reporter: Timothy Farkas >Assignee: Bohdan Kazydub >Priority: Major > Fix For: 1.17.0 > > > This is required for resource management since we will likely remove many > options. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-6430) Drill Should Not Fail If It Sees Deprecated Options Stored In Zookeeper Or Locally
[ https://issues.apache.org/jira/browse/DRILL-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bohdan Kazydub resolved DRILL-6430. --- Resolution: Done Fix Version/s: (was: 1.17.0) 1.16.0 > Drill Should Not Fail If It Sees Deprecated Options Stored In Zookeeper Or > Locally > -- > > Key: DRILL-6430 > URL: https://issues.apache.org/jira/browse/DRILL-6430 > Project: Apache Drill > Issue Type: Improvement >Reporter: Timothy Farkas >Assignee: Bohdan Kazydub >Priority: Major > Fix For: 1.16.0 > > > This is required for resource management since we will likely remove many > options. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7114) ANALYZE command generates warnings for stats file and materialization
[ https://issues.apache.org/jira/browse/DRILL-7114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16795881#comment-16795881 ] Vitalii Diravka commented on DRILL-7114: It can be reproduced not only for ANALYZE TABLE command, but for SELECT queries too (drill-embedded mode, drill master version): {code} 0: jdbc:drill:zk=local> use dfs.tmp; +---+--+ | ok | summary| +---+--+ | true | Default schema changed to [dfs.tmp] | +---+--+ 1 row selected (0.135 seconds) 0: jdbc:drill:zk=local> create table temp_t as select * from (VALUES(1)); +---++ | Fragment | Number of records written | +---++ | 0_0 | 1 | +---++ 1 row selected (0.65 seconds) 0: jdbc:drill:zk=local> select * from temp_t; +-+ | EXPR$0 | +-+ | 1 | +-+ 1 row selected (0.198 seconds) {code} > ANALYZE command generates warnings for stats file and materialization > - > > Key: DRILL-7114 > URL: https://issues.apache.org/jira/browse/DRILL-7114 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Reporter: Aman Sinha >Assignee: Gautam Parai >Priority: Minor > Fix For: 1.16.0 > > > When I run ANALYZE, I see warnings in the log file as shown below. The > ANALYZE command should not try to read the stats file or materialize the > stats. > {noformat} > 12:04:32.939 [2370143e-c419-f33c-d879-84989712bc85:foreman] WARN > o.a.d.e.p.common.DrillStatsTable - Failed to read the stats file. > java.io.FileNotFoundException: File /tmp/orders3/.stats.drill/0_0.json does > not exist > 12:04:32.939 [2370143e-c419-f33c-d879-84989712bc85:foreman] WARN > o.a.d.e.p.common.DrillStatsTable - Failed to materialize the stats. > Continuing without stats. > java.io.FileNotFoundException: File /tmp/orders3/.stats.drill/0_0.json does > not exist > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)