[jira] [Commented] (DRILL-7109) Statistics adds external sort, which spills to disk
[ https://issues.apache.org/jira/browse/DRILL-7109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16799449#comment-16799449 ] Gautam Parai commented on DRILL-7109: - [~rhou] can you please create another Jira for the issue where we see filter predicates of the type $0=$0 in TPCH4 - that is another issue which should be looked at outside the scope of statistics. > Statistics adds external sort, which spills to disk > --- > > Key: DRILL-7109 > URL: https://issues.apache.org/jira/browse/DRILL-7109 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.16.0 >Reporter: Robert Hou >Assignee: Gautam Parai >Priority: Major > Fix For: 1.16.0 > > > TPCH query 4 with sf 100 runs many times slower. One issue is that an extra > external sort has been added, and both external sorts spill to disk. > Also, the hash join sees 100x more data. > Here is the query: > {noformat} > select > o.o_orderpriority, > count(*) as order_count > from > orders o > where > o.o_orderdate >= date '1996-10-01' > and o.o_orderdate < date '1996-10-01' + interval '3' month > and > exists ( > select > * > from > lineitem l > where > l.l_orderkey = o.o_orderkey > and l.l_commitdate < l.l_receiptdate > ) > group by > o.o_orderpriority > order by > o.o_orderpriority; > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7102) Apache Metrics WEBUI Unavailable
[ https://issues.apache.org/jira/browse/DRILL-7102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16799418#comment-16799418 ] Abhishek Girish commented on DRILL-7102: One issue could be that you are using the hostname:port - and that may not be resolvable. Can you try with external IP:port and let me know if that works? > Apache Metrics WEBUI Unavailable > - > > Key: DRILL-7102 > URL: https://issues.apache.org/jira/browse/DRILL-7102 > Project: Apache Drill > Issue Type: Bug > Components: Client - HTTP >Affects Versions: 1.15.0 > Environment: kubernetes v1.13.2 > ubuntu:18.04 > Apache Drill 1.15.0 > 64GB RAM > 8 vCpu Cores >Reporter: Gene >Assignee: Abhishek Girish >Priority: Minor > Attachments: Screen Shot 2019-03-13 at 1.16.14 PM.png, Screen Shot > 2019-03-14 at 2.44.37 PM.png > > > Apache Drill Metrics unavailable in webUI when exposed through NodePort in > Kubernetes. > Error: > {code:java} > Failed to load resource: net::ERR_CONNECTION_REFUSED > {code} > Browser unable to resolve requested url. > Maybe we can have a feature where we can change the resource name that the > browser is looking for. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7102) Apache Metrics WEBUI Unavailable
[ https://issues.apache.org/jira/browse/DRILL-7102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16799417#comment-16799417 ] Abhishek Girish commented on DRILL-7102: [~tocinoatbp], I just tried with LoadBalancer and NodePort configurations - could not reproduce the issue. I was able to open the Metrics page and click through them. Can you please share more details? > Apache Metrics WEBUI Unavailable > - > > Key: DRILL-7102 > URL: https://issues.apache.org/jira/browse/DRILL-7102 > Project: Apache Drill > Issue Type: Bug > Components: Client - HTTP >Affects Versions: 1.15.0 > Environment: kubernetes v1.13.2 > ubuntu:18.04 > Apache Drill 1.15.0 > 64GB RAM > 8 vCpu Cores >Reporter: Gene >Assignee: Abhishek Girish >Priority: Minor > Attachments: Screen Shot 2019-03-13 at 1.16.14 PM.png, Screen Shot > 2019-03-14 at 2.44.37 PM.png > > > Apache Drill Metrics unavailable in webUI when exposed through NodePort in > Kubernetes. > Error: > {code:java} > Failed to load resource: net::ERR_CONNECTION_REFUSED > {code} > Browser unable to resolve requested url. > Maybe we can have a feature where we can change the resource name that the > browser is looking for. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-7102) Apache Metrics WEBUI Unavailable
[ https://issues.apache.org/jira/browse/DRILL-7102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Girish reassigned DRILL-7102: -- Assignee: Abhishek Girish > Apache Metrics WEBUI Unavailable > - > > Key: DRILL-7102 > URL: https://issues.apache.org/jira/browse/DRILL-7102 > Project: Apache Drill > Issue Type: Bug > Components: Client - HTTP >Affects Versions: 1.15.0 > Environment: kubernetes v1.13.2 > ubuntu:18.04 > Apache Drill 1.15.0 > 64GB RAM > 8 vCpu Cores >Reporter: Gene >Assignee: Abhishek Girish >Priority: Minor > Attachments: Screen Shot 2019-03-13 at 1.16.14 PM.png, Screen Shot > 2019-03-14 at 2.44.37 PM.png > > > Apache Drill Metrics unavailable in webUI when exposed through NodePort in > Kubernetes. > Error: > {code:java} > Failed to load resource: net::ERR_CONNECTION_REFUSED > {code} > Browser unable to resolve requested url. > Maybe we can have a feature where we can change the resource name that the > browser is looking for. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-7132) Metadata cache does not have correct min/max values for varchar and interval data types
[ https://issues.apache.org/jira/browse/DRILL-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-7132. --- Resolution: Not A Problem > Metadata cache does not have correct min/max values for varchar and interval > data types > --- > > Key: DRILL-7132 > URL: https://issues.apache.org/jira/browse/DRILL-7132 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.14.0 >Reporter: Robert Hou >Priority: Major > Fix For: 1.17.0 > > Attachments: 0_0_10.parquet > > > The parquet metadata cache does not have correct min/max values for varchar > and interval data types. > I have attached a parquet file. Here is what parquet tools shows for varchar: > [varchar_col] BINARY 14.6% of all space [PLAIN, BIT_PACKED] min: 67 max: 67 > average: 67 total: 67 (raw data: 65 saving -3%) > values: min: 1 max: 1 average: 1 total: 1 > uncompressed: min: 65 max: 65 average: 65 total: 65 > column values statistics: min: ioegjNJKvnkd, max: ioegjNJKvnkd, num_nulls: 0 > Here is what the metadata cache file shows: > "name" : [ "varchar_col" ], > "minValue" : "aW9lZ2pOSkt2bmtk", > "maxValue" : "aW9lZ2pOSkt2bmtk", > "nulls" : 0 > Here is what parquet tools shows for interval: > [interval_col] BINARY 11.3% of all space [PLAIN, BIT_PACKED] min: 52 max: 52 > average: 52 total: 52 (raw data: 50 saving -4%) > values: min: 1 max: 1 average: 1 total: 1 > uncompressed: min: 50 max: 50 average: 50 total: 50 > column values statistics: min: P18582D, max: P18582D, num_nulls: 0 > Here is what the metadata cache file shows: > "name" : [ "interval_col" ], > "minValue" : "UDE4NTgyRA==", > "maxValue" : "UDE4NTgyRA==", > "nulls" : 0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7132) Metadata cache does not have correct min/max values for varchar and interval data types
[ https://issues.apache.org/jira/browse/DRILL-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16799365#comment-16799365 ] Volodymyr Vysotskyi commented on DRILL-7132: Here is a discussion from the PR with an explanation of the necessity of changing format: [https://github.com/apache/drill/pull/805#discussion_r117578486] > Metadata cache does not have correct min/max values for varchar and interval > data types > --- > > Key: DRILL-7132 > URL: https://issues.apache.org/jira/browse/DRILL-7132 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.14.0 >Reporter: Robert Hou >Priority: Major > Fix For: 1.17.0 > > Attachments: 0_0_10.parquet > > > The parquet metadata cache does not have correct min/max values for varchar > and interval data types. > I have attached a parquet file. Here is what parquet tools shows for varchar: > [varchar_col] BINARY 14.6% of all space [PLAIN, BIT_PACKED] min: 67 max: 67 > average: 67 total: 67 (raw data: 65 saving -3%) > values: min: 1 max: 1 average: 1 total: 1 > uncompressed: min: 65 max: 65 average: 65 total: 65 > column values statistics: min: ioegjNJKvnkd, max: ioegjNJKvnkd, num_nulls: 0 > Here is what the metadata cache file shows: > "name" : [ "varchar_col" ], > "minValue" : "aW9lZ2pOSkt2bmtk", > "maxValue" : "aW9lZ2pOSkt2bmtk", > "nulls" : 0 > Here is what parquet tools shows for interval: > [interval_col] BINARY 11.3% of all space [PLAIN, BIT_PACKED] min: 52 max: 52 > average: 52 total: 52 (raw data: 50 saving -4%) > values: min: 1 max: 1 average: 1 total: 1 > uncompressed: min: 50 max: 50 average: 50 total: 50 > column values statistics: min: P18582D, max: P18582D, num_nulls: 0 > Here is what the metadata cache file shows: > "name" : [ "interval_col" ], > "minValue" : "UDE4NTgyRA==", > "maxValue" : "UDE4NTgyRA==", > "nulls" : 0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7132) Metadata cache does not have correct min/max values for varchar and interval data types
[ https://issues.apache.org/jira/browse/DRILL-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16799355#comment-16799355 ] Robert Hou commented on DRILL-7132: --- While I agree that there is no requirement to store data in human-readable format, there are advantages when it comes to support and debugging customer issues. But I assume you considered this and decided the pros of using a different format were more important. > Metadata cache does not have correct min/max values for varchar and interval > data types > --- > > Key: DRILL-7132 > URL: https://issues.apache.org/jira/browse/DRILL-7132 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.14.0 >Reporter: Robert Hou >Priority: Major > Fix For: 1.17.0 > > Attachments: 0_0_10.parquet > > > The parquet metadata cache does not have correct min/max values for varchar > and interval data types. > I have attached a parquet file. Here is what parquet tools shows for varchar: > [varchar_col] BINARY 14.6% of all space [PLAIN, BIT_PACKED] min: 67 max: 67 > average: 67 total: 67 (raw data: 65 saving -3%) > values: min: 1 max: 1 average: 1 total: 1 > uncompressed: min: 65 max: 65 average: 65 total: 65 > column values statistics: min: ioegjNJKvnkd, max: ioegjNJKvnkd, num_nulls: 0 > Here is what the metadata cache file shows: > "name" : [ "varchar_col" ], > "minValue" : "aW9lZ2pOSkt2bmtk", > "maxValue" : "aW9lZ2pOSkt2bmtk", > "nulls" : 0 > Here is what parquet tools shows for interval: > [interval_col] BINARY 11.3% of all space [PLAIN, BIT_PACKED] min: 52 max: 52 > average: 52 total: 52 (raw data: 50 saving -4%) > values: min: 1 max: 1 average: 1 total: 1 > uncompressed: min: 50 max: 50 average: 50 total: 50 > column values statistics: min: P18582D, max: P18582D, num_nulls: 0 > Here is what the metadata cache file shows: > "name" : [ "interval_col" ], > "minValue" : "UDE4NTgyRA==", > "maxValue" : "UDE4NTgyRA==", > "nulls" : 0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7132) Metadata cache does not have correct min/max values for varchar and interval data types
[ https://issues.apache.org/jira/browse/DRILL-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16799353#comment-16799353 ] Robert Hou commented on DRILL-7132: --- The online decoder works. Thanks. --Robert > Metadata cache does not have correct min/max values for varchar and interval > data types > --- > > Key: DRILL-7132 > URL: https://issues.apache.org/jira/browse/DRILL-7132 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.14.0 >Reporter: Robert Hou >Priority: Major > Fix For: 1.17.0 > > Attachments: 0_0_10.parquet > > > The parquet metadata cache does not have correct min/max values for varchar > and interval data types. > I have attached a parquet file. Here is what parquet tools shows for varchar: > [varchar_col] BINARY 14.6% of all space [PLAIN, BIT_PACKED] min: 67 max: 67 > average: 67 total: 67 (raw data: 65 saving -3%) > values: min: 1 max: 1 average: 1 total: 1 > uncompressed: min: 65 max: 65 average: 65 total: 65 > column values statistics: min: ioegjNJKvnkd, max: ioegjNJKvnkd, num_nulls: 0 > Here is what the metadata cache file shows: > "name" : [ "varchar_col" ], > "minValue" : "aW9lZ2pOSkt2bmtk", > "maxValue" : "aW9lZ2pOSkt2bmtk", > "nulls" : 0 > Here is what parquet tools shows for interval: > [interval_col] BINARY 11.3% of all space [PLAIN, BIT_PACKED] min: 52 max: 52 > average: 52 total: 52 (raw data: 50 saving -4%) > values: min: 1 max: 1 average: 1 total: 1 > uncompressed: min: 50 max: 50 average: 50 total: 50 > column values statistics: min: P18582D, max: P18582D, num_nulls: 0 > Here is what the metadata cache file shows: > "name" : [ "interval_col" ], > "minValue" : "UDE4NTgyRA==", > "maxValue" : "UDE4NTgyRA==", > "nulls" : 0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7132) Metadata cache does not have correct min/max values for varchar and interval data types
[ https://issues.apache.org/jira/browse/DRILL-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16799349#comment-16799349 ] Volodymyr Vysotskyi commented on DRILL-7132: These values may be decoded using java code ({{Base64}} class, or alternatives), or using online decoders, like this one: [https://www.base64decode.org/]. Also, old metadata cache files without encoding (V3_2 or older) should be handled correctly by Drill. > Metadata cache does not have correct min/max values for varchar and interval > data types > --- > > Key: DRILL-7132 > URL: https://issues.apache.org/jira/browse/DRILL-7132 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.14.0 >Reporter: Robert Hou >Priority: Major > Fix For: 1.17.0 > > Attachments: 0_0_10.parquet > > > The parquet metadata cache does not have correct min/max values for varchar > and interval data types. > I have attached a parquet file. Here is what parquet tools shows for varchar: > [varchar_col] BINARY 14.6% of all space [PLAIN, BIT_PACKED] min: 67 max: 67 > average: 67 total: 67 (raw data: 65 saving -3%) > values: min: 1 max: 1 average: 1 total: 1 > uncompressed: min: 65 max: 65 average: 65 total: 65 > column values statistics: min: ioegjNJKvnkd, max: ioegjNJKvnkd, num_nulls: 0 > Here is what the metadata cache file shows: > "name" : [ "varchar_col" ], > "minValue" : "aW9lZ2pOSkt2bmtk", > "maxValue" : "aW9lZ2pOSkt2bmtk", > "nulls" : 0 > Here is what parquet tools shows for interval: > [interval_col] BINARY 11.3% of all space [PLAIN, BIT_PACKED] min: 52 max: 52 > average: 52 total: 52 (raw data: 50 saving -4%) > values: min: 1 max: 1 average: 1 total: 1 > uncompressed: min: 50 max: 50 average: 50 total: 50 > column values statistics: min: P18582D, max: P18582D, num_nulls: 0 > Here is what the metadata cache file shows: > "name" : [ "interval_col" ], > "minValue" : "UDE4NTgyRA==", > "maxValue" : "UDE4NTgyRA==", > "nulls" : 0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (DRILL-7130) IllegalStateException: Read batch count [0] should be greater than zero
[ https://issues.apache.org/jira/browse/DRILL-7130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16799295#comment-16799295 ] Timothy Farkas edited comment on DRILL-7130 at 3/22/19 9:01 PM: Made a mistake and the Jira number is not included in the commit message. If searching for this change in the commit history look for "Fixed IllegalStateException while reading Parquet data" Also the commit id is 0a547708d6734f893ca0d6bf673f7a6ae856375e was (Author: timothyfarkas): Made a mistake and the Jira number is not included in the commit message. If searching for this change in the commit history look for "Fixed IllegalStateException while reading Parquet data" > IllegalStateException: Read batch count [0] should be greater than zero > --- > > Key: DRILL-7130 > URL: https://issues.apache.org/jira/browse/DRILL-7130 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Parquet >Affects Versions: 1.15.0 >Reporter: salim achouche >Assignee: salim achouche >Priority: Major > Fix For: 1.16.0 > > > The following exception is being hit when reading parquet data: > Caused by: java.lang.IllegalStateException: Read batch count [0] should be > greater than zero at > org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkState(Preconditions.java:509) > ~[drill-shaded-guava-23.0.jar:23.0] at > org.apache.drill.exec.store.parquet.columnreaders.VarLenNullableFixedEntryReader.getEntry(VarLenNullableFixedEntryReader.java:49) > ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.store.parquet.columnreaders.VarLenBulkPageReader.getFixedEntry(VarLenBulkPageReader.java:167) > ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.store.parquet.columnreaders.VarLenBulkPageReader.getEntry(VarLenBulkPageReader.java:132) > ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.store.parquet.columnreaders.VarLenColumnBulkInput.next(VarLenColumnBulkInput.java:154) > ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.store.parquet.columnreaders.VarLenColumnBulkInput.next(VarLenColumnBulkInput.java:38) > ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.vector.VarCharVector$Mutator.setSafe(VarCharVector.java:624) > ~[vector-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.vector.NullableVarCharVector$Mutator.setSafe(NullableVarCharVector.java:716) > ~[vector-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.store.parquet.columnreaders.VarLengthColumnReaders$NullableVarCharColumn.setSafe(VarLengthColumnReaders.java:215) > ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.store.parquet.columnreaders.VarLengthValuesColumn.readRecordsInBulk(VarLengthValuesColumn.java:98) > ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.store.parquet.columnreaders.VarLenBinaryReader.readRecordsInBulk(VarLenBinaryReader.java:114) > ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.store.parquet.columnreaders.VarLenBinaryReader.readFields(VarLenBinaryReader.java:92) > ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.store.parquet.columnreaders.BatchReader$VariableWidthReader.readRecords(BatchReader.java:156) > ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.store.parquet.columnreaders.BatchReader.readBatch(BatchReader.java:43) > ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.next(ParquetRecordReader.java:288) > ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] ... 29 common frames omitted > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7132) Metadata cache does not have correct min/max values for varchar and interval data types
[ https://issues.apache.org/jira/browse/DRILL-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16799343#comment-16799343 ] Robert Hou commented on DRILL-7132: --- [~vvysotskyi] Sounds good. How does QA verify that the values are correct? We have some metadata cache tests that are failing, and they should be re-verified with the new base24 values. And I'm about to add some new ones for an enhancement to the metadata cache feature. > Metadata cache does not have correct min/max values for varchar and interval > data types > --- > > Key: DRILL-7132 > URL: https://issues.apache.org/jira/browse/DRILL-7132 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.14.0 >Reporter: Robert Hou >Priority: Major > Fix For: 1.17.0 > > Attachments: 0_0_10.parquet > > > The parquet metadata cache does not have correct min/max values for varchar > and interval data types. > I have attached a parquet file. Here is what parquet tools shows for varchar: > [varchar_col] BINARY 14.6% of all space [PLAIN, BIT_PACKED] min: 67 max: 67 > average: 67 total: 67 (raw data: 65 saving -3%) > values: min: 1 max: 1 average: 1 total: 1 > uncompressed: min: 65 max: 65 average: 65 total: 65 > column values statistics: min: ioegjNJKvnkd, max: ioegjNJKvnkd, num_nulls: 0 > Here is what the metadata cache file shows: > "name" : [ "varchar_col" ], > "minValue" : "aW9lZ2pOSkt2bmtk", > "maxValue" : "aW9lZ2pOSkt2bmtk", > "nulls" : 0 > Here is what parquet tools shows for interval: > [interval_col] BINARY 11.3% of all space [PLAIN, BIT_PACKED] min: 52 max: 52 > average: 52 total: 52 (raw data: 50 saving -4%) > values: min: 1 max: 1 average: 1 total: 1 > uncompressed: min: 50 max: 50 average: 50 total: 50 > column values statistics: min: P18582D, max: P18582D, num_nulls: 0 > Here is what the metadata cache file shows: > "name" : [ "interval_col" ], > "minValue" : "UDE4NTgyRA==", > "maxValue" : "UDE4NTgyRA==", > "nulls" : 0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7132) Metadata cache does not have correct min/max values for varchar and interval data types
[ https://issues.apache.org/jira/browse/DRILL-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16799339#comment-16799339 ] Volodymyr Vysotskyi commented on DRILL-7132: [~rhou], parquet metadata cache contains min/max values for varchar, decimal, interval, and some other types encoded using base64, so they differ from the values displayed by parquet tools. There is no need to store values in the same format/encoding, etc. The main requirement is Drill should be able to handle these values from parquet metadata cache files correctly, and it does. As a side note, in DRILL-4139 was made a change to use base64 encoding in parquet metadata cache to be able to handle correctly statistics for decimal and interval types. > Metadata cache does not have correct min/max values for varchar and interval > data types > --- > > Key: DRILL-7132 > URL: https://issues.apache.org/jira/browse/DRILL-7132 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.14.0 >Reporter: Robert Hou >Priority: Major > Fix For: 1.17.0 > > Attachments: 0_0_10.parquet > > > The parquet metadata cache does not have correct min/max values for varchar > and interval data types. > I have attached a parquet file. Here is what parquet tools shows for varchar: > [varchar_col] BINARY 14.6% of all space [PLAIN, BIT_PACKED] min: 67 max: 67 > average: 67 total: 67 (raw data: 65 saving -3%) > values: min: 1 max: 1 average: 1 total: 1 > uncompressed: min: 65 max: 65 average: 65 total: 65 > column values statistics: min: ioegjNJKvnkd, max: ioegjNJKvnkd, num_nulls: 0 > Here is what the metadata cache file shows: > "name" : [ "varchar_col" ], > "minValue" : "aW9lZ2pOSkt2bmtk", > "maxValue" : "aW9lZ2pOSkt2bmtk", > "nulls" : 0 > Here is what parquet tools shows for interval: > [interval_col] BINARY 11.3% of all space [PLAIN, BIT_PACKED] min: 52 max: 52 > average: 52 total: 52 (raw data: 50 saving -4%) > values: min: 1 max: 1 average: 1 total: 1 > uncompressed: min: 50 max: 50 average: 50 total: 50 > column values statistics: min: P18582D, max: P18582D, num_nulls: 0 > Here is what the metadata cache file shows: > "name" : [ "interval_col" ], > "minValue" : "UDE4NTgyRA==", > "maxValue" : "UDE4NTgyRA==", > "nulls" : 0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7132) Metadata cache does not have correct min/max values for varchar and interval data types
Robert Hou created DRILL-7132: - Summary: Metadata cache does not have correct min/max values for varchar and interval data types Key: DRILL-7132 URL: https://issues.apache.org/jira/browse/DRILL-7132 Project: Apache Drill Issue Type: Bug Components: Metadata Affects Versions: 1.14.0 Reporter: Robert Hou Fix For: 1.17.0 Attachments: 0_0_10.parquet The parquet metadata cache does not have correct min/max values for varchar and interval data types. I have attached a parquet file. Here is what parquet tools shows for varchar: [varchar_col] BINARY 14.6% of all space [PLAIN, BIT_PACKED] min: 67 max: 67 average: 67 total: 67 (raw data: 65 saving -3%) values: min: 1 max: 1 average: 1 total: 1 uncompressed: min: 65 max: 65 average: 65 total: 65 column values statistics: min: ioegjNJKvnkd, max: ioegjNJKvnkd, num_nulls: 0 Here is what the metadata cache file shows: "name" : [ "varchar_col" ], "minValue" : "aW9lZ2pOSkt2bmtk", "maxValue" : "aW9lZ2pOSkt2bmtk", "nulls" : 0 Here is what parquet tools shows for interval: [interval_col] BINARY 11.3% of all space [PLAIN, BIT_PACKED] min: 52 max: 52 average: 52 total: 52 (raw data: 50 saving -4%) values: min: 1 max: 1 average: 1 total: 1 uncompressed: min: 50 max: 50 average: 50 total: 50 column values statistics: min: P18582D, max: P18582D, num_nulls: 0 Here is what the metadata cache file shows: "name" : [ "interval_col" ], "minValue" : "UDE4NTgyRA==", "maxValue" : "UDE4NTgyRA==", "nulls" : 0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7127) Update hbase version for mapr profile
[ https://issues.apache.org/jira/browse/DRILL-7127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16799251#comment-16799251 ] Abhishek Girish commented on DRILL-7127: Currently seeing test failures. Moving out of 1.16.0 scope, as it's not a blocker. > Update hbase version for mapr profile > - > > Key: DRILL-7127 > URL: https://issues.apache.org/jira/browse/DRILL-7127 > Project: Apache Drill > Issue Type: Bug > Components: Storage - HBase, Tools, Build & Test >Affects Versions: 1.16.0 >Reporter: Abhishek Girish >Assignee: Abhishek Girish >Priority: Major > > Current hbase version for mapr profile is {{1.1.1-mapr-1602-m7-5.2.0}} - > which is over 3 years old. Needs to be updated to {{1.1.8-mapr-1808}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7127) Update hbase version for mapr profile
[ https://issues.apache.org/jira/browse/DRILL-7127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Girish updated DRILL-7127: --- Fix Version/s: (was: 1.16.0) > Update hbase version for mapr profile > - > > Key: DRILL-7127 > URL: https://issues.apache.org/jira/browse/DRILL-7127 > Project: Apache Drill > Issue Type: Bug > Components: Storage - HBase, Tools, Build & Test >Affects Versions: 1.16.0 >Reporter: Abhishek Girish >Assignee: Abhishek Girish >Priority: Major > > Current hbase version for mapr profile is {{1.1.1-mapr-1602-m7-5.2.0}} - > which is over 3 years old. Needs to be updated to {{1.1.8-mapr-1808}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7109) Statistics adds external sort, which spills to disk
[ https://issues.apache.org/jira/browse/DRILL-7109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16799215#comment-16799215 ] Aman Sinha commented on DRILL-7109: --- [~rhou] for all such issues related to the planning, please add the EXPLAIN plan with and without statistics for faster diagnosis. > Statistics adds external sort, which spills to disk > --- > > Key: DRILL-7109 > URL: https://issues.apache.org/jira/browse/DRILL-7109 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.16.0 >Reporter: Robert Hou >Assignee: Gautam Parai >Priority: Major > Fix For: 1.16.0 > > > TPCH query 4 with sf 100 runs many times slower. One issue is that an extra > external sort has been added, and both external sorts spill to disk. > Also, the hash join sees 100x more data. > Here is the query: > {noformat} > select > o.o_orderpriority, > count(*) as order_count > from > orders o > where > o.o_orderdate >= date '1996-10-01' > and o.o_orderdate < date '1996-10-01' + interval '3' month > and > exists ( > select > * > from > lineitem l > where > l.l_orderkey = o.o_orderkey > and l.l_commitdate < l.l_receiptdate > ) > group by > o.o_orderpriority > order by > o.o_orderpriority; > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7130) IllegalStateException: Read batch count [0] should be greater than zero
[ https://issues.apache.org/jira/browse/DRILL-7130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] salim achouche updated DRILL-7130: -- Reviewer: Timothy Farkas > IllegalStateException: Read batch count [0] should be greater than zero > --- > > Key: DRILL-7130 > URL: https://issues.apache.org/jira/browse/DRILL-7130 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Parquet >Affects Versions: 1.15.0 >Reporter: salim achouche >Assignee: salim achouche >Priority: Major > Fix For: 1.17.0 > > > The following exception is being hit when reading parquet data: > Caused by: java.lang.IllegalStateException: Read batch count [0] should be > greater than zero at > org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkState(Preconditions.java:509) > ~[drill-shaded-guava-23.0.jar:23.0] at > org.apache.drill.exec.store.parquet.columnreaders.VarLenNullableFixedEntryReader.getEntry(VarLenNullableFixedEntryReader.java:49) > ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.store.parquet.columnreaders.VarLenBulkPageReader.getFixedEntry(VarLenBulkPageReader.java:167) > ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.store.parquet.columnreaders.VarLenBulkPageReader.getEntry(VarLenBulkPageReader.java:132) > ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.store.parquet.columnreaders.VarLenColumnBulkInput.next(VarLenColumnBulkInput.java:154) > ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.store.parquet.columnreaders.VarLenColumnBulkInput.next(VarLenColumnBulkInput.java:38) > ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.vector.VarCharVector$Mutator.setSafe(VarCharVector.java:624) > ~[vector-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.vector.NullableVarCharVector$Mutator.setSafe(NullableVarCharVector.java:716) > ~[vector-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.store.parquet.columnreaders.VarLengthColumnReaders$NullableVarCharColumn.setSafe(VarLengthColumnReaders.java:215) > ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.store.parquet.columnreaders.VarLengthValuesColumn.readRecordsInBulk(VarLengthValuesColumn.java:98) > ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.store.parquet.columnreaders.VarLenBinaryReader.readRecordsInBulk(VarLenBinaryReader.java:114) > ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.store.parquet.columnreaders.VarLenBinaryReader.readFields(VarLenBinaryReader.java:92) > ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.store.parquet.columnreaders.BatchReader$VariableWidthReader.readRecords(BatchReader.java:156) > ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.store.parquet.columnreaders.BatchReader.readBatch(BatchReader.java:43) > ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at > org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.next(ParquetRecordReader.java:288) > ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] ... 29 common frames omitted > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7127) Update hbase version for mapr profile
[ https://issues.apache.org/jira/browse/DRILL-7127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sorabh Hamirwasia updated DRILL-7127: - Labels: (was: ready-to-commit) > Update hbase version for mapr profile > - > > Key: DRILL-7127 > URL: https://issues.apache.org/jira/browse/DRILL-7127 > Project: Apache Drill > Issue Type: Bug > Components: Storage - HBase, Tools, Build & Test >Affects Versions: 1.16.0 >Reporter: Abhishek Girish >Assignee: Abhishek Girish >Priority: Major > Fix For: 1.16.0 > > > Current hbase version for mapr profile is {{1.1.1-mapr-1602-m7-5.2.0}} - > which is over 3 years old. Needs to be updated to {{1.1.8-mapr-1808}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7131) generate_series / generator
benj created DRILL-7131: --- Summary: generate_series / generator Key: DRILL-7131 URL: https://issues.apache.org/jira/browse/DRILL-7131 Project: Apache Drill Issue Type: Wish Components: Functions - Drill Affects Versions: 1.15.0 Reporter: benj Please add a very useful functionality an equivalent of generate_series in Postgres / Oracle or generator in MySql [https://www.postgresql.org/docs/9.1/functions-srf.html] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7079) Drill can't query views from the S3 storage when plain authentication is enabled
[ https://issues.apache.org/jira/browse/DRILL-7079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Volodymyr Vysotskyi updated DRILL-7079: --- Labels: ready-to-commit (was: ) > Drill can't query views from the S3 storage when plain authentication is > enabled > > > Key: DRILL-7079 > URL: https://issues.apache.org/jira/browse/DRILL-7079 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.15.0 >Reporter: Denys Ordynskiy >Assignee: Bohdan Kazydub >Priority: Major > Labels: ready-to-commit > Fix For: 1.16.0 > > > Enable plain authentication in Drill. > Create the view on the S3 storage: > create view s3.tmp.`testview` as select * from cp.`employee.json` limit 20; > Try to select data from the created view: > select * from s3.tmp.`testview`; > *Actual result*: > {noformat} > 2019-02-27 17:01:09,202 [Client-1] INFO > o.a.d.j.i.DrillCursor$ResultsListener - [#4] Query failed: > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > IllegalArgumentException: A valid userName is expected > Please, refer to logs for more information. > [Error Id: 2271c3aa-6d09-4b51-a585-0e0e954b46eb on maprhost:31010] > at > org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123) > [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:422) > [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:96) > [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:273) > [drill-rpc-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:243) > [drill-rpc-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] > at > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88) > [netty-codec-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287) > [netty-handler-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) > [netty-codec-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312) > [netty-codec-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:286) > [netty-codec-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundH
[jira] [Closed] (DRILL-7041) CompileException happens if a nested coalesce function returns null
[ https://issues.apache.org/jira/browse/DRILL-7041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anton Gozhiy closed DRILL-7041. --- > CompileException happens if a nested coalesce function returns null > --- > > Key: DRILL-7041 > URL: https://issues.apache.org/jira/browse/DRILL-7041 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.16.0 >Reporter: Anton Gozhiy >Assignee: Bohdan Kazydub >Priority: Major > Fix For: 1.16.0 > > > *Query:* > {code:sql} > select coalesce(coalesce(n_name1, n_name2), n_name) from > cp.`tpch/nation.parquet` > {code} > *Expected result:* > Values from "n_name" column should be returned > *Actual result:* > An exception happens: > {code} > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > CompileException: Line 57, Column 27: Assignment conversion not possible from > type "org.apache.drill.exec.expr.holders.NullableVarCharHolder" to type > "org.apache.drill.exec.vector.UntypedNullHolder" Fragment 0:0 Please, refer > to logs for more information. [Error Id: e54d5bfd-604d-4a39-b62f-33bb964e5286 > on userf87d-pc:31010] (org.apache.drill.exec.exception.SchemaChangeException) > Failure while attempting to load generated class > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchemaFromInput():573 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema():583 > org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():101 > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():143 > org.apache.drill.exec.record.AbstractRecordBatch.next():186 > org.apache.drill.exec.physical.impl.BaseRootExec.next():104 > org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():83 > org.apache.drill.exec.physical.impl.BaseRootExec.next():94 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():297 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():284 > java.security.AccessController.doPrivileged():-2 > javax.security.auth.Subject.doAs():422 > org.apache.hadoop.security.UserGroupInformation.doAs():1746 > org.apache.drill.exec.work.fragment.FragmentExecutor.run():284 > org.apache.drill.common.SelfCleaningRunnable.run():38 > java.util.concurrent.ThreadPoolExecutor.runWorker():1149 > java.util.concurrent.ThreadPoolExecutor$Worker.run():624 > java.lang.Thread.run():748 Caused By > (org.apache.drill.exec.exception.ClassTransformationException) > java.util.concurrent.ExecutionException: > org.apache.drill.exec.exception.ClassTransformationException: Failure > generating transformation classes for value: package > org.apache.drill.exec.test.generated; import > org.apache.drill.exec.exception.SchemaChangeException; import > org.apache.drill.exec.expr.holders.BigIntHolder; import > org.apache.drill.exec.expr.holders.BitHolder; import > org.apache.drill.exec.expr.holders.NullableVarBinaryHolder; import > org.apache.drill.exec.expr.holders.NullableVarCharHolder; import > org.apache.drill.exec.expr.holders.VarCharHolder; import > org.apache.drill.exec.ops.FragmentContext; import > org.apache.drill.exec.record.RecordBatch; import > org.apache.drill.exec.vector.UntypedNullHolder; import > org.apache.drill.exec.vector.UntypedNullVector; import > org.apache.drill.exec.vector.VarCharVector; public class ProjectorGen35 { > BigIntHolder const6; BitHolder constant9; UntypedNullHolder constant13; > VarCharVector vv14; UntypedNullVector vv19; public void doEval(int inIndex, > int outIndex) throws SchemaChangeException { { UntypedNullHolder out0 = new > UntypedNullHolder(); if (constant9 .value == 1) { if (constant13 .isSet!= 0) > { out0 = constant13; } } else { VarCharHolder out17 = new VarCharHolder(); { > out17 .buffer = vv14 .getBuffer(); long startEnd = vv14 > .getAccessor().getStartEnd((inIndex)); out17 .start = ((int) startEnd); out17 > .end = ((int)(startEnd >> 32)); } // start of eval portion of > convertToNullableVARCHAR function. // NullableVarCharHolder out18 = new > NullableVarCharHolder(); { final NullableVarCharHolder output = new > NullableVarCharHolder(); VarCharHolder input = out17; > GConvertToNullableVarCharHolder_eval: { output.isSet = 1; output.start = > input.start; output.end = input.end; output.buffer = input.buffer; } out18 = > output; } // end of eval portion of convertToNullableVARCHAR function. > // if (out18 .isSet!= 0) { out0 = out18; } } if (!(out0 .isSet == 0)) { > vv19 .getMutator().set((outIndex), out0 .isSet, out0); } } } public void > doSetup(FragmentContext context, RecordBatch incoming, RecordBatch outgoing) > throws SchemaChangeException { { UntypedNullHolder out1 = new > UntypedNullHolder(); NullableVarBinaryHolder out2 = new > NullableVarB
[jira] [Commented] (DRILL-7041) CompileException happens if a nested coalesce function returns null
[ https://issues.apache.org/jira/browse/DRILL-7041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16798928#comment-16798928 ] Anton Gozhiy commented on DRILL-7041: - Verified with Drill version 1.16.0-SNAPSHOT (commit bf1bdec6069f6fdd2132608450357edea47d328c) > CompileException happens if a nested coalesce function returns null > --- > > Key: DRILL-7041 > URL: https://issues.apache.org/jira/browse/DRILL-7041 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.16.0 >Reporter: Anton Gozhiy >Assignee: Bohdan Kazydub >Priority: Major > Fix For: 1.16.0 > > > *Query:* > {code:sql} > select coalesce(coalesce(n_name1, n_name2), n_name) from > cp.`tpch/nation.parquet` > {code} > *Expected result:* > Values from "n_name" column should be returned > *Actual result:* > An exception happens: > {code} > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > CompileException: Line 57, Column 27: Assignment conversion not possible from > type "org.apache.drill.exec.expr.holders.NullableVarCharHolder" to type > "org.apache.drill.exec.vector.UntypedNullHolder" Fragment 0:0 Please, refer > to logs for more information. [Error Id: e54d5bfd-604d-4a39-b62f-33bb964e5286 > on userf87d-pc:31010] (org.apache.drill.exec.exception.SchemaChangeException) > Failure while attempting to load generated class > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchemaFromInput():573 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema():583 > org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():101 > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():143 > org.apache.drill.exec.record.AbstractRecordBatch.next():186 > org.apache.drill.exec.physical.impl.BaseRootExec.next():104 > org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():83 > org.apache.drill.exec.physical.impl.BaseRootExec.next():94 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():297 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():284 > java.security.AccessController.doPrivileged():-2 > javax.security.auth.Subject.doAs():422 > org.apache.hadoop.security.UserGroupInformation.doAs():1746 > org.apache.drill.exec.work.fragment.FragmentExecutor.run():284 > org.apache.drill.common.SelfCleaningRunnable.run():38 > java.util.concurrent.ThreadPoolExecutor.runWorker():1149 > java.util.concurrent.ThreadPoolExecutor$Worker.run():624 > java.lang.Thread.run():748 Caused By > (org.apache.drill.exec.exception.ClassTransformationException) > java.util.concurrent.ExecutionException: > org.apache.drill.exec.exception.ClassTransformationException: Failure > generating transformation classes for value: package > org.apache.drill.exec.test.generated; import > org.apache.drill.exec.exception.SchemaChangeException; import > org.apache.drill.exec.expr.holders.BigIntHolder; import > org.apache.drill.exec.expr.holders.BitHolder; import > org.apache.drill.exec.expr.holders.NullableVarBinaryHolder; import > org.apache.drill.exec.expr.holders.NullableVarCharHolder; import > org.apache.drill.exec.expr.holders.VarCharHolder; import > org.apache.drill.exec.ops.FragmentContext; import > org.apache.drill.exec.record.RecordBatch; import > org.apache.drill.exec.vector.UntypedNullHolder; import > org.apache.drill.exec.vector.UntypedNullVector; import > org.apache.drill.exec.vector.VarCharVector; public class ProjectorGen35 { > BigIntHolder const6; BitHolder constant9; UntypedNullHolder constant13; > VarCharVector vv14; UntypedNullVector vv19; public void doEval(int inIndex, > int outIndex) throws SchemaChangeException { { UntypedNullHolder out0 = new > UntypedNullHolder(); if (constant9 .value == 1) { if (constant13 .isSet!= 0) > { out0 = constant13; } } else { VarCharHolder out17 = new VarCharHolder(); { > out17 .buffer = vv14 .getBuffer(); long startEnd = vv14 > .getAccessor().getStartEnd((inIndex)); out17 .start = ((int) startEnd); out17 > .end = ((int)(startEnd >> 32)); } // start of eval portion of > convertToNullableVARCHAR function. // NullableVarCharHolder out18 = new > NullableVarCharHolder(); { final NullableVarCharHolder output = new > NullableVarCharHolder(); VarCharHolder input = out17; > GConvertToNullableVarCharHolder_eval: { output.isSet = 1; output.start = > input.start; output.end = input.end; output.buffer = input.buffer; } out18 = > output; } // end of eval portion of convertToNullableVARCHAR function. > // if (out18 .isSet!= 0) { out0 = out18; } } if (!(out0 .isSet == 0)) { > vv19 .getMutator().set((outIndex), out0 .isSet, out0); } } } public void > doSetup(FragmentContext context, RecordBatch incoming, Record
[jira] [Commented] (DRILL-7104) Change of data type when parquet with multiple fragment
[ https://issues.apache.org/jira/browse/DRILL-7104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16798891#comment-16798891 ] benj commented on DRILL-7104: - With the request that build Parquet with more than 1 Fragment, if you add an UNION on empty string ('') you get an error {code:java} CREATE TABLE `bug` AS ((SELECT CAST(NULL AS VARCHAR) AS demo ,md5(CAST(rand() AS VARCHAR)) AS jam FROM `onebigfile` LIMIT 100) UNION (SELECT CAST(NULL AS VARCHAR) AS demo ,md5(CAST(rand() AS VARCHAR)) AS jam FROM `onebigfile` LIMIT 100) UNION (SELECT CAST('' AS VARCHAR) AS demo, 'jam' AS jam FROM (VALUES(1))) ); => Error: SYSTEM ERROR: NumberFormatException: {code} Please find the complete log of these error here : * [^DRILL-7104_ErrorNumberFormatException_20190322.log] > Change of data type when parquet with multiple fragment > --- > > Key: DRILL-7104 > URL: https://issues.apache.org/jira/browse/DRILL-7104 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Parquet >Affects Versions: 1.15.0 >Reporter: benj >Priority: Major > Attachments: DRILL-7104_ErrorNumberFormatException_20190322.log > > > When creating a Parquet with a column filled only with "CAST(NULL AS > VARCHAR)", if the parquet has several fragment, the type is read like INT > instead of VARCHAR. > First, create +Parquet with only one fragment+ - all is fine (the type of > "demo" is correct). > {code:java} > CREATE TABLE `nobug` AS > (SELECT CAST(NULL AS VARCHAR) AS demo > , md5(cast(rand() AS VARCHAR) AS jam > FROM `onebigfile` LIMIT 100)); > +---++ > | Fragment | Number of records written | > +---++ > | 0_0 | 1000 | > SELECT drilltypeof(demo) AS goodtype FROM `bug` LIMIT 1; > ++ > | goodtype | > ++ > | VARCHAR| > {code} > Second, create +Parquet with at least 2 fragments+ - the type of "demo" > change to INT > {code:java} > CREATE TABLE `bug` AS > ((SELECT CAST(NULL AS VARCHAR) AS demo > ,md5(CAST(rand() AS VARCHAR)) AS jam > FROM `onebigfile` LIMIT 100) > UNION > (SELECT CAST(NULL AS VARCHAR) AS demo > ,md5(CAST(rand() AS VARCHAR)) AS jam > FROM `onebigfile` LIMIT 100)); > +---++ > | Fragment | Number of records written | > +---++ > | 1_1 | 1000276| > | 1_0 | 999724 | > SELECT drilltypeof(demo) AS badtype FROM `bug` LIMIT 1; > ++ > | badtype| > ++ > | INT|{code} > The change of type is really terrible... > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7104) Change of data type when parquet with multiple fragment
[ https://issues.apache.org/jira/browse/DRILL-7104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] benj updated DRILL-7104: Attachment: DRILL-7104_ErrorNumberFormatException_20190322.log > Change of data type when parquet with multiple fragment > --- > > Key: DRILL-7104 > URL: https://issues.apache.org/jira/browse/DRILL-7104 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Parquet >Affects Versions: 1.15.0 >Reporter: benj >Priority: Major > Attachments: DRILL-7104_ErrorNumberFormatException_20190322.log > > > When creating a Parquet with a column filled only with "CAST(NULL AS > VARCHAR)", if the parquet has several fragment, the type is read like INT > instead of VARCHAR. > First, create +Parquet with only one fragment+ - all is fine (the type of > "demo" is correct). > {code:java} > CREATE TABLE `nobug` AS > (SELECT CAST(NULL AS VARCHAR) AS demo > , md5(cast(rand() AS VARCHAR) AS jam > FROM `onebigfile` LIMIT 100)); > +---++ > | Fragment | Number of records written | > +---++ > | 0_0 | 1000 | > SELECT drilltypeof(demo) AS goodtype FROM `bug` LIMIT 1; > ++ > | goodtype | > ++ > | VARCHAR| > {code} > Second, create +Parquet with at least 2 fragments+ - the type of "demo" > change to INT > {code:java} > CREATE TABLE `bug` AS > ((SELECT CAST(NULL AS VARCHAR) AS demo > ,md5(CAST(rand() AS VARCHAR)) AS jam > FROM `onebigfile` LIMIT 100) > UNION > (SELECT CAST(NULL AS VARCHAR) AS demo > ,md5(CAST(rand() AS VARCHAR)) AS jam > FROM `onebigfile` LIMIT 100)); > +---++ > | Fragment | Number of records written | > +---++ > | 1_1 | 1000276| > | 1_0 | 999724 | > SELECT drilltypeof(demo) AS badtype FROM `bug` LIMIT 1; > ++ > | badtype| > ++ > | INT|{code} > The change of type is really terrible... > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7118) Filter not getting pushed down on MapR-DB tables.
[ https://issues.apache.org/jira/browse/DRILL-7118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Volodymyr Vysotskyi updated DRILL-7118: --- Labels: ready-to-commit (was: ) > Filter not getting pushed down on MapR-DB tables. > - > > Key: DRILL-7118 > URL: https://issues.apache.org/jira/browse/DRILL-7118 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.15.0 >Reporter: Hanumath Rao Maduri >Assignee: Hanumath Rao Maduri >Priority: Major > Labels: ready-to-commit > Fix For: 1.16.0 > > > A simple is null filter is not being pushed down for the mapr-db tables. Here > is the repro for the same. > {code:java} > 0: jdbc:drill:zk=local> explain plan for select * from dfs.`/tmp/js` where b > is null; > ANTLR Tool version 4.5 used for code generation does not match the current > runtime version 4.7.1ANTLR Runtime version 4.5 used for parser compilation > does not match the current runtime version 4.7.1ANTLR Tool version 4.5 used > for code generation does not match the current runtime version 4.7.1ANTLR > Runtime version 4.5 used for parser compilation does not match the current > runtime version > 4.7.1+--+--+ > | text | json | > +--+--+ > | 00-00 Screen > 00-01 Project(**=[$0]) > 00-02 Project(T0¦¦**=[$0]) > 00-03 SelectionVectorRemover > 00-04 Filter(condition=[IS NULL($1)]) > 00-05 Project(T0¦¦**=[$0], b=[$1]) > 00-06 Scan(table=[[dfs, /tmp/js]], groupscan=[JsonTableGroupScan > [ScanSpec=JsonScanSpec [tableName=/tmp/js, condition=null], columns=[`**`, > `b`], maxwidth=1]]) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)