[GitHub] drill pull request #1166: DRILL-6016 - Fix for Error reading INT96 created b...
Github user rajrahul commented on a diff in the pull request: https://github.com/apache/drill/pull/1166#discussion_r177950795 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java --- @@ -797,6 +797,24 @@ public void testImpalaParquetBinaryAsTimeStamp_DictChange() throws Exception { } } + @Test + public void testSparkParquetBinaryAsTimeStamp_DictChange() throws Exception { +try { + mockUtcDateTimeZone(); --- End diff -- @vdiravka your thoughts on comment above? ---
Re: Issue in accessing ORC transactional table through apache drill
Hi Smruti, I suggest you to move onto latest Drill and Hive versions, it is resolved there. Regarding your issue, Drill 1.10 uses Hive 1.2.1 client version, so it could be compatible with Hive server/metastore 1.2.1 version, but it wasn't verified. 112_ is a delta_name for the bucket. Your issue happened in OrcInputFormat#generateSplitsInfo() method. Looks like a wrong splitStrategy is selected. It can be related to enabled custom properties before creating Hive ORC bucketed transactional table. So double check all table properties, the properties in your hive-site.xml and properties, which were set in hive shell. If these properties are changes, this is a root cause: https://github.com/apache/hive/blob/branch-1.2/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L1023 https://github.com/apache/hive/blob/branch-1.2/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L1023 If you want to set some Hive specific option in Drill, add it in Drill Hive plugin and restart drillbit. Please ask such kind of questions in user mailing list. Devs are also there, but other users can read and also suggest you some decisions and workarounds. Kind regards Vitalii On Wed, Mar 28, 2018 at 9:05 PM, Prasad Nagaraj Subramanya < prasadn...@gmail.com> wrote: > Hi Smruti, > > Hive orc transactional support is available from Drill 1.13.0 onwards. > Please update to the latest release version. > > Thanks, > Prasad > > On Wed, Mar 28, 2018 at 10:22 AM, Gautam Paraiwrote: > > > From the error it seems like Drill barfs when trying to convert > > "112_" to a number which would be expected? I am not familiar > with > > Hive ORC but maybe you could (temporarily) remove the offending row and > try > > it? > > > > > > Gautam > > > > > > From: Smruti Ranjan > > Sent: Wednesday, March 28, 2018 2:19:30 AM > > To: dev@drill.apache.org > > Subject: Issue in accessing ORC transactional table through apache drill > > > > Dear Team, > > > > Issue while accessing ORC transactional hive table through apache drill > > and below are the stack versions. > > Apache drill 1.10.0 > > Hive 1.2.1 [HDP 2.6.0 versions] > > > > Below is the error coming while accessing data from the mentioned table > > through apache drill. > > > > Query Failed: An Error Occurred > > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > > NumberFormatException: For input string: "112_" [Error Id: > > ad9b4243-d48d-43c7-9755-388202d7c54d on inbbrdssvm16.india.tcs.com:31010 > ] > > > > Please help me in resolving the issue. > > > > > > Thanks & Regards > > Smruti Ranjan Jena > > Tata Consultancy Services > > Mailto: smruti.ran...@tcs.com > > Website: https://urldefense.proofpoint.com/v2/url?u=http-3A__www.tcs. > > com=DwIBAg=cskdkSMqhcnjZxdQVpwTXg=jGaWXfAULy7L7yLSDM6rFQ= > > kVHrzlhCd8duKi3rDtQYq29gjO6EiXKxCg8iaz-x2h8= > > iOWB2L2JkjCw5RXZ9bkXhYhDDMFA3oOd6vPfvE-Jqnk= > > > > Experience certainty. IT Services > > Business Solutions > > Consulting > > > > =-=-= > > Notice: The information contained in this e-mail > > message and/or attachments to it may contain > > confidential or privileged information. If you are > > not the intended recipient, any dissemination, use, > > review, distribution, printing or copying of the > > information contained in this e-mail message > > and/or attachments to it are strictly prohibited. If > > you have received this communication in error, > > please notify us by reply e-mail or telephone and > > immediately and permanently delete the message > > and any attachments. Thank you > > > > > > >
[GitHub] drill pull request #1192: DRILL-6299: Fixed a filter pushed down issue when ...
GitHub user sachouche opened a pull request: https://github.com/apache/drill/pull/1192 DRILL-6299: Fixed a filter pushed down issue when a column doesn't ha⦠This bug happens when the isNull predicate is applied on a column without statistics. @arina-ielchiieva can you please review this pull request? Thanks! You can merge this pull request into a Git repository by running: $ git pull https://github.com/sachouche/drill DRILL-6299 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1192.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1192 commit bb742f61673d0b64c34bfab9de01ffb2968b472c Author: Salim AchoucheDate: 2018-03-28T19:08:25Z DRILL-6299: Fixed a filter pushed down issue when a column doesn't have stats ---
[jira] [Created] (DRILL-6299) Parquet query returns unexpected results
salim achouche created DRILL-6299: - Summary: Parquet query returns unexpected results Key: DRILL-6299 URL: https://issues.apache.org/jira/browse/DRILL-6299 Project: Apache Drill Issue Type: Bug Components: Query Planning Optimization Reporter: salim achouche Query "select id from where str_empty is null and tinyint_var between -10 and 15" returns unexpected results. The same query will succeed if the filter pushdown functionality is disabled. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] drill issue #1181: DRILL-6284: Add operator metrics for batch sizing for fla...
Github user ppadma commented on the issue: https://github.com/apache/drill/pull/1181 @paul-rogers thanks for the review. Please take a look at updated changes. ---
Re: Issue in accessing ORC transactional table through apache drill
Hi Smruti, Hive orc transactional support is available from Drill 1.13.0 onwards. Please update to the latest release version. Thanks, Prasad On Wed, Mar 28, 2018 at 10:22 AM, Gautam Paraiwrote: > From the error it seems like Drill barfs when trying to convert > "112_" to a number which would be expected? I am not familiar with > Hive ORC but maybe you could (temporarily) remove the offending row and try > it? > > > Gautam > > > From: Smruti Ranjan > Sent: Wednesday, March 28, 2018 2:19:30 AM > To: dev@drill.apache.org > Subject: Issue in accessing ORC transactional table through apache drill > > Dear Team, > > Issue while accessing ORC transactional hive table through apache drill > and below are the stack versions. > Apache drill 1.10.0 > Hive 1.2.1 [HDP 2.6.0 versions] > > Below is the error coming while accessing data from the mentioned table > through apache drill. > > Query Failed: An Error Occurred > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > NumberFormatException: For input string: "112_" [Error Id: > ad9b4243-d48d-43c7-9755-388202d7c54d on inbbrdssvm16.india.tcs.com:31010] > > Please help me in resolving the issue. > > > Thanks & Regards > Smruti Ranjan Jena > Tata Consultancy Services > Mailto: smruti.ran...@tcs.com > Website: https://urldefense.proofpoint.com/v2/url?u=http-3A__www.tcs. > com=DwIBAg=cskdkSMqhcnjZxdQVpwTXg=jGaWXfAULy7L7yLSDM6rFQ= > kVHrzlhCd8duKi3rDtQYq29gjO6EiXKxCg8iaz-x2h8= > iOWB2L2JkjCw5RXZ9bkXhYhDDMFA3oOd6vPfvE-Jqnk= > > Experience certainty. IT Services > Business Solutions > Consulting > > =-=-= > Notice: The information contained in this e-mail > message and/or attachments to it may contain > confidential or privileged information. If you are > not the intended recipient, any dissemination, use, > review, distribution, printing or copying of the > information contained in this e-mail message > and/or attachments to it are strictly prohibited. If > you have received this communication in error, > please notify us by reply e-mail or telephone and > immediately and permanently delete the message > and any attachments. Thank you > > >
[jira] [Created] (DRILL-6298) Add debug log for merge join batch sizing
Padma Penumarthy created DRILL-6298: --- Summary: Add debug log for merge join batch sizing Key: DRILL-6298 URL: https://issues.apache.org/jira/browse/DRILL-6298 Project: Apache Drill Issue Type: Bug Affects Versions: 1.13.0 Reporter: Padma Penumarthy Assignee: Padma Penumarthy Fix For: 1.14.0 Add debug log for merge join batch sizers so QA can verify the batch sizes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6297) Define the Schema Change support functionality
salim achouche created DRILL-6297: - Summary: Define the Schema Change support functionality Key: DRILL-6297 URL: https://issues.apache.org/jira/browse/DRILL-6297 Project: Apache Drill Issue Type: Improvement Reporter: salim achouche Assignee: salim achouche The schema change support functionality is one of the main functional aspects of Drill; unfortunately, there is no formal technical specification to this key functionality which makes it very for: * The Drill users to figure out what is the extent of schema changes support and when it is safe to use it * Development to support this functionality Goal - * The goal of this Jira is to deliver a functional specification for the schema change functionality * I'll create a strawman proposal based on previous input and hopefully start a discussion to gradually refine it -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] drill pull request #1181: DRILL-6284: Add operator metrics for batch sizing ...
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1181#discussion_r177825111 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/flatten/FlattenRecordBatch.java --- @@ -99,6 +100,22 @@ private void clear() { } } + public enum Metric implements MetricDef { +NUM_INCOMING_BATCHES, +AVG_INPUT_BATCH_SIZE, +AVG_INPUT_ROW_WIDTH, --- End diff -- Parallel here simply means the same name: "INCOMING"/"INPUT" --> "INPUT", "OUTGOING"/"OUTPUT" --> "OUTPUT". Same name in each term. ---
Re: Issue in accessing ORC transactional table through apache drill
>From the error it seems like Drill barfs when trying to convert "112_" >to a number which would be expected? I am not familiar with Hive ORC but maybe >you could (temporarily) remove the offending row and try it? Gautam From: Smruti RanjanSent: Wednesday, March 28, 2018 2:19:30 AM To: dev@drill.apache.org Subject: Issue in accessing ORC transactional table through apache drill Dear Team, Issue while accessing ORC transactional hive table through apache drill and below are the stack versions. Apache drill 1.10.0 Hive 1.2.1 [HDP 2.6.0 versions] Below is the error coming while accessing data from the mentioned table through apache drill. Query Failed: An Error Occurred org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: NumberFormatException: For input string: "112_" [Error Id: ad9b4243-d48d-43c7-9755-388202d7c54d on inbbrdssvm16.india.tcs.com:31010] Please help me in resolving the issue. Thanks & Regards Smruti Ranjan Jena Tata Consultancy Services Mailto: smruti.ran...@tcs.com Website: https://urldefense.proofpoint.com/v2/url?u=http-3A__www.tcs.com=DwIBAg=cskdkSMqhcnjZxdQVpwTXg=jGaWXfAULy7L7yLSDM6rFQ=kVHrzlhCd8duKi3rDtQYq29gjO6EiXKxCg8iaz-x2h8=iOWB2L2JkjCw5RXZ9bkXhYhDDMFA3oOd6vPfvE-Jqnk= Experience certainty. IT Services Business Solutions Consulting =-=-= Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you
[jira] [Created] (DRILL-6296) Add operator metrics for batch sizing for merge join
Padma Penumarthy created DRILL-6296: --- Summary: Add operator metrics for batch sizing for merge join Key: DRILL-6296 URL: https://issues.apache.org/jira/browse/DRILL-6296 Project: Apache Drill Issue Type: Improvement Components: Execution - Relational Operators Affects Versions: 1.13.0 Reporter: Padma Penumarthy Assignee: Padma Penumarthy Fix For: 1.14.0 Add operator metrics for batch sizing stats for merge join. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6295) {{PartitionerDecorator}} may close {{partitioners}} while {{CustomRunnable}} are active during query cancellation
Vlad Rozov created DRILL-6295: - Summary: {{PartitionerDecorator}} may close {{partitioners}} while {{CustomRunnable}} are active during query cancellation Key: DRILL-6295 URL: https://issues.apache.org/jira/browse/DRILL-6295 Project: Apache Drill Issue Type: Bug Reporter: Vlad Rozov Assignee: Vlad Rozov Fix For: 1.14.0 During query cancellation, in case {{PartitionerDecorator.executeMethodLogic()}} is active (waiting on the {{latch}}), the wait will be interrupted and {{Future}}s cancelled, but there is no guarantee that all {{CustomRunnable}} terminate before returning from {{PartitionerDecorator.executeMethodLogic()}}. On exit, both income and outgoing batches are cleared, leading to clearing of underlying {{Vector}}s and {{DrillBuf}}s. This eventually causes unallocated memory access and JVM crash as {{CustomRunnable}} may execute after income/outgoing batches are cleared. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] drill issue #258: DRILL-4091: Support for additional gis operations in gis c...
Github user cgivre commented on the issue: https://github.com/apache/drill/pull/258 HI @brendanstennett I still am able to review if you'd like. ---
[GitHub] drill pull request #1181: DRILL-6284: Add operator metrics for batch sizing ...
Github user ppadma commented on a diff in the pull request: https://github.com/apache/drill/pull/1181#discussion_r177810092 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/flatten/FlattenRecordBatch.java --- @@ -99,6 +100,22 @@ private void clear() { } } + public enum Metric implements MetricDef { +NUM_INCOMING_BATCHES, +AVG_INPUT_BATCH_SIZE, +AVG_INPUT_ROW_WIDTH, --- End diff -- @paul-rogers Paul, I did not understand what you mean by parallel here and below. Do you mean they should be adjacent columns in the web UI ? ---
[GitHub] drill issue #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsException ...
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1144 Is there a reason to delegate `get/setBytes` to `AbstractByteBuf`? If not, this PR will be a preparation step to use `PlatformDependent` directly bypassing Netty bounds checking. ---
[GitHub] drill issue #258: DRILL-4091: Support for additional gis operations in gis c...
Github user brendanstennett commented on the issue: https://github.com/apache/drill/pull/258 Hey guys, we have some cycles to have a look at this now. @ChrisSandison is going to take a look who has made a few contributions to this project before. ---
[GitHub] drill pull request #1161: DRILL-6230: Extend row set readers to handle hyper...
Github user ppadma commented on a diff in the pull request: https://github.com/apache/drill/pull/1161#discussion_r177564887 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/rowSet/model/ReaderIndex.java --- @@ -28,26 +28,30 @@ public abstract class ReaderIndex implements ColumnReaderIndex { - protected int rowIndex = -1; + protected int position = -1; protected final int rowCount; public ReaderIndex(int rowCount) { this.rowCount = rowCount; } - public int position() { return rowIndex; } - public void set(int index) { rowIndex = index; } + public void set(int index) { +assert position >= -1 && position <= rowCount; +position = index; + } + + @Override + public int logicalIndex() { return position; } + + @Override + public int size() { return rowCount; } + @Override public boolean next() { -if (++rowIndex < rowCount ) { +if (++position < rowCount) { return true; -} else { - rowIndex--; - return false; } +position = rowCount; --- End diff -- is there a need to set position to rowcount ? It will come here when position = rowcount ---
[GitHub] drill pull request #1161: DRILL-6230: Extend row set readers to handle hyper...
Github user ppadma commented on a diff in the pull request: https://github.com/apache/drill/pull/1161#discussion_r177175974 --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/rowSet/HyperRowSetImpl.java --- @@ -45,8 +50,67 @@ public RowSetReader buildReader(HyperRowSet rowSet, SelectionVector4 sv4) { TupleMetadata schema = rowSet.schema(); HyperRowIndex rowIndex = new HyperRowIndex(sv4); return new RowSetReaderImpl(schema, rowIndex, - buildContainerChildren(rowSet.container(), - new MetadataRetrieval(schema))); + buildContainerChildren(rowSet.container(), schema)); +} + } + + public static class HyperRowSetBuilderImpl implements HyperRowSetBuilder { + +private final BufferAllocator allocator; +private final List batches = new ArrayList<>(); +private int totalRowCount; + +public HyperRowSetBuilderImpl(BufferAllocator allocator) { + this.allocator = allocator; +} + +@Override +public void addBatch(SingleRowSet rowSet) { + if (rowSet.rowCount() == 0) { +return; + } + if (rowSet.indirectionType() != SelectionVectorMode.NONE) { +throw new IllegalArgumentException("Batches must not have a selection vector."); + } + batches.add(rowSet.container()); + totalRowCount += rowSet.rowCount(); +} + +@Override +public void addBatch(VectorContainer container) { + if (container.getRecordCount() == 0) { +return; + } + if (container.getSchema().getSelectionVectorMode() != SelectionVectorMode.NONE) { +throw new IllegalArgumentException("Batches must not have a selection vector."); + } + batches.add(container); + totalRowCount += container.getRecordCount(); +} + +@SuppressWarnings("resource") +@Override +public HyperRowSet build() throws SchemaChangeException { + SelectionVector4 sv4 = new SelectionVector4(allocator, totalRowCount); + ExpandableHyperContainer hyperContainer = new ExpandableHyperContainer(); + for (VectorContainer container : batches) { +hyperContainer.addBatch(container); + } + + // TODO: This has a bug. If the hyperset has two batches with unions, + // and the first union contains only VARCHAR, while the second contains --- End diff -- is there a JIRA for this bug ? ---
[GitHub] drill pull request #1161: DRILL-6230: Extend row set readers to handle hyper...
Github user ppadma commented on a diff in the pull request: https://github.com/apache/drill/pull/1161#discussion_r177182331 --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/rowSet/test/TestHyperVectorReaders.java --- @@ -0,0 +1,365 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.test.rowSet.test; + +import static org.apache.drill.test.rowSet.RowSetUtilities.mapArray; +import static org.apache.drill.test.rowSet.RowSetUtilities.mapValue; +import static org.apache.drill.test.rowSet.RowSetUtilities.strArray; +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertFalse; +import static org.junit.Assert.assertTrue; + +import org.apache.drill.common.types.TypeProtos.MinorType; +import org.apache.drill.exec.record.metadata.TupleMetadata; +import org.apache.drill.exec.record.selection.SelectionVector4; +import org.apache.drill.test.SubOperatorTest; +import org.apache.drill.test.rowSet.HyperRowSetImpl; +import org.apache.drill.test.rowSet.RowSet.ExtendableRowSet; +import org.apache.drill.test.rowSet.RowSet.HyperRowSet; +import org.apache.drill.test.rowSet.RowSet.SingleRowSet; +import org.apache.drill.test.rowSet.RowSetBuilder; +import org.apache.drill.test.rowSet.RowSetReader; +import org.apache.drill.test.rowSet.RowSetUtilities; +import org.apache.drill.test.rowSet.RowSetWriter; +import org.apache.drill.test.rowSet.schema.SchemaBuilder; +import org.junit.Test; + +/** + * Test the reader mechanism that reads rows indexed via an SV4. + * SV4's introduce an additional level of indexing: each row may + * come from a different batch. The readers use the SV4 to find + * the root batch and vector, then must navigate downward from that + * vector for maps, repeated maps, lists, unions, repeated lists, + * nullable vectors and variable-length vectors. + * + * This test does not cover repeated vectors; those tests should be added. --- End diff -- please file a JIRA for this. ---
Issue in accessing ORC transactional table through apache drill
Dear Team, Issue while accessing ORC transactional hive table through apache drill and below are the stack versions. Apache drill 1.10.0 Hive 1.2.1 [HDP 2.6.0 versions] Below is the error coming while accessing data from the mentioned table through apache drill. Query Failed: An Error Occurred org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: NumberFormatException: For input string: "112_" [Error Id: ad9b4243-d48d-43c7-9755-388202d7c54d on inbbrdssvm16.india.tcs.com:31010] Please help me in resolving the issue. Thanks & Regards Smruti Ranjan Jena Tata Consultancy Services Mailto: smruti.ran...@tcs.com Website: http://www.tcs.com Experience certainty. IT Services Business Solutions Consulting =-=-= Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you
[GitHub] drill issue #1190: DRILL-5937: ExecConstants: changed comment, timeout defau...
Github user vdiravka commented on the issue: https://github.com/apache/drill/pull/1190 @pushpendra-jaiswal-90 You already have +1 from Drill committer :) Your changes will be merged to master branch soon. ---
[GitHub] drill issue #258: DRILL-4091: Support for additional gis operations in gis c...
Github user lherrmann974 commented on the issue: https://github.com/apache/drill/pull/258 Pinging again. If I may help, please let me know what I can do. ---
[GitHub] drill issue #1190: DRILL-5937: ExecConstants: changed comment, timeout defau...
Github user pushpendra-jaiswal-90 commented on the issue: https://github.com/apache/drill/pull/1190 @vrozov @vdiravka Could you please review this? ---