[jira] [Commented] (DRILL-8503) Add Configuration Option to Skip Host Validation for Splunk
[ https://issues.apache.org/jira/browse/DRILL-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17868406#comment-17868406 ] ASF GitHub Bot commented on DRILL-8503: --- cgivre merged PR #2927: URL: https://github.com/apache/drill/pull/2927 > Add Configuration Option to Skip Host Validation for Splunk > --- > > Key: DRILL-8503 > URL: https://issues.apache.org/jira/browse/DRILL-8503 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Splunk >Affects Versions: 1.21.2 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.22.0 > > > This PR adds an option to skip host validation for SSL connections to Splunk. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8502) Some boot options with drill.exec.options prefix are missed in configuration options
[ https://issues.apache.org/jira/browse/DRILL-8502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17868404#comment-17868404 ] ASF GitHub Bot commented on DRILL-8502: --- jnturton commented on PR #2923: URL: https://github.com/apache/drill/pull/2923#issuecomment-2248307027 Sorry I missed this at the time. Thanks for the cleanup. > Some boot options with drill.exec.options prefix are missed in configuration > options > > > Key: DRILL-8502 > URL: https://issues.apache.org/jira/browse/DRILL-8502 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.21.2 >Reporter: Maksym Rymar >Assignee: Maksym Rymar >Priority: Minor > Fix For: 1.22.0 > > > Drill has boot options with {{drill.exec.options}} prefix which are missed in > configuration options. It can be easily checked by comparing the system > tables: > {code:java} > apache drill> select name from sys.boot where name like 'drill.exec.options%' > AND name not in (select concat('drill.exec.options.', name) from > sys.internal_options union all select concat('drill.exec.options.', name) > from sys.options); > +---+ > | name | > +---+ > | drill.exec.options.drill.exec.testing.controls| > | drill.exec.options.exec.hashagg.max_batches_in_memory | > | drill.exec.options.exec.hashagg.num_rows_in_batch | > | drill.exec.options.exec.hashjoin.mem_limit| > | drill.exec.options.exec.return_result_set_for_ddl | > +---+{code} > Expected – empty result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8503) Add Configuration Option to Skip Host Validation for Splunk
[ https://issues.apache.org/jira/browse/DRILL-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17868366#comment-17868366 ] ASF GitHub Bot commented on DRILL-8503: --- cgivre opened a new pull request, #2927: URL: https://github.com/apache/drill/pull/2927 # [DRILL-8503](https://issues.apache.org/jira/browse/DRILL-8503): Add Configuration Option to Skip Host Validation for Splunk ## Description in corporate installations, sometimes the organization will use self-signed certificates which can cause problems. This PR adds an option to bypass host validation for SSL connections in Splunk. The "correct" way to fix this would be to provide better SSL information, to include the certificate in the Splunk connection, however, Splunk's SDK does not allow for this and there are several open issues and PRs relating to this. This PR also bumps the Splunk SDK version to the latest version which is 1.9.5. ## Documentation An additional configuration option: `validateHostnames` has been added to the Splunk configuration. I updated the README.md file with this information and will be updating the documentation once this has been merged. ## Testing Ran existing unit tests and tested manually. > Add Configuration Option to Skip Host Validation for Splunk > --- > > Key: DRILL-8503 > URL: https://issues.apache.org/jira/browse/DRILL-8503 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Splunk >Affects Versions: 1.21.2 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.22.0 > > > This PR adds an option to skip host validation for SSL connections to Splunk. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8503) Add Configuration Option to Skip Host Validation for Splunk
Charles Givre created DRILL-8503: Summary: Add Configuration Option to Skip Host Validation for Splunk Key: DRILL-8503 URL: https://issues.apache.org/jira/browse/DRILL-8503 Project: Apache Drill Issue Type: Improvement Components: Storage - Splunk Affects Versions: 1.21.2 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.22.0 This PR adds an option to skip host validation for SSL connections to Splunk. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (DRILL-8502) Some boot options with drill.exec.options prefix are missed in configuration options
[ https://issues.apache.org/jira/browse/DRILL-8502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maksym Rymar resolved DRILL-8502. - Fix Version/s: 1.22.0 Resolution: Fixed Merged to master: https://github.com/apache/drill/pull/2923 > Some boot options with drill.exec.options prefix are missed in configuration > options > > > Key: DRILL-8502 > URL: https://issues.apache.org/jira/browse/DRILL-8502 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.21.2 >Reporter: Maksym Rymar >Assignee: Maksym Rymar >Priority: Minor > Fix For: 1.22.0 > > > Drill has boot options with {{drill.exec.options}} prefix which are missed in > configuration options. It can be easily checked by comparing the system > tables: > {code:java} > apache drill> select name from sys.boot where name like 'drill.exec.options%' > AND name not in (select concat('drill.exec.options.', name) from > sys.internal_options union all select concat('drill.exec.options.', name) > from sys.options); > +---+ > | name | > +---+ > | drill.exec.options.drill.exec.testing.controls| > | drill.exec.options.exec.hashagg.max_batches_in_memory | > | drill.exec.options.exec.hashagg.num_rows_in_batch | > | drill.exec.options.exec.hashjoin.mem_limit| > | drill.exec.options.exec.return_result_set_for_ddl | > +---+{code} > Expected – empty result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8502) Some boot options with drill.exec.options prefix are missed in configuration options
[ https://issues.apache.org/jira/browse/DRILL-8502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17867138#comment-17867138 ] ASF GitHub Bot commented on DRILL-8502: --- cgivre merged PR #2923: URL: https://github.com/apache/drill/pull/2923 > Some boot options with drill.exec.options prefix are missed in configuration > options > > > Key: DRILL-8502 > URL: https://issues.apache.org/jira/browse/DRILL-8502 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.21.2 >Reporter: Maksym Rymar >Assignee: Maksym Rymar >Priority: Minor > > Drill has boot options with {{drill.exec.options}} prefix which are missed in > configuration options. It can be easily checked by comparing the system > tables: > {code:java} > apache drill> select name from sys.boot where name like 'drill.exec.options%' > AND name not in (select concat('drill.exec.options.', name) from > sys.internal_options union all select concat('drill.exec.options.', name) > from sys.options); > +---+ > | name | > +---+ > | drill.exec.options.drill.exec.testing.controls| > | drill.exec.options.exec.hashagg.max_batches_in_memory | > | drill.exec.options.exec.hashagg.num_rows_in_batch | > | drill.exec.options.exec.hashjoin.mem_limit| > | drill.exec.options.exec.return_result_set_for_ddl | > +---+{code} > Expected – empty result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8502) Some boot options with drill.exec.options prefix are missed in configuration options
[ https://issues.apache.org/jira/browse/DRILL-8502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17867006#comment-17867006 ] ASF GitHub Bot commented on DRILL-8502: --- rymarm commented on PR #2923: URL: https://github.com/apache/drill/pull/2923#issuecomment-2236471316 @cgivre yes, sure. I've updated the PR with Jira ticket information. > Some boot options with drill.exec.options prefix are missed in configuration > options > > > Key: DRILL-8502 > URL: https://issues.apache.org/jira/browse/DRILL-8502 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.21.2 >Reporter: Maksym Rymar >Assignee: Maksym Rymar >Priority: Minor > > Drill has boot options with {{drill.exec.options}} prefix which are missed in > configuration options. It can be easily checked by comparing the system > tables: > {code:java} > apache drill> select name from sys.boot where name like 'drill.exec.options%' > AND name not in (select concat('drill.exec.options.', name) from > sys.internal_options union all select concat('drill.exec.options.', name) > from sys.options); > +---+ > | name | > +---+ > | drill.exec.options.drill.exec.testing.controls| > | drill.exec.options.exec.hashagg.max_batches_in_memory | > | drill.exec.options.exec.hashagg.num_rows_in_batch | > | drill.exec.options.exec.hashjoin.mem_limit| > | drill.exec.options.exec.return_result_set_for_ddl | > +---+{code} > Expected – empty result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8502) Some boot options with drill.exec.options prefix are missed in configuration options
Maksym Rymar created DRILL-8502: --- Summary: Some boot options with drill.exec.options prefix are missed in configuration options Key: DRILL-8502 URL: https://issues.apache.org/jira/browse/DRILL-8502 Project: Apache Drill Issue Type: Bug Affects Versions: 1.21.2 Reporter: Maksym Rymar Assignee: Maksym Rymar Drill has boot options with {{drill.exec.options}} prefix which are missed in configuration options. It can be easily checked by comparing the system tables: {code:java} apache drill> select name from sys.boot where name like 'drill.exec.options%' AND name not in (select concat('drill.exec.options.', name) from sys.internal_options union all select concat('drill.exec.options.', name) from sys.options); +---+ | name | +---+ | drill.exec.options.drill.exec.testing.controls| | drill.exec.options.exec.hashagg.max_batches_in_memory | | drill.exec.options.exec.hashagg.num_rows_in_batch | | drill.exec.options.exec.hashjoin.mem_limit| | drill.exec.options.exec.return_result_set_for_ddl | +---+{code} Expected – empty result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8316) Convert Druid Storage Plugin to EVF & V2 JSON Reader
[ https://issues.apache.org/jira/browse/DRILL-8316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865827#comment-17865827 ] ASF GitHub Bot commented on DRILL-8316: --- cgivre commented on PR #2657: URL: https://github.com/apache/drill/pull/2657#issuecomment-2227577005 @jnturton Could you do a review of this. I realized that this has been languishing and we might as well merge it if it can be. The one area which I'm a little hesitant about is the ScanBatchCreator. Basically, since I didn't write this storage plugin and it was a bit more complicated than some of the ones I've written, I'd like another set of eyes on it. > Convert Druid Storage Plugin to EVF & V2 JSON Reader > > > Key: DRILL-8316 > URL: https://issues.apache.org/jira/browse/DRILL-8316 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Druid >Affects Versions: 1.20.2 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8492) Allow Parquet TIME_MICROS and TIMESTAMP_MICROS columns to be read as 64-bit integer values
[ https://issues.apache.org/jira/browse/DRILL-8492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865788#comment-17865788 ] ASF GitHub Bot commented on DRILL-8492: --- cgivre commented on PR #2907: URL: https://github.com/apache/drill/pull/2907#issuecomment-2227415938 @jnturton Can we merge this? > Allow Parquet TIME_MICROS and TIMESTAMP_MICROS columns to be read as 64-bit > integer values > --- > > Key: DRILL-8492 > URL: https://issues.apache.org/jira/browse/DRILL-8492 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Parquet >Affects Versions: 1.21.1 >Reporter: Peter Franzen >Priority: Major > > When reading Parquet columns of type {{time_micros}} and > {{{}timestamp_micros{}}}, Drill truncates the microsecond values to > milliseconds in order to convert them to SQL timestamps. > It is currently not possible to read the original microsecond values (as > 64-bit values, not SQL timestamps) through Drill. > One solution for allowing reading the original 64-bit values is to add two > options similar to “store.parquet.reader.int96_as_timestamp" to control > whether microsecond > times and timestamps are truncated to millisecond timestamps or read as > non-truncated 64-bit values. > These options would be added to {{org.apache.drill.exec.ExecConstants}} and > {{{}org.apache.drill.exec.server.options.SystemOptionManager{}}}. > They would also be added to "drill-module.conf": > {{ store.parquet.reader.time_micros_as_int64: false,}} > {{ store.parquet.reader.timestamp_micros_as_int64: false,}} > These options would then be used in the same places as > {{{}store.parquet.reader.int96_as_timestamp{}}}: > * org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory > * > org.apache.drill.exec.store.parquet.columnreaders.ParquetToDrillTypeConverter > * org.apache.drill.exec.store.parquet2.DrillParquetGroupConverter > to create an int64 reader instead of a time/timestamp reader when the > correspondning option is set to true. > In addition to this, > {{org.apache.drill.exec.store.parquet.metadata.FileMetadataCollector }}must > be altered to _not_ truncate the min and max values for > time_micros/timestamp_micros if the corresponding option is true. This class > doesn’t have a reference to an {{{}OptionManager{}}}, so the two new options > must be extracted from the {{OptionManager}} when the {{ParquetReaderConfig}} > instance is created. > Filtering on microsecond columns would be done using 64-bit values rather > than TIME/TIMESTAMP values when the new options are true, e.g. > {{SELECT * FROM WHERE = 1705914906694751;}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8501) Json Conversion UDF Not Respecting System JSON Options
[ https://issues.apache.org/jira/browse/DRILL-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17864245#comment-17864245 ] ASF GitHub Bot commented on DRILL-8501: --- cgivre merged PR #2921: URL: https://github.com/apache/drill/pull/2921 > Json Conversion UDF Not Respecting System JSON Options > -- > > Key: DRILL-8501 > URL: https://issues.apache.org/jira/browse/DRILL-8501 > Project: Apache Drill > Issue Type: Bug > Components: Storage - JSON >Affects Versions: 1.21.2 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.22.0 > > > The convert_fromJSON() UDF does not respect the system JSON options of > allTextMode and readAllNumbersAsDouble. > This PR fixes that. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8501) Json Conversion UDF Not Respecting System JSON Options
[ https://issues.apache.org/jira/browse/DRILL-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17864237#comment-17864237 ] ASF GitHub Bot commented on DRILL-8501: --- jnturton commented on PR #2921: URL: https://github.com/apache/drill/pull/2921#issuecomment-2217905543 Oh that's great, thanks for the enhancements +1. > Json Conversion UDF Not Respecting System JSON Options > -- > > Key: DRILL-8501 > URL: https://issues.apache.org/jira/browse/DRILL-8501 > Project: Apache Drill > Issue Type: Bug > Components: Storage - JSON >Affects Versions: 1.21.2 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.22.0 > > > The convert_fromJSON() UDF does not respect the system JSON options of > allTextMode and readAllNumbersAsDouble. > This PR fixes that. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8501) Json Conversion UDF Not Respecting System JSON Options
[ https://issues.apache.org/jira/browse/DRILL-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863589#comment-17863589 ] ASF GitHub Bot commented on DRILL-8501: --- cgivre commented on PR #2921: URL: https://github.com/apache/drill/pull/2921#issuecomment-2212491446 @jnturton I added new versions of the UDF so that the user can specify in the function call whether they want `allTextMode` and the other option. > Json Conversion UDF Not Respecting System JSON Options > -- > > Key: DRILL-8501 > URL: https://issues.apache.org/jira/browse/DRILL-8501 > Project: Apache Drill > Issue Type: Bug > Components: Storage - JSON >Affects Versions: 1.21.2 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.22.0 > > > The convert_fromJSON() UDF does not respect the system JSON options of > allTextMode and readAllNumbersAsDouble. > This PR fixes that. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8501) Json Conversion UDF Not Respecting System JSON Options
[ https://issues.apache.org/jira/browse/DRILL-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863534#comment-17863534 ] ASF GitHub Bot commented on DRILL-8501: --- cgivre commented on PR #2921: URL: https://github.com/apache/drill/pull/2921#issuecomment-2212305704 > Before I approve, did you consider making these JSON parsing settings parameters of the function itself? It feels odd to me that `store.json.*` settings could influence UDFs too. I'm not sure why they aren't storage plugin config, rather than global config, in the first place... @jnturton I thought about doing exactly what you're describing. Here's the thing. We started some work a while ago to get rid of all the non-EVF2 readers in Drill. It turns out that there are a few places which still use the old non-EVF JSON reader. Specifically, this UDF, the Druid Storage Plugin and the MongoDB storage plugin. I started work on [Drill-8316](https://github.com/apache/drill/pull/2657) which addresses the Druid plugin and [Drill-8329](https://github.com/apache/drill/pull/2567) addresses converting the UDF. Neither one of these were a high priority so they're kind of sitting at the moment. I agree with your premise that the whole idea of having global settings for file formats (including parquet) is not the best idea. > Json Conversion UDF Not Respecting System JSON Options > -- > > Key: DRILL-8501 > URL: https://issues.apache.org/jira/browse/DRILL-8501 > Project: Apache Drill > Issue Type: Bug > Components: Storage - JSON >Affects Versions: 1.21.2 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.22.0 > > > The convert_fromJSON() UDF does not respect the system JSON options of > allTextMode and readAllNumbersAsDouble. > This PR fixes that. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8501) Json Conversion UDF Not Respecting System JSON Options
[ https://issues.apache.org/jira/browse/DRILL-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863481#comment-17863481 ] ASF GitHub Bot commented on DRILL-8501: --- jnturton commented on PR #2921: URL: https://github.com/apache/drill/pull/2921#issuecomment-2211746498 Before I approve, did you consider making these JSON parsing settings parameters of the function itself? It feels odd to me that `store.json.*` settings could influence UDFs too. I'm not sure why they aren't storage plugin config, rather than global config, in the first place... > Json Conversion UDF Not Respecting System JSON Options > -- > > Key: DRILL-8501 > URL: https://issues.apache.org/jira/browse/DRILL-8501 > Project: Apache Drill > Issue Type: Bug > Components: Storage - JSON >Affects Versions: 1.21.2 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.22.0 > > > The convert_fromJSON() UDF does not respect the system JSON options of > allTextMode and readAllNumbersAsDouble. > This PR fixes that. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin
[ https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863373#comment-17863373 ] ASF GitHub Bot commented on DRILL-8474: --- mbeckerle commented on code in PR #2909: URL: https://github.com/apache/drill/pull/2909#discussion_r1666968774 ## contrib/format-daffodil/src/test/java/org/apache/drill/exec/store/daffodil/TestDaffodilReader.java: ## @@ -0,0 +1,250 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.daffodil; + +import org.apache.drill.categories.RowSetTest; +import org.apache.drill.common.types.TypeProtos.MinorType; +import org.apache.drill.exec.physical.rowSet.RowSet; +import org.apache.drill.exec.physical.rowSet.RowSetReader; +import org.apache.drill.exec.record.metadata.SchemaBuilder; +import org.apache.drill.exec.record.metadata.TupleMetadata; +import org.apache.drill.test.ClusterFixture; +import org.apache.drill.test.ClusterTest; +import org.apache.drill.test.QueryBuilder; +import org.apache.drill.test.rowSet.RowSetComparison; +import org.junit.BeforeClass; +import org.junit.Test; +import org.junit.experimental.categories.Category; + +import java.nio.file.Paths; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertFalse; + +@Category(RowSetTest.class) +public class TestDaffodilReader extends ClusterTest { + + String schemaURIRoot = "file:///opt/drill/contrib/format-daffodil/src/test/resources/"; Review Comment: What, exactly, do I change this to, if I want to retrieve files from $DRILL_CONFIG_DIR/lib ? ## contrib/format-daffodil/src/test/java/org/apache/drill/exec/store/daffodil/TestDaffodilReader.java: ## @@ -0,0 +1,250 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.daffodil; + +import org.apache.drill.categories.RowSetTest; +import org.apache.drill.common.types.TypeProtos.MinorType; +import org.apache.drill.exec.physical.rowSet.RowSet; +import org.apache.drill.exec.physical.rowSet.RowSetReader; +import org.apache.drill.exec.record.metadata.SchemaBuilder; +import org.apache.drill.exec.record.metadata.TupleMetadata; +import org.apache.drill.test.ClusterFixture; +import org.apache.drill.test.ClusterTest; +import org.apache.drill.test.QueryBuilder; +import org.apache.drill.test.rowSet.RowSetComparison; +import org.junit.BeforeClass; +import org.junit.Test; +import org.junit.experimental.categories.Category; + +import java.nio.file.Paths; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertFalse; + +@Category(RowSetTest.class) +public class TestDaffodilReader extends ClusterTest { + + String schemaURIRoot = "file:///opt/drill/contrib/format-daffodil/src/test/resources/"; + + @BeforeClass + public static void setup() throws Exception { +// boilerplate call to start test rig +ClusterTest.startCluster(ClusterFixture.builder(dirTestWatcher)); + +DaffodilFormatConfig formatConfig = new DaffodilFormatConfig(null, "", "", "", false); + +cluster.defineFormat("dfs", "daffodil", formatConfig); + +// Needed to test against compressed files. +// Copies data from src/test/resources to the dfs root. +dirTestWatcher.copyResourceToRoot(Paths.get("data/
[jira] [Commented] (DRILL-8490) Sender operator fake memory leak result to sql failed and memory statistics error when ChannelClosedException
[ https://issues.apache.org/jira/browse/DRILL-8490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863218#comment-17863218 ] ASF GitHub Bot commented on DRILL-8490: --- cgivre merged PR #2917: URL: https://github.com/apache/drill/pull/2917 > Sender operator fake memory leak result to sql failed and memory statistics > error when ChannelClosedException > -- > > Key: DRILL-8490 > URL: https://issues.apache.org/jira/browse/DRILL-8490 > Project: Apache Drill > Issue Type: Bug > Components: Server >Affects Versions: 1.21.1 >Reporter: shihuafeng >Priority: Major > Fix For: 1.22.0 > > > *1.DES* > ** when ChannelClosedException, .ReconnectingConnection#CloseHandler > release sendingAccountor reference counter before netty release buffer, so > operator was closed before memory is released by netty . > > *2 .exception info* > > 2024-04-13 08:45:39,909 [DataClient-3] WARN > o.apache.drill.exec.rpc.RequestIdMap - Failure while attempting to fail rpc > response. > java.lang.IllegalArgumentException: Self-suppression not permitted > at java.lang.Throwable.addSuppressed(Throwable.java:1072) > at > org.apache.drill.common.DeferredException.addException(DeferredException.java:88) > at > org.apache.drill.common.DeferredException.addThrowable(DeferredException.java:97) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.fail(FragmentExecutor.java:502) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.access$400(FragmentExecutor.java:131) > at > org.apache.drill.exec.work.fragment.FragmentExecutor$ExecutorStateImpl.fail(FragmentExecutor.java:518) > at > org.apache.drill.exec.ops.FragmentContextImpl.fail(FragmentContextImpl.java:298) > at > org.apache.drill.exec.ops.FragmentContextImpl$1.accept(FragmentContextImpl.java:152) > at > org.apache.drill.exec.ops.FragmentContextImpl$1.accept(FragmentContextImpl.java:149) > at > org.apache.drill.exec.ops.DataTunnelStatusHandler.failed(DataTunnelStatusHandler.java:45) > at > org.apache.drill.exec.rpc.data.DataTunnel$ThrottlingOutcomeListener.failed(DataTunnel.java:125) > at > org.apache.drill.exec.rpc.RequestIdMap$RpcListener.setException(RequestIdMap.java:145) > at > org.apache.drill.exec.rpc.RequestIdMap$SetExceptionProcedure.apply(RequestIdMap.java:78) > at > org.apache.drill.exec.rpc.RequestIdMap$SetExceptionProcedure.apply(RequestIdMap.java:68) > at > com.carrotsearch.hppc.IntObjectHashMap.forEach(IntObjectHashMap.java:692) > at > org.apache.drill.exec.rpc.RequestIdMap.channelClosed(RequestIdMap.java:64) > at > org.apache.drill.exec.rpc.AbstractRemoteConnection.channelClosed(AbstractRemoteConnection.java:192) > at > org.apache.drill.exec.rpc.AbstractClientConnection.channelClosed(AbstractClientConnection.java:97) > at > org.apache.drill.exec.rpc.RpcBus$ChannelClosedHandler.operationComplete(RpcBus.java:158) > at > org.apache.drill.exec.rpc.RpcBus$ChannelClosedHandler.operationComplete(RpcBus.java:135) > at > org.apache.drill.exec.rpc.ReconnectingConnection$CloseHandler.operationComplete(ReconnectingConnection.java:205) > at > org.apache.drill.exec.rpc.ReconnectingConnection$CloseHandler.operationComplete(ReconnectingConnection.java:192) > at > io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578) > at > io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:552) > at > io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491) > at > io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616) > at > io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:605) > at > io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104) > at > io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:84) > at > io.netty.channel.AbstractChannel$CloseFuture.setClosed(AbstractChannel.java:1164) > at > io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChannel.java:755) > > at > io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:731) > at > io.netty.channel.AbstractChannel$AbstractUnsafe.handleWriteError(AbstractChannel.java:950) >
[jira] [Commented] (DRILL-8501) Json Conversion UDF Not Respecting System JSON Options
[ https://issues.apache.org/jira/browse/DRILL-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863057#comment-17863057 ] ASF GitHub Bot commented on DRILL-8501: --- cgivre opened a new pull request, #2921: URL: https://github.com/apache/drill/pull/2921 # [DRILL-8501](https://issues.apache.org/jira/browse/DRILL-): Json Conversion UDF Not Respecting System JSON Options ## Description The `convert_fromJSON()` function was ignoring Drill system configuration variables for reading JSON. This PR adds support for `allTextMode` and `readNumbersAsDouble` to this function. Once merged, the `convert_fromJSON()` function will follow the system settings. I also split one of the unit test files because it had all the UDF tests mixed with NaN tests. ## Documentation No user facing changes. ## Testing Added unit tests. > Json Conversion UDF Not Respecting System JSON Options > -- > > Key: DRILL-8501 > URL: https://issues.apache.org/jira/browse/DRILL-8501 > Project: Apache Drill > Issue Type: Bug > Components: Storage - JSON >Affects Versions: 1.21.2 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.22.0 > > > The convert_fromJSON() UDF does not respect the system JSON options of > allTextMode and readAllNumbersAsDouble. > This PR fixes that. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8501) Json Conversion UDF Not Respecting System JSON Options
Charles Givre created DRILL-8501: Summary: Json Conversion UDF Not Respecting System JSON Options Key: DRILL-8501 URL: https://issues.apache.org/jira/browse/DRILL-8501 Project: Apache Drill Issue Type: Bug Components: Storage - JSON Affects Versions: 1.21.2 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.22.0 The convert_fromJSON() UDF does not respect the system JSON options of allTextMode and readAllNumbersAsDouble. This PR fixes that. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (DRILL-8500) review 3rd party source code borrowed into Apache Drill
[ https://issues.apache.org/jira/browse/DRILL-8500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] PJ Fanning updated DRILL-8500: -- Description: based on the comment: https://github.com/apache/drill/pull/2918#pullrequestreview-2141938793 Any source that Apache Drill has borrowed from a 3rd party code base needs to be documented in our LICENSE and possibly NOTICE (if that 3rd party code base has a NOTICE file - we need to copy its contents into ours). I used https://github.com/scanoss/sbom-workbench to look at the Drill source and there are files that we should investigate. In general, the biggest issues seem to be with files in the 'contrib' area and a lot of them are Javascript files. Also test data files, many are binaries and the SBOM Workbench tool is suspicious that some of them have licensing implications. was: based on the comment: https://github.com/apache/drill/pull/2918#pullrequestreview-2141938793 Any source that Apache Drill has borrowed from a 3rd party code base needs to be documented in our LICENSE and possibly NOTICE (if that 3rd party code base has a NOTICE file - we need to copy its contents into ours). I used https://github.com/scanoss/sbom-workbench to look at the Drill source and there are files that we should investigate. In general, the biggest issues seem to be with files in the 'contrib' area and a lot of them are Javascript files. > review 3rd party source code borrowed into Apache Drill > --- > > Key: DRILL-8500 > URL: https://issues.apache.org/jira/browse/DRILL-8500 > Project: Apache Drill > Issue Type: Task >Reporter: PJ Fanning >Priority: Major > > based on the comment: > https://github.com/apache/drill/pull/2918#pullrequestreview-2141938793 > Any source that Apache Drill has borrowed from a 3rd party code base needs to > be documented in our LICENSE and possibly NOTICE (if that 3rd party code base > has a NOTICE file - we need to copy its contents into ours). > I used https://github.com/scanoss/sbom-workbench to look at the Drill source > and there are files that we should investigate. > In general, the biggest issues seem to be with files in the 'contrib' area > and a lot of them are Javascript files. Also test data files, many are > binaries and the SBOM Workbench tool is suspicious that some of them have > licensing implications. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8500) review 3rd party source code borrowed into Apache Drill
PJ Fanning created DRILL-8500: - Summary: review 3rd party source code borrowed into Apache Drill Key: DRILL-8500 URL: https://issues.apache.org/jira/browse/DRILL-8500 Project: Apache Drill Issue Type: Task Reporter: PJ Fanning based on the comment: https://github.com/apache/drill/pull/2918#pullrequestreview-2141938793 Any source that Apache Drill has borrowed from a 3rd party code base needs to be documented in our LICENSE and possibly NOTICE (if that 3rd party code base has a NOTICE file - we need to copy its contents into ours). I used https://github.com/scanoss/sbom-workbench to look at the Drill source and there are files that we should investigate. In general, the biggest issues seem to be with files in the 'contrib' area and a lot of them are Javascript files. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8499) new util for generating random text
PJ Fanning created DRILL-8499: - Summary: new util for generating random text Key: DRILL-8499 URL: https://issues.apache.org/jira/browse/DRILL-8499 Project: Apache Drill Issue Type: Task Reporter: PJ Fanning Centralise the code for generating random text. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (DRILL-8490) Sender operator fake memory leak result to sql failed and memory statistics error when ChannelClosedException
[ https://issues.apache.org/jira/browse/DRILL-8490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shihuafeng updated DRILL-8490: -- Summary: Sender operator fake memory leak result to sql failed and memory statistics error when ChannelClosedException (was: Sender operator fake memory leak result to sql when ChannelClosedException) > Sender operator fake memory leak result to sql failed and memory statistics > error when ChannelClosedException > -- > > Key: DRILL-8490 > URL: https://issues.apache.org/jira/browse/DRILL-8490 > Project: Apache Drill > Issue Type: Bug > Components: Server >Affects Versions: 1.21.1 >Reporter: shihuafeng >Priority: Major > Fix For: 1.22.0 > > > *1.DES* > ** when ChannelClosedException, .ReconnectingConnection#CloseHandler > release sendingAccountor reference counter before netty release buffer, so > operator was closed before memory is released by netty . > > *2 .exception info* > > 2024-04-13 08:45:39,909 [DataClient-3] WARN > o.apache.drill.exec.rpc.RequestIdMap - Failure while attempting to fail rpc > response. > java.lang.IllegalArgumentException: Self-suppression not permitted > at java.lang.Throwable.addSuppressed(Throwable.java:1072) > at > org.apache.drill.common.DeferredException.addException(DeferredException.java:88) > at > org.apache.drill.common.DeferredException.addThrowable(DeferredException.java:97) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.fail(FragmentExecutor.java:502) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.access$400(FragmentExecutor.java:131) > at > org.apache.drill.exec.work.fragment.FragmentExecutor$ExecutorStateImpl.fail(FragmentExecutor.java:518) > at > org.apache.drill.exec.ops.FragmentContextImpl.fail(FragmentContextImpl.java:298) > at > org.apache.drill.exec.ops.FragmentContextImpl$1.accept(FragmentContextImpl.java:152) > at > org.apache.drill.exec.ops.FragmentContextImpl$1.accept(FragmentContextImpl.java:149) > at > org.apache.drill.exec.ops.DataTunnelStatusHandler.failed(DataTunnelStatusHandler.java:45) > at > org.apache.drill.exec.rpc.data.DataTunnel$ThrottlingOutcomeListener.failed(DataTunnel.java:125) > at > org.apache.drill.exec.rpc.RequestIdMap$RpcListener.setException(RequestIdMap.java:145) > at > org.apache.drill.exec.rpc.RequestIdMap$SetExceptionProcedure.apply(RequestIdMap.java:78) > at > org.apache.drill.exec.rpc.RequestIdMap$SetExceptionProcedure.apply(RequestIdMap.java:68) > at > com.carrotsearch.hppc.IntObjectHashMap.forEach(IntObjectHashMap.java:692) > at > org.apache.drill.exec.rpc.RequestIdMap.channelClosed(RequestIdMap.java:64) > at > org.apache.drill.exec.rpc.AbstractRemoteConnection.channelClosed(AbstractRemoteConnection.java:192) > at > org.apache.drill.exec.rpc.AbstractClientConnection.channelClosed(AbstractClientConnection.java:97) > at > org.apache.drill.exec.rpc.RpcBus$ChannelClosedHandler.operationComplete(RpcBus.java:158) > at > org.apache.drill.exec.rpc.RpcBus$ChannelClosedHandler.operationComplete(RpcBus.java:135) > at > org.apache.drill.exec.rpc.ReconnectingConnection$CloseHandler.operationComplete(ReconnectingConnection.java:205) > at > org.apache.drill.exec.rpc.ReconnectingConnection$CloseHandler.operationComplete(ReconnectingConnection.java:192) > at > io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578) > at > io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:552) > at > io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491) > at > io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616) > at > io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:605) > at > io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104) > at > io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:84) > at > io.netty.channel.AbstractChannel$CloseFuture.setClosed(AbstractChannel.java:1164) > at > io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChannel.java:755) > > at > io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:731) > at &
[jira] [Updated] (DRILL-8490) Sender operator fake memory leak result to sql when ChannelClosedException
[ https://issues.apache.org/jira/browse/DRILL-8490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shihuafeng updated DRILL-8490: -- Description: *1.DES* ** when ChannelClosedException, .ReconnectingConnection#CloseHandler release sendingAccountor reference counter before netty release buffer, so operator was closed before memory is released by netty . *2 .exception info* 2024-04-13 08:45:39,909 [DataClient-3] WARN o.apache.drill.exec.rpc.RequestIdMap - Failure while attempting to fail rpc response. java.lang.IllegalArgumentException: Self-suppression not permitted at java.lang.Throwable.addSuppressed(Throwable.java:1072) at org.apache.drill.common.DeferredException.addException(DeferredException.java:88) at org.apache.drill.common.DeferredException.addThrowable(DeferredException.java:97) at org.apache.drill.exec.work.fragment.FragmentExecutor.fail(FragmentExecutor.java:502) at org.apache.drill.exec.work.fragment.FragmentExecutor.access$400(FragmentExecutor.java:131) at org.apache.drill.exec.work.fragment.FragmentExecutor$ExecutorStateImpl.fail(FragmentExecutor.java:518) at org.apache.drill.exec.ops.FragmentContextImpl.fail(FragmentContextImpl.java:298) at org.apache.drill.exec.ops.FragmentContextImpl$1.accept(FragmentContextImpl.java:152) at org.apache.drill.exec.ops.FragmentContextImpl$1.accept(FragmentContextImpl.java:149) at org.apache.drill.exec.ops.DataTunnelStatusHandler.failed(DataTunnelStatusHandler.java:45) at org.apache.drill.exec.rpc.data.DataTunnel$ThrottlingOutcomeListener.failed(DataTunnel.java:125) at org.apache.drill.exec.rpc.RequestIdMap$RpcListener.setException(RequestIdMap.java:145) at org.apache.drill.exec.rpc.RequestIdMap$SetExceptionProcedure.apply(RequestIdMap.java:78) at org.apache.drill.exec.rpc.RequestIdMap$SetExceptionProcedure.apply(RequestIdMap.java:68) at com.carrotsearch.hppc.IntObjectHashMap.forEach(IntObjectHashMap.java:692) at org.apache.drill.exec.rpc.RequestIdMap.channelClosed(RequestIdMap.java:64) at org.apache.drill.exec.rpc.AbstractRemoteConnection.channelClosed(AbstractRemoteConnection.java:192) at org.apache.drill.exec.rpc.AbstractClientConnection.channelClosed(AbstractClientConnection.java:97) at org.apache.drill.exec.rpc.RpcBus$ChannelClosedHandler.operationComplete(RpcBus.java:158) at org.apache.drill.exec.rpc.RpcBus$ChannelClosedHandler.operationComplete(RpcBus.java:135) at org.apache.drill.exec.rpc.ReconnectingConnection$CloseHandler.operationComplete(ReconnectingConnection.java:205) at org.apache.drill.exec.rpc.ReconnectingConnection$CloseHandler.operationComplete(ReconnectingConnection.java:192) at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578) at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:552) at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491) at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616) at io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:605) at io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104) at io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:84) at io.netty.channel.AbstractChannel$CloseFuture.setClosed(AbstractChannel.java:1164) at io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChannel.java:755) at io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:731) at io.netty.channel.AbstractChannel$AbstractUnsafe.handleWriteError(AbstractChannel.java:950) at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:933) at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.forceFlush(AbstractNioChannel.java:361) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:716) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:658) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:584) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.drill.exec.rpc.ChannelClosedException: Channel closed /10.32.112.138:51108 <--> /10.32.112.138:31012. at org.apache.drill.exec.rpc.RpcBus$ChannelClosedHandler.operationComplete(RpcBus.java:156) {noformat} *no* further _formatting_ is done here{noformat} Summary:
[jira] [Commented] (DRILL-8498) Sqlline illegal reflective access warning
[ https://issues.apache.org/jira/browse/DRILL-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17854278#comment-17854278 ] ASF GitHub Bot commented on DRILL-8498: --- jnturton merged PR #2915: URL: https://github.com/apache/drill/pull/2915 > Sqlline illegal reflective access warning > - > > Key: DRILL-8498 > URL: https://issues.apache.org/jira/browse/DRILL-8498 > Project: Apache Drill > Issue Type: Bug > Components: Client - CLI >Affects Versions: 1.21.1 >Reporter: Maksym Rymar >Assignee: Maksym Rymar >Priority: Minor > > Sqlline has the following warnings on connection to Drill > {code:java} > apache drill> !connect jdbc:drill:drillbit=localhost; > WARNING: An illegal reflective access operation has occurred > WARNING: Illegal reflective access by > javassist.util.proxy.SecurityActions > ([file:/apache-drill-1.21.2/jars/3rdparty/javassist-3.28.0-GA.jar]file:/apache-drill-1.21.2/jars/3rdparty/javassist-3.28.0-GA.jar > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8498) Sqlline illegal reflective access warning
[ https://issues.apache.org/jira/browse/DRILL-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17854277#comment-17854277 ] ASF GitHub Bot commented on DRILL-8498: --- jnturton commented on PR #2915: URL: https://github.com/apache/drill/pull/2915#issuecomment-2162255485 Thank you! > Sqlline illegal reflective access warning > - > > Key: DRILL-8498 > URL: https://issues.apache.org/jira/browse/DRILL-8498 > Project: Apache Drill > Issue Type: Bug > Components: Client - CLI >Affects Versions: 1.21.1 >Reporter: Maksym Rymar >Assignee: Maksym Rymar >Priority: Minor > > Sqlline has the following warnings on connection to Drill > {code:java} > apache drill> !connect jdbc:drill:drillbit=localhost; > WARNING: An illegal reflective access operation has occurred > WARNING: Illegal reflective access by > javassist.util.proxy.SecurityActions > ([file:/apache-drill-1.21.2/jars/3rdparty/javassist-3.28.0-GA.jar]file:/apache-drill-1.21.2/jars/3rdparty/javassist-3.28.0-GA.jar > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (DRILL-8498) Sqlline illegal reflective access warning
[ https://issues.apache.org/jira/browse/DRILL-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17853622#comment-17853622 ] Maksym Rymar edited comment on DRILL-8498 at 6/10/24 11:30 AM: --- PR to review: [https://github.com/apache/drill/pull/2915] was (Author: JIRAUSER297250): PR to review: https://github.com/apache/drill/pull/2915 > Sqlline illegal reflective access warning > - > > Key: DRILL-8498 > URL: https://issues.apache.org/jira/browse/DRILL-8498 > Project: Apache Drill > Issue Type: Bug > Components: Client - CLI >Affects Versions: 1.21.1 >Reporter: Maksym Rymar >Assignee: Maksym Rymar >Priority: Minor > > Sqlline has the following warnings on connection to Drill > {code:java} > apache drill> !connect jdbc:drill:drillbit=localhost; > WARNING: An illegal reflective access operation has occurred > WARNING: Illegal reflective access by > javassist.util.proxy.SecurityActions > ([file:/apache-drill-1.21.2/jars/3rdparty/javassist-3.28.0-GA.jar]file:/apache-drill-1.21.2/jars/3rdparty/javassist-3.28.0-GA.jar > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8498) Sqlline illegal reflective access warning
[ https://issues.apache.org/jira/browse/DRILL-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17853622#comment-17853622 ] Maksym Rymar commented on DRILL-8498: - PR to review: https://github.com/apache/drill/pull/2915 > Sqlline illegal reflective access warning > - > > Key: DRILL-8498 > URL: https://issues.apache.org/jira/browse/DRILL-8498 > Project: Apache Drill > Issue Type: Bug > Components: Client - CLI >Affects Versions: 1.21.1 >Reporter: Maksym Rymar >Assignee: Maksym Rymar >Priority: Minor > > Sqlline has the following warnings on connection to Drill > {code:java} > apache drill> !connect jdbc:drill:drillbit=localhost; > WARNING: An illegal reflective access operation has occurred > WARNING: Illegal reflective access by > javassist.util.proxy.SecurityActions > ([file:/apache-drill-1.21.2/jars/3rdparty/javassist-3.28.0-GA.jar]file:/apache-drill-1.21.2/jars/3rdparty/javassist-3.28.0-GA.jar > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (DRILL-8498) Sqlline illegal reflective access warning
[ https://issues.apache.org/jira/browse/DRILL-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maksym Rymar updated DRILL-8498: Summary: Sqlline illegal reflective access warning (was: Sqlline illegal reflective access waring) > Sqlline illegal reflective access warning > - > > Key: DRILL-8498 > URL: https://issues.apache.org/jira/browse/DRILL-8498 > Project: Apache Drill > Issue Type: Bug > Components: Client - CLI >Affects Versions: 1.21.1 >Reporter: Maksym Rymar >Assignee: Maksym Rymar >Priority: Minor > > Sqlline has the following warnings on connection to Drill > {code:java} > apache drill> !connect jdbc:drill:drillbit=localhost; > WARNING: An illegal reflective access operation has occurred > WARNING: Illegal reflective access by > javassist.util.proxy.SecurityActions > ([file:/apache-drill-1.21.2/jars/3rdparty/javassist-3.28.0-GA.jar]file:/apache-drill-1.21.2/jars/3rdparty/javassist-3.28.0-GA.jar > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8498) Sqlline illegal reflective access waring
Maksym Rymar created DRILL-8498: --- Summary: Sqlline illegal reflective access waring Key: DRILL-8498 URL: https://issues.apache.org/jira/browse/DRILL-8498 Project: Apache Drill Issue Type: Bug Components: Client - CLI Affects Versions: 1.21.1 Reporter: Maksym Rymar Assignee: Maksym Rymar Sqlline has the following warnings on connection to Drill {code:java} apache drill> !connect jdbc:drill:drillbit=localhost; WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by javassist.util.proxy.SecurityActions ([file:/apache-drill-1.21.2/jars/3rdparty/javassist-3.28.0-GA.jar]file:/apache-drill-1.21.2/jars/3rdparty/javassist-3.28.0-GA.jar {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8497) Drill JDBC driver emits reflective access warnings under Java 9+
James Turton created DRILL-8497: --- Summary: Drill JDBC driver emits reflective access warnings under Java 9+ Key: DRILL-8497 URL: https://issues.apache.org/jira/browse/DRILL-8497 Project: Apache Drill Issue Type: Bug Affects Versions: 1.21.1 Reporter: James Turton Assignee: James Turton The failed code patching appears inconsequential to the JDBC driver's functioning but results in log noise for applications. Example warning {code:java} 10:48:27.903 [main] WARN oadd.org.apache.drill.common.util.ProtobufPatcher -- Unable to patch Protobuf. java.lang.reflect.InaccessibleObjectException: Unable to make protected final java.lang.Class java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain) throws java.lang.ClassFormatError accessible: module java.base does not "opens java.lang" to unnamed module @5d5baec3{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (DRILL-8496) Drill Query fails when the escape character(which is part of the data) is just before the quote
[ https://issues.apache.org/jira/browse/DRILL-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] achyut09 updated DRILL-8496: Description: I have the following csv- {code:java} "id"^"first_name"^"last_name"^"email"^"gender" "1"^"John"^"143 \\"^" ewilk...@buzzfeed.com"^"Male" "2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code} and when i run a drill query (SELECT * FROM dfs.`C:\Users\achyu\Documents\dir2`)- I am getting the following error- {code:java} UserRemoteException : DATA_READ ERROR: Unexpected character '101' following quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code} This is my dfs configuration for csv in apache drill.I am using the version 1.21.1- {code:java} "csv": { "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", "fieldDelimiter": "^", "quote": "\"", "escape": "\\", "comment": "#", "extractHeader": true }{code} Turns out this is because of this particular portion- {code:java} "143 \\"{code} In this csv {code:java} 143 \\{code} is part of the data and its not an escape character, But as this character is before the quote its failing. If i just give a space between the escape and " and quote then it works completely fine. I guess this is a bug. Any insights(for escaping the escape character before the quote) or workaround on the same? was: I have the following csv- {code:java} "id"^"first_name"^"last_name"^"email"^"gender" "1"^"John"^"143 \\"^" ewilk...@buzzfeed.com"^"Male" "2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code} and when i run a drill query (SELECT * FROM dfs.`C:\Users\achyu\Documents\dir2`)- I am getting the following error- {code:java} UserRemoteException : DATA_READ ERROR: Unexpected character '101' following quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code} This is my dfs configuration for csv in apache drill.I am using the version 1.21.1- {code:java} "csv": { "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", "fieldDelimiter": "^", "quote": "\"", "escape": "\\", "comment": "#", "extractHeader": true }{code} Turns out this is because of this particular portion- {code:java} "143 \\"{code} In this csv {code:java} 143 \\{code} is part of the data and its not an escape character, But as this character is before the quote its failing. If i just give a space between "\\" and quote then it works completely fine. I guess this is a bug. Any insights(for escaping the escape character before the quote) or workaround on the same? > Drill Query fails when the escape character(which is part of the data) is > just before the quote > --- > > Key: DRILL-8496 > URL: https://issues.apache.org/jira/browse/DRILL-8496 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.21.1 >Reporter: achyut09 >Priority: Critical > Labels: Drill > > I have the following csv- > > {code:java} > "id"^"first_name"^"last_name"^"email"^"gender" > "1"^"John"^"143 \\"^" > ewilk...@buzzfeed.com"^"Male" > "2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code} > and when i run a drill query (SELECT * > FROM dfs.`C:\Users\achyu\Documents\dir2`)- > I am getting the following error- > {code:java} > UserRemoteException : DATA_READ ERROR: Unexpected character '101' following > quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code} > This is my dfs configuration for csv in apache drill.I am using the version > 1.21.1- > {code:java} > "csv": { "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", > "fieldDelimiter": "^", "quote": "\"", "escape": "\\", "comment": "#", > "extractHeader": true }{code} > Turns out this is because of this particular portion- > {code:java} > "143 \\"{code} > In this csv > {code:java} > 143 \\{code} > is part of the data and its not an escape character, But as this character is > before the quote its failing. If i just give a space between the escape and " > and quote then it works completely fine. > I guess this is a bug. > Any insights(for escaping the escape character before the quote) or > workaround on the same? > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (DRILL-8496) Drill Query fails when the escape character(which is part of the data) is just before the quote
[ https://issues.apache.org/jira/browse/DRILL-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] achyut09 updated DRILL-8496: Description: I have the following csv- {code:java} "id"^"first_name"^"last_name"^"email"^"gender" "1"^"John"^"143 \\"^" ewilk...@buzzfeed.com"^"Male" "2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code} and when i run a drill query (SELECT * FROM dfs.`C:\Users\achyu\Documents\dir2`)- I am getting the following error- {code:java} UserRemoteException : DATA_READ ERROR: Unexpected character '101' following quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code} This is my dfs configuration for csv in apache drill.I am using the version 1.21.1- {code:java} "csv": { "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", "fieldDelimiter": "^", "quote": "\"", "escape": "\\", "comment": "#", "extractHeader": true }{code} Turns out this is because of this particular portion- {code:java} "143 \\"{code} In this csv {code:java} 143 \\{code} is part of the data and its not an escape character, But as this character is before the quote its failing. If i just give a space between "\\" and quote then it works completely fine. I guess this is a bug. Any insights(for escaping the escape character before the quote) or workaround on the same? was: I have the following csv- {code:java} "id"^"first_name"^"last_name"^"email"^"gender" "1"^"John"^"143 \\"^" ewilk...@buzzfeed.com"^"Male" "2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code} and when i run a drill query (SELECT * FROM dfs.`C:\Users\achyu\Documents\dir2`)- I am getting the following error- {code:java} UserRemoteException : DATA_READ ERROR: Unexpected character '101' following quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code} This is my dfs configuration for csv in apache drill.I am using the version 1.21.1- "csv": { "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", "fieldDelimiter": "^", "quote": "\"", "escape": " ", "comment": "#", "extractHeader": true } Turns out this is because of this particular portion- "143 \\" In this csv 143 \\ is part of the data and its not an escape character, But as this character is before the quote its failing. If i just give a space between \\ and quote then it works completely fine. I guess this is a bug. Any insights(for escaping the escape character before the quote) or workaround on the same? > Drill Query fails when the escape character(which is part of the data) is > just before the quote > --- > > Key: DRILL-8496 > URL: https://issues.apache.org/jira/browse/DRILL-8496 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.21.1 >Reporter: achyut09 >Priority: Critical > Labels: Drill > > I have the following csv- > > {code:java} > "id"^"first_name"^"last_name"^"email"^"gender" > "1"^"John"^"143 \\"^" > ewilk...@buzzfeed.com"^"Male" > "2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code} > and when i run a drill query (SELECT * > FROM dfs.`C:\Users\achyu\Documents\dir2`)- > I am getting the following error- > {code:java} > UserRemoteException : DATA_READ ERROR: Unexpected character '101' following > quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code} > This is my dfs configuration for csv in apache drill.I am using the version > 1.21.1- > {code:java} > "csv": { "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", > "fieldDelimiter": "^", "quote": "\"", "escape": "\\", "comment": "#", > "extractHeader": true }{code} > Turns out this is because of this particular portion- > {code:java} > "143 \\"{code} > In this csv > {code:java} > 143 \\{code} > is part of the data and its not an escape character, But as this character is > before the quote its failing. If i just give a space between "\\" and quote > then it works completely fine. > I guess this is a bug. > Any insights(for escaping the escape character before the quote) or > workaround on the same? > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (DRILL-8496) Drill Query fails when the escape character(which is part of the data) is just before the quote
[ https://issues.apache.org/jira/browse/DRILL-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] achyut09 updated DRILL-8496: Description: I have the following csv- {code:java} "id"^"first_name"^"last_name"^"email"^"gender" "1"^"John"^"143 \\"^" ewilk...@buzzfeed.com"^"Male" "2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code} and when i run a drill query (SELECT * FROM dfs.`C:\Users\achyu\Documents\dir2`)- I am getting the following error- {code:java} UserRemoteException : DATA_READ ERROR: Unexpected character '101' following quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code} This is my dfs configuration for csv in apache drill.I am using the version 1.21.1- "csv": { "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", "fieldDelimiter": "^", "quote": "\"", "escape": " ", "comment": "#", "extractHeader": true } Turns out this is because of this particular portion- "143 \\" In this csv 143 \\ is part of the data and its not an escape character, But as this character is before the quote its failing. If i just give a space between \\ and quote then it works completely fine. I guess this is a bug. Any insights(for escaping the escape character before the quote) or workaround on the same? was: I have the following csv- {code:java} "id"^"first_name"^"last_name"^"email"^"gender" "1"^"John"^"143 \\"^" ewilk...@buzzfeed.com"^"Male" "2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code} and when i run a drill query (SELECT * FROM dfs.`C:\Users\achyu\Documents\dir2`)- I am getting the following error- {code:java} UserRemoteException : DATA_READ ERROR: Unexpected character '101' following quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code} This is my dfs configuration for csv in apache drill.I am using the version 1.21.1- {quote}{quote}"csv": \{ "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", "fieldDelimiter": "^", "quote": "\"", "escape": "\\", "comment": "#", "extractHeader": true }{quote}{quote} Turns out this is because of this particular portion- "143 " In this csv is part of the data and its not an escape character,But as this character is before the quote its failing. If i just give a space between and quote then it works completely fine. I guess this is a bug. Any insights(for escaping the escape character before the quote) or workaround on the same? > Drill Query fails when the escape character(which is part of the data) is > just before the quote > --- > > Key: DRILL-8496 > URL: https://issues.apache.org/jira/browse/DRILL-8496 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.21.1 >Reporter: achyut09 >Priority: Critical > Labels: Drill > > I have the following csv- > > {code:java} > "id"^"first_name"^"last_name"^"email"^"gender" > "1"^"John"^"143 \\"^" > ewilk...@buzzfeed.com"^"Male" > "2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code} > and when i run a drill query (SELECT * > FROM dfs.`C:\Users\achyu\Documents\dir2`)- > I am getting the following error- > {code:java} > UserRemoteException : DATA_READ ERROR: Unexpected character '101' following > quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code} > This is my dfs configuration for csv in apache drill.I am using the version > 1.21.1- > "csv": { "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", > "fieldDelimiter": "^", "quote": "\"", "escape": " > ", "comment": "#", "extractHeader": true } > > Turns out this is because of this particular portion- "143 \\" > In this csv 143 \\ is part of the data and its not an escape character, But > as this character is before the quote its failing. If i just give a space > between \\ and quote then it works completely fine. > I guess this is a bug. > Any insights(for escaping the escape character before the quote) or > workaround on the same? > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (DRILL-8496) Drill Query fails when the escape character(which is part of the data) is just before the quote
[ https://issues.apache.org/jira/browse/DRILL-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] achyut09 updated DRILL-8496: Description: I have the following csv- {code:java} "id"^"first_name"^"last_name"^"email"^"gender" "1"^"John"^"143 \\"^" ewilk...@buzzfeed.com"^"Male" "2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code} and when i run a drill query (SELECT * FROM dfs.`C:\Users\achyu\Documents\dir2`)- I am getting the following error- {code:java} UserRemoteException : DATA_READ ERROR: Unexpected character '101' following quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code} This is my dfs configuration for csv in apache drill.I am using the version 1.21.1- {quote}"csv": { "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", "fieldDelimiter": "^", "quote": "\"", "escape": " ", "comment": "#", "extractHeader": true } {quote} Turns out this is because of this particular portion- "143 " In this csv is part of the data and its not an escape character,But as this character is before the quote its failing. If i just give a space between and quote then it works completely fine. I guess this is a bug. Any insights(for escaping the escape character before the quote) or workaround on the same? was: I have the following csv- {{}} {code:java} "id"^"first_name"^"last_name"^"email"^"gender" "1"^"John"^"143 \\"^" ewilk...@buzzfeed.com"^"Male" "2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code} and when i run a drill query (SELECT * FROM dfs.`C:\Users\achyu\Documents\dir2`)- I am getting the following error- {code:java} UserRemoteException : DATA_READ ERROR: Unexpected character '101' following quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code} This is my dfs configuration for csv in apache drill.I am using the version 1.21.1- {quote}"csv": \{ "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", "fieldDelimiter": "^", "quote": "\"", "escape": "\\", "comment": "#", "extractHeader": true }{quote} Turns out this is because of this particular portion- "143 \\" In this csv \\ is part of the data and its not an escape character,But as this character is before the quote its failing. If i just give a space between \\ and quote then it works completely fine. I guess this is a bug. Any insights(for escaping the escape character before the quote) or workaround on the same? > Drill Query fails when the escape character(which is part of the data) is > just before the quote > --- > > Key: DRILL-8496 > URL: https://issues.apache.org/jira/browse/DRILL-8496 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.21.1 >Reporter: achyut09 >Priority: Critical > Labels: Drill > > I have the following csv- > > {code:java} > "id"^"first_name"^"last_name"^"email"^"gender" > "1"^"John"^"143 \\"^" > ewilk...@buzzfeed.com"^"Male" > "2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code} > and when i run a drill query (SELECT * > FROM dfs.`C:\Users\achyu\Documents\dir2`)- > I am getting the following error- > {code:java} > UserRemoteException : DATA_READ ERROR: Unexpected character '101' following > quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code} > This is my dfs configuration for csv in apache drill.I am using the version > 1.21.1- > {quote}"csv": { "type": "text", "extensions": [ "csv" ], "lineDelimiter": > "\n", "fieldDelimiter": "^", "quote": "\"", "escape": " > ", "comment": "#", "extractHeader": true } > {quote} > > Turns out this is because of this particular portion- "143 > " > In this csv > is part of the data and its not an escape character,But as this character is > before the quote its failing. If i just give a space between > and quote then it works completely fine. > I guess this is a bug. > Any insights(for escaping the escape character before the quote) or > workaround on the same? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (DRILL-8496) Drill Query fails when the escape character(which is part of the data) is just before the quote
[ https://issues.apache.org/jira/browse/DRILL-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] achyut09 updated DRILL-8496: Description: I have the following csv- {code:java} "id"^"first_name"^"last_name"^"email"^"gender" "1"^"John"^"143 \\"^" ewilk...@buzzfeed.com"^"Male" "2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code} and when i run a drill query (SELECT * FROM dfs.`C:\Users\achyu\Documents\dir2`)- I am getting the following error- {code:java} UserRemoteException : DATA_READ ERROR: Unexpected character '101' following quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code} This is my dfs configuration for csv in apache drill.I am using the version 1.21.1- {quote}{quote}"csv": \{ "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", "fieldDelimiter": "^", "quote": "\"", "escape": "\\", "comment": "#", "extractHeader": true }{quote}{quote} Turns out this is because of this particular portion- "143 " In this csv is part of the data and its not an escape character,But as this character is before the quote its failing. If i just give a space between and quote then it works completely fine. I guess this is a bug. Any insights(for escaping the escape character before the quote) or workaround on the same? was: I have the following csv- {code:java} "id"^"first_name"^"last_name"^"email"^"gender" "1"^"John"^"143 \\"^" ewilk...@buzzfeed.com"^"Male" "2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code} and when i run a drill query (SELECT * FROM dfs.`C:\Users\achyu\Documents\dir2`)- I am getting the following error- {code:java} UserRemoteException : DATA_READ ERROR: Unexpected character '101' following quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code} This is my dfs configuration for csv in apache drill.I am using the version 1.21.1- {quote}"csv": { "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", "fieldDelimiter": "^", "quote": "\"", "escape": " ", "comment": "#", "extractHeader": true } {quote} Turns out this is because of this particular portion- "143 " In this csv is part of the data and its not an escape character,But as this character is before the quote its failing. If i just give a space between and quote then it works completely fine. I guess this is a bug. Any insights(for escaping the escape character before the quote) or workaround on the same? > Drill Query fails when the escape character(which is part of the data) is > just before the quote > --- > > Key: DRILL-8496 > URL: https://issues.apache.org/jira/browse/DRILL-8496 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.21.1 >Reporter: achyut09 >Priority: Critical > Labels: Drill > > I have the following csv- > > {code:java} > "id"^"first_name"^"last_name"^"email"^"gender" > "1"^"John"^"143 \\"^" > ewilk...@buzzfeed.com"^"Male" > "2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code} > and when i run a drill query (SELECT * > FROM dfs.`C:\Users\achyu\Documents\dir2`)- > I am getting the following error- > {code:java} > UserRemoteException : DATA_READ ERROR: Unexpected character '101' following > quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code} > This is my dfs configuration for csv in apache drill.I am using the version > 1.21.1- > {quote}{quote}"csv": \{ "type": "text", "extensions": [ "csv" ], > "lineDelimiter": "\n", "fieldDelimiter": "^", "quote": "\"", "escape": "\\", > "comment": "#", "extractHeader": true }{quote}{quote} > > Turns out this is because of this particular portion- "143 > " > In this csv > is part of the data and its not an escape character,But as this character is > before the quote its failing. If i just give a space between > and quote then it works completely fine. > I guess this is a bug. > Any insights(for escaping the escape character before the quote) or > workaround on the same? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8496) Drill Query fails when the escape character(which is part of the data) is just before the quote
achyut09 created DRILL-8496: --- Summary: Drill Query fails when the escape character(which is part of the data) is just before the quote Key: DRILL-8496 URL: https://issues.apache.org/jira/browse/DRILL-8496 Project: Apache Drill Issue Type: Bug Affects Versions: 1.21.1 Reporter: achyut09 I have the following csv- {{}} {code:java} "id"^"first_name"^"last_name"^"email"^"gender" "1"^"John"^"143 \\"^" ewilk...@buzzfeed.com"^"Male" "2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code} and when i run a drill query (SELECT * FROM dfs.`C:\Users\achyu\Documents\dir2`)- I am getting the following error- {code:java} UserRemoteException : DATA_READ ERROR: Unexpected character '101' following quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code} This is my dfs configuration for csv in apache drill.I am using the version 1.21.1- {quote}"csv": \{ "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", "fieldDelimiter": "^", "quote": "\"", "escape": "\\", "comment": "#", "extractHeader": true }{quote} Turns out this is because of this particular portion- "143 \\" In this csv \\ is part of the data and its not an escape character,But as this character is before the quote its failing. If i just give a space between \\ and quote then it works completely fine. I guess this is a bug. Any insights(for escaping the escape character before the quote) or workaround on the same? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (DRILL-8489) Sender memory leak when rpc encode exception
[ https://issues.apache.org/jira/browse/DRILL-8489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Turton closed DRILL-8489. --- Resolution: Fixed > Sender memory leak when rpc encode exception > > > Key: DRILL-8489 > URL: https://issues.apache.org/jira/browse/DRILL-8489 > Project: Apache Drill > Issue Type: Bug > Components: Server >Affects Versions: 1.21.1 >Reporter: shihuafeng >Priority: Major > Fix For: 1.21.2 > > > When encode throw Exception, if encode msg instanceof ReferenceCounted, netty > can release msg, but drill convert msg to OutboundRpcMessage, so netty can > not release msg. this causes sender memory leaks > exception info > {code:java} > 2024-04-16 16:25:57,998 [DataClient-7] ERROR > o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication. > Connection: /10.32.112.138:47924 <--> /10.32.112.138:31012 (data client). > Closing connection. > io.netty.handler.codec.EncoderException: > org.apache.drill.exec.exception.OutOfMemoryException: Unable to allocate > buffer of size 4096 due to memory limit (9223372036854775807). Current > allocation: 0 > at > io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:107) > at > io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:881) > at > io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:940) > at > io.netty.channel.AbstractChannelHandlerContext$WriteTask.run(AbstractChannelHandlerContext.java:1247) > at > io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173) > at > io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166) > at > io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569) > at > io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) > at > io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Unable to > allocate buffer of size 4096 due to memory limit (9223372036854775807). > Current allocation: 0 > at > org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:245) > at > org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:220) > at > org.apache.drill.exec.memory.DrillByteBufAllocator.buffer(DrillByteBufAllocator.java:55) > at > org.apache.drill.exec.memory.DrillByteBufAllocator.buffer(DrillByteBufAllocator.java:50) > at org.apache.drill.exec.rpc.RpcEncoder.encode(safeRelease.java:87) > at org.apache.drill.exec.rpc.RpcEncoder.encode(RpcEncoder.java:38) > at > io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:90){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (DRILL-8488) HashJoinPOP memory leak is caused by OutOfMemoryException
[ https://issues.apache.org/jira/browse/DRILL-8488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Turton closed DRILL-8488. --- Resolution: Fixed > HashJoinPOP memory leak is caused by OutOfMemoryException > -- > > Key: DRILL-8488 > URL: https://issues.apache.org/jira/browse/DRILL-8488 > Project: Apache Drill > Issue Type: Bug > Components: Server >Affects Versions: 1.21.1 >Reporter: shihuafeng >Priority: Major > Fix For: 1.21.2 > > > [DRILL-8485|[DRILL-8485] HashJoinPOP memory leak is caused by an oom > exception when read data from InputStream - ASF JIRA (apache.org)] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (DRILL-8489) Sender memory leak when rpc encode exception
[ https://issues.apache.org/jira/browse/DRILL-8489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Turton updated DRILL-8489: Fix Version/s: 1.21.2 (was: 1.22.0) > Sender memory leak when rpc encode exception > > > Key: DRILL-8489 > URL: https://issues.apache.org/jira/browse/DRILL-8489 > Project: Apache Drill > Issue Type: Bug > Components: Server >Affects Versions: 1.21.1 >Reporter: shihuafeng >Priority: Major > Fix For: 1.21.2 > > > When encode throw Exception, if encode msg instanceof ReferenceCounted, netty > can release msg, but drill convert msg to OutboundRpcMessage, so netty can > not release msg. this causes sender memory leaks > exception info > {code:java} > 2024-04-16 16:25:57,998 [DataClient-7] ERROR > o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication. > Connection: /10.32.112.138:47924 <--> /10.32.112.138:31012 (data client). > Closing connection. > io.netty.handler.codec.EncoderException: > org.apache.drill.exec.exception.OutOfMemoryException: Unable to allocate > buffer of size 4096 due to memory limit (9223372036854775807). Current > allocation: 0 > at > io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:107) > at > io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:881) > at > io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:940) > at > io.netty.channel.AbstractChannelHandlerContext$WriteTask.run(AbstractChannelHandlerContext.java:1247) > at > io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173) > at > io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166) > at > io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569) > at > io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) > at > io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Unable to > allocate buffer of size 4096 due to memory limit (9223372036854775807). > Current allocation: 0 > at > org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:245) > at > org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:220) > at > org.apache.drill.exec.memory.DrillByteBufAllocator.buffer(DrillByteBufAllocator.java:55) > at > org.apache.drill.exec.memory.DrillByteBufAllocator.buffer(DrillByteBufAllocator.java:50) > at org.apache.drill.exec.rpc.RpcEncoder.encode(safeRelease.java:87) > at org.apache.drill.exec.rpc.RpcEncoder.encode(RpcEncoder.java:38) > at > io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:90){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (DRILL-8488) HashJoinPOP memory leak is caused by OutOfMemoryException
[ https://issues.apache.org/jira/browse/DRILL-8488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Turton updated DRILL-8488: Fix Version/s: 1.21.2 (was: 1.22.0) > HashJoinPOP memory leak is caused by OutOfMemoryException > -- > > Key: DRILL-8488 > URL: https://issues.apache.org/jira/browse/DRILL-8488 > Project: Apache Drill > Issue Type: Bug > Components: Server >Affects Versions: 1.21.1 >Reporter: shihuafeng >Priority: Major > Fix For: 1.21.2 > > > [DRILL-8485|[DRILL-8485] HashJoinPOP memory leak is caused by an oom > exception when read data from InputStream - ASF JIRA (apache.org)] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (DRILL-8480) Cleanup before finished. 0 out of 1 streams have finished
[ https://issues.apache.org/jira/browse/DRILL-8480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Turton closed DRILL-8480. --- > Cleanup before finished. 0 out of 1 streams have finished > - > > Key: DRILL-8480 > URL: https://issues.apache.org/jira/browse/DRILL-8480 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.21.1 >Reporter: Maksym Rymar >Assignee: Maksym Rymar >Priority: Major > Fix For: 1.21.2 > > Attachments: 1a349ff1-d1f9-62bf-ed8c-26346c548005.sys.drill, > tableWithNumber2.parquet > > > Drill fails to execute a query with the following exception: > {code:java} > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Cleanup before finished. 0 out of 1 streams have > finished > Fragment: 1:0 > Please, refer to logs for more information. > [Error Id: 270da8f4-0bb6-4985-bf4f-34853138881c on > compute7.vmcluster.com:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:657) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:395) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:245) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:362) > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:829) > Caused by: java.lang.IllegalStateException: Cleanup before finished. 0 out of > 1 streams have finished > at > org.apache.drill.exec.work.batch.BaseRawBatchBuffer.close(BaseRawBatchBuffer.java:111) > at > org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:91) > at > org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:71) > at > org.apache.drill.exec.work.batch.AbstractDataCollector.close(AbstractDataCollector.java:121) > at > org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:91) > at > org.apache.drill.exec.work.batch.IncomingBuffers.close(IncomingBuffers.java:144) > at > org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:581) > at > org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:567) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:417) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:240) > ... 5 common frames omitted > Suppressed: java.lang.IllegalStateException: Cleanup before finished. > 0 out of 1 streams have finished > ... 15 common frames omitted > Suppressed: java.lang.IllegalStateException: Memory was leaked by > query. Memory leaked: (32768) > Allocator(op:1:0:8:UnorderedReceiver) 100/32768/32768/100 > (res/actual/peak/limit) > at > org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:519) > at > org.apache.drill.exec.ops.BaseOperatorContext.close(BaseOperatorContext.java:159) > at > org.apache.drill.exec.ops.OperatorContextImpl.close(OperatorContextImpl.java:77) > at > org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:581) > at > org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:571) > ... 7 common frames omitted > Suppressed: java.lang.IllegalStateException: Memory was leaked by > query. Memory leaked: (1016640) > Allocator(frag:1:0) 3000/1016640/30016640/90715827882 > (res/actual/peak/limit) > at > org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:519) > at > org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:581) > at > org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:574) > ... 7 common frames omitted {code} > Steps to reproduce: > 1.Enable unequal join: > {code:java} > alter session set `planner.enable_nljoin_for_scalar_only`=false;
[jira] [Updated] (DRILL-8480) Cleanup before finished. 0 out of 1 streams have finished
[ https://issues.apache.org/jira/browse/DRILL-8480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Turton updated DRILL-8480: Fix Version/s: 1.21.2 > Cleanup before finished. 0 out of 1 streams have finished > - > > Key: DRILL-8480 > URL: https://issues.apache.org/jira/browse/DRILL-8480 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.21.1 >Reporter: Maksym Rymar >Assignee: Maksym Rymar >Priority: Major > Fix For: 1.21.2 > > Attachments: 1a349ff1-d1f9-62bf-ed8c-26346c548005.sys.drill, > tableWithNumber2.parquet > > > Drill fails to execute a query with the following exception: > {code:java} > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Cleanup before finished. 0 out of 1 streams have > finished > Fragment: 1:0 > Please, refer to logs for more information. > [Error Id: 270da8f4-0bb6-4985-bf4f-34853138881c on > compute7.vmcluster.com:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:657) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:395) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:245) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:362) > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:829) > Caused by: java.lang.IllegalStateException: Cleanup before finished. 0 out of > 1 streams have finished > at > org.apache.drill.exec.work.batch.BaseRawBatchBuffer.close(BaseRawBatchBuffer.java:111) > at > org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:91) > at > org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:71) > at > org.apache.drill.exec.work.batch.AbstractDataCollector.close(AbstractDataCollector.java:121) > at > org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:91) > at > org.apache.drill.exec.work.batch.IncomingBuffers.close(IncomingBuffers.java:144) > at > org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:581) > at > org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:567) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:417) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:240) > ... 5 common frames omitted > Suppressed: java.lang.IllegalStateException: Cleanup before finished. > 0 out of 1 streams have finished > ... 15 common frames omitted > Suppressed: java.lang.IllegalStateException: Memory was leaked by > query. Memory leaked: (32768) > Allocator(op:1:0:8:UnorderedReceiver) 100/32768/32768/100 > (res/actual/peak/limit) > at > org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:519) > at > org.apache.drill.exec.ops.BaseOperatorContext.close(BaseOperatorContext.java:159) > at > org.apache.drill.exec.ops.OperatorContextImpl.close(OperatorContextImpl.java:77) > at > org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:581) > at > org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:571) > ... 7 common frames omitted > Suppressed: java.lang.IllegalStateException: Memory was leaked by > query. Memory leaked: (1016640) > Allocator(frag:1:0) 3000/1016640/30016640/90715827882 > (res/actual/peak/limit) > at > org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:519) > at > org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:581) > at > org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:574) > ... 7 common frames omitted {code} > Steps to reproduce: > 1.Enable unequal join: > {code:java} > alter session set `planner.ena
[jira] [Closed] (DRILL-8487) HTTP Caching
[ https://issues.apache.org/jira/browse/DRILL-8487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Turton closed DRILL-8487. --- Resolution: Duplicate > HTTP Caching > > > Key: DRILL-8487 > URL: https://issues.apache.org/jira/browse/DRILL-8487 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.21.1 >Reporter: Sena >Priority: Major > > I am using http storage plugin and I want to activate the caching. In the > documentation it says that this requires adding cacheResults. So I added this > to my config. When I test using an older version(1.20.1), I can see the query > result files under the tmp/http-cache directory, but when I test using a > newer version(1.21.1), there are no query result files in that directory, it > only contains the journal. This PR > [https://github.com/apache/drill/pull/2669] may have caused the issue. > Also, is it possible to implement maximum cache size? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (DRILL-8487) HTTP Caching
[ https://issues.apache.org/jira/browse/DRILL-8487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Turton reopened DRILL-8487: - > HTTP Caching > > > Key: DRILL-8487 > URL: https://issues.apache.org/jira/browse/DRILL-8487 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.21.1 >Reporter: Sena >Priority: Major > > I am using http storage plugin and I want to activate the caching. In the > documentation it says that this requires adding cacheResults. So I added this > to my config. When I test using an older version(1.20.1), I can see the query > result files under the tmp/http-cache directory, but when I test using a > newer version(1.21.1), there are no query result files in that directory, it > only contains the journal. This PR > [https://github.com/apache/drill/pull/2669] may have caused the issue. > Also, is it possible to implement maximum cache size? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (DRILL-8487) HTTP Caching
[ https://issues.apache.org/jira/browse/DRILL-8487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Turton closed DRILL-8487. --- Resolution: Fixed > HTTP Caching > > > Key: DRILL-8487 > URL: https://issues.apache.org/jira/browse/DRILL-8487 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.21.1 >Reporter: Sena >Priority: Major > > I am using http storage plugin and I want to activate the caching. In the > documentation it says that this requires adding cacheResults. So I added this > to my config. When I test using an older version(1.20.1), I can see the query > result files under the tmp/http-cache directory, but when I test using a > newer version(1.21.1), there are no query result files in that directory, it > only contains the journal. This PR > [https://github.com/apache/drill/pull/2669] may have caused the issue. > Also, is it possible to implement maximum cache size? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (DRILL-8487) HTTP Caching
[ https://issues.apache.org/jira/browse/DRILL-8487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Turton updated DRILL-8487: Fix Version/s: (was: 1.20.1) > HTTP Caching > > > Key: DRILL-8487 > URL: https://issues.apache.org/jira/browse/DRILL-8487 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.21.1 >Reporter: Sena >Priority: Major > > I am using http storage plugin and I want to activate the caching. In the > documentation it says that this requires adding cacheResults. So I added this > to my config. When I test using an older version(1.20.1), I can see the query > result files under the tmp/http-cache directory, but when I test using a > newer version(1.21.1), there are no query result files in that directory, it > only contains the journal. This PR > [https://github.com/apache/drill/pull/2669] may have caused the issue. > Also, is it possible to implement maximum cache size? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (DRILL-8494) HTTP Caching Not Saving Pages
[ https://issues.apache.org/jira/browse/DRILL-8494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Turton closed DRILL-8494. --- Resolution: Fixed > HTTP Caching Not Saving Pages > - > > Key: DRILL-8494 > URL: https://issues.apache.org/jira/browse/DRILL-8494 > Project: Apache Drill > Issue Type: Bug > Components: Storage - HTTP >Affects Versions: 1.21.1 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.21.2 > > > A minor bugfix, but the HTTP storage plugin was not actually caching results > even when caching was set to true. This bug was introduced in DRILL-8329. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (DRILL-8495) Tried to remove unmanaged buffer
[ https://issues.apache.org/jira/browse/DRILL-8495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Turton updated DRILL-8495: Fix Version/s: 1.21.2 > Tried to remove unmanaged buffer > > > Key: DRILL-8495 > URL: https://issues.apache.org/jira/browse/DRILL-8495 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.21.1 >Reporter: Maksym Rymar >Assignee: Maksym Rymar >Priority: Major > Fix For: 1.21.2 > > > > Drill throws an exception on Hive table: > {code:java} > (java.lang.IllegalStateException) Tried to remove unmanaged buffer. > org.apache.drill.exec.ops.BufferManagerImpl.replace():51 > io.netty.buffer.DrillBuf.reallocIfNeeded():101 > > org.apache.drill.exec.store.hive.writers.primitive.HiveStringWriter.write():38 > > org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.readHiveRecordAndInsertIntoRecordBatch():416 > > org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.next():402 > org.apache.drill.exec.physical.impl.ScanBatch.internalNext():235 > org.apache.drill.exec.physical.impl.ScanBatch.next():299 > > org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237 > org.apache.drill.exec.record.AbstractRecordBatch.next():109 > org.apache.drill.exec.record.AbstractRecordBatch.next():101 > org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():93 > org.apache.drill.exec.record.AbstractRecordBatch.next():160 > > org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237 > org.apache.drill.exec.physical.impl.BaseRootExec.next():103 > > org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81 > org.apache.drill.exec.physical.impl.BaseRootExec.next():93 > org.apache.drill.exec.work.fragment.FragmentExecutor.lambda$run$0():321 > java.security.AccessController.doPrivileged():-2 > javax.security.auth.Subject.doAs():422 > org.apache.hadoop.security.UserGroupInformation.doAs():1899 > org.apache.drill.exec.work.fragment.FragmentExecutor.run():310 > org.apache.drill.common.SelfCleaningRunnable.run():38 > java.util.concurrent.ThreadPoolExecutor.runWorker():1149 > java.util.concurrent.ThreadPoolExecutor$Worker.run():624 > java.lang.Thread.run():748 {code} > > > Reproduce: > # Create Hive table: > {code:java} > create table if NOT EXISTS students(id int, name string, surname string) > stored as parquet;{code} > # Insert a new row with 2 string values of size > 256 bytes: > {code:java} > insert into students values (1, > 'Veeery > long name', > 'biiiig > surname');{code} > # Execute Drill query: > {code:java} > select * from hive.`students` {code} > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (DRILL-8493) Drill Unable to Read XML Files with Namespaces
[ https://issues.apache.org/jira/browse/DRILL-8493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Turton closed DRILL-8493. --- Resolution: Fixed > Drill Unable to Read XML Files with Namespaces > -- > > Key: DRILL-8493 > URL: https://issues.apache.org/jira/browse/DRILL-8493 > Project: Apache Drill > Issue Type: Bug > Components: Format - XML >Affects Versions: 1.21.1 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.21.2 > > > This is a bug fix whereby Drill ignores all data when an XML file has a > namespace. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8495) Tried to remove unmanaged buffer
[ https://issues.apache.org/jira/browse/DRILL-8495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847325#comment-17847325 ] ASF GitHub Bot commented on DRILL-8495: --- jnturton merged PR #2913: URL: https://github.com/apache/drill/pull/2913 > Tried to remove unmanaged buffer > > > Key: DRILL-8495 > URL: https://issues.apache.org/jira/browse/DRILL-8495 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.21.1 >Reporter: Maksym Rymar >Assignee: Maksym Rymar >Priority: Major > > > Drill throws an exception on Hive table: > {code:java} > (java.lang.IllegalStateException) Tried to remove unmanaged buffer. > org.apache.drill.exec.ops.BufferManagerImpl.replace():51 > io.netty.buffer.DrillBuf.reallocIfNeeded():101 > > org.apache.drill.exec.store.hive.writers.primitive.HiveStringWriter.write():38 > > org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.readHiveRecordAndInsertIntoRecordBatch():416 > > org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.next():402 > org.apache.drill.exec.physical.impl.ScanBatch.internalNext():235 > org.apache.drill.exec.physical.impl.ScanBatch.next():299 > > org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237 > org.apache.drill.exec.record.AbstractRecordBatch.next():109 > org.apache.drill.exec.record.AbstractRecordBatch.next():101 > org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():93 > org.apache.drill.exec.record.AbstractRecordBatch.next():160 > > org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237 > org.apache.drill.exec.physical.impl.BaseRootExec.next():103 > > org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81 > org.apache.drill.exec.physical.impl.BaseRootExec.next():93 > org.apache.drill.exec.work.fragment.FragmentExecutor.lambda$run$0():321 > java.security.AccessController.doPrivileged():-2 > javax.security.auth.Subject.doAs():422 > org.apache.hadoop.security.UserGroupInformation.doAs():1899 > org.apache.drill.exec.work.fragment.FragmentExecutor.run():310 > org.apache.drill.common.SelfCleaningRunnable.run():38 > java.util.concurrent.ThreadPoolExecutor.runWorker():1149 > java.util.concurrent.ThreadPoolExecutor$Worker.run():624 > java.lang.Thread.run():748 {code} > > > Reproduce: > # Create Hive table: > {code:java} > create table if NOT EXISTS students(id int, name string, surname string) > stored as parquet;{code} > # Insert a new row with 2 string values of size > 256 bytes: > {code:java} > insert into students values (1, > 'Veeery > long name', > 'biiiig > surname');{code} > # Execute Drill query: > {code:java} > select * from hive.`students` {code} > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8492) Allow Parquet TIME_MICROS and TIMESTAMP_MICROS columns to be read as 64-bit integer values
[ https://issues.apache.org/jira/browse/DRILL-8492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847324#comment-17847324 ] ASF GitHub Bot commented on DRILL-8492: --- jnturton commented on PR #2907: URL: https://github.com/apache/drill/pull/2907#issuecomment-2117674516 It's always bugged me that we don't have a globally accessible way of accessing at least one of DrillbitContext, QueryContext, FragmentContext or just OptionManager. We hardly want to have to spray these things through APIs everywhere in Drill. I'll take a look at whether something can be done... > Allow Parquet TIME_MICROS and TIMESTAMP_MICROS columns to be read as 64-bit > integer values > --- > > Key: DRILL-8492 > URL: https://issues.apache.org/jira/browse/DRILL-8492 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Parquet >Affects Versions: 1.21.1 >Reporter: Peter Franzen >Priority: Major > > When reading Parquet columns of type {{time_micros}} and > {{{}timestamp_micros{}}}, Drill truncates the microsecond values to > milliseconds in order to convert them to SQL timestamps. > It is currently not possible to read the original microsecond values (as > 64-bit values, not SQL timestamps) through Drill. > One solution for allowing reading the original 64-bit values is to add two > options similar to “store.parquet.reader.int96_as_timestamp" to control > whether microsecond > times and timestamps are truncated to millisecond timestamps or read as > non-truncated 64-bit values. > These options would be added to {{org.apache.drill.exec.ExecConstants}} and > {{{}org.apache.drill.exec.server.options.SystemOptionManager{}}}. > They would also be added to "drill-module.conf": > {{ store.parquet.reader.time_micros_as_int64: false,}} > {{ store.parquet.reader.timestamp_micros_as_int64: false,}} > These options would then be used in the same places as > {{{}store.parquet.reader.int96_as_timestamp{}}}: > * org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory > * > org.apache.drill.exec.store.parquet.columnreaders.ParquetToDrillTypeConverter > * org.apache.drill.exec.store.parquet2.DrillParquetGroupConverter > to create an int64 reader instead of a time/timestamp reader when the > correspondning option is set to true. > In addition to this, > {{org.apache.drill.exec.store.parquet.metadata.FileMetadataCollector }}must > be altered to _not_ truncate the min and max values for > time_micros/timestamp_micros if the corresponding option is true. This class > doesn’t have a reference to an {{{}OptionManager{}}}, so the two new options > must be extracted from the {{OptionManager}} when the {{ParquetReaderConfig}} > instance is created. > Filtering on microsecond columns would be done using 64-bit values rather > than TIME/TIMESTAMP values when the new options are true, e.g. > {{SELECT * FROM WHERE = 1705914906694751;}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8495) Tried to remove unmanaged buffer
[ https://issues.apache.org/jira/browse/DRILL-8495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847241#comment-17847241 ] ASF GitHub Bot commented on DRILL-8495: --- rymarm commented on PR #2913: URL: https://github.com/apache/drill/pull/2913#issuecomment-2117347650 @jnturton I addressed checkstyle issues and failed java tests. Should be fine now) > Tried to remove unmanaged buffer > > > Key: DRILL-8495 > URL: https://issues.apache.org/jira/browse/DRILL-8495 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.21.1 >Reporter: Maksym Rymar >Assignee: Maksym Rymar >Priority: Major > > > Drill throws an exception on Hive table: > {code:java} > (java.lang.IllegalStateException) Tried to remove unmanaged buffer. > org.apache.drill.exec.ops.BufferManagerImpl.replace():51 > io.netty.buffer.DrillBuf.reallocIfNeeded():101 > > org.apache.drill.exec.store.hive.writers.primitive.HiveStringWriter.write():38 > > org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.readHiveRecordAndInsertIntoRecordBatch():416 > > org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.next():402 > org.apache.drill.exec.physical.impl.ScanBatch.internalNext():235 > org.apache.drill.exec.physical.impl.ScanBatch.next():299 > > org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237 > org.apache.drill.exec.record.AbstractRecordBatch.next():109 > org.apache.drill.exec.record.AbstractRecordBatch.next():101 > org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():93 > org.apache.drill.exec.record.AbstractRecordBatch.next():160 > > org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237 > org.apache.drill.exec.physical.impl.BaseRootExec.next():103 > > org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81 > org.apache.drill.exec.physical.impl.BaseRootExec.next():93 > org.apache.drill.exec.work.fragment.FragmentExecutor.lambda$run$0():321 > java.security.AccessController.doPrivileged():-2 > javax.security.auth.Subject.doAs():422 > org.apache.hadoop.security.UserGroupInformation.doAs():1899 > org.apache.drill.exec.work.fragment.FragmentExecutor.run():310 > org.apache.drill.common.SelfCleaningRunnable.run():38 > java.util.concurrent.ThreadPoolExecutor.runWorker():1149 > java.util.concurrent.ThreadPoolExecutor$Worker.run():624 > java.lang.Thread.run():748 {code} > > > Reproduce: > # Create Hive table: > {code:java} > create table if NOT EXISTS students(id int, name string, surname string) > stored as parquet;{code} > # Insert a new row with 2 string values of size > 256 bytes: > {code:java} > insert into students values (1, > 'Veeery > long name', > 'biiiig > surname');{code} > # Execute Drill query: > {code:java} > select * from hive.`students` {code} > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8495) Tried to remove unmanaged buffer
[ https://issues.apache.org/jira/browse/DRILL-8495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846934#comment-17846934 ] ASF GitHub Bot commented on DRILL-8495: --- jnturton commented on PR #2913: URL: https://github.com/apache/drill/pull/2913#issuecomment-2115109847 P.S. I see that checkstyle is still upset. > Tried to remove unmanaged buffer > > > Key: DRILL-8495 > URL: https://issues.apache.org/jira/browse/DRILL-8495 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.21.1 >Reporter: Maksym Rymar >Assignee: Maksym Rymar >Priority: Major > > > Drill throws an exception on Hive table: > {code:java} > (java.lang.IllegalStateException) Tried to remove unmanaged buffer. > org.apache.drill.exec.ops.BufferManagerImpl.replace():51 > io.netty.buffer.DrillBuf.reallocIfNeeded():101 > > org.apache.drill.exec.store.hive.writers.primitive.HiveStringWriter.write():38 > > org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.readHiveRecordAndInsertIntoRecordBatch():416 > > org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.next():402 > org.apache.drill.exec.physical.impl.ScanBatch.internalNext():235 > org.apache.drill.exec.physical.impl.ScanBatch.next():299 > > org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237 > org.apache.drill.exec.record.AbstractRecordBatch.next():109 > org.apache.drill.exec.record.AbstractRecordBatch.next():101 > org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():93 > org.apache.drill.exec.record.AbstractRecordBatch.next():160 > > org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237 > org.apache.drill.exec.physical.impl.BaseRootExec.next():103 > > org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81 > org.apache.drill.exec.physical.impl.BaseRootExec.next():93 > org.apache.drill.exec.work.fragment.FragmentExecutor.lambda$run$0():321 > java.security.AccessController.doPrivileged():-2 > javax.security.auth.Subject.doAs():422 > org.apache.hadoop.security.UserGroupInformation.doAs():1899 > org.apache.drill.exec.work.fragment.FragmentExecutor.run():310 > org.apache.drill.common.SelfCleaningRunnable.run():38 > java.util.concurrent.ThreadPoolExecutor.runWorker():1149 > java.util.concurrent.ThreadPoolExecutor$Worker.run():624 > java.lang.Thread.run():748 {code} > > > Reproduce: > # Create Hive table: > {code:java} > create table if NOT EXISTS students(id int, name string, surname string) > stored as parquet;{code} > # Insert a new row with 2 string values of size > 256 bytes: > {code:java} > insert into students values (1, > 'Veeery > long name', > 'biiiig > surname');{code} > # Execute Drill query: > {code:java} > select * from hive.`students` {code} > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8495) Tried to remove unmanaged buffer
[ https://issues.apache.org/jira/browse/DRILL-8495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846594#comment-17846594 ] ASF GitHub Bot commented on DRILL-8495: --- rymarm opened a new pull request, #2913: URL: https://github.com/apache/drill/pull/2913 # [DRILL-8495](https://issues.apache.org/jira/browse/DRILL-8495): Tried to remove unmanaged buffer The root cause of the issue is that multiple HiveWriters use the same `DrillBuf` and during execution they may reallocate the buffer if size of the buffer is not enough for a value (256 bytes+). Since `drillBuf.reallocIfNeeded(int size)` returns a new instance of `DrillBuf`, all other writers still have a reference for the old one buffer, which after `drillBuf.reallocIfNeeded(int size)` execution is unmanaged now. ## Description `HiveValueWriterFactory` now creates a unique `DrillBif` for each writer. HiveWriters are actually used one-by-one and we could utilize a single buffer for all the writers. To do this, I could create a class holder for `DrillBuf`, so each writer has a reference for the same holder, where will be stored a new buffer from every `drillBuf.reallocIfNeeded(int size)` call. But I thought that such logic looked slightly confusing and I decided just to let each HiveWriter use its own buffer. ## Documentation \- ## Testing Add a new unit test to query a Hive table with variable-length values of Binary, VarChar, Char and String types. > Tried to remove unmanaged buffer > > > Key: DRILL-8495 > URL: https://issues.apache.org/jira/browse/DRILL-8495 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.21.1 >Reporter: Maksym Rymar >Assignee: Maksym Rymar >Priority: Major > > > Drill throws an exception on Hive table: > {code:java} > (java.lang.IllegalStateException) Tried to remove unmanaged buffer. > org.apache.drill.exec.ops.BufferManagerImpl.replace():51 > io.netty.buffer.DrillBuf.reallocIfNeeded():101 > > org.apache.drill.exec.store.hive.writers.primitive.HiveStringWriter.write():38 > > org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.readHiveRecordAndInsertIntoRecordBatch():416 > > org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.next():402 > org.apache.drill.exec.physical.impl.ScanBatch.internalNext():235 > org.apache.drill.exec.physical.impl.ScanBatch.next():299 > > org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237 > org.apache.drill.exec.record.AbstractRecordBatch.next():109 > org.apache.drill.exec.record.AbstractRecordBatch.next():101 > org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():93 > org.apache.drill.exec.record.AbstractRecordBatch.next():160 > > org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237 > org.apache.drill.exec.physical.impl.BaseRootExec.next():103 > > org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81 > org.apache.drill.exec.physical.impl.BaseRootExec.next():93 > org.apache.drill.exec.work.fragment.FragmentExecutor.lambda$run$0():321 > java.security.AccessController.doPrivileged():-2 > javax.security.auth.Subject.doAs():422 > org.apache.hadoop.security.UserGroupInformation.doAs():1899 > org.apache.drill.exec.work.fragment.FragmentExecutor.run():310 > org.apache.drill.common.SelfCleaningRunnable.run():38 > java.util.concurrent.ThreadPoolExecutor.runWorker():1149 > java.util.concurrent.ThreadPoolExecutor$Worker.run():624 > java.lang.Thread.run():748 {code} > > > Reproduce: > # Create Hive table: > {code:java} > create table if NOT EXISTS students(id int, name string, surname string) > stored as parquet;{code} > # Insert a new row with 2 string values of size > 256 bytes: > {code:java} > insert into students values (1, > 'Veeery > long name', > 'bi
[jira] [Created] (DRILL-8495) Tried to remove unmanaged buffer
Maksym Rymar created DRILL-8495: --- Summary: Tried to remove unmanaged buffer Key: DRILL-8495 URL: https://issues.apache.org/jira/browse/DRILL-8495 Project: Apache Drill Issue Type: Bug Affects Versions: 1.21.1 Reporter: Maksym Rymar Assignee: Maksym Rymar Drill throws an exception on Hive table: {code:java} (java.lang.IllegalStateException) Tried to remove unmanaged buffer. org.apache.drill.exec.ops.BufferManagerImpl.replace():51 io.netty.buffer.DrillBuf.reallocIfNeeded():101 org.apache.drill.exec.store.hive.writers.primitive.HiveStringWriter.write():38 org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.readHiveRecordAndInsertIntoRecordBatch():416 org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.next():402 org.apache.drill.exec.physical.impl.ScanBatch.internalNext():235 org.apache.drill.exec.physical.impl.ScanBatch.next():299 org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractRecordBatch.next():101 org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():93 org.apache.drill.exec.record.AbstractRecordBatch.next():160 org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237 org.apache.drill.exec.physical.impl.BaseRootExec.next():103 org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81 org.apache.drill.exec.physical.impl.BaseRootExec.next():93 org.apache.drill.exec.work.fragment.FragmentExecutor.lambda$run$0():321 java.security.AccessController.doPrivileged():-2 javax.security.auth.Subject.doAs():422 org.apache.hadoop.security.UserGroupInformation.doAs():1899 org.apache.drill.exec.work.fragment.FragmentExecutor.run():310 org.apache.drill.common.SelfCleaningRunnable.run():38 java.util.concurrent.ThreadPoolExecutor.runWorker():1149 java.util.concurrent.ThreadPoolExecutor$Worker.run():624 java.lang.Thread.run():748 {code} Reproduce: # Create Hive table: {code:java} create table if NOT EXISTS students(id int, name string, surname string) stored as parquet;{code} # Insert a new row with 2 string values of size > 256 bytes: {code:java} insert into students values (1, 'Veeery long name', 'bg surname');{code} # Execute Drill query: {code:java} select * from hive.`students` {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8492) Allow Parquet TIME_MICROS and TIMESTAMP_MICROS columns to be read as 64-bit integer values
[ https://issues.apache.org/jira/browse/DRILL-8492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845996#comment-17845996 ] ASF GitHub Bot commented on DRILL-8492: --- handmadecode commented on PR #2907: URL: https://github.com/apache/drill/pull/2907#issuecomment-2108103905 > Follow up: I see use of the FragmentContext class for accessing config options in the old Parquet reader, perhaps it's a good a vehicle... `FragmentContext` is used to access the new config options where an instance already was available, i.e. in `ColumnReaderFactory`, `ParquetToDrillTypeConverter`, and `DrillParquetGroupConverter`. `FileMetadataCollector` doesn't have access to a `FragmentContext`, only to a `ParquetReaderConfig`. I can only find a FragmentContext in one of the two call paths to `FileMetadataCollector::addColumnMetadata`, so trying to inject a `FragmentContext` into `FileMetadataCollector` will probably have an impact on quite a few other classes. > Allow Parquet TIME_MICROS and TIMESTAMP_MICROS columns to be read as 64-bit > integer values > --- > > Key: DRILL-8492 > URL: https://issues.apache.org/jira/browse/DRILL-8492 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Parquet >Affects Versions: 1.21.1 >Reporter: Peter Franzen >Priority: Major > > When reading Parquet columns of type {{time_micros}} and > {{{}timestamp_micros{}}}, Drill truncates the microsecond values to > milliseconds in order to convert them to SQL timestamps. > It is currently not possible to read the original microsecond values (as > 64-bit values, not SQL timestamps) through Drill. > One solution for allowing reading the original 64-bit values is to add two > options similar to “store.parquet.reader.int96_as_timestamp" to control > whether microsecond > times and timestamps are truncated to millisecond timestamps or read as > non-truncated 64-bit values. > These options would be added to {{org.apache.drill.exec.ExecConstants}} and > {{{}org.apache.drill.exec.server.options.SystemOptionManager{}}}. > They would also be added to "drill-module.conf": > {{ store.parquet.reader.time_micros_as_int64: false,}} > {{ store.parquet.reader.timestamp_micros_as_int64: false,}} > These options would then be used in the same places as > {{{}store.parquet.reader.int96_as_timestamp{}}}: > * org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory > * > org.apache.drill.exec.store.parquet.columnreaders.ParquetToDrillTypeConverter > * org.apache.drill.exec.store.parquet2.DrillParquetGroupConverter > to create an int64 reader instead of a time/timestamp reader when the > correspondning option is set to true. > In addition to this, > {{org.apache.drill.exec.store.parquet.metadata.FileMetadataCollector }}must > be altered to _not_ truncate the min and max values for > time_micros/timestamp_micros if the corresponding option is true. This class > doesn’t have a reference to an {{{}OptionManager{}}}, so the two new options > must be extracted from the {{OptionManager}} when the {{ParquetReaderConfig}} > instance is created. > Filtering on microsecond columns would be done using 64-bit values rather > than TIME/TIMESTAMP values when the new options are true, e.g. > {{SELECT * FROM WHERE = 1705914906694751;}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8492) Allow Parquet TIME_MICROS and TIMESTAMP_MICROS columns to be read as 64-bit integer values
[ https://issues.apache.org/jira/browse/DRILL-8492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845816#comment-17845816 ] ASF GitHub Bot commented on DRILL-8492: --- jnturton commented on PR #2907: URL: https://github.com/apache/drill/pull/2907#issuecomment-2106890548 > However, I could very well have overlooked some way to access the global configuration, and I'd be grateful for any pointers. We should existing find examples in the Parquet format plugin. E.g. the "old" reader is affected by the option store.parquet.reader.pagereader.async. > Allow Parquet TIME_MICROS and TIMESTAMP_MICROS columns to be read as 64-bit > integer values > --- > > Key: DRILL-8492 > URL: https://issues.apache.org/jira/browse/DRILL-8492 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Parquet >Affects Versions: 1.21.1 >Reporter: Peter Franzen >Priority: Major > > When reading Parquet columns of type {{time_micros}} and > {{{}timestamp_micros{}}}, Drill truncates the microsecond values to > milliseconds in order to convert them to SQL timestamps. > It is currently not possible to read the original microsecond values (as > 64-bit values, not SQL timestamps) through Drill. > One solution for allowing reading the original 64-bit values is to add two > options similar to “store.parquet.reader.int96_as_timestamp" to control > whether microsecond > times and timestamps are truncated to millisecond timestamps or read as > non-truncated 64-bit values. > These options would be added to {{org.apache.drill.exec.ExecConstants}} and > {{{}org.apache.drill.exec.server.options.SystemOptionManager{}}}. > They would also be added to "drill-module.conf": > {{ store.parquet.reader.time_micros_as_int64: false,}} > {{ store.parquet.reader.timestamp_micros_as_int64: false,}} > These options would then be used in the same places as > {{{}store.parquet.reader.int96_as_timestamp{}}}: > * org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory > * > org.apache.drill.exec.store.parquet.columnreaders.ParquetToDrillTypeConverter > * org.apache.drill.exec.store.parquet2.DrillParquetGroupConverter > to create an int64 reader instead of a time/timestamp reader when the > correspondning option is set to true. > In addition to this, > {{org.apache.drill.exec.store.parquet.metadata.FileMetadataCollector }}must > be altered to _not_ truncate the min and max values for > time_micros/timestamp_micros if the corresponding option is true. This class > doesn’t have a reference to an {{{}OptionManager{}}}, so the two new options > must be extracted from the {{OptionManager}} when the {{ParquetReaderConfig}} > instance is created. > Filtering on microsecond columns would be done using 64-bit values rather > than TIME/TIMESTAMP values when the new options are true, e.g. > {{SELECT * FROM WHERE = 1705914906694751;}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8492) Allow Parquet TIME_MICROS and TIMESTAMP_MICROS columns to be read as 64-bit integer values
[ https://issues.apache.org/jira/browse/DRILL-8492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845701#comment-17845701 ] ASF GitHub Bot commented on DRILL-8492: --- handmadecode commented on PR #2907: URL: https://github.com/apache/drill/pull/2907#issuecomment-2106221734 > Awesome work. I can backport this too because you've left default behaviour unchanged (and it's self contained). My only question is about ParquetReaderConfig > Allow Parquet TIME_MICROS and TIMESTAMP_MICROS columns to be read as 64-bit > integer values > --- > > Key: DRILL-8492 > URL: https://issues.apache.org/jira/browse/DRILL-8492 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Parquet >Affects Versions: 1.21.1 >Reporter: Peter Franzen >Priority: Major > > When reading Parquet columns of type {{time_micros}} and > {{{}timestamp_micros{}}}, Drill truncates the microsecond values to > milliseconds in order to convert them to SQL timestamps. > It is currently not possible to read the original microsecond values (as > 64-bit values, not SQL timestamps) through Drill. > One solution for allowing reading the original 64-bit values is to add two > options similar to “store.parquet.reader.int96_as_timestamp" to control > whether microsecond > times and timestamps are truncated to millisecond timestamps or read as > non-truncated 64-bit values. > These options would be added to {{org.apache.drill.exec.ExecConstants}} and > {{{}org.apache.drill.exec.server.options.SystemOptionManager{}}}. > They would also be added to "drill-module.conf": > {{ store.parquet.reader.time_micros_as_int64: false,}} > {{ store.parquet.reader.timestamp_micros_as_int64: false,}} > These options would then be used in the same places as > {{{}store.parquet.reader.int96_as_timestamp{}}}: > * org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory > * > org.apache.drill.exec.store.parquet.columnreaders.ParquetToDrillTypeConverter > * org.apache.drill.exec.store.parquet2.DrillParquetGroupConverter > to create an int64 reader instead of a time/timestamp reader when the > correspondning option is set to true. > In addition to this, > {{org.apache.drill.exec.store.parquet.metadata.FileMetadataCollector }}must > be altered to _not_ truncate the min and max values for > time_micros/timestamp_micros if the corresponding option is true. This class > doesn’t have a reference to an {{{}OptionManager{}}}, so the two new options > must be extracted from the {{OptionManager}} when the {{ParquetReaderConfig}} > instance is created. > Filtering on microsecond columns would be done using 64-bit values rather > than TIME/TIMESTAMP values when the new options are true, e.g. > {{SELECT * FROM WHERE = 1705914906694751;}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8480) Cleanup before finished. 0 out of 1 streams have finished
[ https://issues.apache.org/jira/browse/DRILL-8480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845661#comment-17845661 ] ASF GitHub Bot commented on DRILL-8480: --- jnturton merged PR #2897: URL: https://github.com/apache/drill/pull/2897 > Cleanup before finished. 0 out of 1 streams have finished > - > > Key: DRILL-8480 > URL: https://issues.apache.org/jira/browse/DRILL-8480 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.21.1 >Reporter: Maksym Rymar >Assignee: Maksym Rymar >Priority: Major > Attachments: 1a349ff1-d1f9-62bf-ed8c-26346c548005.sys.drill, > tableWithNumber2.parquet > > > Drill fails to execute a query with the following exception: > {code:java} > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Cleanup before finished. 0 out of 1 streams have > finished > Fragment: 1:0 > Please, refer to logs for more information. > [Error Id: 270da8f4-0bb6-4985-bf4f-34853138881c on > compute7.vmcluster.com:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:657) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:395) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:245) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:362) > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:829) > Caused by: java.lang.IllegalStateException: Cleanup before finished. 0 out of > 1 streams have finished > at > org.apache.drill.exec.work.batch.BaseRawBatchBuffer.close(BaseRawBatchBuffer.java:111) > at > org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:91) > at > org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:71) > at > org.apache.drill.exec.work.batch.AbstractDataCollector.close(AbstractDataCollector.java:121) > at > org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:91) > at > org.apache.drill.exec.work.batch.IncomingBuffers.close(IncomingBuffers.java:144) > at > org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:581) > at > org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:567) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:417) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:240) > ... 5 common frames omitted > Suppressed: java.lang.IllegalStateException: Cleanup before finished. > 0 out of 1 streams have finished > ... 15 common frames omitted > Suppressed: java.lang.IllegalStateException: Memory was leaked by > query. Memory leaked: (32768) > Allocator(op:1:0:8:UnorderedReceiver) 100/32768/32768/100 > (res/actual/peak/limit) > at > org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:519) > at > org.apache.drill.exec.ops.BaseOperatorContext.close(BaseOperatorContext.java:159) > at > org.apache.drill.exec.ops.OperatorContextImpl.close(OperatorContextImpl.java:77) > at > org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:581) > at > org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:571) > ... 7 common frames omitted > Suppressed: java.lang.IllegalStateException: Memory was leaked by > query. Memory leaked: (1016640) > Allocator(frag:1:0) 3000/1016640/30016640/90715827882 > (res/actual/peak/limit) > at > org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:519) > at > org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:581) > at > org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:574) > ... 7 common frames omitted {code} > Steps to reproduce: > 1.Enable unequal join: >
[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin
[ https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845113#comment-17845113 ] ASF GitHub Bot commented on DRILL-8474: --- mbeckerle commented on code in PR #2909: URL: https://github.com/apache/drill/pull/2909#discussion_r1595903385 ## contrib/format-daffodil/src/main/java/org/apache/drill/exec/store/daffodil/schema/DaffodilDataProcessorFactory.java: ## @@ -0,0 +1,165 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.daffodil.schema; + +import org.apache.daffodil.japi.Compiler; +import org.apache.daffodil.japi.Daffodil; +import org.apache.daffodil.japi.DataProcessor; +import org.apache.daffodil.japi.Diagnostic; +import org.apache.daffodil.japi.InvalidParserException; +import org.apache.daffodil.japi.InvalidUsageException; +import org.apache.daffodil.japi.ProcessorFactory; +import org.apache.daffodil.japi.ValidationMode; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.net.URI; +import java.net.URISyntaxException; +import java.nio.channels.Channels; +import java.util.List; +import java.util.Objects; + +/** + * Compiles a DFDL schema (mostly for tests) or loads a pre-compiled DFDL schema so that one can + * obtain a DataProcessor for use with DaffodilMessageParser. + * + * TODO: Needs to use a cache to avoid reloading/recompiling every time. + */ +public class DaffodilDataProcessorFactory { + // Default constructor is used. + + private static final Logger logger = LoggerFactory.getLogger(DaffodilDataProcessorFactory.class); + + private DataProcessor dp; + + /** + * Gets a Daffodil DataProcessor given the necessary arguments to compile or reload it. + * + * @param schemaFileURI + * pre-compiled dfdl schema (.bin extension) or DFDL schema source (.xsd extension) + * @param validationMode + * Use true to request Daffodil built-in 'limited' validation. Use false for no validation. + * @param rootName + * Local name of root element of the message. Can be null to use the first element declaration + * of the primary schema file. Ignored if reloading a pre-compiled schema. + * @param rootNS + * Namespace URI as a string. Can be null to use the target namespace of the primary schema + * file or if it is unambiguous what element is the rootName. Ignored if reloading a + * pre-compiled schema. + * @return the DataProcessor + * @throws CompileFailure + * - if schema compilation fails + */ + public DataProcessor getDataProcessor(URI schemaFileURI, boolean validationMode, String rootName, + String rootNS) + throws CompileFailure { + +DaffodilDataProcessorFactory dmp = new DaffodilDataProcessorFactory(); +boolean isPrecompiled = schemaFileURI.toString().endsWith(".bin"); +if (isPrecompiled) { + if (Objects.nonNull(rootName) && !rootName.isEmpty()) { +// A usage error. You shouldn't supply the name and optionally namespace if loading +// precompiled schema because those are built into it. Should be null or "". +logger.warn("Root element name '{}' is ignored when used with precompiled DFDL schema.", +rootName); + } + try { +dmp.loadSchema(schemaFileURI); + } catch (IOException | InvalidParserException e) { +throw new CompileFailure(e); + } + dmp.setupDP(validationMode, null); +} else { + List pfDiags; + try { +pfDiags = dmp.compileSchema(schemaFileURI, rootName, rootNS); + } catch (URISyntaxException | IOException e) { +throw new CompileFailure(e); + } + dmp.setupDP(validationMode, pfDiags); +} +return dmp.dp; + } + + private void loadSchema(URI schemaFileURI) throws IOException, InvalidParserException { +Compiler c = Daffodil.compiler(); +dp = c.reload(Channels.newChannel(schemaFileURI.toURL().openStream())); + } + + private List compileSchema(URI schemaFileURI, String rootName, String rootNS) + throws URISyntaxException, IOExcep
[jira] [Commented] (DRILL-8492) Allow Parquet TIME_MICROS and TIMESTAMP_MICROS columns to be read as 64-bit integer values
[ https://issues.apache.org/jira/browse/DRILL-8492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844316#comment-17844316 ] ASF GitHub Bot commented on DRILL-8492: --- cgivre commented on PR #2907: URL: https://github.com/apache/drill/pull/2907#issuecomment-2098546012 > > LGTM +1 Thanks for the contribution! Can you please update the documentation for the Parquet reader to include this? Otherwise looks good! > > Happy to contribute! > > Do you mean the documentation in the `drill-site` repo? (https://github.com/apache/drill-site/blob/master/_docs/en/data-sources-and-file-formats/040-parquet-format.md) That's the one! > Allow Parquet TIME_MICROS and TIMESTAMP_MICROS columns to be read as 64-bit > integer values > --- > > Key: DRILL-8492 > URL: https://issues.apache.org/jira/browse/DRILL-8492 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Parquet >Affects Versions: 1.21.1 >Reporter: Peter Franzen >Priority: Major > > When reading Parquet columns of type {{time_micros}} and > {{{}timestamp_micros{}}}, Drill truncates the microsecond values to > milliseconds in order to convert them to SQL timestamps. > It is currently not possible to read the original microsecond values (as > 64-bit values, not SQL timestamps) through Drill. > One solution for allowing reading the original 64-bit values is to add two > options similar to “store.parquet.reader.int96_as_timestamp" to control > whether microsecond > times and timestamps are truncated to millisecond timestamps or read as > non-truncated 64-bit values. > These options would be added to {{org.apache.drill.exec.ExecConstants}} and > {{{}org.apache.drill.exec.server.options.SystemOptionManager{}}}. > They would also be added to "drill-module.conf": > {{ store.parquet.reader.time_micros_as_int64: false,}} > {{ store.parquet.reader.timestamp_micros_as_int64: false,}} > These options would then be used in the same places as > {{{}store.parquet.reader.int96_as_timestamp{}}}: > * org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory > * > org.apache.drill.exec.store.parquet.columnreaders.ParquetToDrillTypeConverter > * org.apache.drill.exec.store.parquet2.DrillParquetGroupConverter > to create an int64 reader instead of a time/timestamp reader when the > correspondning option is set to true. > In addition to this, > {{org.apache.drill.exec.store.parquet.metadata.FileMetadataCollector }}must > be altered to _not_ truncate the min and max values for > time_micros/timestamp_micros if the corresponding option is true. This class > doesn’t have a reference to an {{{}OptionManager{}}}, so the two new options > must be extracted from the {{OptionManager}} when the {{ParquetReaderConfig}} > instance is created. > Filtering on microsecond columns would be done using 64-bit values rather > than TIME/TIMESTAMP values when the new options are true, e.g. > {{SELECT * FROM WHERE = 1705914906694751;}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8492) Allow Parquet TIME_MICROS and TIMESTAMP_MICROS columns to be read as 64-bit integer values
[ https://issues.apache.org/jira/browse/DRILL-8492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844315#comment-17844315 ] ASF GitHub Bot commented on DRILL-8492: --- handmadecode commented on PR #2907: URL: https://github.com/apache/drill/pull/2907#issuecomment-2098543073 > LGTM +1 Thanks for the contribution! Can you please update the documentation for the Parquet reader to include this? Otherwise looks good! Happy to contribute! Do you mean the documentation in the `drill-site` repo? (https://github.com/apache/drill-site/blob/master/_docs/en/data-sources-and-file-formats/040-parquet-format.md) > Allow Parquet TIME_MICROS and TIMESTAMP_MICROS columns to be read as 64-bit > integer values > --- > > Key: DRILL-8492 > URL: https://issues.apache.org/jira/browse/DRILL-8492 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Parquet >Affects Versions: 1.21.1 >Reporter: Peter Franzen >Priority: Major > > When reading Parquet columns of type {{time_micros}} and > {{{}timestamp_micros{}}}, Drill truncates the microsecond values to > milliseconds in order to convert them to SQL timestamps. > It is currently not possible to read the original microsecond values (as > 64-bit values, not SQL timestamps) through Drill. > One solution for allowing reading the original 64-bit values is to add two > options similar to “store.parquet.reader.int96_as_timestamp" to control > whether microsecond > times and timestamps are truncated to millisecond timestamps or read as > non-truncated 64-bit values. > These options would be added to {{org.apache.drill.exec.ExecConstants}} and > {{{}org.apache.drill.exec.server.options.SystemOptionManager{}}}. > They would also be added to "drill-module.conf": > {{ store.parquet.reader.time_micros_as_int64: false,}} > {{ store.parquet.reader.timestamp_micros_as_int64: false,}} > These options would then be used in the same places as > {{{}store.parquet.reader.int96_as_timestamp{}}}: > * org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory > * > org.apache.drill.exec.store.parquet.columnreaders.ParquetToDrillTypeConverter > * org.apache.drill.exec.store.parquet2.DrillParquetGroupConverter > to create an int64 reader instead of a time/timestamp reader when the > correspondning option is set to true. > In addition to this, > {{org.apache.drill.exec.store.parquet.metadata.FileMetadataCollector }}must > be altered to _not_ truncate the min and max values for > time_micros/timestamp_micros if the corresponding option is true. This class > doesn’t have a reference to an {{{}OptionManager{}}}, so the two new options > must be extracted from the {{OptionManager}} when the {{ParquetReaderConfig}} > instance is created. > Filtering on microsecond columns would be done using 64-bit values rather > than TIME/TIMESTAMP values when the new options are true, e.g. > {{SELECT * FROM WHERE = 1705914906694751;}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin
[ https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843832#comment-17843832 ] ASF GitHub Bot commented on DRILL-8474: --- mbeckerle commented on PR #2909: URL: https://github.com/apache/drill/pull/2909#issuecomment-209976 > Hi Mike, Are you free at all this week? My apologies... We're in the middle of putting an offer on a house and my life is very hectic at the moment. Best, > Add Daffodil Format Plugin > -- > > Key: DRILL-8474 > URL: https://issues.apache.org/jira/browse/DRILL-8474 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.21.1 >Reporter: Charles Givre >Priority: Major > Fix For: 1.22.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8494) HTTP Caching Not Saving Pages
Charles Givre created DRILL-8494: Summary: HTTP Caching Not Saving Pages Key: DRILL-8494 URL: https://issues.apache.org/jira/browse/DRILL-8494 Project: Apache Drill Issue Type: Bug Components: Storage - HTTP Affects Versions: 1.21.1 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.21.2 A minor bugfix, but the HTTP storage plugin was not actually caching results even when caching was set to true. This bug was introduced in DRILL-8329. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin
[ https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843601#comment-17843601 ] ASF GitHub Bot commented on DRILL-8474: --- cgivre commented on PR #2909: URL: https://github.com/apache/drill/pull/2909#issuecomment-2095044801 Hi Mike, Are you free at all this week? My apologies... We're in the middle of putting an offer on a house and my life is very hectic at the moment. Best, > Add Daffodil Format Plugin > -- > > Key: DRILL-8474 > URL: https://issues.apache.org/jira/browse/DRILL-8474 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.21.1 >Reporter: Charles Givre >Priority: Major > Fix For: 1.22.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8488) HashJoinPOP memory leak is caused by OutOfMemoryException
[ https://issues.apache.org/jira/browse/DRILL-8488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842680#comment-17842680 ] ASF GitHub Bot commented on DRILL-8488: --- cgivre merged PR #2900: URL: https://github.com/apache/drill/pull/2900 > HashJoinPOP memory leak is caused by OutOfMemoryException > -- > > Key: DRILL-8488 > URL: https://issues.apache.org/jira/browse/DRILL-8488 > Project: Apache Drill > Issue Type: Bug > Components: Server >Affects Versions: 1.21.1 >Reporter: shihuafeng >Priority: Major > Fix For: 1.22.0 > > > [DRILL-8485|[DRILL-8485] HashJoinPOP memory leak is caused by an oom > exception when read data from InputStream - ASF JIRA (apache.org)] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8489) Sender memory leak when rpc encode exception
[ https://issues.apache.org/jira/browse/DRILL-8489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842679#comment-17842679 ] ASF GitHub Bot commented on DRILL-8489: --- cgivre merged PR #2901: URL: https://github.com/apache/drill/pull/2901 > Sender memory leak when rpc encode exception > > > Key: DRILL-8489 > URL: https://issues.apache.org/jira/browse/DRILL-8489 > Project: Apache Drill > Issue Type: Bug > Components: Server >Affects Versions: 1.21.1 >Reporter: shihuafeng >Priority: Major > Fix For: 1.22.0 > > > When encode throw Exception, if encode msg instanceof ReferenceCounted, netty > can release msg, but drill convert msg to OutboundRpcMessage, so netty can > not release msg. this causes sender memory leaks > exception info > {code:java} > 2024-04-16 16:25:57,998 [DataClient-7] ERROR > o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication. > Connection: /10.32.112.138:47924 <--> /10.32.112.138:31012 (data client). > Closing connection. > io.netty.handler.codec.EncoderException: > org.apache.drill.exec.exception.OutOfMemoryException: Unable to allocate > buffer of size 4096 due to memory limit (9223372036854775807). Current > allocation: 0 > at > io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:107) > at > io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:881) > at > io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:940) > at > io.netty.channel.AbstractChannelHandlerContext$WriteTask.run(AbstractChannelHandlerContext.java:1247) > at > io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173) > at > io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166) > at > io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569) > at > io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) > at > io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Unable to > allocate buffer of size 4096 due to memory limit (9223372036854775807). > Current allocation: 0 > at > org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:245) > at > org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:220) > at > org.apache.drill.exec.memory.DrillByteBufAllocator.buffer(DrillByteBufAllocator.java:55) > at > org.apache.drill.exec.memory.DrillByteBufAllocator.buffer(DrillByteBufAllocator.java:50) > at org.apache.drill.exec.rpc.RpcEncoder.encode(safeRelease.java:87) > at org.apache.drill.exec.rpc.RpcEncoder.encode(RpcEncoder.java:38) > at > io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:90){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin
[ https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841807#comment-17841807 ] ASF GitHub Bot commented on DRILL-8474: --- mbeckerle commented on PR #2909: URL: https://github.com/apache/drill/pull/2909#issuecomment-2081781546 Tests are now failing due to these two things in TestDaffodilReader.scala ``` String schemaURIRoot = "file:///opt/drill/contrib/format-daffodil/src/test/resources/"; ``` That's an absolute URI that is used to obtain access to the schema files in this statement: ``` private String selectRow(String schema, String file) { return "SELECT * FROM table(dfs.`data/" + file + "` " + " (type => 'daffodil'," + " " + "validationMode => 'true', " + " schemaURI => '" + schemaURIRoot + "schema/" + schema + ".dfdl.xsd'," + " rootName => 'row'," + " rootNamespace => null " + "))"; } ``` This is assembling a select statement, and puts this absolute schemaURI into the schemaURI part of the select. What should I be doing to arrange for these schema URIs to be found. The schemas are a large complex set of files, not just a single file. Many files must be found relative to the initial root schema file. (Hundreds of files potentially). As they include/import other schema files using relative paths. > Add Daffodil Format Plugin > -- > > Key: DRILL-8474 > URL: https://issues.apache.org/jira/browse/DRILL-8474 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.21.1 >Reporter: Charles Givre >Priority: Major > Fix For: 1.22.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin
[ https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841775#comment-17841775 ] ASF GitHub Bot commented on DRILL-8474: --- cgivre commented on code in PR #2909: URL: https://github.com/apache/drill/pull/2909#discussion_r1582375084 ## exec/vector/src/main/java/org/apache/drill/exec/record/metadata/MapBuilder.java: ## @@ -185,6 +192,26 @@ public MapBuilder resumeMap() { return (MapBuilder) parent; } + /** + * Depending on whether the parent is a schema builder or map builder + * we resume appropriately. + */ + @Override + public void resume() { +if (Objects.isNull(parent)) Review Comment: I just built Drill using the following command: ```sh mvn clean install -DskipTests ``` When I did that, I was getting the same error as on GitHub. After adding the braces as described above, it built without issues. With that said, I think you can do just run the check style with: ```sh mvn checkstyle:checkstyle ``` > Add Daffodil Format Plugin > -- > > Key: DRILL-8474 > URL: https://issues.apache.org/jira/browse/DRILL-8474 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.21.1 >Reporter: Charles Givre >Priority: Major > Fix For: 1.22.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin
[ https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841774#comment-17841774 ] ASF GitHub Bot commented on DRILL-8474: --- cgivre commented on code in PR #2909: URL: https://github.com/apache/drill/pull/2909#discussion_r1582375084 ## exec/vector/src/main/java/org/apache/drill/exec/record/metadata/MapBuilder.java: ## @@ -185,6 +192,26 @@ public MapBuilder resumeMap() { return (MapBuilder) parent; } + /** + * Depending on whether the parent is a schema builder or map builder + * we resume appropriately. + */ + @Override + public void resume() { +if (Objects.isNull(parent)) Review Comment: I just built Drill using the following command: ```sh mvn clean install -DskipTests ``` I think you can do just run the check style with: ```sh mvn checkstyle:checkstyle ``` > Add Daffodil Format Plugin > -- > > Key: DRILL-8474 > URL: https://issues.apache.org/jira/browse/DRILL-8474 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.21.1 >Reporter: Charles Givre >Priority: Major > Fix For: 1.22.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin
[ https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841768#comment-17841768 ] ASF GitHub Bot commented on DRILL-8474: --- mbeckerle commented on code in PR #2909: URL: https://github.com/apache/drill/pull/2909#discussion_r1582367382 ## exec/vector/src/main/java/org/apache/drill/exec/record/metadata/MapBuilder.java: ## @@ -185,6 +192,26 @@ public MapBuilder resumeMap() { return (MapBuilder) parent; } + /** + * Depending on whether the parent is a schema builder or map builder + * we resume appropriately. + */ + @Override + public void resume() { +if (Objects.isNull(parent)) Review Comment: What is the maven command line to just make it run this checkstyle? > Add Daffodil Format Plugin > -- > > Key: DRILL-8474 > URL: https://issues.apache.org/jira/browse/DRILL-8474 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.21.1 >Reporter: Charles Givre >Priority: Major > Fix For: 1.22.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin
[ https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841667#comment-17841667 ] ASF GitHub Bot commented on DRILL-8474: --- cgivre commented on code in PR #2909: URL: https://github.com/apache/drill/pull/2909#discussion_r1582206247 ## exec/vector/src/main/java/org/apache/drill/exec/record/metadata/MapBuilder.java: ## @@ -185,6 +192,26 @@ public MapBuilder resumeMap() { return (MapBuilder) parent; } + /** + * Depending on whether the parent is a schema builder or map builder + * we resume appropriately. + */ + @Override + public void resume() { +if (Objects.isNull(parent)) Review Comment: @mbeckerle Confirmed. I successfully built your branch by adding the aforementioned braces. I'll save you some additional trouble. There's another check style violation in `DaffodilBatchReader`. Drill doesn't like star imports for some reason. > Add Daffodil Format Plugin > -- > > Key: DRILL-8474 > URL: https://issues.apache.org/jira/browse/DRILL-8474 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.21.1 >Reporter: Charles Givre >Priority: Major > Fix For: 1.22.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin
[ https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841663#comment-17841663 ] ASF GitHub Bot commented on DRILL-8474: --- cgivre commented on code in PR #2909: URL: https://github.com/apache/drill/pull/2909#discussion_r1582202511 ## exec/vector/src/main/java/org/apache/drill/exec/record/metadata/MapBuilder.java: ## @@ -185,6 +192,26 @@ public MapBuilder resumeMap() { return (MapBuilder) parent; } + /** + * Depending on whether the parent is a schema builder or map builder + * we resume appropriately. + */ + @Override + public void resume() { +if (Objects.isNull(parent)) Review Comment: @mbeckerle I don't know why the checkstyle is telling you the wrong file, but here, you'll need braces as well as at line 203. ie: ```java if (parent instanceof MapBuilder) { resumeMap(); } ``` > Add Daffodil Format Plugin > -- > > Key: DRILL-8474 > URL: https://issues.apache.org/jira/browse/DRILL-8474 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.21.1 >Reporter: Charles Givre >Priority: Major > Fix For: 1.22.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin
[ https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841637#comment-17841637 ] ASF GitHub Bot commented on DRILL-8474: --- shfshihuafeng commented on PR #2909: URL: https://github.com/apache/drill/pull/2909#issuecomment-2081475418 > This fails its tests due to a maven checkstyle failure. It's complaining about Drill:Exec:Vectors, which my code has no changes to. > > Can someone advise on what is wrong here? if (Objects.isNull(parent)) { throw new IllegalStateException("Call to resume() on MapBuilder with no parent."); } > Add Daffodil Format Plugin > -- > > Key: DRILL-8474 > URL: https://issues.apache.org/jira/browse/DRILL-8474 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.21.1 >Reporter: Charles Givre >Priority: Major > Fix For: 1.22.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin
[ https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841636#comment-17841636 ] ASF GitHub Bot commented on DRILL-8474: --- shfshihuafeng commented on PR #2909: URL: https://github.com/apache/drill/pull/2909#issuecomment-2081475241 > This fails its tests due to a maven checkstyle failure. It's complaining about Drill:Exec:Vectors, which my code has no changes to. > > Can someone advise on what is wrong here? /home/runner/work/drill/drill/exec/vector/src/main/java/org/apache/drill/exec/record/metadata/MapBuilder.java:201:5 you need add if' construct must use '{}',like following ? if (Objects.isNull(parent)) { throw new IllegalStateException("Call to resume() on MapBuilder with no parent."); } > This fails its tests due to a maven checkstyle failure. It's complaining about Drill:Exec:Vectors, which my code has no changes to. > > Can someone advise on what is wrong here? exec/vector/src/main/java/org/apache/drill/exec/record/metadata/MapBuilder.java 201 i think you need add {} for if ``` if (Objects.isNull(parent)) { throw new IllegalStateException("Call to resume() on MapBuilder with no parent."); } ``` > Add Daffodil Format Plugin > -- > > Key: DRILL-8474 > URL: https://issues.apache.org/jira/browse/DRILL-8474 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.21.1 >Reporter: Charles Givre >Priority: Major > Fix For: 1.22.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8493) Drill Unable to Read XML Files with Namespaces
[ https://issues.apache.org/jira/browse/DRILL-8493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841556#comment-17841556 ] ASF GitHub Bot commented on DRILL-8493: --- cgivre merged PR #2908: URL: https://github.com/apache/drill/pull/2908 > Drill Unable to Read XML Files with Namespaces > -- > > Key: DRILL-8493 > URL: https://issues.apache.org/jira/browse/DRILL-8493 > Project: Apache Drill > Issue Type: Bug > Components: Format - XML >Affects Versions: 1.21.1 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.21.2 > > > This is a bug fix whereby Drill ignores all data when an XML file has a > namespace. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin
[ https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841537#comment-17841537 ] Mike Beckerle commented on DRILL-8474: -- PR for this ticket is now https://github.com/apache/drill/pull/2909 > Add Daffodil Format Plugin > -- > > Key: DRILL-8474 > URL: https://issues.apache.org/jira/browse/DRILL-8474 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.21.1 >Reporter: Charles Givre >Priority: Major > Fix For: 1.22.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-2835) IndexOutOfBoundsException in partition sender when doing streaming aggregate with LIMIT
[ https://issues.apache.org/jira/browse/DRILL-2835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841535#comment-17841535 ] ASF GitHub Bot commented on DRILL-2835: --- mbeckerle commented on PR #2909: URL: https://github.com/apache/drill/pull/2909#issuecomment-2081179778 This fails its tests due to a maven checkstyle failure. It's complaining about Drill:Exec:Vectors, which my code has no changes to. Can someone advise on what is wrong here? > IndexOutOfBoundsException in partition sender when doing streaming aggregate > with LIMIT > > > Key: DRILL-2835 > URL: https://issues.apache.org/jira/browse/DRILL-2835 > Project: Apache Drill > Issue Type: Bug > Components: Execution - RPC >Affects Versions: 0.8.0 >Reporter: Aman Sinha >Assignee: Venki Korukanti >Priority: Major > Fix For: 0.9.0 > > Attachments: DRILL-2835-1.patch, DRILL-2835-2.patch > > > Following CTAS run on a TPC-DS 100GB scale factor on a 10-node cluster: > {code} > alter session set `planner.enable_hashagg` = false; > alter session set `planner.enable_multiphase_agg` = true; > create table dfs.tmp.stream9 as > select cr_call_center_sk , cr_catalog_page_sk , cr_item_sk , cr_reason_sk , > cr_refunded_addr_sk , count(*) from catalog_returns_dri100 > group by cr_call_center_sk , cr_catalog_page_sk , cr_item_sk , cr_reason_sk > , cr_refunded_addr_sk > limit 100 > ; > {code} > {code} > Caused by: java.lang.IndexOutOfBoundsException: index: 1023, length: 1 > (expected: range(0, 0)) > at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:200) > ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final] > at io.netty.buffer.DrillBuf.chk(DrillBuf.java:222) > ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final] > at io.netty.buffer.DrillBuf.setByte(DrillBuf.java:621) > ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final] > at > org.apache.drill.exec.vector.UInt1Vector$Mutator.set(UInt1Vector.java:342) > ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] > at > org.apache.drill.exec.vector.NullableBigIntVector$Mutator.set(NullableBigIntVector.java:372) > ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] > at > org.apache.drill.exec.vector.NullableBigIntVector.copyFrom(NullableBigIntVector.java:284) > ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] > at > org.apache.drill.exec.test.generated.PartitionerGen4$OutgoingRecordBatch.doEval(PartitionerTemplate.java:370) > ~[na:na] > at > org.apache.drill.exec.test.generated.PartitionerGen4$OutgoingRecordBatch.copy(PartitionerTemplate.java:249) > ~[na:na] > at > org.apache.drill.exec.test.generated.PartitionerGen4.doCopy(PartitionerTemplate.java:208) > ~[na:na] > at > org.apache.drill.exec.test.generated.PartitionerGen4.partitionBatch(PartitionerTemplate.java:176) > ~[na:na] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin
[ https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841530#comment-17841530 ] ASF GitHub Bot commented on DRILL-8474: --- mbeckerle closed pull request #2836: DRILL-8474: Add Daffodil Format Plugin URL: https://github.com/apache/drill/pull/2836 > Add Daffodil Format Plugin > -- > > Key: DRILL-8474 > URL: https://issues.apache.org/jira/browse/DRILL-8474 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.21.1 >Reporter: Charles Givre >Priority: Major > Fix For: 1.22.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin
[ https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841531#comment-17841531 ] ASF GitHub Bot commented on DRILL-8474: --- mbeckerle commented on PR #2836: URL: https://github.com/apache/drill/pull/2836#issuecomment-2081176156 Creating a new squashed PR so as to avoid loss of the comments on this PR. > Add Daffodil Format Plugin > -- > > Key: DRILL-8474 > URL: https://issues.apache.org/jira/browse/DRILL-8474 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.21.1 >Reporter: Charles Givre >Priority: Major > Fix For: 1.22.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin
[ https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841528#comment-17841528 ] ASF GitHub Bot commented on DRILL-8474: --- mbeckerle commented on PR #2836: URL: https://github.com/apache/drill/pull/2836#issuecomment-2081164073 This now passes all the daffodil contrib tests using the published official Daffodil 3.7.0. It does not yet run in any scalable fashion, but the metadata/data interfacing is complete. I would like to squash this to a single commit before merging, and it needs to be tested rebased onto the latest Drill commit. > Add Daffodil Format Plugin > -- > > Key: DRILL-8474 > URL: https://issues.apache.org/jira/browse/DRILL-8474 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.21.1 >Reporter: Charles Givre >Priority: Major > Fix For: 1.22.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8493) Drill Unable to Read XML Files with Namespaces
[ https://issues.apache.org/jira/browse/DRILL-8493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841368#comment-17841368 ] ASF GitHub Bot commented on DRILL-8493: --- cgivre opened a new pull request, #2908: URL: https://github.com/apache/drill/pull/2908 # [DRILL-8493](https://issues.apache.org/jira/browse/DRILL-8493): Drill Unable to Read XML Files with Namespaces ## Description This PR fixes an issue whereby if an XML file has a namespace defined, Drill may ignore all data. ## Documentation No user facing changes. ## Testing Added unit test. > Drill Unable to Read XML Files with Namespaces > -- > > Key: DRILL-8493 > URL: https://issues.apache.org/jira/browse/DRILL-8493 > Project: Apache Drill > Issue Type: Bug > Components: Format - XML >Affects Versions: 1.21.1 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 1.21.2 > > > This is a bug fix whereby Drill ignores all data when an XML file has a > namespace. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8493) Drill Unable to Read XML Files with Namespaces
Charles Givre created DRILL-8493: Summary: Drill Unable to Read XML Files with Namespaces Key: DRILL-8493 URL: https://issues.apache.org/jira/browse/DRILL-8493 Project: Apache Drill Issue Type: Bug Components: Format - XML Affects Versions: 1.21.1 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.21.2 This is a bug fix whereby Drill ignores all data when an XML file has a namespace. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8492) Allow Parquet TIME_MICROS and TIMESTAMP_MICROS columns to be read as 64-bit integer values
Peter Franzen created DRILL-8492: Summary: Allow Parquet TIME_MICROS and TIMESTAMP_MICROS columns to be read as 64-bit integer values Key: DRILL-8492 URL: https://issues.apache.org/jira/browse/DRILL-8492 Project: Apache Drill Issue Type: Improvement Components: Storage - Parquet Affects Versions: 1.21.1 Reporter: Peter Franzen When reading Parquet columns of type {{time_micros}} and {{{}timestamp_micros{}}}, Drill truncates the microsecond values to milliseconds in order to convert them to SQL timestamps. It is currently not possible to read the original microsecond values (as 64-bit values, not SQL timestamps) through Drill. One solution for allowing reading the original 64-bit values is to add two options similar to “store.parquet.reader.int96_as_timestamp" to control whether microsecond times and timestamps are truncated to millisecond timestamps or read as non-truncated 64-bit values. These options would be added to {{org.apache.drill.exec.ExecConstants}} and {{{}org.apache.drill.exec.server.options.SystemOptionManager{}}}. They would also be added to "drill-module.conf": {{ store.parquet.reader.time_micros_as_int64: false,}} {{ store.parquet.reader.timestamp_micros_as_int64: false,}} These options would then be used in the same places as {{{}store.parquet.reader.int96_as_timestamp{}}}: * org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory * org.apache.drill.exec.store.parquet.columnreaders.ParquetToDrillTypeConverter * org.apache.drill.exec.store.parquet2.DrillParquetGroupConverter to create an int64 reader instead of a time/timestamp reader when the correspondning option is set to true. In addition to this, {{org.apache.drill.exec.store.parquet.metadata.FileMetadataCollector }}must be altered to _not_ truncate the min and max values for time_micros/timestamp_micros if the corresponding option is true. This class doesn’t have a reference to an {{{}OptionManager{}}}, so the two new options must be extracted from the {{OptionManager}} when the {{ParquetReaderConfig}} instance is created. Filtering on microsecond columns would be done using 64-bit values rather than TIME/TIMESTAMP values when the new options are true, e.g. {{SELECT * FROM WHERE = 1705914906694751;}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (DRILL-8491) MongoDB | Queries Conversion optimisation & using various mongoDB features
[ https://issues.apache.org/jira/browse/DRILL-8491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piyush Shama updated DRILL-8491: Priority: Critical (was: Major) > MongoDB | Queries Conversion optimisation & using various mongoDB features > -- > > Key: DRILL-8491 > URL: https://issues.apache.org/jira/browse/DRILL-8491 > Project: Apache Drill > Issue Type: Improvement >Reporter: Piyush Shama >Priority: Critical > > {*}Title{*}: Inefficient Query Translation and Underutilised Functions in SQL > to MongoDB Conversion Using Apache Drill > {*}Description{*}: We have been experiencing significant performance issues > when using Apache Drill to convert SQL queries for use with MongoDB. It > appears that the SQL to MongoDB query translation process is not optimally > executed, leading to inefficient query operations and slow response times. > {*}Details{*}: > # {*}Inefficient Query Translation{*}: > ** The translation of SQL queries into MongoDB-specific queries by Apache > Drill seems sub optimal. This inefficiency is particularly noticeable with > complex queries, where the expected execution plan does not align with > MongoDB's capabilities, resulting in slower query performance. > # {*}Underutilization of MongoDB Capabilities{*}: > ** Several MongoDB functionalities are not being fully utilised in the > translation process: > *** {*}Aggregation Operations{*}: Functions like {{{}SUM(){}}}, > {{{}AVG(){}}}, {{{}MIN(){}}}, and {{MAX()}} are either poorly translated or > not utilised, leading to potential performance degradation. > *** {*}Date Handling{*}: Extraction of date components (e.g., day from an > ISO date) within queries is not handled efficiently, forcing additional > processing overhead or client-side computations. > *** {*}Count Queries{*}: Simple count operations are not optimised, possibly > translating into more complex query forms than necessary. > {*}Impact{*}: The current issues significantly affect the performance and > scalability of applications relying on Apache Drill for interacting with > MongoDB, particularly in data-heavy environments. > {*}Expected Behaviour{*}: > * Queries translated from SQL to MongoDB should utilise MongoDB's native > query capabilities more effectively, ensuring that operations such as > aggregations, date extractions, and counts are executed in the most efficient > manner possible. > * The translation engine should optimise the query structure to leverage > MongoDB's strengths, particularly in handling large datasets. > {*}Steps to Reproduce{*}: > # Set up Apache Drill with a MongoDB data source. > # Execute complex SQL queries involving aggregation, date extraction, and > count operations. > # Observe the generated MongoDB queries and resulting performance. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (DRILL-8491) MongoDB | Queries Conversion optimisation & using various mongoDB features
[ https://issues.apache.org/jira/browse/DRILL-8491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piyush Shama updated DRILL-8491: Priority: Major (was: Critical) > MongoDB | Queries Conversion optimisation & using various mongoDB features > -- > > Key: DRILL-8491 > URL: https://issues.apache.org/jira/browse/DRILL-8491 > Project: Apache Drill > Issue Type: Improvement >Reporter: Piyush Shama >Priority: Major > > {*}Title{*}: Inefficient Query Translation and Underutilised Functions in SQL > to MongoDB Conversion Using Apache Drill > {*}Description{*}: We have been experiencing significant performance issues > when using Apache Drill to convert SQL queries for use with MongoDB. It > appears that the SQL to MongoDB query translation process is not optimally > executed, leading to inefficient query operations and slow response times. > {*}Details{*}: > # {*}Inefficient Query Translation{*}: > ** The translation of SQL queries into MongoDB-specific queries by Apache > Drill seems sub optimal. This inefficiency is particularly noticeable with > complex queries, where the expected execution plan does not align with > MongoDB's capabilities, resulting in slower query performance. > # {*}Underutilization of MongoDB Capabilities{*}: > ** Several MongoDB functionalities are not being fully utilised in the > translation process: > *** {*}Aggregation Operations{*}: Functions like {{{}SUM(){}}}, > {{{}AVG(){}}}, {{{}MIN(){}}}, and {{MAX()}} are either poorly translated or > not utilised, leading to potential performance degradation. > *** {*}Date Handling{*}: Extraction of date components (e.g., day from an > ISO date) within queries is not handled efficiently, forcing additional > processing overhead or client-side computations. > *** {*}Count Queries{*}: Simple count operations are not optimised, possibly > translating into more complex query forms than necessary. > {*}Impact{*}: The current issues significantly affect the performance and > scalability of applications relying on Apache Drill for interacting with > MongoDB, particularly in data-heavy environments. > {*}Expected Behaviour{*}: > * Queries translated from SQL to MongoDB should utilise MongoDB's native > query capabilities more effectively, ensuring that operations such as > aggregations, date extractions, and counts are executed in the most efficient > manner possible. > * The translation engine should optimise the query structure to leverage > MongoDB's strengths, particularly in handling large datasets. > {*}Steps to Reproduce{*}: > # Set up Apache Drill with a MongoDB data source. > # Execute complex SQL queries involving aggregation, date extraction, and > count operations. > # Observe the generated MongoDB queries and resulting performance. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (DRILL-8491) MongoDB | Queries Conversion optimisation & using various mongoDB features
[ https://issues.apache.org/jira/browse/DRILL-8491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piyush Shama updated DRILL-8491: Description: {*}Title{*}: Inefficient Query Translation and Underutilised Functions in SQL to MongoDB Conversion Using Apache Drill {*}Description{*}: We have been experiencing significant performance issues when using Apache Drill to convert SQL queries for use with MongoDB. It appears that the SQL to MongoDB query translation process is not optimally executed, leading to inefficient query operations and slow response times. {*}Details{*}: # {*}Inefficient Query Translation{*}: ** The translation of SQL queries into MongoDB-specific queries by Apache Drill seems sub optimal. This inefficiency is particularly noticeable with complex queries, where the expected execution plan does not align with MongoDB's capabilities, resulting in slower query performance. # {*}Underutilization of MongoDB Capabilities{*}: ** Several MongoDB functionalities are not being fully utilised in the translation process: *** {*}Aggregation Operations{*}: Functions like {{{}SUM(){}}}, {{{}AVG(){}}}, {{{}MIN(){}}}, and {{MAX()}} are either poorly translated or not utilised, leading to potential performance degradation. *** {*}Date Handling{*}: Extraction of date components (e.g., day from an ISO date) within queries is not handled efficiently, forcing additional processing overhead or client-side computations. *** {*}Count Queries{*}: Simple count operations are not optimised, possibly translating into more complex query forms than necessary. {*}Impact{*}: The current issues significantly affect the performance and scalability of applications relying on Apache Drill for interacting with MongoDB, particularly in data-heavy environments. {*}Expected Behaviour{*}: * Queries translated from SQL to MongoDB should utilise MongoDB's native query capabilities more effectively, ensuring that operations such as aggregations, date extractions, and counts are executed in the most efficient manner possible. * The translation engine should optimise the query structure to leverage MongoDB's strengths, particularly in handling large datasets. {*}Steps to Reproduce{*}: # Set up Apache Drill with a MongoDB data source. # Execute complex SQL queries involving aggregation, date extraction, and count operations. # Observe the generated MongoDB queries and resulting performance. > MongoDB | Queries Conversion optimisation & using various mongoDB features > -- > > Key: DRILL-8491 > URL: https://issues.apache.org/jira/browse/DRILL-8491 > Project: Apache Drill > Issue Type: Improvement >Reporter: Piyush Shama >Priority: Major > > {*}Title{*}: Inefficient Query Translation and Underutilised Functions in SQL > to MongoDB Conversion Using Apache Drill > {*}Description{*}: We have been experiencing significant performance issues > when using Apache Drill to convert SQL queries for use with MongoDB. It > appears that the SQL to MongoDB query translation process is not optimally > executed, leading to inefficient query operations and slow response times. > {*}Details{*}: > # {*}Inefficient Query Translation{*}: > ** The translation of SQL queries into MongoDB-specific queries by Apache > Drill seems sub optimal. This inefficiency is particularly noticeable with > complex queries, where the expected execution plan does not align with > MongoDB's capabilities, resulting in slower query performance. > # {*}Underutilization of MongoDB Capabilities{*}: > ** Several MongoDB functionalities are not being fully utilised in the > translation process: > *** {*}Aggregation Operations{*}: Functions like {{{}SUM(){}}}, > {{{}AVG(){}}}, {{{}MIN(){}}}, and {{MAX()}} are either poorly translated or > not utilised, leading to potential performance degradation. > *** {*}Date Handling{*}: Extraction of date components (e.g., day from an > ISO date) within queries is not handled efficiently, forcing additional > processing overhead or client-side computations. > *** {*}Count Queries{*}: Simple count operations are not optimised, possibly > translating into more complex query forms than necessary. > {*}Impact{*}: The current issues significantly affect the performance and > scalability of applications relying on Apache Drill for interacting with > MongoDB, particularly in data-heavy environments. > {*}Expected Behaviour{*}: > * Queries translated from SQL to MongoDB should utilise MongoDB's native > query capabilities more effectively, ensuring that operations such as > aggregations, date extractions, and counts are executed in the most effi
[jira] [Created] (DRILL-8491) MongoDB | Queries Conversion optimisation & using various mongoDB features
Piyush Shama created DRILL-8491: --- Summary: MongoDB | Queries Conversion optimisation & using various mongoDB features Key: DRILL-8491 URL: https://issues.apache.org/jira/browse/DRILL-8491 Project: Apache Drill Issue Type: Improvement Reporter: Piyush Shama -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (DRILL-8490) Sender operator fake memory leak result to sql failed and parent allocator exception
[ https://issues.apache.org/jira/browse/DRILL-8490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shihuafeng updated DRILL-8490: -- Summary: Sender operator fake memory leak result to sql failed and parent allocator exception (was: Sender operator fake memory leak result to sql failed or other exception) > Sender operator fake memory leak result to sql failed and parent allocator > exception > - > > Key: DRILL-8490 > URL: https://issues.apache.org/jira/browse/DRILL-8490 > Project: Apache Drill > Issue Type: Bug > Components: Server >Affects Versions: 1.21.1 >Reporter: shihuafeng >Priority: Major > Fix For: 1.22.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8490) Sender operator fake memory leak result to sql failed or other exception
shihuafeng created DRILL-8490: - Summary: Sender operator fake memory leak result to sql failed or other exception Key: DRILL-8490 URL: https://issues.apache.org/jira/browse/DRILL-8490 Project: Apache Drill Issue Type: Bug Components: Server Affects Versions: 1.21.1 Reporter: shihuafeng Fix For: 1.22.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (DRILL-8489) Sender memory leak when rpc encode exception
[ https://issues.apache.org/jira/browse/DRILL-8489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shihuafeng updated DRILL-8489: -- Description: When encode throw Exception, if encode msg instanceof ReferenceCounted, netty can release msg, but drill convert msg to OutboundRpcMessage, so netty can not release msg. this causes sender memory leaks exception info {code:java} 2024-04-16 16:25:57,998 [DataClient-7] ERROR o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication. Connection: /10.32.112.138:47924 <--> /10.32.112.138:31012 (data client). Closing connection. io.netty.handler.codec.EncoderException: org.apache.drill.exec.exception.OutOfMemoryException: Unable to allocate buffer of size 4096 due to memory limit (9223372036854775807). Current allocation: 0 at io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:107) at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:881) at io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:940) at io.netty.channel.AbstractChannelHandlerContext$WriteTask.run(AbstractChannelHandlerContext.java:1247) at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Unable to allocate buffer of size 4096 due to memory limit (9223372036854775807). Current allocation: 0 at org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:245) at org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:220) at org.apache.drill.exec.memory.DrillByteBufAllocator.buffer(DrillByteBufAllocator.java:55) at org.apache.drill.exec.memory.DrillByteBufAllocator.buffer(DrillByteBufAllocator.java:50) at org.apache.drill.exec.rpc.RpcEncoder.encode(safeRelease.java:87) at org.apache.drill.exec.rpc.RpcEncoder.encode(RpcEncoder.java:38) at io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:90){code} > Sender memory leak when rpc encode exception > > > Key: DRILL-8489 > URL: https://issues.apache.org/jira/browse/DRILL-8489 > Project: Apache Drill > Issue Type: Bug > Components: Server >Affects Versions: 1.21.1 >Reporter: shihuafeng >Priority: Major > Fix For: 1.22.0 > > > When encode throw Exception, if encode msg instanceof ReferenceCounted, netty > can release msg, but drill convert msg to OutboundRpcMessage, so netty can > not release msg. this causes sender memory leaks > exception info > {code:java} > 2024-04-16 16:25:57,998 [DataClient-7] ERROR > o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication. > Connection: /10.32.112.138:47924 <--> /10.32.112.138:31012 (data client). > Closing connection. > io.netty.handler.codec.EncoderException: > org.apache.drill.exec.exception.OutOfMemoryException: Unable to allocate > buffer of size 4096 due to memory limit (9223372036854775807). Current > allocation: 0 > at > io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:107) > at > io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:881) > at > io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:940) > at > io.netty.channel.AbstractChannelHandlerContext$WriteTask.run(AbstractChannelHandlerContext.java:1247) > at > io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173) > at > io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166) > at > io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569) > at > io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) > at > io.netty.util.internal.ThreadExecutorMap
[jira] [Commented] (DRILL-8489) Sender memory leak when rpc encode exception
[ https://issues.apache.org/jira/browse/DRILL-8489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837949#comment-17837949 ] ASF GitHub Bot commented on DRILL-8489: --- shfshihuafeng opened a new pull request, #2901: URL: https://github.com/apache/drill/pull/2901 # [DRILL-8489](https://issues.apache.org/jira/browse/DRILL-8489): Sender memory leak when rpc encode exception ## Description When encode throw Exception, if encode msg instanceof ReferenceCounted, netty can release msg, but drill convert msg to OutboundRpcMessage, so netty can not release msg. ## Documentation (Please describe user-visible changes similar to what should appear in the Drill documentation.) ## Testing 1. export DRILL_MAX_DIRECT_MEMORY=${DRILL_MAX_DIRECT_MEMORY:-"2G"} 2. tpch 1s 3. tpch sql 8 ``` select o_year, sum(case when nation = 'CHINA' then volume else 0 end) / sum(volume) as mkt_share from ( select extract(year from o_orderdate) as o_year, l_extendedprice * 1.0 as volume, n2.n_name as nation from hive.tpch1s.part, hive.tpch1s.supplier, hive.tpch1s.lineitem, hive.tpch1s.orders, hive.tpch1s.customer, hive.tpch1s.nation n1, hive.tpch1s.nation n2, hive.tpch1s.region where p_partkey = l_partkey and s_suppkey = l_suppkey and l_orderkey = o_orderkey and o_custkey = c_custkey and c_nationkey = n1.n_nationkey and n1.n_regionkey = r_regionkey and r_name = 'ASIA' and s_nationkey = n2.n_nationkey and o_orderdate between date '1995-01-01' and date '1996-12-31' and p_type = 'LARGE BRUSHED BRASS') as all_nations group by o_year order by o_year; ``` 5. This scenario is relatively easy to Reproduce by running the following script ``` drill_home=/data/shf/apache-drill-1.22.0-SNAPSHOT/bin fileName=/data/shf/1s/shf.txt random_sql(){ #for i in `seq 1 3` while true do num=$((RANDOM%22+1)) if [ -f $fileName ]; then echo "$fileName" " is exit" exit 0 else $drill_home/sqlline -u \"jdbc:drill:zk=jupiter-2:2181/drill_shf/jupiterbits_shf1\" -f tpch_sql8.sql >> sql8.log 2>&1 fi done } main(){ unset HADOOP_CLASSPATH #TPCH power test for i in `seq 1 25` do random_sql & done } ``` > Sender memory leak when rpc encode exception > > > Key: DRILL-8489 > URL: https://issues.apache.org/jira/browse/DRILL-8489 > Project: Apache Drill > Issue Type: Bug > Components: Server >Affects Versions: 1.21.1 >Reporter: shihuafeng >Priority: Major > Fix For: 1.22.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (DRILL-8489) Sender memory leak when rpc encode exception
[ https://issues.apache.org/jira/browse/DRILL-8489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shihuafeng updated DRILL-8489: -- Summary: Sender memory leak when rpc encode exception (was: sender memory) > Sender memory leak when rpc encode exception > > > Key: DRILL-8489 > URL: https://issues.apache.org/jira/browse/DRILL-8489 > Project: Apache Drill > Issue Type: Bug > Components: Server >Affects Versions: 1.21.1 >Reporter: shihuafeng >Priority: Major > Fix For: 1.22.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8489) sender memory
shihuafeng created DRILL-8489: - Summary: sender memory Key: DRILL-8489 URL: https://issues.apache.org/jira/browse/DRILL-8489 Project: Apache Drill Issue Type: Bug Components: Server Affects Versions: 1.21.1 Reporter: shihuafeng Fix For: 1.22.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8488) HashJoinPOP memory leak is caused by OutOfMemoryException
[ https://issues.apache.org/jira/browse/DRILL-8488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837525#comment-17837525 ] ASF GitHub Bot commented on DRILL-8488: --- shfshihuafeng opened a new pull request, #2900: URL: https://github.com/apache/drill/pull/2900 # [DRILL-8488](https://issues.apache.org/jira/browse/DRILL-8488): HashJoinPOP memory leak is caused by OutOfMemoryException (Please replace `PR Title` with actual PR Title) ## Description We should catch the OutOfMemoryException instead of OutOfMemoryError ``` public DrillBuf buffer(final int initialRequestSize, BufferManager manager) { assertOpen(); Preconditions.checkArgument(initialRequestSize >= 0, "the requested size must be non-negative"); if (initialRequestSize == 0) { return empty; } // round to next largest power of two if we're within a chunk since that is how our allocator operates final int actualRequestSize = initialRequestSize < CHUNK_SIZE ? nextPowerOfTwo(initialRequestSize) : initialRequestSize; AllocationOutcome outcome = allocateBytes(actualRequestSize); if (!outcome.isOk()) { **throw new OutOfMemoryException**(createErrorMsg(this, actualRequestSize, initialRequestSize)); } } ``` ## Documentation (Please describe user-visible changes similar to what should appear in the Drill documentation.) ## Testing [drill-848](https://issues.apache.org/jira/browse/DRILL-8485)) > HashJoinPOP memory leak is caused by OutOfMemoryException > -- > > Key: DRILL-8488 > URL: https://issues.apache.org/jira/browse/DRILL-8488 > Project: Apache Drill > Issue Type: Bug > Components: Server >Affects Versions: 1.21.1 >Reporter: shihuafeng >Priority: Major > Fix For: 1.22.0 > > > [DRILL-8485|[DRILL-8485] HashJoinPOP memory leak is caused by an oom > exception when read data from InputStream - ASF JIRA (apache.org)] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (DRILL-8488) HashJoinPOP memory leak is caused by OutOfMemoryException
[ https://issues.apache.org/jira/browse/DRILL-8488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shihuafeng updated DRILL-8488: -- Summary: HashJoinPOP memory leak is caused by OutOfMemoryException (was: HashJoinPOP memory leak is caused by an oom exception) > HashJoinPOP memory leak is caused by OutOfMemoryException > -- > > Key: DRILL-8488 > URL: https://issues.apache.org/jira/browse/DRILL-8488 > Project: Apache Drill > Issue Type: Bug > Components: Server >Affects Versions: 1.21.1 >Reporter: shihuafeng >Priority: Major > Fix For: 1.22.0 > > > [DRILL-8485|[DRILL-8485] HashJoinPOP memory leak is caused by an oom > exception when read data from InputStream - ASF JIRA (apache.org)] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8488) HashJoinPOP memory leak is caused by an oom exception
shihuafeng created DRILL-8488: - Summary: HashJoinPOP memory leak is caused by an oom exception Key: DRILL-8488 URL: https://issues.apache.org/jira/browse/DRILL-8488 Project: Apache Drill Issue Type: Bug Components: Server Affects Versions: 1.21.1 Reporter: shihuafeng Fix For: 1.22.0 [DRILL-8485|[DRILL-8485] HashJoinPOP memory leak is caused by an oom exception when read data from InputStream - ASF JIRA (apache.org)] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (DRILL-8446) Incorrect use of OperatingSystemMXBean
[ https://issues.apache.org/jira/browse/DRILL-8446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Turton closed DRILL-8446. --- Resolution: Fixed > Incorrect use of OperatingSystemMXBean > -- > > Key: DRILL-8446 > URL: https://issues.apache.org/jira/browse/DRILL-8446 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.21.1 >Reporter: Mahmoud Ouali Alami >Assignee: James Turton >Priority: Major > Fix For: 1.21.2 > > Attachments: image-2023-07-04-15-36-42-905.png, > image-2023-07-04-16-24-59-662.png > > > *Context :* > In Drill "CpuGaugeSet" class, we use an internal class instead of a public > class : com.sun.management.OperatingSystemMXBean; > !image-2023-07-04-15-36-42-905.png|width=387,height=257! > This can result to a NoClassDefFoundError: > !image-2023-07-04-16-24-59-662.png|width=845,height=108! > *To do :* > Replace the private class "com.sun.managemenet.OperatingSystemMXBean" with > "java.lang.management.OperatingSystemMXBean", > > Kind regards, > Mahmoud > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (DRILL-8479) Merge Join Memory Leak Depleting Incoming Batches Throw Exception
[ https://issues.apache.org/jira/browse/DRILL-8479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Turton closed DRILL-8479. --- Resolution: Fixed > Merge Join Memory Leak Depleting Incoming Batches Throw Exception > - > > Key: DRILL-8479 > URL: https://issues.apache.org/jira/browse/DRILL-8479 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.21.1 >Reporter: shihuafeng >Priority: Critical > Fix For: 1.21.2 > > Attachments: 0001-mergejoin-leak.patch > > > *Describe the bug* > megerjoin leak when RecordIterator allocate memory exception with > OutOfMemoryException{*}{*} > {*}Steps to reproduce the behavior{*}: > # prepare data for tpch 1s > # set direct memory 5g > # set planner.enable_hashjoin =false to ensure use mergejoin operator。 > # set drill.memory.debug.allocator =true (Check for memory leaks ) > # 20 concurrent for tpch sql8 > # when it had OutOfMemoryException or null EXCEPTION , stopped all sql. > # finding memory leak > *Expected behavior* > when all sql sop , we should find direct memory is 0 AND could not > find leak log like following. > {code:java} > Allocator(op:2:0:11:MergeJoinPOP) 100/73728/4874240/100 > (res/actual/peak/limit){code} > *Error detail, log output or screenshots* > {code:java} > Unable to allocate buffer of size XX (rounded from XX) due to memory limit > (). Current allocation: xx{code} > [^0001-mergejoin-leak.patch] > sql > {code:java} > // code placeholder > select o_year, sum(case when nation = 'CHINA' then volume else 0 end) / > sum(volume) as mkt_share from ( select extract(year from o_orderdate) as > o_year, l_extendedprice * 1.0 as volume, n2.n_name as nation from > hive.tpch1s.part, hive.tpch1s.supplier, hive.tpch1s.lineitem, > hive.tpch1s.orders, hive.tpch1s.customer, hive.tpch1s.nation n1, > hive.tpch1s.nation n2, hive.tpch1s.region where p_partkey = l_partkey and > s_suppkey = l_suppkey and l_orderkey = o_orderkey and o_custkey = c_custkey > and c_nationkey = n1.n_nationkey and n1.n_regionkey = r_regionkey and r_name > = 'ASIA' and s_nationkey = n2.n_nationkey and o_orderdate between date > '1995-01-01' and date '1996-12-31' and p_type = 'LARGE BRUSHED BRASS') as > all_nations group by o_year order by o_year > {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010)