[jira] [Commented] (DRILL-8503) Add Configuration Option to Skip Host Validation for Splunk

2024-07-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17868406#comment-17868406
 ] 

ASF GitHub Bot commented on DRILL-8503:
---

cgivre merged PR #2927:
URL: https://github.com/apache/drill/pull/2927




> Add Configuration Option to Skip Host Validation for Splunk
> ---
>
> Key: DRILL-8503
> URL: https://issues.apache.org/jira/browse/DRILL-8503
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Splunk
>Affects Versions: 1.21.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.22.0
>
>
> This PR adds an option to skip host validation for SSL connections to Splunk. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8502) Some boot options with drill.exec.options prefix are missed in configuration options

2024-07-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17868404#comment-17868404
 ] 

ASF GitHub Bot commented on DRILL-8502:
---

jnturton commented on PR #2923:
URL: https://github.com/apache/drill/pull/2923#issuecomment-2248307027

   Sorry I missed this at the time. Thanks for the cleanup.




> Some boot options with drill.exec.options prefix are missed in configuration 
> options
> 
>
> Key: DRILL-8502
> URL: https://issues.apache.org/jira/browse/DRILL-8502
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.21.2
>Reporter: Maksym Rymar
>Assignee: Maksym Rymar
>Priority: Minor
> Fix For: 1.22.0
>
>
> Drill has boot options with {{drill.exec.options}} prefix which are missed in 
> configuration options. It can be easily checked by comparing the system 
> tables:
> {code:java}
> apache drill> select name from sys.boot where name like 'drill.exec.options%' 
> AND name not in (select concat('drill.exec.options.', name) from 
> sys.internal_options union all select concat('drill.exec.options.', name) 
> from sys.options);
> +---+
> | name  |
> +---+
> | drill.exec.options.drill.exec.testing.controls|
> | drill.exec.options.exec.hashagg.max_batches_in_memory |
> | drill.exec.options.exec.hashagg.num_rows_in_batch |
> | drill.exec.options.exec.hashjoin.mem_limit|
> | drill.exec.options.exec.return_result_set_for_ddl |
> +---+{code}
> Expected – empty result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8503) Add Configuration Option to Skip Host Validation for Splunk

2024-07-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17868366#comment-17868366
 ] 

ASF GitHub Bot commented on DRILL-8503:
---

cgivre opened a new pull request, #2927:
URL: https://github.com/apache/drill/pull/2927

   # [DRILL-8503](https://issues.apache.org/jira/browse/DRILL-8503): Add 
Configuration Option to Skip Host Validation for Splunk
   
   ## Description
   in corporate installations, sometimes the organization will use self-signed 
certificates which can cause problems.  This PR adds an option to bypass host 
validation for SSL connections in Splunk.  The "correct" way to fix this would 
be to provide better SSL information, to include the certificate in the Splunk 
connection, however, Splunk's SDK does not allow for this and there are several 
open issues and PRs relating to this.  
   
   This PR also bumps the Splunk SDK version to the latest version which is 
1.9.5.  
   
   ## Documentation
   An additional configuration option: `validateHostnames` has been added to 
the Splunk configuration.  I updated the README.md file with this information 
and will be updating the documentation once this has been merged.
   
   ## Testing
   Ran existing unit tests and tested manually. 




> Add Configuration Option to Skip Host Validation for Splunk
> ---
>
> Key: DRILL-8503
> URL: https://issues.apache.org/jira/browse/DRILL-8503
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Splunk
>Affects Versions: 1.21.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.22.0
>
>
> This PR adds an option to skip host validation for SSL connections to Splunk. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8503) Add Configuration Option to Skip Host Validation for Splunk

2024-07-24 Thread Charles Givre (Jira)
Charles Givre created DRILL-8503:


 Summary: Add Configuration Option to Skip Host Validation for 
Splunk
 Key: DRILL-8503
 URL: https://issues.apache.org/jira/browse/DRILL-8503
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - Splunk
Affects Versions: 1.21.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.22.0


This PR adds an option to skip host validation for SSL connections to Splunk. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (DRILL-8502) Some boot options with drill.exec.options prefix are missed in configuration options

2024-07-24 Thread Maksym Rymar (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maksym Rymar resolved DRILL-8502.
-
Fix Version/s: 1.22.0
   Resolution: Fixed

Merged to master: https://github.com/apache/drill/pull/2923

> Some boot options with drill.exec.options prefix are missed in configuration 
> options
> 
>
> Key: DRILL-8502
> URL: https://issues.apache.org/jira/browse/DRILL-8502
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.21.2
>Reporter: Maksym Rymar
>Assignee: Maksym Rymar
>Priority: Minor
> Fix For: 1.22.0
>
>
> Drill has boot options with {{drill.exec.options}} prefix which are missed in 
> configuration options. It can be easily checked by comparing the system 
> tables:
> {code:java}
> apache drill> select name from sys.boot where name like 'drill.exec.options%' 
> AND name not in (select concat('drill.exec.options.', name) from 
> sys.internal_options union all select concat('drill.exec.options.', name) 
> from sys.options);
> +---+
> | name  |
> +---+
> | drill.exec.options.drill.exec.testing.controls|
> | drill.exec.options.exec.hashagg.max_batches_in_memory |
> | drill.exec.options.exec.hashagg.num_rows_in_batch |
> | drill.exec.options.exec.hashjoin.mem_limit|
> | drill.exec.options.exec.return_result_set_for_ddl |
> +---+{code}
> Expected – empty result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8502) Some boot options with drill.exec.options prefix are missed in configuration options

2024-07-18 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17867138#comment-17867138
 ] 

ASF GitHub Bot commented on DRILL-8502:
---

cgivre merged PR #2923:
URL: https://github.com/apache/drill/pull/2923




> Some boot options with drill.exec.options prefix are missed in configuration 
> options
> 
>
> Key: DRILL-8502
> URL: https://issues.apache.org/jira/browse/DRILL-8502
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.21.2
>Reporter: Maksym Rymar
>Assignee: Maksym Rymar
>Priority: Minor
>
> Drill has boot options with {{drill.exec.options}} prefix which are missed in 
> configuration options. It can be easily checked by comparing the system 
> tables:
> {code:java}
> apache drill> select name from sys.boot where name like 'drill.exec.options%' 
> AND name not in (select concat('drill.exec.options.', name) from 
> sys.internal_options union all select concat('drill.exec.options.', name) 
> from sys.options);
> +---+
> | name  |
> +---+
> | drill.exec.options.drill.exec.testing.controls|
> | drill.exec.options.exec.hashagg.max_batches_in_memory |
> | drill.exec.options.exec.hashagg.num_rows_in_batch |
> | drill.exec.options.exec.hashjoin.mem_limit|
> | drill.exec.options.exec.return_result_set_for_ddl |
> +---+{code}
> Expected – empty result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8502) Some boot options with drill.exec.options prefix are missed in configuration options

2024-07-18 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17867006#comment-17867006
 ] 

ASF GitHub Bot commented on DRILL-8502:
---

rymarm commented on PR #2923:
URL: https://github.com/apache/drill/pull/2923#issuecomment-2236471316

   @cgivre yes, sure. I've updated the PR with Jira ticket information. 




> Some boot options with drill.exec.options prefix are missed in configuration 
> options
> 
>
> Key: DRILL-8502
> URL: https://issues.apache.org/jira/browse/DRILL-8502
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.21.2
>Reporter: Maksym Rymar
>Assignee: Maksym Rymar
>Priority: Minor
>
> Drill has boot options with {{drill.exec.options}} prefix which are missed in 
> configuration options. It can be easily checked by comparing the system 
> tables:
> {code:java}
> apache drill> select name from sys.boot where name like 'drill.exec.options%' 
> AND name not in (select concat('drill.exec.options.', name) from 
> sys.internal_options union all select concat('drill.exec.options.', name) 
> from sys.options);
> +---+
> | name  |
> +---+
> | drill.exec.options.drill.exec.testing.controls|
> | drill.exec.options.exec.hashagg.max_batches_in_memory |
> | drill.exec.options.exec.hashagg.num_rows_in_batch |
> | drill.exec.options.exec.hashjoin.mem_limit|
> | drill.exec.options.exec.return_result_set_for_ddl |
> +---+{code}
> Expected – empty result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8502) Some boot options with drill.exec.options prefix are missed in configuration options

2024-07-18 Thread Maksym Rymar (Jira)
Maksym Rymar created DRILL-8502:
---

 Summary: Some boot options with drill.exec.options prefix are 
missed in configuration options
 Key: DRILL-8502
 URL: https://issues.apache.org/jira/browse/DRILL-8502
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.21.2
Reporter: Maksym Rymar
Assignee: Maksym Rymar


Drill has boot options with {{drill.exec.options}} prefix which are missed in 
configuration options. It can be easily checked by comparing the system tables:
{code:java}
apache drill> select name from sys.boot where name like 'drill.exec.options%' 
AND name not in (select concat('drill.exec.options.', name) from 
sys.internal_options union all select concat('drill.exec.options.', name) from 
sys.options);
+---+
| name  |
+---+
| drill.exec.options.drill.exec.testing.controls|
| drill.exec.options.exec.hashagg.max_batches_in_memory |
| drill.exec.options.exec.hashagg.num_rows_in_batch |
| drill.exec.options.exec.hashjoin.mem_limit|
| drill.exec.options.exec.return_result_set_for_ddl |
+---+{code}
Expected – empty result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8316) Convert Druid Storage Plugin to EVF & V2 JSON Reader

2024-07-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865827#comment-17865827
 ] 

ASF GitHub Bot commented on DRILL-8316:
---

cgivre commented on PR #2657:
URL: https://github.com/apache/drill/pull/2657#issuecomment-2227577005

   @jnturton 
   Could you do a review of this.  I realized that this has been languishing 
and we might as well merge it if it can be.   
   The one area which I'm a little hesitant about is the ScanBatchCreator.  
Basically, since I didn't write this storage plugin and it was a bit more 
complicated than some of the ones I've written, I'd like another set of eyes on 
it. 




> Convert Druid Storage Plugin to EVF & V2 JSON Reader
> 
>
> Key: DRILL-8316
> URL: https://issues.apache.org/jira/browse/DRILL-8316
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Druid
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8492) Allow Parquet TIME_MICROS and TIMESTAMP_MICROS columns to be read as 64-bit integer values

2024-07-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865788#comment-17865788
 ] 

ASF GitHub Bot commented on DRILL-8492:
---

cgivre commented on PR #2907:
URL: https://github.com/apache/drill/pull/2907#issuecomment-2227415938

   @jnturton Can we merge this?




> Allow Parquet TIME_MICROS and TIMESTAMP_MICROS  columns to be read as 64-bit 
> integer values
> ---
>
> Key: DRILL-8492
> URL: https://issues.apache.org/jira/browse/DRILL-8492
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Affects Versions: 1.21.1
>Reporter: Peter Franzen
>Priority: Major
>
> When reading Parquet columns of type {{time_micros}} and 
> {{{}timestamp_micros{}}}, Drill truncates the microsecond values to 
> milliseconds in order to convert them to SQL timestamps.
> It is currently not possible to read the original microsecond values (as 
> 64-bit values, not SQL timestamps) through Drill.
> One solution for allowing reading the original 64-bit values is to add two 
> options similar to “store.parquet.reader.int96_as_timestamp" to control 
> whether microsecond
> times and timestamps are truncated to millisecond timestamps or read as 
> non-truncated 64-bit values.
> These options would be added to {{org.apache.drill.exec.ExecConstants}} and
> {{{}org.apache.drill.exec.server.options.SystemOptionManager{}}}.
> They would also be added to "drill-module.conf":
> {{   store.parquet.reader.time_micros_as_int64: false,}}
> {{   store.parquet.reader.timestamp_micros_as_int64: false,}}
> These options would then be used in the same places as 
> {{{}store.parquet.reader.int96_as_timestamp{}}}:
>  * org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory
>  * 
> org.apache.drill.exec.store.parquet.columnreaders.ParquetToDrillTypeConverter
>  * org.apache.drill.exec.store.parquet2.DrillParquetGroupConverter
> to create an int64 reader instead of a time/timestamp reader when the 
> correspondning option is set to true.
> In addition to this, 
> {{org.apache.drill.exec.store.parquet.metadata.FileMetadataCollector }}must 
> be altered to _not_ truncate the min and max values for 
> time_micros/timestamp_micros if the corresponding option is true. This class 
> doesn’t have a reference to an {{{}OptionManager{}}}, so the two new options 
> must be extracted from the {{OptionManager}} when the {{ParquetReaderConfig}} 
> instance is created.
> Filtering on microsecond columns would be done using 64-bit values rather 
> than TIME/TIMESTAMP values when the new options are true, e.g.
> {{SELECT *  FROM  WHERE  = 1705914906694751;}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8501) Json Conversion UDF Not Respecting System JSON Options

2024-07-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17864245#comment-17864245
 ] 

ASF GitHub Bot commented on DRILL-8501:
---

cgivre merged PR #2921:
URL: https://github.com/apache/drill/pull/2921




> Json Conversion UDF Not Respecting System JSON Options
> --
>
> Key: DRILL-8501
> URL: https://issues.apache.org/jira/browse/DRILL-8501
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Affects Versions: 1.21.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.22.0
>
>
> The convert_fromJSON() UDF does not respect the system JSON options of 
> allTextMode and readAllNumbersAsDouble.  
> This PR fixes that.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8501) Json Conversion UDF Not Respecting System JSON Options

2024-07-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17864237#comment-17864237
 ] 

ASF GitHub Bot commented on DRILL-8501:
---

jnturton commented on PR #2921:
URL: https://github.com/apache/drill/pull/2921#issuecomment-2217905543

   Oh that's great, thanks for the enhancements +1.




> Json Conversion UDF Not Respecting System JSON Options
> --
>
> Key: DRILL-8501
> URL: https://issues.apache.org/jira/browse/DRILL-8501
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Affects Versions: 1.21.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.22.0
>
>
> The convert_fromJSON() UDF does not respect the system JSON options of 
> allTextMode and readAllNumbersAsDouble.  
> This PR fixes that.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8501) Json Conversion UDF Not Respecting System JSON Options

2024-07-07 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863589#comment-17863589
 ] 

ASF GitHub Bot commented on DRILL-8501:
---

cgivre commented on PR #2921:
URL: https://github.com/apache/drill/pull/2921#issuecomment-2212491446

   @jnturton I added new versions of the UDF so that the user can specify in 
the function call whether they want `allTextMode` and the other option.  




> Json Conversion UDF Not Respecting System JSON Options
> --
>
> Key: DRILL-8501
> URL: https://issues.apache.org/jira/browse/DRILL-8501
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Affects Versions: 1.21.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.22.0
>
>
> The convert_fromJSON() UDF does not respect the system JSON options of 
> allTextMode and readAllNumbersAsDouble.  
> This PR fixes that.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8501) Json Conversion UDF Not Respecting System JSON Options

2024-07-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863534#comment-17863534
 ] 

ASF GitHub Bot commented on DRILL-8501:
---

cgivre commented on PR #2921:
URL: https://github.com/apache/drill/pull/2921#issuecomment-2212305704

   > Before I approve, did you consider making these JSON parsing settings 
parameters of the function itself? It feels odd to me that `store.json.*` 
settings could influence UDFs too. I'm not sure why they aren't storage plugin 
config, rather than global config, in the first place...
   
   @jnturton I thought about doing exactly what you're describing.  Here's the 
thing.  We started some work a while ago to get rid of all the non-EVF2 readers 
in Drill.  It turns out that there are a few places which still use the old 
non-EVF JSON reader.  Specifically, this UDF, the Druid Storage Plugin and the 
MongoDB storage plugin.   I started work on 
[Drill-8316](https://github.com/apache/drill/pull/2657) which addresses the 
Druid plugin and [Drill-8329](https://github.com/apache/drill/pull/2567) 
addresses converting the UDF.  Neither one of these were a high priority so 
they're kind of sitting at the moment. 
   
   I agree with your premise that the whole idea of having global settings for 
file formats (including parquet) is not the best idea.
   





> Json Conversion UDF Not Respecting System JSON Options
> --
>
> Key: DRILL-8501
> URL: https://issues.apache.org/jira/browse/DRILL-8501
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Affects Versions: 1.21.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.22.0
>
>
> The convert_fromJSON() UDF does not respect the system JSON options of 
> allTextMode and readAllNumbersAsDouble.  
> This PR fixes that.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8501) Json Conversion UDF Not Respecting System JSON Options

2024-07-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863481#comment-17863481
 ] 

ASF GitHub Bot commented on DRILL-8501:
---

jnturton commented on PR #2921:
URL: https://github.com/apache/drill/pull/2921#issuecomment-2211746498

   Before I approve, did you consider making these JSON parsing settings 
parameters of the function itself? It feels odd to me that `store.json.*` 
settings could influence UDFs too. I'm not sure why they aren't storage plugin 
config, rather than global config, in the first place...




> Json Conversion UDF Not Respecting System JSON Options
> --
>
> Key: DRILL-8501
> URL: https://issues.apache.org/jira/browse/DRILL-8501
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Affects Versions: 1.21.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.22.0
>
>
> The convert_fromJSON() UDF does not respect the system JSON options of 
> allTextMode and readAllNumbersAsDouble.  
> This PR fixes that.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin

2024-07-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863373#comment-17863373
 ] 

ASF GitHub Bot commented on DRILL-8474:
---

mbeckerle commented on code in PR #2909:
URL: https://github.com/apache/drill/pull/2909#discussion_r1666968774


##
contrib/format-daffodil/src/test/java/org/apache/drill/exec/store/daffodil/TestDaffodilReader.java:
##
@@ -0,0 +1,250 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.daffodil;
+
+import org.apache.drill.categories.RowSetTest;
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.physical.rowSet.RowSet;
+import org.apache.drill.exec.physical.rowSet.RowSetReader;
+import org.apache.drill.exec.record.metadata.SchemaBuilder;
+import org.apache.drill.exec.record.metadata.TupleMetadata;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterTest;
+import org.apache.drill.test.QueryBuilder;
+import org.apache.drill.test.rowSet.RowSetComparison;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+import java.nio.file.Paths;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertFalse;
+
+@Category(RowSetTest.class)
+public class TestDaffodilReader extends ClusterTest {
+
+  String schemaURIRoot = 
"file:///opt/drill/contrib/format-daffodil/src/test/resources/";

Review Comment:
   What, exactly, do I change this to, if I want to retrieve files from 
$DRILL_CONFIG_DIR/lib ? 



##
contrib/format-daffodil/src/test/java/org/apache/drill/exec/store/daffodil/TestDaffodilReader.java:
##
@@ -0,0 +1,250 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.daffodil;
+
+import org.apache.drill.categories.RowSetTest;
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.physical.rowSet.RowSet;
+import org.apache.drill.exec.physical.rowSet.RowSetReader;
+import org.apache.drill.exec.record.metadata.SchemaBuilder;
+import org.apache.drill.exec.record.metadata.TupleMetadata;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterTest;
+import org.apache.drill.test.QueryBuilder;
+import org.apache.drill.test.rowSet.RowSetComparison;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+import java.nio.file.Paths;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertFalse;
+
+@Category(RowSetTest.class)
+public class TestDaffodilReader extends ClusterTest {
+
+  String schemaURIRoot = 
"file:///opt/drill/contrib/format-daffodil/src/test/resources/";
+
+  @BeforeClass
+  public static void setup() throws Exception {
+// boilerplate call to start test rig
+ClusterTest.startCluster(ClusterFixture.builder(dirTestWatcher));
+
+DaffodilFormatConfig formatConfig = new DaffodilFormatConfig(null, "", "", 
"", false);
+
+cluster.defineFormat("dfs", "daffodil", formatConfig);
+
+// Needed to test against compressed files.
+// Copies data from src/test/resources to the dfs root.
+dirTestWatcher.copyResourceToRoot(Paths.get("data/

[jira] [Commented] (DRILL-8490) Sender operator fake memory leak result to sql failed and memory statistics error when ChannelClosedException

2024-07-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863218#comment-17863218
 ] 

ASF GitHub Bot commented on DRILL-8490:
---

cgivre merged PR #2917:
URL: https://github.com/apache/drill/pull/2917




> Sender operator fake memory leak result to sql failed  and memory statistics 
> error when ChannelClosedException
> --
>
> Key: DRILL-8490
> URL: https://issues.apache.org/jira/browse/DRILL-8490
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server
>Affects Versions: 1.21.1
>Reporter: shihuafeng
>Priority: Major
> Fix For: 1.22.0
>
>
> *1.DES*
>      **     when ChannelClosedException, .ReconnectingConnection#CloseHandler 
> release sendingAccountor reference counter before netty release buffer, so 
> operator was closed before memory is released by netty .
>  
> *2 .exception info*
>  
> 2024-04-13 08:45:39,909 [DataClient-3] WARN  
> o.apache.drill.exec.rpc.RequestIdMap - Failure while attempting to fail rpc 
> response.
> java.lang.IllegalArgumentException: Self-suppression not permitted
>         at java.lang.Throwable.addSuppressed(Throwable.java:1072)
>         at 
> org.apache.drill.common.DeferredException.addException(DeferredException.java:88)
>         at 
> org.apache.drill.common.DeferredException.addThrowable(DeferredException.java:97)
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.fail(FragmentExecutor.java:502)
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.access$400(FragmentExecutor.java:131)
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$ExecutorStateImpl.fail(FragmentExecutor.java:518)
>         at 
> org.apache.drill.exec.ops.FragmentContextImpl.fail(FragmentContextImpl.java:298)
>         at 
> org.apache.drill.exec.ops.FragmentContextImpl$1.accept(FragmentContextImpl.java:152)
>         at 
> org.apache.drill.exec.ops.FragmentContextImpl$1.accept(FragmentContextImpl.java:149)
>         at 
> org.apache.drill.exec.ops.DataTunnelStatusHandler.failed(DataTunnelStatusHandler.java:45)
>         at 
> org.apache.drill.exec.rpc.data.DataTunnel$ThrottlingOutcomeListener.failed(DataTunnel.java:125)
>         at 
> org.apache.drill.exec.rpc.RequestIdMap$RpcListener.setException(RequestIdMap.java:145)
>         at 
> org.apache.drill.exec.rpc.RequestIdMap$SetExceptionProcedure.apply(RequestIdMap.java:78)
>         at 
> org.apache.drill.exec.rpc.RequestIdMap$SetExceptionProcedure.apply(RequestIdMap.java:68)
>         at 
> com.carrotsearch.hppc.IntObjectHashMap.forEach(IntObjectHashMap.java:692)
>         at 
> org.apache.drill.exec.rpc.RequestIdMap.channelClosed(RequestIdMap.java:64)
>         at 
> org.apache.drill.exec.rpc.AbstractRemoteConnection.channelClosed(AbstractRemoteConnection.java:192)
>         at 
> org.apache.drill.exec.rpc.AbstractClientConnection.channelClosed(AbstractClientConnection.java:97)
>         at 
> org.apache.drill.exec.rpc.RpcBus$ChannelClosedHandler.operationComplete(RpcBus.java:158)
>         at 
> org.apache.drill.exec.rpc.RpcBus$ChannelClosedHandler.operationComplete(RpcBus.java:135)
>         at 
> org.apache.drill.exec.rpc.ReconnectingConnection$CloseHandler.operationComplete(ReconnectingConnection.java:205)
>         at 
> org.apache.drill.exec.rpc.ReconnectingConnection$CloseHandler.operationComplete(ReconnectingConnection.java:192)
>         at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578)
>         at 
> io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:552)
>         at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491)
>         at 
> io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616)
>         at 
> io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:605)
>         at 
> io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104)
>         at 
> io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:84)
>         at 
> io.netty.channel.AbstractChannel$CloseFuture.setClosed(AbstractChannel.java:1164)
>         at 
> io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChannel.java:755)
>  
>         at 
> io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:731)
>         at 
> io.netty.channel.AbstractChannel$AbstractUnsafe.handleWriteError(AbstractChannel.java:950)
>

[jira] [Commented] (DRILL-8501) Json Conversion UDF Not Respecting System JSON Options

2024-07-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863057#comment-17863057
 ] 

ASF GitHub Bot commented on DRILL-8501:
---

cgivre opened a new pull request, #2921:
URL: https://github.com/apache/drill/pull/2921

   # [DRILL-8501](https://issues.apache.org/jira/browse/DRILL-): Json 
Conversion UDF Not Respecting System JSON Options
   
   ## Description
   The `convert_fromJSON()` function was ignoring Drill system configuration 
variables for reading JSON.  This PR adds support for `allTextMode` and 
`readNumbersAsDouble` to this function.  Once merged, the `convert_fromJSON()` 
function will follow the system settings.
   
   I also split one of the unit test files because it had all the UDF tests 
mixed with NaN tests. 
   
   ## Documentation
   No user facing changes.
   
   ## Testing
   Added unit tests.  




> Json Conversion UDF Not Respecting System JSON Options
> --
>
> Key: DRILL-8501
> URL: https://issues.apache.org/jira/browse/DRILL-8501
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Affects Versions: 1.21.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.22.0
>
>
> The convert_fromJSON() UDF does not respect the system JSON options of 
> allTextMode and readAllNumbersAsDouble.  
> This PR fixes that.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8501) Json Conversion UDF Not Respecting System JSON Options

2024-07-04 Thread Charles Givre (Jira)
Charles Givre created DRILL-8501:


 Summary: Json Conversion UDF Not Respecting System JSON Options
 Key: DRILL-8501
 URL: https://issues.apache.org/jira/browse/DRILL-8501
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - JSON
Affects Versions: 1.21.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.22.0


The convert_fromJSON() UDF does not respect the system JSON options of 
allTextMode and readAllNumbersAsDouble.  

This PR fixes that.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8500) review 3rd party source code borrowed into Apache Drill

2024-06-26 Thread PJ Fanning (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

PJ Fanning updated DRILL-8500:
--
Description: 
based on the comment:
https://github.com/apache/drill/pull/2918#pullrequestreview-2141938793

Any source that Apache Drill has borrowed from a 3rd party code base needs to 
be documented in our LICENSE and possibly NOTICE (if that 3rd party code base 
has a NOTICE file - we need to copy its contents into ours).

I used https://github.com/scanoss/sbom-workbench to look at the Drill source 
and there are files that we should investigate.
In general, the biggest issues seem to be with files in the 'contrib' area and 
a lot of them are Javascript files. Also test data files, many are binaries and 
the SBOM Workbench tool is suspicious that some of them have licensing 
implications.

  was:
based on the comment:
https://github.com/apache/drill/pull/2918#pullrequestreview-2141938793

Any source that Apache Drill has borrowed from a 3rd party code base needs to 
be documented in our LICENSE and possibly NOTICE (if that 3rd party code base 
has a NOTICE file - we need to copy its contents into ours).

I used https://github.com/scanoss/sbom-workbench to look at the Drill source 
and there are files that we should investigate.
In general, the biggest issues seem to be with files in the 'contrib' area and 
a lot of them are Javascript files.


> review 3rd party source code borrowed into Apache Drill
> ---
>
> Key: DRILL-8500
> URL: https://issues.apache.org/jira/browse/DRILL-8500
> Project: Apache Drill
>  Issue Type: Task
>Reporter: PJ Fanning
>Priority: Major
>
> based on the comment:
> https://github.com/apache/drill/pull/2918#pullrequestreview-2141938793
> Any source that Apache Drill has borrowed from a 3rd party code base needs to 
> be documented in our LICENSE and possibly NOTICE (if that 3rd party code base 
> has a NOTICE file - we need to copy its contents into ours).
> I used https://github.com/scanoss/sbom-workbench to look at the Drill source 
> and there are files that we should investigate.
> In general, the biggest issues seem to be with files in the 'contrib' area 
> and a lot of them are Javascript files. Also test data files, many are 
> binaries and the SBOM Workbench tool is suspicious that some of them have 
> licensing implications.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8500) review 3rd party source code borrowed into Apache Drill

2024-06-26 Thread PJ Fanning (Jira)
PJ Fanning created DRILL-8500:
-

 Summary: review 3rd party source code borrowed into Apache Drill
 Key: DRILL-8500
 URL: https://issues.apache.org/jira/browse/DRILL-8500
 Project: Apache Drill
  Issue Type: Task
Reporter: PJ Fanning


based on the comment:
https://github.com/apache/drill/pull/2918#pullrequestreview-2141938793

Any source that Apache Drill has borrowed from a 3rd party code base needs to 
be documented in our LICENSE and possibly NOTICE (if that 3rd party code base 
has a NOTICE file - we need to copy its contents into ours).

I used https://github.com/scanoss/sbom-workbench to look at the Drill source 
and there are files that we should investigate.
In general, the biggest issues seem to be with files in the 'contrib' area and 
a lot of them are Javascript files.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8499) new util for generating random text

2024-06-25 Thread PJ Fanning (Jira)
PJ Fanning created DRILL-8499:
-

 Summary: new util for generating random text
 Key: DRILL-8499
 URL: https://issues.apache.org/jira/browse/DRILL-8499
 Project: Apache Drill
  Issue Type: Task
Reporter: PJ Fanning


Centralise the code for generating random text.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8490) Sender operator fake memory leak result to sql failed and memory statistics error when ChannelClosedException

2024-06-20 Thread shihuafeng (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shihuafeng updated DRILL-8490:
--
Summary: Sender operator fake memory leak result to sql failed  and memory 
statistics error when ChannelClosedException  (was:  Sender operator fake 
memory leak result to sql when ChannelClosedException)

> Sender operator fake memory leak result to sql failed  and memory statistics 
> error when ChannelClosedException
> --
>
> Key: DRILL-8490
> URL: https://issues.apache.org/jira/browse/DRILL-8490
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server
>Affects Versions: 1.21.1
>Reporter: shihuafeng
>Priority: Major
> Fix For: 1.22.0
>
>
> *1.DES*
>      **     when ChannelClosedException, .ReconnectingConnection#CloseHandler 
> release sendingAccountor reference counter before netty release buffer, so 
> operator was closed before memory is released by netty .
>  
> *2 .exception info*
>  
> 2024-04-13 08:45:39,909 [DataClient-3] WARN  
> o.apache.drill.exec.rpc.RequestIdMap - Failure while attempting to fail rpc 
> response.
> java.lang.IllegalArgumentException: Self-suppression not permitted
>         at java.lang.Throwable.addSuppressed(Throwable.java:1072)
>         at 
> org.apache.drill.common.DeferredException.addException(DeferredException.java:88)
>         at 
> org.apache.drill.common.DeferredException.addThrowable(DeferredException.java:97)
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.fail(FragmentExecutor.java:502)
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.access$400(FragmentExecutor.java:131)
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$ExecutorStateImpl.fail(FragmentExecutor.java:518)
>         at 
> org.apache.drill.exec.ops.FragmentContextImpl.fail(FragmentContextImpl.java:298)
>         at 
> org.apache.drill.exec.ops.FragmentContextImpl$1.accept(FragmentContextImpl.java:152)
>         at 
> org.apache.drill.exec.ops.FragmentContextImpl$1.accept(FragmentContextImpl.java:149)
>         at 
> org.apache.drill.exec.ops.DataTunnelStatusHandler.failed(DataTunnelStatusHandler.java:45)
>         at 
> org.apache.drill.exec.rpc.data.DataTunnel$ThrottlingOutcomeListener.failed(DataTunnel.java:125)
>         at 
> org.apache.drill.exec.rpc.RequestIdMap$RpcListener.setException(RequestIdMap.java:145)
>         at 
> org.apache.drill.exec.rpc.RequestIdMap$SetExceptionProcedure.apply(RequestIdMap.java:78)
>         at 
> org.apache.drill.exec.rpc.RequestIdMap$SetExceptionProcedure.apply(RequestIdMap.java:68)
>         at 
> com.carrotsearch.hppc.IntObjectHashMap.forEach(IntObjectHashMap.java:692)
>         at 
> org.apache.drill.exec.rpc.RequestIdMap.channelClosed(RequestIdMap.java:64)
>         at 
> org.apache.drill.exec.rpc.AbstractRemoteConnection.channelClosed(AbstractRemoteConnection.java:192)
>         at 
> org.apache.drill.exec.rpc.AbstractClientConnection.channelClosed(AbstractClientConnection.java:97)
>         at 
> org.apache.drill.exec.rpc.RpcBus$ChannelClosedHandler.operationComplete(RpcBus.java:158)
>         at 
> org.apache.drill.exec.rpc.RpcBus$ChannelClosedHandler.operationComplete(RpcBus.java:135)
>         at 
> org.apache.drill.exec.rpc.ReconnectingConnection$CloseHandler.operationComplete(ReconnectingConnection.java:205)
>         at 
> org.apache.drill.exec.rpc.ReconnectingConnection$CloseHandler.operationComplete(ReconnectingConnection.java:192)
>         at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578)
>         at 
> io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:552)
>         at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491)
>         at 
> io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616)
>         at 
> io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:605)
>         at 
> io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104)
>         at 
> io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:84)
>         at 
> io.netty.channel.AbstractChannel$CloseFuture.setClosed(AbstractChannel.java:1164)
>         at 
> io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChannel.java:755)
>  
>         at 
> io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:731)
>         at 
&

[jira] [Updated] (DRILL-8490) Sender operator fake memory leak result to sql when ChannelClosedException

2024-06-20 Thread shihuafeng (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shihuafeng updated DRILL-8490:
--
Description: 
*1.DES*

     **     when ChannelClosedException, .ReconnectingConnection#CloseHandler 
release sendingAccountor reference counter before netty release buffer, so 
operator was closed before memory is released by netty .

 

*2 .exception info*
 
2024-04-13 08:45:39,909 [DataClient-3] WARN  
o.apache.drill.exec.rpc.RequestIdMap - Failure while attempting to fail rpc 
response.
java.lang.IllegalArgumentException: Self-suppression not permitted
        at java.lang.Throwable.addSuppressed(Throwable.java:1072)
        at 
org.apache.drill.common.DeferredException.addException(DeferredException.java:88)
        at 
org.apache.drill.common.DeferredException.addThrowable(DeferredException.java:97)
        at 
org.apache.drill.exec.work.fragment.FragmentExecutor.fail(FragmentExecutor.java:502)
        at 
org.apache.drill.exec.work.fragment.FragmentExecutor.access$400(FragmentExecutor.java:131)
        at 
org.apache.drill.exec.work.fragment.FragmentExecutor$ExecutorStateImpl.fail(FragmentExecutor.java:518)
        at 
org.apache.drill.exec.ops.FragmentContextImpl.fail(FragmentContextImpl.java:298)
        at 
org.apache.drill.exec.ops.FragmentContextImpl$1.accept(FragmentContextImpl.java:152)
        at 
org.apache.drill.exec.ops.FragmentContextImpl$1.accept(FragmentContextImpl.java:149)
        at 
org.apache.drill.exec.ops.DataTunnelStatusHandler.failed(DataTunnelStatusHandler.java:45)
        at 
org.apache.drill.exec.rpc.data.DataTunnel$ThrottlingOutcomeListener.failed(DataTunnel.java:125)
        at 
org.apache.drill.exec.rpc.RequestIdMap$RpcListener.setException(RequestIdMap.java:145)
        at 
org.apache.drill.exec.rpc.RequestIdMap$SetExceptionProcedure.apply(RequestIdMap.java:78)
        at 
org.apache.drill.exec.rpc.RequestIdMap$SetExceptionProcedure.apply(RequestIdMap.java:68)
        at 
com.carrotsearch.hppc.IntObjectHashMap.forEach(IntObjectHashMap.java:692)
        at 
org.apache.drill.exec.rpc.RequestIdMap.channelClosed(RequestIdMap.java:64)
        at 
org.apache.drill.exec.rpc.AbstractRemoteConnection.channelClosed(AbstractRemoteConnection.java:192)
        at 
org.apache.drill.exec.rpc.AbstractClientConnection.channelClosed(AbstractClientConnection.java:97)
        at 
org.apache.drill.exec.rpc.RpcBus$ChannelClosedHandler.operationComplete(RpcBus.java:158)
        at 
org.apache.drill.exec.rpc.RpcBus$ChannelClosedHandler.operationComplete(RpcBus.java:135)
        at 
org.apache.drill.exec.rpc.ReconnectingConnection$CloseHandler.operationComplete(ReconnectingConnection.java:205)
        at 
org.apache.drill.exec.rpc.ReconnectingConnection$CloseHandler.operationComplete(ReconnectingConnection.java:192)
        at 
io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578)
        at 
io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:552)
        at 
io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491)
        at 
io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616)
        at 
io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:605)
        at 
io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104)
        at 
io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:84)
        at 
io.netty.channel.AbstractChannel$CloseFuture.setClosed(AbstractChannel.java:1164)
        at 
io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChannel.java:755)
 
        at 
io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:731)
        at 
io.netty.channel.AbstractChannel$AbstractUnsafe.handleWriteError(AbstractChannel.java:950)
        at 
io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:933)
        at 
io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.forceFlush(AbstractNioChannel.java:361)
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:716)
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:658)
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:584)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496)
        at 
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
        at 
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.drill.exec.rpc.ChannelClosedException: Channel closed 
/10.32.112.138:51108 <--> /10.32.112.138:31012.
        at 
org.apache.drill.exec.rpc.RpcBus$ChannelClosedHandler.operationComplete(RpcBus.java:156)
{noformat}
*no* further _formatting_ is done here{noformat}
Summary:  

[jira] [Commented] (DRILL-8498) Sqlline illegal reflective access warning

2024-06-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17854278#comment-17854278
 ] 

ASF GitHub Bot commented on DRILL-8498:
---

jnturton merged PR #2915:
URL: https://github.com/apache/drill/pull/2915




> Sqlline illegal reflective access warning
> -
>
> Key: DRILL-8498
> URL: https://issues.apache.org/jira/browse/DRILL-8498
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - CLI
>Affects Versions: 1.21.1
>Reporter: Maksym Rymar
>Assignee: Maksym Rymar
>Priority: Minor
>
> Sqlline has the following warnings on connection to Drill
> {code:java}
> apache drill> !connect jdbc:drill:drillbit=localhost;
> WARNING: An illegal reflective access operation has occurred
> WARNING: Illegal reflective access by 
> javassist.util.proxy.SecurityActions 
> ([file:/apache-drill-1.21.2/jars/3rdparty/javassist-3.28.0-GA.jar]file:/apache-drill-1.21.2/jars/3rdparty/javassist-3.28.0-GA.jar
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8498) Sqlline illegal reflective access warning

2024-06-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17854277#comment-17854277
 ] 

ASF GitHub Bot commented on DRILL-8498:
---

jnturton commented on PR #2915:
URL: https://github.com/apache/drill/pull/2915#issuecomment-2162255485

   Thank you!




> Sqlline illegal reflective access warning
> -
>
> Key: DRILL-8498
> URL: https://issues.apache.org/jira/browse/DRILL-8498
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - CLI
>Affects Versions: 1.21.1
>Reporter: Maksym Rymar
>Assignee: Maksym Rymar
>Priority: Minor
>
> Sqlline has the following warnings on connection to Drill
> {code:java}
> apache drill> !connect jdbc:drill:drillbit=localhost;
> WARNING: An illegal reflective access operation has occurred
> WARNING: Illegal reflective access by 
> javassist.util.proxy.SecurityActions 
> ([file:/apache-drill-1.21.2/jars/3rdparty/javassist-3.28.0-GA.jar]file:/apache-drill-1.21.2/jars/3rdparty/javassist-3.28.0-GA.jar
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (DRILL-8498) Sqlline illegal reflective access warning

2024-06-10 Thread Maksym Rymar (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17853622#comment-17853622
 ] 

Maksym Rymar edited comment on DRILL-8498 at 6/10/24 11:30 AM:
---

PR to review: [https://github.com/apache/drill/pull/2915]


was (Author: JIRAUSER297250):
PR to review: https://github.com/apache/drill/pull/2915
 
 
 

 

> Sqlline illegal reflective access warning
> -
>
> Key: DRILL-8498
> URL: https://issues.apache.org/jira/browse/DRILL-8498
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - CLI
>Affects Versions: 1.21.1
>Reporter: Maksym Rymar
>Assignee: Maksym Rymar
>Priority: Minor
>
> Sqlline has the following warnings on connection to Drill
> {code:java}
> apache drill> !connect jdbc:drill:drillbit=localhost;
> WARNING: An illegal reflective access operation has occurred
> WARNING: Illegal reflective access by 
> javassist.util.proxy.SecurityActions 
> ([file:/apache-drill-1.21.2/jars/3rdparty/javassist-3.28.0-GA.jar]file:/apache-drill-1.21.2/jars/3rdparty/javassist-3.28.0-GA.jar
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8498) Sqlline illegal reflective access warning

2024-06-10 Thread Maksym Rymar (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17853622#comment-17853622
 ] 

Maksym Rymar commented on DRILL-8498:
-

PR to review: https://github.com/apache/drill/pull/2915
 
 
 

 

> Sqlline illegal reflective access warning
> -
>
> Key: DRILL-8498
> URL: https://issues.apache.org/jira/browse/DRILL-8498
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - CLI
>Affects Versions: 1.21.1
>Reporter: Maksym Rymar
>Assignee: Maksym Rymar
>Priority: Minor
>
> Sqlline has the following warnings on connection to Drill
> {code:java}
> apache drill> !connect jdbc:drill:drillbit=localhost;
> WARNING: An illegal reflective access operation has occurred
> WARNING: Illegal reflective access by 
> javassist.util.proxy.SecurityActions 
> ([file:/apache-drill-1.21.2/jars/3rdparty/javassist-3.28.0-GA.jar]file:/apache-drill-1.21.2/jars/3rdparty/javassist-3.28.0-GA.jar
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8498) Sqlline illegal reflective access warning

2024-06-10 Thread Maksym Rymar (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maksym Rymar updated DRILL-8498:

Summary: Sqlline illegal reflective access warning  (was: Sqlline illegal 
reflective access waring)

> Sqlline illegal reflective access warning
> -
>
> Key: DRILL-8498
> URL: https://issues.apache.org/jira/browse/DRILL-8498
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - CLI
>Affects Versions: 1.21.1
>Reporter: Maksym Rymar
>Assignee: Maksym Rymar
>Priority: Minor
>
> Sqlline has the following warnings on connection to Drill
> {code:java}
> apache drill> !connect jdbc:drill:drillbit=localhost;
> WARNING: An illegal reflective access operation has occurred
> WARNING: Illegal reflective access by 
> javassist.util.proxy.SecurityActions 
> ([file:/apache-drill-1.21.2/jars/3rdparty/javassist-3.28.0-GA.jar]file:/apache-drill-1.21.2/jars/3rdparty/javassist-3.28.0-GA.jar
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8498) Sqlline illegal reflective access waring

2024-06-10 Thread Maksym Rymar (Jira)
Maksym Rymar created DRILL-8498:
---

 Summary: Sqlline illegal reflective access waring
 Key: DRILL-8498
 URL: https://issues.apache.org/jira/browse/DRILL-8498
 Project: Apache Drill
  Issue Type: Bug
  Components: Client - CLI
Affects Versions: 1.21.1
Reporter: Maksym Rymar
Assignee: Maksym Rymar


Sqlline has the following warnings on connection to Drill
{code:java}
apache drill> !connect jdbc:drill:drillbit=localhost;
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by 
javassist.util.proxy.SecurityActions 
([file:/apache-drill-1.21.2/jars/3rdparty/javassist-3.28.0-GA.jar]file:/apache-drill-1.21.2/jars/3rdparty/javassist-3.28.0-GA.jar
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8497) Drill JDBC driver emits reflective access warnings under Java 9+

2024-06-09 Thread James Turton (Jira)
James Turton created DRILL-8497:
---

 Summary: Drill JDBC driver emits reflective access warnings under 
Java 9+
 Key: DRILL-8497
 URL: https://issues.apache.org/jira/browse/DRILL-8497
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.21.1
Reporter: James Turton
Assignee: James Turton


The failed code patching appears inconsequential to the JDBC driver's 
functioning but results in log noise for applications. Example warning
{code:java}
10:48:27.903 [main] WARN oadd.org.apache.drill.common.util.ProtobufPatcher -- 
Unable to patch Protobuf.
java.lang.reflect.InaccessibleObjectException: Unable to make protected final 
java.lang.Class 
java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain)
 throws java.lang.ClassFormatError accessible: module java.base does not "opens 
java.lang" to unnamed module @5d5baec3{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8496) Drill Query fails when the escape character(which is part of the data) is just before the quote

2024-05-25 Thread achyut09 (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

achyut09 updated DRILL-8496:

Description: 
I have the following csv-

 
{code:java}
"id"^"first_name"^"last_name"^"email"^"gender"
"1"^"John"^"143 \\"^"
ewilk...@buzzfeed.com"^"Male"
"2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code}
and when i run a drill query (SELECT *
FROM dfs.`C:\Users\achyu\Documents\dir2`)-
I am getting the following error-
{code:java}
UserRemoteException :  DATA_READ ERROR: Unexpected character '101' following 
quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code}
This is my dfs configuration for csv in apache drill.I am using the version 
1.21.1-
{code:java}
"csv": { "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", 
"fieldDelimiter": "^", "quote": "\"", "escape": "\\", "comment": "#", 
"extractHeader": true }{code}
Turns out this is because of this particular portion- 
{code:java}
"143 \\"{code}
In this csv 
{code:java}
143 \\{code}
is part of the data and its not an escape character, But as this character is 
before the quote its failing. If i just give a space between the escape and " 
and quote then it works completely fine.
I guess this is a bug.
Any insights(for escaping the escape character before the quote) or workaround 
on the same?

 

  was:
I have the following csv-

 
{code:java}
"id"^"first_name"^"last_name"^"email"^"gender"
"1"^"John"^"143 \\"^"
ewilk...@buzzfeed.com"^"Male"
"2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code}
and when i run a drill query (SELECT *
FROM dfs.`C:\Users\achyu\Documents\dir2`)-
I am getting the following error-
{code:java}
UserRemoteException :  DATA_READ ERROR: Unexpected character '101' following 
quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code}
This is my dfs configuration for csv in apache drill.I am using the version 
1.21.1-
{code:java}
"csv": { "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", 
"fieldDelimiter": "^", "quote": "\"", "escape": "\\", "comment": "#", 
"extractHeader": true }{code}

Turns out this is because of this particular portion- 
{code:java}
"143 \\"{code}
In this csv 
{code:java}
143 \\{code}
is part of the data and its not an escape character, But as this character is 
before the quote its failing. If i just give a space between "\\" and quote 
then it works completely fine.
I guess this is a bug.
Any insights(for escaping the escape character before the quote) or workaround 
on the same?

 


> Drill Query fails when the escape character(which is part of the data) is 
> just before the quote
> ---
>
> Key: DRILL-8496
> URL: https://issues.apache.org/jira/browse/DRILL-8496
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.21.1
>Reporter: achyut09
>Priority: Critical
>  Labels: Drill
>
> I have the following csv-
>  
> {code:java}
> "id"^"first_name"^"last_name"^"email"^"gender"
> "1"^"John"^"143 \\"^"
> ewilk...@buzzfeed.com"^"Male"
> "2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code}
> and when i run a drill query (SELECT *
> FROM dfs.`C:\Users\achyu\Documents\dir2`)-
> I am getting the following error-
> {code:java}
> UserRemoteException :  DATA_READ ERROR: Unexpected character '101' following 
> quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code}
> This is my dfs configuration for csv in apache drill.I am using the version 
> 1.21.1-
> {code:java}
> "csv": { "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", 
> "fieldDelimiter": "^", "quote": "\"", "escape": "\\", "comment": "#", 
> "extractHeader": true }{code}
> Turns out this is because of this particular portion- 
> {code:java}
> "143 \\"{code}
> In this csv 
> {code:java}
> 143 \\{code}
> is part of the data and its not an escape character, But as this character is 
> before the quote its failing. If i just give a space between the escape and " 
> and quote then it works completely fine.
> I guess this is a bug.
> Any insights(for escaping the escape character before the quote) or 
> workaround on the same?
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8496) Drill Query fails when the escape character(which is part of the data) is just before the quote

2024-05-25 Thread achyut09 (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

achyut09 updated DRILL-8496:

Description: 
I have the following csv-

 
{code:java}
"id"^"first_name"^"last_name"^"email"^"gender"
"1"^"John"^"143 \\"^"
ewilk...@buzzfeed.com"^"Male"
"2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code}
and when i run a drill query (SELECT *
FROM dfs.`C:\Users\achyu\Documents\dir2`)-
I am getting the following error-
{code:java}
UserRemoteException :  DATA_READ ERROR: Unexpected character '101' following 
quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code}
This is my dfs configuration for csv in apache drill.I am using the version 
1.21.1-
{code:java}
"csv": { "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", 
"fieldDelimiter": "^", "quote": "\"", "escape": "\\", "comment": "#", 
"extractHeader": true }{code}

Turns out this is because of this particular portion- 
{code:java}
"143 \\"{code}
In this csv 
{code:java}
143 \\{code}
is part of the data and its not an escape character, But as this character is 
before the quote its failing. If i just give a space between "\\" and quote 
then it works completely fine.
I guess this is a bug.
Any insights(for escaping the escape character before the quote) or workaround 
on the same?

 

  was:
I have the following csv-

 
{code:java}
"id"^"first_name"^"last_name"^"email"^"gender"
"1"^"John"^"143 \\"^"
ewilk...@buzzfeed.com"^"Male"
"2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code}
and when i run a drill query (SELECT *
FROM dfs.`C:\Users\achyu\Documents\dir2`)-
I am getting the following error-
{code:java}
UserRemoteException :  DATA_READ ERROR: Unexpected character '101' following 
quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code}
This is my dfs configuration for csv in apache drill.I am using the version 
1.21.1-

"csv": { "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", 
"fieldDelimiter": "^", "quote": "\"", "escape": "
", "comment": "#", "extractHeader": true }
 
Turns out this is because of this particular portion- "143 \\"
In this csv 143 \\ is part of the data and its not an escape character, But as 
this character is before the quote its failing. If i just give a space between 
\\ and quote then it works completely fine.
I guess this is a bug.
Any insights(for escaping the escape character before the quote) or workaround 
on the same?

 


> Drill Query fails when the escape character(which is part of the data) is 
> just before the quote
> ---
>
> Key: DRILL-8496
> URL: https://issues.apache.org/jira/browse/DRILL-8496
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.21.1
>Reporter: achyut09
>Priority: Critical
>  Labels: Drill
>
> I have the following csv-
>  
> {code:java}
> "id"^"first_name"^"last_name"^"email"^"gender"
> "1"^"John"^"143 \\"^"
> ewilk...@buzzfeed.com"^"Male"
> "2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code}
> and when i run a drill query (SELECT *
> FROM dfs.`C:\Users\achyu\Documents\dir2`)-
> I am getting the following error-
> {code:java}
> UserRemoteException :  DATA_READ ERROR: Unexpected character '101' following 
> quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code}
> This is my dfs configuration for csv in apache drill.I am using the version 
> 1.21.1-
> {code:java}
> "csv": { "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", 
> "fieldDelimiter": "^", "quote": "\"", "escape": "\\", "comment": "#", 
> "extractHeader": true }{code}
> Turns out this is because of this particular portion- 
> {code:java}
> "143 \\"{code}
> In this csv 
> {code:java}
> 143 \\{code}
> is part of the data and its not an escape character, But as this character is 
> before the quote its failing. If i just give a space between "\\" and quote 
> then it works completely fine.
> I guess this is a bug.
> Any insights(for escaping the escape character before the quote) or 
> workaround on the same?
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8496) Drill Query fails when the escape character(which is part of the data) is just before the quote

2024-05-25 Thread achyut09 (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

achyut09 updated DRILL-8496:

Description: 
I have the following csv-

 
{code:java}
"id"^"first_name"^"last_name"^"email"^"gender"
"1"^"John"^"143 \\"^"
ewilk...@buzzfeed.com"^"Male"
"2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code}
and when i run a drill query (SELECT *
FROM dfs.`C:\Users\achyu\Documents\dir2`)-
I am getting the following error-
{code:java}
UserRemoteException :  DATA_READ ERROR: Unexpected character '101' following 
quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code}
This is my dfs configuration for csv in apache drill.I am using the version 
1.21.1-

"csv": { "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", 
"fieldDelimiter": "^", "quote": "\"", "escape": "
", "comment": "#", "extractHeader": true }
 
Turns out this is because of this particular portion- "143 \\"
In this csv 143 \\ is part of the data and its not an escape character, But as 
this character is before the quote its failing. If i just give a space between 
\\ and quote then it works completely fine.
I guess this is a bug.
Any insights(for escaping the escape character before the quote) or workaround 
on the same?

 

  was:
I have the following csv-

 
{code:java}
"id"^"first_name"^"last_name"^"email"^"gender"
"1"^"John"^"143 \\"^"
ewilk...@buzzfeed.com"^"Male"
"2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code}
and when i run a drill query (SELECT *
FROM dfs.`C:\Users\achyu\Documents\dir2`)-
I am getting the following error-
{code:java}
UserRemoteException :  DATA_READ ERROR: Unexpected character '101' following 
quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code}
This is my dfs configuration for csv in apache drill.I am using the version 
1.21.1-
{quote}{quote}"csv": \{ "type": "text", "extensions": [ "csv" ], 
"lineDelimiter": "\n", "fieldDelimiter": "^", "quote": "\"", "escape": "\\", 
"comment": "#", "extractHeader": true }{quote}{quote}
 
Turns out this is because of this particular portion- "143 
"
In this csv 
is part of the data and its not an escape character,But as this character is 
before the quote its failing. If i just give a space between 
and quote then it works completely fine.
I guess this is a bug.
Any insights(for escaping the escape character before the quote) or workaround 
on the same?


> Drill Query fails when the escape character(which is part of the data) is 
> just before the quote
> ---
>
> Key: DRILL-8496
> URL: https://issues.apache.org/jira/browse/DRILL-8496
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.21.1
>Reporter: achyut09
>Priority: Critical
>  Labels: Drill
>
> I have the following csv-
>  
> {code:java}
> "id"^"first_name"^"last_name"^"email"^"gender"
> "1"^"John"^"143 \\"^"
> ewilk...@buzzfeed.com"^"Male"
> "2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code}
> and when i run a drill query (SELECT *
> FROM dfs.`C:\Users\achyu\Documents\dir2`)-
> I am getting the following error-
> {code:java}
> UserRemoteException :  DATA_READ ERROR: Unexpected character '101' following 
> quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code}
> This is my dfs configuration for csv in apache drill.I am using the version 
> 1.21.1-
> "csv": { "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", 
> "fieldDelimiter": "^", "quote": "\"", "escape": "
> ", "comment": "#", "extractHeader": true }
>  
> Turns out this is because of this particular portion- "143 \\"
> In this csv 143 \\ is part of the data and its not an escape character, But 
> as this character is before the quote its failing. If i just give a space 
> between \\ and quote then it works completely fine.
> I guess this is a bug.
> Any insights(for escaping the escape character before the quote) or 
> workaround on the same?
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8496) Drill Query fails when the escape character(which is part of the data) is just before the quote

2024-05-25 Thread achyut09 (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

achyut09 updated DRILL-8496:

Description: 
I have the following csv-

 
{code:java}
"id"^"first_name"^"last_name"^"email"^"gender"
"1"^"John"^"143 \\"^"
ewilk...@buzzfeed.com"^"Male"
"2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code}
and when i run a drill query (SELECT *
FROM dfs.`C:\Users\achyu\Documents\dir2`)-
I am getting the following error-
{code:java}
UserRemoteException :  DATA_READ ERROR: Unexpected character '101' following 
quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code}
This is my dfs configuration for csv in apache drill.I am using the version 
1.21.1-
{quote}"csv": { "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", 
"fieldDelimiter": "^", "quote": "\"", "escape": "
", "comment": "#", "extractHeader": true }
{quote}
 
Turns out this is because of this particular portion- "143 
"
In this csv 
is part of the data and its not an escape character,But as this character is 
before the quote its failing. If i just give a space between 
and quote then it works completely fine.
I guess this is a bug.
Any insights(for escaping the escape character before the quote) or workaround 
on the same?

  was:
I have the following csv-

{{}}
{code:java}
"id"^"first_name"^"last_name"^"email"^"gender"
"1"^"John"^"143 \\"^"
ewilk...@buzzfeed.com"^"Male"
"2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code}
and when i run a drill query (SELECT *
FROM dfs.`C:\Users\achyu\Documents\dir2`)-
I am getting the following error-
{code:java}
UserRemoteException :  DATA_READ ERROR: Unexpected character '101' following 
quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code}
This is my dfs configuration for csv in apache drill.I am using the version 
1.21.1-
{quote}"csv": \{ "type": "text", "extensions": [ "csv" ], "lineDelimiter": 
"\n", "fieldDelimiter": "^", "quote": "\"", "escape": "\\", "comment": "#", 
"extractHeader": true }{quote}
 
Turns out this is because of this particular portion- "143 \\"
In this csv \\ is part of the data and its not an escape character,But as this 
character is before the quote its failing. If i just give a space between \\ 
and quote then it works completely fine.
I guess this is a bug.
Any insights(for escaping the escape character before the quote) or workaround 
on the same?


> Drill Query fails when the escape character(which is part of the data) is 
> just before the quote
> ---
>
> Key: DRILL-8496
> URL: https://issues.apache.org/jira/browse/DRILL-8496
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.21.1
>Reporter: achyut09
>Priority: Critical
>  Labels: Drill
>
> I have the following csv-
>  
> {code:java}
> "id"^"first_name"^"last_name"^"email"^"gender"
> "1"^"John"^"143 \\"^"
> ewilk...@buzzfeed.com"^"Male"
> "2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code}
> and when i run a drill query (SELECT *
> FROM dfs.`C:\Users\achyu\Documents\dir2`)-
> I am getting the following error-
> {code:java}
> UserRemoteException :  DATA_READ ERROR: Unexpected character '101' following 
> quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code}
> This is my dfs configuration for csv in apache drill.I am using the version 
> 1.21.1-
> {quote}"csv": { "type": "text", "extensions": [ "csv" ], "lineDelimiter": 
> "\n", "fieldDelimiter": "^", "quote": "\"", "escape": "
> ", "comment": "#", "extractHeader": true }
> {quote}
>  
> Turns out this is because of this particular portion- "143 
> "
> In this csv 
> is part of the data and its not an escape character,But as this character is 
> before the quote its failing. If i just give a space between 
> and quote then it works completely fine.
> I guess this is a bug.
> Any insights(for escaping the escape character before the quote) or 
> workaround on the same?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8496) Drill Query fails when the escape character(which is part of the data) is just before the quote

2024-05-25 Thread achyut09 (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

achyut09 updated DRILL-8496:

Description: 
I have the following csv-

 
{code:java}
"id"^"first_name"^"last_name"^"email"^"gender"
"1"^"John"^"143 \\"^"
ewilk...@buzzfeed.com"^"Male"
"2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code}
and when i run a drill query (SELECT *
FROM dfs.`C:\Users\achyu\Documents\dir2`)-
I am getting the following error-
{code:java}
UserRemoteException :  DATA_READ ERROR: Unexpected character '101' following 
quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code}
This is my dfs configuration for csv in apache drill.I am using the version 
1.21.1-
{quote}{quote}"csv": \{ "type": "text", "extensions": [ "csv" ], 
"lineDelimiter": "\n", "fieldDelimiter": "^", "quote": "\"", "escape": "\\", 
"comment": "#", "extractHeader": true }{quote}{quote}
 
Turns out this is because of this particular portion- "143 
"
In this csv 
is part of the data and its not an escape character,But as this character is 
before the quote its failing. If i just give a space between 
and quote then it works completely fine.
I guess this is a bug.
Any insights(for escaping the escape character before the quote) or workaround 
on the same?

  was:
I have the following csv-

 
{code:java}
"id"^"first_name"^"last_name"^"email"^"gender"
"1"^"John"^"143 \\"^"
ewilk...@buzzfeed.com"^"Male"
"2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code}
and when i run a drill query (SELECT *
FROM dfs.`C:\Users\achyu\Documents\dir2`)-
I am getting the following error-
{code:java}
UserRemoteException :  DATA_READ ERROR: Unexpected character '101' following 
quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code}
This is my dfs configuration for csv in apache drill.I am using the version 
1.21.1-
{quote}"csv": { "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", 
"fieldDelimiter": "^", "quote": "\"", "escape": "
", "comment": "#", "extractHeader": true }
{quote}
 
Turns out this is because of this particular portion- "143 
"
In this csv 
is part of the data and its not an escape character,But as this character is 
before the quote its failing. If i just give a space between 
and quote then it works completely fine.
I guess this is a bug.
Any insights(for escaping the escape character before the quote) or workaround 
on the same?


> Drill Query fails when the escape character(which is part of the data) is 
> just before the quote
> ---
>
> Key: DRILL-8496
> URL: https://issues.apache.org/jira/browse/DRILL-8496
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.21.1
>Reporter: achyut09
>Priority: Critical
>  Labels: Drill
>
> I have the following csv-
>  
> {code:java}
> "id"^"first_name"^"last_name"^"email"^"gender"
> "1"^"John"^"143 \\"^"
> ewilk...@buzzfeed.com"^"Male"
> "2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code}
> and when i run a drill query (SELECT *
> FROM dfs.`C:\Users\achyu\Documents\dir2`)-
> I am getting the following error-
> {code:java}
> UserRemoteException :  DATA_READ ERROR: Unexpected character '101' following 
> quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code}
> This is my dfs configuration for csv in apache drill.I am using the version 
> 1.21.1-
> {quote}{quote}"csv": \{ "type": "text", "extensions": [ "csv" ], 
> "lineDelimiter": "\n", "fieldDelimiter": "^", "quote": "\"", "escape": "\\", 
> "comment": "#", "extractHeader": true }{quote}{quote}
>  
> Turns out this is because of this particular portion- "143 
> "
> In this csv 
> is part of the data and its not an escape character,But as this character is 
> before the quote its failing. If i just give a space between 
> and quote then it works completely fine.
> I guess this is a bug.
> Any insights(for escaping the escape character before the quote) or 
> workaround on the same?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8496) Drill Query fails when the escape character(which is part of the data) is just before the quote

2024-05-25 Thread achyut09 (Jira)
achyut09 created DRILL-8496:
---

 Summary: Drill Query fails when the escape character(which is part 
of the data) is just before the quote
 Key: DRILL-8496
 URL: https://issues.apache.org/jira/browse/DRILL-8496
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.21.1
Reporter: achyut09


I have the following csv-

{{}}
{code:java}
"id"^"first_name"^"last_name"^"email"^"gender"
"1"^"John"^"143 \\"^"
ewilk...@buzzfeed.com"^"Male"
"2"^"Willaim"^"Khan"^"bmacdona...@microsoft.com"^"Male"{code}
and when i run a drill query (SELECT *
FROM dfs.`C:\Users\achyu\Documents\dir2`)-
I am getting the following error-
{code:java}
UserRemoteException :  DATA_READ ERROR: Unexpected character '101' following 
quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code}
This is my dfs configuration for csv in apache drill.I am using the version 
1.21.1-
{quote}"csv": \{ "type": "text", "extensions": [ "csv" ], "lineDelimiter": 
"\n", "fieldDelimiter": "^", "quote": "\"", "escape": "\\", "comment": "#", 
"extractHeader": true }{quote}
 
Turns out this is because of this particular portion- "143 \\"
In this csv \\ is part of the data and its not an escape character,But as this 
character is before the quote its failing. If i just give a space between \\ 
and quote then it works completely fine.
I guess this is a bug.
Any insights(for escaping the escape character before the quote) or workaround 
on the same?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (DRILL-8489) Sender memory leak when rpc encode exception

2024-05-17 Thread James Turton (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Turton closed DRILL-8489.
---
Resolution: Fixed

> Sender memory leak when rpc encode exception
> 
>
> Key: DRILL-8489
> URL: https://issues.apache.org/jira/browse/DRILL-8489
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server
>Affects Versions: 1.21.1
>Reporter: shihuafeng
>Priority: Major
> Fix For: 1.21.2
>
>
> When encode throw Exception, if encode msg instanceof ReferenceCounted, netty 
> can release msg, but drill convert msg to OutboundRpcMessage, so netty can 
> not release msg. this  causes sender memory leaks
> exception info 
> {code:java}
> 2024-04-16 16:25:57,998 [DataClient-7] ERROR 
> o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.  
> Connection: /10.32.112.138:47924 <--> /10.32.112.138:31012 (data client).  
> Closing connection.
> io.netty.handler.codec.EncoderException: 
> org.apache.drill.exec.exception.OutOfMemoryException: Unable to allocate 
> buffer of size 4096 due to memory limit (9223372036854775807). Current 
> allocation: 0
>         at 
> io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:107)
>         at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:881)
>         at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:940)
>         at 
> io.netty.channel.AbstractChannelHandlerContext$WriteTask.run(AbstractChannelHandlerContext.java:1247)
>         at 
> io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
>         at 
> io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
>         at 
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
>         at 
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
>         at 
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Unable to 
> allocate buffer of size 4096 due to memory limit (9223372036854775807). 
> Current allocation: 0
>         at 
> org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:245)
>         at 
> org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:220)
>         at 
> org.apache.drill.exec.memory.DrillByteBufAllocator.buffer(DrillByteBufAllocator.java:55)
>         at 
> org.apache.drill.exec.memory.DrillByteBufAllocator.buffer(DrillByteBufAllocator.java:50)
>         at org.apache.drill.exec.rpc.RpcEncoder.encode(safeRelease.java:87)
>         at org.apache.drill.exec.rpc.RpcEncoder.encode(RpcEncoder.java:38)
>         at 
> io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:90){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (DRILL-8488) HashJoinPOP memory leak is caused by OutOfMemoryException

2024-05-17 Thread James Turton (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Turton closed DRILL-8488.
---
Resolution: Fixed

> HashJoinPOP memory leak is caused by  OutOfMemoryException
> --
>
> Key: DRILL-8488
> URL: https://issues.apache.org/jira/browse/DRILL-8488
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server
>Affects Versions: 1.21.1
>Reporter: shihuafeng
>Priority: Major
> Fix For: 1.21.2
>
>
> [DRILL-8485|[DRILL-8485] HashJoinPOP memory leak is caused by an oom 
> exception when read data from InputStream - ASF JIRA (apache.org)] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8489) Sender memory leak when rpc encode exception

2024-05-17 Thread James Turton (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Turton updated DRILL-8489:

Fix Version/s: 1.21.2
   (was: 1.22.0)

> Sender memory leak when rpc encode exception
> 
>
> Key: DRILL-8489
> URL: https://issues.apache.org/jira/browse/DRILL-8489
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server
>Affects Versions: 1.21.1
>Reporter: shihuafeng
>Priority: Major
> Fix For: 1.21.2
>
>
> When encode throw Exception, if encode msg instanceof ReferenceCounted, netty 
> can release msg, but drill convert msg to OutboundRpcMessage, so netty can 
> not release msg. this  causes sender memory leaks
> exception info 
> {code:java}
> 2024-04-16 16:25:57,998 [DataClient-7] ERROR 
> o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.  
> Connection: /10.32.112.138:47924 <--> /10.32.112.138:31012 (data client).  
> Closing connection.
> io.netty.handler.codec.EncoderException: 
> org.apache.drill.exec.exception.OutOfMemoryException: Unable to allocate 
> buffer of size 4096 due to memory limit (9223372036854775807). Current 
> allocation: 0
>         at 
> io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:107)
>         at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:881)
>         at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:940)
>         at 
> io.netty.channel.AbstractChannelHandlerContext$WriteTask.run(AbstractChannelHandlerContext.java:1247)
>         at 
> io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
>         at 
> io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
>         at 
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
>         at 
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
>         at 
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Unable to 
> allocate buffer of size 4096 due to memory limit (9223372036854775807). 
> Current allocation: 0
>         at 
> org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:245)
>         at 
> org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:220)
>         at 
> org.apache.drill.exec.memory.DrillByteBufAllocator.buffer(DrillByteBufAllocator.java:55)
>         at 
> org.apache.drill.exec.memory.DrillByteBufAllocator.buffer(DrillByteBufAllocator.java:50)
>         at org.apache.drill.exec.rpc.RpcEncoder.encode(safeRelease.java:87)
>         at org.apache.drill.exec.rpc.RpcEncoder.encode(RpcEncoder.java:38)
>         at 
> io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:90){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8488) HashJoinPOP memory leak is caused by OutOfMemoryException

2024-05-17 Thread James Turton (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Turton updated DRILL-8488:

Fix Version/s: 1.21.2
   (was: 1.22.0)

> HashJoinPOP memory leak is caused by  OutOfMemoryException
> --
>
> Key: DRILL-8488
> URL: https://issues.apache.org/jira/browse/DRILL-8488
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server
>Affects Versions: 1.21.1
>Reporter: shihuafeng
>Priority: Major
> Fix For: 1.21.2
>
>
> [DRILL-8485|[DRILL-8485] HashJoinPOP memory leak is caused by an oom 
> exception when read data from InputStream - ASF JIRA (apache.org)] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (DRILL-8480) Cleanup before finished. 0 out of 1 streams have finished

2024-05-17 Thread James Turton (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Turton closed DRILL-8480.
---

> Cleanup before finished. 0 out of 1 streams have finished
> -
>
> Key: DRILL-8480
> URL: https://issues.apache.org/jira/browse/DRILL-8480
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.21.1
>Reporter: Maksym Rymar
>Assignee: Maksym Rymar
>Priority: Major
> Fix For: 1.21.2
>
> Attachments: 1a349ff1-d1f9-62bf-ed8c-26346c548005.sys.drill, 
> tableWithNumber2.parquet
>
>
> Drill fails to execute a query with the following exception:
> {code:java}
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IllegalStateException: Cleanup before finished. 0 out of 1 streams have 
> finished
> Fragment: 1:0
> Please, refer to logs for more information.
> [Error Id: 270da8f4-0bb6-4985-bf4f-34853138881c on 
> compute7.vmcluster.com:31010]
>         at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:657)
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:395)
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:245)
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:362)
>         at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at java.base/java.lang.Thread.run(Thread.java:829)
> Caused by: java.lang.IllegalStateException: Cleanup before finished. 0 out of 
> 1 streams have finished
>         at 
> org.apache.drill.exec.work.batch.BaseRawBatchBuffer.close(BaseRawBatchBuffer.java:111)
>         at 
> org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:91)
>         at 
> org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:71)
>         at 
> org.apache.drill.exec.work.batch.AbstractDataCollector.close(AbstractDataCollector.java:121)
>         at 
> org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:91)
>         at 
> org.apache.drill.exec.work.batch.IncomingBuffers.close(IncomingBuffers.java:144)
>         at 
> org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:581)
>         at 
> org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:567)
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:417)
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:240)
>         ... 5 common frames omitted
>         Suppressed: java.lang.IllegalStateException: Cleanup before finished. 
> 0 out of 1 streams have finished
>                 ... 15 common frames omitted
>         Suppressed: java.lang.IllegalStateException: Memory was leaked by 
> query. Memory leaked: (32768)
> Allocator(op:1:0:8:UnorderedReceiver) 100/32768/32768/100 
> (res/actual/peak/limit)
>                 at 
> org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:519)
>                 at 
> org.apache.drill.exec.ops.BaseOperatorContext.close(BaseOperatorContext.java:159)
>                 at 
> org.apache.drill.exec.ops.OperatorContextImpl.close(OperatorContextImpl.java:77)
>                 at 
> org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:581)
>                 at 
> org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:571)
>                 ... 7 common frames omitted
>         Suppressed: java.lang.IllegalStateException: Memory was leaked by 
> query. Memory leaked: (1016640)
> Allocator(frag:1:0) 3000/1016640/30016640/90715827882 
> (res/actual/peak/limit)
>                 at 
> org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:519)
>                 at 
> org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:581)
>                 at 
> org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:574)
>                 ... 7 common frames omitted {code}
> Steps to reproduce:
>   1.Enable unequal join:
> {code:java}
> alter session set `planner.enable_nljoin_for_scalar_only`=false;

[jira] [Updated] (DRILL-8480) Cleanup before finished. 0 out of 1 streams have finished

2024-05-17 Thread James Turton (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Turton updated DRILL-8480:

Fix Version/s: 1.21.2

> Cleanup before finished. 0 out of 1 streams have finished
> -
>
> Key: DRILL-8480
> URL: https://issues.apache.org/jira/browse/DRILL-8480
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.21.1
>Reporter: Maksym Rymar
>Assignee: Maksym Rymar
>Priority: Major
> Fix For: 1.21.2
>
> Attachments: 1a349ff1-d1f9-62bf-ed8c-26346c548005.sys.drill, 
> tableWithNumber2.parquet
>
>
> Drill fails to execute a query with the following exception:
> {code:java}
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IllegalStateException: Cleanup before finished. 0 out of 1 streams have 
> finished
> Fragment: 1:0
> Please, refer to logs for more information.
> [Error Id: 270da8f4-0bb6-4985-bf4f-34853138881c on 
> compute7.vmcluster.com:31010]
>         at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:657)
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:395)
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:245)
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:362)
>         at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at java.base/java.lang.Thread.run(Thread.java:829)
> Caused by: java.lang.IllegalStateException: Cleanup before finished. 0 out of 
> 1 streams have finished
>         at 
> org.apache.drill.exec.work.batch.BaseRawBatchBuffer.close(BaseRawBatchBuffer.java:111)
>         at 
> org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:91)
>         at 
> org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:71)
>         at 
> org.apache.drill.exec.work.batch.AbstractDataCollector.close(AbstractDataCollector.java:121)
>         at 
> org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:91)
>         at 
> org.apache.drill.exec.work.batch.IncomingBuffers.close(IncomingBuffers.java:144)
>         at 
> org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:581)
>         at 
> org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:567)
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:417)
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:240)
>         ... 5 common frames omitted
>         Suppressed: java.lang.IllegalStateException: Cleanup before finished. 
> 0 out of 1 streams have finished
>                 ... 15 common frames omitted
>         Suppressed: java.lang.IllegalStateException: Memory was leaked by 
> query. Memory leaked: (32768)
> Allocator(op:1:0:8:UnorderedReceiver) 100/32768/32768/100 
> (res/actual/peak/limit)
>                 at 
> org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:519)
>                 at 
> org.apache.drill.exec.ops.BaseOperatorContext.close(BaseOperatorContext.java:159)
>                 at 
> org.apache.drill.exec.ops.OperatorContextImpl.close(OperatorContextImpl.java:77)
>                 at 
> org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:581)
>                 at 
> org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:571)
>                 ... 7 common frames omitted
>         Suppressed: java.lang.IllegalStateException: Memory was leaked by 
> query. Memory leaked: (1016640)
> Allocator(frag:1:0) 3000/1016640/30016640/90715827882 
> (res/actual/peak/limit)
>                 at 
> org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:519)
>                 at 
> org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:581)
>                 at 
> org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:574)
>                 ... 7 common frames omitted {code}
> Steps to reproduce:
>   1.Enable unequal join:
> {code:java}
> alter session set `planner.ena

[jira] [Closed] (DRILL-8487) HTTP Caching

2024-05-17 Thread James Turton (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Turton closed DRILL-8487.
---
Resolution: Duplicate

> HTTP Caching
> 
>
> Key: DRILL-8487
> URL: https://issues.apache.org/jira/browse/DRILL-8487
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.21.1
>Reporter: Sena
>Priority: Major
>
> I am using http storage plugin and I want to activate the caching. In the 
> documentation it says that this requires adding cacheResults. So I added this 
> to my config. When I test using an older version(1.20.1), I can see the query 
> result files under the tmp/http-cache directory, but when I test using a 
> newer version(1.21.1), there are no query result files in that directory, it 
> only contains the journal. This PR 
> [https://github.com/apache/drill/pull/2669] may have caused the issue.
> Also, is it possible to implement maximum cache size?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Reopened] (DRILL-8487) HTTP Caching

2024-05-17 Thread James Turton (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Turton reopened DRILL-8487:
-

> HTTP Caching
> 
>
> Key: DRILL-8487
> URL: https://issues.apache.org/jira/browse/DRILL-8487
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.21.1
>Reporter: Sena
>Priority: Major
>
> I am using http storage plugin and I want to activate the caching. In the 
> documentation it says that this requires adding cacheResults. So I added this 
> to my config. When I test using an older version(1.20.1), I can see the query 
> result files under the tmp/http-cache directory, but when I test using a 
> newer version(1.21.1), there are no query result files in that directory, it 
> only contains the journal. This PR 
> [https://github.com/apache/drill/pull/2669] may have caused the issue.
> Also, is it possible to implement maximum cache size?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (DRILL-8487) HTTP Caching

2024-05-17 Thread James Turton (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Turton closed DRILL-8487.
---
Resolution: Fixed

> HTTP Caching
> 
>
> Key: DRILL-8487
> URL: https://issues.apache.org/jira/browse/DRILL-8487
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.21.1
>Reporter: Sena
>Priority: Major
>
> I am using http storage plugin and I want to activate the caching. In the 
> documentation it says that this requires adding cacheResults. So I added this 
> to my config. When I test using an older version(1.20.1), I can see the query 
> result files under the tmp/http-cache directory, but when I test using a 
> newer version(1.21.1), there are no query result files in that directory, it 
> only contains the journal. This PR 
> [https://github.com/apache/drill/pull/2669] may have caused the issue.
> Also, is it possible to implement maximum cache size?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8487) HTTP Caching

2024-05-17 Thread James Turton (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Turton updated DRILL-8487:

Fix Version/s: (was: 1.20.1)

> HTTP Caching
> 
>
> Key: DRILL-8487
> URL: https://issues.apache.org/jira/browse/DRILL-8487
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.21.1
>Reporter: Sena
>Priority: Major
>
> I am using http storage plugin and I want to activate the caching. In the 
> documentation it says that this requires adding cacheResults. So I added this 
> to my config. When I test using an older version(1.20.1), I can see the query 
> result files under the tmp/http-cache directory, but when I test using a 
> newer version(1.21.1), there are no query result files in that directory, it 
> only contains the journal. This PR 
> [https://github.com/apache/drill/pull/2669] may have caused the issue.
> Also, is it possible to implement maximum cache size?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (DRILL-8494) HTTP Caching Not Saving Pages

2024-05-17 Thread James Turton (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Turton closed DRILL-8494.
---
Resolution: Fixed

> HTTP Caching Not Saving Pages
> -
>
> Key: DRILL-8494
> URL: https://issues.apache.org/jira/browse/DRILL-8494
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HTTP
>Affects Versions: 1.21.1
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.21.2
>
>
> A minor bugfix, but the HTTP storage plugin was not actually caching results 
> even when caching was set to true.  This bug was introduced in DRILL-8329.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8495) Tried to remove unmanaged buffer

2024-05-17 Thread James Turton (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Turton updated DRILL-8495:

Fix Version/s: 1.21.2

> Tried to remove unmanaged buffer
> 
>
> Key: DRILL-8495
> URL: https://issues.apache.org/jira/browse/DRILL-8495
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.21.1
>Reporter: Maksym Rymar
>Assignee: Maksym Rymar
>Priority: Major
> Fix For: 1.21.2
>
>
>  
> Drill throws an exception on Hive table:
> {code:java}
>   (java.lang.IllegalStateException) Tried to remove unmanaged buffer.
>     org.apache.drill.exec.ops.BufferManagerImpl.replace():51
>     io.netty.buffer.DrillBuf.reallocIfNeeded():101
>     
> org.apache.drill.exec.store.hive.writers.primitive.HiveStringWriter.write():38
>     
> org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.readHiveRecordAndInsertIntoRecordBatch():416
>     
> org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.next():402
>     org.apache.drill.exec.physical.impl.ScanBatch.internalNext():235
>     org.apache.drill.exec.physical.impl.ScanBatch.next():299
>     
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237
>     org.apache.drill.exec.record.AbstractRecordBatch.next():109
>     org.apache.drill.exec.record.AbstractRecordBatch.next():101
>     org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
>     
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():93
>     org.apache.drill.exec.record.AbstractRecordBatch.next():160
>     
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237
>     org.apache.drill.exec.physical.impl.BaseRootExec.next():103
>     
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
>     org.apache.drill.exec.physical.impl.BaseRootExec.next():93
>     org.apache.drill.exec.work.fragment.FragmentExecutor.lambda$run$0():321
>     java.security.AccessController.doPrivileged():-2
>     javax.security.auth.Subject.doAs():422
>     org.apache.hadoop.security.UserGroupInformation.doAs():1899
>     org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
>     org.apache.drill.common.SelfCleaningRunnable.run():38
>     java.util.concurrent.ThreadPoolExecutor.runWorker():1149
>     java.util.concurrent.ThreadPoolExecutor$Worker.run():624
>     java.lang.Thread.run():748 {code}
>  
>  
> Reproduce:
>  # Create Hive table:
> {code:java}
> create table if NOT EXISTS students(id int, name string, surname string) 
> stored as parquet;{code}
>  # Insert a new row with 2 string values of size > 256 bytes:
> {code:java}
> insert into students values (1, 
> 'Veeery
>  long name', 
> 'biiiig
>  surname');{code}
>  # Execute Drill query:
> {code:java}
> select * from hive.`students` {code}
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (DRILL-8493) Drill Unable to Read XML Files with Namespaces

2024-05-17 Thread James Turton (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Turton closed DRILL-8493.
---
Resolution: Fixed

> Drill Unable to Read XML Files with Namespaces
> --
>
> Key: DRILL-8493
> URL: https://issues.apache.org/jira/browse/DRILL-8493
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Format - XML
>Affects Versions: 1.21.1
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.21.2
>
>
> This is a bug fix whereby Drill ignores all data when an XML file has a 
> namespace.  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8495) Tried to remove unmanaged buffer

2024-05-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847325#comment-17847325
 ] 

ASF GitHub Bot commented on DRILL-8495:
---

jnturton merged PR #2913:
URL: https://github.com/apache/drill/pull/2913




> Tried to remove unmanaged buffer
> 
>
> Key: DRILL-8495
> URL: https://issues.apache.org/jira/browse/DRILL-8495
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.21.1
>Reporter: Maksym Rymar
>Assignee: Maksym Rymar
>Priority: Major
>
>  
> Drill throws an exception on Hive table:
> {code:java}
>   (java.lang.IllegalStateException) Tried to remove unmanaged buffer.
>     org.apache.drill.exec.ops.BufferManagerImpl.replace():51
>     io.netty.buffer.DrillBuf.reallocIfNeeded():101
>     
> org.apache.drill.exec.store.hive.writers.primitive.HiveStringWriter.write():38
>     
> org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.readHiveRecordAndInsertIntoRecordBatch():416
>     
> org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.next():402
>     org.apache.drill.exec.physical.impl.ScanBatch.internalNext():235
>     org.apache.drill.exec.physical.impl.ScanBatch.next():299
>     
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237
>     org.apache.drill.exec.record.AbstractRecordBatch.next():109
>     org.apache.drill.exec.record.AbstractRecordBatch.next():101
>     org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
>     
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():93
>     org.apache.drill.exec.record.AbstractRecordBatch.next():160
>     
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237
>     org.apache.drill.exec.physical.impl.BaseRootExec.next():103
>     
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
>     org.apache.drill.exec.physical.impl.BaseRootExec.next():93
>     org.apache.drill.exec.work.fragment.FragmentExecutor.lambda$run$0():321
>     java.security.AccessController.doPrivileged():-2
>     javax.security.auth.Subject.doAs():422
>     org.apache.hadoop.security.UserGroupInformation.doAs():1899
>     org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
>     org.apache.drill.common.SelfCleaningRunnable.run():38
>     java.util.concurrent.ThreadPoolExecutor.runWorker():1149
>     java.util.concurrent.ThreadPoolExecutor$Worker.run():624
>     java.lang.Thread.run():748 {code}
>  
>  
> Reproduce:
>  # Create Hive table:
> {code:java}
> create table if NOT EXISTS students(id int, name string, surname string) 
> stored as parquet;{code}
>  # Insert a new row with 2 string values of size > 256 bytes:
> {code:java}
> insert into students values (1, 
> 'Veeery
>  long name', 
> 'biiiig
>  surname');{code}
>  # Execute Drill query:
> {code:java}
> select * from hive.`students` {code}
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8492) Allow Parquet TIME_MICROS and TIMESTAMP_MICROS columns to be read as 64-bit integer values

2024-05-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847324#comment-17847324
 ] 

ASF GitHub Bot commented on DRILL-8492:
---

jnturton commented on PR #2907:
URL: https://github.com/apache/drill/pull/2907#issuecomment-2117674516

   It's always bugged me that we don't have a globally accessible way of 
accessing at least one of DrillbitContext, QueryContext, FragmentContext or 
just OptionManager. We hardly want to have to spray these things through APIs 
everywhere in Drill. I'll take a look at whether something can be done...




> Allow Parquet TIME_MICROS and TIMESTAMP_MICROS  columns to be read as 64-bit 
> integer values
> ---
>
> Key: DRILL-8492
> URL: https://issues.apache.org/jira/browse/DRILL-8492
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Affects Versions: 1.21.1
>Reporter: Peter Franzen
>Priority: Major
>
> When reading Parquet columns of type {{time_micros}} and 
> {{{}timestamp_micros{}}}, Drill truncates the microsecond values to 
> milliseconds in order to convert them to SQL timestamps.
> It is currently not possible to read the original microsecond values (as 
> 64-bit values, not SQL timestamps) through Drill.
> One solution for allowing reading the original 64-bit values is to add two 
> options similar to “store.parquet.reader.int96_as_timestamp" to control 
> whether microsecond
> times and timestamps are truncated to millisecond timestamps or read as 
> non-truncated 64-bit values.
> These options would be added to {{org.apache.drill.exec.ExecConstants}} and
> {{{}org.apache.drill.exec.server.options.SystemOptionManager{}}}.
> They would also be added to "drill-module.conf":
> {{   store.parquet.reader.time_micros_as_int64: false,}}
> {{   store.parquet.reader.timestamp_micros_as_int64: false,}}
> These options would then be used in the same places as 
> {{{}store.parquet.reader.int96_as_timestamp{}}}:
>  * org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory
>  * 
> org.apache.drill.exec.store.parquet.columnreaders.ParquetToDrillTypeConverter
>  * org.apache.drill.exec.store.parquet2.DrillParquetGroupConverter
> to create an int64 reader instead of a time/timestamp reader when the 
> correspondning option is set to true.
> In addition to this, 
> {{org.apache.drill.exec.store.parquet.metadata.FileMetadataCollector }}must 
> be altered to _not_ truncate the min and max values for 
> time_micros/timestamp_micros if the corresponding option is true. This class 
> doesn’t have a reference to an {{{}OptionManager{}}}, so the two new options 
> must be extracted from the {{OptionManager}} when the {{ParquetReaderConfig}} 
> instance is created.
> Filtering on microsecond columns would be done using 64-bit values rather 
> than TIME/TIMESTAMP values when the new options are true, e.g.
> {{SELECT *  FROM  WHERE  = 1705914906694751;}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8495) Tried to remove unmanaged buffer

2024-05-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847241#comment-17847241
 ] 

ASF GitHub Bot commented on DRILL-8495:
---

rymarm commented on PR #2913:
URL: https://github.com/apache/drill/pull/2913#issuecomment-2117347650

   @jnturton I addressed checkstyle issues and failed java tests. Should be 
fine now)




> Tried to remove unmanaged buffer
> 
>
> Key: DRILL-8495
> URL: https://issues.apache.org/jira/browse/DRILL-8495
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.21.1
>Reporter: Maksym Rymar
>Assignee: Maksym Rymar
>Priority: Major
>
>  
> Drill throws an exception on Hive table:
> {code:java}
>   (java.lang.IllegalStateException) Tried to remove unmanaged buffer.
>     org.apache.drill.exec.ops.BufferManagerImpl.replace():51
>     io.netty.buffer.DrillBuf.reallocIfNeeded():101
>     
> org.apache.drill.exec.store.hive.writers.primitive.HiveStringWriter.write():38
>     
> org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.readHiveRecordAndInsertIntoRecordBatch():416
>     
> org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.next():402
>     org.apache.drill.exec.physical.impl.ScanBatch.internalNext():235
>     org.apache.drill.exec.physical.impl.ScanBatch.next():299
>     
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237
>     org.apache.drill.exec.record.AbstractRecordBatch.next():109
>     org.apache.drill.exec.record.AbstractRecordBatch.next():101
>     org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
>     
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():93
>     org.apache.drill.exec.record.AbstractRecordBatch.next():160
>     
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237
>     org.apache.drill.exec.physical.impl.BaseRootExec.next():103
>     
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
>     org.apache.drill.exec.physical.impl.BaseRootExec.next():93
>     org.apache.drill.exec.work.fragment.FragmentExecutor.lambda$run$0():321
>     java.security.AccessController.doPrivileged():-2
>     javax.security.auth.Subject.doAs():422
>     org.apache.hadoop.security.UserGroupInformation.doAs():1899
>     org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
>     org.apache.drill.common.SelfCleaningRunnable.run():38
>     java.util.concurrent.ThreadPoolExecutor.runWorker():1149
>     java.util.concurrent.ThreadPoolExecutor$Worker.run():624
>     java.lang.Thread.run():748 {code}
>  
>  
> Reproduce:
>  # Create Hive table:
> {code:java}
> create table if NOT EXISTS students(id int, name string, surname string) 
> stored as parquet;{code}
>  # Insert a new row with 2 string values of size > 256 bytes:
> {code:java}
> insert into students values (1, 
> 'Veeery
>  long name', 
> 'biiiig
>  surname');{code}
>  # Execute Drill query:
> {code:java}
> select * from hive.`students` {code}
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8495) Tried to remove unmanaged buffer

2024-05-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846934#comment-17846934
 ] 

ASF GitHub Bot commented on DRILL-8495:
---

jnturton commented on PR #2913:
URL: https://github.com/apache/drill/pull/2913#issuecomment-2115109847

   P.S. I see that checkstyle is still upset.




> Tried to remove unmanaged buffer
> 
>
> Key: DRILL-8495
> URL: https://issues.apache.org/jira/browse/DRILL-8495
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.21.1
>Reporter: Maksym Rymar
>Assignee: Maksym Rymar
>Priority: Major
>
>  
> Drill throws an exception on Hive table:
> {code:java}
>   (java.lang.IllegalStateException) Tried to remove unmanaged buffer.
>     org.apache.drill.exec.ops.BufferManagerImpl.replace():51
>     io.netty.buffer.DrillBuf.reallocIfNeeded():101
>     
> org.apache.drill.exec.store.hive.writers.primitive.HiveStringWriter.write():38
>     
> org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.readHiveRecordAndInsertIntoRecordBatch():416
>     
> org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.next():402
>     org.apache.drill.exec.physical.impl.ScanBatch.internalNext():235
>     org.apache.drill.exec.physical.impl.ScanBatch.next():299
>     
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237
>     org.apache.drill.exec.record.AbstractRecordBatch.next():109
>     org.apache.drill.exec.record.AbstractRecordBatch.next():101
>     org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
>     
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():93
>     org.apache.drill.exec.record.AbstractRecordBatch.next():160
>     
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237
>     org.apache.drill.exec.physical.impl.BaseRootExec.next():103
>     
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
>     org.apache.drill.exec.physical.impl.BaseRootExec.next():93
>     org.apache.drill.exec.work.fragment.FragmentExecutor.lambda$run$0():321
>     java.security.AccessController.doPrivileged():-2
>     javax.security.auth.Subject.doAs():422
>     org.apache.hadoop.security.UserGroupInformation.doAs():1899
>     org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
>     org.apache.drill.common.SelfCleaningRunnable.run():38
>     java.util.concurrent.ThreadPoolExecutor.runWorker():1149
>     java.util.concurrent.ThreadPoolExecutor$Worker.run():624
>     java.lang.Thread.run():748 {code}
>  
>  
> Reproduce:
>  # Create Hive table:
> {code:java}
> create table if NOT EXISTS students(id int, name string, surname string) 
> stored as parquet;{code}
>  # Insert a new row with 2 string values of size > 256 bytes:
> {code:java}
> insert into students values (1, 
> 'Veeery
>  long name', 
> 'biiiig
>  surname');{code}
>  # Execute Drill query:
> {code:java}
> select * from hive.`students` {code}
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8495) Tried to remove unmanaged buffer

2024-05-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846594#comment-17846594
 ] 

ASF GitHub Bot commented on DRILL-8495:
---

rymarm opened a new pull request, #2913:
URL: https://github.com/apache/drill/pull/2913

   # [DRILL-8495](https://issues.apache.org/jira/browse/DRILL-8495): Tried to 
remove unmanaged buffer
   
   The root cause of the issue is that multiple HiveWriters use the same 
`DrillBuf` and during execution they may reallocate the buffer if size of the 
buffer is not enough for a value (256 bytes+). Since 
`drillBuf.reallocIfNeeded(int size)` returns a new instance of `DrillBuf`, all 
other writers still have a reference for the old one buffer, which after 
`drillBuf.reallocIfNeeded(int size)` execution is unmanaged now.
   
   ## Description
   
   `HiveValueWriterFactory` now creates a unique `DrillBif` for each writer. 
   
   HiveWriters are actually used one-by-one and we could utilize a single 
buffer for all the writers. To do this, I could create a class holder for 
`DrillBuf`, so each writer has a reference for the same holder, where will be 
stored a new buffer from every `drillBuf.reallocIfNeeded(int size)` call. But I 
thought that such logic looked slightly confusing and I decided just to let 
each HiveWriter use its own buffer.
   
   ## Documentation
   \-
   
   ## Testing
   Add a new unit test to query a Hive table with variable-length values of 
Binary, VarChar, Char and String types.
   




> Tried to remove unmanaged buffer
> 
>
> Key: DRILL-8495
> URL: https://issues.apache.org/jira/browse/DRILL-8495
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.21.1
>Reporter: Maksym Rymar
>Assignee: Maksym Rymar
>Priority: Major
>
>  
> Drill throws an exception on Hive table:
> {code:java}
>   (java.lang.IllegalStateException) Tried to remove unmanaged buffer.
>     org.apache.drill.exec.ops.BufferManagerImpl.replace():51
>     io.netty.buffer.DrillBuf.reallocIfNeeded():101
>     
> org.apache.drill.exec.store.hive.writers.primitive.HiveStringWriter.write():38
>     
> org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.readHiveRecordAndInsertIntoRecordBatch():416
>     
> org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.next():402
>     org.apache.drill.exec.physical.impl.ScanBatch.internalNext():235
>     org.apache.drill.exec.physical.impl.ScanBatch.next():299
>     
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237
>     org.apache.drill.exec.record.AbstractRecordBatch.next():109
>     org.apache.drill.exec.record.AbstractRecordBatch.next():101
>     org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
>     
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():93
>     org.apache.drill.exec.record.AbstractRecordBatch.next():160
>     
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237
>     org.apache.drill.exec.physical.impl.BaseRootExec.next():103
>     
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
>     org.apache.drill.exec.physical.impl.BaseRootExec.next():93
>     org.apache.drill.exec.work.fragment.FragmentExecutor.lambda$run$0():321
>     java.security.AccessController.doPrivileged():-2
>     javax.security.auth.Subject.doAs():422
>     org.apache.hadoop.security.UserGroupInformation.doAs():1899
>     org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
>     org.apache.drill.common.SelfCleaningRunnable.run():38
>     java.util.concurrent.ThreadPoolExecutor.runWorker():1149
>     java.util.concurrent.ThreadPoolExecutor$Worker.run():624
>     java.lang.Thread.run():748 {code}
>  
>  
> Reproduce:
>  # Create Hive table:
> {code:java}
> create table if NOT EXISTS students(id int, name string, surname string) 
> stored as parquet;{code}
>  # Insert a new row with 2 string values of size > 256 bytes:
> {code:java}
> insert into students values (1, 
> 'Veeery
>  long name', 
> 'bi

[jira] [Created] (DRILL-8495) Tried to remove unmanaged buffer

2024-05-13 Thread Maksym Rymar (Jira)
Maksym Rymar created DRILL-8495:
---

 Summary: Tried to remove unmanaged buffer
 Key: DRILL-8495
 URL: https://issues.apache.org/jira/browse/DRILL-8495
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.21.1
Reporter: Maksym Rymar
Assignee: Maksym Rymar


 

Drill throws an exception on Hive table:
{code:java}
  (java.lang.IllegalStateException) Tried to remove unmanaged buffer.
    org.apache.drill.exec.ops.BufferManagerImpl.replace():51
    io.netty.buffer.DrillBuf.reallocIfNeeded():101
    
org.apache.drill.exec.store.hive.writers.primitive.HiveStringWriter.write():38
    
org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.readHiveRecordAndInsertIntoRecordBatch():416
    org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.next():402
    org.apache.drill.exec.physical.impl.ScanBatch.internalNext():235
    org.apache.drill.exec.physical.impl.ScanBatch.next():299
    
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237
    org.apache.drill.exec.record.AbstractRecordBatch.next():109
    org.apache.drill.exec.record.AbstractRecordBatch.next():101
    org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
    
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():93
    org.apache.drill.exec.record.AbstractRecordBatch.next():160
    
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237
    org.apache.drill.exec.physical.impl.BaseRootExec.next():103
    org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
    org.apache.drill.exec.physical.impl.BaseRootExec.next():93
    org.apache.drill.exec.work.fragment.FragmentExecutor.lambda$run$0():321
    java.security.AccessController.doPrivileged():-2
    javax.security.auth.Subject.doAs():422
    org.apache.hadoop.security.UserGroupInformation.doAs():1899
    org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
    org.apache.drill.common.SelfCleaningRunnable.run():38
    java.util.concurrent.ThreadPoolExecutor.runWorker():1149
    java.util.concurrent.ThreadPoolExecutor$Worker.run():624
    java.lang.Thread.run():748 {code}
 

 

Reproduce:
 # Create Hive table:

{code:java}
create table if NOT EXISTS students(id int, name string, surname string) stored 
as parquet;{code}

 # Insert a new row with 2 string values of size > 256 bytes:

{code:java}
insert into students values (1, 
'Veeery
 long name', 
'bg
 surname');{code}

 # Execute Drill query:

{code:java}
select * from hive.`students` {code}

 
 
 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8492) Allow Parquet TIME_MICROS and TIMESTAMP_MICROS columns to be read as 64-bit integer values

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845996#comment-17845996
 ] 

ASF GitHub Bot commented on DRILL-8492:
---

handmadecode commented on PR #2907:
URL: https://github.com/apache/drill/pull/2907#issuecomment-2108103905

   > Follow up: I see use of the FragmentContext class for accessing config 
options in the old Parquet reader, perhaps it's a good a vehicle...
   
   `FragmentContext` is used to access the new config options where an instance 
already was available, i.e. in `ColumnReaderFactory`, 
`ParquetToDrillTypeConverter`, and `DrillParquetGroupConverter`.
   
   `FileMetadataCollector` doesn't have access to a `FragmentContext`, only to 
a `ParquetReaderConfig`.
   I can only find a FragmentContext in one of the two call paths to 
`FileMetadataCollector::addColumnMetadata`, so trying to inject a 
`FragmentContext` into `FileMetadataCollector` will probably have an impact on 
quite a few other classes.




> Allow Parquet TIME_MICROS and TIMESTAMP_MICROS  columns to be read as 64-bit 
> integer values
> ---
>
> Key: DRILL-8492
> URL: https://issues.apache.org/jira/browse/DRILL-8492
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Affects Versions: 1.21.1
>Reporter: Peter Franzen
>Priority: Major
>
> When reading Parquet columns of type {{time_micros}} and 
> {{{}timestamp_micros{}}}, Drill truncates the microsecond values to 
> milliseconds in order to convert them to SQL timestamps.
> It is currently not possible to read the original microsecond values (as 
> 64-bit values, not SQL timestamps) through Drill.
> One solution for allowing reading the original 64-bit values is to add two 
> options similar to “store.parquet.reader.int96_as_timestamp" to control 
> whether microsecond
> times and timestamps are truncated to millisecond timestamps or read as 
> non-truncated 64-bit values.
> These options would be added to {{org.apache.drill.exec.ExecConstants}} and
> {{{}org.apache.drill.exec.server.options.SystemOptionManager{}}}.
> They would also be added to "drill-module.conf":
> {{   store.parquet.reader.time_micros_as_int64: false,}}
> {{   store.parquet.reader.timestamp_micros_as_int64: false,}}
> These options would then be used in the same places as 
> {{{}store.parquet.reader.int96_as_timestamp{}}}:
>  * org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory
>  * 
> org.apache.drill.exec.store.parquet.columnreaders.ParquetToDrillTypeConverter
>  * org.apache.drill.exec.store.parquet2.DrillParquetGroupConverter
> to create an int64 reader instead of a time/timestamp reader when the 
> correspondning option is set to true.
> In addition to this, 
> {{org.apache.drill.exec.store.parquet.metadata.FileMetadataCollector }}must 
> be altered to _not_ truncate the min and max values for 
> time_micros/timestamp_micros if the corresponding option is true. This class 
> doesn’t have a reference to an {{{}OptionManager{}}}, so the two new options 
> must be extracted from the {{OptionManager}} when the {{ParquetReaderConfig}} 
> instance is created.
> Filtering on microsecond columns would be done using 64-bit values rather 
> than TIME/TIMESTAMP values when the new options are true, e.g.
> {{SELECT *  FROM  WHERE  = 1705914906694751;}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8492) Allow Parquet TIME_MICROS and TIMESTAMP_MICROS columns to be read as 64-bit integer values

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845816#comment-17845816
 ] 

ASF GitHub Bot commented on DRILL-8492:
---

jnturton commented on PR #2907:
URL: https://github.com/apache/drill/pull/2907#issuecomment-2106890548

   > However, I could very well have overlooked some way to access the global 
configuration, and I'd be grateful for any pointers.
   
   We should existing find examples in the Parquet format plugin. E.g. the 
"old" reader is affected by the option store.parquet.reader.pagereader.async.




> Allow Parquet TIME_MICROS and TIMESTAMP_MICROS  columns to be read as 64-bit 
> integer values
> ---
>
> Key: DRILL-8492
> URL: https://issues.apache.org/jira/browse/DRILL-8492
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Affects Versions: 1.21.1
>Reporter: Peter Franzen
>Priority: Major
>
> When reading Parquet columns of type {{time_micros}} and 
> {{{}timestamp_micros{}}}, Drill truncates the microsecond values to 
> milliseconds in order to convert them to SQL timestamps.
> It is currently not possible to read the original microsecond values (as 
> 64-bit values, not SQL timestamps) through Drill.
> One solution for allowing reading the original 64-bit values is to add two 
> options similar to “store.parquet.reader.int96_as_timestamp" to control 
> whether microsecond
> times and timestamps are truncated to millisecond timestamps or read as 
> non-truncated 64-bit values.
> These options would be added to {{org.apache.drill.exec.ExecConstants}} and
> {{{}org.apache.drill.exec.server.options.SystemOptionManager{}}}.
> They would also be added to "drill-module.conf":
> {{   store.parquet.reader.time_micros_as_int64: false,}}
> {{   store.parquet.reader.timestamp_micros_as_int64: false,}}
> These options would then be used in the same places as 
> {{{}store.parquet.reader.int96_as_timestamp{}}}:
>  * org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory
>  * 
> org.apache.drill.exec.store.parquet.columnreaders.ParquetToDrillTypeConverter
>  * org.apache.drill.exec.store.parquet2.DrillParquetGroupConverter
> to create an int64 reader instead of a time/timestamp reader when the 
> correspondning option is set to true.
> In addition to this, 
> {{org.apache.drill.exec.store.parquet.metadata.FileMetadataCollector }}must 
> be altered to _not_ truncate the min and max values for 
> time_micros/timestamp_micros if the corresponding option is true. This class 
> doesn’t have a reference to an {{{}OptionManager{}}}, so the two new options 
> must be extracted from the {{OptionManager}} when the {{ParquetReaderConfig}} 
> instance is created.
> Filtering on microsecond columns would be done using 64-bit values rather 
> than TIME/TIMESTAMP values when the new options are true, e.g.
> {{SELECT *  FROM  WHERE  = 1705914906694751;}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8492) Allow Parquet TIME_MICROS and TIMESTAMP_MICROS columns to be read as 64-bit integer values

2024-05-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845701#comment-17845701
 ] 

ASF GitHub Bot commented on DRILL-8492:
---

handmadecode commented on PR #2907:
URL: https://github.com/apache/drill/pull/2907#issuecomment-2106221734

   > Awesome work. I can backport this too because you've left default 
behaviour unchanged (and it's self contained). My only question is about 
ParquetReaderConfig 

> Allow Parquet TIME_MICROS and TIMESTAMP_MICROS  columns to be read as 64-bit 
> integer values
> ---
>
> Key: DRILL-8492
> URL: https://issues.apache.org/jira/browse/DRILL-8492
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Affects Versions: 1.21.1
>Reporter: Peter Franzen
>Priority: Major
>
> When reading Parquet columns of type {{time_micros}} and 
> {{{}timestamp_micros{}}}, Drill truncates the microsecond values to 
> milliseconds in order to convert them to SQL timestamps.
> It is currently not possible to read the original microsecond values (as 
> 64-bit values, not SQL timestamps) through Drill.
> One solution for allowing reading the original 64-bit values is to add two 
> options similar to “store.parquet.reader.int96_as_timestamp" to control 
> whether microsecond
> times and timestamps are truncated to millisecond timestamps or read as 
> non-truncated 64-bit values.
> These options would be added to {{org.apache.drill.exec.ExecConstants}} and
> {{{}org.apache.drill.exec.server.options.SystemOptionManager{}}}.
> They would also be added to "drill-module.conf":
> {{   store.parquet.reader.time_micros_as_int64: false,}}
> {{   store.parquet.reader.timestamp_micros_as_int64: false,}}
> These options would then be used in the same places as 
> {{{}store.parquet.reader.int96_as_timestamp{}}}:
>  * org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory
>  * 
> org.apache.drill.exec.store.parquet.columnreaders.ParquetToDrillTypeConverter
>  * org.apache.drill.exec.store.parquet2.DrillParquetGroupConverter
> to create an int64 reader instead of a time/timestamp reader when the 
> correspondning option is set to true.
> In addition to this, 
> {{org.apache.drill.exec.store.parquet.metadata.FileMetadataCollector }}must 
> be altered to _not_ truncate the min and max values for 
> time_micros/timestamp_micros if the corresponding option is true. This class 
> doesn’t have a reference to an {{{}OptionManager{}}}, so the two new options 
> must be extracted from the {{OptionManager}} when the {{ParquetReaderConfig}} 
> instance is created.
> Filtering on microsecond columns would be done using 64-bit values rather 
> than TIME/TIMESTAMP values when the new options are true, e.g.
> {{SELECT *  FROM  WHERE  = 1705914906694751;}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8480) Cleanup before finished. 0 out of 1 streams have finished

2024-05-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845661#comment-17845661
 ] 

ASF GitHub Bot commented on DRILL-8480:
---

jnturton merged PR #2897:
URL: https://github.com/apache/drill/pull/2897




> Cleanup before finished. 0 out of 1 streams have finished
> -
>
> Key: DRILL-8480
> URL: https://issues.apache.org/jira/browse/DRILL-8480
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.21.1
>Reporter: Maksym Rymar
>Assignee: Maksym Rymar
>Priority: Major
> Attachments: 1a349ff1-d1f9-62bf-ed8c-26346c548005.sys.drill, 
> tableWithNumber2.parquet
>
>
> Drill fails to execute a query with the following exception:
> {code:java}
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IllegalStateException: Cleanup before finished. 0 out of 1 streams have 
> finished
> Fragment: 1:0
> Please, refer to logs for more information.
> [Error Id: 270da8f4-0bb6-4985-bf4f-34853138881c on 
> compute7.vmcluster.com:31010]
>         at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:657)
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:395)
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:245)
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:362)
>         at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at java.base/java.lang.Thread.run(Thread.java:829)
> Caused by: java.lang.IllegalStateException: Cleanup before finished. 0 out of 
> 1 streams have finished
>         at 
> org.apache.drill.exec.work.batch.BaseRawBatchBuffer.close(BaseRawBatchBuffer.java:111)
>         at 
> org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:91)
>         at 
> org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:71)
>         at 
> org.apache.drill.exec.work.batch.AbstractDataCollector.close(AbstractDataCollector.java:121)
>         at 
> org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:91)
>         at 
> org.apache.drill.exec.work.batch.IncomingBuffers.close(IncomingBuffers.java:144)
>         at 
> org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:581)
>         at 
> org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:567)
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:417)
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:240)
>         ... 5 common frames omitted
>         Suppressed: java.lang.IllegalStateException: Cleanup before finished. 
> 0 out of 1 streams have finished
>                 ... 15 common frames omitted
>         Suppressed: java.lang.IllegalStateException: Memory was leaked by 
> query. Memory leaked: (32768)
> Allocator(op:1:0:8:UnorderedReceiver) 100/32768/32768/100 
> (res/actual/peak/limit)
>                 at 
> org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:519)
>                 at 
> org.apache.drill.exec.ops.BaseOperatorContext.close(BaseOperatorContext.java:159)
>                 at 
> org.apache.drill.exec.ops.OperatorContextImpl.close(OperatorContextImpl.java:77)
>                 at 
> org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:581)
>                 at 
> org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:571)
>                 ... 7 common frames omitted
>         Suppressed: java.lang.IllegalStateException: Memory was leaked by 
> query. Memory leaked: (1016640)
> Allocator(frag:1:0) 3000/1016640/30016640/90715827882 
> (res/actual/peak/limit)
>                 at 
> org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:519)
>                 at 
> org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:581)
>                 at 
> org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:574)
>                 ... 7 common frames omitted {code}
> Steps to reproduce:
>   1.Enable unequal join:
>

[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin

2024-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845113#comment-17845113
 ] 

ASF GitHub Bot commented on DRILL-8474:
---

mbeckerle commented on code in PR #2909:
URL: https://github.com/apache/drill/pull/2909#discussion_r1595903385


##
contrib/format-daffodil/src/main/java/org/apache/drill/exec/store/daffodil/schema/DaffodilDataProcessorFactory.java:
##
@@ -0,0 +1,165 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.daffodil.schema;
+
+import org.apache.daffodil.japi.Compiler;
+import org.apache.daffodil.japi.Daffodil;
+import org.apache.daffodil.japi.DataProcessor;
+import org.apache.daffodil.japi.Diagnostic;
+import org.apache.daffodil.japi.InvalidParserException;
+import org.apache.daffodil.japi.InvalidUsageException;
+import org.apache.daffodil.japi.ProcessorFactory;
+import org.apache.daffodil.japi.ValidationMode;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.net.URI;
+import java.net.URISyntaxException;
+import java.nio.channels.Channels;
+import java.util.List;
+import java.util.Objects;
+
+/**
+ * Compiles a DFDL schema (mostly for tests) or loads a pre-compiled DFDL 
schema so that one can
+ * obtain a DataProcessor for use with DaffodilMessageParser.
+ * 
+ * TODO: Needs to use a cache to avoid reloading/recompiling every time.
+ */
+public class DaffodilDataProcessorFactory {
+  // Default constructor is used.
+
+  private static final Logger logger = 
LoggerFactory.getLogger(DaffodilDataProcessorFactory.class);
+
+  private DataProcessor dp;
+
+  /**
+   * Gets a Daffodil DataProcessor given the necessary arguments to compile or 
reload it.
+   *
+   * @param schemaFileURI
+   * pre-compiled dfdl schema (.bin extension) or DFDL schema source (.xsd 
extension)
+   * @param validationMode
+   * Use true to request Daffodil built-in 'limited' validation. Use false 
for no validation.
+   * @param rootName
+   * Local name of root element of the message. Can be null to use the 
first element declaration
+   * of the primary schema file. Ignored if reloading a pre-compiled 
schema.
+   * @param rootNS
+   * Namespace URI as a string. Can be null to use the target namespace of 
the primary schema
+   * file or if it is unambiguous what element is the rootName. Ignored if 
reloading a
+   * pre-compiled schema.
+   * @return the DataProcessor
+   * @throws CompileFailure
+   * - if schema compilation fails
+   */
+  public DataProcessor getDataProcessor(URI schemaFileURI, boolean 
validationMode, String rootName,
+  String rootNS)
+  throws CompileFailure {
+
+DaffodilDataProcessorFactory dmp = new DaffodilDataProcessorFactory();
+boolean isPrecompiled = schemaFileURI.toString().endsWith(".bin");
+if (isPrecompiled) {
+  if (Objects.nonNull(rootName) && !rootName.isEmpty()) {
+// A usage error. You shouldn't supply the name and optionally 
namespace if loading
+// precompiled schema because those are built into it. Should be null 
or "".
+logger.warn("Root element name '{}' is ignored when used with 
precompiled DFDL schema.",
+rootName);
+  }
+  try {
+dmp.loadSchema(schemaFileURI);
+  } catch (IOException | InvalidParserException e) {
+throw new CompileFailure(e);
+  }
+  dmp.setupDP(validationMode, null);
+} else {
+  List pfDiags;
+  try {
+pfDiags = dmp.compileSchema(schemaFileURI, rootName, rootNS);
+  } catch (URISyntaxException | IOException e) {
+throw new CompileFailure(e);
+  }
+  dmp.setupDP(validationMode, pfDiags);
+}
+return dmp.dp;
+  }
+
+  private void loadSchema(URI schemaFileURI) throws IOException, 
InvalidParserException {
+Compiler c = Daffodil.compiler();
+dp = c.reload(Channels.newChannel(schemaFileURI.toURL().openStream()));
+  }
+
+  private List compileSchema(URI schemaFileURI, String rootName, 
String rootNS)
+  throws URISyntaxException, IOExcep

[jira] [Commented] (DRILL-8492) Allow Parquet TIME_MICROS and TIMESTAMP_MICROS columns to be read as 64-bit integer values

2024-05-07 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844316#comment-17844316
 ] 

ASF GitHub Bot commented on DRILL-8492:
---

cgivre commented on PR #2907:
URL: https://github.com/apache/drill/pull/2907#issuecomment-2098546012

   > > LGTM +1 Thanks for the contribution! Can you please update the 
documentation for the Parquet reader to include this? Otherwise looks good!
   > 
   > Happy to contribute!
   > 
   > Do you mean the documentation in the `drill-site` repo? 
(https://github.com/apache/drill-site/blob/master/_docs/en/data-sources-and-file-formats/040-parquet-format.md)
   
   That's the one!




> Allow Parquet TIME_MICROS and TIMESTAMP_MICROS  columns to be read as 64-bit 
> integer values
> ---
>
> Key: DRILL-8492
> URL: https://issues.apache.org/jira/browse/DRILL-8492
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Affects Versions: 1.21.1
>Reporter: Peter Franzen
>Priority: Major
>
> When reading Parquet columns of type {{time_micros}} and 
> {{{}timestamp_micros{}}}, Drill truncates the microsecond values to 
> milliseconds in order to convert them to SQL timestamps.
> It is currently not possible to read the original microsecond values (as 
> 64-bit values, not SQL timestamps) through Drill.
> One solution for allowing reading the original 64-bit values is to add two 
> options similar to “store.parquet.reader.int96_as_timestamp" to control 
> whether microsecond
> times and timestamps are truncated to millisecond timestamps or read as 
> non-truncated 64-bit values.
> These options would be added to {{org.apache.drill.exec.ExecConstants}} and
> {{{}org.apache.drill.exec.server.options.SystemOptionManager{}}}.
> They would also be added to "drill-module.conf":
> {{   store.parquet.reader.time_micros_as_int64: false,}}
> {{   store.parquet.reader.timestamp_micros_as_int64: false,}}
> These options would then be used in the same places as 
> {{{}store.parquet.reader.int96_as_timestamp{}}}:
>  * org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory
>  * 
> org.apache.drill.exec.store.parquet.columnreaders.ParquetToDrillTypeConverter
>  * org.apache.drill.exec.store.parquet2.DrillParquetGroupConverter
> to create an int64 reader instead of a time/timestamp reader when the 
> correspondning option is set to true.
> In addition to this, 
> {{org.apache.drill.exec.store.parquet.metadata.FileMetadataCollector }}must 
> be altered to _not_ truncate the min and max values for 
> time_micros/timestamp_micros if the corresponding option is true. This class 
> doesn’t have a reference to an {{{}OptionManager{}}}, so the two new options 
> must be extracted from the {{OptionManager}} when the {{ParquetReaderConfig}} 
> instance is created.
> Filtering on microsecond columns would be done using 64-bit values rather 
> than TIME/TIMESTAMP values when the new options are true, e.g.
> {{SELECT *  FROM  WHERE  = 1705914906694751;}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8492) Allow Parquet TIME_MICROS and TIMESTAMP_MICROS columns to be read as 64-bit integer values

2024-05-07 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844315#comment-17844315
 ] 

ASF GitHub Bot commented on DRILL-8492:
---

handmadecode commented on PR #2907:
URL: https://github.com/apache/drill/pull/2907#issuecomment-2098543073

   > LGTM +1 Thanks for the contribution! Can you please update the 
documentation for the Parquet reader to include this? Otherwise looks good!
   
   Happy to contribute!
   
   Do you mean the documentation in the `drill-site` repo? 
(https://github.com/apache/drill-site/blob/master/_docs/en/data-sources-and-file-formats/040-parquet-format.md)




> Allow Parquet TIME_MICROS and TIMESTAMP_MICROS  columns to be read as 64-bit 
> integer values
> ---
>
> Key: DRILL-8492
> URL: https://issues.apache.org/jira/browse/DRILL-8492
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Affects Versions: 1.21.1
>Reporter: Peter Franzen
>Priority: Major
>
> When reading Parquet columns of type {{time_micros}} and 
> {{{}timestamp_micros{}}}, Drill truncates the microsecond values to 
> milliseconds in order to convert them to SQL timestamps.
> It is currently not possible to read the original microsecond values (as 
> 64-bit values, not SQL timestamps) through Drill.
> One solution for allowing reading the original 64-bit values is to add two 
> options similar to “store.parquet.reader.int96_as_timestamp" to control 
> whether microsecond
> times and timestamps are truncated to millisecond timestamps or read as 
> non-truncated 64-bit values.
> These options would be added to {{org.apache.drill.exec.ExecConstants}} and
> {{{}org.apache.drill.exec.server.options.SystemOptionManager{}}}.
> They would also be added to "drill-module.conf":
> {{   store.parquet.reader.time_micros_as_int64: false,}}
> {{   store.parquet.reader.timestamp_micros_as_int64: false,}}
> These options would then be used in the same places as 
> {{{}store.parquet.reader.int96_as_timestamp{}}}:
>  * org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory
>  * 
> org.apache.drill.exec.store.parquet.columnreaders.ParquetToDrillTypeConverter
>  * org.apache.drill.exec.store.parquet2.DrillParquetGroupConverter
> to create an int64 reader instead of a time/timestamp reader when the 
> correspondning option is set to true.
> In addition to this, 
> {{org.apache.drill.exec.store.parquet.metadata.FileMetadataCollector }}must 
> be altered to _not_ truncate the min and max values for 
> time_micros/timestamp_micros if the corresponding option is true. This class 
> doesn’t have a reference to an {{{}OptionManager{}}}, so the two new options 
> must be extracted from the {{OptionManager}} when the {{ParquetReaderConfig}} 
> instance is created.
> Filtering on microsecond columns would be done using 64-bit values rather 
> than TIME/TIMESTAMP values when the new options are true, e.g.
> {{SELECT *  FROM  WHERE  = 1705914906694751;}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin

2024-05-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843832#comment-17843832
 ] 

ASF GitHub Bot commented on DRILL-8474:
---

mbeckerle commented on PR #2909:
URL: https://github.com/apache/drill/pull/2909#issuecomment-209976

   > Hi Mike, Are you free at all this week? My apologies... We're in the 
middle of putting an offer on a house and my life is very hectic at the moment. 
Best, 

> Add Daffodil Format Plugin
> --
>
> Key: DRILL-8474
> URL: https://issues.apache.org/jira/browse/DRILL-8474
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.21.1
>Reporter: Charles Givre
>Priority: Major
> Fix For: 1.22.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8494) HTTP Caching Not Saving Pages

2024-05-06 Thread Charles Givre (Jira)
Charles Givre created DRILL-8494:


 Summary: HTTP Caching Not Saving Pages
 Key: DRILL-8494
 URL: https://issues.apache.org/jira/browse/DRILL-8494
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - HTTP
Affects Versions: 1.21.1
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.21.2


A minor bugfix, but the HTTP storage plugin was not actually caching results 
even when caching was set to true.  This bug was introduced in DRILL-8329.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin

2024-05-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843601#comment-17843601
 ] 

ASF GitHub Bot commented on DRILL-8474:
---

cgivre commented on PR #2909:
URL: https://github.com/apache/drill/pull/2909#issuecomment-2095044801

   Hi Mike, 
   Are you free at all this week?  My apologies... We're in the middle of 
putting an offer on a house and my life is very hectic at the moment.
   Best,
   

> Add Daffodil Format Plugin
> --
>
> Key: DRILL-8474
> URL: https://issues.apache.org/jira/browse/DRILL-8474
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.21.1
>Reporter: Charles Givre
>Priority: Major
> Fix For: 1.22.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8488) HashJoinPOP memory leak is caused by OutOfMemoryException

2024-05-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842680#comment-17842680
 ] 

ASF GitHub Bot commented on DRILL-8488:
---

cgivre merged PR #2900:
URL: https://github.com/apache/drill/pull/2900




> HashJoinPOP memory leak is caused by  OutOfMemoryException
> --
>
> Key: DRILL-8488
> URL: https://issues.apache.org/jira/browse/DRILL-8488
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server
>Affects Versions: 1.21.1
>Reporter: shihuafeng
>Priority: Major
> Fix For: 1.22.0
>
>
> [DRILL-8485|[DRILL-8485] HashJoinPOP memory leak is caused by an oom 
> exception when read data from InputStream - ASF JIRA (apache.org)] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8489) Sender memory leak when rpc encode exception

2024-05-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842679#comment-17842679
 ] 

ASF GitHub Bot commented on DRILL-8489:
---

cgivre merged PR #2901:
URL: https://github.com/apache/drill/pull/2901




> Sender memory leak when rpc encode exception
> 
>
> Key: DRILL-8489
> URL: https://issues.apache.org/jira/browse/DRILL-8489
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server
>Affects Versions: 1.21.1
>Reporter: shihuafeng
>Priority: Major
> Fix For: 1.22.0
>
>
> When encode throw Exception, if encode msg instanceof ReferenceCounted, netty 
> can release msg, but drill convert msg to OutboundRpcMessage, so netty can 
> not release msg. this  causes sender memory leaks
> exception info 
> {code:java}
> 2024-04-16 16:25:57,998 [DataClient-7] ERROR 
> o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.  
> Connection: /10.32.112.138:47924 <--> /10.32.112.138:31012 (data client).  
> Closing connection.
> io.netty.handler.codec.EncoderException: 
> org.apache.drill.exec.exception.OutOfMemoryException: Unable to allocate 
> buffer of size 4096 due to memory limit (9223372036854775807). Current 
> allocation: 0
>         at 
> io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:107)
>         at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:881)
>         at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:940)
>         at 
> io.netty.channel.AbstractChannelHandlerContext$WriteTask.run(AbstractChannelHandlerContext.java:1247)
>         at 
> io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
>         at 
> io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
>         at 
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
>         at 
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
>         at 
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Unable to 
> allocate buffer of size 4096 due to memory limit (9223372036854775807). 
> Current allocation: 0
>         at 
> org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:245)
>         at 
> org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:220)
>         at 
> org.apache.drill.exec.memory.DrillByteBufAllocator.buffer(DrillByteBufAllocator.java:55)
>         at 
> org.apache.drill.exec.memory.DrillByteBufAllocator.buffer(DrillByteBufAllocator.java:50)
>         at org.apache.drill.exec.rpc.RpcEncoder.encode(safeRelease.java:87)
>         at org.apache.drill.exec.rpc.RpcEncoder.encode(RpcEncoder.java:38)
>         at 
> io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:90){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin

2024-04-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841807#comment-17841807
 ] 

ASF GitHub Bot commented on DRILL-8474:
---

mbeckerle commented on PR #2909:
URL: https://github.com/apache/drill/pull/2909#issuecomment-2081781546

   Tests are now failing due to these two things in TestDaffodilReader.scala
   ```
 String schemaURIRoot = 
"file:///opt/drill/contrib/format-daffodil/src/test/resources/";
   ```
   That's an absolute URI that is used to obtain access to the schema files in 
this statement:
   ```
 private String selectRow(String schema, String file) {
   return "SELECT * FROM table(dfs.`data/" + file + "` " + " (type => 
'daffodil'," + " " +
   "validationMode => 'true', " + " schemaURI => '" + schemaURIRoot + 
"schema/" + schema +
   ".dfdl.xsd'," + " rootName => 'row'," + " rootNamespace => null " + 
"))";
 }
   ```
   This is assembling a select statement, and puts this absolute schemaURI into 
the schemaURI part of the select. 
   
   What should I be doing to arrange for these schema URIs to be found. 
   
   The schemas are a large complex set of files, not just a single file. Many 
files must be found relative to the initial root schema file. (Hundreds of 
files potentially). As they include/import other schema files using relative 
paths. 
   




> Add Daffodil Format Plugin
> --
>
> Key: DRILL-8474
> URL: https://issues.apache.org/jira/browse/DRILL-8474
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.21.1
>Reporter: Charles Givre
>Priority: Major
> Fix For: 1.22.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin

2024-04-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841775#comment-17841775
 ] 

ASF GitHub Bot commented on DRILL-8474:
---

cgivre commented on code in PR #2909:
URL: https://github.com/apache/drill/pull/2909#discussion_r1582375084


##
exec/vector/src/main/java/org/apache/drill/exec/record/metadata/MapBuilder.java:
##
@@ -185,6 +192,26 @@ public MapBuilder resumeMap() {
 return (MapBuilder) parent;
   }
 
+  /**
+   * Depending on whether the parent is a schema builder or map builder
+   * we resume appropriately.
+   */
+  @Override
+  public void resume() {
+if (Objects.isNull(parent))

Review Comment:
   I just built Drill using the following command:
   
   ```sh
   mvn clean install -DskipTests
   ```
   When I did that, I was getting the same error as on GitHub.  After adding 
the braces as described above, it built without issues.
   With that said, I think you can do just run the check style with:
   
   ```sh
   mvn checkstyle:checkstyle
   ```





> Add Daffodil Format Plugin
> --
>
> Key: DRILL-8474
> URL: https://issues.apache.org/jira/browse/DRILL-8474
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.21.1
>Reporter: Charles Givre
>Priority: Major
> Fix For: 1.22.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin

2024-04-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841774#comment-17841774
 ] 

ASF GitHub Bot commented on DRILL-8474:
---

cgivre commented on code in PR #2909:
URL: https://github.com/apache/drill/pull/2909#discussion_r1582375084


##
exec/vector/src/main/java/org/apache/drill/exec/record/metadata/MapBuilder.java:
##
@@ -185,6 +192,26 @@ public MapBuilder resumeMap() {
 return (MapBuilder) parent;
   }
 
+  /**
+   * Depending on whether the parent is a schema builder or map builder
+   * we resume appropriately.
+   */
+  @Override
+  public void resume() {
+if (Objects.isNull(parent))

Review Comment:
   I just built Drill using the following command:
   
   ```sh
   mvn clean install -DskipTests
   ```
   
   I think you can do just run the check style with:
   
   ```sh
   mvn checkstyle:checkstyle
   ```





> Add Daffodil Format Plugin
> --
>
> Key: DRILL-8474
> URL: https://issues.apache.org/jira/browse/DRILL-8474
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.21.1
>Reporter: Charles Givre
>Priority: Major
> Fix For: 1.22.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin

2024-04-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841768#comment-17841768
 ] 

ASF GitHub Bot commented on DRILL-8474:
---

mbeckerle commented on code in PR #2909:
URL: https://github.com/apache/drill/pull/2909#discussion_r1582367382


##
exec/vector/src/main/java/org/apache/drill/exec/record/metadata/MapBuilder.java:
##
@@ -185,6 +192,26 @@ public MapBuilder resumeMap() {
 return (MapBuilder) parent;
   }
 
+  /**
+   * Depending on whether the parent is a schema builder or map builder
+   * we resume appropriately.
+   */
+  @Override
+  public void resume() {
+if (Objects.isNull(parent))

Review Comment:
   What is the maven command line to just make it run this checkstyle?





> Add Daffodil Format Plugin
> --
>
> Key: DRILL-8474
> URL: https://issues.apache.org/jira/browse/DRILL-8474
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.21.1
>Reporter: Charles Givre
>Priority: Major
> Fix For: 1.22.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin

2024-04-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841667#comment-17841667
 ] 

ASF GitHub Bot commented on DRILL-8474:
---

cgivre commented on code in PR #2909:
URL: https://github.com/apache/drill/pull/2909#discussion_r1582206247


##
exec/vector/src/main/java/org/apache/drill/exec/record/metadata/MapBuilder.java:
##
@@ -185,6 +192,26 @@ public MapBuilder resumeMap() {
 return (MapBuilder) parent;
   }
 
+  /**
+   * Depending on whether the parent is a schema builder or map builder
+   * we resume appropriately.
+   */
+  @Override
+  public void resume() {
+if (Objects.isNull(parent))

Review Comment:
   @mbeckerle Confirmed.  I successfully built your branch by adding the 
aforementioned braces.  I'll save you some additional trouble.  There's another 
check style violation in `DaffodilBatchReader`.  Drill doesn't like star 
imports for some reason.





> Add Daffodil Format Plugin
> --
>
> Key: DRILL-8474
> URL: https://issues.apache.org/jira/browse/DRILL-8474
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.21.1
>Reporter: Charles Givre
>Priority: Major
> Fix For: 1.22.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin

2024-04-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841663#comment-17841663
 ] 

ASF GitHub Bot commented on DRILL-8474:
---

cgivre commented on code in PR #2909:
URL: https://github.com/apache/drill/pull/2909#discussion_r1582202511


##
exec/vector/src/main/java/org/apache/drill/exec/record/metadata/MapBuilder.java:
##
@@ -185,6 +192,26 @@ public MapBuilder resumeMap() {
 return (MapBuilder) parent;
   }
 
+  /**
+   * Depending on whether the parent is a schema builder or map builder
+   * we resume appropriately.
+   */
+  @Override
+  public void resume() {
+if (Objects.isNull(parent))

Review Comment:
   @mbeckerle I don't know why the checkstyle is telling you the wrong file, 
but here, you'll need braces as well as at line 203. 
   
   ie:
   ```java
   if (parent instanceof MapBuilder) {
 resumeMap();
   }
   ```
   





> Add Daffodil Format Plugin
> --
>
> Key: DRILL-8474
> URL: https://issues.apache.org/jira/browse/DRILL-8474
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.21.1
>Reporter: Charles Givre
>Priority: Major
> Fix For: 1.22.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin

2024-04-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841637#comment-17841637
 ] 

ASF GitHub Bot commented on DRILL-8474:
---

shfshihuafeng commented on PR #2909:
URL: https://github.com/apache/drill/pull/2909#issuecomment-2081475418

   > This fails its tests due to a maven checkstyle failure. It's complaining 
about Drill:Exec:Vectors, which my code has no changes to.
   > 
   > Can someone advise on what is wrong here?
   
if (Objects.isNull(parent)) {
   throw new IllegalStateException("Call to resume() on MapBuilder with no 
parent.");
   }




> Add Daffodil Format Plugin
> --
>
> Key: DRILL-8474
>     URL: https://issues.apache.org/jira/browse/DRILL-8474
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.21.1
>Reporter: Charles Givre
>Priority: Major
> Fix For: 1.22.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin

2024-04-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841636#comment-17841636
 ] 

ASF GitHub Bot commented on DRILL-8474:
---

shfshihuafeng commented on PR #2909:
URL: https://github.com/apache/drill/pull/2909#issuecomment-2081475241

   > This fails its tests due to a maven checkstyle failure. It's complaining 
about Drill:Exec:Vectors, which my code has no changes to.
   > 
   > Can someone advise on what is wrong here?
   
/home/runner/work/drill/drill/exec/vector/src/main/java/org/apache/drill/exec/record/metadata/MapBuilder.java:201:5
   you need add if' construct must use '{}',like following ?
   
if (Objects.isNull(parent)) {
   throw new IllegalStateException("Call to resume() on MapBuilder with no 
parent.");
   }
 
   
   > This fails its tests due to a maven checkstyle failure. It's complaining 
about Drill:Exec:Vectors, which my code has no changes to.
   > 
   > Can someone advise on what is wrong here?
   
   
exec/vector/src/main/java/org/apache/drill/exec/record/metadata/MapBuilder.java 
 201 

   i think  you  need add {} for if
   ```
if (Objects.isNull(parent)) {
   throw new IllegalStateException("Call to resume() on MapBuilder with no 
parent.");
   }
   ```




> Add Daffodil Format Plugin
> --
>
> Key: DRILL-8474
> URL: https://issues.apache.org/jira/browse/DRILL-8474
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.21.1
>Reporter: Charles Givre
>Priority: Major
> Fix For: 1.22.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8493) Drill Unable to Read XML Files with Namespaces

2024-04-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841556#comment-17841556
 ] 

ASF GitHub Bot commented on DRILL-8493:
---

cgivre merged PR #2908:
URL: https://github.com/apache/drill/pull/2908




> Drill Unable to Read XML Files with Namespaces
> --
>
> Key: DRILL-8493
> URL: https://issues.apache.org/jira/browse/DRILL-8493
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Format - XML
>Affects Versions: 1.21.1
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.21.2
>
>
> This is a bug fix whereby Drill ignores all data when an XML file has a 
> namespace.  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin

2024-04-27 Thread Mike Beckerle (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841537#comment-17841537
 ] 

Mike Beckerle commented on DRILL-8474:
--

PR for this ticket is now https://github.com/apache/drill/pull/2909

> Add Daffodil Format Plugin
> --
>
> Key: DRILL-8474
> URL: https://issues.apache.org/jira/browse/DRILL-8474
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.21.1
>Reporter: Charles Givre
>Priority: Major
> Fix For: 1.22.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-2835) IndexOutOfBoundsException in partition sender when doing streaming aggregate with LIMIT

2024-04-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-2835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841535#comment-17841535
 ] 

ASF GitHub Bot commented on DRILL-2835:
---

mbeckerle commented on PR #2909:
URL: https://github.com/apache/drill/pull/2909#issuecomment-2081179778

   This fails its tests due to a maven checkstyle failure. It's complaining 
about Drill:Exec:Vectors, which my code has no changes to. 
   
   Can someone advise on what is wrong here?
   




> IndexOutOfBoundsException in partition sender when doing streaming aggregate 
> with LIMIT 
> 
>
> Key: DRILL-2835
> URL: https://issues.apache.org/jira/browse/DRILL-2835
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - RPC
>Affects Versions: 0.8.0
>Reporter: Aman Sinha
>Assignee: Venki Korukanti
>Priority: Major
> Fix For: 0.9.0
>
> Attachments: DRILL-2835-1.patch, DRILL-2835-2.patch
>
>
> Following CTAS run on a TPC-DS 100GB scale factor on a 10-node cluster: 
> {code}
> alter session set `planner.enable_hashagg` = false;
> alter session set `planner.enable_multiphase_agg` = true;
> create table dfs.tmp.stream9 as 
> select cr_call_center_sk , cr_catalog_page_sk ,  cr_item_sk , cr_reason_sk , 
> cr_refunded_addr_sk , count(*) from catalog_returns_dri100 
>  group by cr_call_center_sk , cr_catalog_page_sk ,  cr_item_sk , cr_reason_sk 
> , cr_refunded_addr_sk
>  limit 100
> ;
> {code}
> {code}
> Caused by: java.lang.IndexOutOfBoundsException: index: 1023, length: 1 
> (expected: range(0, 0))
> at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:200) 
> ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
> at io.netty.buffer.DrillBuf.chk(DrillBuf.java:222) 
> ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
> at io.netty.buffer.DrillBuf.setByte(DrillBuf.java:621) 
> ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
> at 
> org.apache.drill.exec.vector.UInt1Vector$Mutator.set(UInt1Vector.java:342) 
> ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.vector.NullableBigIntVector$Mutator.set(NullableBigIntVector.java:372)
>  ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.vector.NullableBigIntVector.copyFrom(NullableBigIntVector.java:284)
>  ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.test.generated.PartitionerGen4$OutgoingRecordBatch.doEval(PartitionerTemplate.java:370)
>  ~[na:na]
> at 
> org.apache.drill.exec.test.generated.PartitionerGen4$OutgoingRecordBatch.copy(PartitionerTemplate.java:249)
>  ~[na:na]
> at 
> org.apache.drill.exec.test.generated.PartitionerGen4.doCopy(PartitionerTemplate.java:208)
>  ~[na:na]
> at 
> org.apache.drill.exec.test.generated.PartitionerGen4.partitionBatch(PartitionerTemplate.java:176)
>  ~[na:na]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin

2024-04-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841530#comment-17841530
 ] 

ASF GitHub Bot commented on DRILL-8474:
---

mbeckerle closed pull request #2836: DRILL-8474: Add Daffodil Format Plugin
URL: https://github.com/apache/drill/pull/2836




> Add Daffodil Format Plugin
> --
>
> Key: DRILL-8474
> URL: https://issues.apache.org/jira/browse/DRILL-8474
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.21.1
>Reporter: Charles Givre
>Priority: Major
> Fix For: 1.22.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin

2024-04-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841531#comment-17841531
 ] 

ASF GitHub Bot commented on DRILL-8474:
---

mbeckerle commented on PR #2836:
URL: https://github.com/apache/drill/pull/2836#issuecomment-2081176156

   Creating a new squashed PR so as to avoid loss of the comments on this PR. 




> Add Daffodil Format Plugin
> --
>
> Key: DRILL-8474
> URL: https://issues.apache.org/jira/browse/DRILL-8474
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.21.1
>Reporter: Charles Givre
>Priority: Major
> Fix For: 1.22.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin

2024-04-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841528#comment-17841528
 ] 

ASF GitHub Bot commented on DRILL-8474:
---

mbeckerle commented on PR #2836:
URL: https://github.com/apache/drill/pull/2836#issuecomment-2081164073

   This now passes all the daffodil contrib tests using the published official 
Daffodil 3.7.0.
   
   It does not yet run in any scalable fashion, but the metadata/data 
interfacing is complete. 
   
   I would like to squash this to a single commit before merging, and it needs 
to be tested rebased onto the latest Drill commit. 




> Add Daffodil Format Plugin
> --
>
> Key: DRILL-8474
> URL: https://issues.apache.org/jira/browse/DRILL-8474
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.21.1
>Reporter: Charles Givre
>Priority: Major
> Fix For: 1.22.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8493) Drill Unable to Read XML Files with Namespaces

2024-04-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841368#comment-17841368
 ] 

ASF GitHub Bot commented on DRILL-8493:
---

cgivre opened a new pull request, #2908:
URL: https://github.com/apache/drill/pull/2908

   # [DRILL-8493](https://issues.apache.org/jira/browse/DRILL-8493): Drill 
Unable to Read XML Files with Namespaces
   
   
   ## Description
   This PR fixes an issue whereby if an XML file has a namespace defined, Drill 
may ignore all data.
   
   
   ## Documentation
   No user facing changes.
   
   ## Testing
   Added unit test.




> Drill Unable to Read XML Files with Namespaces
> --
>
> Key: DRILL-8493
> URL: https://issues.apache.org/jira/browse/DRILL-8493
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Format - XML
>Affects Versions: 1.21.1
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.21.2
>
>
> This is a bug fix whereby Drill ignores all data when an XML file has a 
> namespace.  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8493) Drill Unable to Read XML Files with Namespaces

2024-04-26 Thread Charles Givre (Jira)
Charles Givre created DRILL-8493:


 Summary: Drill Unable to Read XML Files with Namespaces
 Key: DRILL-8493
 URL: https://issues.apache.org/jira/browse/DRILL-8493
 Project: Apache Drill
  Issue Type: Bug
  Components: Format - XML
Affects Versions: 1.21.1
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.21.2


This is a bug fix whereby Drill ignores all data when an XML file has a 
namespace.  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8492) Allow Parquet TIME_MICROS and TIMESTAMP_MICROS columns to be read as 64-bit integer values

2024-04-26 Thread Peter Franzen (Jira)
Peter Franzen created DRILL-8492:


 Summary: Allow Parquet TIME_MICROS and TIMESTAMP_MICROS  columns 
to be read as 64-bit integer values
 Key: DRILL-8492
 URL: https://issues.apache.org/jira/browse/DRILL-8492
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - Parquet
Affects Versions: 1.21.1
Reporter: Peter Franzen


When reading Parquet columns of type {{time_micros}} and 
{{{}timestamp_micros{}}}, Drill truncates the microsecond values to 
milliseconds in order to convert them to SQL timestamps.

It is currently not possible to read the original microsecond values (as 64-bit 
values, not SQL timestamps) through Drill.

One solution for allowing reading the original 64-bit values is to add two 
options similar to “store.parquet.reader.int96_as_timestamp" to control whether 
microsecond
times and timestamps are truncated to millisecond timestamps or read as 
non-truncated 64-bit values.

These options would be added to {{org.apache.drill.exec.ExecConstants}} and
{{{}org.apache.drill.exec.server.options.SystemOptionManager{}}}.

They would also be added to "drill-module.conf":

{{   store.parquet.reader.time_micros_as_int64: false,}}
{{   store.parquet.reader.timestamp_micros_as_int64: false,}}

These options would then be used in the same places as 
{{{}store.parquet.reader.int96_as_timestamp{}}}:


 * org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory
 * org.apache.drill.exec.store.parquet.columnreaders.ParquetToDrillTypeConverter
 * org.apache.drill.exec.store.parquet2.DrillParquetGroupConverter



to create an int64 reader instead of a time/timestamp reader when the 
correspondning option is set to true.

In addition to this, 
{{org.apache.drill.exec.store.parquet.metadata.FileMetadataCollector }}must be 
altered to _not_ truncate the min and max values for 
time_micros/timestamp_micros if the corresponding option is true. This class 
doesn’t have a reference to an {{{}OptionManager{}}}, so the two new options 
must be extracted from the {{OptionManager}} when the {{ParquetReaderConfig}} 
instance is created.

Filtering on microsecond columns would be done using 64-bit values rather than 
TIME/TIMESTAMP values when the new options are true, e.g.

{{SELECT *  FROM  WHERE  = 1705914906694751;}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8491) MongoDB | Queries Conversion optimisation & using various mongoDB features

2024-04-18 Thread Piyush Shama (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piyush Shama updated DRILL-8491:

Priority: Critical  (was: Major)

> MongoDB | Queries Conversion optimisation & using various mongoDB features
> --
>
> Key: DRILL-8491
> URL: https://issues.apache.org/jira/browse/DRILL-8491
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Piyush Shama
>Priority: Critical
>
> {*}Title{*}: Inefficient Query Translation and Underutilised Functions in SQL 
> to MongoDB Conversion Using Apache Drill
> {*}Description{*}: We have been experiencing significant performance issues 
> when using Apache Drill to convert SQL queries for use with MongoDB. It 
> appears that the SQL to MongoDB query translation process is not optimally 
> executed, leading to inefficient query operations and slow response times.
> {*}Details{*}:
>  # {*}Inefficient Query Translation{*}:
>  ** The translation of SQL queries into MongoDB-specific queries by Apache 
> Drill seems sub optimal. This inefficiency is particularly noticeable with 
> complex queries, where the expected execution plan does not align with 
> MongoDB's capabilities, resulting in slower query performance.
>  # {*}Underutilization of MongoDB Capabilities{*}:
>  ** Several MongoDB functionalities are not being fully utilised in the 
> translation process:
>  *** {*}Aggregation Operations{*}: Functions like {{{}SUM(){}}}, 
> {{{}AVG(){}}}, {{{}MIN(){}}}, and {{MAX()}} are either poorly translated or 
> not utilised, leading to potential performance degradation.
>  *** {*}Date Handling{*}: Extraction of date components (e.g., day from an 
> ISO date) within queries is not handled efficiently, forcing additional 
> processing overhead or client-side computations.
>  *** {*}Count Queries{*}: Simple count operations are not optimised, possibly 
> translating into more complex query forms than necessary.
> {*}Impact{*}: The current issues significantly affect the performance and 
> scalability of applications relying on Apache Drill for interacting with 
> MongoDB, particularly in data-heavy environments.
> {*}Expected Behaviour{*}:
>  * Queries translated from SQL to MongoDB should utilise MongoDB's native 
> query capabilities more effectively, ensuring that operations such as 
> aggregations, date extractions, and counts are executed in the most efficient 
> manner possible.
>  * The translation engine should optimise the query structure to leverage 
> MongoDB's strengths, particularly in handling large datasets.
> {*}Steps to Reproduce{*}:
>  # Set up Apache Drill with a MongoDB data source.
>  # Execute complex SQL queries involving aggregation, date extraction, and 
> count operations.
>  # Observe the generated MongoDB queries and resulting performance.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8491) MongoDB | Queries Conversion optimisation & using various mongoDB features

2024-04-18 Thread Piyush Shama (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piyush Shama updated DRILL-8491:

Priority: Major  (was: Critical)

> MongoDB | Queries Conversion optimisation & using various mongoDB features
> --
>
> Key: DRILL-8491
> URL: https://issues.apache.org/jira/browse/DRILL-8491
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Piyush Shama
>Priority: Major
>
> {*}Title{*}: Inefficient Query Translation and Underutilised Functions in SQL 
> to MongoDB Conversion Using Apache Drill
> {*}Description{*}: We have been experiencing significant performance issues 
> when using Apache Drill to convert SQL queries for use with MongoDB. It 
> appears that the SQL to MongoDB query translation process is not optimally 
> executed, leading to inefficient query operations and slow response times.
> {*}Details{*}:
>  # {*}Inefficient Query Translation{*}:
>  ** The translation of SQL queries into MongoDB-specific queries by Apache 
> Drill seems sub optimal. This inefficiency is particularly noticeable with 
> complex queries, where the expected execution plan does not align with 
> MongoDB's capabilities, resulting in slower query performance.
>  # {*}Underutilization of MongoDB Capabilities{*}:
>  ** Several MongoDB functionalities are not being fully utilised in the 
> translation process:
>  *** {*}Aggregation Operations{*}: Functions like {{{}SUM(){}}}, 
> {{{}AVG(){}}}, {{{}MIN(){}}}, and {{MAX()}} are either poorly translated or 
> not utilised, leading to potential performance degradation.
>  *** {*}Date Handling{*}: Extraction of date components (e.g., day from an 
> ISO date) within queries is not handled efficiently, forcing additional 
> processing overhead or client-side computations.
>  *** {*}Count Queries{*}: Simple count operations are not optimised, possibly 
> translating into more complex query forms than necessary.
> {*}Impact{*}: The current issues significantly affect the performance and 
> scalability of applications relying on Apache Drill for interacting with 
> MongoDB, particularly in data-heavy environments.
> {*}Expected Behaviour{*}:
>  * Queries translated from SQL to MongoDB should utilise MongoDB's native 
> query capabilities more effectively, ensuring that operations such as 
> aggregations, date extractions, and counts are executed in the most efficient 
> manner possible.
>  * The translation engine should optimise the query structure to leverage 
> MongoDB's strengths, particularly in handling large datasets.
> {*}Steps to Reproduce{*}:
>  # Set up Apache Drill with a MongoDB data source.
>  # Execute complex SQL queries involving aggregation, date extraction, and 
> count operations.
>  # Observe the generated MongoDB queries and resulting performance.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8491) MongoDB | Queries Conversion optimisation & using various mongoDB features

2024-04-18 Thread Piyush Shama (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piyush Shama updated DRILL-8491:

Description: 
{*}Title{*}: Inefficient Query Translation and Underutilised Functions in SQL 
to MongoDB Conversion Using Apache Drill

{*}Description{*}: We have been experiencing significant performance issues 
when using Apache Drill to convert SQL queries for use with MongoDB. It appears 
that the SQL to MongoDB query translation process is not optimally executed, 
leading to inefficient query operations and slow response times.

{*}Details{*}:
 # {*}Inefficient Query Translation{*}:

 ** The translation of SQL queries into MongoDB-specific queries by Apache 
Drill seems sub optimal. This inefficiency is particularly noticeable with 
complex queries, where the expected execution plan does not align with 
MongoDB's capabilities, resulting in slower query performance.
 # {*}Underutilization of MongoDB Capabilities{*}:

 ** Several MongoDB functionalities are not being fully utilised in the 
translation process:
 *** {*}Aggregation Operations{*}: Functions like {{{}SUM(){}}}, {{{}AVG(){}}}, 
{{{}MIN(){}}}, and {{MAX()}} are either poorly translated or not utilised, 
leading to potential performance degradation.
 *** {*}Date Handling{*}: Extraction of date components (e.g., day from an ISO 
date) within queries is not handled efficiently, forcing additional processing 
overhead or client-side computations.
 *** {*}Count Queries{*}: Simple count operations are not optimised, possibly 
translating into more complex query forms than necessary.

{*}Impact{*}: The current issues significantly affect the performance and 
scalability of applications relying on Apache Drill for interacting with 
MongoDB, particularly in data-heavy environments.

{*}Expected Behaviour{*}:
 * Queries translated from SQL to MongoDB should utilise MongoDB's native query 
capabilities more effectively, ensuring that operations such as aggregations, 
date extractions, and counts are executed in the most efficient manner possible.
 * The translation engine should optimise the query structure to leverage 
MongoDB's strengths, particularly in handling large datasets.

{*}Steps to Reproduce{*}:
 # Set up Apache Drill with a MongoDB data source.
 # Execute complex SQL queries involving aggregation, date extraction, and 
count operations.
 # Observe the generated MongoDB queries and resulting performance.

 

> MongoDB | Queries Conversion optimisation & using various mongoDB features
> --
>
> Key: DRILL-8491
> URL: https://issues.apache.org/jira/browse/DRILL-8491
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Piyush Shama
>Priority: Major
>
> {*}Title{*}: Inefficient Query Translation and Underutilised Functions in SQL 
> to MongoDB Conversion Using Apache Drill
> {*}Description{*}: We have been experiencing significant performance issues 
> when using Apache Drill to convert SQL queries for use with MongoDB. It 
> appears that the SQL to MongoDB query translation process is not optimally 
> executed, leading to inefficient query operations and slow response times.
> {*}Details{*}:
>  # {*}Inefficient Query Translation{*}:
>  ** The translation of SQL queries into MongoDB-specific queries by Apache 
> Drill seems sub optimal. This inefficiency is particularly noticeable with 
> complex queries, where the expected execution plan does not align with 
> MongoDB's capabilities, resulting in slower query performance.
>  # {*}Underutilization of MongoDB Capabilities{*}:
>  ** Several MongoDB functionalities are not being fully utilised in the 
> translation process:
>  *** {*}Aggregation Operations{*}: Functions like {{{}SUM(){}}}, 
> {{{}AVG(){}}}, {{{}MIN(){}}}, and {{MAX()}} are either poorly translated or 
> not utilised, leading to potential performance degradation.
>  *** {*}Date Handling{*}: Extraction of date components (e.g., day from an 
> ISO date) within queries is not handled efficiently, forcing additional 
> processing overhead or client-side computations.
>  *** {*}Count Queries{*}: Simple count operations are not optimised, possibly 
> translating into more complex query forms than necessary.
> {*}Impact{*}: The current issues significantly affect the performance and 
> scalability of applications relying on Apache Drill for interacting with 
> MongoDB, particularly in data-heavy environments.
> {*}Expected Behaviour{*}:
>  * Queries translated from SQL to MongoDB should utilise MongoDB's native 
> query capabilities more effectively, ensuring that operations such as 
> aggregations, date extractions, and counts are executed in the most effi

[jira] [Created] (DRILL-8491) MongoDB | Queries Conversion optimisation & using various mongoDB features

2024-04-18 Thread Piyush Shama (Jira)
Piyush Shama created DRILL-8491:
---

 Summary: MongoDB | Queries Conversion optimisation & using various 
mongoDB features
 Key: DRILL-8491
 URL: https://issues.apache.org/jira/browse/DRILL-8491
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Piyush Shama






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8490) Sender operator fake memory leak result to sql failed and parent allocator exception

2024-04-17 Thread shihuafeng (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shihuafeng updated DRILL-8490:
--
Summary: Sender operator fake memory leak result to sql  failed and parent 
allocator exception  (was: Sender operator fake memory leak result to sql  
failed or other exception)

> Sender operator fake memory leak result to sql  failed and parent allocator 
> exception
> -
>
> Key: DRILL-8490
> URL: https://issues.apache.org/jira/browse/DRILL-8490
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server
>Affects Versions: 1.21.1
>Reporter: shihuafeng
>Priority: Major
> Fix For: 1.22.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8490) Sender operator fake memory leak result to sql failed or other exception

2024-04-16 Thread shihuafeng (Jira)
shihuafeng created DRILL-8490:
-

 Summary: Sender operator fake memory leak result to sql  failed or 
other exception
 Key: DRILL-8490
 URL: https://issues.apache.org/jira/browse/DRILL-8490
 Project: Apache Drill
  Issue Type: Bug
  Components:  Server
Affects Versions: 1.21.1
Reporter: shihuafeng
 Fix For: 1.22.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8489) Sender memory leak when rpc encode exception

2024-04-16 Thread shihuafeng (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shihuafeng updated DRILL-8489:
--
Description: 
When encode throw Exception, if encode msg instanceof ReferenceCounted, netty 
can release msg, but drill convert msg to OutboundRpcMessage, so netty can not 
release msg. this  causes sender memory leaks

exception info 
{code:java}
2024-04-16 16:25:57,998 [DataClient-7] ERROR o.a.d.exec.rpc.RpcExceptionHandler 
- Exception in RPC communication.  Connection: /10.32.112.138:47924 <--> 
/10.32.112.138:31012 (data client).  Closing connection.
io.netty.handler.codec.EncoderException: 
org.apache.drill.exec.exception.OutOfMemoryException: Unable to allocate buffer 
of size 4096 due to memory limit (9223372036854775807). Current allocation: 0
        at 
io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:107)
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:881)
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:940)
        at 
io.netty.channel.AbstractChannelHandlerContext$WriteTask.run(AbstractChannelHandlerContext.java:1247)
        at 
io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
        at 
io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
        at 
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
        at 
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
        at 
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Unable to 
allocate buffer of size 4096 due to memory limit (9223372036854775807). Current 
allocation: 0
        at 
org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:245)
        at 
org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:220)
        at 
org.apache.drill.exec.memory.DrillByteBufAllocator.buffer(DrillByteBufAllocator.java:55)
        at 
org.apache.drill.exec.memory.DrillByteBufAllocator.buffer(DrillByteBufAllocator.java:50)
        at org.apache.drill.exec.rpc.RpcEncoder.encode(safeRelease.java:87)
        at org.apache.drill.exec.rpc.RpcEncoder.encode(RpcEncoder.java:38)
        at 
io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:90){code}

> Sender memory leak when rpc encode exception
> 
>
> Key: DRILL-8489
> URL: https://issues.apache.org/jira/browse/DRILL-8489
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server
>Affects Versions: 1.21.1
>Reporter: shihuafeng
>Priority: Major
> Fix For: 1.22.0
>
>
> When encode throw Exception, if encode msg instanceof ReferenceCounted, netty 
> can release msg, but drill convert msg to OutboundRpcMessage, so netty can 
> not release msg. this  causes sender memory leaks
> exception info 
> {code:java}
> 2024-04-16 16:25:57,998 [DataClient-7] ERROR 
> o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.  
> Connection: /10.32.112.138:47924 <--> /10.32.112.138:31012 (data client).  
> Closing connection.
> io.netty.handler.codec.EncoderException: 
> org.apache.drill.exec.exception.OutOfMemoryException: Unable to allocate 
> buffer of size 4096 due to memory limit (9223372036854775807). Current 
> allocation: 0
>         at 
> io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:107)
>         at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:881)
>         at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:940)
>         at 
> io.netty.channel.AbstractChannelHandlerContext$WriteTask.run(AbstractChannelHandlerContext.java:1247)
>         at 
> io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
>         at 
> io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
>         at 
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
>         at 
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
>         at 
> io.netty.util.internal.ThreadExecutorMap

[jira] [Commented] (DRILL-8489) Sender memory leak when rpc encode exception

2024-04-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837949#comment-17837949
 ] 

ASF GitHub Bot commented on DRILL-8489:
---

shfshihuafeng opened a new pull request, #2901:
URL: https://github.com/apache/drill/pull/2901

   # [DRILL-8489](https://issues.apache.org/jira/browse/DRILL-8489): Sender 
memory leak when rpc encode exception
   
   ## Description
   
   When encode throw Exception, if encode msg  instanceof ReferenceCounted, 
netty can release msg, but drill convert msg to OutboundRpcMessage, so netty 
can not release msg. 
   
   ## Documentation
   (Please describe user-visible changes similar to what should appear in the 
Drill documentation.)
   
   ## Testing
   1. export DRILL_MAX_DIRECT_MEMORY=${DRILL_MAX_DIRECT_MEMORY:-"2G"}
   2. tpch 1s
   3. tpch sql 8
   
   ```
   select
   o_year,
   sum(case when nation = 'CHINA' then volume else 0 end) / sum(volume) as 
mkt_share
   from (
   select
   extract(year from o_orderdate) as o_year,
   l_extendedprice * 1.0 as volume,
   n2.n_name as nation
   from hive.tpch1s.part, hive.tpch1s.supplier, hive.tpch1s.lineitem, 
hive.tpch1s.orders, hive.tpch1s.customer, hive.tpch1s.nation n1, 
hive.tpch1s.nation n2, hive.tpch1s.region
   where
   p_partkey = l_partkey
   and s_suppkey = l_suppkey
   and l_orderkey = o_orderkey
   and o_custkey = c_custkey
   and c_nationkey = n1.n_nationkey
   and n1.n_regionkey = r_regionkey
   and r_name = 'ASIA'
   and s_nationkey = n2.n_nationkey
   and o_orderdate between date '1995-01-01'
   and date '1996-12-31'
   and p_type = 'LARGE BRUSHED BRASS') as all_nations
   group by o_year
   order by o_year;   
   ```
   
   5. This scenario is relatively easy to Reproduce by running the following 
script
   ```
   drill_home=/data/shf/apache-drill-1.22.0-SNAPSHOT/bin
   fileName=/data/shf/1s/shf.txt
   
   random_sql(){
   #for i in `seq 1 3`
   while true
   do
 num=$((RANDOM%22+1))
 if [ -f $fileName ]; then
 echo "$fileName" " is exit"
 exit 0
 else
 $drill_home/sqlline -u 
\"jdbc:drill:zk=jupiter-2:2181/drill_shf/jupiterbits_shf1\" -f tpch_sql8.sql >> 
sql8.log 2>&1
 fi
   done
   }
   main(){
   unset HADOOP_CLASSPATH
   #TPCH power test
   for i in `seq 1 25`
   do
   random_sql &
   done
   
   
   }
   
   ```




> Sender memory leak when rpc encode exception
> 
>
> Key: DRILL-8489
> URL: https://issues.apache.org/jira/browse/DRILL-8489
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server
>Affects Versions: 1.21.1
>Reporter: shihuafeng
>Priority: Major
> Fix For: 1.22.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8489) Sender memory leak when rpc encode exception

2024-04-16 Thread shihuafeng (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shihuafeng updated DRILL-8489:
--
Summary: Sender memory leak when rpc encode exception  (was: sender memory)

> Sender memory leak when rpc encode exception
> 
>
> Key: DRILL-8489
> URL: https://issues.apache.org/jira/browse/DRILL-8489
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server
>Affects Versions: 1.21.1
>Reporter: shihuafeng
>Priority: Major
> Fix For: 1.22.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8489) sender memory

2024-04-16 Thread shihuafeng (Jira)
shihuafeng created DRILL-8489:
-

 Summary: sender memory
 Key: DRILL-8489
 URL: https://issues.apache.org/jira/browse/DRILL-8489
 Project: Apache Drill
  Issue Type: Bug
  Components:  Server
Affects Versions: 1.21.1
Reporter: shihuafeng
 Fix For: 1.22.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8488) HashJoinPOP memory leak is caused by OutOfMemoryException

2024-04-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837525#comment-17837525
 ] 

ASF GitHub Bot commented on DRILL-8488:
---

shfshihuafeng opened a new pull request, #2900:
URL: https://github.com/apache/drill/pull/2900

   # [DRILL-8488](https://issues.apache.org/jira/browse/DRILL-8488): 
HashJoinPOP memory leak is caused by  OutOfMemoryException
   
   (Please replace `PR Title` with actual PR Title)
   
   ## Description
   
   We should catch the OutOfMemoryException instead of OutOfMemoryError
   
   ```
public DrillBuf buffer(final int initialRequestSize, BufferManager manager) 
{
   assertOpen();
   
   Preconditions.checkArgument(initialRequestSize >= 0, "the requested size 
must be non-negative");
   
   if (initialRequestSize == 0) {
 return empty;
   }
   
   // round to next largest power of two if we're within a chunk since that 
is how our allocator operates
   final int actualRequestSize = initialRequestSize < CHUNK_SIZE ?
   nextPowerOfTwo(initialRequestSize)
   : initialRequestSize;
   AllocationOutcome outcome = allocateBytes(actualRequestSize);
   if (!outcome.isOk()) {
 **throw new OutOfMemoryException**(createErrorMsg(this, 
actualRequestSize, initialRequestSize));
   }
   }
   ```
   
   ## Documentation
   (Please describe user-visible changes similar to what should appear in the 
Drill documentation.)
   
   ## Testing
   [drill-848](https://issues.apache.org/jira/browse/DRILL-8485))
   




> HashJoinPOP memory leak is caused by  OutOfMemoryException
> --
>
> Key: DRILL-8488
> URL: https://issues.apache.org/jira/browse/DRILL-8488
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server
>Affects Versions: 1.21.1
>Reporter: shihuafeng
>Priority: Major
> Fix For: 1.22.0
>
>
> [DRILL-8485|[DRILL-8485] HashJoinPOP memory leak is caused by an oom 
> exception when read data from InputStream - ASF JIRA (apache.org)] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8488) HashJoinPOP memory leak is caused by OutOfMemoryException

2024-04-15 Thread shihuafeng (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shihuafeng updated DRILL-8488:
--
Summary: HashJoinPOP memory leak is caused by  OutOfMemoryException  (was: 
HashJoinPOP memory leak is caused by  an oom exception)

> HashJoinPOP memory leak is caused by  OutOfMemoryException
> --
>
> Key: DRILL-8488
> URL: https://issues.apache.org/jira/browse/DRILL-8488
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server
>Affects Versions: 1.21.1
>Reporter: shihuafeng
>Priority: Major
> Fix For: 1.22.0
>
>
> [DRILL-8485|[DRILL-8485] HashJoinPOP memory leak is caused by an oom 
> exception when read data from InputStream - ASF JIRA (apache.org)] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8488) HashJoinPOP memory leak is caused by an oom exception

2024-04-15 Thread shihuafeng (Jira)
shihuafeng created DRILL-8488:
-

 Summary: HashJoinPOP memory leak is caused by  an oom exception
 Key: DRILL-8488
 URL: https://issues.apache.org/jira/browse/DRILL-8488
 Project: Apache Drill
  Issue Type: Bug
  Components:  Server
Affects Versions: 1.21.1
Reporter: shihuafeng
 Fix For: 1.22.0


[DRILL-8485|[DRILL-8485] HashJoinPOP memory leak is caused by an oom exception 
when read data from InputStream - ASF JIRA (apache.org)] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (DRILL-8446) Incorrect use of OperatingSystemMXBean

2024-04-15 Thread James Turton (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Turton closed DRILL-8446.
---
Resolution: Fixed

> Incorrect use of OperatingSystemMXBean
> --
>
> Key: DRILL-8446
> URL: https://issues.apache.org/jira/browse/DRILL-8446
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.21.1
>Reporter: Mahmoud Ouali Alami
>Assignee: James Turton
>Priority: Major
> Fix For: 1.21.2
>
> Attachments: image-2023-07-04-15-36-42-905.png, 
> image-2023-07-04-16-24-59-662.png
>
>
> *Context :* 
> In Drill "CpuGaugeSet" class, we use an internal class instead of a public 
> class : com.sun.management.OperatingSystemMXBean;
> !image-2023-07-04-15-36-42-905.png|width=387,height=257!
> This can result to a NoClassDefFoundError:
> !image-2023-07-04-16-24-59-662.png|width=845,height=108!  
> *To do :* 
> Replace the private class "com.sun.managemenet.OperatingSystemMXBean" with 
> "java.lang.management.OperatingSystemMXBean",
>  
> Kind regards,
> Mahmoud
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (DRILL-8479) Merge Join Memory Leak Depleting Incoming Batches Throw Exception

2024-04-15 Thread James Turton (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Turton closed DRILL-8479.
---
Resolution: Fixed

> Merge Join Memory Leak Depleting Incoming Batches Throw Exception
> -
>
> Key: DRILL-8479
> URL: https://issues.apache.org/jira/browse/DRILL-8479
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.21.1
>Reporter: shihuafeng
>Priority: Critical
> Fix For: 1.21.2
>
> Attachments: 0001-mergejoin-leak.patch
>
>
> *Describe the bug*
> megerjoin  leak when RecordIterator allocate memory exception with 
> OutOfMemoryException{*}{*}
> {*}Steps to reproduce the behavior{*}:
>  # prepare data for tpch 1s
>  # set direct memory 5g
>  #  set planner.enable_hashjoin =false  to  ensure use mergejoin operator。
>  #  set drill.memory.debug.allocator =true (Check for memory leaks )
>  # 20 concurrent for tpch sql8
>  # when it had OutOfMemoryException or null EXCEPTION , stopped all sql.
>  # finding memory leak
> *Expected behavior*
>       when all  sql sop , we should find direct memory is 0 AND  could not 
> find leak log like following.
> {code:java}
> Allocator(op:2:0:11:MergeJoinPOP) 100/73728/4874240/100 
> (res/actual/peak/limit){code}
> *Error detail, log output or screenshots*
> {code:java}
> Unable to allocate buffer of size XX (rounded from XX) due to memory limit 
> (). Current allocation: xx{code}
> [^0001-mergejoin-leak.patch]
> sql 
> {code:java}
> // code placeholder
> select o_year, sum(case when nation = 'CHINA' then volume else 0 end) / 
> sum(volume) as mkt_share from ( select extract(year from o_orderdate) as 
> o_year, l_extendedprice * 1.0 as volume, n2.n_name as nation from 
> hive.tpch1s.part, hive.tpch1s.supplier, hive.tpch1s.lineitem, 
> hive.tpch1s.orders, hive.tpch1s.customer, hive.tpch1s.nation n1, 
> hive.tpch1s.nation n2, hive.tpch1s.region where p_partkey = l_partkey and 
> s_suppkey = l_suppkey and l_orderkey = o_orderkey and o_custkey = c_custkey 
> and c_nationkey = n1.n_nationkey and n1.n_regionkey = r_regionkey and r_name 
> = 'ASIA' and s_nationkey = n2.n_nationkey and o_orderdate between date 
> '1995-01-01' and date '1996-12-31' and p_type = 'LARGE BRUSHED BRASS') as 
> all_nations group by o_year order by o_year
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   3   4   5   6   7   8   9   10   >