[jira] [Created] (DRILL-8501) Json Conversion UDF Not Respecting System JSON Options

2024-07-04 Thread Charles Givre (Jira)
Charles Givre created DRILL-8501:


 Summary: Json Conversion UDF Not Respecting System JSON Options
 Key: DRILL-8501
 URL: https://issues.apache.org/jira/browse/DRILL-8501
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - JSON
Affects Versions: 1.21.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.22.0


The convert_fromJSON() UDF does not respect the system JSON options of 
allTextMode and readAllNumbersAsDouble.  

This PR fixes that.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8494) HTTP Caching Not Saving Pages

2024-05-06 Thread Charles Givre (Jira)
Charles Givre created DRILL-8494:


 Summary: HTTP Caching Not Saving Pages
 Key: DRILL-8494
 URL: https://issues.apache.org/jira/browse/DRILL-8494
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - HTTP
Affects Versions: 1.21.1
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.21.2


A minor bugfix, but the HTTP storage plugin was not actually caching results 
even when caching was set to true.  This bug was introduced in DRILL-8329.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8493) Drill Unable to Read XML Files with Namespaces

2024-04-26 Thread Charles Givre (Jira)
Charles Givre created DRILL-8493:


 Summary: Drill Unable to Read XML Files with Namespaces
 Key: DRILL-8493
 URL: https://issues.apache.org/jira/browse/DRILL-8493
 Project: Apache Drill
  Issue Type: Bug
  Components: Format - XML
Affects Versions: 1.21.1
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.21.2


This is a bug fix whereby Drill ignores all data when an XML file has a 
namespace.  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8481) Ability to query XML root attributes

2024-02-28 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre updated DRILL-8481:
-
Fix Version/s: 1.21.2

> Ability to query XML root attributes
> 
>
> Key: DRILL-8481
> URL: https://issues.apache.org/jira/browse/DRILL-8481
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - XML
>Affects Versions: 1.21.1
>Reporter: benj
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.21.2
>
>
> Hi,
> It is possible to retrieve the field attributes except those of the root
> It would be interesting to be able to retrieve the attributes found in the 
> root node of XML files.
> In my common use cases, I have many XML files each containing a single XML 
> frame with often one or more attributes in the root tag.
> To recover this value, I am currently forced to preprocess the files to 
> "copy" this attribute into the fields of the XML record.
> Even with multiple xml records under the root, it would be useful to consider 
> that the root attributes are accessible for each record
> Example (fichier aaa.xml): 
> {noformat}
> 
> 
> blue
> 
> {noformat}
> With request : 
> {code:sql}
> SELECT * FROM(SELECT filename, * FROM TABLE(dfs.test.`/aaa.xml`(type=>'xml', 
> dataLevel=>1)) as xml) AS x;
> {code}
> I can access to :
> * P1_SubVersion
> * P1_MID
> * P1_PN
> * P1_SL
> * P2_SubVersion
> * P2.Color
> But I can' access to :
> * PPP_Version
> * PPP_TimeStamp
> and changing the DataLevel does not solve the problem
> Regards,



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8481) Ability to query XML root attributes

2024-02-28 Thread Charles Givre (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821723#comment-17821723
 ] 

Charles Givre commented on DRILL-8481:
--

[~benj641] 

I just submitted a bug fix.  [https://github.com/apache/drill/pull/2884]

If you can review and test it, I'd appreciate it. 

> Ability to query XML root attributes
> 
>
> Key: DRILL-8481
> URL: https://issues.apache.org/jira/browse/DRILL-8481
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - XML
>Affects Versions: 1.21.1
>Reporter: benj
>Assignee: Charles Givre
>Priority: Major
>
> Hi,
> It is possible to retrieve the field attributes except those of the root
> It would be interesting to be able to retrieve the attributes found in the 
> root node of XML files.
> In my common use cases, I have many XML files each containing a single XML 
> frame with often one or more attributes in the root tag.
> To recover this value, I am currently forced to preprocess the files to 
> "copy" this attribute into the fields of the XML record.
> Even with multiple xml records under the root, it would be useful to consider 
> that the root attributes are accessible for each record
> Example (fichier aaa.xml): 
> {noformat}
> 
> 
> blue
> 
> {noformat}
> With request : 
> {code:sql}
> SELECT * FROM(SELECT filename, * FROM TABLE(dfs.test.`/aaa.xml`(type=>'xml', 
> dataLevel=>1)) as xml) AS x;
> {code}
> I can access to :
> * P1_SubVersion
> * P1_MID
> * P1_PN
> * P1_SL
> * P2_SubVersion
> * P2.Color
> But I can' access to :
> * PPP_Version
> * PPP_TimeStamp
> and changing the DataLevel does not solve the problem
> Regards,



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (DRILL-8481) Ability to query XML root attributes

2024-02-28 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre reassigned DRILL-8481:


Assignee: Charles Givre

> Ability to query XML root attributes
> 
>
> Key: DRILL-8481
> URL: https://issues.apache.org/jira/browse/DRILL-8481
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - XML
>Affects Versions: 1.21.1
>Reporter: benj
>Assignee: Charles Givre
>Priority: Major
>
> Hi,
> It is possible to retrieve the field attributes except those of the root
> It would be interesting to be able to retrieve the attributes found in the 
> root node of XML files.
> In my common use cases, I have many XML files each containing a single XML 
> frame with often one or more attributes in the root tag.
> To recover this value, I am currently forced to preprocess the files to 
> "copy" this attribute into the fields of the XML record.
> Even with multiple xml records under the root, it would be useful to consider 
> that the root attributes are accessible for each record
> Example (fichier aaa.xml): 
> {noformat}
> 
> 
> blue
> 
> {noformat}
> With request : 
> {code:sql}
> SELECT * FROM(SELECT filename, * FROM TABLE(dfs.test.`/aaa.xml`(type=>'xml', 
> dataLevel=>1)) as xml) AS x;
> {code}
> I can access to :
> * P1_SubVersion
> * P1_MID
> * P1_PN
> * P1_SL
> * P2_SubVersion
> * P2.Color
> But I can' access to :
> * PPP_Version
> * PPP_TimeStamp
> and changing the DataLevel does not solve the problem
> Regards,



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8481) Ability to query XML root attributes

2024-02-27 Thread Charles Givre (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821285#comment-17821285
 ] 

Charles Givre commented on DRILL-8481:
--

[~benj641] Thanks for submitting.  Are you actively working on this or is this 
just a bug report?

> Ability to query XML root attributes
> 
>
> Key: DRILL-8481
> URL: https://issues.apache.org/jira/browse/DRILL-8481
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - XML
>Affects Versions: 1.21.1
>Reporter: benj
>Priority: Major
>
> Hi,
> It is possible to retrieve the field attributes except those of the root
> It would be interesting to be able to retrieve the attributes found in the 
> root node of XML files.
> In my common use cases, I have many XML files each containing a single XML 
> frame with often one or more attributes in the root tag.
> To recover this value, I am currently forced to preprocess the files to 
> "copy" this attribute into the fields of the XML record.
> Even with multiple xml records under the root, it would be useful to consider 
> that the root attributes are accessible for each record
> Example (fichier aaa.xml): 
> {noformat}
> 
> 
> blue
> 
> {noformat}
> With request : 
> {code:sql}
> SELECT * FROM(SELECT filename, * FROM TABLE(dfs.test.`/aaa.xml`(type=>'xml', 
> dataLevel=>1)) as xml) AS x;
> {code}
> I can access to :
> * P1_SubVersion
> * P1_MID
> * P1_PN
> * P1_SL
> * P2_SubVersion
> * P2.Color
> But I can' access to :
> * PPP_Version
> * PPP_TimeStamp
> and changing the DataLevel does not solve the problem
> Regards,



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8474) Add Daffodil Format Plugin

2024-01-03 Thread Charles Givre (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17802191#comment-17802191
 ] 

Charles Givre commented on DRILL-8474:
--

[https://github.com/apache/drill/pull/2836]

> Add Daffodil Format Plugin
> --
>
> Key: DRILL-8474
> URL: https://issues.apache.org/jira/browse/DRILL-8474
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.21.1
>Reporter: Charles Givre
>Priority: Major
> Fix For: 1.22.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8474) Add Daffodil Format Plugin

2024-01-03 Thread Charles Givre (Jira)
Charles Givre created DRILL-8474:


 Summary: Add Daffodil Format Plugin
 Key: DRILL-8474
 URL: https://issues.apache.org/jira/browse/DRILL-8474
 Project: Apache Drill
  Issue Type: New Feature
Affects Versions: 1.21.1
Reporter: Charles Givre
 Fix For: 1.22.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8472) Bump Image Metadata Library to Latest Version

2024-01-01 Thread Charles Givre (Jira)
Charles Givre created DRILL-8472:


 Summary: Bump Image Metadata Library to Latest Version
 Key: DRILL-8472
 URL: https://issues.apache.org/jira/browse/DRILL-8472
 Project: Apache Drill
  Issue Type: Task
Affects Versions: 1.21.1
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.21.2


Bump Metadata Extractor dependency to latest version.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8471) Bump DeltaLake Driver to Version 3.0.0

2024-01-01 Thread Charles Givre (Jira)
Charles Givre created DRILL-8471:


 Summary: Bump DeltaLake Driver to Version 3.0.0
 Key: DRILL-8471
 URL: https://issues.apache.org/jira/browse/DRILL-8471
 Project: Apache Drill
  Issue Type: Task
  Components: Format - DeltaLake
Reporter: Charles Givre


Bump DeltaLake Driver to Version 3.0.0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8470) Bump MongoDB Driver to Latest Version

2024-01-01 Thread Charles Givre (Jira)
Charles Givre created DRILL-8470:


 Summary: Bump MongoDB Driver to Latest Version
 Key: DRILL-8470
 URL: https://issues.apache.org/jira/browse/DRILL-8470
 Project: Apache Drill
  Issue Type: Task
  Components: Storage - MongoDB
Affects Versions: 1.21.1
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.21.2


Bump mongoDB driver to latest version.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8461) Prevent XXE Attacks in XML Format Plugin

2023-11-03 Thread Charles Givre (Jira)
Charles Givre created DRILL-8461:


 Summary: Prevent XXE Attacks in XML Format Plugin
 Key: DRILL-8461
 URL: https://issues.apache.org/jira/browse/DRILL-8461
 Project: Apache Drill
  Issue Type: Bug
  Components: Format - XML
Affects Versions: 1.21.1
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.22.0


Drill's XML reader would allow a maliciously crafted XML file to perform an 
_XML eXternal Entity injection_ (XXE)  attack.  This fix disables DTD parsing 
in the XML format plugin and prevents XXE attacks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8453) Add XSD Support to XML Reader (Part 1)

2023-08-21 Thread Charles Givre (Jira)
Charles Givre created DRILL-8453:


 Summary: Add XSD Support to XML Reader (Part 1)
 Key: DRILL-8453
 URL: https://issues.apache.org/jira/browse/DRILL-8453
 Project: Apache Drill
  Issue Type: Improvement
  Components: Format - XML
Affects Versions: 1.21.1
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.21.2


This PR is a part of a series to add better support for reading XML data to 
Drill.  One of the main challenges is that XML data does not have a way of 
inferring data types, nor does it have a way of detecting arrays.  

The only way to do this really well is to have a schema.  Some XML files link a 
schema definition file to the data.  This PR adds the capability for Drill to 
map XSD schema files into Drill schemas.  

The current plan is as follows: Part 1 of this PR simply adds the reader but 
adds no new user detectable functionality.  Part 2 will include the actual 
integration with the XML reader.  Part 3 will include the ability to read 
arrays.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8450) Add Data Type Inference to XML Format Plugin

2023-08-08 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre updated DRILL-8450:
-
Description: 
This PR adds data type inference to the XML format plugin.  In similar fashion 
to other plugins, it adds a new configuration parameter: allTextMode, which 
when set to true, reads all data as strings.  The default is true.

Note that the inference is limited to doubles, date, timestamps, boolean and 
strings.

> Add Data Type Inference to XML Format Plugin
> 
>
> Key: DRILL-8450
> URL: https://issues.apache.org/jira/browse/DRILL-8450
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Format - XML
>Affects Versions: 1.21.1
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.22.0
>
>
> This PR adds data type inference to the XML format plugin.  In similar 
> fashion to other plugins, it adds a new configuration parameter: allTextMode, 
> which when set to true, reads all data as strings.  The default is true.
> Note that the inference is limited to doubles, date, timestamps, boolean and 
> strings.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8450) Add Data Type Inference to XML Format Plugin

2023-08-08 Thread Charles Givre (Jira)
Charles Givre created DRILL-8450:


 Summary: Add Data Type Inference to XML Format Plugin
 Key: DRILL-8450
 URL: https://issues.apache.org/jira/browse/DRILL-8450
 Project: Apache Drill
  Issue Type: Improvement
  Components: Format - XML
Affects Versions: 1.21.1
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.22.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8439) Getting col__ prefix for columns that are not special when extractHeader is enabled

2023-05-31 Thread Charles Givre (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17728079#comment-17728079
 ] 

Charles Givre commented on DRILL-8439:
--

Can you please verify in the CSV file that the affected column doesn't have any 
other leading characters?  Please check for carriage returns, and other 
invisible unicode characters.  The fact that Drill is inserting an extra 
underscore leads me to believe there could be some extra garbage in that field.

In any event, can't you just query this by giving it an alias?

IE:

{{SELECT `col__PRODUCTID_` AS product_id ...}}

> Getting col__ prefix for columns that are not special when extractHeader is 
> enabled
> ---
>
> Key: DRILL-8439
> URL: https://issues.apache.org/jira/browse/DRILL-8439
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata, SQL Parser
>Affects Versions: 1.21.0
> Environment: Enabled {{extractHeader}} in the csv config of dfs 
> plugin.
> No. of drillbits: Single
> OS: Windows
>Reporter: Diksha Chaturvedi
>Priority: Major
>  Labels: drill, extractHeader
>
> As per documentation, Drill appends col_ to the columns that start with a 
> number or special characters.
> {code:java}
> /**
>  * Prefix used to replace non-alphabetic characters at the start of
>  * a column name. For example, $foo becomes col_foo. Used
>  * because SQL does not allow _foo.
>  */
> public static final String COLUMN_PREFIX = "col_";
> {code}
> But in my case I'm getting it even for all alphabetical column name.
> 
> I have the following data in the CSV file,
> ||PRODUCTID||PRODUCTNAME||SUPPLIERID||CATEGORYID||UNIT||PRICE||
> |1|Chais|1|1|10 boxes x 20 bags|18|
> |2|Chang|1|1|24 - 12 oz bottles|19|
> |3|Aniseed Syrup|1|2|12 - 550 ml bottles|10|
> |4|Chef Anton's Cajun Seasoning|2|2|48 - 6 oz jars|22|
> |5|Chef Anton's Gumbo Mix|2|2|36 boxes|21.35|
>  
> While querying on the csv file using following query:
> {code:sql}
> SELECT * FROM dfs.`/var/lib/PRODUCT.csv`{code}
> The output is 
> [!https://i.stack.imgur.com/FBNmn.png|width=611,height=130!|https://i.stack.imgur.com/FBNmn.png]
> 
> I know about other criterias like
> {{#UNITS}} is changed to {{col_UNITS}}
> {{FINANCIAL$RECORD}} is changed to {{FINANCIAL_RECORD}}
> But what's with {{{}PRODUCTID{}}}; Why is it changed to 
> {{col___PRODUCTID__}}? In this case it has appended extra underscores also. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8438) Bump YAUAA to 7.19.2

2023-05-23 Thread Charles Givre (Jira)
Charles Givre created DRILL-8438:


 Summary: Bump YAUAA to 7.19.2
 Key: DRILL-8438
 URL: https://issues.apache.org/jira/browse/DRILL-8438
 Project: Apache Drill
  Issue Type: Task
  Components: Functions - Drill
Reporter: Charles Givre
Assignee: Niels Basjes


Bump YAUAA to latest version.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8437) Add Header Index Pagination

2023-05-21 Thread Charles Givre (Jira)
Charles Givre created DRILL-8437:


 Summary: Add Header Index Pagination
 Key: DRILL-8437
 URL: https://issues.apache.org/jira/browse/DRILL-8437
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - HTTP
Affects Versions: 1.21.1
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.22.0


Some APIs include pagination fields in the HTTP response headers.  This PR adds 
a new pagination method called Header Index which supports that.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8434) Add Median Function

2023-05-15 Thread Charles Givre (Jira)
Charles Givre created DRILL-8434:


 Summary: Add Median Function
 Key: DRILL-8434
 URL: https://issues.apache.org/jira/browse/DRILL-8434
 Project: Apache Drill
  Issue Type: Improvement
  Components: Functions - Drill
Affects Versions: 1.21.1
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.22.0


Adds a median function to Drill. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8433) Add Percent Change UDF to Drill

2023-05-09 Thread Charles Givre (Jira)
Charles Givre created DRILL-8433:


 Summary: Add Percent Change UDF to Drill
 Key: DRILL-8433
 URL: https://issues.apache.org/jira/browse/DRILL-8433
 Project: Apache Drill
  Issue Type: Improvement
  Components: Functions - Drill
Affects Versions: 1.21.1
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.22.0


Adds a function to calculate the percent change between two columns.  Doing 
this without a custom function is cumbersome because you have to include a 
check for division by zero.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8428) ElasticSearch Config Missing Getters

2023-05-04 Thread Charles Givre (Jira)
Charles Givre created DRILL-8428:


 Summary: ElasticSearch Config Missing Getters
 Key: DRILL-8428
 URL: https://issues.apache.org/jira/browse/DRILL-8428
 Project: Apache Drill
  Issue Type: Bug
Reporter: Charles Givre
Assignee: Charles Givre


The ElasticSearch config was missing some getters and as a result, prevented 
users from setting certain config variables.  This PR fixes this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (DRILL-8385) Add support for disabling SSL certificate verification in the Elasticsearch plugin

2023-04-23 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre reassigned DRILL-8385:


Assignee: Charles Givre

> Add support for disabling SSL certificate verification in the Elasticsearch 
> plugin
> --
>
> Key: DRILL-8385
> URL: https://issues.apache.org/jira/browse/DRILL-8385
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - ElasticSearch
>Affects Versions: 1.20.3
>Reporter: James Turton
>Assignee: Charles Givre
>Priority: Minor
> Fix For: Future
>
>
> In Calcite, provide a custom TrustManager that trusts every certificate to 
> the ES RestClient builder in ElasticsearchSchemaFactory if a corresponding 
> config option has been set by application code.
> In Drill, add a config option to the ES plugin allowing certificate 
> verification to be toggled and pass it through to the Calcite option 
> mentioned above.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (DRILL-4223) PIVOT and UNPIVOT to rotate table valued expressions

2023-04-18 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre closed DRILL-4223.


> PIVOT and UNPIVOT to rotate table valued expressions
> 
>
> Key: DRILL-4223
> URL: https://issues.apache.org/jira/browse/DRILL-4223
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Execution - Codegen, SQL Parser
>Reporter: Ashwin Aravind
>Priority: Major
> Fix For: 1.21.0
>
>
> Capability to PIVOT and UNPIVOT table values expressions which are results of 
> a SELECT query



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-4223) PIVOT and UNPIVOT to rotate table valued expressions

2023-04-18 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre updated DRILL-4223:
-
Fix Version/s: 1.21.0

> PIVOT and UNPIVOT to rotate table valued expressions
> 
>
> Key: DRILL-4223
> URL: https://issues.apache.org/jira/browse/DRILL-4223
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Execution - Codegen, SQL Parser
>Reporter: Ashwin Aravind
>Priority: Major
> Fix For: 1.21.0
>
>
> Capability to PIVOT and UNPIVOT table values expressions which are results of 
> a SELECT query



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (DRILL-4223) PIVOT and UNPIVOT to rotate table valued expressions

2023-04-18 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre resolved DRILL-4223.
--
Resolution: Fixed

Added in Drill 1.21. 

> PIVOT and UNPIVOT to rotate table valued expressions
> 
>
> Key: DRILL-4223
> URL: https://issues.apache.org/jira/browse/DRILL-4223
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Execution - Codegen, SQL Parser
>Reporter: Ashwin Aravind
>Priority: Major
> Fix For: 1.21.0
>
>
> Capability to PIVOT and UNPIVOT table values expressions which are results of 
> a SELECT query



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8417) Allow Excel Reader to Ignore Formula Errors

2023-03-30 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre updated DRILL-8417:
-
Description: 
If Drill encounters an Excel formula which is invalid somehow, such as a DIV/0, 
Drill is unable to proceed and throws a number format exception. 

This PR adds a config parameter called ignoreErrors which allows Drill to skip 
such records and returns null for that cell.  Drill will also output a log 
warning.  When set to false, original behavior is retained.

> Allow Excel Reader to Ignore Formula Errors
> ---
>
> Key: DRILL-8417
> URL: https://issues.apache.org/jira/browse/DRILL-8417
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Excel
>Affects Versions: 1.21.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.21.1
>
>
> If Drill encounters an Excel formula which is invalid somehow, such as a 
> DIV/0, Drill is unable to proceed and throws a number format exception. 
> This PR adds a config parameter called ignoreErrors which allows Drill to 
> skip such records and returns null for that cell.  Drill will also output a 
> log warning.  When set to false, original behavior is retained.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8417) Allow Excel Reader to Ignore Formula Errors

2023-03-30 Thread Charles Givre (Jira)
Charles Givre created DRILL-8417:


 Summary: Allow Excel Reader to Ignore Formula Errors
 Key: DRILL-8417
 URL: https://issues.apache.org/jira/browse/DRILL-8417
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - Excel
Affects Versions: 1.21.0
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.21.1






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8414) Index Paginator Not Working When Provided URL

2023-03-18 Thread Charles Givre (Jira)
Charles Givre created DRILL-8414:


 Summary: Index Paginator Not Working When Provided URL
 Key: DRILL-8414
 URL: https://issues.apache.org/jira/browse/DRILL-8414
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - HTTP
Affects Versions: 1.21.0
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.21.1


The index paginator offers two options:  One where the API returns an index or 
offset and the other is when it returns a URL.  The second was not fully 
implemented.  This PR also adds functionality in the case where the API returns 
a path rather than a URL.  In that case, the path will replace the pre-existing 
path segments.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8413) Add DNS Lookup Functions

2023-03-17 Thread Charles Givre (Jira)
Charles Givre created DRILL-8413:


 Summary: Add DNS Lookup Functions
 Key: DRILL-8413
 URL: https://issues.apache.org/jira/browse/DRILL-8413
 Project: Apache Drill
  Issue Type: New Feature
  Components: Functions - Drill
Affects Versions: 1.21.0
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.22


This PR adds additional DNS lookup functions to Drill:

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8411) GoogleSheets Reader Will Not Read More than 1K Rows

2023-03-14 Thread Charles Givre (Jira)
Charles Givre created DRILL-8411:


 Summary: GoogleSheets Reader Will Not Read More than 1K Rows
 Key: DRILL-8411
 URL: https://issues.apache.org/jira/browse/DRILL-8411
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - GoogleSheets
Affects Versions: 1.21.0
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.21.1


The GoogleSheets reader hits the batch limit from the GoogleSheets SDK of 1000 
rows and stops.   This PR fixes that.  

It also fixes a minor but annoying issue whereby the GoogleSheets reader 
determines a column is a date/time, but is then unable to parse it because it 
is in a non-standard format.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8408) Allow Implicit Casts on Join

2023-03-08 Thread Charles Givre (Jira)
Charles Givre created DRILL-8408:


 Summary: Allow Implicit Casts on Join
 Key: DRILL-8408
 URL: https://issues.apache.org/jira/browse/DRILL-8408
 Project: Apache Drill
  Issue Type: Improvement
  Components: Execution - Data Types
Affects Versions: 1.21.0
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.21.1


Currently, Drill does not allow implicit casts on joins.  With DRILL-8136, this 
has been significantly improved, and it might make sense to do so. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8407) Add Support for SFTP File Systems

2023-03-05 Thread Charles Givre (Jira)
Charles Givre created DRILL-8407:


 Summary: Add Support for SFTP File Systems
 Key: DRILL-8407
 URL: https://issues.apache.org/jira/browse/DRILL-8407
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - File
Affects Versions: 1.20.0
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: Future


Add support for SFTP File Systems. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8402) Add REGEXP_EXTRACT Function

2023-02-20 Thread Charles Givre (Jira)
Charles Givre created DRILL-8402:


 Summary: Add REGEXP_EXTRACT Function
 Key: DRILL-8402
 URL: https://issues.apache.org/jira/browse/DRILL-8402
 Project: Apache Drill
  Issue Type: Improvement
  Components: Functions - Drill
Affects Versions: 1.21.0
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.21.0


This PR adds two UDFs to Drill:

regexp_extract(, ) which returns an array of strings which were 
captured by capturing groups in the regex.

regexp_extract(, , ) returns the text captured by a 
specific capturing group. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8399) MS Access Reader Misinterprets Data Types

2023-02-10 Thread Charles Givre (Jira)
Charles Givre created DRILL-8399:


 Summary: MS Access Reader Misinterprets Data Types
 Key: DRILL-8399
 URL: https://issues.apache.org/jira/browse/DRILL-8399
 Project: Apache Drill
  Issue Type: Bug
  Components: Format - MS Access
Affects Versions: 1.21.0
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.21.0


The MS Access reader was assigning certain data types incorrectly, resulting in 
various errors.  This minor PR fixes that.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8395) Add Support for INSERT and Drop Table to GoogleSheets Plugin

2023-02-05 Thread Charles Givre (Jira)
Charles Givre created DRILL-8395:


 Summary: Add Support for INSERT and Drop Table to GoogleSheets 
Plugin
 Key: DRILL-8395
 URL: https://issues.apache.org/jira/browse/DRILL-8395
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - GoogleSheets
Affects Versions: 1.20.3
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.21.0


This PR adds support for INSERT queries which allow a user to append data to an 
existing GoogleSheets tab.  It also:
 * Adds support for DROP TABLE queries which were not implemented
 * Modifies CTAS queries so that if a user executes a CTAS query with a file 
token, Drill will add a new tab to an existing document, but if the user 
executes a CTAS with a file name, it will create an entirely new document.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8392) Empty Tables Causes Index Out of Bounds Exception on PDF Reader

2023-01-24 Thread Charles Givre (Jira)
Charles Givre created DRILL-8392:


 Summary: Empty Tables Causes Index Out of Bounds Exception on PDF 
Reader
 Key: DRILL-8392
 URL: https://issues.apache.org/jira/browse/DRILL-8392
 Project: Apache Drill
  Issue Type: Bug
  Components: Format - PDF
Affects Versions: 1.20.3
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.21.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8390) Minor Improvements to PDF Reader

2023-01-18 Thread Charles Givre (Jira)
Charles Givre created DRILL-8390:


 Summary: Minor Improvements to PDF Reader
 Key: DRILL-8390
 URL: https://issues.apache.org/jira/browse/DRILL-8390
 Project: Apache Drill
  Issue Type: Improvement
  Components: Format - PDF
Reporter: Charles Givre
Assignee: Charles Givre


This PR makes some minor improvements to the PDF reader including:
 * Fixes a minor bug where certain configurations the first row of data was 
skipped
 * Fixes a minor bug where empty tables were causing crashes with the 
spreadsheet extraction algorithm was used
 * Adds a table_count metadata field
 * Adds a table_index metadata field to reflect the current table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8387) Add Support for User Translation to ElasticSearch Plugin

2023-01-12 Thread Charles Givre (Jira)
Charles Givre created DRILL-8387:


 Summary: Add Support for User Translation to ElasticSearch Plugin
 Key: DRILL-8387
 URL: https://issues.apache.org/jira/browse/DRILL-8387
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - ElasticSearch
Affects Versions: 1.20.3
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.21.0


Add support for user translation to ElasticSearch. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8386) Add Support for User Translation for Cassandra

2023-01-11 Thread Charles Givre (Jira)
Charles Givre created DRILL-8386:


 Summary: Add Support for User Translation for Cassandra
 Key: DRILL-8386
 URL: https://issues.apache.org/jira/browse/DRILL-8386
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - Cassandra
Affects Versions: 1.20.3
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.21.0


Adds support for user translation to the Cassandra plugin. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8384) Add Format Plugin for Microsoft Access

2023-01-10 Thread Charles Givre (Jira)
Charles Givre created DRILL-8384:


 Summary: Add Format Plugin for Microsoft Access
 Key: DRILL-8384
 URL: https://issues.apache.org/jira/browse/DRILL-8384
 Project: Apache Drill
  Issue Type: Improvement
  Components: Format - MS Access
Affects Versions: 1.21.0
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.21.0


Shockingly, MS Access is still in widespread use.  This plugin enables Drill to 
read MS Access files. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-5033) Query on JSON that has null as value for each key

2022-12-29 Thread Charles Givre (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17652916#comment-17652916
 ] 

Charles Givre commented on DRILL-5033:
--

[https://github.com/apache/drill/pull/2731]

> Query on JSON that has null as value for each key
> -
>
> Key: DRILL-5033
> URL: https://issues.apache.org/jira/browse/DRILL-5033
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Affects Versions: 1.9.0
>Reporter: Khurram Faraaz
>Priority: Major
>
> Drill 1.9.0 git commit ID : 83513daf
> Drill returns same result with or without `store.json.all_text_mode`=true
> Note that each key in the JSON has null as its value.
> [root@cent01 null_eq_joins]# cat right_all_nulls.json
> {
>  "intKey" : null,
>  "bgintKey": null,
>  "strKey": null,
>  "boolKey": null,
>  "fltKey": null,
>  "dblKey": null,
>  "timKey": null,
>  "dtKey": null,
>  "tmstmpKey": null,
>  "intrvldyKey": null,
>  "intrvlyrKey": null
> }
> [root@cent01 null_eq_joins]#
> Querying the above JSON file results in null as query result.
>  -  We should see each of the keys in the JSON as a column in query result.
>  -  And in each column the value should be a null value. 
> Current behavior does not look right.
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> select * from `right_all_nulls.json`;
> +---+
> |   *   |
> +---+
> | null  |
> +---+
> 1 row selected (0.313 seconds)
> {noformat}
> Adding comment from [~julianhyde] 
> IMHO it is similar but not the same as DRILL-1256. Worth logging an issue and 
> let [~jnadeau] (or someone) put on the record what should be the behavior of 
> an empty record (empty JSON map) when it is top-level (as in this case) or in 
> a collection.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8376) Add Distribution UDFs

2022-12-24 Thread Charles Givre (Jira)
Charles Givre created DRILL-8376:


 Summary: Add Distribution UDFs
 Key: DRILL-8376
 URL: https://issues.apache.org/jira/browse/DRILL-8376
 Project: Apache Drill
  Issue Type: Improvement
  Components: Functions - Drill
Affects Versions: 1.21
Reporter: Charles Givre
Assignee: Charles Givre


Add `width_bucket`, `pearson_correlation` and `kendall_correlation` to Drill



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (DRILL-7554) Convert LTSV Format Plugin to EVF2

2022-12-19 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre closed DRILL-7554.

Resolution: Duplicate

> Convert LTSV Format Plugin to EVF2
> --
>
> Key: DRILL-7554
> URL: https://issues.apache.org/jira/browse/DRILL-7554
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Text  CSV
>Affects Versions: 1.17.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (DRILL-8179) Convert LTSV Format Plugin to EVF2

2022-12-19 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre reassigned DRILL-8179:


Assignee: Charles Givre

> Convert LTSV Format Plugin to EVF2
> --
>
> Key: DRILL-8179
> URL: https://issues.apache.org/jira/browse/DRILL-8179
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.20.1
>Reporter: Jingchuan Hu
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> Get authorized by Charles, continue the conversion from LTSV to EVF2 directly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (DRILL-8198) XML EVF2 reader provideSchema usage

2022-12-19 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre resolved DRILL-8198.
--
Resolution: Fixed

> XML EVF2 reader provideSchema usage
> ---
>
> Key: DRILL-8198
> URL: https://issues.apache.org/jira/browse/DRILL-8198
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Storage - XML
>Affects Versions: 1.20.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
> Fix For: 2.0.0
>
>
> XMLBatchReader is converted to EVF2 reader, but not used provideSchema for 
> Schema Provision feature



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-7554) Convert LTSV Format Plugin to EVF2

2022-12-19 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre updated DRILL-7554:
-
Summary: Convert LTSV Format Plugin to EVF2  (was: Convert LTSV Format 
Plugin to EVF)

> Convert LTSV Format Plugin to EVF2
> --
>
> Key: DRILL-7554
> URL: https://issues.apache.org/jira/browse/DRILL-7554
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Text  CSV
>Affects Versions: 1.17.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8371) Add Write/Append Capability to Splunk Plugin

2022-12-14 Thread Charles Givre (Jira)
Charles Givre created DRILL-8371:


 Summary: Add Write/Append Capability to Splunk Plugin
 Key: DRILL-8371
 URL: https://issues.apache.org/jira/browse/DRILL-8371
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - Splunk
Affects Versions: 1.20.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


While Drill can currently read from Splunk indexes, it cannot write to them or 
create them.  This proposed PR adds support for CTAS queries for Splunk as well 
as INSERT and DROP TABLE. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8365) HTTP Plugin Places Parameters in Wrong Place

2022-12-04 Thread Charles Givre (Jira)
Charles Givre created DRILL-8365:


 Summary: HTTP Plugin Places Parameters in Wrong Place
 Key: DRILL-8365
 URL: https://issues.apache.org/jira/browse/DRILL-8365
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - HTTP
Affects Versions: 1.20.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.20.3


When the requireTail option is set to true, and pagination is enabled, the HTTP 
plugin puts the required parameters in the wrong place in the URL.  This PR 
fixes that.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8364) Add Support for OAuth Enabled File Systems

2022-12-03 Thread Charles Givre (Jira)
Charles Givre created DRILL-8364:


 Summary: Add Support for OAuth Enabled File Systems
 Key: DRILL-8364
 URL: https://issues.apache.org/jira/browse/DRILL-8364
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - File
Affects Versions: 1.20.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


Currently Drill supports reading from file systems such as HDFS, S3 and others 
that use token based authentication.  This PR extends Drill's plugin 
architecture so that Drill can connect with other file systems which use OAuth 
2.0 for authentication.

This PR also adds support for Drill to query Box. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8360) Add Provided Schema for XML Reader

2022-11-27 Thread Charles Givre (Jira)
Charles Givre created DRILL-8360:


 Summary: Add Provided Schema for XML Reader
 Key: DRILL-8360
 URL: https://issues.apache.org/jira/browse/DRILL-8360
 Project: Apache Drill
  Issue Type: Improvement
  Components: Format - XML
Affects Versions: 1.20.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


The XML reader does not support provisioned schema.  This PR adds that support.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8356) Add File Name to GoogleSheets Plugin

2022-11-09 Thread Charles Givre (Jira)
Charles Givre created DRILL-8356:


 Summary: Add File Name to GoogleSheets Plugin
 Key: DRILL-8356
 URL: https://issues.apache.org/jira/browse/DRILL-8356
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - GoogleSheets
Affects Versions: 2.0.0
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


GoogleSheets uses tokens to identify the individual files.  These tokens are 
not human readable and will make it difficult for a user to know which file 
they are accessing.  

This PR adds a metadata field called `_title` which identifies the document 
they are working with.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8354) Add IS_EMPTY Function.

2022-11-08 Thread Charles Givre (Jira)
Charles Givre created DRILL-8354:


 Summary: Add IS_EMPTY Function.
 Key: DRILL-8354
 URL: https://issues.apache.org/jira/browse/DRILL-8354
 Project: Apache Drill
  Issue Type: Improvement
  Components: Functions - Drill
Affects Versions: 1.20.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


When analyzing data, there is currently no single function to evaluate whether 
a given field is empty.  With scalar fields, this can be accomplished with the 
`IS NOT NULL` operator, but with complex fields, this is more challenging as 
complex fields are never null. 

This PR adds a UDF called IS_EMPTY() which accepts any type of field and 
returns true if the field does not contain data.  

 

In the case of scalar fields, if the field is `null` this returns true.  In the 
case of complex fields, which can never be `null`, in the case of lists, the 
function returns true if the list is empty.  In the case of maps, it returns 
true if all of the map's fields are unpopulated. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8350) Convert PCAP Format Plugin to EVF2

2022-11-02 Thread Charles Givre (Jira)
Charles Givre created DRILL-8350:


 Summary: Convert PCAP Format Plugin to EVF2
 Key: DRILL-8350
 URL: https://issues.apache.org/jira/browse/DRILL-8350
 Project: Apache Drill
  Issue Type: Task
  Components: Format - PCAP
Affects Versions: 1.20.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


Convert the PCAP format plugin to EVF2



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8349) GoogleSheets Not Registering Schemas with Non Default Name

2022-11-01 Thread Charles Givre (Jira)
Charles Givre created DRILL-8349:


 Summary: GoogleSheets Not Registering Schemas with Non Default Name
 Key: DRILL-8349
 URL: https://issues.apache.org/jira/browse/DRILL-8349
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - GoogleSheets
Affects Versions: 2.0.0
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


GoogleSheets plugin fails to register plugin instances with names other than 
`GoogleSheets`. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8342) Add Automatic Retry for Rate Limited APIs

2022-10-22 Thread Charles Givre (Jira)
Charles Givre created DRILL-8342:


 Summary: Add Automatic Retry for Rate Limited APIs
 Key: DRILL-8342
 URL: https://issues.apache.org/jira/browse/DRILL-8342
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - HTTP
Affects Versions: 1.20.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


Many APIs have a burst limit for number of requests.  This PR adds a retry 
capability to the HTTP Storage Plugin, whereby if a 429 response code is 
received, Drill will wait a configurable amount of time, and retry the request 
once. 

To prevent runaway pagination, this retry will only happen once per request. 

This PR adds a new configuration option called retryDelay which is the number 
of milliseconds that Drill should wait between retrys.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8341) Add Scanned Plugin List to Sys Profiles Table

2022-10-21 Thread Charles Givre (Jira)
Charles Givre created DRILL-8341:


 Summary: Add Scanned Plugin List to Sys Profiles Table
 Key: DRILL-8341
 URL: https://issues.apache.org/jira/browse/DRILL-8341
 Project: Apache Drill
  Issue Type: Improvement
  Components: Execution - Monitoring
Affects Versions: 1.20.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


In DRILL-8322, [~dzamo] added the list of scanned plugins to the query 
profiles.  This information is extremely useful in query analysis.  This minor 
PR adds this same information to the sys.profiles table. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8340) Add Additional Date Manipulation Functions (Part 1)

2022-10-20 Thread Charles Givre (Jira)
Charles Givre created DRILL-8340:


 Summary: Add Additional Date Manipulation Functions (Part 1)
 Key: DRILL-8340
 URL: https://issues.apache.org/jira/browse/DRILL-8340
 Project: Apache Drill
  Issue Type: Improvement
  Components: Functions - Drill
Affects Versions: 1.20.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


This PR adds several utility functions to facilitate working with dates and 
times.  These are modeled after the date/time functionality in MySQL.

Specifically this adds:
 * YEARWEEK():  Returns an int of year week. IE (202002)
 * TIME_STAMP():  Converts most anything that looks like a date 
string into a timestamp.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (DRILL-8328) HTTP UDF Not Resolving Storage Aliases

2022-10-19 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre closed DRILL-8328.


> HTTP UDF Not Resolving Storage Aliases
> --
>
> Key: DRILL-8328
> URL: https://issues.apache.org/jira/browse/DRILL-8328
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HTTP
>Affects Versions: 1.20.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Blocker
> Fix For: 1.20.3
>
>
> The http_request function currently does not resolve plugin aliases 
> correctly.  This PR fixes that issue. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8335) Add Ability to Query GoogleSheets Tabs by Index

2022-10-14 Thread Charles Givre (Jira)
Charles Givre created DRILL-8335:


 Summary: Add Ability to Query GoogleSheets Tabs by Index
 Key: DRILL-8335
 URL: https://issues.apache.org/jira/browse/DRILL-8335
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - GoogleSheets
Affects Versions: 1.20.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


The GoogleSheets plugin does not provide a way for a user to query data if they 
do not know the available tab names.  This adds the ability to query by index 
of the tabs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8333) Fix Resource Leaks in HTTP Plugin

2022-10-13 Thread Charles Givre (Jira)
Charles Givre created DRILL-8333:


 Summary: Fix Resource Leaks in HTTP Plugin
 Key: DRILL-8333
 URL: https://issues.apache.org/jira/browse/DRILL-8333
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - HTTP
Affects Versions: 1.20.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.20.3


The HTTP plugin has several methods which collect a `ResponseBody` object but 
do not close these objects.  This is causing a resource leak and will cause 
Drill to fail in the event that queries fire off many API calls. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8330) Convert ESRI Shape File Reader to EVF2

2022-10-04 Thread Charles Givre (Jira)
Charles Givre created DRILL-8330:


 Summary: Convert ESRI Shape File Reader to EVF2 
 Key: DRILL-8330
 URL: https://issues.apache.org/jira/browse/DRILL-8330
 Project: Apache Drill
  Issue Type: Task
  Components: Format - ESRI
Affects Versions: 1.20.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


Converts the ESRI Shape File reader to EVF V2.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8329) Close HTTP Caching Resources

2022-10-03 Thread Charles Givre (Jira)
Charles Givre created DRILL-8329:


 Summary: Close HTTP Caching Resources 
 Key: DRILL-8329
 URL: https://issues.apache.org/jira/browse/DRILL-8329
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - HTTP
Affects Versions: 1.20.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.20.3


The HTTP plugin has the ability to cache API responses.  However, the storage 
plugin was not closing the connection to the file cache.  This minor PR fixes 
that. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8328) HTTP UDF Not Resolving Storage Aliases

2022-10-02 Thread Charles Givre (Jira)
Charles Givre created DRILL-8328:


 Summary: HTTP UDF Not Resolving Storage Aliases
 Key: DRILL-8328
 URL: https://issues.apache.org/jira/browse/DRILL-8328
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - HTTP
Affects Versions: 1.20.0
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.20.3


The http_request function currently does not resolve plugin aliases correctly.  
This PR fixes that issue. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8327) GoogleSheets not Reporting Schemata to Info_Schema

2022-10-01 Thread Charles Givre (Jira)
Charles Givre created DRILL-8327:


 Summary: GoogleSheets not Reporting Schemata to Info_Schema
 Key: DRILL-8327
 URL: https://issues.apache.org/jira/browse/DRILL-8327
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - GoogleSheets
Affects Versions: 2.0.0
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


The GoogleSheets (GS) plugin was not reporting the available documents to the 
info schema.  This PR makes some modifications so that users can determine 
which documents are available via the information schema. 

The GS plugin does not report the tabs as tables to the information schema 
because that can cause Drill to exceed Google's rate quota.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8325) Convert PDF Format Plugin to EVF V2

2022-09-29 Thread Charles Givre (Jira)
Charles Givre created DRILL-8325:


 Summary: Convert PDF Format Plugin to EVF V2
 Key: DRILL-8325
 URL: https://issues.apache.org/jira/browse/DRILL-8325
 Project: Apache Drill
  Issue Type: Task
  Components: Format - PDF
Affects Versions: 1.20.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


Converts the PDF Format Reader to EVF V2. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8320) Prevent Infinite Pagination for Index Paginator

2022-09-27 Thread Charles Givre (Jira)
Charles Givre created DRILL-8320:


 Summary: Prevent Infinite Pagination for Index Paginator
 Key: DRILL-8320
 URL: https://issues.apache.org/jira/browse/DRILL-8320
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - HTTP
Affects Versions: 1.20.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


In some cases that use keyset/index pagination, if the API does not have a 
boolean column that indicates when to stop, Drill will send requests until the 
API stops returning data.  This PR fixes this by making the boolean parameter 
optional.  

If that parameter is not present, if the index result is blank or the same as 
the previous request, pagination will end.

Note, if the pagination parameters are buried in nested objects, this cannot be 
configured with a dataPath.  If the user uses a dataPath, pagination will stop 
at the first page.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (DRILL-8317) Convert LogRegex Format Plugin to EVF V2

2022-09-24 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre resolved DRILL-8317.
--
Resolution: Done

> Convert LogRegex Format Plugin to EVF V2
> 
>
> Key: DRILL-8317
> URL: https://issues.apache.org/jira/browse/DRILL-8317
> Project: Apache Drill
>  Issue Type: Task
>  Components: Format - Log Reader
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> Converts the existing logRegex reader to EVF V2.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8318) httpd format parser throws exception on log item with malformed query string

2022-09-24 Thread Charles Givre (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17609078#comment-17609078
 ] 

Charles Givre commented on DRILL-8318:
--

[~nielsbasjes], could you take a look?

> httpd format parser throws exception on log item with malformed query string
> 
>
> Key: DRILL-8318
> URL: https://issues.apache.org/jira/browse/DRILL-8318
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.19.0
> Environment: drill-embedded
> openjdk version "1.8.0_342"
> OpenJDK Runtime Environment Corretto-8.342.07.1 (build 1.8.0_342-b07)
> OpenJDK 64-Bit Server VM Corretto-8.342.07.1 (build 25.342-b07, mixed mode)
> Ubuntu 20.04.4 LTS (Focal Fossa)
> Running under WSL on Windows 11
>Reporter: Richard Downer
>Priority: Major
> Attachments: testcase
>
>
> I am running Apache Drill over my httpd-style access logs. These are 
> collecting data from requests on the open Internet, which sometimes means 
> questionable requests made by remote Internet users (sometimes with hostile 
> intent).
> One such style of request looks like this:
> {{151.236.216.243 - - [15/Sep/2022:20:18:07 +] "GET 
> /?=PHPE9568F36-D428-11d2-A769-00AA001ACF42 HTTP/1.1" 301 178 "-" 
> "curl/7.54.0"}}
> I have put this request into a new log file containing only this line, as a 
> test case. I initiate a query:
> {{select request_receive_time, request_status_last, request_firstline_method, 
> request_firstline_uri from 
> table(dfs.`/home/richard/drill/access-logs/nginx/testcase`(type=>'httpd', 
> logFormat=>'combined')) where request_status_last = 404;}}
> This produces this error:
> {{Error: DATA_READ ERROR: Error reading HTTPD file at line number 0}}
> {{Error occurred during setter call: null caused by 
> "java.lang.StringIndexOutOfBoundsException: String index out of range: -1" 
> when calling "public void 
> org.apache.drill.exec.store.httpd.HttpdLogRecord.setWildcard(java.lang.String,java.lang.String)"
>  for  key = "STRING:request.firstline.uri.query.*"  name = 
> "STRING:request.firstline.uri.query"  value = "Value\{filled=STRING, 
> s='PHPE9568F36-D428-11d2-A769-00AA001ACF42', l=null, d=null}" castsTo = 
> "[STRING]"}}
> {{Format plugin: httpd}}
> {{Format plugin: HttpdLogFormatPlugin}}
> {{Plugin config name: null}}
> {{Fragment: 0:0}}
> While I appreciate that the query string part of the request is probably 
> malformed according to a strict interpretation, this is a real request seen 
> "in the wild" and I would prefer that Drill is robust enough to deal with the 
> type of garbage requests frequently seen on real web server.
> Thank you for your assistance - if I can provide any more information that 
> would help please let me know!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8317) Convert LogRegex Format Plugin to EVF V2

2022-09-22 Thread Charles Givre (Jira)
Charles Givre created DRILL-8317:


 Summary: Convert LogRegex Format Plugin to EVF V2
 Key: DRILL-8317
 URL: https://issues.apache.org/jira/browse/DRILL-8317
 Project: Apache Drill
  Issue Type: Task
  Components: Format - Log Reader
Affects Versions: 1.20.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


Converts the existing logRegex reader to EVF V2.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8241) Remove Deprecated JSON Reader

2022-09-20 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre updated DRILL-8241:
-
Description: 
This is a master ticket to remove the deprecated v1 JSON reader from Drill.  
This JSON reader is used in several places and removing it will ensure 
consistent behavior across all data sources. 

The V2, EVF based JSON reader has several advantages, including the possibility 
of schema provisioning, limit pushdowns and others.

Here are the tasks which need to be completed to fully remove the v1 JSON 
reader.
 * Complete DRILL-5955 which adds support for the UNION vector to the EVF Json 
reader.
 * Convert the convert_fromJSON functions to V2 (DRILL-8239)
 * Convert the Druid Storage Plugin to V2 (DRILL-8316)
 * Convert MongoDB Storage Plugin to V2.  (Note the MongoDB plugin uses an 
EVF-based BSON reader as well as the V1 JSON reader)
 * Remove all V1-based unit tests
 * Migrate the JsonOptions from the HTTP Storage Plugin to global location to 
allow other plugins and users of JSON to set JSON configuration at a more 
granular level. (DRILL-8243)
 * Remove extraneous configuration options.
 * Bug fix HTTP UDFs (DRILL-8242)

  was:
This is a master ticket to remove the deprecated v1 JSON reader from Drill.  
This JSON reader is used in several places and removing it will ensure 
consistent behavior across all data sources. 

The V2, EVF based JSON reader has several advantages, including the possibility 
of schema provisioning, limit pushdowns and others.

Here are the tasks which need to be completed to fully remove the v1 JSON 
reader.
 * Complete DRILL-5955 which adds support for the UNION vector to the EVF Json 
reader.
 * Convert the convert_fromJSON functions to V2 (DRILL-8239)
 * Convert the Druid Storage Plugin to V2
 * Convert MongoDB Storage Plugin to V2.  (Note the MongoDB plugin uses an 
EVF-based BSON reader as well as the V1 JSON reader)
 * Remove all V1-based unit tests
 * Migrate the JsonOptions from the HTTP Storage Plugin to global location to 
allow other plugins and users of JSON to set JSON configuration at a more 
granular level. (DRILL-8243)
 * Remove extraneous configuration options.
 * Bug fix HTTP UDFs (DRILL-8242)


> Remove Deprecated JSON Reader
> -
>
> Key: DRILL-8241
> URL: https://issues.apache.org/jira/browse/DRILL-8241
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.20.1
>Reporter: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> This is a master ticket to remove the deprecated v1 JSON reader from Drill.  
> This JSON reader is used in several places and removing it will ensure 
> consistent behavior across all data sources. 
> The V2, EVF based JSON reader has several advantages, including the 
> possibility of schema provisioning, limit pushdowns and others.
> Here are the tasks which need to be completed to fully remove the v1 JSON 
> reader.
>  * Complete DRILL-5955 which adds support for the UNION vector to the EVF 
> Json reader.
>  * Convert the convert_fromJSON functions to V2 (DRILL-8239)
>  * Convert the Druid Storage Plugin to V2 (DRILL-8316)
>  * Convert MongoDB Storage Plugin to V2.  (Note the MongoDB plugin uses an 
> EVF-based BSON reader as well as the V1 JSON reader)
>  * Remove all V1-based unit tests
>  * Migrate the JsonOptions from the HTTP Storage Plugin to global location to 
> allow other plugins and users of JSON to set JSON configuration at a more 
> granular level. (DRILL-8243)
>  * Remove extraneous configuration options.
>  * Bug fix HTTP UDFs (DRILL-8242)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8316) Convert Druid Storage Plugin to EVF & V2 JSON Reader

2022-09-20 Thread Charles Givre (Jira)
Charles Givre created DRILL-8316:


 Summary: Convert Druid Storage Plugin to EVF & V2 JSON Reader
 Key: DRILL-8316
 URL: https://issues.apache.org/jira/browse/DRILL-8316
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - Druid
Affects Versions: 1.20.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8315) Convert SAS Format Plugin to EVF V2

2022-09-20 Thread Charles Givre (Jira)
Charles Givre created DRILL-8315:


 Summary: Convert SAS Format Plugin to EVF V2
 Key: DRILL-8315
 URL: https://issues.apache.org/jira/browse/DRILL-8315
 Project: Apache Drill
  Issue Type: Improvement
  Components: Format - SAS
Affects Versions: 1.20.2, 1.20.1
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


Convert the SAS Format Plugin to EVF V2.  No user facing changes. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8312) Convert Format Plugins to EVF V2

2022-09-19 Thread Charles Givre (Jira)
Charles Givre created DRILL-8312:


 Summary: Convert Format Plugins to EVF V2
 Key: DRILL-8312
 URL: https://issues.apache.org/jira/browse/DRILL-8312
 Project: Apache Drill
  Issue Type: Improvement
Affects Versions: 1.20.2
Reporter: Charles Givre
 Fix For: 2.0.0


This is a blanket ticket to convert all format plugins to EVF V2.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (DRILL-8159) Upgrade HTTPD reader to use EVF V2

2022-09-19 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre resolved DRILL-8159.
--
Resolution: Done

> Upgrade HTTPD reader to use EVF V2
> --
>
> Key: DRILL-8159
> URL: https://issues.apache.org/jira/browse/DRILL-8159
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>
> Continuation of work originally in the DRILL-8085 PR.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8311) Convert SPSS Format Plugin to EVF V2

2022-09-19 Thread Charles Givre (Jira)
Charles Givre created DRILL-8311:


 Summary: Convert SPSS Format Plugin to EVF V2
 Key: DRILL-8311
 URL: https://issues.apache.org/jira/browse/DRILL-8311
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - SPSS
Affects Versions: 1.20.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


This PR converts the SPSS format plugin to use EVF V2. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8310) Convert Syslog Format to EVF V2

2022-09-19 Thread Charles Givre (Jira)
Charles Givre created DRILL-8310:


 Summary: Convert Syslog Format to EVF V2
 Key: DRILL-8310
 URL: https://issues.apache.org/jira/browse/DRILL-8310
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - Syslog
Affects Versions: 1.20.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


This PR proposes to convert the syslog to use EVF V2.   No user facing changes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8307) Druid storage plugin's use of Apache HttpClient is not thread safe

2022-09-15 Thread Charles Givre (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605352#comment-17605352
 ] 

Charles Givre commented on DRILL-8307:
--

[~dzamo] I don't know if you're planning on taking this one, but I had two 
thoughts here:
 #  Would it be worth looking to see where else in Drill we use the Apache 
httpclient and switch them all over to OkHttp?
 # I started a branch ([https://github.com/cgivre/drill/tree/druid_evf)] which 
converts the Druid plugin to EVF.  It was almost done. The remaining parts to 
finish were all in the ScanBatchCreator.  Would you want to incorporate this 
work as well? 

> Druid storage plugin's use of Apache HttpClient is not thread safe
> --
>
> Key: DRILL-8307
> URL: https://issues.apache.org/jira/browse/DRILL-8307
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Affects Versions: 1.20.2
>Reporter: James Turton
>Priority: Major
> Fix For: 1.20.3
>
>
> When multiple concurrent queries are run against a single Druid storage 
> plugin then an error such as is shown below is reported by the Apache 
> HttpClient used in that plugin. The Druid storage plugin uses a single static 
> HttpClient instance which should be replaced with something like 
> PoolingHttpClientConnectionManager for multithreaded access.
>  [1cdd2b75-1310---5a638567ed07:foreman] INFO
> o.a.d.e.s.d.s.DruidSchemaFactory
> - User Error Occurred: Failure while loading druid datasources for database
> 'druid-egsmd300'. (Invalid use of BasicClientConnManager: connection still
> allocated.
> Make sure to release the connection before allocating another one.)
> org.apache.drill.common.exceptions.UserException: DATA_READ ERROR: Failure
> while loading druid datasources for database ''.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (DRILL-8289) Add Threat Hunting Functions

2022-09-12 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre resolved DRILL-8289.
--
Resolution: Done

> Add Threat Hunting Functions
> 
>
> Key: DRILL-8289
> URL: https://issues.apache.org/jira/browse/DRILL-8289
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Functions - Drill
>Affects Versions: 2.0.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> # Threat Hunting Functions
> These functions are useful for doing threat hunting with Apache Drill. These 
> were inspired by huntlib.[1]
> The functions are: 
> * `punctuation_pattern()`: Extracts the pattern of punctuation in 
> text.
> * `entropy()`: This function calculates the Shannon Entropy of a 
> given string of text.
> * `entropyPerByte()`: This function calculates the Shannon Entropy of 
> a given string of text, normed for the string length.
> [1]: https://github.com/target/huntlib



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8305) Add Implicit Fields to Google Sheets Reader

2022-09-11 Thread Charles Givre (Jira)
Charles Givre created DRILL-8305:


 Summary: Add Implicit Fields to Google Sheets Reader
 Key: DRILL-8305
 URL: https://issues.apache.org/jira/browse/DRILL-8305
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - GoogleSheets
Affects Versions: 2.0.0
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


GoogleSheets needs additional metadata fields to access the available data.  
This PR adds framework for implicit metadata fields.  

This PR also adds the _sheets field which lists the available tabs within a 
Google Sheets document.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8291) Allow case sensitive Filters in HTTP Plugin

2022-09-03 Thread Charles Givre (Jira)
Charles Givre created DRILL-8291:


 Summary: Allow case sensitive Filters in HTTP Plugin
 Key: DRILL-8291
 URL: https://issues.apache.org/jira/browse/DRILL-8291
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - HTTP
Affects Versions: 1.20.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 1.20.3


Some APIs will reject filter pushdowns if they are not in the correct case.  
This PR adds a config option `caseSensitiveFilters` to the API config and when 
set to true, preserves the case of the filters pushed down. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8289) Add Threat Hunting Functions

2022-08-28 Thread Charles Givre (Jira)
Charles Givre created DRILL-8289:


 Summary: Add Threat Hunting Functions
 Key: DRILL-8289
 URL: https://issues.apache.org/jira/browse/DRILL-8289
 Project: Apache Drill
  Issue Type: New Feature
  Components: Functions - Drill
Affects Versions: 2.0.0
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


# Threat Hunting Functions
These functions are useful for doing threat hunting with Apache Drill. These 
were inspired by huntlib.[1]

The functions are: 
* `punctuation_pattern()`: Extracts the pattern of punctuation in text.
* `entropy()`: This function calculates the Shannon Entropy of a given 
string of text.
* `entropyPerByte()`: This function calculates the Shannon Entropy of a 
given string of text, normed for the string length.

[1]: https://github.com/target/huntlib



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8288) Null Columns not being Written to GoogleSheets

2022-08-28 Thread Charles Givre (Jira)
Charles Givre created DRILL-8288:


 Summary: Null Columns not being Written to GoogleSheets
 Key: DRILL-8288
 URL: https://issues.apache.org/jira/browse/DRILL-8288
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - GoogleSheets
Affects Versions: 2.0.0
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


When writing to GoogleSheets, null columns are not written which causes wrong 
data. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8287) Add Support for Keyset Based Pagination

2022-08-25 Thread Charles Givre (Jira)
Charles Givre created DRILL-8287:


 Summary: Add Support for Keyset Based Pagination
 Key: DRILL-8287
 URL: https://issues.apache.org/jira/browse/DRILL-8287
 Project: Apache Drill
  Issue Type: New Feature
  Components: Storage - HTTP
Affects Versions: 1.20.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


Some APIs such as HubSpot use values in the result set to indicate whether 
there are additional pages.  This PR adds support for this kind of pagination.  
Note that current implementation only works for JSON based APIs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8286) GoogleSheets StoragePlugin displaying ClientID and ClientSecret in Config

2022-08-25 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre updated DRILL-8286:
-
Component/s: Storage - GoogleSheets

> GoogleSheets StoragePlugin displaying ClientID and ClientSecret in Config
> -
>
> Key: DRILL-8286
> URL: https://issues.apache.org/jira/browse/DRILL-8286
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - GoogleSheets
>Affects Versions: 2.0.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> The GoogleSheets storage plugin is rendering the `clientID` and 
> `clientSecret` in the config body instead of in the credential provider.
> This minor PR fixes that.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8286) GoogleSheets StoragePlugin displaying ClientID and ClientSecret in Config

2022-08-25 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre updated DRILL-8286:
-
Affects Version/s: 2.0.0

> GoogleSheets StoragePlugin displaying ClientID and ClientSecret in Config
> -
>
> Key: DRILL-8286
> URL: https://issues.apache.org/jira/browse/DRILL-8286
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> The GoogleSheets storage plugin is rendering the `clientID` and 
> `clientSecret` in the config body instead of in the credential provider.
> This minor PR fixes that.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8286) GoogleSheets StoragePlugin displaying ClientID and ClientSecret in Config

2022-08-25 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre updated DRILL-8286:
-
Fix Version/s: 2.0.0

> GoogleSheets StoragePlugin displaying ClientID and ClientSecret in Config
> -
>
> Key: DRILL-8286
> URL: https://issues.apache.org/jira/browse/DRILL-8286
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> The GoogleSheets storage plugin is rendering the `clientID` and 
> `clientSecret` in the config body instead of in the credential provider.
> This minor PR fixes that.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8286) GoogleSheets StoragePlugin displaying ClientID and ClientSecret in Config

2022-08-25 Thread Charles Givre (Jira)
Charles Givre created DRILL-8286:


 Summary: GoogleSheets StoragePlugin displaying ClientID and 
ClientSecret in Config
 Key: DRILL-8286
 URL: https://issues.apache.org/jira/browse/DRILL-8286
 Project: Apache Drill
  Issue Type: Bug
Reporter: Charles Givre
Assignee: Charles Givre


The GoogleSheets storage plugin is rendering the `clientID` and `clientSecret` 
in the config body instead of in the credential provider.

This minor PR fixes that.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (DRILL-8284) Apache SQL Query failing while accessing the Json with complex data model

2022-08-23 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre closed DRILL-8284.

Resolution: Not A Bug

> Apache SQL Query failing while accessing the Json with complex data model
> -
>
> Key: DRILL-8284
> URL: https://issues.apache.org/jira/browse/DRILL-8284
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: SHUBHAM KUMAR
>Priority: Major
>
> Apache SQL Query failing while accessing the Json with complex data model. 
> Complex Json: 
> Map object inside another map object then Array Object. 
> Case1: When we have nested objects within array map, and map within map. 
> {"attributes": [
>                     {
>                         "name": "webBrandName",
>                         "value": {
>                             "en-US": "Smashbox"
>                         }
>                     },
>                     {
>                         "name": "startDate",
>                         "value": "2011-07-25T15:30:00.000Z"
>                     }
>                 ]
> }
> Case2: Having array with multiple map items with diff data types. eg. String 
> and Boolean both type. 
> {"attributes": [
>                     {
>                         "name": "startDate",
>                         "value": "2011-07-25T15:30:00.000Z"
>                     },
>                     {
>                         "name": "hasCBD",
>                         "value": false
>                     }
>                 ]
> }
> Query: 
> select flatten(attributes) as Var from dfs.`/filepath/filename.json`
>  
> Error: 
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> IndexOutOfBoundsException: readerIndex: 0, writerIndex: 1764642048 (expected: 
> 0 <= readerIndex <= writerIndex <= capacity(0)) Fragment: 0:0 Please, refer 
> to logs for more information. [Error Id: c5a3b8fa-cad1-4c9a-8673-de5745e9170b 
> on GGNUWT461535L.ad.infosys.com:31010]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8284) Apache SQL Query failing while accessing the Json with complex data model

2022-08-23 Thread Charles Givre (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17583946#comment-17583946
 ] 

Charles Givre commented on DRILL-8284:
--

[~shubhamsmvdu] This is normal behavior for Drill.  The issue you are 
encountering is a schema change exception on the `value` field.  In both cases, 
what is happening is that Drill first encounters one data type and creates a 
vector for that, then in the next row, encounters the same field but in a 
different data type and throws an exception. 

The are a few options:
 #  If you use the v1 JSON reader, you can enable the UNION data type which 
allows heterogeneous data types.  We are working on enabling this for the V2 
JSON reader, but for the moment, it is not.  This is a variable which must be 
set at the system level.
 # Provide a schema:  You can provide a schema for the field `value` and set 
`mode` to JSON.  I'd have to dig up the documentation for this but what this 
does is force the field to a string.  If JSON objects are encountered, those 
will be rendered as a string. 

I'm going to close this as this is expected behavior.  Please use github issues 
or slack to continue the conversation. 

> Apache SQL Query failing while accessing the Json with complex data model
> -
>
> Key: DRILL-8284
> URL: https://issues.apache.org/jira/browse/DRILL-8284
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: SHUBHAM KUMAR
>Priority: Major
>
> Apache SQL Query failing while accessing the Json with complex data model. 
> Complex Json: 
> Map object inside another map object then Array Object. 
> Case1: When we have nested objects within array map, and map within map. 
> {"attributes": [
>                     {
>                         "name": "webBrandName",
>                         "value": {
>                             "en-US": "Smashbox"
>                         }
>                     },
>                     {
>                         "name": "startDate",
>                         "value": "2011-07-25T15:30:00.000Z"
>                     }
>                 ]
> }
> Case2: Having array with multiple map items with diff data types. eg. String 
> and Boolean both type. 
> {"attributes": [
>                     {
>                         "name": "startDate",
>                         "value": "2011-07-25T15:30:00.000Z"
>                     },
>                     {
>                         "name": "hasCBD",
>                         "value": false
>                     }
>                 ]
> }
> Query: 
> select flatten(attributes) as Var from dfs.`/filepath/filename.json`
>  
> Error: 
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> IndexOutOfBoundsException: readerIndex: 0, writerIndex: 1764642048 (expected: 
> 0 <= readerIndex <= writerIndex <= capacity(0)) Fragment: 0:0 Please, refer 
> to logs for more information. [Error Id: c5a3b8fa-cad1-4c9a-8673-de5745e9170b 
> on GGNUWT461535L.ad.infosys.com:31010]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8276) Add Support for User Translation for Splunk

2022-08-07 Thread Charles Givre (Jira)
Charles Givre created DRILL-8276:


 Summary: Add Support for User Translation for Splunk
 Key: DRILL-8276
 URL: https://issues.apache.org/jira/browse/DRILL-8276
 Project: Apache Drill
  Issue Type: Task
  Components: Storage - Other
Affects Versions: 1.20.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


This PR adds support for user translation to Splunk.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8271) Make Storage and Format Config Case Insensitive

2022-07-25 Thread Charles Givre (Jira)
Charles Givre created DRILL-8271:


 Summary: Make Storage and Format Config Case Insensitive
 Key: DRILL-8271
 URL: https://issues.apache.org/jira/browse/DRILL-8271
 Project: Apache Drill
  Issue Type: Task
Reporter: Charles Givre






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (DRILL-8270) Delete absolete zookeeper patch (tech debt)

2022-07-25 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre reassigned DRILL-8270:


Assignee: Charles Givre

> Delete absolete zookeeper patch (tech debt)
> ---
>
> Key: DRILL-8270
> URL: https://issues.apache.org/jira/browse/DRILL-8270
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.20.1
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
> Fix For: 2.0.0
>
>
> Patch files are in the `.gitignore` and yet a .patch file 
> ([contrib/native/client/patches/zookeeper-3.4.6-x64.patch|https://github.com/apache/drill/pull/2585/files/06625708f0419442d823d0025afa6e043fffcc4e#diff-0b6d0330fc567658b83263c83e902ec72dc0e95bb0ad0830736dc5cae8449168])
>  somehow has been included in the Drill build.  This PR removes it. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8270) Delete absolete zookeeper patch (tech debt)

2022-07-25 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre updated DRILL-8270:
-
Affects Version/s: 1.20.1

> Delete absolete zookeeper patch (tech debt)
> ---
>
> Key: DRILL-8270
> URL: https://issues.apache.org/jira/browse/DRILL-8270
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.20.1
>Reporter: Charles Givre
>Priority: Minor
>
> Patch files are in the `.gitignore` and yet a .patch file 
> ([contrib/native/client/patches/zookeeper-3.4.6-x64.patch|https://github.com/apache/drill/pull/2585/files/06625708f0419442d823d0025afa6e043fffcc4e#diff-0b6d0330fc567658b83263c83e902ec72dc0e95bb0ad0830736dc5cae8449168])
>  somehow has been included in the Drill build.  This PR removes it. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8270) Delete absolete zookeeper patch (tech debt)

2022-07-25 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre updated DRILL-8270:
-
Description: Patch files are in the `.gitignore` and yet a .patch file 
([contrib/native/client/patches/zookeeper-3.4.6-x64.patch|https://github.com/apache/drill/pull/2585/files/06625708f0419442d823d0025afa6e043fffcc4e#diff-0b6d0330fc567658b83263c83e902ec72dc0e95bb0ad0830736dc5cae8449168])
 somehow has been included in the Drill build.  This PR removes it. 

> Delete absolete zookeeper patch (tech debt)
> ---
>
> Key: DRILL-8270
> URL: https://issues.apache.org/jira/browse/DRILL-8270
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Charles Givre
>Priority: Minor
>
> Patch files are in the `.gitignore` and yet a .patch file 
> ([contrib/native/client/patches/zookeeper-3.4.6-x64.patch|https://github.com/apache/drill/pull/2585/files/06625708f0419442d823d0025afa6e043fffcc4e#diff-0b6d0330fc567658b83263c83e902ec72dc0e95bb0ad0830736dc5cae8449168])
>  somehow has been included in the Drill build.  This PR removes it. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8270) Delete absolete zookeeper patch (tech debt)

2022-07-25 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre updated DRILL-8270:
-
Fix Version/s: 2.0.0

> Delete absolete zookeeper patch (tech debt)
> ---
>
> Key: DRILL-8270
> URL: https://issues.apache.org/jira/browse/DRILL-8270
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.20.1
>Reporter: Charles Givre
>Priority: Minor
> Fix For: 2.0.0
>
>
> Patch files are in the `.gitignore` and yet a .patch file 
> ([contrib/native/client/patches/zookeeper-3.4.6-x64.patch|https://github.com/apache/drill/pull/2585/files/06625708f0419442d823d0025afa6e043fffcc4e#diff-0b6d0330fc567658b83263c83e902ec72dc0e95bb0ad0830736dc5cae8449168])
>  somehow has been included in the Drill build.  This PR removes it. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8270) Delete absolete zookeeper patch (tech debt)

2022-07-25 Thread Charles Givre (Jira)
Charles Givre created DRILL-8270:


 Summary: Delete absolete zookeeper patch (tech debt)
 Key: DRILL-8270
 URL: https://issues.apache.org/jira/browse/DRILL-8270
 Project: Apache Drill
  Issue Type: Task
Reporter: Charles Givre






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8185) EVF 2 doen't handle map arrays or nested maps

2022-07-18 Thread Charles Givre (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17568259#comment-17568259
 ] 

Charles Givre commented on DRILL-8185:
--

Hey [~Paul.Rogers] , 

Just as an FYSA, we've been doing some work to consolidate and improve the 
overall handling of JSON in Drill. (DRILL-8241)  The overarching goal being to 
remove all the deprecated JSON readers and use the EVF2 JSON reader throughout 
Drill.  [~vitalii] has been working on DRILL-5955 with the goal being to 
"re-enable" union vectors in the EVF2.  He has a draft PR but I'm not sure how 
close we are to completion. 

We've done some work to also make the JSON configuration more granular and 
introduced a JSONOptions class which standardizes the JSON configuration for 
all plugins that use JSON.  (DRILL-8243)

> EVF 2 doen't handle map arrays or nested maps
> -
>
> Key: DRILL-8185
> URL: https://issues.apache.org/jira/browse/DRILL-8185
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.20.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 2.0.0
>
>
> When converting Avro, Luoc found two bugs in how EVF 2 (the projection 
> mechanism) handles map array and nested maps



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8244) HTTP_Request Not Passing Down Config Variable

2022-06-08 Thread Charles Givre (Jira)
Charles Givre created DRILL-8244:


 Summary: HTTP_Request Not Passing Down Config Variable
 Key: DRILL-8244
 URL: https://issues.apache.org/jira/browse/DRILL-8244
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Other
Affects Versions: 1.20.1
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


The http_request UDF was not passing down the provided schema and other config 
parameters down to the jsonLoader.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (DRILL-8241) Remove Deprecated JSON Reader

2022-06-03 Thread Charles Givre (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre updated DRILL-8241:
-
Description: 
This is a master ticket to remove the deprecated v1 JSON reader from Drill.  
This JSON reader is used in several places and removing it will ensure 
consistent behavior across all data sources. 

The V2, EVF based JSON reader has several advantages, including the possibility 
of schema provisioning, limit pushdowns and others.

Here are the tasks which need to be completed to fully remove the v1 JSON 
reader.
 * Complete DRILL-5955 which adds support for the UNION vector to the EVF Json 
reader.
 * Convert the convert_fromJSON functions to V2 (DRILL-8239)
 * Convert the Druid Storage Plugin to V2
 * Convert MongoDB Storage Plugin to V2.  (Note the MongoDB plugin uses an 
EVF-based BSON reader as well as the V1 JSON reader)
 * Remove all V1-based unit tests
 * Migrate the JsonOptions from the HTTP Storage Plugin to global location to 
allow other plugins and users of JSON to set JSON configuration at a more 
granular level. (DRILL-8243)
 * Remove extraneous configuration options.
 * Bug fix HTTP UDFs (DRILL-8242)

  was:
This is a master ticket to remove the deprecated v1 JSON reader from Drill.  
This JSON reader is used in several places and removing it will ensure 
consistent behavior across all data sources. 

The V2, EVF based JSON reader has several advantages, including the possibility 
of schema provisioning, limit pushdowns and others.

Here are the tasks which need to be completed to fully remove the v1 JSON 
reader.
 * Complete DRILL-5955 which adds support for the UNION vector to the EVF Json 
reader.
 * Convert the convert_fromJSON functions to V2 (DRILL-8239)
 * Convert the Druid Storage Plugin to V2
 * Convert MongoDB Storage Plugin to V2.  (Note the MongoDB plugin uses an 
EVF-based BSON reader as well as the V1 JSON reader)
 * Remove all V1-based unit tests
 * Migrate the JsonOptions from the HTTP Storage Plugin to global location to 
allow other plugins and users of JSON to set JSON configuration at a more 
granular level.
 * Remove extraneous configuration options.
 * Bug fix HTTP UDFs (DRILL-8242)


> Remove Deprecated JSON Reader
> -
>
> Key: DRILL-8241
> URL: https://issues.apache.org/jira/browse/DRILL-8241
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.20.1
>Reporter: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> This is a master ticket to remove the deprecated v1 JSON reader from Drill.  
> This JSON reader is used in several places and removing it will ensure 
> consistent behavior across all data sources. 
> The V2, EVF based JSON reader has several advantages, including the 
> possibility of schema provisioning, limit pushdowns and others.
> Here are the tasks which need to be completed to fully remove the v1 JSON 
> reader.
>  * Complete DRILL-5955 which adds support for the UNION vector to the EVF 
> Json reader.
>  * Convert the convert_fromJSON functions to V2 (DRILL-8239)
>  * Convert the Druid Storage Plugin to V2
>  * Convert MongoDB Storage Plugin to V2.  (Note the MongoDB plugin uses an 
> EVF-based BSON reader as well as the V1 JSON reader)
>  * Remove all V1-based unit tests
>  * Migrate the JsonOptions from the HTTP Storage Plugin to global location to 
> allow other plugins and users of JSON to set JSON configuration at a more 
> granular level. (DRILL-8243)
>  * Remove extraneous configuration options.
>  * Bug fix HTTP UDFs (DRILL-8242)



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


  1   2   3   4   5   6   >