[jira] [Updated] (DRILL-8321) Change kafka_2.13 dependency scope to test

2022-09-29 Thread James Turton (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Turton updated DRILL-8321:

Fix Version/s: 1.20.3
   (was: 2.0.0)

> Change kafka_2.13 dependency scope to test 
> ---
>
> Key: DRILL-8321
> URL: https://issues.apache.org/jira/browse/DRILL-8321
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.20.2
>Reporter: Maksym Rymar
>Assignee: Maksym Rymar
>Priority: Minor
> Fix For: 1.20.3
>
>
> Drill has 2 scala dependencies:
>  * {{org.apache.kafka.kafka_2.13}}
>  * {{com.madhukaraphatak.java-sizeof_2.11}}
> which are targets on different scala versions 2.13 and 2.11. But Scala has no 
> backward compatibility for major releases, so we can’t have 2 libraries 
> compiled on various versions of scala.
> To solve the issue there are only 2 ways:
>  # Compile both libraries on the same major Scala version.
>  # Remove one of the libraries from Drill
> {{kafka_2.13}} is server side (kafka’s server side) dependency and is 
> unnecessary on the client side (Drill). Probably, it was added carelessly to 
> Drill to a compile scope, while it is necessary only in a test scope.
> So {{kafka_2.13}} can be removed from compile scope. It will reduce the Drill 
> package size and the main – it will solve scala version conflict.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8324) remove dependency on java-sizeof jar

2022-09-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17611239#comment-17611239
 ] 

ASF GitHub Bot commented on DRILL-8324:
---

pjfanning opened a new pull request, #2665:
URL: https://github.com/apache/drill/pull/2665

   ## Description
   
   Remove use java-sizeof jar. It is an old copy of Spark code. I have ported 
the latest Spark code to Java and inlined it.
   
   ## Documentation
   (Please describe user-visible changes similar to what should appear in the 
Drill documentation.)
   
   ## Testing
   (Please describe how this PR has been tested.)
   




> remove dependency on java-sizeof jar
> 
>
> Key: DRILL-8324
> URL: https://issues.apache.org/jira/browse/DRILL-8324
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: PJ Fanning
>Priority: Major
>
> [https://github.com/phatak-dev/java-sizeof] is not maintained and ties us to 
> a very old version of Scala.
> It looks like it should be easy to rewrite the code in Java and have it in 
> Drill itself.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8325) Convert PDF Format Plugin to EVF V2

2022-09-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17611208#comment-17611208
 ] 

ASF GitHub Bot commented on DRILL-8325:
---

cgivre opened a new pull request, #2664:
URL: https://github.com/apache/drill/pull/2664

   # [DRILL-8325](https://issues.apache.org/jira/browse/DRILL-8325): Convert 
PDF Format Plugin to EVF V2
   
   ## Description
   Converts the PDF Format reader to use EVF V2. 
   
   ## Documentation
   No user facing changes.
   
   ## Testing
   Ran existing unit tests.  For star queries, it seemed that EVF2 doesn't 
always return columns in the same order, so I had to refactor one unit test.




> Convert PDF Format Plugin to EVF V2
> ---
>
> Key: DRILL-8325
> URL: https://issues.apache.org/jira/browse/DRILL-8325
> Project: Apache Drill
>  Issue Type: Task
>  Components: Format - PDF
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
> Fix For: 2.0.0
>
>
> Converts the PDF Format Reader to EVF V2. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8323) upgrade commons-text to 1.10.0

2022-09-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17611209#comment-17611209
 ] 

ASF GitHub Bot commented on DRILL-8323:
---

cgivre merged PR #2663:
URL: https://github.com/apache/drill/pull/2663




> upgrade commons-text to 1.10.0
> --
>
> Key: DRILL-8323
> URL: https://issues.apache.org/jira/browse/DRILL-8323
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: PJ Fanning
>Priority: Major
>
> [https://commons.apache.org/proper/commons-text/changes-report.html#a1.10.0]
> https://issues.apache.org/jira/browse/TEXT-191 affects one of our tests - I 
> have fixed the test in my PR - the old expected value was wrong due to 
> TEXT-191 bug



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8325) Convert PDF Format Plugin to EVF V2

2022-09-29 Thread Charles Givre (Jira)
Charles Givre created DRILL-8325:


 Summary: Convert PDF Format Plugin to EVF V2
 Key: DRILL-8325
 URL: https://issues.apache.org/jira/browse/DRILL-8325
 Project: Apache Drill
  Issue Type: Task
  Components: Format - PDF
Affects Versions: 1.20.2
Reporter: Charles Givre
Assignee: Charles Givre
 Fix For: 2.0.0


Converts the PDF Format Reader to EVF V2. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8323) upgrade commons-text to 1.10.0

2022-09-29 Thread PJ Fanning (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

PJ Fanning updated DRILL-8323:
--
Description: 
[https://commons.apache.org/proper/commons-text/changes-report.html#a1.10.0]

https://issues.apache.org/jira/browse/TEXT-191 affects one of our tests - I 
have fixed the test in my PR - the old expected value was wrong due to TEXT-191 
bug

  was:https://commons.apache.org/proper/commons-text/changes-report.html#a1.10.0


> upgrade commons-text to 1.10.0
> --
>
> Key: DRILL-8323
> URL: https://issues.apache.org/jira/browse/DRILL-8323
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: PJ Fanning
>Priority: Major
>
> [https://commons.apache.org/proper/commons-text/changes-report.html#a1.10.0]
> https://issues.apache.org/jira/browse/TEXT-191 affects one of our tests - I 
> have fixed the test in my PR - the old expected value was wrong due to 
> TEXT-191 bug



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8323) upgrade commons-text to 1.10.0

2022-09-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17611142#comment-17611142
 ] 

ASF GitHub Bot commented on DRILL-8323:
---

pjfanning commented on code in PR #2663:
URL: https://github.com/apache/drill/pull/2663#discussion_r983816977


##
contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestStringDistanceFunctions.java:
##
@@ -59,15 +59,15 @@ public void testJaccardDistance() throws Exception {
 double result = queryBuilder()
 .sql("select jaccard_distance( 'Big car', 'red car' ) as distance FROM 
(VALUES(1))")
 .singletonDouble();
-assertEquals(0.56, result, 0.0);
+assertEquals(0.5556, result, 0.0);
   }
 
   @Test
   public void testJaroDistance() throws Exception {
 double result = queryBuilder()
 .sql("select jaro_distance( 'Big car', 'red car' ) as distance FROM 
(VALUES(1))")
 .singletonDouble();
-assertEquals(0.7142857142857143, result, 0.0);
+assertEquals(0.2857142857142857, result, 0.0);
   }

Review Comment:
   change needed because of https://issues.apache.org/jira/browse/TEXT-191





> upgrade commons-text to 1.10.0
> --
>
> Key: DRILL-8323
> URL: https://issues.apache.org/jira/browse/DRILL-8323
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: PJ Fanning
>Priority: Major
>
> [https://commons.apache.org/proper/commons-text/changes-report.html#a1.10.0]
> https://issues.apache.org/jira/browse/TEXT-191 affects one of our tests - I 
> have fixed the test in my PR - the old expected value was wrong due to 
> TEXT-191 bug



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8324) remove dependency on java-sizeof jar

2022-09-29 Thread PJ Fanning (Jira)
PJ Fanning created DRILL-8324:
-

 Summary: remove dependency on java-sizeof jar
 Key: DRILL-8324
 URL: https://issues.apache.org/jira/browse/DRILL-8324
 Project: Apache Drill
  Issue Type: Improvement
Reporter: PJ Fanning


[https://github.com/phatak-dev/java-sizeof] is not maintained and ties us to a 
very old version of Scala.

It looks like it should be easy to rewrite the code in Java and have it in 
Drill itself.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8323) upgrade commons-text to 1.10.0

2022-09-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17611082#comment-17611082
 ] 

ASF GitHub Bot commented on DRILL-8323:
---

pjfanning opened a new pull request, #2663:
URL: https://github.com/apache/drill/pull/2663

   (and excel-streaming-reader that has a transitive dependency on it)
   
   ## Description
   
   Some important bugs fixed in latest version




> upgrade commons-text to 1.10.0
> --
>
> Key: DRILL-8323
> URL: https://issues.apache.org/jira/browse/DRILL-8323
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: PJ Fanning
>Priority: Major
>
> https://commons.apache.org/proper/commons-text/changes-report.html#a1.10.0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8323) upgrade commons-text to 1.10.0

2022-09-29 Thread PJ Fanning (Jira)
PJ Fanning created DRILL-8323:
-

 Summary: upgrade commons-text to 1.10.0
 Key: DRILL-8323
 URL: https://issues.apache.org/jira/browse/DRILL-8323
 Project: Apache Drill
  Issue Type: Improvement
Reporter: PJ Fanning


https://commons.apache.org/proper/commons-text/changes-report.html#a1.10.0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8321) Change kafka_2.13 dependency scope to test

2022-09-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17611063#comment-17611063
 ] 

ASF GitHub Bot commented on DRILL-8321:
---

rymarm opened a new pull request, #2662:
URL: https://github.com/apache/drill/pull/2662

   # [DRILL-8321](https://issues.apache.org/jira/browse/DRILL-8321): Change 
kafka_2.13 dependency scope to test 
   
   ## Description
   Move `org.apache.kafka.kafka_2.13` to test dependency scope. It prevents 
conflict between Scala versions of `org.apache.kafka.kafka_2.13` (Scala 2.13) 
and `com.madhukaraphatak.java-sizeof_2.11` (Scala 2.11). Solves such exceptions 
that could appear during Drill-On-Yarn startup:
   ```
   Caused by: java.util.ServiceConfigurationError: 
com.fasterxml.jackson.databind.Module: Provider 
com.fasterxml.jackson.module.scala.DefaultScalaModule could not be instantiated
   ...
   Caused by: java.lang.NoSuchMethodError: 'scala.collection.immutable.Seq$ 
scala.package$.Seq()'
   ```
   
   Also, it cleans Drill's classpath from unnecessary dependencies.
   
   Probably `org.apache.kafka.kafka_2.13` was added to compile scope by 
mistake. 
   
   ## Documentation
   No changes.
   
   ## Testing
   Unit tests pass successfully. Tried to connect to Kafka and read a topic - 
everything works fine.
   




> Change kafka_2.13 dependency scope to test 
> ---
>
> Key: DRILL-8321
> URL: https://issues.apache.org/jira/browse/DRILL-8321
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.20.2
>Reporter: Maksym Rymar
>Assignee: Maksym Rymar
>Priority: Minor
> Fix For: 2.0.0
>
>
> Drill has 2 scala dependencies:
>  * {{org.apache.kafka.kafka_2.13}}
>  * {{com.madhukaraphatak.java-sizeof_2.11}}
> which are targets on different scala versions 2.13 and 2.11. But Scala has no 
> backward compatibility for major releases, so we can’t have 2 libraries 
> compiled on various versions of scala.
> To solve the issue there are only 2 ways:
>  # Compile both libraries on the same major Scala version.
>  # Remove one of the libraries from Drill
> {{kafka_2.13}} is server side (kafka’s server side) dependency and is 
> unnecessary on the client side (Drill). Probably, it was added carelessly to 
> Drill to a compile scope, while it is necessary only in a test scope.
> So {{kafka_2.13}} can be removed from compile scope. It will reduce the Drill 
> package size and the main – it will solve scala version conflict.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8321) Change kafka_2.13 dependency scope to test

2022-09-29 Thread PJ Fanning (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17611053#comment-17611053
 ] 

PJ Fanning commented on DRILL-8321:
---

Any chance we can drop [https://github.com/phatak-dev/java-sizeof] ? If it 
isn't published for recent scala versions, it is a real millstone around our 
necks.

> Change kafka_2.13 dependency scope to test 
> ---
>
> Key: DRILL-8321
> URL: https://issues.apache.org/jira/browse/DRILL-8321
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.20.2
>Reporter: Maksym Rymar
>Assignee: Maksym Rymar
>Priority: Minor
> Fix For: 2.0.0
>
>
> Drill has 2 scala dependencies:
>  * {{org.apache.kafka.kafka_2.13}}
>  * {{com.madhukaraphatak.java-sizeof_2.11}}
> which are targets on different scala versions 2.13 and 2.11. But Scala has no 
> backward compatibility for major releases, so we can’t have 2 libraries 
> compiled on various versions of scala.
> To solve the issue there are only 2 ways:
>  # Compile both libraries on the same major Scala version.
>  # Remove one of the libraries from Drill
> {{kafka_2.13}} is server side (kafka’s server side) dependency and is 
> unnecessary on the client side (Drill). Probably, it was added carelessly to 
> Drill to a compile scope, while it is necessary only in a test scope.
> So {{kafka_2.13}} can be removed from compile scope. It will reduce the Drill 
> package size and the main – it will solve scala version conflict.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-7878) Fix LGTM Alerts

2022-09-29 Thread PJ Fanning (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17611051#comment-17611051
 ] 

PJ Fanning commented on DRILL-7878:
---

[~yaybeNo] can this be closed? lgtm is closing down and many of the issues have 
been dealt with anyway

> Fix LGTM Alerts
> ---
>
> Key: DRILL-7878
> URL: https://issues.apache.org/jira/browse/DRILL-7878
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Evan Wong
>Priority: Major
>
> Try and deal with all alerts from LGTM badge
> [https://lgtm.com/projects/g/apache/drill/alerts/?mode=list]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (DRILL-8322) Add a list of scanned plugin names to the query profile

2022-09-29 Thread James Turton (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Turton updated DRILL-8322:

Description: A useful piece of information about a query is the set of 
plugins that it attempted to scan. While some such information is present in 
the physical plan text in the query profile, an easy to parse, sorted and 
deduplicated list of plugin names is not.  (was: A useful piece of information 
about a query is the set of plugins that it attempted to scan. While some such 
information is present in the physical plan text in the query profile, an easy 
to parse sorted and deuplicated list of plugin names is not.)

> Add a list of scanned plugin names to the query profile
> ---
>
> Key: DRILL-8322
> URL: https://issues.apache.org/jira/browse/DRILL-8322
> Project: Apache Drill
>  Issue Type: Improvement
>  Components:  Server
>Affects Versions: 1.20.2
>Reporter: James Turton
>Priority: Minor
> Fix For: 2.0.0
>
>
> A useful piece of information about a query is the set of plugins that it 
> attempted to scan. While some such information is present in the physical 
> plan text in the query profile, an easy to parse, sorted and deduplicated 
> list of plugin names is not.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8322) Add a list of scanned plugin names to the query profile

2022-09-29 Thread James Turton (Jira)
James Turton created DRILL-8322:
---

 Summary: Add a list of scanned plugin names to the query profile
 Key: DRILL-8322
 URL: https://issues.apache.org/jira/browse/DRILL-8322
 Project: Apache Drill
  Issue Type: Improvement
  Components:  Server
Affects Versions: 1.20.2
Reporter: James Turton
 Fix For: 2.0.0


A useful piece of information about a query is the set of plugins that it 
attempted to scan. While some such information is present in the physical plan 
text in the query profile, an easy to parse sorted and deuplicated list of 
plugin names is not.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (DRILL-8321) Change kafka_2.13 dependency scope to test

2022-09-29 Thread Maksym Rymar (Jira)
Maksym Rymar created DRILL-8321:
---

 Summary: Change kafka_2.13 dependency scope to test 
 Key: DRILL-8321
 URL: https://issues.apache.org/jira/browse/DRILL-8321
 Project: Apache Drill
  Issue Type: Task
Affects Versions: 1.20.2
Reporter: Maksym Rymar
Assignee: Maksym Rymar
 Fix For: 2.0.0


Drill has 2 scala dependencies:
 * {{org.apache.kafka.kafka_2.13}}

 * {{com.madhukaraphatak.java-sizeof_2.11}}

which are targets on different scala versions 2.13 and 2.11. But Scala has no 
backward compatibility for major releases, so we can’t have 2 libraries 
compiled on various versions of scala.

To solve the issue there are only 2 ways:
 # Compile both libraries on the same major Scala version.

 # Remove one of the libraries from Drill

{{kafka_2.13}} is server side (kafka’s server side) dependency and is 
unnecessary on the client side (Drill). Probably, it was added carelessly to 
Drill to a compile scope, while it is necessary only in a test scope.

So {{kafka_2.13}} can be removed from compile scope. It will reduce the Drill 
package size and the main – it will solve scala version conflict.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)