Re: drill tests not passing

2023-07-13 Thread Charles Givre
Hi Mike,
You can just build Drill with the -DskipTests = true option and that should
work.  I do my development in intellij, and just run the relevant unit
tests there.  Then to test Drill in its entirety, I'll use the CI in github
which works most of the time. ;-)


-- C

On Thu, Jul 13, 2023 at 1:14 PM Mike Beckerle  wrote:

> To answer questions:
>
> 1. Paul: This is a 100% stock build. All I have done is clone the repo
> (master branch). Make a new git branch (in case I make future changes). Try
> to build (success) and test (failed so far).
>
> 2. James: The /opt/drill directory I created is owned by my userid and has
> full read/write access for all the development activities. I just put it
> there so it would have a shorter path to fix the first Hive-related glitch
> I encountered with the Linux 255 limit on file pathname length.
>
> I will try the suggested maven command line for non-UTC and see if things
> improve.
>
> The challenge for me as a newby is how do I know if I have everything
> properly configured?
>
> Can I just turn off building and testing of the Hive-related stuff in some
> supported/well-known way?
>
> If so, I would suggest I'd like to turn off not just Hive, but *as much as
> possible*. I really just need the embedded drill to work.
>
> I would agree with @Charles Givre   that a contrib
> package addition is the ideal approach and that's what I'll be attempting.
>
> -mikeb
>
> On Thu, Jul 13, 2023 at 10:59 AM Charles Givre  wrote:
>
> > I'll add some heresy here... IMHO, for the purposes of developing a DFDL
> > extension, you probably don't need all the Drill tests to run.  For your
> > project, my suggestion would be to add a module to the contrib package
> and
> > that way your changes are relatively self contained.
> > Best,
> > -- C
> >
> >
> >
> > > On Jul 13, 2023, at 10:27 AM, James Turton  wrote:
> > >
> > > Hi Mike
> > >
> > > Here's the command line I use to run tests on a machine that's not in
> > the UTC time zone (plus some unrelated memory size arguments).
> > >
> > > mvn test -Djunit.args="-Duser.timezone=UTC -Duser.language=en
> > -Duser.region=US" -DmemoryMb=2560 -DdirectMemoryMb=2560
> > >
> > > I have one other question to add to Paul's comments - does the OS user
> > that you're running Maven under have write access to all of the source
> tree
> > that you put at /opt/drill?
> > >
> > > On 2023/07/11 22:12, Paul Rogers wrote:
> > >> Hi Mike,
> > >>
> > >> A quick glance at the log suggests a failure in the tests for the JSON
> > >> reader, in the Mongo extended types. Drill's date/time support has
> > >> historically been fragile. Some tests only work if your machine is set
> > to
> > >> use the UTC time zone (or Java is told to pretend that the time is
> UTC.)
> > >> The Mongo types test failure seems to be around a date/time test so
> > maybe
> > >> this is the issue?
> > >>
> > >> There are also failures indicating that the Drillbit (Drill server)
> > died.
> > >> Not sure how this can happen, as tests run Drill embedded (or used
> to.)
> > >> Looking earlier in the logs, it seems that the Drillbit didn't start
> > due to
> > >> UDF (user-defined function) failures:
> > >>
> > >> Found duplicated function in drill-custom-lower.jar:
> > >> custom_lower(VARCHAR-REQUIRED)
> > >> Found duplicated function in built-in: lower(VARCHAR-REQUIRED)
> > >>
> > >> Not sure how this could occur: it should have failed in all builds.
> > >>
> > >> Also:
> > >>
> > >> File
> > >>
> >
> /opt/drill/exec/java-exec/target/org.apache.drill.exec.udf.dynamic.TestDynamicUDFSupport/home/drill/happy/udf/staging/drill-custom-lower-sources.jar
> > >> does not exist on file system file:///
> > >>
> > >> This is complaining that Drill needs the source code (not just class
> > file)
> > >> for its built-in functions. Again, this should not fail in a standard
> > >> build, because if it did, it would fail in all builds.
> > >>
> > >> There are other odd errors as well.
> > >>
> > >> Perhaps we should ask: is this a "stock" build? Check out Drill and
> run
> > >> tests? Or, have you already started making changes for your project?
> > >>
> > >> - Paul
> > >>
> > >>
> > >> On Tue, Jul 11, 2023 at 9:07 AM Mike Beckerle 
> > wrote:
> > >>
> > >>> I have drill building and running its tests. Some tests fail: [ERROR]
> > >>> Tests run: 4366, Failures: 2, Errors: 1, Skipped: 133
> > >>>
> > >>> I am wondering if there is perhaps some setup step that I missed in
> the
> > >>> instructions.
> > >>>
> > >>> I have attached the output from the 'mvn clean install
> > -DskipTests=false'
> > >>> execution. (zipped)
> > >>> I am running on Ubuntu 20.04, definitely have Java 8 setup.
> > >>>
> > >>> I'm hoping someone can skim it and spot the issue(s).
> > >>>
> > >>> Thanks for any help
> > >>>
> > >>> Mike Beckerle
> > >>> Apache Daffodil PMC | daffodil.apache.org
> > >>> OGF DFDL Workgroup Co-Chair |
> > www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
> > >>> Owl Cyber Defense | www.owlcyberdefense.com

Re: drill tests not passing

2023-07-13 Thread Mike Beckerle
To answer questions:

1. Paul: This is a 100% stock build. All I have done is clone the repo
(master branch). Make a new git branch (in case I make future changes). Try
to build (success) and test (failed so far).

2. James: The /opt/drill directory I created is owned by my userid and has
full read/write access for all the development activities. I just put it
there so it would have a shorter path to fix the first Hive-related glitch
I encountered with the Linux 255 limit on file pathname length.

I will try the suggested maven command line for non-UTC and see if things
improve.

The challenge for me as a newby is how do I know if I have everything
properly configured?

Can I just turn off building and testing of the Hive-related stuff in some
supported/well-known way?

If so, I would suggest I'd like to turn off not just Hive, but *as much as
possible*. I really just need the embedded drill to work.

I would agree with @Charles Givre   that a contrib
package addition is the ideal approach and that's what I'll be attempting.

-mikeb

On Thu, Jul 13, 2023 at 10:59 AM Charles Givre  wrote:

> I'll add some heresy here... IMHO, for the purposes of developing a DFDL
> extension, you probably don't need all the Drill tests to run.  For your
> project, my suggestion would be to add a module to the contrib package and
> that way your changes are relatively self contained.
> Best,
> -- C
>
>
>
> > On Jul 13, 2023, at 10:27 AM, James Turton  wrote:
> >
> > Hi Mike
> >
> > Here's the command line I use to run tests on a machine that's not in
> the UTC time zone (plus some unrelated memory size arguments).
> >
> > mvn test -Djunit.args="-Duser.timezone=UTC -Duser.language=en
> -Duser.region=US" -DmemoryMb=2560 -DdirectMemoryMb=2560
> >
> > I have one other question to add to Paul's comments - does the OS user
> that you're running Maven under have write access to all of the source tree
> that you put at /opt/drill?
> >
> > On 2023/07/11 22:12, Paul Rogers wrote:
> >> Hi Mike,
> >>
> >> A quick glance at the log suggests a failure in the tests for the JSON
> >> reader, in the Mongo extended types. Drill's date/time support has
> >> historically been fragile. Some tests only work if your machine is set
> to
> >> use the UTC time zone (or Java is told to pretend that the time is UTC.)
> >> The Mongo types test failure seems to be around a date/time test so
> maybe
> >> this is the issue?
> >>
> >> There are also failures indicating that the Drillbit (Drill server)
> died.
> >> Not sure how this can happen, as tests run Drill embedded (or used to.)
> >> Looking earlier in the logs, it seems that the Drillbit didn't start
> due to
> >> UDF (user-defined function) failures:
> >>
> >> Found duplicated function in drill-custom-lower.jar:
> >> custom_lower(VARCHAR-REQUIRED)
> >> Found duplicated function in built-in: lower(VARCHAR-REQUIRED)
> >>
> >> Not sure how this could occur: it should have failed in all builds.
> >>
> >> Also:
> >>
> >> File
> >>
> /opt/drill/exec/java-exec/target/org.apache.drill.exec.udf.dynamic.TestDynamicUDFSupport/home/drill/happy/udf/staging/drill-custom-lower-sources.jar
> >> does not exist on file system file:///
> >>
> >> This is complaining that Drill needs the source code (not just class
> file)
> >> for its built-in functions. Again, this should not fail in a standard
> >> build, because if it did, it would fail in all builds.
> >>
> >> There are other odd errors as well.
> >>
> >> Perhaps we should ask: is this a "stock" build? Check out Drill and run
> >> tests? Or, have you already started making changes for your project?
> >>
> >> - Paul
> >>
> >>
> >> On Tue, Jul 11, 2023 at 9:07 AM Mike Beckerle 
> wrote:
> >>
> >>> I have drill building and running its tests. Some tests fail: [ERROR]
> >>> Tests run: 4366, Failures: 2, Errors: 1, Skipped: 133
> >>>
> >>> I am wondering if there is perhaps some setup step that I missed in the
> >>> instructions.
> >>>
> >>> I have attached the output from the 'mvn clean install
> -DskipTests=false'
> >>> execution. (zipped)
> >>> I am running on Ubuntu 20.04, definitely have Java 8 setup.
> >>>
> >>> I'm hoping someone can skim it and spot the issue(s).
> >>>
> >>> Thanks for any help
> >>>
> >>> Mike Beckerle
> >>> Apache Daffodil PMC | daffodil.apache.org
> >>> OGF DFDL Workgroup Co-Chair |
> www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
> >>> Owl Cyber Defense | www.owlcyberdefense.com
> >>>
> >>>
> >>>
> >
>
>


Re: drill tests not passing

2023-07-13 Thread Charles Givre
I'll add some heresy here... IMHO, for the purposes of developing a DFDL 
extension, you probably don't need all the Drill tests to run.  For your 
project, my suggestion would be to add a module to the contrib package and that 
way your changes are relatively self contained.  
Best,
-- C



> On Jul 13, 2023, at 10:27 AM, James Turton  wrote:
> 
> Hi Mike
> 
> Here's the command line I use to run tests on a machine that's not in the UTC 
> time zone (plus some unrelated memory size arguments).
> 
> mvn test -Djunit.args="-Duser.timezone=UTC -Duser.language=en 
> -Duser.region=US" -DmemoryMb=2560 -DdirectMemoryMb=2560
> 
> I have one other question to add to Paul's comments - does the OS user that 
> you're running Maven under have write access to all of the source tree that 
> you put at /opt/drill?
> 
> On 2023/07/11 22:12, Paul Rogers wrote:
>> Hi Mike,
>> 
>> A quick glance at the log suggests a failure in the tests for the JSON
>> reader, in the Mongo extended types. Drill's date/time support has
>> historically been fragile. Some tests only work if your machine is set to
>> use the UTC time zone (or Java is told to pretend that the time is UTC.)
>> The Mongo types test failure seems to be around a date/time test so maybe
>> this is the issue?
>> 
>> There are also failures indicating that the Drillbit (Drill server) died.
>> Not sure how this can happen, as tests run Drill embedded (or used to.)
>> Looking earlier in the logs, it seems that the Drillbit didn't start due to
>> UDF (user-defined function) failures:
>> 
>> Found duplicated function in drill-custom-lower.jar:
>> custom_lower(VARCHAR-REQUIRED)
>> Found duplicated function in built-in: lower(VARCHAR-REQUIRED)
>> 
>> Not sure how this could occur: it should have failed in all builds.
>> 
>> Also:
>> 
>> File
>> /opt/drill/exec/java-exec/target/org.apache.drill.exec.udf.dynamic.TestDynamicUDFSupport/home/drill/happy/udf/staging/drill-custom-lower-sources.jar
>> does not exist on file system file:///
>> 
>> This is complaining that Drill needs the source code (not just class file)
>> for its built-in functions. Again, this should not fail in a standard
>> build, because if it did, it would fail in all builds.
>> 
>> There are other odd errors as well.
>> 
>> Perhaps we should ask: is this a "stock" build? Check out Drill and run
>> tests? Or, have you already started making changes for your project?
>> 
>> - Paul
>> 
>> 
>> On Tue, Jul 11, 2023 at 9:07 AM Mike Beckerle  wrote:
>> 
>>> I have drill building and running its tests. Some tests fail: [ERROR]
>>> Tests run: 4366, Failures: 2, Errors: 1, Skipped: 133
>>> 
>>> I am wondering if there is perhaps some setup step that I missed in the
>>> instructions.
>>> 
>>> I have attached the output from the 'mvn clean install -DskipTests=false'
>>> execution. (zipped)
>>> I am running on Ubuntu 20.04, definitely have Java 8 setup.
>>> 
>>> I'm hoping someone can skim it and spot the issue(s).
>>> 
>>> Thanks for any help
>>> 
>>> Mike Beckerle
>>> Apache Daffodil PMC | daffodil.apache.org
>>> OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
>>> Owl Cyber Defense | www.owlcyberdefense.com
>>> 
>>> 
>>> 
> 



Re: drill tests not passing

2023-07-13 Thread James Turton

Hi Mike

Here's the command line I use to run tests on a machine that's not in 
the UTC time zone (plus some unrelated memory size arguments).


mvn test -Djunit.args="-Duser.timezone=UTC -Duser.language=en 
-Duser.region=US" -DmemoryMb=2560 -DdirectMemoryMb=2560


I have one other question to add to Paul's comments - does the OS user 
that you're running Maven under have write access to all of the source 
tree that you put at /opt/drill?


On 2023/07/11 22:12, Paul Rogers wrote:

Hi Mike,

A quick glance at the log suggests a failure in the tests for the JSON
reader, in the Mongo extended types. Drill's date/time support has
historically been fragile. Some tests only work if your machine is set to
use the UTC time zone (or Java is told to pretend that the time is UTC.)
The Mongo types test failure seems to be around a date/time test so maybe
this is the issue?

There are also failures indicating that the Drillbit (Drill server) died.
Not sure how this can happen, as tests run Drill embedded (or used to.)
Looking earlier in the logs, it seems that the Drillbit didn't start due to
UDF (user-defined function) failures:

Found duplicated function in drill-custom-lower.jar:
custom_lower(VARCHAR-REQUIRED)
Found duplicated function in built-in: lower(VARCHAR-REQUIRED)

Not sure how this could occur: it should have failed in all builds.

Also:

File
/opt/drill/exec/java-exec/target/org.apache.drill.exec.udf.dynamic.TestDynamicUDFSupport/home/drill/happy/udf/staging/drill-custom-lower-sources.jar
does not exist on file system file:///

This is complaining that Drill needs the source code (not just class file)
for its built-in functions. Again, this should not fail in a standard
build, because if it did, it would fail in all builds.

There are other odd errors as well.

Perhaps we should ask: is this a "stock" build? Check out Drill and run
tests? Or, have you already started making changes for your project?

- Paul


On Tue, Jul 11, 2023 at 9:07 AM Mike Beckerle  wrote:


I have drill building and running its tests. Some tests fail: [ERROR]
Tests run: 4366, Failures: 2, Errors: 1, Skipped: 133

I am wondering if there is perhaps some setup step that I missed in the
instructions.

I have attached the output from the 'mvn clean install -DskipTests=false'
execution. (zipped)
I am running on Ubuntu 20.04, definitely have Java 8 setup.

I'm hoping someone can skim it and spot the issue(s).

Thanks for any help

Mike Beckerle
Apache Daffodil PMC | daffodil.apache.org
OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
Owl Cyber Defense | www.owlcyberdefense.com







Re: Newby: First attempt to build drill - failure

2023-07-13 Thread James Turton
That class comes from the Hive codebase where it looks like they have 
made liberal use of anonymous inner classes. The Drill build gets 
exposed to it because it uses the Maven shade plugin to repackage Hive 
and its dependencies. I don't think we can easily change the name of 
that class.


On 2023/07/11 16:38, Mike Beckerle wrote:

Should there be a ticket created about this:

/home/mbeckerle/dataiti/opensource/drill/contrib/storage-hive/hive-exec-shade/target/classes/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore$drop_partition_by_name_with_environment_context_args$drop_partition_by_name_with_environment_context_argsTupleSchemeFactory.class

The largest part of that path is the file name part which has 
"drop_partition_by_name_with_environment_context_args" appearing twice 
in the class file name. This appears to be a generated name so we 
should be able to shorten it.



On Tue, Jul 11, 2023 at 12:27 AM James Turton  wrote:

Good news and welcome to Drill!

I haven't heard of anyone runing into this problem before, and I
build
Drill under the directory /home/james/Development/apache/drill which
isn't far off of what you tried in terms of length. I do see the
280-character path cited by Maven below though. Perhaps in your
case the
drill-hive-exec-shaded was downloaded from the Apache Snapshots repo,
rather than built locally, and this issue only presents itself if the
maven-dependency-plugin must unpack a very long file path from a
downloaded jar.


On 2023/07/10 18:23, Mike Beckerle wrote:
> Never mind. The file name was > 255 long, so I have installed
the drill
> build tree in /opt and now the path is shorter than 255.
>
>
> On Mon, Jul 10, 2023 at 12:00 PM Mike Beckerle
 wrote:
>
>> I'm trying to build the current master branch as of today
2023-07-10.
>>
>> It fails due to a file-name too long issue.
>>
>> The command I issued is just "mvn clean install -DskipTests"
per the
>> instructions.
>>
>> I'm running on Linux, Ubuntu 20.04. Java 8.
>>
>> [INFO] --- maven-dependency-plugin:3.4.0:unpack (unpack) @
>> drill-hive-exec-shaded ---
>> [INFO] Configured Artifact:
>>

org.apache.drill.contrib.storage-hive:drill-hive-exec-shaded:1.22.0-SNAPSHOT:jar
>> [INFO] Unpacking
>>

/home/mbeckerle/dataiti/opensource/drill/contrib/storage-hive/hive-exec-shade/target/drill-hive-exec-shaded-1.22.0-SNAPSHOT.jar
>> to
>>

/home/mbeckerle/dataiti/opensource/drill/contrib/storage-hive/hive-exec-shade/target/classes
>> with includes "**/**" and excludes ""
>> [INFO]
>>

>> [INFO] Reactor Summary for Drill : 1.22.0-SNAPSHOT:
>> [INFO]
>> [INFO] Drill : 
SUCCESS [
>>   3.974 s]
>> [INFO] Drill : Tools : 
SUCCESS [
>>   0.226 s]
>> [INFO] Drill : Tools : Freemarker codegen .
SUCCESS [
>>   3.762 s]
>> [INFO] Drill : Protocol ...
SUCCESS [
>>   5.001 s]
>> [INFO] Drill : Common .
SUCCESS [
>>   4.944 s]
>> [INFO] Drill : Logical Plan ...
SUCCESS [
>>   5.991 s]
>> [INFO] Drill : Exec : .
SUCCESS [
>>   0.210 s]
>> [INFO] Drill : Exec : Memory : 
SUCCESS [
>>   0.179 s]
>> [INFO] Drill : Exec : Memory : Base ...
SUCCESS [
>>   2.373 s]
>> [INFO] Drill : Exec : RPC .
SUCCESS [
>>   2.436 s]
>> [INFO] Drill : Exec : Vectors .
SUCCESS [
>> 54.917 s]
>> [INFO] Drill : Contrib : ..
SUCCESS [
>>   0.138 s]
>> [INFO] Drill : Contrib : Data : ...
SUCCESS [
>>   0.143 s]
>> [INFO] Drill : Contrib : Data : TPCH Sample ...
SUCCESS [
>>   1.473 s]
>> [INFO] Drill : Metastore : 
SUCCESS [
>>   0.144 s]
>> [INFO] Drill : Metastore : API 
SUCCESS [
>>   4.366 s]
>> [INFO] Drill : Metastore : Iceberg 
SUCCESS [
>>   3.940 s]
>> [INFO] Drill : Exec : Java Execution Engine ...
SUCCESS [01:04
>> min]
>> [INFO] Drill : Exec : JDBC Driver using dependencies ..
SUCCESS [
>>   7.332 s]
>> [INFO] Drill : Exec : JDBC JAR with all dependencies ..
SUCCESS [
>> 16.304 s]
>> [INFO] Drill : On-YARN 
SUCCESS [
>>   5.477 s]
>> [INFO] Drill : Metastore : RDBMS ..
SUCCESS [