Re: [Discuss] Integrate Arrow gandiva into Drill

2019-04-22 Thread Parth Chandra
Is there a way to provide Drill's memory allocator to Gandiva/Arrow? If
not, then how do we keep a proper accounting of any memory used by
Gandiva/Arrow?

On Sat, Apr 20, 2019 at 7:05 PM Paul Rogers 
wrote:

> Hi Weijie,
>
> Thanks much for the explanation. Sounds like you are making good progress.
>
>
> For which operator is the filter pushed into the scan? Although Impala
> does this for all scans, AFAIK, Drill does not do so. For example, the text
> and JSON reader do not handle filtering. Filtering is instead done by the
> Filter operator in these cases. Perhaps you have your own special scan
> which handles filtering?
>
>
> The concern in DRILL-6340 was the user might do a project operation that
> causes the output batch to be much larger than the input batch. Someone
> suggested flatten as one example. String concatenation is another example.
> The input batch might be large. The result of the concatenation could be
> too large for available memory. So, the idea was to project the single
> input batch into two (or more) output batches to control batch size.
>
>
> II like how you've categorized the vectors into the set that Gandiva can
> project, and the set that Drill must handle. Maybe you can extend this idea
> for the case where input batches are split into multiple output batches.
>
>  Let Drill handle VarChar expressions that could increase column width
> (such as the concatenate operator.) Let Drill decide the number of rows in
> the output batch. Then, for the columns that Gandiva can handle, project
> just those rows needed for the current output batch.
>
> Your solution might also be extended to handle the Gandiva library issue.
> Since you are splitting vectors into the Drill group and the Gandiva group,
> if Drill runs on a platform without Gandiva support, or if the Gandiva
> library can't be found, just let all vectors fall into the Drill vector
> group.
>
> If the user wants to use Gandiva, he/she could set a config option to
> point to the Gandiva library (and supporting files, if any.) Or, use the
> existing LD_LIBRARY_PATH env. variable.
>
> Thanks,
> - Paul
>
>
>
> On Thursday, April 18, 2019, 11:45:08 PM PDT, weijie tong <
> tongweijie...@gmail.com> wrote:
>
>  Hi Paul:
> Currently Gandiva only supports Project ,Filter operations. My work is to
> integrate Project operator. Since most of the Filter operator will be
> pushed down to the Scan.
>
> The Gandiva project interface works at the RecordBatch level. It accepts
> the memory address of the vectors of  input RecordBatch and . Before that
> it also need to construct a binary schema object to describe the input
> RecordBatch schema.
>
> The integration work mainly has two parts:
>   1. at the setup step, find the expressions which can be solved by the
> Gandiva . The matched expression will be solved by the Gandiva, others will
> still be solved by Drill.
>   2. invoking the Gandiva native project method. The matched expressions'
> ValueVectors will all be allocated corresponding Arrow type null
> representation ValueVector. The null input vector's bit  will also be set.
> The same work will also be done to the output ValueVectors, transfer the
> arrow output null vector to Drill's null vector. Since the native method
> only care the physical memory address, invoking that native method is not a
> hard work.
>
> Since my current implementation is before DRILL-6340, it does not solve the
> output size of the project which is less than the input size case. To cover
> that case , there's some more work to do which I have not focused on.
>
> To contribute to community , there's also some test case problem which
> needs to be considered, since the Gandiva jar is platform dependent.
>
>
>
>
> On Fri, Apr 19, 2019 at 8:43 AM Paul Rogers 
> wrote:
>
> > Hi Weijie,
> >
> > Thanks much for the update on your Gandiva work. It is great work.
> >
> > Can you say more about how you are doing the integration?
> >
> > As you mentioned the memory layout of Arrow's null vector differs from
> the
> > "is set" vector in Drill. How did you work around that?
> >
> > The Project operator is pretty simple if we are just copying or removing
> > columns. However, much of Project deals with invoking Drill-provided
> > functions: simple ones (add two ints) and complex ones (perform a regex
> > match). To be useful, the integration would have to mimic Drill's
> behavior
> > for each of these many functions.
> >
> > Project currently works row-by-row. But, to get the maximum performance,
> > it would work column-by-column to take full advantage of vectorization.
> > Doing that would require large changes to the code that sets up codegen,
> > and iterates over the batch.
> >
> >
> > For operators such as Sort, the only vector-based operations are 1) sort
> a
> > batch using defined keys to get an offset vector, and 2) create a new
> > vector by copying values, row-by-row, from one batch to another according
> > to the offset vector.
> >
> > The join and agg

[GitHub] [drill] aravi5 commented on issue #1760: DRILL-7164: KafkaFilterPushdownTest is sometimes failing to pattern m…

2019-04-22 Thread GitBox
aravi5 commented on issue #1760: DRILL-7164: KafkaFilterPushdownTest is 
sometimes failing to pattern m…
URL: https://github.com/apache/drill/pull/1760#issuecomment-485531648
 
 
   The changes look good! +1


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (DRILL-7193) Integration changes of the Distributed RM queue configuration with Simple Parallelizer.

2019-04-22 Thread Hanumath Rao Maduri (JIRA)
Hanumath Rao Maduri created DRILL-7193:
--

 Summary: Integration changes of the Distributed RM queue 
configuration with Simple Parallelizer.
 Key: DRILL-7193
 URL: https://issues.apache.org/jira/browse/DRILL-7193
 Project: Apache Drill
  Issue Type: Sub-task
  Components: Query Planning & Optimization
Affects Versions: 1.17.0
Reporter: Hanumath Rao Maduri
Assignee: Hanumath Rao Maduri
 Fix For: 1.17.0


Refactoring fragment generation code for the RM to accommodate non RM, ZK based 
queue RM and Distributed RM.
Calling the Distributed RM for queue selection based on memory requirements.
Adjustment of the operator memory based on the memory limits of the selected 
queue.
Setting of the optimal memory allocation per operator in each minor fragment. 
This shows up in the query profile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] Apache Drill Release 1.16.0 - RC1

2019-04-22 Thread SorabhApache
*< Please disregard previous email, one of the link is not correct in it.
Use the information in this email instead >*

Hi Drillers,
I'd like to propose the second release candidate (RC1) for the Apache Drill,
version 1.16.0.

Changes since the previous release candidate:
DRILL-7185: Drill Fails to Read Large Packets
DRILL-7186: Missing storage.json REST endpoint
DRILL-7190: Missing backward compatibility for REST API with DRILL-6562

Also below 2 JIRA's were created to separately track revert of protbuf
changes in 1.16.0:
DRILL-7188: Revert DRILL-6642: Update protocol-buffers version
DRILL-7189: Revert DRILL-7105 Error while building the Drill native client

The RC1 includes total of 215 resolved JIRAs [1].
Thanks to everyone for their hard work to contribute to this release.

The tarball artifacts are hosted at [2] and the maven artifacts are hosted
at [3].

This release candidate is based on commit
cf5b758e0a4c22b75bfb02ac2653ff09415ddf53 located at [4].

Please download and try out the release candidate.

The vote ends at 06:00 PM UTC (11:00 AM PDT, 09:00 PM EET, 11:30 PM IST),
Apr 25th, 2019

[ ] +1
[ ] +0
[ ] -1

Here is my vote: +1
  [1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820&version=12344284
  [2] http://home.apache.org/~sorabh/drill/releases/1.16.0/rc1/
  [3]
https://repository.apache.org/content/repositories/orgapachedrill-1067/
  [4] https://github.com/sohami/drill/commits/drill-1.16.0

Thanks,
Sorabh

>


Re: Rename 'project.artifactId'

2019-04-22 Thread Abhishek Girish
Also, looking at the apache project examples you shared (like 'hive' &
'hbase'), 'drill' might be a better choice, in my opinion, to keep it
consistent. I don't understand how that would impact the distribution
tarball file name. If it helps, we could have a full name variable in the
main pom.xml for such reasons?

On Mon, Apr 22, 2019 at 10:19 AM Kunal Khatua  wrote:

> Hi Vitalii
>
> I think this would be a temporary headache for people to re-setup/update
> their IDEs' projects.
>
> Is it possible to have this done together with
> https://issues.apache.org/jira/browse/DRILL-6956 [
> https://issues.apache.org/jira/browse/DRILL-6956] (Maintain a single
> entry for Drill Version in the pom file) ?
>
> ~ Kunal
> On 4/22/2019 4:13:27 AM, Vitalii Diravka  wrote:
> Hi all!
>
> I am going to rename project.artifactId (see [1] or DRILL-7169) from '
> *drill-root*' [2] to '*apache-drill*' or 'drill'.
> Some other project examples:
> Hbase: hbase [3]
> Calcite: calcite [4] Parquet: parquet
> artifactId> [5]
> Hive: hive [6] Avro: avro-toplevel
> artifactId> [7] Spark: spark-parent_2.12 [8]
>
> Renaming to 'apache-drill' artifatId will allow to use this variable in a
> lot of places to avoid hard-coding '*apache-drill*' string in different
> paths [1]. A typical artifact produced by Maven would have the form
> -. (for example, myapp-1.0.jar) [9],
> [10]. Current
> Drill final name of different artifacts:
> apache-drill-${project.version}
> Therefore looks like '*apache-drill*' is the best name of Drill artifactId,
> which will allow to keep consistency of all Drill artifacts.
>
> For sure it will lead for changing of Drill artifactId in maven central
> repository:
> https://mvnrepository.com/artifact/org.apache.drill/drill-root
> What do you think guys? Can we update it? Are there any risks to do it?
>
>
> [1] https://github.com/apache/drill/pull/1746
> [2] https://github.com/apache/drill/blob/master/pom.xml#L32
>
> [3] https://github.com/apache/hbase/blob/master/pom.xml#L40
> [4] https://github.com/apache/calcite/blob/master/pom.xml#L28
> [5] https://github.com/apache/parquet-mr/blob/master/pom.xml#L11
> [6] https://github.com/apache/hive/blob/master/pom.xml#L23
> [7] https://github.com/apache/avro/blob/master/pom.xml#L29
> [8] https://github.com/apache/spark/blob/master/pom.xml#L28
> [9] https://maven.apache.org/guides/mini/guide-naming-conventions.html
> [10] https://maven.apache.org/guides/getting-started/
>
> Kind regards
> Vitalii
>


[VOTE] Apache Drill Release 1.16.0 - RC1

2019-04-22 Thread SorabhApache
Hi Drillers,
I'd like to propose the second release candidate (RC1) for the Apache Drill,
version 1.16.0.

Changes since the previous release candidate:
DRILL-7185: Drill Fails to Read Large Packets
DRILL-7186: Missing storage.json REST endpoint
DRILL-7190: Missing backward compatibility for REST API with DRILL-6562

Also below 2 JIRA's were created to separately track revert of protbuf
changes in 1.16.0:
DRILL-7188: Revert DRILL-6642: Update protocol-buffers version
DRILL-7189: Revert DRILL-7105 Error while building the Drill native client

The RC1 includes total of 215 resolved JIRAs [1].
Thanks to everyone for their hard work to contribute to this release.

The tarball artifacts are hosted at [2] and the maven artifacts are hosted
at [3].

This release candidate is based on commit
cf5b758e0a4c22b75bfb02ac2653ff09415ddf53 located at [4].

Please download and try out the release candidate.

The vote ends at 06:00 PM UTC (11:00 AM PDT, 09:00 PM EET, 11:30 PM IST),
Apr 25th, 2019

[ ] +1
[ ] +0
[ ] -1

Here is my vote: +1
  [1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820&version=12344284
  [2] http://home.apache.org/~sorabh/drill/releases/1.16.0/rc1/

  [3]
https://repository.apache.org/content/repositories/orgapachedrill-1067/
  [4] https://github.com/sohami/drill/commits/drill-1.16.0

Thanks,
Sorabh


Re: Rename 'project.artifactId'

2019-04-22 Thread Kunal Khatua
Hi Vitalii

I think this would be a temporary headache for people to re-setup/update their 
IDEs' projects. 

Is it possible to have this done together with 
https://issues.apache.org/jira/browse/DRILL-6956 
[https://issues.apache.org/jira/browse/DRILL-6956] (Maintain a single entry for 
Drill Version in the pom file) ?

~ Kunal
On 4/22/2019 4:13:27 AM, Vitalii Diravka  wrote:
Hi all!

I am going to rename project.artifactId (see [1] or DRILL-7169) from '
*drill-root*' [2] to '*apache-drill*' or 'drill'.
Some other project examples:
Hbase: hbase [3]
Calcite: calcite [4] Parquet: parquet
artifactId> [5]
Hive: hive [6] Avro: avro-toplevel
artifactId> [7] Spark: spark-parent_2.12 [8]

Renaming to 'apache-drill' artifatId will allow to use this variable in a
lot of places to avoid hard-coding '*apache-drill*' string in different
paths [1]. A typical artifact produced by Maven would have the form
-. (for example, myapp-1.0.jar) [9],
[10]. Current
Drill final name of different artifacts:
apache-drill-${project.version}
Therefore looks like '*apache-drill*' is the best name of Drill artifactId,
which will allow to keep consistency of all Drill artifacts.

For sure it will lead for changing of Drill artifactId in maven central
repository:
https://mvnrepository.com/artifact/org.apache.drill/drill-root
What do you think guys? Can we update it? Are there any risks to do it?


[1] https://github.com/apache/drill/pull/1746
[2] https://github.com/apache/drill/blob/master/pom.xml#L32

[3] https://github.com/apache/hbase/blob/master/pom.xml#L40
[4] https://github.com/apache/calcite/blob/master/pom.xml#L28
[5] https://github.com/apache/parquet-mr/blob/master/pom.xml#L11
[6] https://github.com/apache/hive/blob/master/pom.xml#L23
[7] https://github.com/apache/avro/blob/master/pom.xml#L29
[8] https://github.com/apache/spark/blob/master/pom.xml#L28
[9] https://maven.apache.org/guides/mini/guide-naming-conventions.html
[10] https://maven.apache.org/guides/getting-started/

Kind regards
Vitalii


[GitHub] [drill] sohami commented on a change in pull request #1760: DRILL-7164: KafkaFilterPushdownTest is sometimes failing to pattern m…

2019-04-22 Thread GitBox
sohami commented on a change in pull request #1760: DRILL-7164: 
KafkaFilterPushdownTest is sometimes failing to pattern m…
URL: https://github.com/apache/drill/pull/1760#discussion_r277334942
 
 

 ##
 File path: exec/java-exec/src/test/java/org/apache/drill/PlanTestBase.java
 ##
 @@ -92,7 +91,20 @@ public static void testPlanMatchingPatterns(String query, 
String[] expectedPatte
 
   public static void testPlanMatchingPatterns(String query, Pattern[] 
expectedPatterns, Pattern[] excludedPatterns)
 throws Exception {
-final String plan = getPlanInString("EXPLAIN PLAN for " + 
QueryTestUtil.normalizeQuery(query), OPTIQ_FORMAT);
+testPlanMatchingPatterns(query, OPTIQ_FORMAT, expectedPatterns, 
excludedPatterns);
+  }
+
+  public static void testPlanMatchingPatterns(String query, String planFormat,
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] sohami commented on issue #1756: Drill-7185: Drill Fails to Read Large Packets

2019-04-22 Thread GitBox
sohami commented on issue #1756: Drill-7185: Drill Fails to Read Large Packets
URL: https://github.com/apache/drill/pull/1756#issuecomment-485453582
 
 
   > Commit message does not contain Jira number an unfortunately PR was merged 
with such message.
   > Please try to ensure Jira number is included the next time.
   
   My bad, will ensure it from next time onwards.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Re: Apache Avro Version

2019-04-22 Thread AYUSH SHARMA
Hi,
I tried clearing out the .m2 repository and built drill again , however i
again saw the same avro 1.7.7 being used.
Similarly parquet version is set to 1.10.0 and 1.8.1 is being used during
build.
I really need to figure out this as soon as possible. i would appreciate
any help that i can get right now.

Regards
Ayush


On Mon, Apr 22, 2019 at 2:30 PM Vova Vysotskyi  wrote:

> Hi Ayush,
>
> If for some reasons you cannot subscribe to the mailing list, you can use
> Apache mailing archives[1] to find the required letter or response.
>
> [1]
> https://lists.apache.org/thread.html/a28d9fb33a03796032b04ea5b49ab190e9485e2e57113f6b0af02413@%3Cdev.drill.apache.org%3E
>
> Kind regards,
> Volodymyr Vysotskyi
>
>
> On Mon, Apr 22, 2019 at 8:34 AM AYUSH SHARMA 
> wrote:
>
>> Hi Kunal,
>> I checked but i am unable to see any mails , also i did not subscribe to
>> the distribution list and simply mailed to this email address from Drill
>> Home page.
>> It will be great if you could simply forward me the reply here.
>> Thanking you in anticipation.
>>
>> Regards
>> Ayush
>>
>> On Mon, Apr 22, 2019 at 4:31 AM Kunal Khatua  wrote:
>>
>> > Three was a response to your mail. Please check your subscription to see
>> > if you've accidentally filtered out mails from the distribution.
>> >
>> >
>> >
>> > On Sun, Apr 21, 2019, 3:51 PM AYUSH SHARMA 
>> > wrote:
>> >
>> >> Hi,
>> >> Still waiting to hear from anyone in this Distribution list.
>> >> The application is in a critical state and there are more than 300+
>> CVEs ,
>> >> i can contribute to fix almost 250+ CVEs but i need people to respond
>> to
>> >> some basic queries as Maven is not my strong suite.
>> >> Please respond and lets help each other.
>> >>
>> >> Regards
>> >> Ayush
>> >>
>> >> On Mon, Apr 15, 2019 at 6:46 PM AYUSH SHARMA > >
>> >> wrote:
>> >>
>> >> > Hi,
>> >> > I am trying to build drill-master from Github using maven and the
>> >> > Avro-Version is set to 1.8.2
>> >> > However , the child project under
>> >> > \drill-master\contrib\storage-hive\hive-exec-shade contains avro
>> >> version of
>> >> > 1.7.7
>> >> > The problem here is that avro 1.7.7 has a 7.5 CVE and i want to use
>> >> 1.8.2
>> >> > , however i am unable to find any reference int he pom.xml for the
>> same.
>> >> > Can you please help me get this issue resolved?
>> >> > *Desperately need help!!!.*
>> >> >
>> >> > Regards
>> >> > Ayush
>> >> >
>> >>
>> >
>>
>


[GitHub] [drill] dgrinchenko opened a new pull request #1763: DRILL-6974: Add possibility to view option value via SET command

2019-04-22 Thread GitBox
dgrinchenko opened a new pull request #1763: DRILL-6974: Add possibility to 
view option value via SET command
URL: https://github.com/apache/drill/pull/1763
 
 
   JIRA: [DRILL-6974](https://issues.apache.org/jira/browse/DRILL-6974)
   - ALTER ... RESET ... and ALTER ... SET ... sub-parsers separated to 2
 different SqlCall classes with same parent SqlSetOption
   - parserImpls modified to handle new syntax of ALTER... SET...
 expresion:
 a) ALTER ... SET option.name - option.value - setting option value
 b) ALTER ... SET option.name - display option value
   - Handler for SqlSetOption separated to SetOptionHandler and
 ResetOptionhandler for better representation of handled statements
   - Base abstract class AbstractSqlSetHandler created to not repeat
 shared implementation of same functions
   - SetOptionHandler covered with unit tests for each statement
 form.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] arina-ielchiieva commented on issue #1756: Drill-7185: Drill Fails to Read Large Packets

2019-04-22 Thread GitBox
arina-ielchiieva commented on issue #1756: Drill-7185: Drill Fails to Read 
Large Packets
URL: https://github.com/apache/drill/pull/1756#issuecomment-485407046
 
 
   Commit message does not contain Jira number an unfortunately PR was merged 
with such message.
   Please try to ensure Jira number is included the next time.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Rename 'project.artifactId'

2019-04-22 Thread Vitalii Diravka
Hi all!

I am going to rename project.artifactId (see [1] or DRILL-7169) from '
*drill-root*' [2] to '*apache-drill*' or 'drill'.
Some other project examples:
Hbase: hbase [3]
Calcite: calcite [4] Parquet: parquet [5]
Hive: hive [6] Avro: avro-toplevel [7] Spark: spark-parent_2.12 [8]

Renaming to 'apache-drill' artifatId will allow to use this variable in a
lot of places to avoid hard-coding '*apache-drill*' string in different
paths [1]. A typical artifact produced by Maven would have the form
-. (for example, myapp-1.0.jar) [9],
[10]. Current
Drill final name of different artifacts:
apache-drill-${project.version}
Therefore looks like '*apache-drill*' is the best name of Drill artifactId,
which will allow to keep consistency of all Drill artifacts.

For sure it will lead for changing of Drill artifactId in maven central
repository:
https://mvnrepository.com/artifact/org.apache.drill/drill-root
What do you think guys? Can we update it? Are there any risks to do it?


[1] https://github.com/apache/drill/pull/1746
[2] https://github.com/apache/drill/blob/master/pom.xml#L32

[3] https://github.com/apache/hbase/blob/master/pom.xml#L40
[4] https://github.com/apache/calcite/blob/master/pom.xml#L28
[5] https://github.com/apache/parquet-mr/blob/master/pom.xml#L11
[6] https://github.com/apache/hive/blob/master/pom.xml#L23
[7] https://github.com/apache/avro/blob/master/pom.xml#L29
[8] https://github.com/apache/spark/blob/master/pom.xml#L28
[9] https://maven.apache.org/guides/mini/guide-naming-conventions.html
[10] https://maven.apache.org/guides/getting-started/

Kind regards
Vitalii


Re: Apache Avro Version

2019-04-22 Thread Vova Vysotskyi
Let's clarify the behavior: Drill uses Avro 1.8.2 and Parquet 1.10.0 for
its own purposes.

Also, Drill uses Hive which also uses Avro and Parquet, but different
versions.
To allow hive working correctly, we put some libraries with the specific
versions used by Hive into the hive-exec-shade jar to avoid problems with
changes in API for different versions.

So Drill will use Avro 1.8.2 or Parquet 1.10.0 when tables are queried, but
for the case of querying Hive plugin, may be used these versions by the
hive-exec library.

Kind regards,
Volodymyr Vysotskyi


On Mon, Apr 22, 2019 at 1:41 PM AYUSH SHARMA 
wrote:

> Hi,
> I tried clearing out the .m2 repository and built drill again , however i
> again saw the same avro 1.7.7 being used.
> Similarly parquet version is set to 1.10.0 and 1.8.1 is being used during
> build.
> I really need to figure out this as soon as possible. i would appreciate
> any help that i can get right now.
>
> Regards
> Ayush
>
>
> On Mon, Apr 22, 2019 at 2:30 PM Vova Vysotskyi  wrote:
>
>> Hi Ayush,
>>
>> If for some reasons you cannot subscribe to the mailing list, you can use
>> Apache mailing archives[1] to find the required letter or response.
>>
>> [1]
>> https://lists.apache.org/thread.html/a28d9fb33a03796032b04ea5b49ab190e9485e2e57113f6b0af02413@%3Cdev.drill.apache.org%3E
>>
>> Kind regards,
>> Volodymyr Vysotskyi
>>
>>
>> On Mon, Apr 22, 2019 at 8:34 AM AYUSH SHARMA 
>> wrote:
>>
>>> Hi Kunal,
>>> I checked but i am unable to see any mails , also i did not subscribe to
>>> the distribution list and simply mailed to this email address from Drill
>>> Home page.
>>> It will be great if you could simply forward me the reply here.
>>> Thanking you in anticipation.
>>>
>>> Regards
>>> Ayush
>>>
>>> On Mon, Apr 22, 2019 at 4:31 AM Kunal Khatua  wrote:
>>>
>>> > Three was a response to your mail. Please check your subscription to
>>> see
>>> > if you've accidentally filtered out mails from the distribution.
>>> >
>>> >
>>> >
>>> > On Sun, Apr 21, 2019, 3:51 PM AYUSH SHARMA 
>>> > wrote:
>>> >
>>> >> Hi,
>>> >> Still waiting to hear from anyone in this Distribution list.
>>> >> The application is in a critical state and there are more than 300+
>>> CVEs ,
>>> >> i can contribute to fix almost 250+ CVEs but i need people to respond
>>> to
>>> >> some basic queries as Maven is not my strong suite.
>>> >> Please respond and lets help each other.
>>> >>
>>> >> Regards
>>> >> Ayush
>>> >>
>>> >> On Mon, Apr 15, 2019 at 6:46 PM AYUSH SHARMA <
>>> er.ayushsha...@gmail.com>
>>> >> wrote:
>>> >>
>>> >> > Hi,
>>> >> > I am trying to build drill-master from Github using maven and the
>>> >> > Avro-Version is set to 1.8.2
>>> >> > However , the child project under
>>> >> > \drill-master\contrib\storage-hive\hive-exec-shade contains avro
>>> >> version of
>>> >> > 1.7.7
>>> >> > The problem here is that avro 1.7.7 has a 7.5 CVE and i want to use
>>> >> 1.8.2
>>> >> > , however i am unable to find any reference int he pom.xml for the
>>> same.
>>> >> > Can you please help me get this issue resolved?
>>> >> > *Desperately need help!!!.*
>>> >> >
>>> >> > Regards
>>> >> > Ayush
>>> >> >
>>> >>
>>> >
>>>
>>


Re: Apache Avro Version

2019-04-22 Thread Vova Vysotskyi
Hi Ayush,

If for some reasons you cannot subscribe to the mailing list, you can use
Apache mailing archives[1] to find the required letter or response.

[1]
https://lists.apache.org/thread.html/a28d9fb33a03796032b04ea5b49ab190e9485e2e57113f6b0af02413@%3Cdev.drill.apache.org%3E

Kind regards,
Volodymyr Vysotskyi


On Mon, Apr 22, 2019 at 8:34 AM AYUSH SHARMA 
wrote:

> Hi Kunal,
> I checked but i am unable to see any mails , also i did not subscribe to
> the distribution list and simply mailed to this email address from Drill
> Home page.
> It will be great if you could simply forward me the reply here.
> Thanking you in anticipation.
>
> Regards
> Ayush
>
> On Mon, Apr 22, 2019 at 4:31 AM Kunal Khatua  wrote:
>
> > Three was a response to your mail. Please check your subscription to see
> > if you've accidentally filtered out mails from the distribution.
> >
> >
> >
> > On Sun, Apr 21, 2019, 3:51 PM AYUSH SHARMA 
> > wrote:
> >
> >> Hi,
> >> Still waiting to hear from anyone in this Distribution list.
> >> The application is in a critical state and there are more than 300+
> CVEs ,
> >> i can contribute to fix almost 250+ CVEs but i need people to respond to
> >> some basic queries as Maven is not my strong suite.
> >> Please respond and lets help each other.
> >>
> >> Regards
> >> Ayush
> >>
> >> On Mon, Apr 15, 2019 at 6:46 PM AYUSH SHARMA 
> >> wrote:
> >>
> >> > Hi,
> >> > I am trying to build drill-master from Github using maven and the
> >> > Avro-Version is set to 1.8.2
> >> > However , the child project under
> >> > \drill-master\contrib\storage-hive\hive-exec-shade contains avro
> >> version of
> >> > 1.7.7
> >> > The problem here is that avro 1.7.7 has a 7.5 CVE and i want to use
> >> 1.8.2
> >> > , however i am unable to find any reference int he pom.xml for the
> same.
> >> > Can you please help me get this issue resolved?
> >> > *Desperately need help!!!.*
> >> >
> >> > Regards
> >> > Ayush
> >> >
> >>
> >
>


[GitHub] [drill] vvysotskyi commented on a change in pull request #1754: DRILL-7098: File Metadata Metastore Plugin

2019-04-22 Thread GitBox
vvysotskyi commented on a change in pull request #1754: DRILL-7098: File 
Metadata Metastore Plugin
URL: https://github.com/apache/drill/pull/1754#discussion_r277232508
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/AbstractGroupScanWithMetadata.java
 ##
 @@ -151,13 +152,13 @@ public long getColumnValueCount(SchemaPath column) {
 long colNulls;
 if (columnStats != null) {
   Long nulls = (Long) 
columnStats.getStatistic(ColumnStatisticsKind.NULLS_COUNT);
-  colNulls = nulls != null ? nulls : GroupScan.NO_COLUMN_STATS;
+  colNulls = nulls != null ? nulls :  Statistic.NO_COLUMN_STATS;
 } else {
   return 0;
 }
-return GroupScan.NO_COLUMN_STATS == tableRowCount
-|| GroupScan.NO_COLUMN_STATS == colNulls
-? GroupScan.NO_COLUMN_STATS : tableRowCount - colNulls;
+return  Statistic.NO_COLUMN_STATS == tableRowCount
 
 Review comment:
   And here
   ```suggestion
   return Statistic.NO_COLUMN_STATS == tableRowCount
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] vvysotskyi commented on a change in pull request #1754: DRILL-7098: File Metadata Metastore Plugin

2019-04-22 Thread GitBox
vvysotskyi commented on a change in pull request #1754: DRILL-7098: File 
Metadata Metastore Plugin
URL: https://github.com/apache/drill/pull/1754#discussion_r277232441
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/AbstractGroupScanWithMetadata.java
 ##
 @@ -151,13 +152,13 @@ public long getColumnValueCount(SchemaPath column) {
 long colNulls;
 if (columnStats != null) {
   Long nulls = (Long) 
columnStats.getStatistic(ColumnStatisticsKind.NULLS_COUNT);
-  colNulls = nulls != null ? nulls : GroupScan.NO_COLUMN_STATS;
+  colNulls = nulls != null ? nulls :  Statistic.NO_COLUMN_STATS;
 
 Review comment:
   Please remove extra space here:
   ```suggestion
 colNulls = nulls != null ? nulls : Statistic.NO_COLUMN_STATS;
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] vvysotskyi commented on a change in pull request #1754: DRILL-7098: File Metadata Metastore Plugin

2019-04-22 Thread GitBox
vvysotskyi commented on a change in pull request #1754: DRILL-7098: File 
Metadata Metastore Plugin
URL: https://github.com/apache/drill/pull/1754#discussion_r277233417
 
 

 ##
 File path: 
metastore/metastore-api/src/main/java/org/apache/drill/metastore/TableStatisticsKind.java
 ##
 @@ -37,8 +36,8 @@ public Long mergeStatistics(Collection statistics) {
   long rowCount = 0;
   for (BaseMetadata statistic : statistics) {
 Long statRowCount = getValue(statistic);
-if (statRowCount == null || statRowCount == GroupScan.NO_COLUMN_STATS) 
{
-  rowCount = GroupScan.NO_COLUMN_STATS;
+if (statRowCount == null || statRowCount ==  
Statistic.NO_COLUMN_STATS) {
 
 Review comment:
   ```suggestion
   if (statRowCount == null || statRowCount == 
Statistic.NO_COLUMN_STATS) {
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] vvysotskyi commented on a change in pull request #1754: DRILL-7098: File Metadata Metastore Plugin

2019-04-22 Thread GitBox
vvysotskyi commented on a change in pull request #1754: DRILL-7098: File 
Metadata Metastore Plugin
URL: https://github.com/apache/drill/pull/1754#discussion_r277232563
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/AbstractGroupScanWithMetadata.java
 ##
 @@ -151,13 +152,13 @@ public long getColumnValueCount(SchemaPath column) {
 long colNulls;
 if (columnStats != null) {
   Long nulls = (Long) 
columnStats.getStatistic(ColumnStatisticsKind.NULLS_COUNT);
-  colNulls = nulls != null ? nulls : GroupScan.NO_COLUMN_STATS;
+  colNulls = nulls != null ? nulls :  Statistic.NO_COLUMN_STATS;
 } else {
   return 0;
 }
-return GroupScan.NO_COLUMN_STATS == tableRowCount
-|| GroupScan.NO_COLUMN_STATS == colNulls
-? GroupScan.NO_COLUMN_STATS : tableRowCount - colNulls;
+return  Statistic.NO_COLUMN_STATS == tableRowCount
+||  Statistic.NO_COLUMN_STATS == colNulls
+?  Statistic.NO_COLUMN_STATS : tableRowCount - colNulls;
 
 Review comment:
   ```suggestion
   ? Statistic.NO_COLUMN_STATS : tableRowCount - colNulls;
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] vvysotskyi commented on a change in pull request #1754: DRILL-7098: File Metadata Metastore Plugin

2019-04-22 Thread GitBox
vvysotskyi commented on a change in pull request #1754: DRILL-7098: File 
Metadata Metastore Plugin
URL: https://github.com/apache/drill/pull/1754#discussion_r277232544
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/AbstractGroupScanWithMetadata.java
 ##
 @@ -151,13 +152,13 @@ public long getColumnValueCount(SchemaPath column) {
 long colNulls;
 if (columnStats != null) {
   Long nulls = (Long) 
columnStats.getStatistic(ColumnStatisticsKind.NULLS_COUNT);
-  colNulls = nulls != null ? nulls : GroupScan.NO_COLUMN_STATS;
+  colNulls = nulls != null ? nulls :  Statistic.NO_COLUMN_STATS;
 } else {
   return 0;
 }
-return GroupScan.NO_COLUMN_STATS == tableRowCount
-|| GroupScan.NO_COLUMN_STATS == colNulls
-? GroupScan.NO_COLUMN_STATS : tableRowCount - colNulls;
+return  Statistic.NO_COLUMN_STATS == tableRowCount
+||  Statistic.NO_COLUMN_STATS == colNulls
 
 Review comment:
   ```suggestion
   || Statistic.NO_COLUMN_STATS == colNulls
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] vvysotskyi commented on a change in pull request #1754: DRILL-7098: File Metadata Metastore Plugin

2019-04-22 Thread GitBox
vvysotskyi commented on a change in pull request #1754: DRILL-7098: File 
Metadata Metastore Plugin
URL: https://github.com/apache/drill/pull/1754#discussion_r277233443
 
 

 ##
 File path: 
metastore/metastore-api/src/main/java/org/apache/drill/metastore/TableStatisticsKind.java
 ##
 @@ -37,8 +36,8 @@ public Long mergeStatistics(Collection statistics) {
   long rowCount = 0;
   for (BaseMetadata statistic : statistics) {
 Long statRowCount = getValue(statistic);
-if (statRowCount == null || statRowCount == GroupScan.NO_COLUMN_STATS) 
{
-  rowCount = GroupScan.NO_COLUMN_STATS;
+if (statRowCount == null || statRowCount ==  
Statistic.NO_COLUMN_STATS) {
+  rowCount =  Statistic.NO_COLUMN_STATS;
 
 Review comment:
   ```suggestion
 rowCount = Statistic.NO_COLUMN_STATS;
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] vvysotskyi commented on a change in pull request #1754: DRILL-7098: File Metadata Metastore Plugin

2019-04-22 Thread GitBox
vvysotskyi commented on a change in pull request #1754: DRILL-7098: File 
Metadata Metastore Plugin
URL: https://github.com/apache/drill/pull/1754#discussion_r277233476
 
 

 ##
 File path: 
metastore/metastore-api/src/main/java/org/apache/drill/metastore/TableStatisticsKind.java
 ##
 @@ -50,7 +49,7 @@ public Long mergeStatistics(Collection statistics) {
 @Override
 public Long getValue(BaseMetadata metadata) {
   Long rowCount = (Long) metadata.getStatistic(this);
-  return rowCount != null ? rowCount : GroupScan.NO_COLUMN_STATS;
+  return rowCount != null ? rowCount :  Statistic.NO_COLUMN_STATS;
 
 Review comment:
   ```suggestion
 return rowCount != null ? rowCount : Statistic.NO_COLUMN_STATS;
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] vvysotskyi commented on a change in pull request #1754: DRILL-7098: File Metadata Metastore Plugin

2019-04-22 Thread GitBox
vvysotskyi commented on a change in pull request #1754: DRILL-7098: File 
Metadata Metastore Plugin
URL: https://github.com/apache/drill/pull/1754#discussion_r277232692
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/BaseParquetMetadataProvider.java
 ##
 @@ -190,12 +191,12 @@ public TableMetadata getTableMetadata() {
 
   if (this.schema == null) {
 schema = new TupleSchema();
-fields.forEach((schemaPath, majorType) -> 
SchemaPathUtils.addColumnMetadata(schema, schemaPath, majorType));
+fields.forEach((schemaPath, majorType) -> 
MetadataUtils.addColumnMetadata(schema, schemaPath, majorType));
   } else {
 // merges specified schema with schema from table
 fields.forEach((schemaPath, majorType) -> {
   if (SchemaPathUtils.getColumnMetadata(schemaPath, schema) == null) {
-SchemaPathUtils.addColumnMetadata(schema, schemaPath, majorType);
+  MetadataUtils.addColumnMetadata(schema, schemaPath, majorType);
 
 Review comment:
   ```suggestion
   MetadataUtils.addColumnMetadata(schema, schemaPath, majorType);
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (DRILL-7192) Drill limits rows when autoLimit is disabled

2019-04-22 Thread Volodymyr Vysotskyi (JIRA)
Volodymyr Vysotskyi created DRILL-7192:
--

 Summary: Drill limits rows when autoLimit is disabled
 Key: DRILL-7192
 URL: https://issues.apache.org/jira/browse/DRILL-7192
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.16.0
Reporter: Volodymyr Vysotskyi
 Fix For: Future


In DRILL-7048 was implemented autoLimit for JDBC and rest clients.

*Steps to reproduce the issue:*
 1. Check that autoLimit was disabled, if not, disable it and restart Drill.
 2. Submit any query, and verify that rows count is correct, for example,
{code:sql}
SELECT * FROM cp.`employee.json`;
{code}
returns 1,155 rows
 3. Enable autoLimit for sqlLine sqlLine client:
{code:sql}
!set rowLimit 10
{code}
4. Submit the same query and verify that the result has 10 rows.
 5. Disable autoLimit:
{code:sql}
!set rowLimit 0
{code}
6. Submit the same query, but for this time, *it returns 10 rows instead of 
1,155*.

Correct rows count is returned only after creating a new connection.

The same issue is also observed for SQuirreL SQL client, but for example, for 
Postgres, it works correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)