Re: QUESTION: Packet Parser for PCAP Plugin

2019-04-23 Thread Charles Givre
Hi Ted, 
I thought about that approach as well.  My concern was cluttering up the plugin 
with lots of columns, especially as we add different protocols.  However, if 
that's not a concern, I can have a go at it.  

I was thinking the same thing about the Kaitai struct.  Would it be possible to 
have some generic reader such that you provide the schema, and Drill would map 
that to columns as appropriate.  That way you could use all the formats pretty 
much instantly from the Kaitai format gallery. 



> On Apr 23, 2019, at 5:08 PM, Ted Dunning  wrote:
> 
> Wow. Kaitai looks fabulous. It would be tempting to define a generic format
> that could use a kaitai spec to define the format of a file.
> 
> Regarding the map output, I think we solved the same problem in the PCAP
> parser itself by simply putting all of the fields at the top level and
> making them nullable. This means that the UDP stuff is null for TCP packets.
> 
> The same approach could be taken for other packets. If parsing is lazy,
> then reference to a parsed column would be required to trigger the parsing
> of a packet.
> 
> 
> 
> On Tue, Apr 23, 2019 at 10:52 AM Charles Givre  wrote:
> 
>> Hi Ted
>> The library that gave me the idea is the Kaitai struct.  The java library
>> itself is released under the Apache or MIT license.  It can parse a number
>> of binary formats including DNS packets, ICMP and many others.  It accepts
>> a byte[] as input. I already wrote working code that reads it but I’m not
>> sure how to output these results in Drill.
>> 
>> Sent from my iPhone
>> 
>>> On Apr 23, 2019, at 12:45, Ted Dunning  wrote:
>>> 
>>> I think this would be very useful, particularly if it is easy to add
>>> additional parsing methods.
>>> 
>>> When I started to pcap work, I couldn't find any libraries that combined
>>> what we needed in terms of function and license.
>>> 
 On Tue, Apr 23, 2019, 9:34 AM Charles Givre  wrote:
 
 Hello all,
 I saw a few open source libraries that parse actual packet content and
>> was
 interested in incorporating this into Drill's PCAP parser.  I was
>> thinking
 initially of writing this as a UDF, however, I think it would be much
 better to include this directly in Drill.  What I was thinking was to
 create a field called parsed_packet that would be a Drill Map.  The
 contents of this field would vary depending on the type of packet.  For
 instance, if it is a DNS packet, you get all the DNS info, ICMP etc...
 Does the community think this is a good idea?   Also, given the
>> structure
 of the PCAP plugin, I'm not quite sure how to create a Map field with
 variable contents.  Are there any examples that use the same
>> architecture
 as the PCAP plugin?
 Thanks,
 -- C
>> 



Re: [Discuss] Integrate Arrow gandiva into Drill

2019-04-23 Thread Parth Chandra
Finally!
We can definitely go ahead with Gandiva if it doesn't depend on the memory
allocator.
Last time I checked the C++ version of Arrow still had its own memory
allocation, but Gandiva likely does not use that for its C++ code.



On Tue, Apr 23, 2019 at 5:46 AM weijie tong  wrote:

> Gandiva 's Project does not allocate any more memory to execute. It just
> calculates the input memory data whatever they are var-length or
> fixed-width. The output memory will also be allocated by the Drill ahead
> which needs to be fixed-width vectors. The var-width output vector cases
> should not be allowed the Gandiva to evaluate since that will need Gandiva
> to allocate additional memory which is not controlled by the JVM.
>
> I guess that's why Gandiva does not implement operator like HashJoin or
> HashAggregate which need to allocate additional memory to implement. But
> Arrow's WIP PR ARROW-3191 https://github.com/apache/arrow/pull/4151 will
> make that possible.
>
> On Tue, Apr 23, 2019 at 7:15 AM Parth Chandra  wrote:
>
> > Is there a way to provide Drill's memory allocator to Gandiva/Arrow? If
> > not, then how do we keep a proper accounting of any memory used by
> > Gandiva/Arrow?
> >
> > On Sat, Apr 20, 2019 at 7:05 PM Paul Rogers 
> > wrote:
> >
> > > Hi Weijie,
> > >
> > > Thanks much for the explanation. Sounds like you are making good
> > progress.
> > >
> > >
> > > For which operator is the filter pushed into the scan? Although Impala
> > > does this for all scans, AFAIK, Drill does not do so. For example, the
> > text
> > > and JSON reader do not handle filtering. Filtering is instead done by
> the
> > > Filter operator in these cases. Perhaps you have your own special scan
> > > which handles filtering?
> > >
> > >
> > > The concern in DRILL-6340 was the user might do a project operation
> that
> > > causes the output batch to be much larger than the input batch. Someone
> > > suggested flatten as one example. String concatenation is another
> > example.
> > > The input batch might be large. The result of the concatenation could
> be
> > > too large for available memory. So, the idea was to project the single
> > > input batch into two (or more) output batches to control batch size.
> > >
> > >
> > > II like how you've categorized the vectors into the set that Gandiva
> can
> > > project, and the set that Drill must handle. Maybe you can extend this
> > idea
> > > for the case where input batches are split into multiple output
> batches.
> > >
> > >  Let Drill handle VarChar expressions that could increase column width
> > > (such as the concatenate operator.) Let Drill decide the number of rows
> > in
> > > the output batch. Then, for the columns that Gandiva can handle,
> project
> > > just those rows needed for the current output batch.
> > >
> > > Your solution might also be extended to handle the Gandiva library
> issue.
> > > Since you are splitting vectors into the Drill group and the Gandiva
> > group,
> > > if Drill runs on a platform without Gandiva support, or if the Gandiva
> > > library can't be found, just let all vectors fall into the Drill vector
> > > group.
> > >
> > > If the user wants to use Gandiva, he/she could set a config option to
> > > point to the Gandiva library (and supporting files, if any.) Or, use
> the
> > > existing LD_LIBRARY_PATH env. variable.
> > >
> > > Thanks,
> > > - Paul
> > >
> > >
> > >
> > > On Thursday, April 18, 2019, 11:45:08 PM PDT, weijie tong <
> > > tongweijie...@gmail.com> wrote:
> > >
> > >  Hi Paul:
> > > Currently Gandiva only supports Project ,Filter operations. My work is
> to
> > > integrate Project operator. Since most of the Filter operator will be
> > > pushed down to the Scan.
> > >
> > > The Gandiva project interface works at the RecordBatch level. It
> accepts
> > > the memory address of the vectors of  input RecordBatch and . Before
> that
> > > it also need to construct a binary schema object to describe the input
> > > RecordBatch schema.
> > >
> > > The integration work mainly has two parts:
> > >   1. at the setup step, find the expressions which can be solved by the
> > > Gandiva . The matched expression will be solved by the Gandiva, others
> > will
> > > still be solved by Drill.
> > >   2. invoking the Gandiva native project method. The matched
> expressions'
> > > ValueVectors will all be allocated corresponding Arrow type null
> > > representation ValueVector. The null input vector's bit  will also be
> > set.
> > > The same work will also be done to the output ValueVectors, transfer
> the
> > > arrow output null vector to Drill's null vector. Since the native
> method
> > > only care the physical memory address, invoking that native method is
> > not a
> > > hard work.
> > >
> > > Since my current implementation is before DRILL-6340, it does not solve
> > the
> > > output size of the project which is less than the input size case. To
> > cover
> > > that case , there's some more work to do which I have not focused on.

[jira] [Created] (DRILL-7199) Optimize the time taken to populate column statistics for non-interesting columns

2019-04-23 Thread Venkata Jyothsna Donapati (JIRA)
Venkata Jyothsna Donapati created DRILL-7199:


 Summary: Optimize the time taken to populate column statistics for 
non-interesting columns
 Key: DRILL-7199
 URL: https://issues.apache.org/jira/browse/DRILL-7199
 Project: Apache Drill
  Issue Type: Bug
Reporter: Venkata Jyothsna Donapati
Assignee: Venkata Jyothsna Donapati


Currently populating column statistics for non-existent columns very long since 
it is populated for every row group. Since non-existent column statistics are 
common for the table, it can be populated once and can be reused.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: QUESTION: Packet Parser for PCAP Plugin

2019-04-23 Thread Ted Dunning
Wow. Kaitai looks fabulous. It would be tempting to define a generic format
that could use a kaitai spec to define the format of a file.

Regarding the map output, I think we solved the same problem in the PCAP
parser itself by simply putting all of the fields at the top level and
making them nullable. This means that the UDP stuff is null for TCP packets.

The same approach could be taken for other packets. If parsing is lazy,
then reference to a parsed column would be required to trigger the parsing
of a packet.



On Tue, Apr 23, 2019 at 10:52 AM Charles Givre  wrote:

> Hi Ted
> The library that gave me the idea is the Kaitai struct.  The java library
> itself is released under the Apache or MIT license.  It can parse a number
> of binary formats including DNS packets, ICMP and many others.  It accepts
> a byte[] as input. I already wrote working code that reads it but I’m not
> sure how to output these results in Drill.
>
> Sent from my iPhone
>
> > On Apr 23, 2019, at 12:45, Ted Dunning  wrote:
> >
> > I think this would be very useful, particularly if it is easy to add
> > additional parsing methods.
> >
> > When I started to pcap work, I couldn't find any libraries that combined
> > what we needed in terms of function and license.
> >
> >> On Tue, Apr 23, 2019, 9:34 AM Charles Givre  wrote:
> >>
> >> Hello all,
> >> I saw a few open source libraries that parse actual packet content and
> was
> >> interested in incorporating this into Drill's PCAP parser.  I was
> thinking
> >> initially of writing this as a UDF, however, I think it would be much
> >> better to include this directly in Drill.  What I was thinking was to
> >> create a field called parsed_packet that would be a Drill Map.  The
> >> contents of this field would vary depending on the type of packet.  For
> >> instance, if it is a DNS packet, you get all the DNS info, ICMP etc...
> >> Does the community think this is a good idea?   Also, given the
> structure
> >> of the PCAP plugin, I'm not quite sure how to create a Map field with
> >> variable contents.  Are there any examples that use the same
> architecture
> >> as the PCAP plugin?
> >> Thanks,
> >> -- C
>


Re: [VOTE] Apache Drill Release 1.16.0 - RC1

2019-04-23 Thread Sorabh Hamirwasia
Hi Volodymyr,
The KEYS file on svn will be updated when a release candidate is approved
and all the artifacts are copied to the svn.

NOTICE is not updated per release so I won't treat it as blocker. But would
be good to add it in the wiki below to ensure from next time onwards it's
updated.

For release I am following this wiki[1] which is part of Parth's
repository. I will update it to include both the steps above as well.

[1]: https://github.com/parthchandra/drill/wiki/Drill-Release-Process

Thanks,
Sorabh

On Tue, Apr 23, 2019 at 1:19 PM Volodymyr Vysotskyi 
wrote:

> Sorabh, could you please add your key to the
> https://dist.apache.org/repos/dist/release/drill/KEYS file?
>
> Not sure that it is a blocker, but the year in NOTICE is 2018.
>
> Do we have any guides for basic checks for release? If no, it would be good
> to introduce such a list of things to check for the release manager.
>
> Kind regards,
> Volodymyr Vysotskyi
>
>
> On Tue, Apr 23, 2019 at 11:09 PM Aman Sinha  wrote:
>
> > Downloaded source tarball on my Linux VM and built and ran unit tests
> > successfully (elapsed time 46 mins).
> > Downloaded binary tarball on my Mac  and ran in embedded mode.
> > Verified Sorabh's release signature using  gpg --verify
> > Checked the maven artifacts are published
> > Checked Ran a few queries against TPC-DS SF1 and examined query profiles
> in
> > the Web UI.  Looked good.
> > Did a few manual tests with REFRESH METADATA by creating the new V4
> > metadata cache and checked EXPLAIN plans and query results.
> > Found an issue with control-c handling and filed DRILL-7198  and noted in
> > the JIRA that I don't think it is a blocker.
> >
> > Overall, release looks good !  +1
> >
> > Aman
> >
> >
> > On Tue, Apr 23, 2019 at 10:01 AM SorabhApache  wrote:
> >
> > > Thanks Aman and Volodymyr for discussing on this issue. Just to clarify
> > on
> > > the thread that RC1 still stands as valid, since the issue is not
> blocker
> > > anymore.
> > >
> > > On Tue, Apr 23, 2019 at 9:18 AM Volodymyr Vysotskyi <
> > volody...@apache.org>
> > > wrote:
> > >
> > > > Discussed with Aman and concluded that this issue is not a blocker
> for
> > > the
> > > > release.
> > > >
> > > > Kind regards,
> > > > Volodymyr Vysotskyi
> > > >
> > > >
> > > > On Tue, Apr 23, 2019 at 6:39 PM Aman Sinha 
> > wrote:
> > > >
> > > > > Hi Vova,
> > > > > I added some thoughts in the DRILL-7195 JIRA.
> > > > >
> > > > > Aman
> > > > >
> > > > > On Tue, Apr 23, 2019 at 6:06 AM Volodymyr Vysotskyi <
> > > > volody...@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > I did some checks and found the following issues:
> > > > > > - DRILL-7195 
> > > > > > - DRILL-7194 
> > > > > > - DRILL-7192 
> > > > > >
> > > > > > One of them (DRILL-7194) is also reproduced on the previous
> > version,
> > > > > > another is connected with the new feature (DRILL-7192), so I
> don't
> > > > think
> > > > > > that we should treat them as blockers.
> > > > > > The third one (DRILL-7195) is a regression and in some cases may
> > > cause
> > > > > the
> > > > > > wrong results, so I think that it should be fixed before the
> > release.
> > > > > > Any thoughts?
> > > > > >
> > > > > > Kind regards,
> > > > > > Volodymyr Vysotskyi
> > > > > >
> > > > > >
> > > > > > On Mon, Apr 22, 2019 at 8:58 PM SorabhApache 
> > > > wrote:
> > > > > >
> > > > > > > *< Please disregard previous email, one of the link is not
> > correct
> > > in
> > > > > it.
> > > > > > > Use the information in this email instead >*
> > > > > > >
> > > > > > > Hi Drillers,
> > > > > > > I'd like to propose the second release candidate (RC1) for the
> > > Apache
> > > > > > > Drill,
> > > > > > > version 1.16.0.
> > > > > > >
> > > > > > > Changes since the previous release candidate:
> > > > > > > DRILL-7185: Drill Fails to Read Large Packets
> > > > > > > DRILL-7186: Missing storage.json REST endpoint
> > > > > > > DRILL-7190: Missing backward compatibility for REST API with
> > > > DRILL-6562
> > > > > > >
> > > > > > > Also below 2 JIRA's were created to separately track revert of
> > > > protbuf
> > > > > > > changes in 1.16.0:
> > > > > > > DRILL-7188: Revert DRILL-6642: Update protocol-buffers version
> > > > > > > DRILL-7189: Revert DRILL-7105 Error while building the Drill
> > native
> > > > > > client
> > > > > > >
> > > > > > > The RC1 includes total of 215 resolved JIRAs [1].
> > > > > > > Thanks to everyone for their hard work to contribute to this
> > > release.
> > > > > > >
> > > > > > > The tarball artifacts are hosted at [2] and the maven artifacts
> > are
> > > > > > hosted
> > > > > > > at [3].
> > > > > > >
> > > > > > > This release candidate is based on commit
> > > > > > > cf5b758e0a4c22b75bfb02ac2653ff09415ddf53 located at [4].
> > > > > > >
> > > > > > > Please download and try out the 

Re: [VOTE] Apache Drill Release 1.16.0 - RC1

2019-04-23 Thread Volodymyr Vysotskyi
Sorabh, could you please add your key to the
https://dist.apache.org/repos/dist/release/drill/KEYS file?

Not sure that it is a blocker, but the year in NOTICE is 2018.

Do we have any guides for basic checks for release? If no, it would be good
to introduce such a list of things to check for the release manager.

Kind regards,
Volodymyr Vysotskyi


On Tue, Apr 23, 2019 at 11:09 PM Aman Sinha  wrote:

> Downloaded source tarball on my Linux VM and built and ran unit tests
> successfully (elapsed time 46 mins).
> Downloaded binary tarball on my Mac  and ran in embedded mode.
> Verified Sorabh's release signature using  gpg --verify
> Checked the maven artifacts are published
> Checked Ran a few queries against TPC-DS SF1 and examined query profiles in
> the Web UI.  Looked good.
> Did a few manual tests with REFRESH METADATA by creating the new V4
> metadata cache and checked EXPLAIN plans and query results.
> Found an issue with control-c handling and filed DRILL-7198  and noted in
> the JIRA that I don't think it is a blocker.
>
> Overall, release looks good !  +1
>
> Aman
>
>
> On Tue, Apr 23, 2019 at 10:01 AM SorabhApache  wrote:
>
> > Thanks Aman and Volodymyr for discussing on this issue. Just to clarify
> on
> > the thread that RC1 still stands as valid, since the issue is not blocker
> > anymore.
> >
> > On Tue, Apr 23, 2019 at 9:18 AM Volodymyr Vysotskyi <
> volody...@apache.org>
> > wrote:
> >
> > > Discussed with Aman and concluded that this issue is not a blocker for
> > the
> > > release.
> > >
> > > Kind regards,
> > > Volodymyr Vysotskyi
> > >
> > >
> > > On Tue, Apr 23, 2019 at 6:39 PM Aman Sinha 
> wrote:
> > >
> > > > Hi Vova,
> > > > I added some thoughts in the DRILL-7195 JIRA.
> > > >
> > > > Aman
> > > >
> > > > On Tue, Apr 23, 2019 at 6:06 AM Volodymyr Vysotskyi <
> > > volody...@apache.org>
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I did some checks and found the following issues:
> > > > > - DRILL-7195 
> > > > > - DRILL-7194 
> > > > > - DRILL-7192 
> > > > >
> > > > > One of them (DRILL-7194) is also reproduced on the previous
> version,
> > > > > another is connected with the new feature (DRILL-7192), so I don't
> > > think
> > > > > that we should treat them as blockers.
> > > > > The third one (DRILL-7195) is a regression and in some cases may
> > cause
> > > > the
> > > > > wrong results, so I think that it should be fixed before the
> release.
> > > > > Any thoughts?
> > > > >
> > > > > Kind regards,
> > > > > Volodymyr Vysotskyi
> > > > >
> > > > >
> > > > > On Mon, Apr 22, 2019 at 8:58 PM SorabhApache 
> > > wrote:
> > > > >
> > > > > > *< Please disregard previous email, one of the link is not
> correct
> > in
> > > > it.
> > > > > > Use the information in this email instead >*
> > > > > >
> > > > > > Hi Drillers,
> > > > > > I'd like to propose the second release candidate (RC1) for the
> > Apache
> > > > > > Drill,
> > > > > > version 1.16.0.
> > > > > >
> > > > > > Changes since the previous release candidate:
> > > > > > DRILL-7185: Drill Fails to Read Large Packets
> > > > > > DRILL-7186: Missing storage.json REST endpoint
> > > > > > DRILL-7190: Missing backward compatibility for REST API with
> > > DRILL-6562
> > > > > >
> > > > > > Also below 2 JIRA's were created to separately track revert of
> > > protbuf
> > > > > > changes in 1.16.0:
> > > > > > DRILL-7188: Revert DRILL-6642: Update protocol-buffers version
> > > > > > DRILL-7189: Revert DRILL-7105 Error while building the Drill
> native
> > > > > client
> > > > > >
> > > > > > The RC1 includes total of 215 resolved JIRAs [1].
> > > > > > Thanks to everyone for their hard work to contribute to this
> > release.
> > > > > >
> > > > > > The tarball artifacts are hosted at [2] and the maven artifacts
> are
> > > > > hosted
> > > > > > at [3].
> > > > > >
> > > > > > This release candidate is based on commit
> > > > > > cf5b758e0a4c22b75bfb02ac2653ff09415ddf53 located at [4].
> > > > > >
> > > > > > Please download and try out the release candidate.
> > > > > >
> > > > > > The vote ends at 06:00 PM UTC (11:00 AM PDT, 09:00 PM EET, 11:30
> PM
> > > > IST),
> > > > > > Apr 25th, 2019
> > > > > >
> > > > > > [ ] +1
> > > > > > [ ] +0
> > > > > > [ ] -1
> > > > > >
> > > > > > Here is my vote: +1
> > > > > >   [1]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820=12344284
> > > > > >   [2] http://home.apache.org/~sorabh/drill/releases/1.16.0/rc1/
> > > > > >   [3]
> > > > > >
> > > >
> > https://repository.apache.org/content/repositories/orgapachedrill-1067/
> > > > > >   [4] https://github.com/sohami/drill/commits/drill-1.16.0
> > > > > >
> > > > > > Thanks,
> > > > > > Sorabh
> > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [VOTE] Apache Drill Release 1.16.0 - RC1

2019-04-23 Thread Aman Sinha
Downloaded source tarball on my Linux VM and built and ran unit tests
successfully (elapsed time 46 mins).
Downloaded binary tarball on my Mac  and ran in embedded mode.
Verified Sorabh's release signature using  gpg --verify
Checked the maven artifacts are published
Checked Ran a few queries against TPC-DS SF1 and examined query profiles in
the Web UI.  Looked good.
Did a few manual tests with REFRESH METADATA by creating the new V4
metadata cache and checked EXPLAIN plans and query results.
Found an issue with control-c handling and filed DRILL-7198  and noted in
the JIRA that I don't think it is a blocker.

Overall, release looks good !  +1

Aman


On Tue, Apr 23, 2019 at 10:01 AM SorabhApache  wrote:

> Thanks Aman and Volodymyr for discussing on this issue. Just to clarify on
> the thread that RC1 still stands as valid, since the issue is not blocker
> anymore.
>
> On Tue, Apr 23, 2019 at 9:18 AM Volodymyr Vysotskyi 
> wrote:
>
> > Discussed with Aman and concluded that this issue is not a blocker for
> the
> > release.
> >
> > Kind regards,
> > Volodymyr Vysotskyi
> >
> >
> > On Tue, Apr 23, 2019 at 6:39 PM Aman Sinha  wrote:
> >
> > > Hi Vova,
> > > I added some thoughts in the DRILL-7195 JIRA.
> > >
> > > Aman
> > >
> > > On Tue, Apr 23, 2019 at 6:06 AM Volodymyr Vysotskyi <
> > volody...@apache.org>
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I did some checks and found the following issues:
> > > > - DRILL-7195 
> > > > - DRILL-7194 
> > > > - DRILL-7192 
> > > >
> > > > One of them (DRILL-7194) is also reproduced on the previous version,
> > > > another is connected with the new feature (DRILL-7192), so I don't
> > think
> > > > that we should treat them as blockers.
> > > > The third one (DRILL-7195) is a regression and in some cases may
> cause
> > > the
> > > > wrong results, so I think that it should be fixed before the release.
> > > > Any thoughts?
> > > >
> > > > Kind regards,
> > > > Volodymyr Vysotskyi
> > > >
> > > >
> > > > On Mon, Apr 22, 2019 at 8:58 PM SorabhApache 
> > wrote:
> > > >
> > > > > *< Please disregard previous email, one of the link is not correct
> in
> > > it.
> > > > > Use the information in this email instead >*
> > > > >
> > > > > Hi Drillers,
> > > > > I'd like to propose the second release candidate (RC1) for the
> Apache
> > > > > Drill,
> > > > > version 1.16.0.
> > > > >
> > > > > Changes since the previous release candidate:
> > > > > DRILL-7185: Drill Fails to Read Large Packets
> > > > > DRILL-7186: Missing storage.json REST endpoint
> > > > > DRILL-7190: Missing backward compatibility for REST API with
> > DRILL-6562
> > > > >
> > > > > Also below 2 JIRA's were created to separately track revert of
> > protbuf
> > > > > changes in 1.16.0:
> > > > > DRILL-7188: Revert DRILL-6642: Update protocol-buffers version
> > > > > DRILL-7189: Revert DRILL-7105 Error while building the Drill native
> > > > client
> > > > >
> > > > > The RC1 includes total of 215 resolved JIRAs [1].
> > > > > Thanks to everyone for their hard work to contribute to this
> release.
> > > > >
> > > > > The tarball artifacts are hosted at [2] and the maven artifacts are
> > > > hosted
> > > > > at [3].
> > > > >
> > > > > This release candidate is based on commit
> > > > > cf5b758e0a4c22b75bfb02ac2653ff09415ddf53 located at [4].
> > > > >
> > > > > Please download and try out the release candidate.
> > > > >
> > > > > The vote ends at 06:00 PM UTC (11:00 AM PDT, 09:00 PM EET, 11:30 PM
> > > IST),
> > > > > Apr 25th, 2019
> > > > >
> > > > > [ ] +1
> > > > > [ ] +0
> > > > > [ ] -1
> > > > >
> > > > > Here is my vote: +1
> > > > >   [1]
> > > > >
> > > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820=12344284
> > > > >   [2] http://home.apache.org/~sorabh/drill/releases/1.16.0/rc1/
> > > > >   [3]
> > > > >
> > >
> https://repository.apache.org/content/repositories/orgapachedrill-1067/
> > > > >   [4] https://github.com/sohami/drill/commits/drill-1.16.0
> > > > >
> > > > > Thanks,
> > > > > Sorabh
> > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


[jira] [Resolved] (DRILL-7197) Drill attempts to create table on Amazon when configured for on premise S3A

2019-04-23 Thread Matt Keranen (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Keranen resolved DRILL-7197.
-
Resolution: Not A Problem

S3 configuration needs to be in core-site.xml, not the storage configuration.

> Drill attempts to create table on Amazon when configured for on premise S3A
> ---
>
> Key: DRILL-7197
> URL: https://issues.apache.org/jira/browse/DRILL-7197
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server
>Affects Versions: 1.15.0
> Environment: On premises 4 node Minio S3A storage cluster:
> {
>  {{ "type": "file",}}
>  {{ "connection": "s3a://logs",}}
>  {{ "config": {}}
>  {{  "fs.s3a.endpoint": "x.x.x.x:9001",}}
>  {{  "fs.s3a.connection.ssl.enabled": "false",}}
>  {{  "fs.s3a.access.key": "xxx",}}
>  {{  "fs.s3a.secret.key": "xxx"}}
>  {{ },}}
>  {{ "workspaces": {}}
>  {{  "tmp": {}}
>  {{   "location": "/tmp",}}
>  {{   "writable": true,}}
>  {{   "defaultInputFormat": null,}}
>  {{   "allowAccessOutsideWorkspace": false}}
>  {{  },}}
>  {{  "logs": {}}
>  {{  "location": "/",}}
>  {{  "writable": true,}}
>  {{  "defaultInputFormat": null,}}
>  {{  "allowAccessOutsideWorkspace": false }}
>  {{ }}}
> },
>Reporter: Matt Keranen
>Priority: Major
>
> Attempting to create Parquet tables from JSON data both within an on premises 
> Minio S3A cluster.
> Able to SELECT from local S3 storage (Minio), but CTAS seems to be trying to 
> go to Amazon S3:
>     SELECT * FROM s3.logs LIMIT 1;
> works, but
>     CREATE TABLE s3.logs.`test` SELECT ... FROM s3.logs
> results in the following error. Drill appears to attempt a connection to 
> Amazon on a CTAS but not a SELECT alone:
>  
> {{Error: SYSTEM ERROR: AmazonS3Exception: Status Code: 403, AWS Service: 
> Amazon S3, AWS Request ID: XXX, AWS Error Code: InvalidAccessKeyId, AWS Error 
> Message: The AWS Access Key Id you provided does not exist in our records.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-7198) Issuing a control-C in Sqlline exits the session (it does cancel the query)

2019-04-23 Thread Aman Sinha (JIRA)
Aman Sinha created DRILL-7198:
-

 Summary: Issuing a control-C in Sqlline exits the session (it does 
cancel the query)
 Key: DRILL-7198
 URL: https://issues.apache.org/jira/browse/DRILL-7198
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.15.0, 1.16.0
Reporter: Aman Sinha


This behavior is observed both in Drill 1.15.0 and the RC1 of 1.16.0.   Run a 
long-running query in sqlline and cancel it using control-c.  It exits the 
sqlline session although it does cancel the query.  Behavior is seen in both 
embedded mode and distributed mode.  If the query is submitted through sqlline  
and cancelled from the Web UI, it does behave correctly..the session does not 
get killed and subsequent queries can be submitted in the same sqlline session. 

Same query in Drill 1.14.0 works correctly and returns the column headers while 
canceling the query. 

Since the query can be cancelled just fine through the Web UI,  I am not 
considering this a blocker for 1.16.   Very likely the sqlline upgrade in 
1.15.0 changed the behavior.  




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-7197) Drill attempts to create table on Amazon when configured for on premise S3A

2019-04-23 Thread Matt Keranen (JIRA)
Matt Keranen created DRILL-7197:
---

 Summary: Drill attempts to create table on Amazon when configured 
for on premise S3A
 Key: DRILL-7197
 URL: https://issues.apache.org/jira/browse/DRILL-7197
 Project: Apache Drill
  Issue Type: Bug
  Components:  Server
Affects Versions: 1.15.0
 Environment: {{{}}
{{ "type": "file",}}
{{ "connection": "s3a://logs",}}
{{ "config": {}}
{{  "fs.s3a.endpoint": "x.x.x.x:9001",}}
{{  "fs.s3a.connection.ssl.enabled": "false",}}
{{  "fs.s3a.access.key": "xxx",}}
{{  "fs.s3a.secret.key": "xxx"}}
{{ },}}
{{ "workspaces": {}}
{{  "tmp": {}}
{{   "location": "/tmp",}}
{{   "writable": true,}}
{{   "defaultInputFormat": null,}}
{{   "allowAccessOutsideWorkspace": false}}
{{  },}}
{{  "logs": {}}
{{  "location": "/",}}
{{  "writable": true,}}
{{  "defaultInputFormat": null,}}
{{  "allowAccessOutsideWorkspace": false }}
{{ }}}
{{ },}}
Reporter: Matt Keranen


Able to SELECT from local S3 storage (Minio), but CTAS seems to be trying to go 
to Amazon S3:

    SELECT * FROM s3.logs LIMIT 1;

works, but

    CREATE TABLE s3.logs.`test` SELECT ... FROM s3.logs

results in error:

{{Error: SYSTEM ERROR: AmazonS3Exception: Status Code: 403, AWS Service: Amazon 
S3, AWS Request ID: XXX, AWS Error Code: InvalidAccessKeyId, AWS Error Message: 
The AWS Access Key Id you provided does not exist in our records.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: QUESTION: Packet Parser for PCAP Plugin

2019-04-23 Thread Paul Rogers
Hi Charles,

Two comments. 

First, Drill "maps" are actually structs (nested tuples): every record must 
have the same set of columns within the "map." That is, though the Drill type 
is called a "map", and you might assume that, given that name, it would act 
like a JSON, Python of Java map, the actual implementation is, in fact, a 
struct. (I saw a JIRA ticket to rename the Map type in some context because of 
this unfortunate mismatch of name and implementation.)

By contrast, Hive defines both Map and Struct types. A Drill "Map" is like a 
Hive Struct, and Drill has no equivalent of a Hive Map. Still, there are 
solutions.

To use a single parsed_packet map column, you'd have to know the union of all 
the columns you'll create across all the packet types and define a map schema 
that includes all these columns. Define this map in all batches so you have a 
consistent schema. This means including all columns for all packet types, even 
if the data does not happen to have all packet types.

Or, you could define a different map for each packet type; but you'd still have 
to define the needed ones up front. You could do this if you had columns 
called, say, parsed_x_packet, parsed_y_packet, etc. If that packet type is 
projected (appears in the SELECT ... clause), then define the required schema 
for all records. The user just selects the packet types of interest.

This brings us to the second comment. The long work to merge the row set 
framework into Drill is coming to a close, and it is now available for you to 
use. The row set framework provides a very simple way to define your map 
schemas (once you know what they are). It also handles projection:the user 
selects some of your parsed packets, but not others, or projects some of the 
packet map columns, but not others.

Drill 1.16 migrates the CSV reader to the new framework (where it also supports 
user-defined schemas and type conversions.) The next step in the row set work 
is to migrate a few other readers to the new framework. Perhaps, PCAP might be 
a good candidate to enable your new packet-parsing feature.


Thanks,
- Paul

 

On Tuesday, April 23, 2019, 9:34:16 AM PDT, Charles Givre 
 wrote:  
 
 Hello all,
I saw a few open source libraries that parse actual packet content and was 
interested in incorporating this into Drill's PCAP parser.  I was thinking 
initially of writing this as a UDF, however, I think it would be much better to 
include this directly in Drill.  What I was thinking was to create a field 
called parsed_packet that would be a Drill Map.  The contents of this field 
would vary depending on the type of packet.  For instance, if it is a DNS 
packet, you get all the DNS info, ICMP etc...
Does the community think this is a good idea?  Also, given the structure of the 
PCAP plugin, I'm not quite sure how to create a Map field with variable 
contents.  Are there any examples that use the same architecture as the PCAP 
plugin?
Thanks,
-- C  

Re: QUESTION: Packet Parser for PCAP Plugin

2019-04-23 Thread Charles Givre
Hi Ted
The library that gave me the idea is the Kaitai struct.  The java library 
itself is released under the Apache or MIT license.  It can parse a number of 
binary formats including DNS packets, ICMP and many others.  It accepts a 
byte[] as input. I already wrote working code that reads it but I’m not sure 
how to output these results in Drill. 

Sent from my iPhone

> On Apr 23, 2019, at 12:45, Ted Dunning  wrote:
> 
> I think this would be very useful, particularly if it is easy to add
> additional parsing methods.
> 
> When I started to pcap work, I couldn't find any libraries that combined
> what we needed in terms of function and license.
> 
>> On Tue, Apr 23, 2019, 9:34 AM Charles Givre  wrote:
>> 
>> Hello all,
>> I saw a few open source libraries that parse actual packet content and was
>> interested in incorporating this into Drill's PCAP parser.  I was thinking
>> initially of writing this as a UDF, however, I think it would be much
>> better to include this directly in Drill.  What I was thinking was to
>> create a field called parsed_packet that would be a Drill Map.  The
>> contents of this field would vary depending on the type of packet.  For
>> instance, if it is a DNS packet, you get all the DNS info, ICMP etc...
>> Does the community think this is a good idea?   Also, given the structure
>> of the PCAP plugin, I'm not quite sure how to create a Map field with
>> variable contents.  Are there any examples that use the same architecture
>> as the PCAP plugin?
>> Thanks,
>> -- C


[GitHub] [drill] vvysotskyi commented on a change in pull request #1763: DRILL-6974: Add possibility to view option value via SET command

2019-04-23 Thread GitBox
vvysotskyi commented on a change in pull request #1763: DRILL-6974: Add 
possibility to view option value via SET command
URL: https://github.com/apache/drill/pull/1763#discussion_r277788929
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/planner/sql/handlers/SetOptionHandlerTest.java
 ##
 @@ -0,0 +1,91 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.sql.handlers;
+
+import org.apache.drill.categories.SqlTest;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.compile.ClassCompilerSelector;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterTest;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+@Category(SqlTest.class)
+public class SetOptionHandlerTest extends ClusterTest {
+
+  @BeforeClass
+  public static void setup() throws Exception {
+startCluster(ClusterFixture.builder(dirTestWatcher).maxParallelization(1));
+  }
+
+  @Test
+  public void testSimpleSetQuery() throws Exception {
+String defaultValue = client.queryBuilder()
+.sql("SELECT val from sys.options where name = '%s' limit 1",
+ClassCompilerSelector.JAVA_COMPILER_DEBUG_OPTION)
+.singletonString();
+
+boolean newValue = !Boolean.valueOf(defaultValue);
+try {
+client.alterSession(ClassCompilerSelector.JAVA_COMPILER_DEBUG_OPTION, 
newValue);
 
 Review comment:
   Please fix indentation


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] vvysotskyi commented on a change in pull request #1763: DRILL-6974: Add possibility to view option value via SET command

2019-04-23 Thread GitBox
vvysotskyi commented on a change in pull request #1763: DRILL-6974: Add 
possibility to view option value via SET command
URL: https://github.com/apache/drill/pull/1763#discussion_r277785713
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/parser/DrillSqlSetOption.java
 ##
 @@ -0,0 +1,114 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.sql.parser;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.calcite.sql.SqlCall;
+import org.apache.calcite.sql.SqlIdentifier;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.calcite.sql.SqlLiteral;
+import org.apache.calcite.sql.SqlNode;
+import org.apache.calcite.sql.SqlOperator;
+import org.apache.calcite.sql.SqlSetOption;
+import org.apache.calcite.sql.SqlSpecialOperator;
+import org.apache.calcite.sql.SqlWriter;
+import org.apache.calcite.sql.parser.SqlParserPos;
+import org.apache.calcite.util.ImmutableNullableList;
+import org.apache.drill.exec.planner.sql.handlers.SetOptionHandler;
+
+/**
+ * Sql parse tree node to represent statement: {@code SET  [ = VALUE ]}.
+ * Statement handled in: {@link SetOptionHandler}
+ */
+public final class DrillSqlSetOption extends SqlSetOption {
+
+  public static final SqlSpecialOperator OPERATOR = new 
SqlSpecialOperator("SET_OPTION", SqlKind.SET_OPTION) {
+@Override
+public SqlCall createCall(SqlLiteral functionQualifier, SqlParserPos pos, 
SqlNode... operands) {
+  SqlNode scopeNode = operands[0];
+  String scope = scopeNode == null ? null : scopeNode.toString();
+  return new DrillSqlSetOption(pos, scope, (SqlIdentifier) operands[1], 
operands[2]);
+}
+  };
+
+public DrillSqlSetOption(SqlParserPos pos, String scope, SqlIdentifier 
name, SqlNode value) {
+super(pos, scope, name, value);
+  }
+
+  @Override
+  public SqlKind getKind() {
+return SqlKind.SET_OPTION;
+  }
+
+  @Override
+  public SqlOperator getOperator() {
+return OPERATOR;
+  }
+
+  @Override
+  public List getOperandList() {
+List operandList = new ArrayList<>();
+if (this.getScope() == null) {
+  operandList.add(null);
+} else {
+  operandList.add(new SqlIdentifier(this.getScope(), SqlParserPos.ZERO));
+}
+
+operandList.add(this.getName());
+operandList.add(this.getValue());
+return ImmutableNullableList.copyOf(operandList);
+  }
+
+  @Override
+  public void setOperand(int i, SqlNode operand) {
 
 Review comment:
   Are there any differences in logic between this method and method from the 
parent class `SqlSetOption`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] vvysotskyi commented on a change in pull request #1763: DRILL-6974: Add possibility to view option value via SET command

2019-04-23 Thread GitBox
vvysotskyi commented on a change in pull request #1763: DRILL-6974: Add 
possibility to view option value via SET command
URL: https://github.com/apache/drill/pull/1763#discussion_r277786213
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/parser/DrillSqlSetOption.java
 ##
 @@ -0,0 +1,114 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.sql.parser;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.calcite.sql.SqlCall;
+import org.apache.calcite.sql.SqlIdentifier;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.calcite.sql.SqlLiteral;
+import org.apache.calcite.sql.SqlNode;
+import org.apache.calcite.sql.SqlOperator;
+import org.apache.calcite.sql.SqlSetOption;
+import org.apache.calcite.sql.SqlSpecialOperator;
+import org.apache.calcite.sql.SqlWriter;
+import org.apache.calcite.sql.parser.SqlParserPos;
+import org.apache.calcite.util.ImmutableNullableList;
+import org.apache.drill.exec.planner.sql.handlers.SetOptionHandler;
+
+/**
+ * Sql parse tree node to represent statement: {@code SET  [ = VALUE ]}.
+ * Statement handled in: {@link SetOptionHandler}
+ */
+public final class DrillSqlSetOption extends SqlSetOption {
+
+  public static final SqlSpecialOperator OPERATOR = new 
SqlSpecialOperator("SET_OPTION", SqlKind.SET_OPTION) {
+@Override
+public SqlCall createCall(SqlLiteral functionQualifier, SqlParserPos pos, 
SqlNode... operands) {
+  SqlNode scopeNode = operands[0];
+  String scope = scopeNode == null ? null : scopeNode.toString();
+  return new DrillSqlSetOption(pos, scope, (SqlIdentifier) operands[1], 
operands[2]);
+}
+  };
+
+public DrillSqlSetOption(SqlParserPos pos, String scope, SqlIdentifier 
name, SqlNode value) {
+super(pos, scope, name, value);
+  }
+
+  @Override
+  public SqlKind getKind() {
+return SqlKind.SET_OPTION;
+  }
+
+  @Override
+  public SqlOperator getOperator() {
+return OPERATOR;
+  }
+
+  @Override
+  public List getOperandList() {
 
 Review comment:
   The same question here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] vvysotskyi commented on a change in pull request #1763: DRILL-6974: Add possibility to view option value via SET command

2019-04-23 Thread GitBox
vvysotskyi commented on a change in pull request #1763: DRILL-6974: Add 
possibility to view option value via SET command
URL: https://github.com/apache/drill/pull/1763#discussion_r277787941
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/planner/sql/handlers/ResetOptionHandlerTest.java
 ##
 @@ -0,0 +1,66 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.sql.handlers;
+
+import org.apache.drill.categories.SqlTest;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterTest;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+@Category(SqlTest.class)
+public class ResetOptionHandlerTest extends ClusterTest {
+
+  @BeforeClass
+  public static void setup() throws Exception {
+startCluster(ClusterFixture.builder(dirTestWatcher).maxParallelization(1));
+  }
+
+  @Test
+  public void testReset() throws Exception {
+String testValue = "100";
+String defaultValue = client.queryBuilder()
 
 Review comment:
   Is it required to check this? We can obtain option value, and for example, 
increase it to be used in tests, so it will not fail if the default value is 
changed to 100.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] vvysotskyi commented on a change in pull request #1763: DRILL-6974: Add possibility to view option value via SET command

2019-04-23 Thread GitBox
vvysotskyi commented on a change in pull request #1763: DRILL-6974: Add 
possibility to view option value via SET command
URL: https://github.com/apache/drill/pull/1763#discussion_r277789557
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/planner/sql/handlers/SetOptionHandlerTest.java
 ##
 @@ -0,0 +1,91 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.sql.handlers;
+
+import org.apache.drill.categories.SqlTest;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.compile.ClassCompilerSelector;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterTest;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+@Category(SqlTest.class)
+public class SetOptionHandlerTest extends ClusterTest {
+
+  @BeforeClass
+  public static void setup() throws Exception {
+startCluster(ClusterFixture.builder(dirTestWatcher).maxParallelization(1));
+  }
+
+  @Test
+  public void testSimpleSetQuery() throws Exception {
+String defaultValue = client.queryBuilder()
+.sql("SELECT val from sys.options where name = '%s' limit 1",
+ClassCompilerSelector.JAVA_COMPILER_DEBUG_OPTION)
+.singletonString();
+
+boolean newValue = !Boolean.valueOf(defaultValue);
+try {
+client.alterSession(ClassCompilerSelector.JAVA_COMPILER_DEBUG_OPTION, 
newValue);
+
+String changedValue = client.queryBuilder()
+.sql("SELECT val from sys.options where name = '%s' limit 1",
+ClassCompilerSelector.JAVA_COMPILER_DEBUG_OPTION)
+.singletonString();
+
+Assert.assertEquals(String.valueOf(newValue), changedValue);
+Assert.assertNotEquals(defaultValue, changedValue);
 
 Review comment:
   This assertion is redundant, the first one handles such case: 
`newValue==changedValue&!=defaultValue->changedValue!=defaultValue`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] vvysotskyi commented on a change in pull request #1763: DRILL-6974: Add possibility to view option value via SET command

2019-04-23 Thread GitBox
vvysotskyi commented on a change in pull request #1763: DRILL-6974: Add 
possibility to view option value via SET command
URL: https://github.com/apache/drill/pull/1763#discussion_r277784626
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/parser/DrillSqlResetOption.java
 ##
 @@ -0,0 +1,103 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.sql.parser;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.calcite.sql.SqlCall;
+import org.apache.calcite.sql.SqlIdentifier;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.calcite.sql.SqlLiteral;
+import org.apache.calcite.sql.SqlNode;
+import org.apache.calcite.sql.SqlOperator;
+import org.apache.calcite.sql.SqlSetOption;
+import org.apache.calcite.sql.SqlSpecialOperator;
+import org.apache.calcite.sql.SqlWriter;
+import org.apache.calcite.sql.parser.SqlParserPos;
+import org.apache.calcite.util.ImmutableNullableList;
+import org.apache.drill.exec.planner.sql.handlers.SetOptionHandler;
+
+/**
+ * Sql parse tree node to represent statement: {@code RESET {  | ALL } }.
+ * Statement handled in: {@link SetOptionHandler}
 
 Review comment:
   `SetOptionHandler` -> Reset...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] vvysotskyi commented on a change in pull request #1763: DRILL-6974: Add possibility to view option value via SET command

2019-04-23 Thread GitBox
vvysotskyi commented on a change in pull request #1763: DRILL-6974: Add 
possibility to view option value via SET command
URL: https://github.com/apache/drill/pull/1763#discussion_r27915
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/SetOptionHandler.java
 ##
 @@ -20,108 +20,68 @@
 import java.math.BigDecimal;
 
 import org.apache.calcite.sql.type.SqlTypeName;
-import org.apache.calcite.tools.ValidationException;
-
 import org.apache.calcite.util.NlsString;
 import org.apache.drill.common.exceptions.UserException;
-import org.apache.drill.exec.ExecConstants;
 import org.apache.drill.exec.ops.QueryContext;
 import org.apache.drill.exec.physical.PhysicalPlan;
 import org.apache.drill.exec.planner.sql.DirectPlan;
+import org.apache.drill.exec.planner.sql.parser.DrillSqlSetOption;
 import org.apache.drill.exec.server.options.OptionManager;
 import org.apache.drill.exec.server.options.OptionValue;
 import org.apache.drill.exec.server.options.OptionValue.OptionScope;
-import org.apache.drill.exec.server.options.QueryOptionManager;
-import org.apache.drill.exec.util.ImpersonationUtil;
 import org.apache.drill.exec.work.foreman.ForemanSetupException;
 import org.apache.calcite.sql.SqlLiteral;
 import org.apache.calcite.sql.SqlNode;
 import org.apache.calcite.sql.SqlSetOption;
 
 /**
- * Converts a {@link SqlNode} representing "ALTER .. SET option = value" and 
"ALTER ... RESET ..." statements to a
- * {@link PhysicalPlan}. See {@link SqlSetOption}. These statements have side 
effects i.e. the options within the
- * system context or the session context are modified. The resulting {@link 
DirectPlan} returns to the client a string
- * that is the name of the option that was updated.
+ * Converts a {@link SqlNode} representing: "ALTER .. SET option = value" or 
"ALTER ... SET option"
+ * statement to a {@link PhysicalPlan}. See {@link DrillSqlSetOption}
+ * 
+ * These statements have side effects i.e. the options within the system 
context or the session context are modified.
+ * The resulting {@link DirectPlan} returns to the client a string that is the 
name of the option that was updated
+ * or a value of the property
  */
-public class SetOptionHandler extends AbstractSqlHandler {
-  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(SetOptionHandler.class);
-
-  private final QueryContext context;
+public class SetOptionHandler extends AbstractSqlSetHandler {
+  private static final org.slf4j.Logger LOGGER = 
org.slf4j.LoggerFactory.getLogger(SetOptionHandler.class);
 
 Review comment:
   In `AbstractSqlSetHandler` you have added a logger named in lower case, but 
in this class, it is in upper case. Could you please use a single style?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] vvysotskyi commented on a change in pull request #1763: DRILL-6974: Add possibility to view option value via SET command

2019-04-23 Thread GitBox
vvysotskyi commented on a change in pull request #1763: DRILL-6974: Add 
possibility to view option value via SET command
URL: https://github.com/apache/drill/pull/1763#discussion_r28188
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/SetOptionHandler.java
 ##
 @@ -20,108 +20,68 @@
 import java.math.BigDecimal;
 
 import org.apache.calcite.sql.type.SqlTypeName;
-import org.apache.calcite.tools.ValidationException;
-
 import org.apache.calcite.util.NlsString;
 import org.apache.drill.common.exceptions.UserException;
-import org.apache.drill.exec.ExecConstants;
 import org.apache.drill.exec.ops.QueryContext;
 import org.apache.drill.exec.physical.PhysicalPlan;
 import org.apache.drill.exec.planner.sql.DirectPlan;
+import org.apache.drill.exec.planner.sql.parser.DrillSqlSetOption;
 import org.apache.drill.exec.server.options.OptionManager;
 import org.apache.drill.exec.server.options.OptionValue;
 import org.apache.drill.exec.server.options.OptionValue.OptionScope;
-import org.apache.drill.exec.server.options.QueryOptionManager;
-import org.apache.drill.exec.util.ImpersonationUtil;
 import org.apache.drill.exec.work.foreman.ForemanSetupException;
 import org.apache.calcite.sql.SqlLiteral;
 import org.apache.calcite.sql.SqlNode;
 import org.apache.calcite.sql.SqlSetOption;
 
 /**
- * Converts a {@link SqlNode} representing "ALTER .. SET option = value" and 
"ALTER ... RESET ..." statements to a
- * {@link PhysicalPlan}. See {@link SqlSetOption}. These statements have side 
effects i.e. the options within the
- * system context or the session context are modified. The resulting {@link 
DirectPlan} returns to the client a string
- * that is the name of the option that was updated.
+ * Converts a {@link SqlNode} representing: "ALTER .. SET option = value" or 
"ALTER ... SET option"
+ * statement to a {@link PhysicalPlan}. See {@link DrillSqlSetOption}
+ * 
+ * These statements have side effects i.e. the options within the system 
context or the session context are modified.
+ * The resulting {@link DirectPlan} returns to the client a string that is the 
name of the option that was updated
+ * or a value of the property
  */
-public class SetOptionHandler extends AbstractSqlHandler {
-  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(SetOptionHandler.class);
-
-  private final QueryContext context;
+public class SetOptionHandler extends AbstractSqlSetHandler {
+  private static final org.slf4j.Logger LOGGER = 
org.slf4j.LoggerFactory.getLogger(SetOptionHandler.class);
 
   public SetOptionHandler(QueryContext context) {
-this.context = context;
+super(context);
   }
 
+  /**
+   * {@inheritDoc}
 
 Review comment:
   The same as for `ResetOptionHandler`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] vvysotskyi commented on a change in pull request #1763: DRILL-6974: Add possibility to view option value via SET command

2019-04-23 Thread GitBox
vvysotskyi commented on a change in pull request #1763: DRILL-6974: Add 
possibility to view option value via SET command
URL: https://github.com/apache/drill/pull/1763#discussion_r24762
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/ResetOptionHandler.java
 ##
 @@ -0,0 +1,73 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.sql.handlers;
+
+import org.apache.drill.exec.ops.QueryContext;
+import org.apache.drill.exec.physical.PhysicalPlan;
+import org.apache.drill.exec.planner.sql.DirectPlan;
+import org.apache.drill.exec.planner.sql.parser.DrillSqlResetOption;
+import org.apache.drill.exec.server.options.OptionManager;
+import org.apache.drill.exec.server.options.OptionValue;
+import org.apache.drill.exec.server.options.OptionValue.OptionScope;
+import org.apache.drill.exec.server.options.QueryOptionManager;
+import org.apache.drill.exec.work.foreman.ForemanSetupException;
+import org.apache.calcite.sql.SqlNode;
+import org.apache.calcite.sql.SqlSetOption;
+
+/**
+ * Converts a {@link SqlNode} representing: "ALTER .. RESET option | ALL" 
statement to a {@link PhysicalPlan}.
+ * See {@link DrillSqlResetOption}.
+ * 
+ * These statements have side effects i.e. the options within the system 
context or the session context are modified.
+ * The resulting {@link DirectPlan} returns to the client a string that is the 
name of the option that was updated
+ * or a value of the property
+ */
+public class ResetOptionHandler extends AbstractSqlSetHandler {
+
+  /**
+   * Class constructor.
+   * @param context Context of the Query
+   */
+  public ResetOptionHandler(QueryContext context) {
+super(context);
+  }
+
+  /**
+   * {@inheritDoc}
 
 Review comment:
   Could you please add Javadoc for this method, since the parent class does 
not have it? I don't see a reason for using `{@inheritDoc}` annotation here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Re: [VOTE] Apache Drill Release 1.16.0 - RC1

2019-04-23 Thread SorabhApache
Thanks Aman and Volodymyr for discussing on this issue. Just to clarify on
the thread that RC1 still stands as valid, since the issue is not blocker
anymore.

On Tue, Apr 23, 2019 at 9:18 AM Volodymyr Vysotskyi 
wrote:

> Discussed with Aman and concluded that this issue is not a blocker for the
> release.
>
> Kind regards,
> Volodymyr Vysotskyi
>
>
> On Tue, Apr 23, 2019 at 6:39 PM Aman Sinha  wrote:
>
> > Hi Vova,
> > I added some thoughts in the DRILL-7195 JIRA.
> >
> > Aman
> >
> > On Tue, Apr 23, 2019 at 6:06 AM Volodymyr Vysotskyi <
> volody...@apache.org>
> > wrote:
> >
> > > Hi all,
> > >
> > > I did some checks and found the following issues:
> > > - DRILL-7195 
> > > - DRILL-7194 
> > > - DRILL-7192 
> > >
> > > One of them (DRILL-7194) is also reproduced on the previous version,
> > > another is connected with the new feature (DRILL-7192), so I don't
> think
> > > that we should treat them as blockers.
> > > The third one (DRILL-7195) is a regression and in some cases may cause
> > the
> > > wrong results, so I think that it should be fixed before the release.
> > > Any thoughts?
> > >
> > > Kind regards,
> > > Volodymyr Vysotskyi
> > >
> > >
> > > On Mon, Apr 22, 2019 at 8:58 PM SorabhApache 
> wrote:
> > >
> > > > *< Please disregard previous email, one of the link is not correct in
> > it.
> > > > Use the information in this email instead >*
> > > >
> > > > Hi Drillers,
> > > > I'd like to propose the second release candidate (RC1) for the Apache
> > > > Drill,
> > > > version 1.16.0.
> > > >
> > > > Changes since the previous release candidate:
> > > > DRILL-7185: Drill Fails to Read Large Packets
> > > > DRILL-7186: Missing storage.json REST endpoint
> > > > DRILL-7190: Missing backward compatibility for REST API with
> DRILL-6562
> > > >
> > > > Also below 2 JIRA's were created to separately track revert of
> protbuf
> > > > changes in 1.16.0:
> > > > DRILL-7188: Revert DRILL-6642: Update protocol-buffers version
> > > > DRILL-7189: Revert DRILL-7105 Error while building the Drill native
> > > client
> > > >
> > > > The RC1 includes total of 215 resolved JIRAs [1].
> > > > Thanks to everyone for their hard work to contribute to this release.
> > > >
> > > > The tarball artifacts are hosted at [2] and the maven artifacts are
> > > hosted
> > > > at [3].
> > > >
> > > > This release candidate is based on commit
> > > > cf5b758e0a4c22b75bfb02ac2653ff09415ddf53 located at [4].
> > > >
> > > > Please download and try out the release candidate.
> > > >
> > > > The vote ends at 06:00 PM UTC (11:00 AM PDT, 09:00 PM EET, 11:30 PM
> > IST),
> > > > Apr 25th, 2019
> > > >
> > > > [ ] +1
> > > > [ ] +0
> > > > [ ] -1
> > > >
> > > > Here is my vote: +1
> > > >   [1]
> > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820=12344284
> > > >   [2] http://home.apache.org/~sorabh/drill/releases/1.16.0/rc1/
> > > >   [3]
> > > >
> > https://repository.apache.org/content/repositories/orgapachedrill-1067/
> > > >   [4] https://github.com/sohami/drill/commits/drill-1.16.0
> > > >
> > > > Thanks,
> > > > Sorabh
> > > >
> > > > >
> > > >
> > >
> >
>


Re: QUESTION: Packet Parser for PCAP Plugin

2019-04-23 Thread Ted Dunning
I think this would be very useful, particularly if it is easy to add
additional parsing methods.

When I started to pcap work, I couldn't find any libraries that combined
what we needed in terms of function and license.

On Tue, Apr 23, 2019, 9:34 AM Charles Givre  wrote:

> Hello all,
> I saw a few open source libraries that parse actual packet content and was
> interested in incorporating this into Drill's PCAP parser.  I was thinking
> initially of writing this as a UDF, however, I think it would be much
> better to include this directly in Drill.  What I was thinking was to
> create a field called parsed_packet that would be a Drill Map.  The
> contents of this field would vary depending on the type of packet.  For
> instance, if it is a DNS packet, you get all the DNS info, ICMP etc...
> Does the community think this is a good idea?   Also, given the structure
> of the PCAP plugin, I'm not quite sure how to create a Map field with
> variable contents.  Are there any examples that use the same architecture
> as the PCAP plugin?
> Thanks,
> -- C


QUESTION: Packet Parser for PCAP Plugin

2019-04-23 Thread Charles Givre
Hello all,
I saw a few open source libraries that parse actual packet content and was 
interested in incorporating this into Drill's PCAP parser.  I was thinking 
initially of writing this as a UDF, however, I think it would be much better to 
include this directly in Drill.  What I was thinking was to create a field 
called parsed_packet that would be a Drill Map.  The contents of this field 
would vary depending on the type of packet.  For instance, if it is a DNS 
packet, you get all the DNS info, ICMP etc...
Does the community think this is a good idea?   Also, given the structure of 
the PCAP plugin, I'm not quite sure how to create a Map field with variable 
contents.  Are there any examples that use the same architecture as the PCAP 
plugin?
Thanks,
-- C

Re: [VOTE] Apache Drill Release 1.16.0 - RC1

2019-04-23 Thread Volodymyr Vysotskyi
Discussed with Aman and concluded that this issue is not a blocker for the
release.

Kind regards,
Volodymyr Vysotskyi


On Tue, Apr 23, 2019 at 6:39 PM Aman Sinha  wrote:

> Hi Vova,
> I added some thoughts in the DRILL-7195 JIRA.
>
> Aman
>
> On Tue, Apr 23, 2019 at 6:06 AM Volodymyr Vysotskyi 
> wrote:
>
> > Hi all,
> >
> > I did some checks and found the following issues:
> > - DRILL-7195 
> > - DRILL-7194 
> > - DRILL-7192 
> >
> > One of them (DRILL-7194) is also reproduced on the previous version,
> > another is connected with the new feature (DRILL-7192), so I don't think
> > that we should treat them as blockers.
> > The third one (DRILL-7195) is a regression and in some cases may cause
> the
> > wrong results, so I think that it should be fixed before the release.
> > Any thoughts?
> >
> > Kind regards,
> > Volodymyr Vysotskyi
> >
> >
> > On Mon, Apr 22, 2019 at 8:58 PM SorabhApache  wrote:
> >
> > > *< Please disregard previous email, one of the link is not correct in
> it.
> > > Use the information in this email instead >*
> > >
> > > Hi Drillers,
> > > I'd like to propose the second release candidate (RC1) for the Apache
> > > Drill,
> > > version 1.16.0.
> > >
> > > Changes since the previous release candidate:
> > > DRILL-7185: Drill Fails to Read Large Packets
> > > DRILL-7186: Missing storage.json REST endpoint
> > > DRILL-7190: Missing backward compatibility for REST API with DRILL-6562
> > >
> > > Also below 2 JIRA's were created to separately track revert of protbuf
> > > changes in 1.16.0:
> > > DRILL-7188: Revert DRILL-6642: Update protocol-buffers version
> > > DRILL-7189: Revert DRILL-7105 Error while building the Drill native
> > client
> > >
> > > The RC1 includes total of 215 resolved JIRAs [1].
> > > Thanks to everyone for their hard work to contribute to this release.
> > >
> > > The tarball artifacts are hosted at [2] and the maven artifacts are
> > hosted
> > > at [3].
> > >
> > > This release candidate is based on commit
> > > cf5b758e0a4c22b75bfb02ac2653ff09415ddf53 located at [4].
> > >
> > > Please download and try out the release candidate.
> > >
> > > The vote ends at 06:00 PM UTC (11:00 AM PDT, 09:00 PM EET, 11:30 PM
> IST),
> > > Apr 25th, 2019
> > >
> > > [ ] +1
> > > [ ] +0
> > > [ ] -1
> > >
> > > Here is my vote: +1
> > >   [1]
> > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820=12344284
> > >   [2] http://home.apache.org/~sorabh/drill/releases/1.16.0/rc1/
> > >   [3]
> > >
> https://repository.apache.org/content/repositories/orgapachedrill-1067/
> > >   [4] https://github.com/sohami/drill/commits/drill-1.16.0
> > >
> > > Thanks,
> > > Sorabh
> > >
> > > >
> > >
> >
>


Re: [VOTE] Apache Drill Release 1.16.0 - RC1

2019-04-23 Thread Aman Sinha
Hi Vova,
I added some thoughts in the DRILL-7195 JIRA.

Aman

On Tue, Apr 23, 2019 at 6:06 AM Volodymyr Vysotskyi 
wrote:

> Hi all,
>
> I did some checks and found the following issues:
> - DRILL-7195 
> - DRILL-7194 
> - DRILL-7192 
>
> One of them (DRILL-7194) is also reproduced on the previous version,
> another is connected with the new feature (DRILL-7192), so I don't think
> that we should treat them as blockers.
> The third one (DRILL-7195) is a regression and in some cases may cause the
> wrong results, so I think that it should be fixed before the release.
> Any thoughts?
>
> Kind regards,
> Volodymyr Vysotskyi
>
>
> On Mon, Apr 22, 2019 at 8:58 PM SorabhApache  wrote:
>
> > *< Please disregard previous email, one of the link is not correct in it.
> > Use the information in this email instead >*
> >
> > Hi Drillers,
> > I'd like to propose the second release candidate (RC1) for the Apache
> > Drill,
> > version 1.16.0.
> >
> > Changes since the previous release candidate:
> > DRILL-7185: Drill Fails to Read Large Packets
> > DRILL-7186: Missing storage.json REST endpoint
> > DRILL-7190: Missing backward compatibility for REST API with DRILL-6562
> >
> > Also below 2 JIRA's were created to separately track revert of protbuf
> > changes in 1.16.0:
> > DRILL-7188: Revert DRILL-6642: Update protocol-buffers version
> > DRILL-7189: Revert DRILL-7105 Error while building the Drill native
> client
> >
> > The RC1 includes total of 215 resolved JIRAs [1].
> > Thanks to everyone for their hard work to contribute to this release.
> >
> > The tarball artifacts are hosted at [2] and the maven artifacts are
> hosted
> > at [3].
> >
> > This release candidate is based on commit
> > cf5b758e0a4c22b75bfb02ac2653ff09415ddf53 located at [4].
> >
> > Please download and try out the release candidate.
> >
> > The vote ends at 06:00 PM UTC (11:00 AM PDT, 09:00 PM EET, 11:30 PM IST),
> > Apr 25th, 2019
> >
> > [ ] +1
> > [ ] +0
> > [ ] -1
> >
> > Here is my vote: +1
> >   [1]
> >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820=12344284
> >   [2] http://home.apache.org/~sorabh/drill/releases/1.16.0/rc1/
> >   [3]
> > https://repository.apache.org/content/repositories/orgapachedrill-1067/
> >   [4] https://github.com/sohami/drill/commits/drill-1.16.0
> >
> > Thanks,
> > Sorabh
> >
> > >
> >
>


[jira] [Created] (DRILL-7196) Drillbit should raise more clear exception message when query is executed on a disabled storage plugin

2019-04-23 Thread Dmytriy Grinchenko (JIRA)
Dmytriy Grinchenko created DRILL-7196:
-

 Summary: Drillbit should raise more clear exception message when 
query is executed on a disabled storage plugin
 Key: DRILL-7196
 URL: https://issues.apache.org/jira/browse/DRILL-7196
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
Affects Versions: 1.12.0
Reporter: Dmytriy Grinchenko
Assignee: Dmytriy Grinchenko
 Fix For: 1.17.0


Currently Drillbit will return exception like below: 
{code}
Error: SYSTEM ERROR: AssertionError: Rule's description should be unique; 
existing rule=TestRuleName; new rule=TestRuleName
{code}

This error message is not representative about the real failure cause.  Instead 
we need to provide more clear exception, for example: "Unable to execute query 
using disabled storage plugin."




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] Apache Drill Release 1.16.0 - RC1

2019-04-23 Thread Volodymyr Vysotskyi
Hi all,

I did some checks and found the following issues:
- DRILL-7195 
- DRILL-7194 
- DRILL-7192 

One of them (DRILL-7194) is also reproduced on the previous version,
another is connected with the new feature (DRILL-7192), so I don't think
that we should treat them as blockers.
The third one (DRILL-7195) is a regression and in some cases may cause the
wrong results, so I think that it should be fixed before the release.
Any thoughts?

Kind regards,
Volodymyr Vysotskyi


On Mon, Apr 22, 2019 at 8:58 PM SorabhApache  wrote:

> *< Please disregard previous email, one of the link is not correct in it.
> Use the information in this email instead >*
>
> Hi Drillers,
> I'd like to propose the second release candidate (RC1) for the Apache
> Drill,
> version 1.16.0.
>
> Changes since the previous release candidate:
> DRILL-7185: Drill Fails to Read Large Packets
> DRILL-7186: Missing storage.json REST endpoint
> DRILL-7190: Missing backward compatibility for REST API with DRILL-6562
>
> Also below 2 JIRA's were created to separately track revert of protbuf
> changes in 1.16.0:
> DRILL-7188: Revert DRILL-6642: Update protocol-buffers version
> DRILL-7189: Revert DRILL-7105 Error while building the Drill native client
>
> The RC1 includes total of 215 resolved JIRAs [1].
> Thanks to everyone for their hard work to contribute to this release.
>
> The tarball artifacts are hosted at [2] and the maven artifacts are hosted
> at [3].
>
> This release candidate is based on commit
> cf5b758e0a4c22b75bfb02ac2653ff09415ddf53 located at [4].
>
> Please download and try out the release candidate.
>
> The vote ends at 06:00 PM UTC (11:00 AM PDT, 09:00 PM EET, 11:30 PM IST),
> Apr 25th, 2019
>
> [ ] +1
> [ ] +0
> [ ] -1
>
> Here is my vote: +1
>   [1]
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820=12344284
>   [2] http://home.apache.org/~sorabh/drill/releases/1.16.0/rc1/
>   [3]
> https://repository.apache.org/content/repositories/orgapachedrill-1067/
>   [4] https://github.com/sohami/drill/commits/drill-1.16.0
>
> Thanks,
> Sorabh
>
> >
>


Re: [Discuss] Integrate Arrow gandiva into Drill

2019-04-23 Thread weijie tong
Gandiva 's Project does not allocate any more memory to execute. It just
calculates the input memory data whatever they are var-length or
fixed-width. The output memory will also be allocated by the Drill ahead
which needs to be fixed-width vectors. The var-width output vector cases
should not be allowed the Gandiva to evaluate since that will need Gandiva
to allocate additional memory which is not controlled by the JVM.

I guess that's why Gandiva does not implement operator like HashJoin or
HashAggregate which need to allocate additional memory to implement. But
Arrow's WIP PR ARROW-3191 https://github.com/apache/arrow/pull/4151 will
make that possible.

On Tue, Apr 23, 2019 at 7:15 AM Parth Chandra  wrote:

> Is there a way to provide Drill's memory allocator to Gandiva/Arrow? If
> not, then how do we keep a proper accounting of any memory used by
> Gandiva/Arrow?
>
> On Sat, Apr 20, 2019 at 7:05 PM Paul Rogers 
> wrote:
>
> > Hi Weijie,
> >
> > Thanks much for the explanation. Sounds like you are making good
> progress.
> >
> >
> > For which operator is the filter pushed into the scan? Although Impala
> > does this for all scans, AFAIK, Drill does not do so. For example, the
> text
> > and JSON reader do not handle filtering. Filtering is instead done by the
> > Filter operator in these cases. Perhaps you have your own special scan
> > which handles filtering?
> >
> >
> > The concern in DRILL-6340 was the user might do a project operation that
> > causes the output batch to be much larger than the input batch. Someone
> > suggested flatten as one example. String concatenation is another
> example.
> > The input batch might be large. The result of the concatenation could be
> > too large for available memory. So, the idea was to project the single
> > input batch into two (or more) output batches to control batch size.
> >
> >
> > II like how you've categorized the vectors into the set that Gandiva can
> > project, and the set that Drill must handle. Maybe you can extend this
> idea
> > for the case where input batches are split into multiple output batches.
> >
> >  Let Drill handle VarChar expressions that could increase column width
> > (such as the concatenate operator.) Let Drill decide the number of rows
> in
> > the output batch. Then, for the columns that Gandiva can handle, project
> > just those rows needed for the current output batch.
> >
> > Your solution might also be extended to handle the Gandiva library issue.
> > Since you are splitting vectors into the Drill group and the Gandiva
> group,
> > if Drill runs on a platform without Gandiva support, or if the Gandiva
> > library can't be found, just let all vectors fall into the Drill vector
> > group.
> >
> > If the user wants to use Gandiva, he/she could set a config option to
> > point to the Gandiva library (and supporting files, if any.) Or, use the
> > existing LD_LIBRARY_PATH env. variable.
> >
> > Thanks,
> > - Paul
> >
> >
> >
> > On Thursday, April 18, 2019, 11:45:08 PM PDT, weijie tong <
> > tongweijie...@gmail.com> wrote:
> >
> >  Hi Paul:
> > Currently Gandiva only supports Project ,Filter operations. My work is to
> > integrate Project operator. Since most of the Filter operator will be
> > pushed down to the Scan.
> >
> > The Gandiva project interface works at the RecordBatch level. It accepts
> > the memory address of the vectors of  input RecordBatch and . Before that
> > it also need to construct a binary schema object to describe the input
> > RecordBatch schema.
> >
> > The integration work mainly has two parts:
> >   1. at the setup step, find the expressions which can be solved by the
> > Gandiva . The matched expression will be solved by the Gandiva, others
> will
> > still be solved by Drill.
> >   2. invoking the Gandiva native project method. The matched expressions'
> > ValueVectors will all be allocated corresponding Arrow type null
> > representation ValueVector. The null input vector's bit  will also be
> set.
> > The same work will also be done to the output ValueVectors, transfer the
> > arrow output null vector to Drill's null vector. Since the native method
> > only care the physical memory address, invoking that native method is
> not a
> > hard work.
> >
> > Since my current implementation is before DRILL-6340, it does not solve
> the
> > output size of the project which is less than the input size case. To
> cover
> > that case , there's some more work to do which I have not focused on.
> >
> > To contribute to community , there's also some test case problem which
> > needs to be considered, since the Gandiva jar is platform dependent.
> >
> >
> >
> >
> > On Fri, Apr 19, 2019 at 8:43 AM Paul Rogers 
> > wrote:
> >
> > > Hi Weijie,
> > >
> > > Thanks much for the update on your Gandiva work. It is great work.
> > >
> > > Can you say more about how you are doing the integration?
> > >
> > > As you mentioned the memory layout of Arrow's null vector differs from
> > the
> > > "is set" vector in Drill. 

[jira] [Created] (DRILL-7195) Query returns incorrect result or does not fail when cast with is null is used in filter condition

2019-04-23 Thread Volodymyr Vysotskyi (JIRA)
Volodymyr Vysotskyi created DRILL-7195:
--

 Summary: Query returns incorrect result or does not fail when cast 
with is null is used in filter condition
 Key: DRILL-7195
 URL: https://issues.apache.org/jira/browse/DRILL-7195
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.16.0
Reporter: Volodymyr Vysotskyi
 Fix For: 1.16.0


1. For the case when a query contains filter with a {{cast}} which cannot be 
done with {{is null}}, the query does not fail:
{code:sql}
select * from dfs.tmp.`a.json` as t where cast(t.a as integer) is null;
+---+
| a |
+---+
+---+
No rows selected (0.142 seconds)
{code}
where
{noformat}
cat /tmp/a.json
{"a":"aaa"}
{noformat}
But for the case when this condition is specified in project, query, as it is 
expected, fails:
{code:sql}
select cast(t.a as integer) is null from dfs.tmp.`a.json` t;
Error: SYSTEM ERROR: NumberFormatException: aaa

Fragment 0:0

Please, refer to logs for more information.

[Error Id: ed3982ce-a12f-4d63-bc6e-cafddf28cc24 on user515050-pc:31010] 
(state=,code=0)
{code}
This is a regression, for Drill 1.15 the first and the second queries are 
failed:
{code:sql}
select * from dfs.tmp.`a.json` as t where cast(t.a as integer) is null;
Error: SYSTEM ERROR: NumberFormatException: aaa

Fragment 0:0

Please, refer to logs for more information.

[Error Id: 2f878f15-ddaa-48cd-9dfb-45c04db39048 on user515050-pc:31010] 
(state=,code=0)
{code}
2. For the case when {{drill.exec.functions.cast_empty_string_to_null}} is 
enabled, this issue will cause wrong results:
{code:sql}
alter system set `drill.exec.functions.cast_empty_string_to_null`=true;

select * from dfs.tmp.`a1.json` t where cast(t.a as integer) is null;
+---+
| a |
+---+
+---+
No rows selected (1.759 seconds)
{code}
where
{noformat}
cat /tmp/a1.json 
{"a":"1"}
{"a":""}
{noformat}
Result for Drill 1.15.0:
{code:sql}
select * from dfs.tmp.`a1.json` t where cast(t.a as integer) is null;
++
| a  |
++
||
++
1 row selected (1.724 seconds)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-7194) Wrong result when non-deterministic functions are used in filter

2019-04-23 Thread Volodymyr Vysotskyi (JIRA)
Volodymyr Vysotskyi created DRILL-7194:
--

 Summary: Wrong result when non-deterministic functions are used in 
filter
 Key: DRILL-7194
 URL: https://issues.apache.org/jira/browse/DRILL-7194
 Project: Apache Drill
  Issue Type: Bug
Reporter: Volodymyr Vysotskyi
 Fix For: Future


Drill returns the wrong result when non-deterministic functions are used in 
filter condition, for example, the next query:
{code:sql}
select 1 from (values(1)) where random()=random();
{code}
returns
{noformat}
++
| EXPR$0 |
++
| 1  |
++
1 row selected (0.105 seconds)
{noformat}
but {{random()=random()}} should be {{false}}, and therefore query shouldn't 
return any rows.

If this condition is used in projection, it returns the correct result:
{code:sql}
select random()=random();
{code}
returns
{noformat}
++
| EXPR$0 |
++
| false  |
++
1 row selected (1.558 seconds)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)