[jira] [Created] (DRILL-5589) JDBC client crashes after successful authentication if trace logging is enabled.

2017-06-14 Thread Sorabh Hamirwasia (JIRA)
Sorabh Hamirwasia created DRILL-5589:


 Summary: JDBC client crashes after successful authentication if 
trace logging is enabled.
 Key: DRILL-5589
 URL: https://issues.apache.org/jira/browse/DRILL-5589
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.11.0
Reporter: Sorabh Hamirwasia
Assignee: Sorabh Hamirwasia


When authentication is completed then with latest changes we [dispose the 
saslClient instance | 
https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/rpc/security/AuthenticationOutcomeListener.java#L295]
 if encryption is not enabled. Then later in caller we try to [log the 
mechanism name | 
https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/rpc/security/AuthenticationOutcomeListener.java#L136]
 using saslClient instance with trace level logging. This will cause the client 
to crash since the saslClient instance is already disposed before logging. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (DRILL-5588) Hash Aggregate: Avoid copy on output of aggregate columns

2017-06-14 Thread Boaz Ben-Zvi (JIRA)
Boaz Ben-Zvi created DRILL-5588:
---

 Summary: Hash Aggregate: Avoid copy on output of aggregate columns
 Key: DRILL-5588
 URL: https://issues.apache.org/jira/browse/DRILL-5588
 Project: Apache Drill
  Issue Type: Improvement
  Components: Execution - Relational Operators
Affects Versions: 1.10.0
Reporter: Boaz Ben-Zvi


 When the Hash Aggregate operator outputs its result batches downstream, the 
key columns (value vectors) are returned as is, but for the aggregate columns 
new value vectors are allocated and the values are copied. This has an impact 
on performance. (see the method allocateOutgoing() ). A second effect is on 
memory management (as this allocation is not planned for by the code that 
controls spilling, etc).
   For some simple aggregate functions (e.g. SUM), the stored value vectors for 
the aggregate values can be returned as is. For functions like AVG, there is a 
need to divide the SUM values by the COUNT values. Still this can be done 
in-place (of the SUM values) and avoid new allocation and copy. 
   For VarChar type aggregate values (only used by MAX or MIN), there is 
another issue -- currently any such value vector is allocated as an 
ObjectVector (see BatchHolder()) (and on the JVM heap, not in direct memory). 
This is to manage the sizes of the values, which could change as the 
aggregation progresses (e.g., for MAX(name) -- first record has 'abe', but the 
next record has 'benjamin' which is both bigger ('b' > 'a') and longer). For 
the final output, this requires a new allocation and a copy in order to have a 
compact value vector in direct memory. Maybe the ObjectVector could be replaced 
with some direct memory implementation that is optimized for "good" values 
(e.g., all are of similar size), but penalized "bad" values (e.g., reallocates 
or moves values, when needed) ?






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Drill Summit/Conference Proposal

2017-06-14 Thread Julian Hyde
I like the idea of co-hosting a conference. ApacheCon in particular is a good 
venue, and they explicitly encourage sub-conferences (there are  “Big Data” and 
“IoT” tracks, and this year there were sub-conferences for Tomcat and 
CloudStack). DrillCon was part of ApacheCon, people could attend a whole day of 
Drill talks, or they they could go to talks about other Apache projects in the 
larger conference, and connect with Apache members.

Also, the conference is professionally organized, at a large hotel with good 
facilities.

Unfortunately ApacheCon just happened (it was in Miami in May); but it’s 
something to consider next year.

Julian


> On Jun 14, 2017, at 9:18 AM, Charles Givre  wrote:
> 
> Hi Bob,
> Good to hear from you. I agree that there could be value in having a joint
> Presto/Drill/Redshift conference, but how would you describe the overall
> theme?
> 
> In essence (not looking to start a flame war here...) these tools are
> similar in terms of what the user experiences and I can definitely see
> value in bringing the communities together.  I also like the idea of
> multiple tracks.  I was thinking of having something like developer/analyst
> tracks.
> -- C
> 
> On Wed, Jun 14, 2017 at 11:27 AM, Bob Rudis  wrote:
> 
>> I grok this is the Drill list and I'm also a big user of Drill (and
>> have made some UDFs) but there might be some efficacy in expanding the
>> scope to the Presto and Redshift Spectrum communities. I'm not
>> claiming there's 100% equivalence, but the broader view of being able
>> to access multiple types of data sources from a central platform is
>> compelling and -- at least in my circles -- not widely known. It could
>> be an event with a a primary central track but three separate
>> specializations that have a couple intraday time pockets.
>> 
>> On Wed, Jun 14, 2017 at 8:55 AM, Charles Givre  wrote:
>>> Hello fellow Drill users and developers,
>>> I've been involved with the Drill community for some time, and I was
>>> thinking that it might be time to start exploring the idea of a Drill
>>> Summit or Conference.  If you're interested, please send me a note and
>> I'll
>>> start having some conversations about what's next.
>>> 
>>> Personally, I think it could be extremely valuable to get Drill
>> developers
>>> and users together and share ideas about where things are and how people
>>> are using Drill.
>>> Thanks!!
>>> -- C
>> 



Re: Drill Summit/Conference Proposal

2017-06-14 Thread Bob Rudis
I grok this is the Drill list and I'm also a big user of Drill (and
have made some UDFs) but there might be some efficacy in expanding the
scope to the Presto and Redshift Spectrum communities. I'm not
claiming there's 100% equivalence, but the broader view of being able
to access multiple types of data sources from a central platform is
compelling and -- at least in my circles -- not widely known. It could
be an event with a a primary central track but three separate
specializations that have a couple intraday time pockets.

On Wed, Jun 14, 2017 at 8:55 AM, Charles Givre  wrote:
> Hello fellow Drill users and developers,
> I've been involved with the Drill community for some time, and I was
> thinking that it might be time to start exploring the idea of a Drill
> Summit or Conference.  If you're interested, please send me a note and I'll
> start having some conversations about what's next.
>
> Personally, I think it could be extremely valuable to get Drill developers
> and users together and share ideas about where things are and how people
> are using Drill.
> Thanks!!
> -- C


Re: Drill Summit/Conference Proposal

2017-06-14 Thread Charles Givre
Hi Bob,
Good to hear from you. I agree that there could be value in having a joint
Presto/Drill/Redshift conference, but how would you describe the overall
theme?

In essence (not looking to start a flame war here...) these tools are
similar in terms of what the user experiences and I can definitely see
value in bringing the communities together.  I also like the idea of
multiple tracks.  I was thinking of having something like developer/analyst
tracks.
-- C

On Wed, Jun 14, 2017 at 11:27 AM, Bob Rudis  wrote:

> I grok this is the Drill list and I'm also a big user of Drill (and
> have made some UDFs) but there might be some efficacy in expanding the
> scope to the Presto and Redshift Spectrum communities. I'm not
> claiming there's 100% equivalence, but the broader view of being able
> to access multiple types of data sources from a central platform is
> compelling and -- at least in my circles -- not widely known. It could
> be an event with a a primary central track but three separate
> specializations that have a couple intraday time pockets.
>
> On Wed, Jun 14, 2017 at 8:55 AM, Charles Givre  wrote:
> > Hello fellow Drill users and developers,
> > I've been involved with the Drill community for some time, and I was
> > thinking that it might be time to start exploring the idea of a Drill
> > Summit or Conference.  If you're interested, please send me a note and
> I'll
> > start having some conversations about what's next.
> >
> > Personally, I think it could be extremely valuable to get Drill
> developers
> > and users together and share ideas about where things are and how people
> > are using Drill.
> > Thanks!!
> > -- C
>


Drill Summit/Conference Proposal

2017-06-14 Thread Charles Givre
Hello fellow Drill users and developers,
I've been involved with the Drill community for some time, and I was
thinking that it might be time to start exploring the idea of a Drill
Summit or Conference.  If you're interested, please send me a note and I'll
start having some conversations about what's next.

Personally, I think it could be extremely valuable to get Drill developers
and users together and share ideas about where things are and how people
are using Drill.
Thanks!!
-- C