Re: Dynamic UDFs support

2016-06-21 Thread yuliya Feldman
Just thoughts:
You can try to reuse distributed cache Let Drill AM do the needful in terms of 
orchestrating UDF jars distribution.
But
I would be inclined to have a common path that is independent of the fact that 
it is Drill on YARN or not, as maintaining two separate ways of dealing with 
loading/unloading UDFs will be painful and error prone.
One more note (I left a comment in the doc) - not sure about authorization 
model here - we need to have some.
Just my 2cThanks

  From: Paul Rogers 
 To: "dev@drill.apache.org"  
 Sent: Monday, June 20, 2016 7:32 PM
 Subject: Re: Dynamic UDFs support
   
Hi Neeraja,

The proposal calls for the user to copy the jar file to each Drillbit node. The 
jar would go into a new $DRILL_HOME/jars/3rdparty/udf directory.

In Drill-on-YARN (DoY), YARN is responsible for copying Drill code to each node 
(which is good.) YARN puts that code in a location known only to YARN. Since 
the location is private to YARN, the user can’t easily hunt down the location 
in order to add the udf jar. Even if the user did find the location, the next 
Drillbit to start would create a new copy of the Drill software, without the 
udf jar.

Second, in DoY we have separated user files from Drill software. This makes it 
much easier to distribute the software to each node: we give the Drill 
distribution tar archive to YARN, and YARN copies it to each node and untars 
the Drill files. We make a separate copy of the (far smaller) set of user 
config files.

If the udf jar goes into a Drill folder ($DRILL_HOME/jars/3rdparty/udf), then 
the user would have to rebuild the Drill tar file each time they add a udf jar. 
When I tried this myself when building DoY, I found it to be slow and 
error-prone.

So, the solution is to place the udf code in the new “site” directory: 
$DRILL_SITE/jars. That’s what that is for. Then, let DoY automatically 
distribute the code to every node. Perfect! Except that it does not work to 
dynamically distribute code after Drill starts.

For DoY, the solution requirements are:

1. Distribute code using Drill itself, rather than manually copying jars to 
(unknown) Drill directories.
2. Ensure the solution works even if another Drillbit is spun up later, and 
uses the original Drill tar file.

I’m thinking we want to leverage DFS: place udf files into a well-known DFS 
directory. Register the udf into, say, ZK. When a new Drillbit starts, it looks 
for new udf jars in ZK, copies the file to a temporary location, and launches. 
An existing Drill is notified of the change and does the same download process. 
Clean-up is needed at some point to remove ZK entries if the udf jar becomes 
statically available on the next launch. That needs more thought.

We’d still need the phases mentioned earlier to ensure consistency.

Suggestions anyone as to how to do this super simply & still get it to work 
with DoY?

Thanks,

- Paul
 
> On Jun 20, 2016, at 7:18 PM, Neeraja Rentachintala 
>  wrote:
> 
> This will need to work with YARN (Once Drill is YARN enabled, I would
> expect a lot of users using it in conjunction with YARN).
> Paul, I am not clear why this wouldn't work with YARN. Can you elaborate.
> 
> -Neeraja
> 
> On Mon, Jun 20, 2016 at 7:01 PM, Paul Rogers  wrote:
> 
>> Good enough, as long as we document the limitation that this feature can’t
>> work with YARN deployment as users generally do not have access to the
>> temporary “localization” directories where the Drill code is placed by YARN.
>> 
>> Note that the jar distribution race condition issue occurs with the
>> proposed design: I believe I sketched out a scenario in one of the earlier
>> comments. Drillbit A receives the CREATE FUNCTION command. It tells
>> Drillbit B. While informing the other Drillbits, Drillbit B plans and
>> launches a query that uses the function. Drillbit Z starts execution of the
>> query before it learns from A about the new function. This will be rare —
>> just rare enough to create very hard to reproduce bugs.
>> 
>> The only reliable solution is to do the work in multiple passes:
>> 
>> Pass 1: Ask each node to load the function, but not make it available to
>> the planner. (it would be available to the execution engine.)
>> Pass 2: Await confirmation from each node that this is done.
>> Pass 3: Alert every node that it is now free to plan queries with the
>> function.
>> 
>> Finally, I wonder if we should design the SQL syntax based on a long-term
>> design, even if the feature itself is a short-term work-around. Changing
>> the syntax later might break scripts that users might write.
>> 
>> So, the question for the group is this: is the value of semi-complete
>> feature sufficient to justify the potential problems?
>> 
>> - Paul
>> 
>>> On Jun 20, 2016, at 6:15 PM, Parth Chandra 
>> wrote:
>>> 
>>> Moving discussion to dev.
>>> 
>>> I believe the aim is to do a simple implementation without the complexity
>>> of distributing the UDF. I think the document should make this limitation
>>>

Re: Getting java.lang.VerifyError: class io.netty.buffer.UnsafeDirectLittleEndian

2016-07-19 Thread yuliya Feldman
could be Netty versions mismatch: between version drill is using and your 
project is using.
In netty-4.0.27.Final clear() is not "final" 


  From: Rajesh Chejerla 
 To: u...@drill.apache.org; dev@drill.apache.org 
 Sent: Tuesday, July 19, 2016 6:27 AM
 Subject: Getting java.lang.VerifyError: class 
io.netty.buffer.UnsafeDirectLittleEndian
   
Hi,

I'm getting "java.lang.VerifyError: class
io.netty.buffer.UnsafeDirectLittleEndian" error while getting connection to
database. This is happening when I use another library(vert.x-web) along
with apache-drill.

Could you please help on this issue.

java.lang.VerifyError: class io.netty.buffer.UnsafeDirectLittleEndian
overrides final method clear.()Lio/netty/buffer/ByteBuf;
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
    at
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
    at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at
io.netty.buffer.PooledByteBufAllocatorL.(PooledByteBufAllocatorL.java:56)
    at
org.apache.drill.exec.memory.AllocationManager.(AllocationManager.java:60)
    at
org.apache.drill.exec.memory.BaseAllocator.(BaseAllocator.java:44)
    at
org.apache.drill.exec.memory.RootAllocatorFactory.newRoot(RootAllocatorFactory.java:38)
    at
org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:140)
    at
org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:64)
    at
org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:69)
    at
net.hydromatic.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:126)
    at org.apache.drill.jdbc.Driver.connect(Driver.java:72)
    at java.sql.DriverManager.getConnection(DriverManager.java:664)
    at java.sql.DriverManager.getConnection(DriverManager.java:270)
    at
com.gainsight.services.data.transformer.drill.api.impl.DrillServiceImpl.submitDrillQuerySync(DrillServiceImpl.java:48)
    at
com.gainsight.services.dataprocessing.dataprocessor.dagdataprocessor.nirmata.utils.NirmataUtils.transformQuery(NirmataUtils.java:54)
    at
com.gainsight.services.dataprocessing.dataprocessor.dagdataprocessor.nirmata.executors.TrasnsformExecutor$1.execute(TrasnsformExecutor.java:30)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at
com.google.common.util.concurrent.MoreExecutors$DirectExecutorService.execute(MoreExecutors.java:310)
    at
com.nirmata.workflow.details.WorkflowManagerImpl.executeTask(WorkflowManagerImpl.java:553)
    at
com.nirmata.workflow.details.WorkflowManagerImpl.lambda$null$9(WorkflowManagerImpl.java:591)
    at
com.nirmata.workflow.queue.zookeeper.SimpleQueue.processNode(SimpleQueue.java:274)
    at
com.nirmata.workflow.queue.zookeeper.SimpleQueue.runLoop(SimpleQueue.java:228)
    at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)


-- 

Thanks & Regards,
Rajesh Chejerla


  

Re: Dynamic UDFs support

2016-07-26 Thread yuliya Feldman
I want to make sure (also will make a note in the design doc) that we have an 
option to disable dynamic loading/unloading of UDFs until we will be able to 
have an ability to do proper authentication AND authorization of the user(s).

  From: Arina Yelchiyeva 
 To: dev@drill.apache.org 
 Sent: Monday, July 25, 2016 9:09 AM
 Subject: Re: Dynamic UDFs support
   
My fault, agree, DROP is more appropriate.
Thanks Julian!

On Mon, Jul 25, 2016 at 7:07 PM Julian Hyde  wrote:

> But don't call it DELETE. In SQL the opposite of CREATE is DROP.
>
> Julian
>
> > On Jul 25, 2016, at 8:48 AM, Keys Botzum  wrote:
> >
> > I like the approach to handling DELETE. This is very useful. I think an
> implementation that does not guarantee consistent behavior is perfectly
> fine for use that is targeted at developers that are working on UDFs. As
> long as the docs make the intent clear this makes me very happy.
> >
> > I'll defer to others more expert than I on the remainder of the design.
> >
> > Keys
> > ___
> > Keys Botzum
> > Senior Principal Technologist
> > kbot...@maprtech.com 
> > 443-718-0098
> > MapR Technologies
> > http://www.mapr.com 
> >> On Jul 25, 2016, at 9:55 AM, Arina Yelchiyeva <
> arina.yelchiy...@gmail.com> wrote:
> >>
> >> Taking into account all previous comments and discussion we had with
> Parth
> >> and Paul, please find below my design notes (I am going to prepare
> proper
> >> design document, just want to see if all agree with raw version).
> >> I propose will use lazy-init to dynamically loaded UDFs, in such case
> when
> >> user issues CREATE UDF command, foreman will only validate jar and
> update
> >> ZK function registry, and only if function is needed it will be loaded
> to
> >> appropriate drillbit (during planning stage or fragment execution). We
> >> might add listeners (as Paul proposed) to pre-load UDFs but I didn't
> >> include it to current release to simplify solution but we might
> re-consider
> >> this.
> >> I have looked at issue with class loading and unloading and if we ship
> each
> >> jar with its own classloader, DELETE functionality can be introduced in
> >> current release, at least marked as experimental or for developers use
> >> only, to ease UDF development process.
> >>
> >> Any comments are welcomed.
> >>
> >> *Invariants*
> >>
> >> 1. DFS staging area where user copies jar to be loaded
> >>
> >> 2. DFS udf area (former registration area) where all validated jars are
> >> present
> >>
> >> 3. ZK function registry - contains list of all dynamically loaded UDFs
> and
> >> their jars. UDF name will be represented as combination of name and
> input
> >> parameters.
> >>
> >> 4. Lazy-init - all dynamically loaded UDFs will be loaded to drillbit
> upon
> >> request, i.e. if drillbits receives query or fragment that contains
> such UDF
> >>
> >> 5. Currently only CREATE and DELETE statements are supported
> >>
> >>
> >> *Adding UDFs*
> >>
> >> 1. User copies source and binary (hereinafter jar) to DFS staging area
> >> 2. User issues CREATE UDF command
> >> 3. Foreman receives request to create UDF:
> >> a) checks if jar is present in staging area
> >> b) copies jar to temporary DFS location
> >> c) validates UDFs present in jar locally:
> >> 1) copies jar to temporary local fs
> >> 2) scans jar using temporary classloader
> >> 3) checks if there are any duplicates in local function registry
> >> 4) returns list of UDFs to be registered
> >> d) validates UDFs present in jar in ZK:
> >> 1) takes list of dynamically loaded UDFs from ZK
> >> 2) checks if there are no duplicates either by jar name or among UDFs
> >> 3) moves jar from DFS temporary area to DFS udf area
> >> 4) updates ZK with list of new dynamic UDFs
> >> 5) removes jar from staging area
> >> 6) returns confirmation to user that UDFs were registered
> >>
> >>
> >> *Lazy-init*
> >>
> >> 1. User issues query with dynamically loaded UDF.
> >>
> >> 2. During planning stage or fragment execution, if UDF is not present in
> >> local function registry,  drillbit:
> >>
> >> a) checks if such UDF is present in ZK function registry
> >>
> >> b) if present, loads UDF using jar name, otherwise return an error
> >>
> >> c) proceeds planning stage or fragment execution
> >>
> >>
> >> *New drillbit registration / Drillbit re-start*
> >>
> >> Local udf directory is re-created, to clean up previously loaded jars
> if any
> >>
> >>
> >> *Delete UDF*
> >>
> >> Each jar that going to be loaded dynamically will have its own
> classloader
> >> which will solve problem with loading and unloading classes with the
> same
> >> name.
> >>
> >>
> >> 1. User issues DELETE command (delete will operate on jar name level)
> >>
> >> 2. Foreman receives DELETE request:
> >>
> >> a) checks if such jar is present in ZK function registry
> >>
> >> b) creates ephemeral znode /udf/delete/jar_name
> >>
> >> c) removes record in ZK function registry
> >>
> >> d) removes 

Re: Dynamic UDFs support

2016-07-26 Thread yuliya Feldman
Thank you Arina
Yuliya

  From: Arina Yelchiyeva 
 To: dev@drill.apache.org; yuliya Feldman  
 Sent: Tuesday, July 26, 2016 10:11 AM
 Subject: Re: Dynamic UDFs support
   
Sure, I'll add this option. I'll send a link to final document once it's
done.

On Tue, Jul 26, 2016 at 8:06 PM Keys Botzum  wrote:

> +1
>
> Keys
> ___
> Keys Botzum
> Senior Principal Technologist
> kbot...@maprtech.com <mailto:kbot...@maprtech.com>
> 443-718-0098
> MapR Technologies
> http://www.mapr.com <http://www.mapr.com/>
> > On Jul 26, 2016, at 1:05 PM, yuliya Feldman 
> wrote:
> >
> > I want to make sure (also will make a note in the design doc) that we
> have an option to disable dynamic loading/unloading of UDFs until we will
> be able to have an ability to do proper authentication AND authorization of
> the user(s).
> >
> >      From: Arina Yelchiyeva  arina.yelchiy...@gmail.com>>
> > To: dev@drill.apache.org <mailto:dev@drill.apache.org>
> > Sent: Monday, July 25, 2016 9:09 AM
> > Subject: Re: Dynamic UDFs support
> >
> > My fault, agree, DROP is more appropriate.
> > Thanks Julian!
> >
> > On Mon, Jul 25, 2016 at 7:07 PM Julian Hyde  <mailto:jhyde.apa...@gmail.com>> wrote:
> >
> >> But don't call it DELETE. In SQL the opposite of CREATE is DROP.
> >>
> >> Julian
> >>
> >>> On Jul 25, 2016, at 8:48 AM, Keys Botzum  <mailto:kbot...@maprtech.com>> wrote:
> >>>
> >>> I like the approach to handling DELETE. This is very useful. I think an
> >> implementation that does not guarantee consistent behavior is perfectly
> >> fine for use that is targeted at developers that are working on UDFs. As
> >> long as the docs make the intent clear this makes me very happy.
> >>>
> >>> I'll defer to others more expert than I on the remainder of the design.
> >>>
> >>> Keys
> >>> ___
> >>> Keys Botzum
> >>> Senior Principal Technologist
> >>> kbot...@maprtech.com <mailto:kbot...@maprtech.com>  kbot...@maprtech.com <mailto:kbot...@maprtech.com>>
> >>> 443-718-0098
> >>> MapR Technologies
> >>> http://www.mapr.com <http://www.mapr.com/> <http://www.mapr.com/ <
> http://www.mapr.com/>>
> >>>> On Jul 25, 2016, at 9:55 AM, Arina Yelchiyeva <
> >> arina.yelchiy...@gmail.com <mailto:arina.yelchiy...@gmail.com>> wrote:
> >>>>
> >>>> Taking into account all previous comments and discussion we had with
> >> Parth
> >>>> and Paul, please find below my design notes (I am going to prepare
> >> proper
> >>>> design document, just want to see if all agree with raw version).
> >>>> I propose will use lazy-init to dynamically loaded UDFs, in such case
> >> when
> >>>> user issues CREATE UDF command, foreman will only validate jar and
> >> update
> >>>> ZK function registry, and only if function is needed it will be loaded
> >> to
> >>>> appropriate drillbit (during planning stage or fragment execution). We
> >>>> might add listeners (as Paul proposed) to pre-load UDFs but I didn't
> >>>> include it to current release to simplify solution but we might
> >> re-consider
> >>>> this.
> >>>> I have looked at issue with class loading and unloading and if we ship
> >> each
> >>>> jar with its own classloader, DELETE functionality can be introduced
> in
> >>>> current release, at least marked as experimental or for developers use
> >>>> only, to ease UDF development process.
> >>>>
> >>>> Any comments are welcomed.
> >>>>
> >>>> *Invariants*
> >>>>
> >>>> 1. DFS staging area where user copies jar to be loaded
> >>>>
> >>>> 2. DFS udf area (former registration area) where all validated jars
> are
> >>>> present
> >>>>
> >>>> 3. ZK function registry - contains list of all dynamically loaded UDFs
> >> and
> >>>> their jars. UDF name will be represented as combination of name and
> >> input
> >>>> parameters.
> >>>>
> >>>> 4. Lazy-init - all dynamically loaded UDFs will be loaded to drillbit
> >> upon
> >>>> request, i.e. if drillbits receives query or fragment that 

Re: [ANNOUNCE] - New Apache Drill Committer - Neeraja Rentachintala

2016-11-18 Thread yuliya Feldman
Congratulations Neeraja!!!

  From: Parth Chandra 
 To: dev  
 Sent: Thursday, November 17, 2016 11:10 AM
 Subject: [ANNOUNCE] - New Apache Drill Committer - Neeraja Rentachintala
   
On behalf of the Apache Drill PMC, I am very pleased to announce that
Neeraja Rentachintala has accepted the invitation to become a committer in
the project.


Welcome Neeraja !


   

Re: [ANNOUNCE] - New Apache Drill Committer - Chris Westin

2016-12-01 Thread yuliya Feldman
Congratulations Chris!!!

  From: Jacques Nadeau 
 To: dev  
 Sent: Thursday, December 1, 2016 8:54 AM
 Subject: [ANNOUNCE] - New Apache Drill Committer - Chris Westin
   
On behalf of the Apache Drill PMC, I am very pleased to announce that Chris
Westin has accepted the invitation to become a committer in the project.

Welcome Chris and thanks for your great contributions!


--
Jacques Nadeau
CTO and Co-Founder, Dremio


   

Re: [ANNOUNCE] New Committer: Arina Ielchiieva

2017-02-26 Thread yuliya Feldman
Congratulations Arina!!!

  From: Arina Yelchiyeva 
 To: dev@drill.apache.org 
 Sent: Sunday, February 26, 2017 5:23 AM
 Subject: Re: [ANNOUNCE] New Committer: Arina Ielchiieva
   
Thank you all for congratulations! I really appreciate that.

Kind regards
Arina

On Sat, Feb 25, 2017 at 3:30 PM, Parth Chandra  wrote:

> Congratulations Arina. Welcome and thank you for your great work so far.
>
>
>
> On Fri, Feb 24, 2017 at 9:06 AM, Sudheesh Katkam 
> wrote:
>
> > The Project Management Committee (PMC) for Apache Drill has invited Arina
> > Ielchiieva to become a committer, and we are pleased to announce that she
> > has accepted.
> >
> > Arina has a long list of contributions [1] that have touched many aspects
> > of the product. Her work includes features such as dynamic UDF support
> and
> > temporary tables support.
> >
> > Welcome Arina, and thank you for your contributions.
> >
> > - Sudheesh, on behalf of the Apache Drill PMC
> >
> > [1] https://github.com/apache/drill/commits/master?author=
> arina-ielchiieva
> >
>


   

Re: [HANGOUT] Topics for 7/25/17

2017-07-25 Thread yuliya Feldman
Sorry for the late chime in.Just a note - regarding s3 - even after upgrade to 
hadoop 2.8.x you may need to separately update versions of aws, as one provided 
with the upgrade is not supporting all the newly added regions.
Thanks,Yuliya

  From: Arina Yelchiyeva 
 To: dev@drill.apache.org 
Cc: user 
 Sent: Tuesday, July 25, 2017 10:35 AM
 Subject: Re: [HANGOUT] Topics for 7/25/17
   
Meeting minutes 25 July 2017:

Attendees:
Rob, Vova, Sorabh, Pritesh, Paul, Aman, Padma, Jyothsna, Sindhuri.

Two topics were discussed.
1. Release candidate for 1.11.0.
Everybody is encouraged to test the release candidate and vote.
Aman asked about the release candidate performance testing.
Asked Kunal via email and he confirmed that performance testing is in
progress.

2. Upgrade to hadoop version 2.8.
Padma was looking into S3 connectivity issues and found out that switching
to Hadoop version 2.8.1 will solve these problems.
However, the hadoop release notes for 2.8.1 (and 2.8.0 as well) say the
following:
"Please note that 2.8.x release line continues to be not yet ready for
production use”.
Was decided to wait till the next Hadoop stable version release (hopefully
before Drill 1.12.0 release)
and for now document that users may switch to 2.8.1 themselves.


Thank you all for attending the hangout today.

Kind regards
Arina

On Tue, Jul 25, 2017 at 8:04 PM, Arina Yelchiyeva <
arina.yelchiy...@gmail.com> wrote:

> Hangouts is starting now...
>
> On Tue, Jul 25, 2017 at 7:41 AM, Padma Penumarthy 
> wrote:
>
>> I have a topic to discuss. Lot of folks on the user mailing list raised
>> the issue of not being able to access all S3 regions using Drill.
>> We need hadoop version 2.8 or higher to be able to connect to
>> regions which support only Version 4 signature.
>> I tried with 2.8.1, which just got released and it works i.e. I am able to
>> connect to both old and new regions (by specifying the endpoint in the
>> config).
>> There are some failures in unit tests, which can be fixed.
>>
>> Fixing S3 connectivity issues is important.
>> However, the hadoop release notes for 2.8.1 (and 2.8.0 as well) say the
>> following:
>> "Please note that 2.8.x release line continues to be not yet ready for
>> production use”.
>>
>> So, should we or not move to 2.8.1 ?
>>
>> Thanks,
>> Padma
>>
>>
>> On Jul 24, 2017, at 9:46 AM, Arina Yelchiyeva > > wrote:
>>
>> Hi all,
>>
>> We'll have the hangout tomorrow at the usual time [1]. Any topics to be
>> discussed?
>>
>> [1] https://drill.apache.org/community-resources/
>>
>> Kind regards
>> Arina
>>
>>
>

   

Re: [ANNOUNCE] New PMC member: Arina Ielchiieva

2017-08-03 Thread yuliya Feldman
Congrats Arina!!!
Very glad to see this happening.
Yuliya

  From: Arina Yelchiyeva 
 To: dev@drill.apache.org 
 Sent: Thursday, August 3, 2017 2:53 AM
 Subject: Re: [ANNOUNCE] New PMC member: Arina Ielchiieva
   
Thank all you!

Kind regards
Arina

On Thu, Aug 3, 2017 at 5:58 AM, Sudheesh Katkam  wrote:

> Congratulations and thank you, Arina.
>
> On Wed, Aug 2, 2017 at 1:38 PM, Paul Rogers  wrote:
>
> > The success of the Drill 1.11 release proves this is a well-deserved
> move.
> > Congratulations!
> >
> > - Paul
> >
> > > On Aug 2, 2017, at 11:23 AM, Aman Sinha  wrote:
> > >
> > > I am pleased to announce that Drill PMC invited Arina Ielchiieva to the
> > PMC
> > > and she has accepted the invitation.
> > >
> > > Congratulations Arina and thanks for your contributions !
> > >
> > > -Aman
> > > (on behalf of Drill PMC)
> >
> >
>


   

Re: compiling drill 1.11.0 with cdh profile

2017-08-08 Thread yuliya Feldman
Feels like you can't access: 
https://repo.maven.apache.org/maven2/org/apache/apache/14/apache-14.pom
as no other repo contain that pom.



  From: Dor Ben Dov 
 To: "dev@drill.apache.org"  
 Sent: Tuesday, August 8, 2017 1:12 AM
 Subject: RE: compiling drill 1.11.0 with cdh profile
   
 Can one help me with this ? 

Dor

-Original Message-
From: Dor Ben Dov 
Sent: יום ב 07 אוגוסט 2017 13:44
To: dev@drill.apache.org
Subject: compiling drill 1.11.0 with cdh profile

Hi all,

Tried to compile source code of branch 1.11.0 with profile cdh for cloudera - 
getting this exception, anyone ? 

 [dor@dor-fedora64 drill]$ mvn -U -DskipTests clean install -Pcdh [INFO] 
Scanning for projects...
Downloading: 
https://repository.cloudera.com/artifactory/cloudera-repos/org/apache/apache/14/apache-14.pom
Downloading: http://conjars.org/repo/org/apache/apache/14/apache-14.pom
Downloading: http://repository.mapr.com/maven/org/apache/apache/14/apache-14.pom
Downloading: http://repo.dremio.com/release/org/apache/apache/14/apache-14.pom
Downloading: 
http://repository.mapr.com/nexus/content/repositories/drill/org/apache/apache/14/apache-14.pom
Downloading: 
https://repo.maven.apache.org/maven2/org/apache/apache/14/apache-14.pom
[ERROR] [ERROR] Some problems were encountered while processing the POMs:
[FATAL] Non-resolvable parent POM for org.apache.drill:drill-root:1.11.0: Could 
not transfer artifact org.apache:apache:pom:14 from/to cloudera 
(https://repository.cloudera.com/artifactory/cloudera-repos/): 
repository.cloudera.com: Name or service not known and 'parent.relativePath' 
points at wrong local POM @ line 15, column 11  @ [ERROR] The build could not 
read 1 project -> [Help 1]
[ERROR]  
[ERROR]  The project org.apache.drill:drill-root:1.11.0 
(/home/dor/Downloads/drill/pom.xml) has 1 error
[ERROR]    Non-resolvable parent POM for org.apache.drill:drill-root:1.11.0: 
Could not transfer artifact org.apache:apache:pom:14 from/to cloudera 
(https://repository.cloudera.com/artifactory/cloudera-repos/): 
repository.cloudera.com: Name or service not known and 'parent.relativePath' 
points at wrong local POM @ line 15, column 11: Unknown host 
repository.cloudera.com: Name or service not known -> [Help 2]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildingException
[ERROR] [Help 2] 
http://cwiki.apache.org/confluence/display/MAVEN/UnresolvableModelException
[dor@dor-fedora64 drill]$


** I am using fedora 26 **

Regards,
Dor
This message and the information contained herein is proprietary and 
confidential and subject to the Amdocs policy statement,

you may review at https://www.amdocs.com/about/email-disclaimer 


This message and the information contained herein is proprietary and 
confidential and subject to the Amdocs policy statement,

you may review at https://www.amdocs.com/about/email-disclaimer 



   

Re: compiling drill 1.11.0 with cdh profile

2017-08-09 Thread yuliya Feldman
Could you click on the following link: [1] or try to wget it from the machine 
you are building on?
As apache-14.pom is NOT found in any of the repos except [1] it tries to 
download it form there and fails.You may or may not have access to other repos, 
but [1] does not look like accessible for your build.
As you say it works fine with HW profile or you try to build on different 
machine? May be HW maven repo hosts apache-14.pom or you have different 
internet access pattern on HW machine (if it is different machine indeed)
[1]  https://repo.maven.apache.org/maven2/org/apache/apache/14/apache-14.pom

  From: Dor Ben Dov 
 To: "dev@drill.apache.org" ; yuliya Feldman 
 
 Sent: Tuesday, August 8, 2017 11:32 PM
 Subject: RE: compiling drill 1.11.0 with cdh profile
   
Yulia, 
I ran 'mvn -U -Dskip.Tests clean install -Pcdh' isn't this enough?
What am I missing on cloudera? 
When I am taking drill to hortonworks, it work well by the way.

Regards,
Dor

-Original Message-
From: yuliya Feldman [mailto:yufeld...@yahoo.com.INVALID] 
Sent: יום ג 08 אוגוסט 2017 17:46
To: dev@drill.apache.org
Subject: Re: compiling drill 1.11.0 with cdh profile

Feels like you can't access: 
https://repo.maven.apache.org/maven2/org/apache/apache/14/apache-14.pom
as no other repo contain that pom.



      From: Dor Ben Dov 
 To: "dev@drill.apache.org" 
 Sent: Tuesday, August 8, 2017 1:12 AM
 Subject: RE: compiling drill 1.11.0 with cdh profile
  
 Can one help me with this ? 

Dor

-Original Message-
From: Dor Ben Dov
Sent: יום ב 07 אוגוסט 2017 13:44
To: dev@drill.apache.org
Subject: compiling drill 1.11.0 with cdh profile

Hi all,

Tried to compile source code of branch 1.11.0 with profile cdh for cloudera - 
getting this exception, anyone ? 

 [dor@dor-fedora64 drill]$ mvn -U -DskipTests clean install -Pcdh [INFO] 
Scanning for projects...
Downloading: 
https://repository.cloudera.com/artifactory/cloudera-repos/org/apache/apache/14/apache-14.pom
Downloading: http://conjars.org/repo/org/apache/apache/14/apache-14.pom
Downloading: http://repository.mapr.com/maven/org/apache/apache/14/apache-14.pom
Downloading: http://repo.dremio.com/release/org/apache/apache/14/apache-14.pom
Downloading: 
http://repository.mapr.com/nexus/content/repositories/drill/org/apache/apache/14/apache-14.pom
Downloading: 
https://repo.maven.apache.org/maven2/org/apache/apache/14/apache-14.pom
[ERROR] [ERROR] Some problems were encountered while processing the POMs:
[FATAL] Non-resolvable parent POM for org.apache.drill:drill-root:1.11.0: Could 
not transfer artifact org.apache:apache:pom:14 from/to cloudera 
(https://repository.cloudera.com/artifactory/cloudera-repos/): 
repository.cloudera.com: Name or service not known and 'parent.relativePath' 
points at wrong local POM @ line 15, column 11  @ [ERROR] The build could not 
read 1 project -> [Help 1] [ERROR] [ERROR]  The project 
org.apache.drill:drill-root:1.11.0 (/home/dor/Downloads/drill/pom.xml) has 1 
error [ERROR]    Non-resolvable parent POM for 
org.apache.drill:drill-root:1.11.0: Could not transfer artifact 
org.apache:apache:pom:14 from/to cloudera 
(https://repository.cloudera.com/artifactory/cloudera-repos/): 
repository.cloudera.com: Name or service not known and 'parent.relativePath' 
points at wrong local POM @ line 15, column 11: Unknown host 
repository.cloudera.com: Name or service not known -> [Help 2] [ERROR] [ERROR] 
To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildingException
[ERROR] [Help 2] 
http://cwiki.apache.org/confluence/display/MAVEN/UnresolvableModelException
[dor@dor-fedora64 drill]$


** I am using fedora 26 **

Regards,
Dor
This message and the information contained herein is proprietary and 
confidential and subject to the Amdocs policy statement,

you may review at https://www.amdocs.com/about/email-disclaimer 
<https://www.amdocs.com/about/email-disclaimer>

This message and the information contained herein is proprietary and 
confidential and subject to the Amdocs policy statement,

you may review at https://www.amdocs.com/about/email-disclaimer 
<https://www.amdocs.com/about/email-disclaimer>


  
This message and the information contained herein is proprietary and 
confidential and subject to the Amdocs policy statement,

you may review at https://www.amdocs.com/about/email-disclaimer 
<https://www.amdocs.com/about/email-disclaimer>


   

Review Request 29835: DRILL-1926 - Fix backp ressure logic

2015-01-12 Thread Yuliya Feldman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29835/
---

Review request for drill, Aditya Kishore, Jacques Nadeau, and Steven Phillips.


Bugs: DRILL-1926
https://issues.apache.org/jira/browse/DRILL-1926


Repository: drill-git


Description
---

Fix Back Pressure logic where it was kicking in only when number of elements in 
the queue was == to softLimit.
Also change to release back presure gradually and not in a single shot


Diffs
-

  
exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ResponseSenderQueue.java
 5a9316ebac83e39c6a66b3d4835bb2cea303908b 
  
exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/UnlimitedRawBatchBuffer.java
 623a719d8ff54554f7e77bcda7b50de66b48c766 
  
exec/java-exec/src/test/java/org/apache/drill/exec/work/batch/TestUnlimitedBatchBuffer.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/29835/diff/


Testing
---

all suites


Thanks,

Yuliya Feldman



Review Request 30965: Follow up on DRILL-133 (LocalExchange) to save CPU cycles on hash generation when using in HashToLocalExchange

2015-02-12 Thread Yuliya Feldman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30965/
---

Review request for drill, Jacques Nadeau, Steven Phillips, and Venki Korukanti.


Bugs: DRILL-2209
https://issues.apache.org/jira/browse/DRILL-2209


Repository: drill-git


Description
---

Insert Project operator to add new column "EXPRHASH" with hash expression for 
fields that are used for HashToRandomExchange
Remove Project operator after HashRandomExchange (or Demux) since it will 
create problems to fields ordering in HashJoin.

Tight this to MuxExchange - so if MuxExchange is enabled, Project is inserted.


Diffs
-

  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashToRandomExchangePrel.java
 372c75d 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/visitor/InsertLocalExchangeVisitor.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/30965/diff/


Testing
---

Need to add Unit Tests. tested live, run Functional and TPCH tests


Thanks,

Yuliya Feldman



Review Request 31107: Ability to make PartitionSender multithreaded - useful in case of LocalExchange being enabled, as it allows to deal with high volume of incoming data

2015-02-17 Thread Yuliya Feldman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31107/
---

Review request for drill, Chris Westin, Jacques Nadeau, Steven Phillips, and 
Venki Korukanti.


Bugs: DRILL-2210
https://issues.apache.org/jira/browse/DRILL-2210


Repository: drill-git


Description
---

In addition to description

Fixed few classes that did not handle multithreading well
Added/Changed some Stats behavior to allow stats merge from multiple threads, 
since again this class is not suitable to be used in multithreaded environment
Introduced new decorator class to handle multi thrteading (or not)  to minimize 
changes to ParitionSenderRootExec class


Diffs
-

  exec/java-exec/src/main/java/org/apache/drill/exec/ops/OperatorStats.java 
0e9da0e 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/SendingAccountor.java
 7af7b65 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
 f09acaa 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/Partitioner.java
 5ed9c39 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerDecorator.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerTemplate.java
 4292c09 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java
 faa8546 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
 aa0a5ad 

Diff: https://reviews.apache.org/r/31107/diff/


Testing
---

Still need to provide Unit Tests.

Functional tests are passing

Performance tests were run and look promising for some queries


Thanks,

Yuliya Feldman



Re: Review Request 31107: Ability to make PartitionSender multithreaded - useful in case of LocalExchange being enabled, as it allows to deal with high volume of incoming data

2015-02-19 Thread Yuliya Feldman


> On Feb. 19, 2015, 5:12 p.m., Chris Westin wrote:
> >

Thank you very much Chris for review - see my comments to your comments


> On Feb. 19, 2015, 5:12 p.m., Chris Westin wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/ops/OperatorStats.java, 
> > line 69
> > <https://reviews.apache.org/r/31107/diff/1/?file=866190#file866190line69>
> >
> > Why not call this(original, false) to avoid duplicating the code?

oops - forgot to remove this method - will do


> On Feb. 19, 2015, 5:12 p.m., Chris Westin wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/ops/OperatorStats.java, 
> > line 115
> > <https://reviews.apache.org/r/31107/diff/1/?file=866190#file866190line115>
> >
> > how about
> > 
> > final IntLongOpenHashmap fromMetrics = from.longMetrics;
> > final long[] fromValues = fromMetrics.values;
> > for(int i : fromMetrics.keys) {
> >   final long value = fromValues[i];
> >   longMetrics.putOrAdd(i, value, value);
> > } 
> > 
> > This avoids multiple evaluation of the member accesses on every pass of 
> > the loop, and multiple bounds checks within the loop body.
> > 
> > Similar improvements can be made to the doubleMetrics loop below.

OK


> On Feb. 19, 2015, 5:12 p.m., Chris Westin wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/SendingAccountor.java,
> >  line 32
> > <https://reviews.apache.org/r/31107/diff/1/?file=866191#file866191line32>
> >
> > I'd make these final too, while you're here.

OK


> On Feb. 19, 2015, 5:12 p.m., Chris Westin wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java,
> >  line 286
> > <https://reviews.apache.org/r/31107/diff/1/?file=866192#file866192line286>
> >
> > Reuse the id variable from above (which looks like it should be made 
> > final).

yes - definitely


> On Feb. 19, 2015, 5:12 p.m., Chris Westin wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerDecorator.java,
> >  line 46
> > <https://reviews.apache.org/r/31107/diff/1/?file=866194#file866194line46>
> >
> > Could this be final?

Don't see a reason why not, but will double check


> On Feb. 19, 2015, 5:12 p.m., Chris Westin wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerDecorator.java,
> >  line 164
> > <https://reviews.apache.org/r/31107/diff/1/?file=866194#file866194line164>
> >
> > Call Thread.currentThread() once at the beginning, and then reuse.

It is a beginnging - as I am within callable. I am trying to get the name of 
the thread that executes "Callable" to include name of it's parent thread 
(tname).
some parts can be reused but not the full name


> On Feb. 19, 2015, 5:12 p.m., Chris Westin wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerDecorator.java,
> >  line 166
> > <https://reviews.apache.org/r/31107/diff/1/?file=866194#file866194line166>
> >
> > Would it make sense to restore the original thread name here? If so, 
> > wrap the iface.execute() in a try and restore the name in a finally.

There not much point in restoring original state, as those threads are only 
used to call "callable" and so each next thread will rename it to something it 
needs every time


> On Feb. 19, 2015, 5:12 p.m., Chris Westin wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerDecorator.java,
> >  line 178
> > <https://reviews.apache.org/r/31107/diff/1/?file=866194#file866194line178>
> >
> > Is IOException the best choice here? Would RuntimeException make more 
> > sense? Either way, would a message be helpful in case there are problems? 
> > How would a user know what to do if this exception fired?

Since method that evenually calls this one throws only IOException I am 
throwing it. Will add message


> On Feb. 19, 2015, 5:12 p.m., Chris Westin wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/SendingAccountor.java,
> >  line 47
> > <https://reviews.apache.org/r/31107/diff/1/?file=866191#file866191line47>
> >
> > This should do something other than swallow InterruptedException. See 
> > http://www.ibm.com/developerworks/library/j-jtp05236/ .

Not that I touched this code, I need to see what should be done here.


- Yuliya


---
Th

Re: Review Request 31107: Ability to make PartitionSender multithreaded - useful in case of LocalExchange being enabled, as it allows to deal with high volume of incoming data

2015-02-20 Thread Yuliya Feldman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31107/
---

(Updated Feb. 20, 2015, 5:39 p.m.)


Review request for drill, Chris Westin, Jacques Nadeau, Steven Phillips, and 
Venki Korukanti.


Changes
---

Changes based on first round of code review


Bugs: DRILL-2210
https://issues.apache.org/jira/browse/DRILL-2210


Repository: drill-git


Description
---

In addition to description

Fixed few classes that did not handle multithreading well
Added/Changed some Stats behavior to allow stats merge from multiple threads, 
since again this class is not suitable to be used in multithreaded environment
Introduced new decorator class to handle multi thrteading (or not)  to minimize 
changes to ParitionSenderRootExec class


Diffs (updated)
-

  exec/java-exec/src/main/java/org/apache/drill/exec/ops/OperatorStats.java 
0e9da0e 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/SendingAccountor.java
 7af7b65 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
 f09acaa 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/Partitioner.java
 5ed9c39 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerDecorator.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerTemplate.java
 4292c09 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java
 faa8546 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
 aa0a5ad 

Diff: https://reviews.apache.org/r/31107/diff/


Testing
---

Still need to provide Unit Tests.

Functional tests are passing

Performance tests were run and look promising for some queries


Thanks,

Yuliya Feldman



Re: Review Request 30965: Follow up on DRILL-133 (LocalExchange) to save CPU cycles on hash generation when using in HashToLocalExchange

2015-02-20 Thread Yuliya Feldman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30965/
---

(Updated Feb. 20, 2015, 5:46 p.m.)


Review request for drill, Jacques Nadeau, Steven Phillips, and Venki Korukanti.


Changes
---

Updated based of first code review round


Bugs: DRILL-2209
https://issues.apache.org/jira/browse/DRILL-2209


Repository: drill-git


Description
---

Insert Project operator to add new column "EXPRHASH" with hash expression for 
fields that are used for HashToRandomExchange
Remove Project operator after HashRandomExchange (or Demux) since it will 
create problems to fields ordering in HashJoin.

Tight this to MuxExchange - so if MuxExchange is enabled, Project is inserted.


Diffs (updated)
-

  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashToRandomExchangePrel.java
 372c75d 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/visitor/InsertLocalExchangeVisitor.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/30965/diff/


Testing
---

Need to add Unit Tests. tested live, run Functional and TPCH tests


Thanks,

Yuliya Feldman



Re: Review Request 31107: Ability to make PartitionSender multithreaded - useful in case of LocalExchange being enabled, as it allows to deal with high volume of incoming data

2015-02-23 Thread Yuliya Feldman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31107/
---

(Updated Feb. 23, 2015, 3:29 p.m.)


Review request for drill, Chris Westin, Jacques Nadeau, Steven Phillips, and 
Venki Korukanti.


Changes
---

Addressing review comments


Bugs: DRILL-2210
https://issues.apache.org/jira/browse/DRILL-2210


Repository: drill-git


Description
---

In addition to description

Fixed few classes that did not handle multithreading well
Added/Changed some Stats behavior to allow stats merge from multiple threads, 
since again this class is not suitable to be used in multithreaded environment
Introduced new decorator class to handle multi thrteading (or not)  to minimize 
changes to ParitionSenderRootExec class


Diffs (updated)
-

  exec/java-exec/src/main/java/org/apache/drill/exec/ops/OperatorStats.java 
0e9da0e 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/SendingAccountor.java
 7af7b65 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
 f09acaa 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/Partitioner.java
 5ed9c39 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerDecorator.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerTemplate.java
 4292c09 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java
 faa8546 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
 aa0a5ad 

Diff: https://reviews.apache.org/r/31107/diff/


Testing
---

Still need to provide Unit Tests.

Functional tests are passing

Performance tests were run and look promising for some queries


Thanks,

Yuliya Feldman



Re: Review Request 30965: Follow up on DRILL-133 (LocalExchange) to save CPU cycles on hash generation when using in HashToLocalExchange

2015-02-23 Thread Yuliya Feldman


> On Feb. 23, 2015, 5:44 p.m., Jinfeng Ni wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/visitor/InsertLocalExchangeVisitor.java,
> >  line 112
> > <https://reviews.apache.org/r/30965/diff/2/?file=871428#file871428line112>
> >
> > MuxExchange has Project as its child. So, MuxExchange will have same 
> > traits as Project (addColumnprojectPrel), in stead of its parent (prel).

Will definitely fix it - thank you for pointing out


> On Feb. 23, 2015, 5:44 p.m., Jinfeng Ni wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/visitor/InsertLocalExchangeVisitor.java,
> >  line 127
> > <https://reviews.apache.org/r/30965/diff/2/?file=871428#file871428line127>
> >
> > I'm not fully clear about the motification of inserting the hash 
> > expression into Project. But here if we remove the compuated hash 
> > expression, does it mean that the down stream operator will not be able to 
> > refer to this computed value, and have to re-compute?

The problem is that if we have HashJoin later on it is not aware of additional 
column and it will be failing, so after discussion with Jacques we decided to 
add Project before HashExchage and remove it after - so to thw world outside of 
Mux/HashExchange/Demux it will look as Project was never inserted


- Yuliya


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30965/#review73732
---


On Feb. 23, 2015, 4:09 p.m., Yuliya Feldman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30965/
> ---
> 
> (Updated Feb. 23, 2015, 4:09 p.m.)
> 
> 
> Review request for drill, Jacques Nadeau, Jinfeng Ni, Steven Phillips, and 
> Venki Korukanti.
> 
> 
> Bugs: DRILL-2209
> https://issues.apache.org/jira/browse/DRILL-2209
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> Insert Project operator to add new column "EXPRHASH" with hash expression for 
> fields that are used for HashToRandomExchange
> Remove Project operator after HashRandomExchange (or Demux) since it will 
> create problems to fields ordering in HashJoin.
> 
> Tight this to MuxExchange - so if MuxExchange is enabled, Project is inserted.
> 
> 
> Diffs
> -
> 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashToRandomExchangePrel.java
>  372c75d 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/visitor/InsertLocalExchangeVisitor.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/30965/diff/
> 
> 
> Testing
> ---
> 
> Need to add Unit Tests. tested live, run Functional and TPCH tests
> 
> 
> Thanks,
> 
> Yuliya Feldman
> 
>



Re: Review Request 30965: Follow up on DRILL-133 (LocalExchange) to save CPU cycles on hash generation when using in HashToLocalExchange

2015-02-23 Thread Yuliya Feldman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30965/
---

(Updated Feb. 23, 2015, 11:32 p.m.)


Review request for drill, Jacques Nadeau, Jinfeng Ni, Steven Phillips, and 
Venki Korukanti.


Changes
---

Addressing review ocmments


Bugs: DRILL-2209
https://issues.apache.org/jira/browse/DRILL-2209


Repository: drill-git


Description
---

Insert Project operator to add new column "EXPRHASH" with hash expression for 
fields that are used for HashToRandomExchange
Remove Project operator after HashRandomExchange (or Demux) since it will 
create problems to fields ordering in HashJoin.

Tight this to MuxExchange - so if MuxExchange is enabled, Project is inserted.


Diffs (updated)
-

  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashToRandomExchangePrel.java
 372c75d 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/visitor/InsertLocalExchangeVisitor.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/30965/diff/


Testing
---

Need to add Unit Tests. tested live, run Functional and TPCH tests


Thanks,

Yuliya Feldman



Re: Review Request 31107: Ability to make PartitionSender multithreaded - useful in case of LocalExchange being enabled, as it allows to deal with high volume of incoming data

2015-02-25 Thread Yuliya Feldman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31107/
---

(Updated Feb. 25, 2015, 11:24 p.m.)


Review request for drill, Chris Westin, Jacques Nadeau, Steven Phillips, and 
Venki Korukanti.


Changes
---

Changes based on latest code review with Jacques and Venki

1. Reuse of ExecutorService from WorkManager
2. Not using anonymous objects
3. Not using callables in favor of runnables
4. other small corrections


Bugs: DRILL-2210
https://issues.apache.org/jira/browse/DRILL-2210


Repository: drill-git


Description
---

In addition to description

Fixed few classes that did not handle multithreading well
Added/Changed some Stats behavior to allow stats merge from multiple threads, 
since again this class is not suitable to be used in multithreaded environment
Introduced new decorator class to handle multi thrteading (or not)  to minimize 
changes to ParitionSenderRootExec class


Diffs (updated)
-

  exec/java-exec/src/main/java/org/apache/drill/exec/compile/CodeCompiler.java 
7cc350e 
  exec/java-exec/src/main/java/org/apache/drill/exec/ops/FragmentContext.java 
e413921 
  exec/java-exec/src/main/java/org/apache/drill/exec/ops/OperatorStats.java 
0e9da0e 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/SendingAccountor.java
 7af7b65 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
 f09acaa 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/Partitioner.java
 5ed9c39 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerDecorator.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerTemplate.java
 4292c09 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java
 faa8546 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitContext.java 
83a89df 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
 aa0a5ad 
  exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java 
99c6ab8 
  
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestOptiqPlans.java
 478 

Diff: https://reviews.apache.org/r/31107/diff/


Testing
---

Still need to provide Unit Tests.

Functional tests are passing

Performance tests were run and look promising for some queries


Thanks,

Yuliya Feldman



Re: Review Request 30965: Follow up on DRILL-133 (LocalExchange) to save CPU cycles on hash generation when using in HashToLocalExchange

2015-02-26 Thread Yuliya Feldman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30965/
---

(Updated Feb. 26, 2015, 12:34 a.m.)


Review request for drill, Jacques Nadeau, Jinfeng Ni, Steven Phillips, and 
Venki Korukanti.


Changes
---

Addressing review comments after code review with Jacques and Venki


Bugs: DRILL-2209
https://issues.apache.org/jira/browse/DRILL-2209


Repository: drill-git


Description
---

Insert Project operator to add new column "EXPRHASH" with hash expression for 
fields that are used for HashToRandomExchange
Remove Project operator after HashRandomExchange (or Demux) since it will 
create problems to fields ordering in HashJoin.

Tight this to MuxExchange - so if MuxExchange is enabled, Project is inserted.


Diffs (updated)
-

  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashToRandomExchangePrel.java
 372c75d 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PrelUtil.java
 1adc54f 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/visitor/InsertLocalExchangeVisitor.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/30965/diff/


Testing
---

Need to add Unit Tests. tested live, run Functional and TPCH tests


Thanks,

Yuliya Feldman



Re: Review Request 31107: Ability to make PartitionSender multithreaded - useful in case of LocalExchange being enabled, as it allows to deal with high volume of incoming data

2015-02-26 Thread Yuliya Feldman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31107/
---

(Updated Feb. 26, 2015, 7:19 p.m.)


Review request for drill, Chris Westin, Jacques Nadeau, Steven Phillips, and 
Venki Korukanti.


Changes
---

Added calculation of number of threads based on the cost (number of rows).

Formula is:  cost/slicetarget/#senders/threadfactor

threadfactor is set to 4 by default

Additional config param is max number of threads - by default set to 32

Will need to play around with those params to figure out good combination


Bugs: DRILL-2210
https://issues.apache.org/jira/browse/DRILL-2210


Repository: drill-git


Description
---

In addition to description

Fixed few classes that did not handle multithreading well
Added/Changed some Stats behavior to allow stats merge from multiple threads, 
since again this class is not suitable to be used in multithreaded environment
Introduced new decorator class to handle multi thrteading (or not)  to minimize 
changes to ParitionSenderRootExec class


Diffs (updated)
-

  exec/java-exec/src/main/java/org/apache/drill/exec/compile/CodeCompiler.java 
7cc350e 
  exec/java-exec/src/main/java/org/apache/drill/exec/ops/FragmentContext.java 
e413921 
  exec/java-exec/src/main/java/org/apache/drill/exec/ops/OperatorStats.java 
0e9da0e 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/IteratorValidator.java
 64cf7c5 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/SendingAccountor.java
 7af7b65 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
 f09acaa 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/Partitioner.java
 5ed9c39 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerDecorator.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerTemplate.java
 4292c09 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/fragment/Materializer.java
 961b603 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java
 faa8546 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitContext.java 
83a89df 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
 aa0a5ad 
  exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java 
99c6ab8 
  
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestOptiqPlans.java
 478 

Diff: https://reviews.apache.org/r/31107/diff/


Testing
---

Still need to provide Unit Tests.

Functional tests are passing

Performance tests were run and look promising for some queries


Thanks,

Yuliya Feldman



Re: Review Request 30965: Follow up on DRILL-133 (LocalExchange) to save CPU cycles on hash generation when using in HashToLocalExchange

2015-03-02 Thread Yuliya Feldman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30965/
---

(Updated March 2, 2015, 1:56 p.m.)


Review request for drill, Jacques Nadeau, Jinfeng Ni, Steven Phillips, and 
Venki Korukanti.


Changes
---

Added unit tests for additional project operator + HashEspression


Bugs: DRILL-2209
https://issues.apache.org/jira/browse/DRILL-2209


Repository: drill-git


Description
---

Insert Project operator to add new column "EXPRHASH" with hash expression for 
fields that are used for HashToRandomExchange
Remove Project operator after HashRandomExchange (or Demux) since it will 
create problems to fields ordering in HashJoin.

Tight this to MuxExchange - so if MuxExchange is enabled, Project is inserted.


Diffs (updated)
-

  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashToRandomExchangePrel.java
 372c75d 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PrelUtil.java
 1adc54f 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/visitor/InsertLocalExchangeVisitor.java
 PRE-CREATION 
  
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestLocalExchange.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/30965/diff/


Testing
---

Need to add Unit Tests. tested live, run Functional and TPCH tests


Thanks,

Yuliya Feldman



Re: Review Request 30965: Follow up on DRILL-133 (LocalExchange) to save CPU cycles on hash generation when using in HashToLocalExchange

2015-03-03 Thread Yuliya Feldman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30965/
---

(Updated March 3, 2015, 12:59 a.m.)


Review request for drill, Jacques Nadeau, Jinfeng Ni, Steven Phillips, and 
Venki Korukanti.


Changes
---

Addressing review comments for UnitTests


Bugs: DRILL-2209
https://issues.apache.org/jira/browse/DRILL-2209


Repository: drill-git


Description
---

Insert Project operator to add new column "EXPRHASH" with hash expression for 
fields that are used for HashToRandomExchange
Remove Project operator after HashRandomExchange (or Demux) since it will 
create problems to fields ordering in HashJoin.

Tight this to MuxExchange - so if MuxExchange is enabled, Project is inserted.


Diffs (updated)
-

  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashToRandomExchangePrel.java
 372c75d 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PrelUtil.java
 1adc54f 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/visitor/InsertLocalExchangeVisitor.java
 PRE-CREATION 
  
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestLocalExchange.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/30965/diff/


Testing
---

Need to add Unit Tests. tested live, run Functional and TPCH tests


Thanks,

Yuliya Feldman



Re: Review Request 31107: Ability to make PartitionSender multithreaded - useful in case of LocalExchange being enabled, as it allows to deal with high volume of incoming data

2015-03-06 Thread Yuliya Feldman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31107/
---

(Updated March 6, 2015, 2:59 p.m.)


Review request for drill, Chris Westin, Jacques Nadeau, Steven Phillips, and 
Venki Korukanti.


Changes
---

Added Unit tests
Added number of threads as a metric to PartitionSender operator
Fixed partitioners distribution algorithm to do even distribution between 
threads


Bugs: DRILL-2210
https://issues.apache.org/jira/browse/DRILL-2210


Repository: drill-git


Description
---

In addition to description

Fixed few classes that did not handle multithreading well
Added/Changed some Stats behavior to allow stats merge from multiple threads, 
since again this class is not suitable to be used in multithreaded environment
Introduced new decorator class to handle multi thrteading (or not)  to minimize 
changes to ParitionSenderRootExec class


Diffs (updated)
-

  exec/java-exec/src/main/java/org/apache/drill/exec/compile/CodeCompiler.java 
7cc350e 
  exec/java-exec/src/main/java/org/apache/drill/exec/ops/FragmentContext.java 
e413921 
  exec/java-exec/src/main/java/org/apache/drill/exec/ops/OperatorStats.java 
0e9da0e 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/IteratorValidator.java
 64cf7c5 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/SendingAccountor.java
 7af7b65 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
 a23bd7a 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/Partitioner.java
 5ed9c39 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerDecorator.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerTemplate.java
 71ffd41 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/fragment/Materializer.java
 961b603 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java
 bbfbbcb 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitContext.java 
0fb10ff 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
 3d3e96f 
  exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java 
99c6ab8 
  
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestOptiqPlans.java
 478 
  
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/partitionsender/TestPartitionSender.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/31107/diff/


Testing
---

Still need to provide Unit Tests.

Functional tests are passing

Performance tests were run and look promising for some queries


Thanks,

Yuliya Feldman



Re: Review Request 31107: Ability to make PartitionSender multithreaded - useful in case of LocalExchange being enabled, as it allows to deal with high volume of incoming data

2015-03-09 Thread Yuliya Feldman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31107/
---

(Updated March 9, 2015, 9:31 a.m.)


Review request for drill, Chris Westin, Jacques Nadeau, Steven Phillips, and 
Venki Korukanti.


Changes
---

More unit tests for failure scenarios
Optimized more partitioners distribution algorithm
Fixed OperatorStats merging metrics


Bugs: DRILL-2210
https://issues.apache.org/jira/browse/DRILL-2210


Repository: drill-git


Description
---

In addition to description

Fixed few classes that did not handle multithreading well
Added/Changed some Stats behavior to allow stats merge from multiple threads, 
since again this class is not suitable to be used in multithreaded environment
Introduced new decorator class to handle multi thrteading (or not)  to minimize 
changes to ParitionSenderRootExec class


Diffs (updated)
-

  exec/java-exec/src/main/java/org/apache/drill/exec/compile/CodeCompiler.java 
7cc350e 
  exec/java-exec/src/main/java/org/apache/drill/exec/ops/FragmentContext.java 
e413921 
  exec/java-exec/src/main/java/org/apache/drill/exec/ops/OperatorStats.java 
0e9da0e 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/IteratorValidator.java
 64cf7c5 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/SendingAccountor.java
 7af7b65 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
 a23bd7a 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/Partitioner.java
 5ed9c39 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerDecorator.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerTemplate.java
 71ffd41 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/fragment/Materializer.java
 961b603 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java
 bbfbbcb 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitContext.java 
0fb10ff 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
 3d3e96f 
  exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java 
99c6ab8 
  
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestOptiqPlans.java
 478 
  
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/partitionsender/TestPartitionSender.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/31107/diff/


Testing
---

Still need to provide Unit Tests.

Functional tests are passing

Performance tests were run and look promising for some queries


Thanks,

Yuliya Feldman



Re: Review Request 31107: Ability to make PartitionSender multithreaded - useful in case of LocalExchange being enabled, as it allows to deal with high volume of incoming data

2015-03-15 Thread Yuliya Feldman


> On March 15, 2015, 1:28 p.m., Jacques Nadeau wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerTemplate.java,
> >  line 85
> > <https://reviews.apache.org/r/31107/diff/7/?file=889340#file889340line85>
> >
> > returning null here seems weird.  When would that happen?

Since there can be > 1 partitioners current partitioner may not correspond to 
the index. There is a method in PartitionerDecorator that loops over 
partitioners and return one that matches the index. I will add comments to the 
method.


> On March 15, 2015, 1:28 p.m., Jacques Nadeau wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/ops/FragmentContext.java,
> >  line 234
> > <https://reviews.apache.org/r/31107/diff/7/?file=889333#file889333line234>
> >
> > shouldn't this parameter be instanceCount?  instanceNumber seems like 
> > an index.

will do


> On March 15, 2015, 1:28 p.m., Jacques Nadeau wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/ops/OperatorStats.java, 
> > line 70
> > <https://reviews.apache.org/r/31107/diff/7/?file=889334#file889334line70>
> >
> > missing description of what isClean means.

will do


- Yuliya


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31107/#review76510
---


On March 9, 2015, 9:31 a.m., Yuliya Feldman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31107/
> ---
> 
> (Updated March 9, 2015, 9:31 a.m.)
> 
> 
> Review request for drill, Chris Westin, Jacques Nadeau, Steven Phillips, and 
> Venki Korukanti.
> 
> 
> Bugs: DRILL-2210
> https://issues.apache.org/jira/browse/DRILL-2210
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> In addition to description
> 
> Fixed few classes that did not handle multithreading well
> Added/Changed some Stats behavior to allow stats merge from multiple threads, 
> since again this class is not suitable to be used in multithreaded environment
> Introduced new decorator class to handle multi thrteading (or not)  to 
> minimize changes to ParitionSenderRootExec class
> 
> 
> Diffs
> -
> 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/compile/CodeCompiler.java 
> 7cc350e 
>   exec/java-exec/src/main/java/org/apache/drill/exec/ops/FragmentContext.java 
> e413921 
>   exec/java-exec/src/main/java/org/apache/drill/exec/ops/OperatorStats.java 
> 0e9da0e 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/IteratorValidator.java
>  64cf7c5 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/SendingAccountor.java
>  7af7b65 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
>  a23bd7a 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/Partitioner.java
>  5ed9c39 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerDecorator.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerTemplate.java
>  71ffd41 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/fragment/Materializer.java
>  961b603 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java
>  bbfbbcb 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitContext.java
>  0fb10ff 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
>  3d3e96f 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java 
> 99c6ab8 
>   
> exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestOptiqPlans.java
>  478 
>   
> exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/partitionsender/TestPartitionSender.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/31107/diff/
> 
> 
> Testing
> ---
> 
> Still need to provide Unit Tests.
> 
> Functional tests are passing
> 
> Performance tests were run and look promising for some queries
> 
> 
> Thanks,
> 
> Yuliya Feldman
> 
>



Re: Review Request 31107: Ability to make PartitionSender multithreaded - useful in case of LocalExchange being enabled, as it allows to deal with high volume of incoming data

2015-03-15 Thread Yuliya Feldman


> On March 15, 2015, 8:18 p.m., Jacques Nadeau wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerTemplate.java,
> >  line 85
> > <https://reviews.apache.org/r/31107/diff/7/?file=889340#file889340line85>
> >
> > Just to confirm, this will only be called in rare cases, right? E.g. 
> > not for every record or even every batch.

it is used only by the method you have your comment on efficiency which is used 
only by receivingFragmentFinished()


> On March 15, 2015, 8:18 p.m., Jacques Nadeau wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerDecorator.java,
> >  line 106
> > <https://reviews.apache.org/r/31107/diff/7/?file=889339#file889339line106>
> >
> > I really think you should make this logic more efficient.  Why not just 
> > have a direct array of outgoing batches by index?

I am not sure what exactly you mean, but I think it is a confusion with name of 
the method - it should have been called getOutgoingBatch (not 
getOutgoingBatches) - decorator for one in PartitionerTemplate. Since now we 
have list of partitioners and each of them has list of outgingbatches we need 
to get right partitioner first.
Corresponding method in PartitionerTemplate may not be needed - since all the 
logic can be done inside this one. To get precise Partitioner based just on the 
index may not be always precise because I am trying to balance it out and so 
some partitioners will have an extra OutgoingBatch - like partitioner0 has 
batches[1,2], partitioner1 has batches[3] - if I have 2 threads/partitioners 
and just 3 destinations/outgoing batches.
Let's discuss if I still miss something


- Yuliya


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31107/#review76516
-------


On March 9, 2015, 9:31 a.m., Yuliya Feldman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31107/
> ---
> 
> (Updated March 9, 2015, 9:31 a.m.)
> 
> 
> Review request for drill, Chris Westin, Jacques Nadeau, Steven Phillips, and 
> Venki Korukanti.
> 
> 
> Bugs: DRILL-2210
> https://issues.apache.org/jira/browse/DRILL-2210
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> In addition to description
> 
> Fixed few classes that did not handle multithreading well
> Added/Changed some Stats behavior to allow stats merge from multiple threads, 
> since again this class is not suitable to be used in multithreaded environment
> Introduced new decorator class to handle multi thrteading (or not)  to 
> minimize changes to ParitionSenderRootExec class
> 
> 
> Diffs
> -
> 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/compile/CodeCompiler.java 
> 7cc350e 
>   exec/java-exec/src/main/java/org/apache/drill/exec/ops/FragmentContext.java 
> e413921 
>   exec/java-exec/src/main/java/org/apache/drill/exec/ops/OperatorStats.java 
> 0e9da0e 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/IteratorValidator.java
>  64cf7c5 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/SendingAccountor.java
>  7af7b65 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
>  a23bd7a 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/Partitioner.java
>  5ed9c39 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerDecorator.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerTemplate.java
>  71ffd41 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/fragment/Materializer.java
>  961b603 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java
>  bbfbbcb 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitContext.java
>  0fb10ff 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
>  3d3e96f 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java 
> 99c6ab8 
>   
> exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestOptiqPlans.java
>  478 
>   
> exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/partitionsender/TestPartitionSender.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/31107/diff/
> 
> 
> Testing
> ---
> 
> Still need to provide Unit Tests.
> 
> Functional tests are passing
> 
> Performance tests were run and look promising for some queries
> 
> 
> Thanks,
> 
> Yuliya Feldman
> 
>



Re: Review Request 31107: Ability to make PartitionSender multithreaded - useful in case of LocalExchange being enabled, as it allows to deal with high volume of incoming data

2015-03-16 Thread Yuliya Feldman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31107/
---

(Updated March 16, 2015, 9:17 p.m.)


Review request for drill, Chris Westin, Jacques Nadeau, Steven Phillips, and 
Venki Korukanti.


Changes
---

Latest (hopefully)

Addressing review comments
changed default settings for threads
added cost to metrics
added additional param to completely overwrite # of threads


Bugs: DRILL-2210
https://issues.apache.org/jira/browse/DRILL-2210


Repository: drill-git


Description
---

In addition to description

Fixed few classes that did not handle multithreading well
Added/Changed some Stats behavior to allow stats merge from multiple threads, 
since again this class is not suitable to be used in multithreaded environment
Introduced new decorator class to handle multi thrteading (or not)  to minimize 
changes to ParitionSenderRootExec class


Diffs (updated)
-

  exec/java-exec/src/main/java/org/apache/drill/exec/compile/CodeCompiler.java 
7cc350e 
  exec/java-exec/src/main/java/org/apache/drill/exec/ops/FragmentContext.java 
108f5bb 
  exec/java-exec/src/main/java/org/apache/drill/exec/ops/OperatorStats.java 
0e9da0e 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/IteratorValidator.java
 64cf7c5 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/SendingAccountor.java
 7af7b65 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
 ccbd289 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/Partitioner.java
 5ed9c39 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerDecorator.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerTemplate.java
 1d9088a 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/fragment/Materializer.java
 9b0944e 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java
 abbc910 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitContext.java 
0fb10ff 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
 d821af8 
  exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java 
99c6ab8 
  
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestOptiqPlans.java
 478 
  
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/partitionsender/TestPartitionSender.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/31107/diff/


Testing
---

Still need to provide Unit Tests.

Functional tests are passing

Performance tests were run and look promising for some queries


Thanks,

Yuliya Feldman



Re: [VOTE] Release Apache Drill 0.8.0

2015-03-21 Thread yuliya Feldman
# Downloaded src tar from [2]# Built Drill from the source distribution using 
Java 7 completed
successfully
# Started Drill in embedded mode and ran test queries - working on Mac.
# Started Drillbit, connected using sqlline and ran test queries - working
on Mac#Verified that LocalExchange(UnorderdMux) is triggered#Verified that 
ProjectOperator with Hash inserted/removed correctly#Verified that number of 
threads used in Partition Sender set is based on the formula
+1 (non-binding)
Thanks,Yuliya
[2] http://people.apache.org/~jacques/apache-drill-0.8.0.rc0/

  From: Aditya 
 To: "dev@drill.apache.org"  
 Sent: Friday, March 20, 2015 11:08 PM
 Subject: Re: [VOTE] Release Apache Drill 0.8.0
   
# Verified that the source distribution contains the snapshot of Drill
source code at git commit id f1b59ed[1].
# Signature verification passed with Jacques' public key (6B5FA695) for
both source and binary tarballs.
# Building Drill from the source distribution using Java 7 completed
successfully in 05:27 min.
# All unit tests passed (5 times in a row).
# Started Drill in embedded mode and ran test queries - working on Windows.
# Started Drillbit, connected using sqlline and ran test queries - working
on Windows.

Overall looks a solid release, well done team.

+1 (binding)

aditya...

[1]
https://github.com/apache/drill/tree/f1b59ed4467ddaf75bc986ec095a20d6c28e9d15



On Fri, Mar 20, 2015 at 6:18 PM, Jacques Nadeau  wrote:

> Good evening,
>
> I would like to propose the release of Apache Drill, version 0.8.0.
>
> This release includes 230 resolved JIRAs [1].
>
> The artifacts are hosted at [2].
>
> The vote will be open for 72 hours, ending 6PM Pacific, March 23, 2015
>
> [ ] +1
> [ ] +0
> [ ] -1
>
>
> Thank you,
> Jacques
>
> [1]
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820&version=12328812
> [2] http://people.apache.org/~jacques/apache-drill-0.8.0.rc0/
>


  

Re: eclipse:eclipse failing

2015-06-10 Thread yuliya Feldman
I have been using mvn eclipse:eclipse w/o issues for quite a long time.
As Jacques pointed out you need to run "install" target first.
  From: Jacques Nadeau 
 To: "dev@drill.apache.org"  
 Sent: Wednesday, June 10, 2015 12:36 PM
 Subject: Re: eclipse:eclipse failing
   
The problem is you'll need to first run a complete mvn install -DskipTests
command before you can use eclipse:eclipse.

 Furthermore, I strongly recommend using Eclipse's import capability as we
haven't tested the eclipse:eclipse behavior with Drill.



On Mon, Jun 8, 2015 at 6:27 PM, 오진박  wrote:

> hi,when using mvn eclipse:eclipse,I got following errors, I need help to
> resolve it.
>

  

Re: Suspicious direct memory consumption when running queries concurrently

2015-07-31 Thread yuliya Feldman
How much memory your jvm is taking?
Do you even have enough disk space to dump it.
  From: Abdel Hakim Deneche 
 To: "dev@drill.apache.org"  
 Sent: Friday, July 31, 2015 9:19 PM
 Subject: Re: Suspicious direct memory consumption when running queries 
concurrently
   
I tried getting a jmap dump multiple times without success, each time it
crashes the jvm with the following exception:

Dumping heap to /home/mapr/private-sql-hadoop-test/framework/myfile.hprof
> ...
> Exception in thread "main" java.io.IOException: Premature EOF
>        at
> sun.tools.attach.HotSpotVirtualMachine.readInt(HotSpotVirtualMachine.java:248)
>        at
> sun.tools.attach.LinuxVirtualMachine.execute(LinuxVirtualMachine.java:199)
>        at
> sun.tools.attach.HotSpotVirtualMachine.executeCommand(HotSpotVirtualMachine.java:217)
>        at
> sun.tools.attach.HotSpotVirtualMachine.dumpHeap(HotSpotVirtualMachine.java:180)
>        at sun.tools.jmap.JMap.dump(JMap.java:242)
>        at sun.tools.jmap.JMap.main(JMap.java:140)


On Mon, Jul 27, 2015 at 3:45 PM, Jacques Nadeau  wrote:

> A allocate -> release cycle all on the same thread goes into a per thread
> cache.
>
> A bunch of Netty arena settings are configurable.  The big issue I believe
> is that the limits are soft limits implemented by the allocation-time
> release mechanism.  As such, if you allocate a bunch of memory, then
> release it all, that won't necessarily trigger any actual chunk releases.
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
> On Mon, Jul 27, 2015 at 12:47 PM, Abdel Hakim Deneche <
> adene...@maprtech.com
> > wrote:
>
> > @Jacques, my understanding is that chunks are not owned by specific a
> > thread but they are part of a specific memory arena which is in turn only
> > accessed by specific threads. Do you want me to find which threads are
> > associated with the same arena where we have hanging chunks ?
> >
> >
> > On Mon, Jul 27, 2015 at 11:04 AM, Jacques Nadeau 
> > wrote:
> >
> > > It sounds like your statement is that we're cacheing too many unused
> > > chunks.  Hanifi and I previously discussed implementing a separate
> > flushing
> > > mechanism to release unallocated chunks that are hanging around.  The
> > main
> > > question is, why are so many chunks hanging around and what threads are
> > > they associated with.  A Jmap dump and analysis should allow you to do
> > > determine which thread owns the excess chunks.  My guess would be the
> RPC
> > > pool since those are long lasting (as opposed to the WorkManager pool,
> > > which is contracting).
> > >
> > > --
> > > Jacques Nadeau
> > > CTO and Co-Founder, Dremio
> > >
> > > On Mon, Jul 27, 2015 at 9:53 AM, Abdel Hakim Deneche <
> > > adene...@maprtech.com>
> > > wrote:
> > >
> > > > When running a set of, mostly window function, queries concurrently
> on
> > a
> > > > single drillbit with a 8GB max direct memory. We are seeing a
> > continuous
> > > > increase of direct memory allocation.
> > > >
> > > > We repeat the following steps multiple times:
> > > > - we launch in "iteration" of tests that will run all queries in a
> > random
> > > > order, 10 queries at a time
> > > > - after the iteration finishes, we wait for a couple of minute to
> give
> > > > Drill time to release the memory being held by the finishing
> fragments
> > > >
> > > > Using Drill's memory logger ("drill.allocator") we were able to get
> > > > snapshots of how memory was internally used by Netty, we only focused
> > on
> > > > the number of allocated chunks, if we take this number and multiply
> it
> > by
> > > > 16MB (netty's chunk size) we get approximately the same value
> reported
> > by
> > > > Drill's direct memory allocation.
> > > > Here is a graph that shows the evolution of the number of allocated
> > > chunks
> > > > on a 500 iterations run (I'm working on improving the plots) :
> > > >
> > > > http://bit.ly/1JL6Kp3
> > > >
> > > > In this specific case, after the first iteration Drill was allocating
> > > ~2GB
> > > > of direct memory, this number kept rising after each iteration to
> ~6GB.
> > > We
> > > > suspect this caused one of our previous runs to crash the JVM.
> > > >
> > > > If we only focus on the log lines between iterations (when Drill's
> > memory
> > > > usage is below 10MB) then all allocated chunks are at most 2% usage.
> At
> > > > some point we end up with 288 nearly empty chunks, yet the next
> > iteration
> > > > will cause more chunks to be allocated!!!
> > > >
> > > > is this expected ?
> > > >
> > > > PS: I am running more tests and will update this thread with more
> > > > informations.
> > > >
> > > > --
> > > >
> > > > Abdelhakim Deneche
> > > >
> > > > Software Engineer
> > > >
> > > >  
> > > >
> > > >
> > > > Now Available - Free Hadoop On-Demand Training
> > > > <
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available


> > > > >
> > > >
> > >
> >
> >
> >
> > --
> >
> > Abdelhaki

Re: Zookeeper (semi-)automated cluster setup

2015-09-26 Thread yuliya Feldman
What exactly do you have in mind?
1. Deploying/Setting up ZK servers2. What is the optimal number of ZK servers 
for cluster of certain size3. When you say "configured for drill" - what do you 
mean by that? 
  From: Edmon Begoli 
 To: u...@drill.apache.org; dev@drill.apache.org 
 Sent: Saturday, September 26, 2015 3:54 PM
 Subject: Zookeeper (semi-)automated cluster setup
   
Hey folks,

I am exploring a setup for large cluster deployment for Drill.

Is anyone aware of any ZooKeeper utility for simplified and (semi-)automated
setup, and, ideally, configured for Drill?

Thank you,
Edmon


  

Re: [DISCUSS] Design Documents

2015-10-18 Thread yuliya Feldman
+1
  From: Parth Chandra 
 To: dev@drill.apache.org 
 Sent: Friday, October 16, 2015 10:21 AM
 Subject: [DISCUSS] Design Documents
   
Hi guys,

Now that 1.2 is out I wanted to bring up the exciting topic of design
documents for Drill. As the project gets more contributors, we definitely
need to start documenting our designs and also allow for a more substantial
review process. In particular, we need to make sure that there is
sufficient time for comment as well as a time limit for comments so that
developers are not left stranded. It is understood that committers should
ensure they spend enough time in reviewing designs.

I can see some substantial improvements in the works (some may even have
pull requests for initial work) and I think that this is a good time to
make sure that the design is done and understood by all before we get too
far ahead with the implementation.

[1] is an example from Spark, though that might be asking for a lot.

[2] is an example from Drill - Hash Aggregation in Drill - This is an ideal
design document. It could be improved even further perhaps by adding some
implementation level details (for example parameters that could be used to
tune Hash aggregation) that could aid QA/documentation.

What do people think? Can we start enforcing the requirement to have
reviewed design docs before submitting pull requests for *advanced*
features?

Parth

[1] http://people.csail.mit.edu/matei/papers/2012/nsdi_spark.pdf
[2] https://issues.apache.org/jira/secure/attachment/12622804/DrillAggrs.pdf


   

Re: Zookeeper down before query starts/after query finishes

2015-11-08 Thread yuliya Feldman
In the reality if you can not connect to ZK (and ConnectionLoss is a client 
side error) it either means issues with network on client node itself or issues 
with ZK quorum.  In those situations unless you receive (eventually) "Session 
Expiration" or "Connection reestablished" again you don't know what is going 
on. What probably would be prudent to do is to timeout if after ConnectionLoss 
you do not have anything back from ZK server for time > ZK client timeout (30 
sec. by default I think).
And again it will need to depend on the client - in your example it is a good 
idea to fail in some other cases it may be a good idea to wait (e.g if you deal 
with non-idempotent operations)
  From: Hsuan Yi Chu 
 To: dev@drill.apache.org 
 Sent: Sunday, November 8, 2015 9:36 AM
 Subject: Re: Zookeeper down before query starts/after query finishes
   
I just submitted a pull request to address DRILL-3751, which focuses on the
scenario where query already finishes and zookeeper dies. So Foreman cannot
delete the profiles of running queries in zookeeper.

I think in this case, after a few retries, Foreman can assume Zookeeper is
down. And, this query is assumed to fail since client might not be able to
receive the result (see the behavior in DRILL-3751
).

Does this make sense?




On Fri, Nov 6, 2015 at 10:43 AM, Hsuan Yi Chu  wrote:

> My understanding is :
> Before query starts/After query finishes, Foreman will put/delete running
> query profiles in zookeeper.
>
> However, if zookeeper is down before the put/delete is successful, Drill
> would be blocked at the put/delete operation.
>
> See https://issues.apache.org/jira/browse/DRILL-3751
>
> I think it is not quite right to let Drill just wait for Zookeeper to
> respond. Does it make sense to use "time-out" here?
>
>
>


  

Re: Zookeeper down before query starts/after query finishes

2015-11-08 Thread yuliya Feldman
Did not notice your reply :)
Yes - I agree with Jacques - we should consider variety of the scenarios here.
Thanks,Yuliya
  From: Jacques Nadeau 
 To: dev  
 Sent: Sunday, November 8, 2015 11:56 AM
 Subject: Re: Zookeeper down before query starts/after query finishes
   
I think we need to talk through a couple of different scenarios and decide
on Drill behavior in each.

Client Based
1) Initial connection to ZK from client fails
2) Client loses ZK Connection
  a) Reconnects within session timeout
  b) Cannot reconnect within session timeout (loses session)
3) ZK Connection is gets reconnected with new session (2b)

Drillbit Based
4) Drillbit initial connection fails to complete
5) Drillbit loses connection
  a) reconnects within session timeout
  b) cannot reconnect within session timeout (loses session)
6) Drillbit reestablishes connection after timeout (5b)

It seems like your initial proposal is entirely focused on item (5b) in the
list above. However, the code change affects all items 1-6. I think it
would be worthwhile to come up with clear definition of desired behavior
for all items 1-6. I also think the behavior in 2b should probably be very
different than in 5b.

Note, I'm not suggesting that this initial fix needs to resolve all items
to the desired behavior. However, it is hard to review the patch without
measuring against what are target is across the items. My hope out of this
is a clear framework to review the patch as well as a number of jiras to
resolve issues across each of these issues where there are gaps.

thanks!
jacques



--
Jacques Nadeau
CTO and Co-Founder, Dremio



On Sun, Nov 8, 2015 at 9:36 AM, Hsuan Yi Chu  wrote:

> I just submitted a pull request to address DRILL-3751, which focuses on the
> scenario where query already finishes and zookeeper dies. So Foreman cannot
> delete the profiles of running queries in zookeeper.
>
> I think in this case, after a few retries, Foreman can assume Zookeeper is
> down. And, this query is assumed to fail since client might not be able to
> receive the result (see the behavior in DRILL-3751
> ).
>
> Does this make sense?
>
>
> On Fri, Nov 6, 2015 at 10:43 AM, Hsuan Yi Chu  wrote:
>
> > My understanding is :
> > Before query starts/After query finishes, Foreman will put/delete running
> > query profiles in zookeeper.
> >
> > However, if zookeeper is down before the put/delete is successful, Drill
> > would be blocked at the put/delete operation.
> >
> > See https://issues.apache.org/jira/browse/DRILL-3751
> >
> > I think it is not quite right to let Drill just wait for Zookeeper to
> > respond. Does it make sense to use "time-out" here?
> >
> >
> >
>


  

Re: Codehale Metrics JMXReporter Disabled?

2015-12-02 Thread yuliya Feldman
I know I was enabling it for my small project I did with Drill for Strata in 
Feb.
If you enable JMX I believe there was a bug somewhere lurking around - I think 
static order init issue. I may have a code somewhere with enabling JMX and 
fixing the issue I described.
  From: Jacques Nadeau 
 To: dev  
 Sent: Wednesday, December 2, 2015 1:54 PM
 Subject: Re: Codehale Metrics JMXReporter Disabled?
   
Afraid not. I think it may have been debug reasons and shouldn't have been
merged.

--
Jacques Nadeau
CTO and Co-Founder, Dremio



On Wed, Dec 2, 2015 at 1:44 PM, Sudheesh Katkam 
wrote:

> Jacques,
>
> Do you happen to remember why JMXReporter was disabled <
> https://github.com/apache/drill/commit/4eea03a052d1b0a9190b9d1512088da9f81cc037#diff-5a33b44e2c23b1f09d338938c8f1e742R47
> >?
>
> Thank you,
> Sudheesh


  

[DISCUSS] DRILL-4132

2016-01-18 Thread yuliya Feldman

Hello here,
I wanted to start discussion on [1]
Would be nice to have a hangout session with @jacques-n, @hnfgns, 
@StevenMPhillips
Let me know suitable time
Thanks,Yuliya
[1] https://issues.apache.org/jira/browse/DRILL-4132

Re: [DISCUSS] DRILL-4132

2016-01-19 Thread yuliya Feldman
Great idea.
Created a poll [1]. Anybody interested can vote on time.
Thanks,Yuliya
[1] http://doodle.com/poll/c7w37sbvxh36k576


 

  From: Hanifi Gunes 
 To: dev@drill.apache.org; yuliya Feldman  
 Sent: Tuesday, January 19, 2016 11:44 AM
 Subject: Re: [DISCUSS] DRILL-4132
   
Do you want to create a doodle for this? [1]

-Hanifi

1: http://doodle.com/create

On Mon, Jan 18, 2016 at 11:02 PM, yuliya Feldman <
yufeld...@yahoo.com.invalid> wrote:

>
> Hello here,
> I wanted to start discussion on [1]
> Would be nice to have a hangout session with @jacques-n,
> @hnfgns, @StevenMPhillips
> Let me know suitable time
> Thanks,Yuliya
> [1] https://issues.apache.org/jira/browse/DRILL-4132


  

[jira] [Created] (DRILL-4597) Calcite type validation assertions when planner.enable_type_inference is enabled for system tables

2016-04-08 Thread Yuliya Feldman (JIRA)
Yuliya Feldman created DRILL-4597:
-

 Summary: Calcite type validation assertions when 
planner.enable_type_inference is enabled for system tables
 Key: DRILL-4597
 URL: https://issues.apache.org/jira/browse/DRILL-4597
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Reporter: Yuliya Feldman


With calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11 and type inference 
enabled following query fails:
select concat(hostname, ':', user_port) from sys.drillbits where `current`=true;

with below exception

at 
org.apache.calcite.sql.type.SqlTypeFactoryImpl.createSqlType(SqlTypeFactoryImpl.java:62)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.drill.exec.planner.sql.TypeInferenceUtils$DrillConcatSqlReturnTypeInference.inferReturnType(TypeInferenceUtils.java:420)
 ~[classes/:na]
at org.apache.calcite.sql.SqlOperator.inferReturnType(SqlOperator.java:468) 
~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at org.apache.calcite.sql.SqlOperator.validateOperands(SqlOperator.java:435) 
~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at org.apache.calcite.sql.SqlFunction.deriveType(SqlFunction.java:287) 
~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at org.apache.calcite.sql.SqlFunction.deriveType(SqlFunction.java:222) 
~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:4288)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:4275)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:130) 
~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.deriveTypeImpl(SqlValidatorImpl.java:1495)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.deriveType(SqlValidatorImpl.java:1478)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.expandSelectItem(SqlValidatorImpl.java:440)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelectList(SqlValidatorImpl.java:3447)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:2995)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:86)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:877)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:863)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:210) 
~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:837)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:551)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.drill.exec.planner.sql.SqlConverter.validate(SqlConverter.java:155) 
~[classes/:na]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode(DefaultSqlHandler.java:596)
 ~[classes/:na]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert(DefaultSqlHandler.java:192)
 ~[classes/:na]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:164)
 ~[classes/:na]
at 
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:94)
 ~[classes/:na]
at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:970) 
[classes/:na]
at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:254) 
[classes/:na]

So far we could not repro it on non-system tables, but any type inference on 
system table leads to the exception



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4809) Drill to provide ability to support parameterized conditions

2016-07-26 Thread Yuliya Feldman (JIRA)
Yuliya Feldman created DRILL-4809:
-

 Summary: Drill to provide ability to support parameterized 
conditions
 Key: DRILL-4809
 URL: https://issues.apache.org/jira/browse/DRILL-4809
 Project: Apache Drill
  Issue Type: Improvement
  Components: Execution - Flow, Query Planning & Optimization, SQL 
Parser
Reporter: Yuliya Feldman


Currently Drill does not provide ability to specify variables in the WHERE 
clause which means that user has to create a new query to handle any new 
condition.

For example if someone wants to execute following query:
select id, name from foo where dir0=$1 and dir1=$2

(s)he unable to do it and thus if dir0 and dir1 get created on the fly (by day, 
month or what not) new query needs to be created to handle data in new 
directories.






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-1926) Fix back pressure logic

2015-01-04 Thread Yuliya Feldman (JIRA)
Yuliya Feldman created DRILL-1926:
-

 Summary: Fix back pressure logic
 Key: DRILL-1926
 URL: https://issues.apache.org/jira/browse/DRILL-1926
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Reporter: Yuliya Feldman
Assignee: Yuliya Feldman


While enqueueing coming requests in UnlimitedRawBatchBuffer replies to the 
sender(s) will queue up only if size of the queue is equal to soft limit and 
not when it is >= softlimit - which means it will work only once, while 
requests will be piling up.
Also improving logic of sending responses back to the senders by not just 
sending them in one shot that can create flood of requests again, but in 
batches based on difference between softlimit and queue size



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-1859) IllegalReferenceCountException in the decoder inside Netty

2015-01-15 Thread Yuliya Feldman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuliya Feldman resolved DRILL-1859.
---
Resolution: Fixed

> IllegalReferenceCountException in the decoder inside Netty
> --
>
> Key: DRILL-1859
> URL: https://issues.apache.org/jira/browse/DRILL-1859
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - RPC
>Affects Versions: 0.7.0
>Reporter: Aman Sinha
>    Assignee: Yuliya Feldman
> Fix For: 0.8.0
>
> Attachments: 0001-DRILL-1926-Fix-for-back-pressure-logic.patch, 
> 0002-DRILL-1859-Issue-with-killing-stopping-operator-proc.patch, 
> decoder_err.txt
>
>
> The following query does a LIMIT inside a subquery to force a UnionExchange 
> and then does an ORDER-BY outside that will first re-distribute the data 
> before sorting.  It results in a DecoderException in netty.
> {code}
> 0: jdbc:drill:zk=local> alter session set `planner.slice_target` = 10;
> +++
> | ok |  summary   |
> +++
> | true   | planner.slice_target updated. |
> +++
> 0: jdbc:drill:zk=local> select t2.o_custkey from (select o_orderkey, 
> o_custkey from cp.`tpch/orders.parquet` t1 group by o_orderkey, o_custkey 
> limit 10) t2 order by t2.o_custkey;
> Query failed: Query failed: Failure while running fragment., refCnt: 0, 
> decrement: 1 
> {code}
> Here's partial output from the logs: (will attach full error log).  
> {code}
> io.netty.handler.codec.DecoderException: 
> io.netty.util.IllegalReferenceCountException: refCnt: 0, decrement: 1
> 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:99)
>  [netty-codec-4.0.24.Final.jar:4.0.24.Final]
> 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>  [netty-transport-4.0.24.Final.jar:4.0.24.Final]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2209) Save on CPU cycles by adding Project with column that has hash calculated

2015-02-10 Thread Yuliya Feldman (JIRA)
Yuliya Feldman created DRILL-2209:
-

 Summary: Save on CPU cycles by adding Project with column that has 
hash calculated
 Key: DRILL-2209
 URL: https://issues.apache.org/jira/browse/DRILL-2209
 Project: Apache Drill
  Issue Type: Improvement
  Components: Execution - Flow, Query Planning & Optimization
Reporter: Yuliya Feldman
Assignee: Yuliya Feldman


Related to DRILL-133. Wrapping HashToRandomExhcnage and/or LocalExchange with 
Project operator with additional column that represents hash function of 
column(s) we are hashing on. This is to save CPU cycles and not recalculate 
hash every time



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2210) Allow multithreaded copy and/or flush in ParittionSender

2015-02-10 Thread Yuliya Feldman (JIRA)
Yuliya Feldman created DRILL-2210:
-

 Summary: Allow multithreaded copy and/or flush in ParittionSender
 Key: DRILL-2210
 URL: https://issues.apache.org/jira/browse/DRILL-2210
 Project: Apache Drill
  Issue Type: Improvement
  Components: Execution - Flow
Reporter: Yuliya Feldman
Assignee: Yuliya Feldman


Related to DRILL-133. As in LocalExchange we merge data from multiple receivers 
into LocalExchange to fan it out later to multiple Senders, amount of data that 
needs to be sent out increases. Add ability to copy/flush data in multiple 
threads



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4132) Ability to submit simple type of physical plan directly to EndPoint DrillBit for execution

2015-11-25 Thread Yuliya Feldman (JIRA)
Yuliya Feldman created DRILL-4132:
-

 Summary: Ability to submit simple type of physical plan directly 
to EndPoint DrillBit for execution
 Key: DRILL-4132
 URL: https://issues.apache.org/jira/browse/DRILL-4132
 Project: Apache Drill
  Issue Type: New Feature
  Components: Execution - Flow, Execution - RPC
Reporter: Yuliya Feldman
Assignee: Yuliya Feldman


Today Drill Query execution is optimistic and stateful (at least due to data 
exchanges) - if any of the stages of query execution fails whole query fails. 
If query is just simple scan, filter push down and project where no data 
exchange happens between DrillBits there is no need to fail whole query when 
one DrillBit fails, as minor fragments running on that DrillBit can be rerun on 
the other DrillBit. There are probably multiple ways to achieve this. This JIRA 
is to open discussion on: 
1. agreement that we need to support above use case 
2. means of achieving it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4412) Have an array of DrillBitEndPoints (at least) for leaf fragments instead of single one

2016-02-17 Thread Yuliya Feldman (JIRA)
Yuliya Feldman created DRILL-4412:
-

 Summary: Have an array of DrillBitEndPoints (at least) for leaf 
fragments instead of single one
 Key: DRILL-4412
 URL: https://issues.apache.org/jira/browse/DRILL-4412
 Project: Apache Drill
  Issue Type: Improvement
  Components: Query Planning & Optimization
Reporter: Yuliya Feldman
Assignee: Yuliya Feldman


To follow up on the ability to submit simple physical plan directly to a 
DrillBit for execution 
[JIRA-4132|https://issues.apache.org/jira/browse/DRILL-4132] it would be 
beneficial to have an array of DrillBitEndPoint in PlanFragment. Leaf fragments 
that scan the data can have an array of DrillBitEndPoint based on data 
locality, as data may be replicated and in case it is necessary to restart Scan 
fragment it can be restarted on DrillBits that have replica of the data, versus 
always retrying the same DrillBit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)