[GitHub] weijietong commented on issue #1504: DRILL-6792: Find the right probe side fragment wrapper & fix DrillBuf…

2018-11-27 Thread GitBox
weijietong commented on issue #1504: DRILL-6792: Find the right probe side 
fragment wrapper & fix DrillBuf…
URL: https://github.com/apache/drill/pull/1504#issuecomment-442308688
 
 
   @sohami @vdiravka ready to merge now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ilooner commented on issue #1515: DRILL-6806: Moving code for a HashAgg partition into separate class.

2018-11-27 Thread GitBox
ilooner commented on issue #1515: DRILL-6806: Moving code for a HashAgg 
partition into separate class.
URL: https://github.com/apache/drill/pull/1515#issuecomment-442278894
 
 
   @vdiravka Got overloaded with some deadlines for another project. I won't be 
able to finish this for this release.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (DRILL-6873) Cluster without dfs throws DATA_READ ERROR file does not exist

2018-11-27 Thread Matt Keranen (JIRA)
Matt Keranen created DRILL-6873:
---

 Summary: Cluster without dfs throws DATA_READ ERROR file does not 
exist
 Key: DRILL-6873
 URL: https://issues.apache.org/jira/browse/DRILL-6873
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - JSON
Affects Versions: 1.14.0
 Environment: Drill v1.14.0

Zookeeper 3.4.13

Centos 7.5

 
Reporter: Matt Keranen


Running drillbits on multiple servers with Zookeeper but without HDFS. When 
file storage is configured to a common path, but not all filenames are present 
on all nodes, errors are thrown:

    Error: DATA_READ ERROR: Failure reading JSON file - File 
file:/localdata/logs/fileX.json.gz does not exist

Example use case: Querying log files on multiple machines as a ZK cluster from 
their local filesystems without moving them to a distributed file system which 
may not be in use.

Is there a (planned) configuration option to simply skip filenames that exist 
on some but not all nodes?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Apache Drill Meetup on Nov 14th!

2018-11-27 Thread Pritesh Maker
Thanks again to the speakers Nitin and Aman for fantastic talks! I have
updated the meetup with the links to the slides from the presenters as well
as a recording of the meetup.

Take a look at the Meetup to see the slides and a few pictures of the event.
https://www.meetup.com/Bay-Area-Apache-Drill-User-Group/events/255727785/

Link to slides/ recording:
https://drive.google.com/drive/folders/10HAyVVUSq8LsFOYG8J8beeIUloG34o_6

Aman's Slideshare Link:
https://www.slideshare.net/AmanSinha6/accelerating-sql-queries-in-nosql-databases-using-apache-drill-and-secondary-indexes


Thanks,
Pritesh

On Sun, Nov 4, 2018 at 2:46 PM Pritesh Maker  wrote:

> Hello, Drillers!
>
> We are restarting meetups for Apache Drill! The next meet up will be on
> Nov 14th at 6:30 PM at the MapR Headquarters.
>
> We will have two speakers for the meetup
> - Nitin Sharma @ Netflix who will talk about Netflix's Personalization
> Infrastructure
> - Aman Sinha @ MapR who will talk about a brand new feature in Apache
> Drill to leverage Secondary Indexes
>
> You can find more details of their proposed talks at the meetup site.
> Please register soon since we have limited seating!
> https://www.meetup.com/Bay-Area-Apache-Drill-User-Group/events/255727785/
>
> We look forward to seeing you there!
>
> Thank you,
> Pritesh
>


[GitHub] Ben-Zvi closed pull request #1555: DRILL-6864: Upgrade the git-commit-id plugin to 2.2.5

2018-11-27 Thread GitBox
Ben-Zvi closed pull request #1555: DRILL-6864: Upgrade the git-commit-id plugin 
to 2.2.5
URL: https://github.com/apache/drill/pull/1555
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/pom.xml b/pom.xml
index bd512c4d6f7..6f80581ae57 100644
--- a/pom.xml
+++ b/pom.xml
@@ -527,7 +527,7 @@
   
 pl.project13.maven
 git-commit-id-plugin
-2.1.9
+2.2.5
 
   
 for-jars


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (DRILL-6872) Add support for Join with small build side table between Lateral & Unnest subquery.

2018-11-27 Thread Sorabh Hamirwasia (JIRA)
Sorabh Hamirwasia created DRILL-6872:


 Summary: Add support for Join with small build side table between 
Lateral & Unnest subquery.
 Key: DRILL-6872
 URL: https://issues.apache.org/jira/browse/DRILL-6872
 Project: Apache Drill
  Issue Type: Improvement
  Components: Execution - Flow
Reporter: Sorabh Hamirwasia


We want to support Hash Join in Lateral & Unnest subquery for special case of 
small table on build side of the Hash Join. In this case basically build side 
is a small table which is scanned once by Hash Join and then for each left side 
batch with EMIT outcome, same build side information is used. The example plan 
will look like below: 
{code:java}
             LJ
              /    \
        Scan    HashJoin
                     /      \
            Unnest    Scan (Build)
{code}
 
Here we should also think about sniffing logic in Hash Join operator which can 
cause deadlock for this scenario.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Hangout Discussion Topics

2018-11-27 Thread Vitalii Diravka
The link for hangout is:
http://meet.google.com/yki-iqdf-tai

Kind regards
Vitalii


On Tue, Nov 27, 2018 at 6:29 AM Boaz Ben-Zvi  wrote:

>  I can present the list of Performance Projects (this was scheduled
> for the Developers Day two weeks ago, but was set aside for the lack of
> time then).
>
> We can dive deeper into any specific project, or discuss a couple of
> general mechanisms that may be needed (preview: these are "shared
> memory" and "pass planner information to the operators").
>
>Thanks,
>
> Boaz
>
> On 11/26/18 10:29 AM, Vitalii Diravka wrote:
> > Hi All,
> >
> > Does anyone have any topics to discuss during the hangout tomorrow?
> >
> > Kind regards
> > Vitalii
> >
>
>


[GitHub] sohami commented on issue #1504: DRILL-6792: Find the right probe side fragment wrapper & fix DrillBuf…

2018-11-27 Thread GitBox
sohami commented on issue #1504: DRILL-6792: Find the right probe side fragment 
wrapper & fix DrillBuf…
URL: https://github.com/apache/drill/pull/1504#issuecomment-442147737
 
 
   Latest commit looks fine with just 1 minor comment.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sohami commented on a change in pull request #1504: DRILL-6792: Find the right probe side fragment wrapper & fix DrillBuf…

2018-11-27 Thread GitBox
sohami commented on a change in pull request #1504: DRILL-6792: Find the right 
probe side fragment wrapper & fix DrillBuf…
URL: https://github.com/apache/drill/pull/1504#discussion_r236770022
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/filter/RuntimeFilterSink.java
 ##
 @@ -17,206 +17,250 @@
  */
 package org.apache.drill.exec.work.filter;
 
-import org.apache.drill.exec.memory.BufferAllocator;
+import io.netty.buffer.DrillBuf;
+import org.apache.drill.common.exceptions.DrillRuntimeException;
+import org.apache.drill.exec.ops.AccountingDataTunnel;
+import org.apache.drill.exec.ops.Consumer;
+import org.apache.drill.exec.ops.SendingAccountor;
+import org.apache.drill.exec.ops.StatusHandler;
+import org.apache.drill.exec.proto.BitData;
+import org.apache.drill.exec.proto.CoordinationProtos;
+import org.apache.drill.exec.proto.GeneralRPCProtos;
+import org.apache.drill.exec.proto.UserBitShared;
+import org.apache.drill.exec.rpc.RpcException;
+import org.apache.drill.exec.rpc.RpcOutcomeListener;
+import org.apache.drill.exec.rpc.data.DataTunnel;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.shaded.guava.com.google.common.base.Stopwatch;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
+
+import java.io.Closeable;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
 import java.util.concurrent.BlockingQueue;
-import java.util.concurrent.ExecutorService;
-import java.util.concurrent.Future;
 import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.TimeUnit;
 import java.util.concurrent.atomic.AtomicBoolean;
-import java.util.concurrent.atomic.AtomicInteger;
-import java.util.concurrent.locks.Condition;
-import java.util.concurrent.locks.ReentrantLock;
 
 /**
  * This sink receives the RuntimeFilters from the netty thread,
- * aggregates them in an async thread, supplies the aggregated
- * one to the fragment running thread.
+ * aggregates them in an async thread, broadcast the final aggregated
+ * one to the RuntimeFilterRecordBatch.
  */
-public class RuntimeFilterSink implements AutoCloseable {
-
-  private AtomicInteger currentBookId = new AtomicInteger(0);
+public class RuntimeFilterSink implements Closeable
+{
 
-  private int staleBookId = 0;
+  private BlockingQueue rfQueue = new 
LinkedBlockingQueue<>();
 
-  /**
-   * RuntimeFilterWritable holding the aggregated version of all the received 
filter
-   */
-  private RuntimeFilterWritable aggregated = null;
+  private Map joinMjId2rfNumber;
 
-  private BlockingQueue rfQueue = new 
LinkedBlockingQueue<>();
+  //HashJoin node's major fragment id to its corresponding probe side nodes's 
endpoints
+  private Map> 
joinMjId2probeScanEps = new HashMap<>();
 
-  /**
-   * Flag used by Minor Fragment thread to indicate it has encountered error
-   */
-  private AtomicBoolean running = new AtomicBoolean(true);
+  //HashJoin node's major fragment id to its corresponding probe side scan 
node's belonging major fragment id
+  private Map joinMjId2ScanMjId = new HashMap<>();
 
-  /**
-   * Lock used to synchronize between producer (Netty Thread) and consumer 
(AsyncAggregateThread) of elements of this
-   * queue. This is needed because in error condition running flag can be 
consumed by producer and consumer thread at
-   * different times. Whoever sees it first will take this lock and clear all 
elements and set the queue to null to
-   * indicate producer not to put any new elements in it.
-   */
-  private ReentrantLock queueLock = new ReentrantLock();
+  //HashJoin node's major fragment id to its aggregated RuntimeFilterWritable
+  private Map joinMjId2AggregatedRF = new 
HashMap<>();
+  //for debug usage
+  private Map joinMjId2Stopwatch = new HashMap<>();
 
-  private Condition notEmpty = queueLock.newCondition();
+  private DrillbitContext drillbitContext;
 
-  private ReentrantLock aggregatedRFLock = new ReentrantLock();
+  private SendingAccountor sendingAccountor;
 
-  private BufferAllocator bufferAllocator;
+  private  AsyncAggregateWorker asyncAggregateWorker;
 
-  private Future future;
+  private AtomicBoolean running = new AtomicBoolean(true);
 
   private static final Logger logger = 
LoggerFactory.getLogger(RuntimeFilterSink.class);
 
 
-  public RuntimeFilterSink(BufferAllocator bufferAllocator, ExecutorService 
executorService) {
-this.bufferAllocator = bufferAllocator;
-AsyncAggregateWorker asyncAggregateWorker = new AsyncAggregateWorker();
-future = executorService.submit(asyncAggregateWorker);
+  public RuntimeFilterSink(DrillbitContext drillbitContext, SendingAccountor 
sendingAccountor)
+  {
+this.drillbitContext = drillbitContext;
+this.sendingAccountor = sendingAccountor;
+asyncAggregateWorker = new AsyncAggregateWorker();
+drillbitContext.getExecutor().submit(asyncAggregateWorker);
   }
 
-  public void aggregate(RuntimeFilterWritable runtimeFilterWritable) {
-if (running.

Re: [DISCUSS] 1.15.0 release

2018-11-27 Thread Vitalii Diravka
Hi all!
Thanks for updating tickets.

There are couple of tickets, which are almost done and would be nice to
include them to 1.15 release:
Can be merged:
DRILL-6864 ben-zvi Root POM: Update the git-commit-id plugin
DRILL-6039 vdonapati drillbit.sh graceful_stop does not wait for fragments
to complete before stopping the drillbitAlmost ready to commit:
DRILL-6867 le.louch WebUI Query editor cursor position
DRILL-6792 weijie Find the right probe side fragment to any storage plugin
DRILL-6806 timothyfarkas Start moving code for handling a partition in
HashAgg into a separate class
Also I am planning to include DRILL-6562: Upgrade to SqlLine 1.6.0 and my
work - DRILL-6562: Plugin Management improvements.
I suppose these tickets will be addressed in the next few days, then I will
start release process.

Kind regards
Vitalii


On Mon, Nov 26, 2018 at 7:50 PM Vitalii Diravka  wrote:

> Hi all!
>
> I found the issue DRILL-6828 [1], which is introduced in DRILL-6381. I
> think it is an important one, since the issue is a degradation and it
> blocks working with HashPartitionSender exchange operator.
> Boaz, since you are assigned to the jira, could you please take a look?
>
> Charles, your "REST metadata" work is merged. Regarding "Syslog plugin" I
> will do review, but as usual process. I mean it should not block the
> release.
>
> Team, these are tickets, which should be updated to make the release in
> time. Please change the release version to 1.16, if it is not a blocker and
> requires additional work.
> Issue key Assignee Summary
> DRILL-6845 ben-zvi Eliminate duplicates for Semi Hash Join
> DRILL-6864 ben-zvi Root POM: Update the git-commit-id plugin
> DRILL-6867 le.louch WebUI Query editor cursor position
> DRILL-6849 weijie Runtime filter queries with nested broadcast returns
> wrong results
> DRILL-6838 weijie Query with Runtime Filter fails with
> IllegalStateException: Memory was leaked by query
> DRILL-6792 weijie Find the right probe side fragment to any storage plugin
> DRILL-6791 Paul.Rogers Merge scan projection framework into master
> DRILL-6039 vdonapati drillbit.sh graceful_stop does not wait for
> fragments to complete before stopping the drillbit
> DRILL-6806 timothyfarkas Start moving code for handling a partition in
> HashAgg into a separate class
> DRILL-6543 ben-zvi Option for memory mgmt: Reserve allowance for
> non-buffered
> DRILL-6863 KazydubB Drop table is not working if path within workspace
> starts with '/'
> DRILL-6253 timothyfarkas HashAgg Unit Testing And Refactoring
> DRILL-6032 timothyfarkas Use RecordBatchSizer to estimate size of columns
> in HashAgg
> DRILL-6623 karthikm Drill encounters exception IndexOutOfBoundsException:
> writerIndex: -8373248 (expected: readerIndex(0) <= writerIndex <=
> capacity(32768))
> DRILL-6245 vdonapati Clicking on anything redirects to main login page
> [1]
> https://issues.apache.org/jira/browse/DRILL-6828?focusedCommentId=16699152&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16699152
>
>
> Kind regards
> Vitalii
>
>
> On Tue, Nov 20, 2018 at 3:06 PM Charles Givre  wrote:
>
>> Hi @Vitalii
>> Are you sure?  The metadata commit is really pretty minor now that I
>> cleaned it up.  If nobody can review the Syslog format plugin until the
>> next version, that’s fine, but I don’t think it should be a big deal to
>> review either.
>> Best,
>> — C
>>
>> > On Nov 20, 2018, at 05:20, Vitalii Diravka  wrote:
>> >
>> > @Charles Your changes require some time to pass review, it is better to
>> > consider them to the next release.
>> > All other mentioned issues are resolved.
>> >
>> > Team, please verify the tickets which you are responsible for and update
>> > them accordingly [1].
>> > If there are no blockers we can consider to include one more
>> batch-commit
>> > to the 1.15.0 Drill release.
>> > Therefore 27.11 can be the final cut-off date.
>> >
>> > [1] https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=185
>> >
>> > Kind regards
>> > Vitalii
>> >
>> >
>> > On Wed, Nov 7, 2018 at 2:24 AM Vitalii Diravka 
>> wrote:
>> >
>> >> @Charles I think we can include it into 1.15.0 release, it also will
>> >> depend on the review process.
>> >> It looks like it is DRILL-6582 Jira ticket. I have updated "Fix
>> Version/s"
>> >> for it.
>> >>
>> >> @Khurram Vova and me posted the info in the ticket. Please try to use
>> *PreparedStatement.executeQuery().
>> >> *If it works, please close the jira.
>> >>
>> >> @Karthik For sure, it should be included to 1.15.0 Drill release.
>> >>
>> >> @Gautam You are right, it will be good to include this issue into
>> current
>> >> release,
>> >> since Drill SI based on MapR-DB data source and it should work
>> properly.
>> >>
>> >> Therefore we have more than one week to reduce the number of tickets
>> from
>> >> [1]
>> >>
>> >> [1]
>> >>
>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20DRILL%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20R

[GitHub] vdiravka commented on issue #1504: DRILL-6792: Find the right probe side fragment wrapper & fix DrillBuf…

2018-11-27 Thread GitBox
vdiravka commented on issue #1504: DRILL-6792: Find the right probe side 
fragment wrapper & fix DrillBuf…
URL: https://github.com/apache/drill/pull/1504#issuecomment-442070498
 
 
   @sohami Is a newly added commit fine? If so @weijietong please squash the 
commits then.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] vdiravka commented on issue #1555: DRILL-6864: Upgrade the git-commit-id plugin to 2.2.5

2018-11-27 Thread GitBox
vdiravka commented on issue #1555: DRILL-6864: Upgrade the git-commit-id plugin 
to 2.2.5
URL: https://github.com/apache/drill/pull/1555#issuecomment-442022887
 
 
   +1 LGTM
   `jackson.databind` is specified in `dependencyManagement` block, therefore 
2.9.5 version is used. It can be updated in future, if it will be necessary.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] vdiravka edited a comment on issue #1555: DRILL-6864: Upgrade the git-commit-id plugin to 2.2.5

2018-11-27 Thread GitBox
vdiravka edited a comment on issue #1555: DRILL-6864: Upgrade the git-commit-id 
plugin to 2.2.5
URL: https://github.com/apache/drill/pull/1555#issuecomment-442022887
 
 
   +1 LGTM
   `jackson.databind` is specified in `dependencyManagement` block, therefore 
2.9.5 version is used by maven. It can be updated in future, if it will be 
necessary.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services