[GitHub] weijietong commented on issue #1504: DRILL-6792: Find the right probe side fragment wrapper & fix DrillBuf…
weijietong commented on issue #1504: DRILL-6792: Find the right probe side fragment wrapper & fix DrillBuf… URL: https://github.com/apache/drill/pull/1504#issuecomment-442308688 @sohami @vdiravka ready to merge now. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ilooner commented on issue #1515: DRILL-6806: Moving code for a HashAgg partition into separate class.
ilooner commented on issue #1515: DRILL-6806: Moving code for a HashAgg partition into separate class. URL: https://github.com/apache/drill/pull/1515#issuecomment-442278894 @vdiravka Got overloaded with some deadlines for another project. I won't be able to finish this for this release. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (DRILL-6873) Cluster without dfs throws DATA_READ ERROR file does not exist
Matt Keranen created DRILL-6873: --- Summary: Cluster without dfs throws DATA_READ ERROR file does not exist Key: DRILL-6873 URL: https://issues.apache.org/jira/browse/DRILL-6873 Project: Apache Drill Issue Type: Bug Components: Storage - JSON Affects Versions: 1.14.0 Environment: Drill v1.14.0 Zookeeper 3.4.13 Centos 7.5 Reporter: Matt Keranen Running drillbits on multiple servers with Zookeeper but without HDFS. When file storage is configured to a common path, but not all filenames are present on all nodes, errors are thrown: Error: DATA_READ ERROR: Failure reading JSON file - File file:/localdata/logs/fileX.json.gz does not exist Example use case: Querying log files on multiple machines as a ZK cluster from their local filesystems without moving them to a distributed file system which may not be in use. Is there a (planned) configuration option to simply skip filenames that exist on some but not all nodes? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Apache Drill Meetup on Nov 14th!
Thanks again to the speakers Nitin and Aman for fantastic talks! I have updated the meetup with the links to the slides from the presenters as well as a recording of the meetup. Take a look at the Meetup to see the slides and a few pictures of the event. https://www.meetup.com/Bay-Area-Apache-Drill-User-Group/events/255727785/ Link to slides/ recording: https://drive.google.com/drive/folders/10HAyVVUSq8LsFOYG8J8beeIUloG34o_6 Aman's Slideshare Link: https://www.slideshare.net/AmanSinha6/accelerating-sql-queries-in-nosql-databases-using-apache-drill-and-secondary-indexes Thanks, Pritesh On Sun, Nov 4, 2018 at 2:46 PM Pritesh Maker wrote: > Hello, Drillers! > > We are restarting meetups for Apache Drill! The next meet up will be on > Nov 14th at 6:30 PM at the MapR Headquarters. > > We will have two speakers for the meetup > - Nitin Sharma @ Netflix who will talk about Netflix's Personalization > Infrastructure > - Aman Sinha @ MapR who will talk about a brand new feature in Apache > Drill to leverage Secondary Indexes > > You can find more details of their proposed talks at the meetup site. > Please register soon since we have limited seating! > https://www.meetup.com/Bay-Area-Apache-Drill-User-Group/events/255727785/ > > We look forward to seeing you there! > > Thank you, > Pritesh >
[GitHub] Ben-Zvi closed pull request #1555: DRILL-6864: Upgrade the git-commit-id plugin to 2.2.5
Ben-Zvi closed pull request #1555: DRILL-6864: Upgrade the git-commit-id plugin to 2.2.5 URL: https://github.com/apache/drill/pull/1555 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/pom.xml b/pom.xml index bd512c4d6f7..6f80581ae57 100644 --- a/pom.xml +++ b/pom.xml @@ -527,7 +527,7 @@ pl.project13.maven git-commit-id-plugin -2.1.9 +2.2.5 for-jars This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (DRILL-6872) Add support for Join with small build side table between Lateral & Unnest subquery.
Sorabh Hamirwasia created DRILL-6872: Summary: Add support for Join with small build side table between Lateral & Unnest subquery. Key: DRILL-6872 URL: https://issues.apache.org/jira/browse/DRILL-6872 Project: Apache Drill Issue Type: Improvement Components: Execution - Flow Reporter: Sorabh Hamirwasia We want to support Hash Join in Lateral & Unnest subquery for special case of small table on build side of the Hash Join. In this case basically build side is a small table which is scanned once by Hash Join and then for each left side batch with EMIT outcome, same build side information is used. The example plan will look like below: {code:java} LJ / \ Scan HashJoin / \ Unnest Scan (Build) {code} Here we should also think about sniffing logic in Hash Join operator which can cause deadlock for this scenario. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Hangout Discussion Topics
The link for hangout is: http://meet.google.com/yki-iqdf-tai Kind regards Vitalii On Tue, Nov 27, 2018 at 6:29 AM Boaz Ben-Zvi wrote: > I can present the list of Performance Projects (this was scheduled > for the Developers Day two weeks ago, but was set aside for the lack of > time then). > > We can dive deeper into any specific project, or discuss a couple of > general mechanisms that may be needed (preview: these are "shared > memory" and "pass planner information to the operators"). > >Thanks, > > Boaz > > On 11/26/18 10:29 AM, Vitalii Diravka wrote: > > Hi All, > > > > Does anyone have any topics to discuss during the hangout tomorrow? > > > > Kind regards > > Vitalii > > > >
[GitHub] sohami commented on issue #1504: DRILL-6792: Find the right probe side fragment wrapper & fix DrillBuf…
sohami commented on issue #1504: DRILL-6792: Find the right probe side fragment wrapper & fix DrillBuf… URL: https://github.com/apache/drill/pull/1504#issuecomment-442147737 Latest commit looks fine with just 1 minor comment. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] sohami commented on a change in pull request #1504: DRILL-6792: Find the right probe side fragment wrapper & fix DrillBuf…
sohami commented on a change in pull request #1504: DRILL-6792: Find the right probe side fragment wrapper & fix DrillBuf… URL: https://github.com/apache/drill/pull/1504#discussion_r236770022 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/work/filter/RuntimeFilterSink.java ## @@ -17,206 +17,250 @@ */ package org.apache.drill.exec.work.filter; -import org.apache.drill.exec.memory.BufferAllocator; +import io.netty.buffer.DrillBuf; +import org.apache.drill.common.exceptions.DrillRuntimeException; +import org.apache.drill.exec.ops.AccountingDataTunnel; +import org.apache.drill.exec.ops.Consumer; +import org.apache.drill.exec.ops.SendingAccountor; +import org.apache.drill.exec.ops.StatusHandler; +import org.apache.drill.exec.proto.BitData; +import org.apache.drill.exec.proto.CoordinationProtos; +import org.apache.drill.exec.proto.GeneralRPCProtos; +import org.apache.drill.exec.proto.UserBitShared; +import org.apache.drill.exec.rpc.RpcException; +import org.apache.drill.exec.rpc.RpcOutcomeListener; +import org.apache.drill.exec.rpc.data.DataTunnel; +import org.apache.drill.exec.server.DrillbitContext; +import org.apache.drill.shaded.guava.com.google.common.base.Stopwatch; import org.slf4j.Logger; import org.slf4j.LoggerFactory; + +import java.io.Closeable; +import java.util.HashMap; +import java.util.List; +import java.util.Map; import java.util.concurrent.BlockingQueue; -import java.util.concurrent.ExecutorService; -import java.util.concurrent.Future; import java.util.concurrent.LinkedBlockingQueue; +import java.util.concurrent.TimeUnit; import java.util.concurrent.atomic.AtomicBoolean; -import java.util.concurrent.atomic.AtomicInteger; -import java.util.concurrent.locks.Condition; -import java.util.concurrent.locks.ReentrantLock; /** * This sink receives the RuntimeFilters from the netty thread, - * aggregates them in an async thread, supplies the aggregated - * one to the fragment running thread. + * aggregates them in an async thread, broadcast the final aggregated + * one to the RuntimeFilterRecordBatch. */ -public class RuntimeFilterSink implements AutoCloseable { - - private AtomicInteger currentBookId = new AtomicInteger(0); +public class RuntimeFilterSink implements Closeable +{ - private int staleBookId = 0; + private BlockingQueue rfQueue = new LinkedBlockingQueue<>(); - /** - * RuntimeFilterWritable holding the aggregated version of all the received filter - */ - private RuntimeFilterWritable aggregated = null; + private Map joinMjId2rfNumber; - private BlockingQueue rfQueue = new LinkedBlockingQueue<>(); + //HashJoin node's major fragment id to its corresponding probe side nodes's endpoints + private Map> joinMjId2probeScanEps = new HashMap<>(); - /** - * Flag used by Minor Fragment thread to indicate it has encountered error - */ - private AtomicBoolean running = new AtomicBoolean(true); + //HashJoin node's major fragment id to its corresponding probe side scan node's belonging major fragment id + private Map joinMjId2ScanMjId = new HashMap<>(); - /** - * Lock used to synchronize between producer (Netty Thread) and consumer (AsyncAggregateThread) of elements of this - * queue. This is needed because in error condition running flag can be consumed by producer and consumer thread at - * different times. Whoever sees it first will take this lock and clear all elements and set the queue to null to - * indicate producer not to put any new elements in it. - */ - private ReentrantLock queueLock = new ReentrantLock(); + //HashJoin node's major fragment id to its aggregated RuntimeFilterWritable + private Map joinMjId2AggregatedRF = new HashMap<>(); + //for debug usage + private Map joinMjId2Stopwatch = new HashMap<>(); - private Condition notEmpty = queueLock.newCondition(); + private DrillbitContext drillbitContext; - private ReentrantLock aggregatedRFLock = new ReentrantLock(); + private SendingAccountor sendingAccountor; - private BufferAllocator bufferAllocator; + private AsyncAggregateWorker asyncAggregateWorker; - private Future future; + private AtomicBoolean running = new AtomicBoolean(true); private static final Logger logger = LoggerFactory.getLogger(RuntimeFilterSink.class); - public RuntimeFilterSink(BufferAllocator bufferAllocator, ExecutorService executorService) { -this.bufferAllocator = bufferAllocator; -AsyncAggregateWorker asyncAggregateWorker = new AsyncAggregateWorker(); -future = executorService.submit(asyncAggregateWorker); + public RuntimeFilterSink(DrillbitContext drillbitContext, SendingAccountor sendingAccountor) + { +this.drillbitContext = drillbitContext; +this.sendingAccountor = sendingAccountor; +asyncAggregateWorker = new AsyncAggregateWorker(); +drillbitContext.getExecutor().submit(asyncAggregateWorker); } - public void aggregate(RuntimeFilterWritable runtimeFilterWritable) { -if (running.
Re: [DISCUSS] 1.15.0 release
Hi all! Thanks for updating tickets. There are couple of tickets, which are almost done and would be nice to include them to 1.15 release: Can be merged: DRILL-6864 ben-zvi Root POM: Update the git-commit-id plugin DRILL-6039 vdonapati drillbit.sh graceful_stop does not wait for fragments to complete before stopping the drillbitAlmost ready to commit: DRILL-6867 le.louch WebUI Query editor cursor position DRILL-6792 weijie Find the right probe side fragment to any storage plugin DRILL-6806 timothyfarkas Start moving code for handling a partition in HashAgg into a separate class Also I am planning to include DRILL-6562: Upgrade to SqlLine 1.6.0 and my work - DRILL-6562: Plugin Management improvements. I suppose these tickets will be addressed in the next few days, then I will start release process. Kind regards Vitalii On Mon, Nov 26, 2018 at 7:50 PM Vitalii Diravka wrote: > Hi all! > > I found the issue DRILL-6828 [1], which is introduced in DRILL-6381. I > think it is an important one, since the issue is a degradation and it > blocks working with HashPartitionSender exchange operator. > Boaz, since you are assigned to the jira, could you please take a look? > > Charles, your "REST metadata" work is merged. Regarding "Syslog plugin" I > will do review, but as usual process. I mean it should not block the > release. > > Team, these are tickets, which should be updated to make the release in > time. Please change the release version to 1.16, if it is not a blocker and > requires additional work. > Issue key Assignee Summary > DRILL-6845 ben-zvi Eliminate duplicates for Semi Hash Join > DRILL-6864 ben-zvi Root POM: Update the git-commit-id plugin > DRILL-6867 le.louch WebUI Query editor cursor position > DRILL-6849 weijie Runtime filter queries with nested broadcast returns > wrong results > DRILL-6838 weijie Query with Runtime Filter fails with > IllegalStateException: Memory was leaked by query > DRILL-6792 weijie Find the right probe side fragment to any storage plugin > DRILL-6791 Paul.Rogers Merge scan projection framework into master > DRILL-6039 vdonapati drillbit.sh graceful_stop does not wait for > fragments to complete before stopping the drillbit > DRILL-6806 timothyfarkas Start moving code for handling a partition in > HashAgg into a separate class > DRILL-6543 ben-zvi Option for memory mgmt: Reserve allowance for > non-buffered > DRILL-6863 KazydubB Drop table is not working if path within workspace > starts with '/' > DRILL-6253 timothyfarkas HashAgg Unit Testing And Refactoring > DRILL-6032 timothyfarkas Use RecordBatchSizer to estimate size of columns > in HashAgg > DRILL-6623 karthikm Drill encounters exception IndexOutOfBoundsException: > writerIndex: -8373248 (expected: readerIndex(0) <= writerIndex <= > capacity(32768)) > DRILL-6245 vdonapati Clicking on anything redirects to main login page > [1] > https://issues.apache.org/jira/browse/DRILL-6828?focusedCommentId=16699152&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16699152 > > > Kind regards > Vitalii > > > On Tue, Nov 20, 2018 at 3:06 PM Charles Givre wrote: > >> Hi @Vitalii >> Are you sure? The metadata commit is really pretty minor now that I >> cleaned it up. If nobody can review the Syslog format plugin until the >> next version, that’s fine, but I don’t think it should be a big deal to >> review either. >> Best, >> — C >> >> > On Nov 20, 2018, at 05:20, Vitalii Diravka wrote: >> > >> > @Charles Your changes require some time to pass review, it is better to >> > consider them to the next release. >> > All other mentioned issues are resolved. >> > >> > Team, please verify the tickets which you are responsible for and update >> > them accordingly [1]. >> > If there are no blockers we can consider to include one more >> batch-commit >> > to the 1.15.0 Drill release. >> > Therefore 27.11 can be the final cut-off date. >> > >> > [1] https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=185 >> > >> > Kind regards >> > Vitalii >> > >> > >> > On Wed, Nov 7, 2018 at 2:24 AM Vitalii Diravka >> wrote: >> > >> >> @Charles I think we can include it into 1.15.0 release, it also will >> >> depend on the review process. >> >> It looks like it is DRILL-6582 Jira ticket. I have updated "Fix >> Version/s" >> >> for it. >> >> >> >> @Khurram Vova and me posted the info in the ticket. Please try to use >> *PreparedStatement.executeQuery(). >> >> *If it works, please close the jira. >> >> >> >> @Karthik For sure, it should be included to 1.15.0 Drill release. >> >> >> >> @Gautam You are right, it will be good to include this issue into >> current >> >> release, >> >> since Drill SI based on MapR-DB data source and it should work >> properly. >> >> >> >> Therefore we have more than one week to reduce the number of tickets >> from >> >> [1] >> >> >> >> [1] >> >> >> https://issues.apache.org/jira/issues/?jql=project%20%3D%20DRILL%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20R
[GitHub] vdiravka commented on issue #1504: DRILL-6792: Find the right probe side fragment wrapper & fix DrillBuf…
vdiravka commented on issue #1504: DRILL-6792: Find the right probe side fragment wrapper & fix DrillBuf… URL: https://github.com/apache/drill/pull/1504#issuecomment-442070498 @sohami Is a newly added commit fine? If so @weijietong please squash the commits then. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] vdiravka commented on issue #1555: DRILL-6864: Upgrade the git-commit-id plugin to 2.2.5
vdiravka commented on issue #1555: DRILL-6864: Upgrade the git-commit-id plugin to 2.2.5 URL: https://github.com/apache/drill/pull/1555#issuecomment-442022887 +1 LGTM `jackson.databind` is specified in `dependencyManagement` block, therefore 2.9.5 version is used. It can be updated in future, if it will be necessary. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] vdiravka edited a comment on issue #1555: DRILL-6864: Upgrade the git-commit-id plugin to 2.2.5
vdiravka edited a comment on issue #1555: DRILL-6864: Upgrade the git-commit-id plugin to 2.2.5 URL: https://github.com/apache/drill/pull/1555#issuecomment-442022887 +1 LGTM `jackson.databind` is specified in `dependencyManagement` block, therefore 2.9.5 version is used by maven. It can be updated in future, if it will be necessary. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services