Build failure on 0.13.SNAPSHOT

2018-07-18 Thread Dongjin Lee
Hello. I am trying to build druid, but it fails. My environment is like the
following:

- CPU: Intel(R) Core(TM) i7-7560U CPU @ 2.40GHz
- RAM: 7704 MB
- OS: ubuntu 18.04
- JDK: openjdk version "1.8.0_171" (default configuration, with MaxHeapSize
= 1928 MB)
- Branch: master (commit: cd8ea3d)

The error message I got is:

[INFO]
> 
> [INFO] Reactor Summary:
> [INFO]
> [INFO] io.druid:druid . SUCCESS [
> 50.258 s]
> [INFO] java-util .. SUCCESS [03:57
> min]
> [INFO] druid-api .. SUCCESS [
> 22.694 s]
> [INFO] druid-common ... SUCCESS [
> 14.083 s]
> [INFO] druid-hll .. SUCCESS [
> 17.126 s]
> [INFO] extendedset  SUCCESS [
> 10.856 s]
>
> *[INFO] druid-processing ... FAILURE
> [04:36 min]*[INFO] druid-aws-common ...
> SKIPPED
> [INFO] druid-server ... SKIPPED
> [INFO] druid-examples . SKIPPED
> ...
> [INFO]
> 
> [INFO] BUILD FAILURE
> [INFO]
> 
> [INFO] Total time: 10:29 min
> [INFO] Finished at: 2018-07-19T13:23:31+09:00
> [INFO] Final Memory: 88M/777M
> [INFO]
> 
>
> *[ERROR] Failed to execute goal
> org.apache.maven.plugins:maven-surefire-plugin:2.19.1:test (default-test)
> on project druid-processing: Execution default-test of goal
> org.apache.maven.plugins:maven-surefire-plugin:2.19.1:test failed: The
> forked VM terminated without properly saying goodbye. VM crash or
> System.exit called?*[ERROR] Command was /bin/sh -c cd
> /home/djlee/workspace/java/druid/processing &&
> /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xmx3000m -Duser.language=en
> -Duser.country=US -Dfile.encoding=UTF-8 -Duser.timezone=UTC
> -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
> -Ddruid.indexing.doubleStorage=double -jar
> /home/djlee/workspace/java/druid/processing/target/surefire/surefirebooter1075382243904099051.jar
> /home/djlee/workspace/java/druid/processing/target/surefire/surefire559351134757209tmp
> /home/djlee/workspace/java/druid/processing/target/surefire/surefire_5173894389718744688tmp


It seems like it fails when it runs tests on `druid-processing` module but
I can't certain. Is there anyone who can give me some hints? Thanks in
advance.

Best,
Dongjin

-- 
*Dongjin Lee*

*A hitchhiker in the mathematical world.*

*github:  github.com/dongjinleekr
linkedin: kr.linkedin.com/in/dongjinleekr
slideshare:
www.slideshare.net/dongjinleekr
*


Subscription Request

2018-07-18 Thread Dongjin Lee
-- 
*Dongjin Lee*

*A hitchhiker in the mathematical world.*

*github:  github.com/dongjinleekr
linkedin: kr.linkedin.com/in/dongjinleekr
slideshare:
www.slideshare.net/dongjinleekr
*


[GitHub] jihoonson opened a new pull request #6022: Log the full stack trace when an HTTP request fails

2018-07-18 Thread GitBox
jihoonson opened a new pull request #6022: Log the full stack trace when an 
HTTP request fails
URL: https://github.com/apache/incubator-druid/pull/6022
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] drcrallen commented on issue #6014: Optionally refuse to consume new data until the prior chunk is being consumed

2018-07-18 Thread GitBox
drcrallen commented on issue #6014: Optionally refuse to consume new data until 
the prior chunk is being consumed
URL: https://github.com/apache/incubator-druid/pull/6014#issuecomment-40659
 
 
   I refactored things a bit and am going to try out a slightly modified direct 
druid client which only uses the HttpClient code to feed in InputStreams, 
rather than trying to wait for an entire result to succeed or fail.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] leventov commented on issue #3236: gitter community channel?

2018-07-18 Thread GitBox
leventov commented on issue #3236: gitter community channel?
URL: 
https://github.com/apache/incubator-druid/issues/3236#issuecomment-406110033
 
 
   @gianm I'm against a Gitter channel for the same reasons as you mentioned, 
plus I dislike any chats for several more reasons (chats are time sinks; 
promote shallow thinking; unstructured, it's hard to follow conversations 
there; not indexed by search engines; etc.)
   
   There are plenty of mediums where somebody could ask questions: Github 
issues, mailing list, 
[StackOverflow](https://stackoverflow.com/questions/tagged/druid).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson opened a new issue #6021: NPE in KafkaSupervisor.checkpointTaskGroup

2018-07-18 Thread GitBox
jihoonson opened a new issue #6021: NPE in KafkaSupervisor.checkpointTaskGroup
URL: https://github.com/apache/incubator-druid/issues/6021
 
 
   ```
   2018-07-18T23:41:26,739 ERROR [KafkaSupervisor-] 
io.druid.indexing.kafka.supervisor.KafkaSupervisor - KafkaSupervisor[] 
failed to handle notice: 
{class=io.druid.indexing.kafka.supervisor.KafkaSupervisor, exceptionType=class 
java.lang.NullPointerException, exceptionMessage=null, 
noticeClass=GracefulShutdownNotice}
   java.lang.NullPointerException
at 
io.druid.indexing.kafka.supervisor.KafkaSupervisor.checkpointTaskGroup(KafkaSupervisor.java:1434)
 ~[druid-kafka-indexing-service-0.12.1-iap8.jar:0.12.1-iap8]
at 
io.druid.indexing.kafka.supervisor.KafkaSupervisor.checkTaskDuration(KafkaSupervisor.java:1382)
 ~[druid-kafka-indexing-service-0.12.1-iap8.jar:0.12.1-iap8]
at 
io.druid.indexing.kafka.supervisor.KafkaSupervisor.gracefulShutdownInternal(KafkaSupervisor.java:813)
 ~[druid-kafka-indexing-service-0.12.1-iap8.jar:0.12.1-iap8]
at 
io.druid.indexing.kafka.supervisor.KafkaSupervisor$GracefulShutdownNotice.handle(KafkaSupervisor.java:584)
 ~[druid-kafka-indexing-service-0.12.1-iap8.jar:0.12.1-iap8]
at 
io.druid.indexing.kafka.supervisor.KafkaSupervisor$2.run(KafkaSupervisor.java:367)
 [druid-kafka-indexing-service-0.12.1-iap8.jar:0.12.1-iap8]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[?:1.8.0_163]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[?:1.8.0_163]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_163]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_163]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_163]
   ```
   
   `0.12.1-iap8` is available at 
https://github.com/implydata/druid/tree/druid-0.12.1-iap8, and very similar to 
https://github.com/apache/incubator-druid/tree/0.12.2.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson opened a new issue #6020: Race in taskMaster when the overlord becomes the leader

2018-07-18 Thread GitBox
jihoonson opened a new issue #6020: Race in taskMaster when the overlord 
becomes the leader
URL: https://github.com/apache/incubator-druid/issues/6020
 
 
   `TaskMaster` has the interfaces to return the variables (`taskRunner`, 
`taskQueue`, etc) which are initialized only when the overlord becomes the 
leader. The code of the interfaces is like this:
   
   ```java
 public Optional getTaskRunner()
 {
   if (overlordLeaderSelector.isLeader()) {
 return Optional.of(taskRunner);
   } else {
 return Optional.absent();
   }
 }
   ```
   
   However, `taskRunner` is initialized in 
`DruidLeaderSelector.Listener.becomeLeader()` which is called after the 
overlord becomes the leader, and thus `Optional.of()` throws an NPE. The full 
stack trace is as follows:
   
   ```
   java.lang.NullPointerException
   at 
com.google.common.base.Preconditions.checkNotNull(Preconditions.java:213) 
~[guava-16.0.1.jar:?]
   at com.google.common.base.Optional.of(Optional.java:85) 
~[guava-16.0.1.jar:?]
   at 
io.druid.indexing.overlord.TaskMaster.getTaskRunner(TaskMaster.java:214) 
~[druid-indexing-service-0.12.1-iap8.jar:0.12.1-iap8]
   at 
io.druid.indexing.overlord.http.OverlordResource.getWorkers(OverlordResource.java:810)
 ~[druid-indexing-service-0.12.1-iap8.jar:0.12.1-iap8]
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_163]
   at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_163]
   at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_163]
   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_163]
   at 
com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
 ~[jersey-server-1.19.3.jar:1.19.3]
   at 
com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)
 ~[jersey-server-1.19.3.jar:1.19.3]
   at 
com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
 ~[jersey-server-1.19.3.jar:1.19.3]
   at 
com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302)
 ~[jersey-server-1.19.3.jar:1.19.3]
   at 
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
 ~[jersey-server-1.19.3.jar:1.19.3]
   at 
com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
 ~[jersey-server-1.19.3.jar:1.19.3]
   at 
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
 ~[jersey-server-1.19.3.jar:1.19.3]
   at 
com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
 ~[jersey-server-1.19.3.jar:1.19.3]
   at 
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542)
 ~[jersey-server-1.19.3.jar:1.19.3]
   at 
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473)
 [jersey-server-1.19.3.jar:1.19.3]
   at 
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419)
 [jersey-server-1.19.3.jar:1.19.3]
   at 
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409)
 [jersey-server-1.19.3.jar:1.19.3]
   at 
com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409)
 [jersey-servlet-1.19.3.jar:1.19.3]
   at 
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558)
 [jersey-servlet-1.19.3.jar:1.19.3]
   at 
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733)
 [jersey-servlet-1.19.3.jar:1.19.3]
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) 
[javax.servlet-api-3.1.0.jar:3.1.0]
   at 
com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:286)
 [guice-servlet-4.1.0.jar:?]
   at 
com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:276)
 [guice-servlet-4.1.0.jar:?]
   at 
com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:181) 
[guice-servlet-4.1.0.jar:?]
   at 
com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
 [guice-servlet-4.1.0.jar:?]
   at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85)
 [guice-servlet-4.1.0.jar:?]
   at 
com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:120)
 [guice-servlet-4.1.0.jar:?]
   at 
com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:135) 
[gui

Re: Late June/July podling reports

2018-07-18 Thread Jonathan Wei
Thanks for the info, Julian/Taylor.

> It seems more useful to write about the current state today.

> Since Druid has missed those reports, it’s due to report next month. I’d
recommend merging them into one, and make sure they get posted on time.

Sounds good, I'll submit a combined report covering June/July/August for
the upcoming August report date then.

- Jon


On Wed, Jul 18, 2018 at 3:22 PM, P. Taylor Goetz  wrote:

> No need to panic, this not the end of the world. I’ll have to check, but
> reminders may not have gone out this month.
>
> If that’s the case, I’ll mention it it my sign off for the next report so
> the IPMC is aware of what happened.
>
> -Taylor
>
> > On Jul 18, 2018, at 5:45 PM, Gian Merlino  wrote:
> >
> > OMG! I don't see a report reminder for July. Is that not something that
> is
> > happening anymore? I was relying on getting one of those…
> >
> > IMO, there is no reason to write the older June report. We missed it, and
> > that is sad, but it is probably not super interesting to look back and
> see
> > what was happening then. It seems more useful to write about the current
> > state today. I'd rewrite the first bullet (" 1. Move the source code and
> > website to Apache infrastructure." to reflect that we actually have moved
> > source code already, and include "done migrating source code" in the "how
> > has the project developed" section. Also the "
> > https://github.com/druid-io/druid"; link is wrong now, it should be the
> > incubator repo.
> >
> >> On Wed, Jul 18, 2018 at 1:57 PM Jonathan Wei  wrote:
> >>
> >> We neglected to submit podling reports for June and July, so I put
> together
> >> reports for those two months.
> >>
> >> I'm putting them here for internal review first, please comment if you
> have
> >> any feedback/changes.
> >>
> >> June:
> >> 
> >> Druid (As of June 01, 2018)
> >>
> >> Druid is a high-performance, column-oriented, distributed data store.
> >>
> >> Druid has been incubating since 2018-02-28.
> >>
> >> Three most important issues to address in the move towards graduation:
> >>
> >> 1. Move the source code and website to Apache infrastructure.
> >> 2. Plan and execute our first Apache release.
> >> 3. Expanding the community and adding more committers
> >>
> >> Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
> aware
> >> of?
> >>
> >> - None.
> >>
> >> How has the community developed since the last report?
> >>
> >> - A healthy, constant flow of bug fixes, quality improvements and new
> >> features
> >>  are still ongoing on https://github.com/druid-io/druid.
> >>
> >> How has the project developed since the last report?
> >>
> >> - SGA and ICLA status sorted out, ready to migrate source to Apache repo
> >> - Since the last report there have been 22 commits from 12 individuals.
> >> - We have conducted a vote to put out the 0.12.1 release. This release
> >> candidate is being done outside the Incubator.
> >>
> >> How would you assess the podling's maturity?
> >> Please feel free to add your own commentary.
> >>
> >>  [X] Initial setup
> >>  [ ] Working towards first release
> >>  [ ] Community building
> >>  [ ] Nearing graduation
> >>  [ ] Other:
> >>
> >> Date of last release:
> >>
> >> - Druid 0.12.0 on 2018-03-06 (non-Apache release)
> >> - No official Apache release yet since beginning Apache Incubation
> >>
> >> When were the last committers or PPMC members elected?
> >>
> >> - Project is still functioning with the initial set of committers.
> >>
> >>
> >>
> >>
> >>
> >> July
> >> 
> >>
> >> Druid (As of July 01, 2018)
> >>
> >> Druid is a high-performance, column-oriented, distributed data store.
> >>
> >> Druid has been incubating since 2018-02-28.
> >>
> >> Three most important issues to address in the move towards graduation:
> >>
> >> 1. Move the source code and website to Apache infrastructure.
> >> 2. Plan and execute our first Apache release.
> >> 3. Expanding the community and adding more committers
> >>
> >> Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
> aware
> >> of?
> >>
> >> - None.
> >>
> >> How has the community developed since the last report?
> >>
> >> - A healthy, constant flow of bug fixes, quality improvements and new
> >> features
> >>  are still ongoing on https://github.com/druid-io/druid.
> >>
> >> How has the project developed since the last report?
> >>
> >> - Source migration to Apache infrastructure is in progress (
> >> https://issues.apache.org/jira/browse/INFRA-16674)
> >> - Since the last report there have been 47 commits from 14 individuals.
> >> - We have released 0.12.1, a non-incubator release.
> >> - We are working on 0.12.2, a bug fix release. To get the bug fixes to
> >> users faster, this will be another non-incubator release.
> >>
> >> How would you assess the podling's maturity?
> >> Please feel free to add your own commentary.
> >>
> >>  [ ] Initial setup
> >>  [X]

[GitHub] himanshug commented on issue #5492: Native parallel batch indexing without shuffle

2018-07-18 Thread GitBox
himanshug commented on issue #5492: Native parallel batch indexing without 
shuffle
URL: https://github.com/apache/incubator-druid/pull/5492#issuecomment-406096803
 
 
   LGTM for the overall design and high level working.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson edited a comment on issue #5492: Native parallel batch indexing without shuffle

2018-07-18 Thread GitBox
jihoonson edited a comment on issue #5492: Native parallel batch indexing 
without shuffle
URL: https://github.com/apache/incubator-druid/pull/5492#issuecomment-405766673
 
 
   @himanshug thank you for reviewing this PR! Yes, I have tested in our 
cluster by ingesting 100 GB of TPC-H lineitem table. The number of splits was 
100. I also moved parallel indexing stuffs to the `batch.parallel` package.
   
   [Himanshu] :+1:


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] himanshug commented on a change in pull request #5492: Native parallel batch indexing without shuffle

2018-07-18 Thread GitBox
himanshug commented on a change in pull request #5492: Native parallel batch 
indexing without shuffle
URL: https://github.com/apache/incubator-druid/pull/5492#discussion_r203553781
 
 

 ##
 File path: 
indexing-service/src/main/java/io/druid/indexing/common/Counters.java
 ##
 @@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package io.druid.indexing.common;
+
+import com.google.common.util.concurrent.AtomicDouble;
+
+import javax.annotation.Nullable;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.ConcurrentMap;
+import java.util.concurrent.atomic.AtomicInteger;
+import java.util.concurrent.atomic.AtomicReference;
+import java.util.function.BinaryOperator;
+
+public class Counters
+{
+  private final ConcurrentMap intCounters = new 
ConcurrentHashMap<>();
+  private final ConcurrentMap doubleCounters = new 
ConcurrentHashMap<>();
+  private final ConcurrentMap objectCounters = new 
ConcurrentHashMap<>();
 
 Review comment:
   ok, comment came because it looks like dead code at this point.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] himanshug commented on a change in pull request #5492: Native parallel batch indexing without shuffle

2018-07-18 Thread GitBox
himanshug commented on a change in pull request #5492: Native parallel batch 
indexing without shuffle
URL: https://github.com/apache/incubator-druid/pull/5492#discussion_r203553472
 
 

 ##
 File path: 
indexing-service/src/main/java/io/druid/indexing/common/task/ParallelIndexSubTask.java
 ##
 @@ -0,0 +1,431 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package io.druid.indexing.common.task;
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+import io.druid.client.indexing.IndexingServiceClient;
+import io.druid.data.input.Firehose;
+import io.druid.data.input.FirehoseFactory;
+import io.druid.data.input.InputRow;
+import io.druid.indexer.TaskStatus;
+import io.druid.indexing.appenderator.ActionBasedSegmentAllocator;
+import io.druid.indexing.appenderator.ActionBasedUsedSegmentChecker;
+import io.druid.indexing.common.TaskLockType;
+import io.druid.indexing.common.TaskToolbox;
+import io.druid.indexing.common.actions.LockTryAcquireAction;
+import io.druid.indexing.common.actions.SegmentAllocateAction;
+import io.druid.indexing.common.actions.SurrogateAction;
+import io.druid.indexing.common.actions.TaskActionClient;
+import io.druid.indexing.firehose.IngestSegmentFirehoseFactory;
+import io.druid.java.util.common.ISE;
+import io.druid.java.util.common.Intervals;
+import io.druid.java.util.common.StringUtils;
+import io.druid.java.util.common.logger.Logger;
+import io.druid.java.util.common.parsers.ParseException;
+import io.druid.query.DruidMetrics;
+import io.druid.segment.indexing.DataSchema;
+import io.druid.segment.indexing.RealtimeIOConfig;
+import io.druid.segment.indexing.granularity.GranularitySpec;
+import io.druid.segment.realtime.FireDepartment;
+import io.druid.segment.realtime.FireDepartmentMetrics;
+import io.druid.segment.realtime.RealtimeMetricsMonitor;
+import io.druid.segment.realtime.appenderator.Appenderator;
+import io.druid.segment.realtime.appenderator.AppenderatorDriverAddResult;
+import io.druid.segment.realtime.appenderator.Appenderators;
+import io.druid.segment.realtime.appenderator.BaseAppenderatorDriver;
+import io.druid.segment.realtime.appenderator.BatchAppenderatorDriver;
+import io.druid.segment.realtime.appenderator.SegmentAllocator;
+import io.druid.segment.realtime.appenderator.SegmentsAndMetadata;
+import io.druid.timeline.DataSegment;
+import org.apache.commons.io.FileUtils;
+import org.joda.time.Interval;
+
+import javax.annotation.Nullable;
+import java.io.File;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.SortedSet;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.TimeoutException;
+
+/**
+ * A worker task of {@link ParallelIndexSupervisorTask}. Similar to {@link 
IndexTask}, but this task
+ * generates and pushes segments, and reports them to the {@link 
ParallelIndexSupervisorTask} instead of
+ * publishing on its own.
+ */
+public class ParallelIndexSubTask extends AbstractTask
 
 Review comment:
   ok.
   
   do we already have the design/plan somewhere for the perfect rollup ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] himanshug commented on a change in pull request #5492: Native parallel batch indexing without shuffle

2018-07-18 Thread GitBox
himanshug commented on a change in pull request #5492: Native parallel batch 
indexing without shuffle
URL: https://github.com/apache/incubator-druid/pull/5492#discussion_r203553320
 
 

 ##
 File path: 
indexing-service/src/main/java/io/druid/indexing/common/task/SinglePhaseParallelIndexTaskRunner.java
 ##
 @@ -0,0 +1,484 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package io.druid.indexing.common.task;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Preconditions;
+import com.google.common.util.concurrent.FutureCallback;
+import com.google.common.util.concurrent.Futures;
+import com.google.common.util.concurrent.ListenableFuture;
+import io.druid.client.indexing.IndexingServiceClient;
+import io.druid.data.input.FiniteFirehoseFactory;
+import io.druid.data.input.FirehoseFactory;
+import io.druid.data.input.InputSplit;
+import io.druid.indexer.TaskState;
+import io.druid.indexer.TaskStatusPlus;
+import io.druid.indexing.appenderator.ActionBasedUsedSegmentChecker;
+import io.druid.indexing.common.TaskToolbox;
+import io.druid.indexing.common.actions.SegmentTransactionalInsertAction;
+import io.druid.indexing.common.task.TaskMonitor.MonitorEntry;
+import io.druid.indexing.common.task.TaskMonitor.SubTaskCompleteEvent;
+import io.druid.java.util.common.ISE;
+import io.druid.java.util.common.logger.Logger;
+import io.druid.segment.realtime.appenderator.SegmentIdentifier;
+import io.druid.segment.realtime.appenderator.TransactionalSegmentPublisher;
+import io.druid.segment.realtime.appenderator.UsedSegmentChecker;
+import io.druid.timeline.DataSegment;
+
+import javax.annotation.Nullable;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.ConcurrentMap;
+import java.util.concurrent.LinkedBlockingDeque;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+import java.util.stream.Stream;
+
+/**
+ * An implementation of {@link ParallelIndexTaskRunner} to support best-effort 
roll-up. This runner can submit and
+ * monitor multiple {@link ParallelIndexSubTask}s.
+ *
+ * As its name indicates, distributed indexing is done in a single phase, 
i.e., without shuffling intermediate data. As
+ * a result, this task can't be used for perfect rollup.
+ */
+public class SinglePhaseParallelIndexTaskRunner implements 
ParallelIndexTaskRunner
+{
+  private static final Logger log = new 
Logger(SinglePhaseParallelIndexTaskRunner.class);
+
+  private final TaskToolbox toolbox;
+  private final String taskId;
+  private final String groupId;
+  private final ParallelIndexIngestionSpec ingestionSchema;
+  private final Map context;
+  private final FiniteFirehoseFactory baseFirehoseFactory;
+  private final int maxNumTasks;
+  private final IndexingServiceClient indexingServiceClient;
+
+  private final BlockingQueue> 
taskCompleteEvents =
+  new LinkedBlockingDeque<>();
+
+  // subTaskId -> report
+  private final ConcurrentMap segmentsMap = new 
ConcurrentHashMap<>();
+
+  private volatile boolean stopped;
+  private volatile TaskMonitor taskMonitor;
+
+  private int nextSpecId = 0;
+
+  SinglePhaseParallelIndexTaskRunner(
+  TaskToolbox toolbox,
+  String taskId,
+  String groupId,
+  ParallelIndexIngestionSpec ingestionSchema,
+  Map context,
+  IndexingServiceClient indexingServiceClient
+  )
+  {
+this.toolbox = toolbox;
+this.taskId = taskId;
+this.groupId = groupId;
+this.ingestionSchema = ingestionSchema;
+this.context = context;
+this.baseFirehoseFactory = (FiniteFirehoseFactory) 
ingestionSchema.getIOConfig().getFirehoseFactory();
+this.maxNumTasks = ingestionSchema.getTuningConfig().getMaxNumSubTasks();
+this.indexingServiceClient = 
Preconditions.checkNotNull(indexingServiceClient, "indexingServiceClient");
+  }
+
+  @Override
+  public TaskState run() throws Exception
+  {
+final Iterator subTaskSpecIterator = 
subTaskSpec

[GitHub] himanshug commented on a change in pull request #5492: Native parallel batch indexing without shuffle

2018-07-18 Thread GitBox
himanshug commented on a change in pull request #5492: Native parallel batch 
indexing without shuffle
URL: https://github.com/apache/incubator-druid/pull/5492#discussion_r203553220
 
 

 ##
 File path: 
indexing-service/src/main/java/io/druid/indexing/common/task/SinglePhaseParallelIndexTaskRunner.java
 ##
 @@ -0,0 +1,484 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package io.druid.indexing.common.task;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Preconditions;
+import com.google.common.util.concurrent.FutureCallback;
+import com.google.common.util.concurrent.Futures;
+import com.google.common.util.concurrent.ListenableFuture;
+import io.druid.client.indexing.IndexingServiceClient;
+import io.druid.data.input.FiniteFirehoseFactory;
+import io.druid.data.input.FirehoseFactory;
+import io.druid.data.input.InputSplit;
+import io.druid.indexer.TaskState;
+import io.druid.indexer.TaskStatusPlus;
+import io.druid.indexing.appenderator.ActionBasedUsedSegmentChecker;
+import io.druid.indexing.common.TaskToolbox;
+import io.druid.indexing.common.actions.SegmentTransactionalInsertAction;
+import io.druid.indexing.common.task.TaskMonitor.MonitorEntry;
+import io.druid.indexing.common.task.TaskMonitor.SubTaskCompleteEvent;
+import io.druid.java.util.common.ISE;
+import io.druid.java.util.common.logger.Logger;
+import io.druid.segment.realtime.appenderator.SegmentIdentifier;
+import io.druid.segment.realtime.appenderator.TransactionalSegmentPublisher;
+import io.druid.segment.realtime.appenderator.UsedSegmentChecker;
+import io.druid.timeline.DataSegment;
+
+import javax.annotation.Nullable;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.ConcurrentMap;
+import java.util.concurrent.LinkedBlockingDeque;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+import java.util.stream.Stream;
+
+/**
+ * An implementation of {@link ParallelIndexTaskRunner} to support best-effort 
roll-up. This runner can submit and
+ * monitor multiple {@link ParallelIndexSubTask}s.
+ *
+ * As its name indicates, distributed indexing is done in a single phase, 
i.e., without shuffling intermediate data. As
+ * a result, this task can't be used for perfect rollup.
+ */
+public class SinglePhaseParallelIndexTaskRunner implements 
ParallelIndexTaskRunner
+{
+  private static final Logger log = new 
Logger(SinglePhaseParallelIndexTaskRunner.class);
+
+  private final TaskToolbox toolbox;
+  private final String taskId;
+  private final String groupId;
+  private final ParallelIndexIngestionSpec ingestionSchema;
+  private final Map context;
+  private final FiniteFirehoseFactory baseFirehoseFactory;
+  private final int maxNumTasks;
+  private final IndexingServiceClient indexingServiceClient;
+
+  private final BlockingQueue> 
taskCompleteEvents =
+  new LinkedBlockingDeque<>();
+
+  // subTaskId -> report
+  private final ConcurrentMap segmentsMap = new 
ConcurrentHashMap<>();
+
+  private volatile boolean stopped;
+  private volatile TaskMonitor taskMonitor;
+
+  private int nextSpecId = 0;
+
+  SinglePhaseParallelIndexTaskRunner(
+  TaskToolbox toolbox,
+  String taskId,
+  String groupId,
+  ParallelIndexIngestionSpec ingestionSchema,
+  Map context,
+  IndexingServiceClient indexingServiceClient
+  )
+  {
+this.toolbox = toolbox;
+this.taskId = taskId;
+this.groupId = groupId;
+this.ingestionSchema = ingestionSchema;
+this.context = context;
+this.baseFirehoseFactory = (FiniteFirehoseFactory) 
ingestionSchema.getIOConfig().getFirehoseFactory();
+this.maxNumTasks = ingestionSchema.getTuningConfig().getMaxNumSubTasks();
+this.indexingServiceClient = 
Preconditions.checkNotNull(indexingServiceClient, "indexingServiceClient");
+  }
+
+  @Override
+  public TaskState run() throws Exception
+  {
+final Iterator subTaskSpecIterator = 
subTaskSpec

[GitHub] himanshug commented on a change in pull request #5492: Native parallel batch indexing without shuffle

2018-07-18 Thread GitBox
himanshug commented on a change in pull request #5492: Native parallel batch 
indexing without shuffle
URL: https://github.com/apache/incubator-druid/pull/5492#discussion_r203552226
 
 

 ##
 File path: 
indexing-service/src/main/java/io/druid/indexing/common/task/ParallelIndexSupervisorTask.java
 ##
 @@ -0,0 +1,541 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package io.druid.indexing.common.task;
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.jaxrs.smile.SmileMediaTypes;
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+import com.google.common.base.Throwables;
+import io.druid.client.indexing.IndexingServiceClient;
+import io.druid.data.input.FiniteFirehoseFactory;
+import io.druid.data.input.FirehoseFactory;
+import io.druid.indexer.TaskStatus;
+import io.druid.indexing.common.Counters;
+import io.druid.indexing.common.TaskLock;
+import io.druid.indexing.common.TaskToolbox;
+import io.druid.indexing.common.actions.LockListAction;
+import io.druid.indexing.common.actions.TaskActionClient;
+import io.druid.indexing.common.stats.RowIngestionMetersFactory;
+import io.druid.indexing.common.task.IndexTask.IndexIngestionSpec;
+import io.druid.indexing.common.task.IndexTask.IndexTuningConfig;
+import io.druid.indexing.common.task.ParallelIndexTaskRunner.SubTaskSpecStatus;
+import io.druid.java.util.common.IAE;
+import io.druid.java.util.common.ISE;
+import io.druid.java.util.common.logger.Logger;
+import io.druid.segment.indexing.granularity.GranularitySpec;
+import io.druid.segment.realtime.appenderator.SegmentIdentifier;
+import io.druid.segment.realtime.firehose.ChatHandler;
+import io.druid.segment.realtime.firehose.ChatHandlerProvider;
+import io.druid.segment.realtime.firehose.ChatHandlers;
+import io.druid.server.security.Action;
+import io.druid.server.security.AuthorizerMapper;
+import io.druid.timeline.partition.NumberedShardSpec;
+import org.joda.time.DateTime;
+import org.joda.time.Interval;
+
+import javax.annotation.Nullable;
+import javax.servlet.http.HttpServletRequest;
+import javax.ws.rs.Consumes;
+import javax.ws.rs.GET;
+import javax.ws.rs.POST;
+import javax.ws.rs.Path;
+import javax.ws.rs.PathParam;
+import javax.ws.rs.Produces;
+import javax.ws.rs.core.Context;
+import javax.ws.rs.core.MediaType;
+import javax.ws.rs.core.Response;
+import javax.ws.rs.core.Response.Status;
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+import java.util.Map.Entry;
+import java.util.SortedSet;
+import java.util.stream.Collectors;
+
+/**
+ * ParallelIndexSupervisorTask is capable of running multiple subTasks for 
parallel indexing. This is
+ * applicable if the input {@link FiniteFirehoseFactory} is splittable. While 
this task is running, it can submit
+ * multiple child tasks to overlords. This task succeeds only when all its 
child tasks succeed; otherwise it fails.
+ *
+ * @see ParallelIndexTaskRunner
+ */
+public class ParallelIndexSupervisorTask extends AbstractTask implements 
ChatHandler
+{
+  static final String TYPE = "index_parallel";
+
+  private static final Logger log = new 
Logger(ParallelIndexSupervisorTask.class);
+
+  private final ParallelIndexIngestionSpec ingestionSchema;
+  private final FiniteFirehoseFactory baseFirehoseFactory;
+  private final IndexingServiceClient indexingServiceClient;
+  private final ChatHandlerProvider chatHandlerProvider;
+  private final AuthorizerMapper authorizerMapper;
+  private final RowIngestionMetersFactory rowIngestionMetersFactory;
+
+  private final Counters counters = new Counters();
+
+  private volatile ParallelIndexTaskRunner runner;
+
+  // toolbox is initlized when run() is called, and can be used for processing 
HTTP endpoint requests.
+  private volatile TaskToolbox toolbox;
+
+  @JsonCreator
+  public ParallelIndexSupervisorTask(
+  @JsonProperty("id") String id,
+  @JsonProperty("resource") TaskResource taskResource,
+  @JsonProperty("spec") ParallelIndexIngestionSpec ing

Re: Late June/July podling reports

2018-07-18 Thread P. Taylor Goetz
No need to panic, this not the end of the world. I’ll have to check, but 
reminders may not have gone out this month.

If that’s the case, I’ll mention it it my sign off for the next report so the 
IPMC is aware of what happened.

-Taylor

> On Jul 18, 2018, at 5:45 PM, Gian Merlino  wrote:
> 
> OMG! I don't see a report reminder for July. Is that not something that is
> happening anymore? I was relying on getting one of those…
> 
> IMO, there is no reason to write the older June report. We missed it, and
> that is sad, but it is probably not super interesting to look back and see
> what was happening then. It seems more useful to write about the current
> state today. I'd rewrite the first bullet (" 1. Move the source code and
> website to Apache infrastructure." to reflect that we actually have moved
> source code already, and include "done migrating source code" in the "how
> has the project developed" section. Also the "
> https://github.com/druid-io/druid"; link is wrong now, it should be the
> incubator repo.
> 
>> On Wed, Jul 18, 2018 at 1:57 PM Jonathan Wei  wrote:
>> 
>> We neglected to submit podling reports for June and July, so I put together
>> reports for those two months.
>> 
>> I'm putting them here for internal review first, please comment if you have
>> any feedback/changes.
>> 
>> June:
>> 
>> Druid (As of June 01, 2018)
>> 
>> Druid is a high-performance, column-oriented, distributed data store.
>> 
>> Druid has been incubating since 2018-02-28.
>> 
>> Three most important issues to address in the move towards graduation:
>> 
>> 1. Move the source code and website to Apache infrastructure.
>> 2. Plan and execute our first Apache release.
>> 3. Expanding the community and adding more committers
>> 
>> Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
>> of?
>> 
>> - None.
>> 
>> How has the community developed since the last report?
>> 
>> - A healthy, constant flow of bug fixes, quality improvements and new
>> features
>>  are still ongoing on https://github.com/druid-io/druid.
>> 
>> How has the project developed since the last report?
>> 
>> - SGA and ICLA status sorted out, ready to migrate source to Apache repo
>> - Since the last report there have been 22 commits from 12 individuals.
>> - We have conducted a vote to put out the 0.12.1 release. This release
>> candidate is being done outside the Incubator.
>> 
>> How would you assess the podling's maturity?
>> Please feel free to add your own commentary.
>> 
>>  [X] Initial setup
>>  [ ] Working towards first release
>>  [ ] Community building
>>  [ ] Nearing graduation
>>  [ ] Other:
>> 
>> Date of last release:
>> 
>> - Druid 0.12.0 on 2018-03-06 (non-Apache release)
>> - No official Apache release yet since beginning Apache Incubation
>> 
>> When were the last committers or PPMC members elected?
>> 
>> - Project is still functioning with the initial set of committers.
>> 
>> 
>> 
>> 
>> 
>> July
>> 
>> 
>> Druid (As of July 01, 2018)
>> 
>> Druid is a high-performance, column-oriented, distributed data store.
>> 
>> Druid has been incubating since 2018-02-28.
>> 
>> Three most important issues to address in the move towards graduation:
>> 
>> 1. Move the source code and website to Apache infrastructure.
>> 2. Plan and execute our first Apache release.
>> 3. Expanding the community and adding more committers
>> 
>> Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
>> of?
>> 
>> - None.
>> 
>> How has the community developed since the last report?
>> 
>> - A healthy, constant flow of bug fixes, quality improvements and new
>> features
>>  are still ongoing on https://github.com/druid-io/druid.
>> 
>> How has the project developed since the last report?
>> 
>> - Source migration to Apache infrastructure is in progress (
>> https://issues.apache.org/jira/browse/INFRA-16674)
>> - Since the last report there have been 47 commits from 14 individuals.
>> - We have released 0.12.1, a non-incubator release.
>> - We are working on 0.12.2, a bug fix release. To get the bug fixes to
>> users faster, this will be another non-incubator release.
>> 
>> How would you assess the podling's maturity?
>> Please feel free to add your own commentary.
>> 
>>  [ ] Initial setup
>>  [X] Working towards first release
>>  [ ] Community building
>>  [ ] Nearing graduation
>>  [ ] Other:
>> 
>> Date of last release:
>> 
>> - Druid 0.12.1 on 2018-06-08 (non-Apache release)
>> - No official Apache release yet since beginning Apache Incubation
>> 
>> When were the last committers or PPMC members elected?
>> 
>> - Project is still functioning with the initial set of committers.
>> 

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



Re: Late June/July podling reports

2018-07-18 Thread P. Taylor Goetz
Thanks for putting together and sharing the reports on dev@ Jonathan.

Since Druid has missed those reports, it’s due to report next month. I’d 
recommend merging them into one, and make sure they get posted on time.

As a mentor I would recommend making every effort possible to report on time. 
It may seem like a pain, but quarterly reports are important because they 
document for the IPMC how a project is progressing.

IMHO, failing to report sends a bad signal to the IPMC. It’s a quarterly task 
that shouldn’t take much longer than an hour or so. That’s not a big time 
commitment, and any member of the PPMC can do it.

I’d suggest having someone (or several) volunteer as the report coordinator for 
the reporting period. You could rotate it, or not. It can be a collaborative 
effort, or one person. It’s up to the project to decide, but I’d recommend 
figuring out something since the same reporting requirement applies to TLPs.

Just a friendly suggestion from a mentor. ;)

-Taylor

> On Jul 18, 2018, at 4:57 PM, Jonathan Wei  wrote:
> 
> We neglected to submit podling reports for June and July, so I put together
> reports for those two months.
> 
> I'm putting them here for internal review first, please comment if you have
> any feedback/changes.
> 
> June:
> 
> Druid (As of June 01, 2018)
> 
> Druid is a high-performance, column-oriented, distributed data store.
> 
> Druid has been incubating since 2018-02-28.
> 
> Three most important issues to address in the move towards graduation:
> 
> 1. Move the source code and website to Apache infrastructure.
> 2. Plan and execute our first Apache release.
> 3. Expanding the community and adding more committers
> 
> Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
> of?
> 
> - None.
> 
> How has the community developed since the last report?
> 
> - A healthy, constant flow of bug fixes, quality improvements and new
> features
>  are still ongoing on https://github.com/druid-io/druid.
> 
> How has the project developed since the last report?
> 
> - SGA and ICLA status sorted out, ready to migrate source to Apache repo
> - Since the last report there have been 22 commits from 12 individuals.
> - We have conducted a vote to put out the 0.12.1 release. This release
> candidate is being done outside the Incubator.
> 
> How would you assess the podling's maturity?
> Please feel free to add your own commentary.
> 
>  [X] Initial setup
>  [ ] Working towards first release
>  [ ] Community building
>  [ ] Nearing graduation
>  [ ] Other:
> 
> Date of last release:
> 
> - Druid 0.12.0 on 2018-03-06 (non-Apache release)
> - No official Apache release yet since beginning Apache Incubation
> 
> When were the last committers or PPMC members elected?
> 
> - Project is still functioning with the initial set of committers.
> 
> 
> 
> 
> 
> July
> 
> 
> Druid (As of July 01, 2018)
> 
> Druid is a high-performance, column-oriented, distributed data store.
> 
> Druid has been incubating since 2018-02-28.
> 
> Three most important issues to address in the move towards graduation:
> 
> 1. Move the source code and website to Apache infrastructure.
> 2. Plan and execute our first Apache release.
> 3. Expanding the community and adding more committers
> 
> Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
> of?
> 
> - None.
> 
> How has the community developed since the last report?
> 
> - A healthy, constant flow of bug fixes, quality improvements and new
> features
>  are still ongoing on https://github.com/druid-io/druid.
> 
> How has the project developed since the last report?
> 
> - Source migration to Apache infrastructure is in progress (
> https://issues.apache.org/jira/browse/INFRA-16674)
> - Since the last report there have been 47 commits from 14 individuals.
> - We have released 0.12.1, a non-incubator release.
> - We are working on 0.12.2, a bug fix release. To get the bug fixes to
> users faster, this will be another non-incubator release.
> 
> How would you assess the podling's maturity?
> Please feel free to add your own commentary.
> 
>  [ ] Initial setup
>  [X] Working towards first release
>  [ ] Community building
>  [ ] Nearing graduation
>  [ ] Other:
> 
> Date of last release:
> 
> - Druid 0.12.1 on 2018-06-08 (non-Apache release)
> - No official Apache release yet since beginning Apache Incubation
> 
> When were the last committers or PPMC members elected?
> 
> - Project is still functioning with the initial set of committers.

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



Re: Late June/July podling reports

2018-07-18 Thread Julian Hyde
For the first 3 months podlings report monthly, then switch to quarterly. IIRC, 
Druid just transitioned to quarterly. See 
https://incubator.apache.org/guides/ppmc.html#podling_status_reports 
 and 
podlings.xml.

It’s a nice idea to submit a report (since you missed last month’s), but it’s a 
bit late - the board meeting was today. You should file the report 2 weeks 
before the meeting. That gives the incubator a week to compose its report, then 
a week to get mentor sign off and for the board to read it before the meeting.

The board meeting is generally the 3rd Wednesday of the month: see 
https://www.apache.org/foundation/board/calendar.html 
. So plan on writing the 
report on the first Wednesday of the month.

> On Jul 18, 2018, at 2:45 PM, Gian Merlino  wrote:
> 
> OMG! I don't see a report reminder for July. Is that not something that is
> happening anymore? I was relying on getting one of those…
> 
> IMO, there is no reason to write the older June report. We missed it, and
> that is sad, but it is probably not super interesting to look back and see
> what was happening then. It seems more useful to write about the current
> state today. I'd rewrite the first bullet (" 1. Move the source code and
> website to Apache infrastructure." to reflect that we actually have moved
> source code already, and include "done migrating source code" in the "how
> has the project developed" section. Also the "
> https://github.com/druid-io/druid"; link is wrong now, it should be the
> incubator repo.
> 
> On Wed, Jul 18, 2018 at 1:57 PM Jonathan Wei  wrote:
> 
>> We neglected to submit podling reports for June and July, so I put together
>> reports for those two months.
>> 
>> I'm putting them here for internal review first, please comment if you have
>> any feedback/changes.
>> 
>> June:
>> 
>> Druid (As of June 01, 2018)
>> 
>> Druid is a high-performance, column-oriented, distributed data store.
>> 
>> Druid has been incubating since 2018-02-28.
>> 
>> Three most important issues to address in the move towards graduation:
>> 
>> 1. Move the source code and website to Apache infrastructure.
>> 2. Plan and execute our first Apache release.
>> 3. Expanding the community and adding more committers
>> 
>> Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
>> of?
>> 
>> - None.
>> 
>> How has the community developed since the last report?
>> 
>> - A healthy, constant flow of bug fixes, quality improvements and new
>> features
>>  are still ongoing on https://github.com/druid-io/druid.
>> 
>> How has the project developed since the last report?
>> 
>> - SGA and ICLA status sorted out, ready to migrate source to Apache repo
>> - Since the last report there have been 22 commits from 12 individuals.
>> - We have conducted a vote to put out the 0.12.1 release. This release
>> candidate is being done outside the Incubator.
>> 
>> How would you assess the podling's maturity?
>> Please feel free to add your own commentary.
>> 
>>  [X] Initial setup
>>  [ ] Working towards first release
>>  [ ] Community building
>>  [ ] Nearing graduation
>>  [ ] Other:
>> 
>> Date of last release:
>> 
>> - Druid 0.12.0 on 2018-03-06 (non-Apache release)
>> - No official Apache release yet since beginning Apache Incubation
>> 
>> When were the last committers or PPMC members elected?
>> 
>> - Project is still functioning with the initial set of committers.
>> 
>> 
>> 
>> 
>> 
>> July
>> 
>> 
>> Druid (As of July 01, 2018)
>> 
>> Druid is a high-performance, column-oriented, distributed data store.
>> 
>> Druid has been incubating since 2018-02-28.
>> 
>> Three most important issues to address in the move towards graduation:
>> 
>> 1. Move the source code and website to Apache infrastructure.
>> 2. Plan and execute our first Apache release.
>> 3. Expanding the community and adding more committers
>> 
>> Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
>> of?
>> 
>> - None.
>> 
>> How has the community developed since the last report?
>> 
>> - A healthy, constant flow of bug fixes, quality improvements and new
>> features
>>  are still ongoing on https://github.com/druid-io/druid.
>> 
>> How has the project developed since the last report?
>> 
>> - Source migration to Apache infrastructure is in progress (
>> https://issues.apache.org/jira/browse/INFRA-16674)
>> - Since the last report there have been 47 commits from 14 individuals.
>> - We have released 0.12.1, a non-incubator release.
>> - We are working on 0.12.2, a bug fix release. To get the bug fixes to
>> users faster, this will be another non-incubator release.
>> 
>> How would you assess the podling's maturity?
>> Please feel free to add your own commentary.
>> 
>>  [ ] Initial setup
>>  [X] Working towards first release
>> 

[GitHub] jihoonson opened a new issue #6019: Race in TaskQueue when updating status of complete tasks

2018-07-18 Thread GitBox
jihoonson opened a new issue #6019: Race in TaskQueue when updating status of 
complete tasks
URL: https://github.com/apache/incubator-druid/issues/6019
 
 
   In 
[TaskQueue.notifyStatus()](https://github.com/apache/incubator-druid/blob/master/indexing-service/src/main/java/io/druid/indexing/overlord/TaskQueue.java#L380),
 the overlord updates the status of a complete task. To do so, it first calls 
[taskRunner.shutdown(task.getId())](https://github.com/apache/incubator-druid/blob/master/indexing-service/src/main/java/io/druid/indexing/overlord/TaskQueue.java#L396)
 to remove the complete task from the taskRunner, and then calls 
[taskStorage.setStatus(taskStatus)](https://github.com/apache/incubator-druid/blob/master/indexing-service/src/main/java/io/druid/indexing/overlord/TaskQueue.java#L424)
 to update the status of the task in metastore. So, between them, the complete 
task is not in taskRunner, but its status in metastore is `RUNNING`.
   
   This might break the overlord APIs returning the task status. Since we don't 
have a good system to track all task status changes yet, the overlord tries to 
find waiting tasks by `(all not-complete tasks - tasks in taskRunner)`. But, 
because of the above race, a complete task might not be in taskRunner, but its 
status in metastore is still `RUNNING`, and the overlord might return a wrong 
task status as `WAITING`.
   
   The short and simple solution is to fix this race, but I would say that it 
would be better to make a system to track all task status changes which can 
record the waiting task status correctly.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on issue #6019: Race in TaskQueue when updating status of complete tasks

2018-07-18 Thread GitBox
jihoonson commented on issue #6019: Race in TaskQueue when updating status of 
complete tasks
URL: 
https://github.com/apache/incubator-druid/issues/6019#issuecomment-406085513
 
 
   Related to https://github.com/apache/incubator-druid/issues/5523.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



Re: Late June/July podling reports

2018-07-18 Thread Gian Merlino
OMG! I don't see a report reminder for July. Is that not something that is
happening anymore? I was relying on getting one of those…

IMO, there is no reason to write the older June report. We missed it, and
that is sad, but it is probably not super interesting to look back and see
what was happening then. It seems more useful to write about the current
state today. I'd rewrite the first bullet (" 1. Move the source code and
website to Apache infrastructure." to reflect that we actually have moved
source code already, and include "done migrating source code" in the "how
has the project developed" section. Also the "
https://github.com/druid-io/druid"; link is wrong now, it should be the
incubator repo.

On Wed, Jul 18, 2018 at 1:57 PM Jonathan Wei  wrote:

> We neglected to submit podling reports for June and July, so I put together
> reports for those two months.
>
> I'm putting them here for internal review first, please comment if you have
> any feedback/changes.
>
> June:
> 
> Druid (As of June 01, 2018)
>
> Druid is a high-performance, column-oriented, distributed data store.
>
> Druid has been incubating since 2018-02-28.
>
> Three most important issues to address in the move towards graduation:
>
>  1. Move the source code and website to Apache infrastructure.
>  2. Plan and execute our first Apache release.
>  3. Expanding the community and adding more committers
>
> Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
> of?
>
> - None.
>
> How has the community developed since the last report?
>
> - A healthy, constant flow of bug fixes, quality improvements and new
> features
>   are still ongoing on https://github.com/druid-io/druid.
>
> How has the project developed since the last report?
>
> - SGA and ICLA status sorted out, ready to migrate source to Apache repo
> - Since the last report there have been 22 commits from 12 individuals.
> - We have conducted a vote to put out the 0.12.1 release. This release
> candidate is being done outside the Incubator.
>
> How would you assess the podling's maturity?
> Please feel free to add your own commentary.
>
>   [X] Initial setup
>   [ ] Working towards first release
>   [ ] Community building
>   [ ] Nearing graduation
>   [ ] Other:
>
> Date of last release:
>
> - Druid 0.12.0 on 2018-03-06 (non-Apache release)
> - No official Apache release yet since beginning Apache Incubation
>
> When were the last committers or PPMC members elected?
>
> - Project is still functioning with the initial set of committers.
>
>
>
>
>
> July
> 
>
> Druid (As of July 01, 2018)
>
> Druid is a high-performance, column-oriented, distributed data store.
>
> Druid has been incubating since 2018-02-28.
>
> Three most important issues to address in the move towards graduation:
>
>  1. Move the source code and website to Apache infrastructure.
>  2. Plan and execute our first Apache release.
>  3. Expanding the community and adding more committers
>
> Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
> of?
>
> - None.
>
> How has the community developed since the last report?
>
> - A healthy, constant flow of bug fixes, quality improvements and new
> features
>   are still ongoing on https://github.com/druid-io/druid.
>
> How has the project developed since the last report?
>
> - Source migration to Apache infrastructure is in progress (
> https://issues.apache.org/jira/browse/INFRA-16674)
> - Since the last report there have been 47 commits from 14 individuals.
> - We have released 0.12.1, a non-incubator release.
> - We are working on 0.12.2, a bug fix release. To get the bug fixes to
> users faster, this will be another non-incubator release.
>
> How would you assess the podling's maturity?
> Please feel free to add your own commentary.
>
>   [ ] Initial setup
>   [X] Working towards first release
>   [ ] Community building
>   [ ] Nearing graduation
>   [ ] Other:
>
> Date of last release:
>
> - Druid 0.12.1 on 2018-06-08 (non-Apache release)
> - No official Apache release yet since beginning Apache Incubation
>
> When were the last committers or PPMC members elected?
>
> - Project is still functioning with the initial set of committers.
>


[GitHub] gianm commented on issue #5998: Add support to filter on datasource for active tasks

2018-07-18 Thread GitBox
gianm commented on issue #5998: Add support to filter on datasource for active 
tasks
URL: https://github.com/apache/incubator-druid/pull/5998#issuecomment-406072235
 
 
   LGTM after Travis. thanks @surekhasaharan!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



Late June/July podling reports

2018-07-18 Thread Jonathan Wei
We neglected to submit podling reports for June and July, so I put together
reports for those two months.

I'm putting them here for internal review first, please comment if you have
any feedback/changes.

June:

Druid (As of June 01, 2018)

Druid is a high-performance, column-oriented, distributed data store.

Druid has been incubating since 2018-02-28.

Three most important issues to address in the move towards graduation:

 1. Move the source code and website to Apache infrastructure.
 2. Plan and execute our first Apache release.
 3. Expanding the community and adding more committers

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
of?

- None.

How has the community developed since the last report?

- A healthy, constant flow of bug fixes, quality improvements and new
features
  are still ongoing on https://github.com/druid-io/druid.

How has the project developed since the last report?

- SGA and ICLA status sorted out, ready to migrate source to Apache repo
- Since the last report there have been 22 commits from 12 individuals.
- We have conducted a vote to put out the 0.12.1 release. This release
candidate is being done outside the Incubator.

How would you assess the podling's maturity?
Please feel free to add your own commentary.

  [X] Initial setup
  [ ] Working towards first release
  [ ] Community building
  [ ] Nearing graduation
  [ ] Other:

Date of last release:

- Druid 0.12.0 on 2018-03-06 (non-Apache release)
- No official Apache release yet since beginning Apache Incubation

When were the last committers or PPMC members elected?

- Project is still functioning with the initial set of committers.





July


Druid (As of July 01, 2018)

Druid is a high-performance, column-oriented, distributed data store.

Druid has been incubating since 2018-02-28.

Three most important issues to address in the move towards graduation:

 1. Move the source code and website to Apache infrastructure.
 2. Plan and execute our first Apache release.
 3. Expanding the community and adding more committers

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
of?

- None.

How has the community developed since the last report?

- A healthy, constant flow of bug fixes, quality improvements and new
features
  are still ongoing on https://github.com/druid-io/druid.

How has the project developed since the last report?

- Source migration to Apache infrastructure is in progress (
https://issues.apache.org/jira/browse/INFRA-16674)
- Since the last report there have been 47 commits from 14 individuals.
- We have released 0.12.1, a non-incubator release.
- We are working on 0.12.2, a bug fix release. To get the bug fixes to
users faster, this will be another non-incubator release.

How would you assess the podling's maturity?
Please feel free to add your own commentary.

  [ ] Initial setup
  [X] Working towards first release
  [ ] Community building
  [ ] Nearing graduation
  [ ] Other:

Date of last release:

- Druid 0.12.1 on 2018-06-08 (non-Apache release)
- No official Apache release yet since beginning Apache Incubation

When were the last committers or PPMC members elected?

- Project is still functioning with the initial set of committers.


[GitHub] surekhasaharan commented on a change in pull request #5998: Add support to filter on datasource for active tasks

2018-07-18 Thread GitBox
surekhasaharan commented on a change in pull request #5998: Add support to 
filter on datasource for active tasks
URL: https://github.com/apache/incubator-druid/pull/5998#discussion_r203514243
 
 

 ##
 File path: 
indexing-service/src/main/java/io/druid/indexing/overlord/http/OverlordResource.java
 ##
 @@ -632,14 +632,14 @@ public Response getTasks(
 List> allActiveTaskInfo = Lists.newArrayList();
 final List allActiveTasks = Lists.newArrayList();
 if (state == null || !"complete".equals(StringUtils.toLowerCase(state))) {
-  allActiveTaskInfo = taskStorageQueryAdapter.getActiveTaskInfo();
+  allActiveTaskInfo = 
taskStorageQueryAdapter.getActiveTaskInfo(dataSource);
   for (final TaskInfo task : allActiveTaskInfo) {
 allActiveTasks.add(
 new AnyTask(
 task.getId(),
 task.getTask() == null ? null : task.getTask().getType(),
 SettableFuture.create(),
-task.getDataSource(),
+task.getDataSource(), // ?
 
 Review comment:
   removed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] surekhasaharan commented on a change in pull request #5998: Add support to filter on datasource for active tasks

2018-07-18 Thread GitBox
surekhasaharan commented on a change in pull request #5998: Add support to 
filter on datasource for active tasks
URL: https://github.com/apache/incubator-druid/pull/5998#discussion_r203513943
 
 

 ##
 File path: 
indexing-service/src/main/java/io/druid/indexing/overlord/HeapMemoryTaskStorage.java
 ##
 @@ -170,7 +170,7 @@ public void setStatus(TaskStatus status)
   }
 
   @Override
-  public List> getActiveTaskInfo()
+  public List> getActiveTaskInfo(@Nullable String datasource)
 
 Review comment:
   replaced at all the places.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] gianm closed pull request #5993: SQL: Add server-wide default time zone config.

2018-07-18 Thread GitBox
gianm closed pull request #5993: SQL: Add server-wide default time zone config.
URL: https://github.com/apache/incubator-druid/pull/5993
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/content/querying/sql.md b/docs/content/querying/sql.md
index 09778119f2c..314a3ed74e3 100644
--- a/docs/content/querying/sql.md
+++ b/docs/content/querying/sql.md
@@ -416,10 +416,10 @@ Connection context can be specified as JDBC connection 
properties or as a "conte
 
 |Parameter|Description|Default value|
 |-|---|-|
-|`sqlTimeZone`|Sets the time zone for this connection, which will affect how 
time functions and timestamp literals behave. Should be a time zone name like 
"America/Los_Angeles" or offset like "-08:00".|UTC|
-|`useApproximateCountDistinct`|Whether to use an approximate cardinalty 
algorithm for `COUNT(DISTINCT 
foo)`.|druid.sql.planner.useApproximateCountDistinct on the broker|
-|`useApproximateTopN`|Whether to use approximate [TopN 
queries](topnquery.html) when a SQL query could be expressed as such. If false, 
exact [GroupBy queries](groupbyquery.html) will be used 
instead.|druid.sql.planner.useApproximateTopN on the broker|
-|`useFallback`|Whether to evaluate operations on the broker when they cannot 
be expressed as Druid queries. This option is not recommended for production 
since it can generate unscalable query plans. If false, SQL queries that cannot 
be translated to Druid queries will fail.|druid.sql.planner.useFallback on the 
broker|
+|`sqlTimeZone`|Sets the time zone for this connection, which will affect how 
time functions and timestamp literals behave. Should be a time zone name like 
"America/Los_Angeles" or offset like "-08:00".|druid.sql.planner.sqlTimeZone on 
the broker (default: UTC)|
+|`useApproximateCountDistinct`|Whether to use an approximate cardinalty 
algorithm for `COUNT(DISTINCT 
foo)`.|druid.sql.planner.useApproximateCountDistinct on the broker (default: 
true)|
+|`useApproximateTopN`|Whether to use approximate [TopN 
queries](topnquery.html) when a SQL query could be expressed as such. If false, 
exact [GroupBy queries](groupbyquery.html) will be used 
instead.|druid.sql.planner.useApproximateTopN on the broker (default: true)|
+|`useFallback`|Whether to evaluate operations on the broker when they cannot 
be expressed as Druid queries. This option is not recommended for production 
since it can generate unscalable query plans. If false, SQL queries that cannot 
be translated to Druid queries will fail.|druid.sql.planner.useFallback on the 
broker (default: false)|
 
 ### Retrieving metadata
 
@@ -500,3 +500,4 @@ The Druid SQL server is configured through the following 
properties on the broke
 |`druid.sql.planner.useApproximateCountDistinct`|Whether to use an approximate 
cardinalty algorithm for `COUNT(DISTINCT foo)`.|true|
 |`druid.sql.planner.useApproximateTopN`|Whether to use approximate [TopN 
queries](../querying/topnquery.html) when a SQL query could be expressed as 
such. If false, exact [GroupBy queries](../querying/groupbyquery.html) will be 
used instead.|true|
 |`druid.sql.planner.useFallback`|Whether to evaluate operations on the broker 
when they cannot be expressed as Druid queries. This option is not recommended 
for production since it can generate unscalable query plans. If false, SQL 
queries that cannot be translated to Druid queries will fail.|false|
+|`druid.sql.planner.sqlTimeZone`|Sets the default time zone for the server, 
which will affect how time functions and timestamp literals behave. Should be a 
time zone name like "America/Los_Angeles" or offset like "-08:00".|UTC|
diff --git a/sql/src/main/java/io/druid/sql/calcite/planner/PlannerConfig.java 
b/sql/src/main/java/io/druid/sql/calcite/planner/PlannerConfig.java
index 6453010b693..c8aa8f3a2c7 100644
--- a/sql/src/main/java/io/druid/sql/calcite/planner/PlannerConfig.java
+++ b/sql/src/main/java/io/druid/sql/calcite/planner/PlannerConfig.java
@@ -21,9 +21,11 @@
 
 import com.fasterxml.jackson.annotation.JsonProperty;
 import io.druid.java.util.common.IAE;
+import org.joda.time.DateTimeZone;
 import org.joda.time.Period;
 
 import java.util.Map;
+import java.util.Objects;
 
 public class PlannerConfig
 {
@@ -55,6 +57,9 @@
   @JsonProperty
   private boolean useFallback = false;
 
+  @JsonProperty
+  private DateTimeZone sqlTimeZone = DateTimeZone.UTC;
+
   public Period getMetadataRefreshPeriod()
   {
 return metadataRefreshPeriod;
@@ -95,6 +100,11 @@ public boolean isUseFallback()
 return useFallback;
   }
 
+  public DateTimeZone getSqlTimeZone()
+  {
+return sqlTimeZone;
+  }
+
   public PlannerConfig withOverrides(final Map context)
   {
 if (context == null) {
@@ -122,6 +132,7 @@ public PlannerConfig withOve

[GitHub] jihoonson commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-18 Thread GitBox
jihoonson commented on a change in pull request #5958: Part 2 of changes for 
SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r203510498
 
 

 ##
 File path: 
extensions-core/lookups-cached-single/src/main/java/io/druid/server/lookup/LoadingLookup.java
 ##
 @@ -62,15 +63,19 @@ public LoadingLookup(
 
 
   @Override
-  public String apply(final String key)
+  public String apply(@Nullable final String key)
   {
-if (key == null) {
+String keyEquivalent = NullHandling.nullToEmptyIfNeeded(key);
 
 Review comment:
   Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-18 Thread GitBox
jihoonson commented on a change in pull request #5958: Part 2 of changes for 
SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r203510258
 
 

 ##
 File path: 
extensions-core/histogram/src/main/java/io/druid/query/aggregation/histogram/ApproximateHistogramAggregator.java
 ##
 @@ -59,7 +60,9 @@ public ApproximateHistogramAggregator(
   @Override
   public void aggregate()
   {
-histogram.offer(selector.getFloat());
+if (NullHandling.replaceWithDefault() || !selector.isNull()) {
 
 Review comment:
   Got it. Thanks. Would you add a comment here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] drcrallen commented on issue #6014: Optionally refuse to consume new data until the prior chunk is being consumed

2018-07-18 Thread GitBox
drcrallen commented on issue #6014: Optionally refuse to consume new data until 
the prior chunk is being consumed
URL: https://github.com/apache/incubator-druid/pull/6014#issuecomment-406050564
 
 
   as far as I can tell, the problem is that the Channel Future doesn't return 
from a `get` until the call is completed, but it can't complete because it is 
waiting for the queue to free up. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] drcrallen edited a comment on issue #6014: Optionally refuse to consume new data until the prior chunk is being consumed

2018-07-18 Thread GitBox
drcrallen edited a comment on issue #6014: Optionally refuse to consume new 
data until the prior chunk is being consumed
URL: https://github.com/apache/incubator-druid/pull/6014#issuecomment-406045358
 
 
   I'm a bit baffled what is blocking here. There are a few competing threads 
for locks:
   
   
   ```
   "HttpClient-Netty-Worker-87" - Thread t@123
  java.lang.Thread.State: WAITING
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for <28fa59c3> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
   at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
   at 
java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:350)
   at 
io.druid.client.DirectDruidClient$1.handleChunk(DirectDruidClient.java:335)
   at 
io.druid.java.util.http.client.NettyHttpClient$1.messageReceived(NettyHttpClient.java:225)
   at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
   at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
   at 
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
   at 
org.jboss.netty.handler.timeout.ReadTimeoutHandler.messageReceived(ReadTimeoutHandler.java:184)
   at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
   at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
   at 
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
   at 
org.jboss.netty.handler.codec.http.HttpContentDecoder.messageReceived(HttpContentDecoder.java:135)
   at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
   at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
   at 
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
   at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
   at 
org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
   at 
org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
   at 
org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:485)
   at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
   at 
org.jboss.netty.handler.codec.http.HttpClientCodec.handleUpstream(HttpClientCodec.java:92)
   at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
   at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
   at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
   at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
   at 
org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
   at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
   at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
   at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
   at 
org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
   at 
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
   at 
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   at java.lang.Thread.run(Thread.java:748)
   
  Locked ownable synchronizers:
   - locked <576b7c74> (a 
java.util.concurrent.ThreadPoolExecutor$Worker)
   ```
   
   for producing into the inputstream queue, and 
   
   ```
   "processing-fjp-3" - Thread t@437
  java.lang.Thread.State: WAITING
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for <4e912efa> (a 
com.google.common.util.concurrent.AbstractFuture$Sync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
   at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
   at 
java.util.concurr

[GitHub] drcrallen edited a comment on issue #6014: Optionally refuse to consume new data until the prior chunk is being consumed

2018-07-18 Thread GitBox
drcrallen edited a comment on issue #6014: Optionally refuse to consume new 
data until the prior chunk is being consumed
URL: https://github.com/apache/incubator-druid/pull/6014#issuecomment-406045358
 
 
   I'm a bit baffled what is blocking here. There are a few competing threads 
for locks:
   
   
   ```
   "HttpClient-Netty-Worker-87" - Thread t@123
  java.lang.Thread.State: WAITING
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for <28fa59c3> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
   at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
   at 
java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:350)
   at 
io.druid.client.DirectDruidClient$1.handleChunk(DirectDruidClient.java:335)
   at 
io.druid.java.util.http.client.NettyHttpClient$1.messageReceived(NettyHttpClient.java:225)
   at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
   at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
   at 
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
   at 
org.jboss.netty.handler.timeout.ReadTimeoutHandler.messageReceived(ReadTimeoutHandler.java:184)
   at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
   at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
   at 
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
   at 
org.jboss.netty.handler.codec.http.HttpContentDecoder.messageReceived(HttpContentDecoder.java:135)
   at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
   at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
   at 
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
   at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
   at 
org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
   at 
org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
   at 
org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:485)
   at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
   at 
org.jboss.netty.handler.codec.http.HttpClientCodec.handleUpstream(HttpClientCodec.java:92)
   at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
   at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
   at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
   at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
   at 
org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
   at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
   at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
   at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
   at 
org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
   at 
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
   at 
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   at java.lang.Thread.run(Thread.java:748)
   
  Locked ownable synchronizers:
   - locked <576b7c74> (a 
java.util.concurrent.ThreadPoolExecutor$Worker)
   ```
   
   for producing into the queue, and 
   
   ```
   "processing-fjp-3" - Thread t@437
  java.lang.Thread.State: WAITING
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for <4e912efa> (a 
com.google.common.util.concurrent.AbstractFuture$Sync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
   at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
   at 
java.util.concurrent.locks.Ab

[GitHub] drcrallen commented on issue #6014: Optionally refuse to consume new data until the prior chunk is being consumed

2018-07-18 Thread GitBox
drcrallen commented on issue #6014: Optionally refuse to consume new data until 
the prior chunk is being consumed
URL: https://github.com/apache/incubator-druid/pull/6014#issuecomment-406045358
 
 
   I'm a bit baffled what is blocking here. There are two competing threads for 
locks:
   
   
   ```
   "HttpClient-Netty-Worker-87" - Thread t@123
  java.lang.Thread.State: WAITING
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for <28fa59c3> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
   at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
   at 
java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:350)
   at 
io.druid.client.DirectDruidClient$1.handleChunk(DirectDruidClient.java:335)
   at 
io.druid.java.util.http.client.NettyHttpClient$1.messageReceived(NettyHttpClient.java:225)
   at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
   at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
   at 
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
   at 
org.jboss.netty.handler.timeout.ReadTimeoutHandler.messageReceived(ReadTimeoutHandler.java:184)
   at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
   at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
   at 
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
   at 
org.jboss.netty.handler.codec.http.HttpContentDecoder.messageReceived(HttpContentDecoder.java:135)
   at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
   at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
   at 
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
   at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
   at 
org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
   at 
org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
   at 
org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:485)
   at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
   at 
org.jboss.netty.handler.codec.http.HttpClientCodec.handleUpstream(HttpClientCodec.java:92)
   at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
   at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
   at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
   at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
   at 
org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
   at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
   at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
   at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
   at 
org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
   at 
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
   at 
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   at java.lang.Thread.run(Thread.java:748)
   
  Locked ownable synchronizers:
   - locked <576b7c74> (a 
java.util.concurrent.ThreadPoolExecutor$Worker)
   ```
   
   for producing into the queue, and 
   
   ```
   "processing-fjp-3" - Thread t@437
  java.lang.Thread.State: WAITING
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for <4e912efa> (a 
com.google.common.util.concurrent.AbstractFuture$Sync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
   at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
   at 
java.util.concurrent.locks.AbstractQue

[GitHub] gianm commented on a change in pull request #5998: Add support to filter on datasource for active tasks

2018-07-18 Thread GitBox
gianm commented on a change in pull request #5998: Add support to filter on 
datasource for active tasks
URL: https://github.com/apache/incubator-druid/pull/5998#discussion_r203491384
 
 

 ##
 File path: 
indexing-service/src/main/java/io/druid/indexing/overlord/HeapMemoryTaskStorage.java
 ##
 @@ -170,7 +170,7 @@ public void setStatus(TaskStatus status)
   }
 
   @Override
-  public List> getActiveTaskInfo()
+  public List> getActiveTaskInfo(@Nullable String datasource)
 
 Review comment:
   Please spell it "dataSource". Similar comment elsewhere.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] gianm commented on a change in pull request #5998: Add support to filter on datasource for active tasks

2018-07-18 Thread GitBox
gianm commented on a change in pull request #5998: Add support to filter on 
datasource for active tasks
URL: https://github.com/apache/incubator-druid/pull/5998#discussion_r203495466
 
 

 ##
 File path: 
indexing-service/src/main/java/io/druid/indexing/overlord/http/OverlordResource.java
 ##
 @@ -632,14 +632,14 @@ public Response getTasks(
 List> allActiveTaskInfo = Lists.newArrayList();
 final List allActiveTasks = Lists.newArrayList();
 if (state == null || !"complete".equals(StringUtils.toLowerCase(state))) {
-  allActiveTaskInfo = taskStorageQueryAdapter.getActiveTaskInfo();
+  allActiveTaskInfo = 
taskStorageQueryAdapter.getActiveTaskInfo(dataSource);
   for (final TaskInfo task : allActiveTaskInfo) {
 allActiveTasks.add(
 new AnyTask(
 task.getId(),
 task.getTask() == null ? null : task.getTask().getType(),
 SettableFuture.create(),
-task.getDataSource(),
+task.getDataSource(), // ?
 
 Review comment:
   What does the `// ?` comment mean?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] gianm edited a comment on issue #3236: gitter community channel?

2018-07-18 Thread GitBox
gianm edited a comment on issue #3236: gitter community channel?
URL: 
https://github.com/apache/incubator-druid/issues/3236#issuecomment-406042168
 
 
   I have hesitated to set it up myself because I am not sure if it's a good 
idea to have a real time chat channel. We had one in the past and it was a 
mixed blessing. It was great for being able to get people connected super fast, 
like a turbo-charged version of the mailing lists. But it was also difficult to 
keep the channel "staffed", if you will, since Druid committers tend to be busy 
people. So lots of times, someone would come in looking for help and there 
would be nobody there to help them. That side of things ended up being 
frustrating for everyone, compared to the mailing lists, where it was more 
likely that someone would chime in with thoughts when they could.
   
   I probably wouldn't personally join a chat channel, because if I joined it, 
then it would make me feel bad when someone asks for help and I'm not available 
to help them. And I feel that a channel should be created by someone who is 
also volunteering to join and monitor it.
   
   @leventov could you elaborate more on why you're against a channel?
   
   @drcrallen my understanding of ASF policy is that decisions must be recorded 
on ASF infra (i.e. mailing lists or wiki), for legal and recordkeeping reasons. 
(Btw: github counts, since we archive all github activity to 
comm...@druid.apache.org.) But I assume that this is less strict for general, 
non-decision-making communication. Fwiw, Kafka has an IRC channel 
(https://kafka.apache.org/contact). It seems to get a couple of messages a day.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] fjy commented on issue #6015: Check the kafka topic when comparing checkpoints from tasks with the one stored in metastore

2018-07-18 Thread GitBox
fjy commented on issue #6015: Check the kafka topic when comparing checkpoints 
from tasks with the one stored in metastore
URL: https://github.com/apache/incubator-druid/pull/6015#issuecomment-406043066
 
 
   👍 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] gianm commented on issue #3236: gitter community channel?

2018-07-18 Thread GitBox
gianm commented on issue #3236: gitter community channel?
URL: 
https://github.com/apache/incubator-druid/issues/3236#issuecomment-406042168
 
 
   I have hesitated to set it up myself because I am not sure if it's a good 
idea to have a real time chat channel. We had one in the past and it was a 
mixed blessing. It was great for being able to get people connected super fast, 
like a turbo-charged version of the mailing lists. But it was also difficult to 
keep the channel "staffed", if you will, since Druid committers tend to be busy 
people. So lots of times, someone would come in looking for help and there 
would be nobody there to help them. That side of things ended up being 
frustrating for everyone, compared to the mailing lists, where it was more 
likely that someone would chime in with thoughts when they could.
   
   I probably wouldn't personally a chat channel, because if I joined it, then 
it would make me feel bad when someone asks for help and I'm not available to 
help them. And I feel that a channel should be created by someone who is also 
volunteering to join and monitor it.
   
   @leventov could you elaborate more on why you're against a channel?
   
   @drcrallen my understanding of ASF policy is that decisions must be recorded 
on ASF infra (i.e. mailing lists or wiki), for legal and recordkeeping reasons. 
(Btw: github counts, since we archive all github activity to 
comm...@druid.apache.org.) But I assume that this is less strict for general, 
non-decision-making communication. Fwiw, Kafka has an IRC channel 
(https://kafka.apache.org/contact). It seems to get a couple of messages a day.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] matanster commented on issue #3236: gitter community channel?

2018-07-18 Thread GitBox
matanster commented on issue #3236: gitter community channel?
URL: 
https://github.com/apache/incubator-druid/issues/3236#issuecomment-406041657
 
 
   Dangers? your comment left me curious


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-18 Thread GitBox
jihoonson commented on a change in pull request #5958: Part 2 of changes for 
SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r203492914
 
 

 ##
 File path: common/src/main/java/io/druid/math/expr/ExprEval.java
 ##
 @@ -245,36 +252,63 @@ public final ExprType type()
 @Override
 public final int asInt()
 {
-  if (value == null) {
+  Number number = asNumber();
+  if (number == null) {
 assert NullHandling.replaceWithDefault();
 return 0;
   }
-
-  final Integer theInt = Ints.tryParse(value);
-  assert NullHandling.replaceWithDefault() || theInt != null;
-  return theInt == null ? 0 : theInt;
+  return number.intValue();
 }
 
 @Override
 public final long asLong()
 {
-  // GuavaUtils.tryParseLong handles nulls, no need for special null 
handling here.
-  final Long theLong = GuavaUtils.tryParseLong(value);
-  assert NullHandling.replaceWithDefault() || theLong != null;
-  return theLong == null ? 0L : theLong;
+  Number number = asNumber();
+  if (number == null) {
+assert NullHandling.replaceWithDefault();
 
 Review comment:
   So, would you point me out where the discussion is? I guess it's in 
https://github.com/apache/incubator-druid/pull/5278?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-18 Thread GitBox
jihoonson commented on a change in pull request #5958: Part 2 of changes for 
SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r203490303
 
 

 ##
 File path: common/src/main/java/io/druid/math/expr/Expr.java
 ##
 @@ -365,15 +371,21 @@ public ExprEval eval(ObjectBinding bindings)
 
 // Result of any Binary expressions is null if any of the argument is null.
 // e.g "select null * 2 as c;" or "select null + 1 as c;" will return null 
as per Standard SQL spec.
-if (NullHandling.sqlCompatible() && (leftVal.isNull() || 
rightVal.isNull())) {
+if (NullHandling.sqlCompatible() && (leftVal.value() == null || 
rightVal.value() == null)) {
   return ExprEval.of(null);
 }
 
 if (leftVal.type() == ExprType.STRING && rightVal.type() == 
ExprType.STRING) {
   return evalString(leftVal.asString(), rightVal.asString());
 } else if (leftVal.type() == ExprType.LONG && rightVal.type() == 
ExprType.LONG) {
+  if (NullHandling.sqlCompatible() && (leftVal.isNumericNull() || 
rightVal.isNumericNull())) {
 
 Review comment:
   Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] gianm commented on issue #5993: SQL: Add server-wide default time zone config.

2018-07-18 Thread GitBox
gianm commented on issue #5993: SQL: Add server-wide default time zone config.
URL: https://github.com/apache/incubator-druid/pull/5993#issuecomment-406035985
 
 
   @asdf2014 @jihoonson @nishantmonu51 Thanks for your comments -- I fixed the 
call to DateTimeZone.forID and added the new config parameter to the PR 
description. I also merged from master to try to help the tests to pass.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] gianm commented on a change in pull request #5993: SQL: Add server-wide default time zone config.

2018-07-18 Thread GitBox
gianm commented on a change in pull request #5993: SQL: Add server-wide default 
time zone config.
URL: https://github.com/apache/incubator-druid/pull/5993#discussion_r203489841
 
 

 ##
 File path: sql/src/test/java/io/druid/sql/calcite/CalciteQueryTest.java
 ##
 @@ -169,6 +169,14 @@ public int getMaxQueryCount()
   return 1;
 }
   };
+  private static final PlannerConfig PLANNER_CONFIG_LOS_ANGELES = new 
PlannerConfig()
+  {
+@Override
+public DateTimeZone getSqlTimeZone()
+{
+  return DateTimeZone.forID("America/Los_Angeles");
 
 Review comment:
   I'll change this


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] drcrallen commented on issue #6018: RegisteredLookup java api

2018-07-18 Thread GitBox
drcrallen commented on issue #6018: RegisteredLookup java api
URL: 
https://github.com/apache/incubator-druid/issues/6018#issuecomment-406028004
 
 
   it should be in the druid server jar. Can you explain your use case some 
more?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] drcrallen commented on issue #3236: gitter community channel?

2018-07-18 Thread GitBox
drcrallen commented on issue #3236: gitter community channel?
URL: 
https://github.com/apache/incubator-druid/issues/3236#issuecomment-406027596
 
 
   There are inherent dangers to communications regarding the project that are 
not part of official ASF channels (i.e. the mailing list). How do other ASF 
projects handle such IM style communications? Are other ASF projects permissive 
of non-mailing-list interactions on a regular basis?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] drcrallen commented on issue #6014: Optionally refuse to consume new data until the prior chunk is being consumed

2018-07-18 Thread GitBox
drcrallen commented on issue #6014: Optionally refuse to consume new data until 
the prior chunk is being consumed
URL: https://github.com/apache/incubator-druid/pull/6014#issuecomment-406020572
 
 
   I'm looking to see if there's a fix


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] drcrallen commented on issue #6014: Optionally refuse to consume new data until the prior chunk is being consumed

2018-07-18 Thread GitBox
drcrallen commented on issue #6014: Optionally refuse to consume new data until 
the prior chunk is being consumed
URL: https://github.com/apache/incubator-druid/pull/6014#issuecomment-406020525
 
 
   This seems to cause deadlocks. Right now it looks like the "parallel 
merging" doesn't let the FJP know that iterating through the results is a 
blocking operation, so the FJP can get clogged with TransferQueue poll 
operations that don't match with the http client `transfer` operations, causing 
http and fjp threads to deadlock each other.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



Re: Multi-threaded Druid Tests/Benchmarks

2018-07-18 Thread Charles Allen
io.druid.benchmark.query.TopNBenchmark is the one that tore up heap when i
was trying to test alternate strategies for
https://github.com/apache/incubator-druid/pull/5913 and
https://github.com/apache/incubator-druid/pull/6014 locally. You can
control the number of segments created.

On Wed, Jul 18, 2018 at 12:35 AM Anastasia Braginsky
 wrote:

>  So this is probably where we can help with the Oak-based incremental
> index.Can you please give me any reference to those tests? Any descriptions?
> Thanks!
>
> On Tuesday, July 17, 2018, 8:59:57 PM GMT+3, Charles Allen <
> charles.al...@snap.com.INVALID> wrote:
>
>  Unfortunately I think multi-threaded test coverage is kind of weak and
> historically very hart to test. There are some topN benchmarks but they are
> very limited as they don't scale well (heap gets blasted from incremental
> index) with a large concurrency level.
>
> On Sun, Jul 15, 2018 at 6:35 AM Anastasia Braginsky
>  wrote:
>
> > Hi Everybody,
> > From last Tuesday Druid's meeting I recall Charles mentioned some Druid's
> > multi-threaded tests/benchmarks that can be applied end-to-end to check
> the
> > performance.
> > Can I get some references/names so I can start investigating this
> > direction from multi-threaded Oak-in-Druid perspective?Thanks!
> >
> >
>


[GitHub] leventov commented on issue #3236: gitter community channel?

2018-07-18 Thread GitBox
leventov commented on issue #3236: gitter community channel?
URL: 
https://github.com/apache/incubator-druid/issues/3236#issuecomment-406001070
 
 
   I'm against creating a Gitter channel (or Slack, or IRC, FWIW).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] drcrallen commented on issue #6014: Optionally refuse to consume new data until the prior chunk is being consumed

2018-07-18 Thread GitBox
drcrallen commented on issue #6014: Optionally refuse to consume new data until 
the prior chunk is being consumed
URL: https://github.com/apache/incubator-druid/pull/6014#issuecomment-40550
 
 
   @gianm yes, the main risk is blocking the http threads of historicals. Since 
results are merged in http threads and not processing threads, other queries 
should be able to continue to do work, but it could block new queries from 
coming in.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] joshlemer commented on issue #3236: gitter community channel?

2018-07-18 Thread GitBox
joshlemer commented on issue #3236: gitter community channel?
URL: 
https://github.com/apache/incubator-druid/issues/3236#issuecomment-405998680
 
 
   @gianm nobody else can make the gitter channel except people who own this 
repository, can you just make one it takes 30 seconds 
https://gitter.im/#createcommunity


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] asdf2014 edited a comment on issue #5980: Various changes about a few coding specifications

2018-07-18 Thread GitBox
asdf2014 edited a comment on issue #5980: Various changes about a few coding 
specifications
URL: https://github.com/apache/incubator-druid/pull/5980#issuecomment-405439513
 
 
   @leventov PTAL.
   
   1. Set the level of `ArraysAsListWithZeroOrOneArgument` inspection as 
`warning`, because use `Collections.singletonList` will cause a 
ClassCastException in 
`io.druid.server.coordinator.CostBalancerStrategyBenchmark#factoryClasses` and 
`io.druid.collections.bitmap.WrappedRoaringBitmapTest#factoryClasses`.
   
   2. Also set the level of `ToArrayCallWithZeroLengthArrayArgument` coding 
inspection as `warning` and add `BY_LEVEL` option, because 
[teamcity](https://teamcity.jetbrains.com/viewLog.html?buildId=1526603&tab=Inspection&buildTypeId=OpenSourceProjects_Druid_InspectionsPullRequests)
 is misunderstanding this inspection, teamcity think `pre-size` way is better, 
but it is not.
   
   ```java
   @State(Scope.Thread)
   @OutputTimeUnit(TimeUnit.NANOSECONDS)
   @BenchmarkMode(Mode.AverageTime)
   public class ToArrayBenchmark {
   
   @Param({"1", "100", "1000", "5000", "1", "10"})
   private int n;
   
   private final List list = new ArrayList<>();
   
   @Setup
   public void populateList() {
   for (int i = 0; i < n; i++) {
   list.add(0);
   }
   }
   
   @Benchmark
   public Object[] preSize() {
   return list.toArray(new Object[n]);
   }
   
   @Benchmark
   public Object[] resize() {
   return list.toArray(new Object[0]);
   }
   
   /*
   Integer List:
   Benchmark(n)  Mode  Cnt   ScoreError  
Units
   ToArrayBenchmark.preSize   1  avgt3  41.552 ±108.030  
ns/op
   ToArrayBenchmark.preSize 100  avgt3 216.449 ±799.501  
ns/op
   ToArrayBenchmark.preSize1000  avgt32087.965 ±   6027.778  
ns/op
   ToArrayBenchmark.preSize5000  avgt39098.358 ±  14603.493  
ns/op
   ToArrayBenchmark.preSize   1  avgt3   24204.199 ± 121468.232  
ns/op
   ToArrayBenchmark.preSize  10  avgt3  188183.618 ± 369455.090  
ns/op
   ToArrayBenchmark.resize1  avgt3  18.987 ± 36.449  
ns/op
   ToArrayBenchmark.resize  100  avgt3 265.549 ±   1125.008  
ns/op
   ToArrayBenchmark.resize 1000  avgt31560.713 ±   2922.186  
ns/op
   ToArrayBenchmark.resize 5000  avgt37804.810 ±   8333.390  
ns/op
   ToArrayBenchmark.resize1  avgt3   24791.026 ±  78459.936  
ns/op
   ToArrayBenchmark.resize   10  avgt3  158891.642 ±  56055.895  
ns/op
   Object List:
   Benchmark(n)  Mode  Cnt  Score   Error  Units
   ToArrayBenchmark.preSize   1  avgt3 36.306 ±96.612  ns/op
   ToArrayBenchmark.preSize 100  avgt3 52.372 ±84.159  ns/op
   ToArrayBenchmark.preSize1000  avgt3449.807 ±   215.692  ns/op
   ToArrayBenchmark.preSize5000  avgt3   2080.172 ±  2003.726  ns/op
   ToArrayBenchmark.preSize   1  avgt3   4657.937 ±  8432.624  ns/op
   ToArrayBenchmark.preSize  10  avgt3  51980.829 ± 46920.314  ns/op
   ToArrayBenchmark.resize1  avgt3 16.747 ±85.131  ns/op
   ToArrayBenchmark.resize  100  avgt3 43.803 ±28.704  ns/op
   ToArrayBenchmark.resize 1000  avgt3404.681 ±   132.986  ns/op
   ToArrayBenchmark.resize 5000  avgt3   1972.649 ±   174.691  ns/op
   ToArrayBenchmark.resize1  avgt3   4021.440 ±  1114.212  ns/op
   ToArrayBenchmark.resize   10  avgt3  44204.167 ± 76714.850  ns/op
*/
   public static void main(String[] args) throws Exception {
   Options opt = new OptionsBuilder()
   .include(ToArrayBenchmark.class.getSimpleName())
   .forks(1)
   .warmupIterations(1)
   .measurementIterations(3)
   .threads(1)
   .build();
   new Runner(opt).run();
   }
   }
   ```
   Tips: Full code is 
[here](https://github.com/asdf2014/yuzhouwan/blob/master/yuzhouwan-hacker/src/main/java/com/yuzhouwan/hacker/algorithms/collection/ToArrayBenchmark.java).
   
   3. Added `[a-z][a-zA-Z0-9_]*\.equals\((\"|[A-Z_]+\))` into checkstyle config 
file.
   
   4. Added `java.io.File#toURL()` into druid-forbidden-apis.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
Fo

[GitHub] asdf2014 commented on issue #5980: Various changes about a few coding specifications

2018-07-18 Thread GitBox
asdf2014 commented on issue #5980: Various changes about a few coding 
specifications
URL: https://github.com/apache/incubator-druid/pull/5980#issuecomment-405974955
 
 
   Hi,  @leventov . It has been fixed. BTW, 
[jobs-405249198](https://travis-ci.org/apache/incubator-druid/jobs/405249198) 
failed because vm crashed. Would you please help me rebuild it once?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] gianm commented on issue #6014: Optionally refuse to consume new data until the prior chunk is being consumed

2018-07-18 Thread GitBox
gianm commented on issue #6014: Optionally refuse to consume new data until the 
prior chunk is being consumed
URL: https://github.com/apache/incubator-druid/pull/6014#issuecomment-405970508
 
 
   Is this meant to be addressing 
https://github.com/apache/incubator-druid/issues/4933? What effect does 
enabling this option have on historicals -- I guess backpressure will extend 
all the way to them, potentially blocking their http threads?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



Implementing "no-RollUp" (aka PlainFactsHolder) in Oak Incremental Index

2018-07-18 Thread Anastasia Braginsky
Hi Again,
Just to summarize once again the way we are implementing the "no-RollUp" (aka 
PlainFactsHolder) in Oak Incremental Index. If someone is familiar with this 
part of incremental index, please see if we are wrong in some assumptions. As a 
reminder, in Oak Incremental Index we are not using Facts Holder at all. The 
Incremental Index Row (aka Time&Dims) is mapped directly to the Row 
(Aggregators). However, we want to keep functionality of "RollUp" (aggregate 
the metrics of the same IncrementalIndexRow up to some time granularity) and 
"Plain" (no aggregation, each IncrementalIndexRow and its Row are held ordered 
by their timestamps).

Bottom line, PlainFactsHolder gives you the mapping from timestamp to 
IncrementalIndexRow+Row disregarding the order among the same timestamp. This 
is what we wanted to do, just without mapping timestamp to queues as it is done 
via PlainFactsHolder. We plan to have the same map just giving it a comparator 
that will order the IncrementalIndexRows according to their timestamps and 
disregarding the dimensions. The iterators will start from the first 
timestamp-to-start (with any dimensions) and end with the last timestamp-to-end 
(with any dimensions). This looks too us as a simplest way that can be also 
implemented in original incremental index. Any ideas why  PlainFactsHolder was 
implemented with those queues?

If I am not clear enough I would be happy to know and to explain myself 
better.Looking forward to hear your opinion!

Thanks,Anastasia


[GitHub] nicolasblaye opened a new issue #6018: RegisteredLookup java api

2018-07-18 Thread GitBox
nicolasblaye opened a new issue #6018: RegisteredLookup java api
URL: https://github.com/apache/incubator-druid/issues/6018
 
 
   Hi everyone,
   
   We use registered lookup tables. I did a query using the json api and I 
could use it, but I can't find the correct ExtractionFn for the java api.
   
   I imported both `"io.druid.extensions" % "druid-lookups-cached-global"` and 
`"io.druid.extensions" % "druid-lookups-cached-single"` and I didn't see it, 
though I know it exists somewhere (I presume it is server side): 
https://github.com/apache/incubator-druid/blob/master/server/src/main/java/io/druid/query/lookup/RegisteredLookupExtractionFn.java.
   
   I can easily stub this to have the correct json for the query, but I was 
wondering if I was missing something and the class was present client side 
somewhere?
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] erankor opened a new issue #6017: Logging of invalid queries

2018-07-18 Thread GitBox
erankor opened a new issue #6017: Logging of invalid queries
URL: https://github.com/apache/incubator-druid/issues/6017
 
 
   Hi all,
   
   When Druid gets an invalid query, is there a way to see the query that was 
submitted?
   I don't see the query body in the broker log - I only see the exception (I'm 
using the default log4j config, that has 'info' severity). And also - it seems 
that invalid queries are not logged to the request logger 
(druid.request.logging).
   The motivation here is that I have a monitor on the Druid logs, so I know 
when an invalid query is somehow sent. But when it happens, I don't know what 
query it was... (the query can come from many servers)
   
   Thanks!
   
   Eran


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] salman028 closed issue #6008: groupBy Query took too much time

2018-07-18 Thread GitBox
salman028 closed issue #6008: groupBy Query took too much time 
URL: https://github.com/apache/incubator-druid/issues/6008
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] hellobabygogo commented on issue #5871: fix push supervisor error

2018-07-18 Thread GitBox
hellobabygogo commented on issue #5871: fix push supervisor error
URL: https://github.com/apache/incubator-druid/pull/5871#issuecomment-405858208
 
 
   @jihoonson yes, it can avoid push continuously and cause the tasks failed.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



Re: Multi-threaded Druid Tests/Benchmarks

2018-07-18 Thread Anastasia Braginsky
 So this is probably where we can help with the Oak-based incremental index.Can 
you please give me any reference to those tests? Any descriptions?
Thanks!

On Tuesday, July 17, 2018, 8:59:57 PM GMT+3, Charles Allen 
 wrote:  
 
 Unfortunately I think multi-threaded test coverage is kind of weak and
historically very hart to test. There are some topN benchmarks but they are
very limited as they don't scale well (heap gets blasted from incremental
index) with a large concurrency level.

On Sun, Jul 15, 2018 at 6:35 AM Anastasia Braginsky
 wrote:

> Hi Everybody,
> From last Tuesday Druid's meeting I recall Charles mentioned some Druid's
> multi-threaded tests/benchmarks that can be applied end-to-end to check the
> performance.
> Can I get some references/names so I can start investigating this
> direction from multi-threaded Oak-in-Druid perspective?Thanks!
>
>