[GitHub] clintropolis commented on a change in pull request #5957: Various changes

2018-07-19 Thread GitBox
clintropolis commented on a change in pull request #5957: Various changes
URL: https://github.com/apache/incubator-druid/pull/5957#discussion_r203922348
 
 

 ##
 File path: processing/src/main/java/io/druid/segment/ColumnSelector.java
 ##
 @@ -31,5 +31,5 @@
   List getColumnNames();
 
   @Nullable
-  Column getColumn(String columnName);
+  ColumnHolder getColumn(String columnName);
 
 Review comment:
   Nit, maybe this should be called `getColumnHolder` or something to be a bit 
clearer since `ColumnHolder.getColumn` also exists and there are a handful of 
places that are effectively calling `selector.getColumn().getColumn()` which 
reads kind of funny.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



Re: Build failure on 0.13.SNAPSHOT

2018-07-19 Thread Dongjin Lee
Hi Jihoon,

I ran `mvn clean package` following development/build

.

Dongjin

On Fri, Jul 20, 2018 at 12:30 AM Jihoon Son  wrote:

> Hi Dongjin,
>
> what maven command did you run?
>
> Jihoon
>
> On Wed, Jul 18, 2018 at 10:38 PM Dongjin Lee  wrote:
>
> > Hello. I am trying to build druid, but it fails. My environment is like
> the
> > following:
> >
> > - CPU: Intel(R) Core(TM) i7-7560U CPU @ 2.40GHz
> > - RAM: 7704 MB
> > - OS: ubuntu 18.04
> > - JDK: openjdk version "1.8.0_171" (default configuration, with
> MaxHeapSize
> > = 1928 MB)
> > - Branch: master (commit: cd8ea3d)
> >
> > The error message I got is:
> >
> > [INFO]
> > >
> 
> > > [INFO] Reactor Summary:
> > > [INFO]
> > > [INFO] io.druid:druid . SUCCESS [
> > > 50.258 s]
> > > [INFO] java-util .. SUCCESS
> > [03:57
> > > min]
> > > [INFO] druid-api .. SUCCESS [
> > > 22.694 s]
> > > [INFO] druid-common ... SUCCESS [
> > > 14.083 s]
> > > [INFO] druid-hll .. SUCCESS [
> > > 17.126 s]
> > > [INFO] extendedset  SUCCESS [
> > > 10.856 s]
> > >
> > > *[INFO] druid-processing ... FAILURE
> > > [04:36 min]*[INFO] druid-aws-common ...
> > > SKIPPED
> > > [INFO] druid-server ... SKIPPED
> > > [INFO] druid-examples . SKIPPED
> > > ...
> > > [INFO]
> > >
> 
> > > [INFO] BUILD FAILURE
> > > [INFO]
> > >
> 
> > > [INFO] Total time: 10:29 min
> > > [INFO] Finished at: 2018-07-19T13:23:31+09:00
> > > [INFO] Final Memory: 88M/777M
> > > [INFO]
> > >
> 
> > >
> > > *[ERROR] Failed to execute goal
> > > org.apache.maven.plugins:maven-surefire-plugin:2.19.1:test
> (default-test)
> > > on project druid-processing: Execution default-test of goal
> > > org.apache.maven.plugins:maven-surefire-plugin:2.19.1:test failed: The
> > > forked VM terminated without properly saying goodbye. VM crash or
> > > System.exit called?*[ERROR] Command was /bin/sh -c cd
> > > /home/djlee/workspace/java/druid/processing &&
> > > /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xmx3000m
> > -Duser.language=en
> > > -Duser.country=US -Dfile.encoding=UTF-8 -Duser.timezone=UTC
> > > -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
> > > -Ddruid.indexing.doubleStorage=double -jar
> > >
> >
> /home/djlee/workspace/java/druid/processing/target/surefire/surefirebooter1075382243904099051.jar
> > >
> >
> /home/djlee/workspace/java/druid/processing/target/surefire/surefire559351134757209tmp
> > >
> >
> /home/djlee/workspace/java/druid/processing/target/surefire/surefire_5173894389718744688tmp
> >
> >
> > It seems like it fails when it runs tests on `druid-processing` module
> but
> > I can't certain. Is there anyone who can give me some hints? Thanks in
> > advance.
> >
> > Best,
> > Dongjin
> >
> > --
> > *Dongjin Lee*
> >
> > *A hitchhiker in the mathematical world.*
> >
> > *github:  github.com/dongjinleekr
> > linkedin:
> kr.linkedin.com/in/dongjinleekr
> > slideshare:
> > www.slideshare.net/dongjinleekr
> > *
> >
>
> --
> *Dongjin Lee*
>
> *A hitchhiker in the mathematical world.*
>
> *github:  github.com/dongjinleekr
> linkedin: kr.linkedin.com/in/dongjinleekr
> slideshare: 
> www.slideshare.net/dongjinleekr
> *
>


[GitHub] jihoonson commented on issue #5471: Implement force push down for nested group by query

2018-07-19 Thread GitBox
jihoonson commented on issue #5471: Implement force push down for nested group 
by query
URL: https://github.com/apache/incubator-druid/pull/5471#issuecomment-406445172
 
 
   Hi @samarthjain, sorry for the delay. I'm looking at this PR again, and the 
size of this PR looks huge. Would you check that this PR contains only your 
changes?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jon-wei closed pull request #5998: Add support to filter on datasource for active tasks

2018-07-19 Thread GitBox
jon-wei closed pull request #5998: Add support to filter on datasource for 
active tasks
URL: https://github.com/apache/incubator-druid/pull/5998
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/common/src/main/java/io/druid/metadata/MetadataStorageActionHandler.java 
b/common/src/main/java/io/druid/metadata/MetadataStorageActionHandler.java
index 892bf761716..dca2dd006b9 100644
--- a/common/src/main/java/io/druid/metadata/MetadataStorageActionHandler.java
+++ b/common/src/main/java/io/druid/metadata/MetadataStorageActionHandler.java
@@ -120,7 +120,7 @@ void insert(
*
* @return list of {@link TaskInfo}
*/
-  List> getActiveTaskInfo();
+  List> getActiveTaskInfo(@Nullable String dataSource);
 
   /**
* Return createdDate and dataSource for the given id
diff --git 
a/indexing-service/src/main/java/io/druid/indexing/overlord/HeapMemoryTaskStorage.java
 
b/indexing-service/src/main/java/io/druid/indexing/overlord/HeapMemoryTaskStorage.java
index 2e91c3f6026..341cf8ae41d 100644
--- 
a/indexing-service/src/main/java/io/druid/indexing/overlord/HeapMemoryTaskStorage.java
+++ 
b/indexing-service/src/main/java/io/druid/indexing/overlord/HeapMemoryTaskStorage.java
@@ -170,7 +170,7 @@ public void setStatus(TaskStatus status)
   }
 
   @Override
-  public List> getActiveTaskInfo()
+  public List> getActiveTaskInfo(@Nullable String dataSource)
   {
 giant.lock();
 
diff --git 
a/indexing-service/src/main/java/io/druid/indexing/overlord/MetadataTaskStorage.java
 
b/indexing-service/src/main/java/io/druid/indexing/overlord/MetadataTaskStorage.java
index e8489cffdb1..b927497b022 100644
--- 
a/indexing-service/src/main/java/io/druid/indexing/overlord/MetadataTaskStorage.java
+++ 
b/indexing-service/src/main/java/io/druid/indexing/overlord/MetadataTaskStorage.java
@@ -214,10 +214,10 @@ public Task apply(@Nullable Pair input)
   }
 
   @Override
-  public List> getActiveTaskInfo()
+  public List> getActiveTaskInfo(@Nullable String dataSource)
   {
 return ImmutableList.copyOf(
-handler.getActiveTaskInfo()
+handler.getActiveTaskInfo(dataSource)
 );
   }
 
diff --git 
a/indexing-service/src/main/java/io/druid/indexing/overlord/TaskStorage.java 
b/indexing-service/src/main/java/io/druid/indexing/overlord/TaskStorage.java
index da7a07ab76f..b24dd35a123 100644
--- a/indexing-service/src/main/java/io/druid/indexing/overlord/TaskStorage.java
+++ b/indexing-service/src/main/java/io/druid/indexing/overlord/TaskStorage.java
@@ -127,9 +127,11 @@
* Returns a list of currently running or pending tasks as stored in the 
storage facility as {@link TaskInfo}. No particular order
* is guaranteed, but implementations are encouraged to return tasks in 
ascending order of creation.
*
+   * @param datasource datasource
+   *
* @return list of {@link TaskInfo}
*/
-  List> getActiveTaskInfo();
+  List> getActiveTaskInfo(@Nullable String dataSource);
 
   /**
* Returns up to {@code maxTaskStatuses} {@link TaskInfo} objects of 
recently finished tasks as stored in the storage facility. No
@@ -137,7 +139,8 @@
* No particular standard of "recent" is guaranteed, and in fact, this 
method is permitted to simply return nothing.
*
* @param maxTaskStatuses maxTaskStatuses
-   * @param duration duration
+   * @param durationduration
+   * @param datasource  datasource
*
* @return list of {@link TaskInfo}
*/
diff --git 
a/indexing-service/src/main/java/io/druid/indexing/overlord/TaskStorageQueryAdapter.java
 
b/indexing-service/src/main/java/io/druid/indexing/overlord/TaskStorageQueryAdapter.java
index 4c3815cf9aa..c1cd2b3b7df 100644
--- 
a/indexing-service/src/main/java/io/druid/indexing/overlord/TaskStorageQueryAdapter.java
+++ 
b/indexing-service/src/main/java/io/druid/indexing/overlord/TaskStorageQueryAdapter.java
@@ -55,9 +55,9 @@ public TaskStorageQueryAdapter(TaskStorage storage)
 return storage.getActiveTasks();
   }
 
-  public List> getActiveTaskInfo()
+  public List> getActiveTaskInfo(@Nullable String dataSource)
   {
-return storage.getActiveTaskInfo();
+return storage.getActiveTaskInfo(dataSource);
   }
 
   public List> getRecentlyCompletedTaskInfo(
diff --git 
a/indexing-service/src/main/java/io/druid/indexing/overlord/http/OverlordResource.java
 
b/indexing-service/src/main/java/io/druid/indexing/overlord/http/OverlordResource.java
index 895ea8ff3d5..68b5ff69167 100644
--- 
a/indexing-service/src/main/java/io/druid/indexing/overlord/http/OverlordResource.java
+++ 
b/indexing-service/src/main/java/io/druid/indexing/overlord/http/OverlordResource.java
@@ -629,10 +629,10 @@ public Response getTasks(
   finalTaskList.addAll(completedTasks);
 }
 
-List> allActiveTaskInfo = 

Re: Question on GroupBy query results merging process

2018-07-19 Thread Jisoo Kim
Hi Jihoon,

Thanks for the reply. So what I ended up doing for merging a list of
serialized result Sequences (which is a byte array) was:

1) Create a stream of  out of the list
2) For each serialized sequence in a list, create a query runner that
deserializes the byte array and returns a Sequence (along with applying
PreComputeManipulatorFn). Now the stream becomes Stream
3) Call queryRunnerFactory.mergeRunners() (factory is created from the
injector and given query) on the materialized list of QueryRunner
4) Create a FluentQueryRunner out of 3) and add necessary steps including
mergeResults(), which essentially calls queryToolChest.mergeResults() on
queryRunnerFactory.mergeRunners()

Does my approach look valid or is it something that I shouldn't be doing
for merging query results? Before I changed the merging logic to the above
I encountered a problem with merging sub-query results properly for very
heavy groupBy queries.

I haven't had much chance to read through all the group by query processing
logic, but I am wondering if the improvement that I gained was from
changing logic was mainly from deserializing the sub-query results in
parallel (by calling queryRunnerFactory.mergeRunners() which seems to
enable parallelism), or if it was also benefitting from using
GroupByMergingQueryRunnerV2 that has parallel combining threads enabled.

Thanks,
Jisoo

On Thu, Jul 19, 2018 at 3:06 PM, Jihoon Son  wrote:

> Hi Jisoo,
>
> sorry, the previous email was sent by accident.
>
> The initial version of groupBy v2 wasn't capable of combining intermediates
> in parallel. Some of our customers met the similar issue to yours, and so I
> was working on improving groupBy v2 performance for a while.
>
> Parallel combining on brokers definitely makes sense. I was thinking to add
> a sort of ParallelMergeSequence which is a parallel version of
> MergeSequence, but it can be anything if it supports parallel combining on
> brokers.
>
> One thing I'm worrying about is, most query processing interfaces in
> brokers are using Sequence, and thus using another stuff for a specific
> query type might make the codes complicated. I think we need to avoid it if
> possible.
>
> Best,
> Jihoon
>
> On Thu, Jul 19, 2018 at 2:58 PM Jihoon Son  wrote:
>
> > Hi Jisoo,
> >
> > the initial version of groupBy v2
> >
> > On Thu, Jul 19, 2018 at 2:42 PM Jisoo Kim 
> > wrote:
> >
> >> Hi all,
> >>
> >> I am currently working on a project that uses Druid's QueryRunner and
> >> other
> >> druid-processing classes. It uses Druid's own classes to calculate query
> >> results. I have been testing large GroupBy queries (using v2), and it
> >> seems
> >> like parallel combining threads for GroupBy queries are only enabled on
> >> the
> >> historical level. I think it is only getting called by
> >> GroupByStrategyV2.mergeRunners()
> >> <
> >> https://github.com/apache/incubator-druid/blob/druid-0.
> 12.1/processing/src/main/java/io/druid/query/groupby/
> strategy/GroupByStrategyV2.java#L335
> >> >
> >> which is only called by GroupByQueryRunnerFactory.mergeRunners() on
> >> historicals.
> >>
> >> Are GroupByMergingQueryRunnerV2 and parallel combining threads meant for
> >> computing and merging per-segment results only, or can they also be used
> >> on
> >> the broker level? I changed the logic of my project from calling
> >> queryToolChest.mergeResults() on MergeSequence (created by providing a
> >> list
> >> of per-segment/per-server sequences) to calling
> >> queryToolChest.mergeResults() on queryRunnerFactory.mergeRunners()
> (where
> >> each runner returns a deserialized result sequence), and that seemed to
> >> have reduced really heavy groupby query computation time or failures by
> >> quite a lot. Or is this just a coincidence and there shouldn't be a
> >> performance difference in merging groupby query results, and the only
> >> difference could've been by parallelizing the deserialization of result
> >> sequences from sub-queries?
> >>
> >> Thanks,
> >> Jisoo
> >>
> >
>


Re: Question on GroupBy query results merging process

2018-07-19 Thread Jihoon Son
Hi Jisoo,

sorry, the previous email was sent by accident.

The initial version of groupBy v2 wasn't capable of combining intermediates
in parallel. Some of our customers met the similar issue to yours, and so I
was working on improving groupBy v2 performance for a while.

Parallel combining on brokers definitely makes sense. I was thinking to add
a sort of ParallelMergeSequence which is a parallel version of
MergeSequence, but it can be anything if it supports parallel combining on
brokers.

One thing I'm worrying about is, most query processing interfaces in
brokers are using Sequence, and thus using another stuff for a specific
query type might make the codes complicated. I think we need to avoid it if
possible.

Best,
Jihoon

On Thu, Jul 19, 2018 at 2:58 PM Jihoon Son  wrote:

> Hi Jisoo,
>
> the initial version of groupBy v2
>
> On Thu, Jul 19, 2018 at 2:42 PM Jisoo Kim 
> wrote:
>
>> Hi all,
>>
>> I am currently working on a project that uses Druid's QueryRunner and
>> other
>> druid-processing classes. It uses Druid's own classes to calculate query
>> results. I have been testing large GroupBy queries (using v2), and it
>> seems
>> like parallel combining threads for GroupBy queries are only enabled on
>> the
>> historical level. I think it is only getting called by
>> GroupByStrategyV2.mergeRunners()
>> <
>> https://github.com/apache/incubator-druid/blob/druid-0.12.1/processing/src/main/java/io/druid/query/groupby/strategy/GroupByStrategyV2.java#L335
>> >
>> which is only called by GroupByQueryRunnerFactory.mergeRunners() on
>> historicals.
>>
>> Are GroupByMergingQueryRunnerV2 and parallel combining threads meant for
>> computing and merging per-segment results only, or can they also be used
>> on
>> the broker level? I changed the logic of my project from calling
>> queryToolChest.mergeResults() on MergeSequence (created by providing a
>> list
>> of per-segment/per-server sequences) to calling
>> queryToolChest.mergeResults() on queryRunnerFactory.mergeRunners() (where
>> each runner returns a deserialized result sequence), and that seemed to
>> have reduced really heavy groupby query computation time or failures by
>> quite a lot. Or is this just a coincidence and there shouldn't be a
>> performance difference in merging groupby query results, and the only
>> difference could've been by parallelizing the deserialization of result
>> sequences from sub-queries?
>>
>> Thanks,
>> Jisoo
>>
>


Re: Question on GroupBy query results merging process

2018-07-19 Thread Jihoon Son
Hi Jisoo,

the initial version of groupBy v2

On Thu, Jul 19, 2018 at 2:42 PM Jisoo Kim 
wrote:

> Hi all,
>
> I am currently working on a project that uses Druid's QueryRunner and other
> druid-processing classes. It uses Druid's own classes to calculate query
> results. I have been testing large GroupBy queries (using v2), and it seems
> like parallel combining threads for GroupBy queries are only enabled on the
> historical level. I think it is only getting called by
> GroupByStrategyV2.mergeRunners()
> <
> https://github.com/apache/incubator-druid/blob/druid-0.12.1/processing/src/main/java/io/druid/query/groupby/strategy/GroupByStrategyV2.java#L335
> >
> which is only called by GroupByQueryRunnerFactory.mergeRunners() on
> historicals.
>
> Are GroupByMergingQueryRunnerV2 and parallel combining threads meant for
> computing and merging per-segment results only, or can they also be used on
> the broker level? I changed the logic of my project from calling
> queryToolChest.mergeResults() on MergeSequence (created by providing a list
> of per-segment/per-server sequences) to calling
> queryToolChest.mergeResults() on queryRunnerFactory.mergeRunners() (where
> each runner returns a deserialized result sequence), and that seemed to
> have reduced really heavy groupby query computation time or failures by
> quite a lot. Or is this just a coincidence and there shouldn't be a
> performance difference in merging groupby query results, and the only
> difference could've been by parallelizing the deserialization of result
> sequences from sub-queries?
>
> Thanks,
> Jisoo
>


Question on GroupBy query results merging process

2018-07-19 Thread Jisoo Kim
Hi all,

I am currently working on a project that uses Druid's QueryRunner and other
druid-processing classes. It uses Druid's own classes to calculate query
results. I have been testing large GroupBy queries (using v2), and it seems
like parallel combining threads for GroupBy queries are only enabled on the
historical level. I think it is only getting called by
GroupByStrategyV2.mergeRunners()

which is only called by GroupByQueryRunnerFactory.mergeRunners() on
historicals.

Are GroupByMergingQueryRunnerV2 and parallel combining threads meant for
computing and merging per-segment results only, or can they also be used on
the broker level? I changed the logic of my project from calling
queryToolChest.mergeResults() on MergeSequence (created by providing a list
of per-segment/per-server sequences) to calling
queryToolChest.mergeResults() on queryRunnerFactory.mergeRunners() (where
each runner returns a deserialized result sequence), and that seemed to
have reduced really heavy groupby query computation time or failures by
quite a lot. Or is this just a coincidence and there shouldn't be a
performance difference in merging groupby query results, and the only
difference could've been by parallelizing the deserialization of result
sequences from sub-queries?

Thanks,
Jisoo


Re: Issue with segments not loading/taking a long time

2018-07-19 Thread Jihoon Son
Hi Samarth,

IIRC, nothing has been changed around loading local segments when
historicals start up.
The above log looks that you have 4816 segments to load.
How long does it take to load all of them? How slow is it compared to
before?

Jihoon

On Thu, Jul 19, 2018 at 1:37 PM Samarth Jain  wrote:

> Thanks for the reply, Clint. It does look related.
>
> We also noticed that historicals are taking a long time to download the
> segments after a restart. At least in 0.10.1, restart of a historical
> wouldn't be a big deal as the segments it is responsible for serving were
> still available on the local disk and so after a restart, the segment load
> process would finish up fast.
> With 0.12.1, I am seeing that a restarted historical is spending a lot of
> time loading segments.
>
> Sample log line:
>
> *@40005b50ee941f5ead84 2018-07-19T20:03:22,526 INFO
> [Segment-Load-Startup-5]
> io.druid.server.coordination.SegmentLoadDropHandler - Loading
>
> segment[3681/4816][xyz_druid_raw_2015-10-11T00:00:00.000Z_2015-10-12T00:00:00.000Z_2018-03-14T23:58:34.091Z]*
>
>
> Has something changed around how the segment assignment works since 0.10.1
> which could cause this change in behavior? I could always dig in the code
> and try figure out but considering you have actively worked in this area
> recently, I thought you might have an idea?
>
>
>
>
>
>
>
>
>
>
>
> On Thu, Jul 19, 2018 at 1:26 PM, Clint Wylie  wrote:
>
> > You might be running into something related to these issues
> > https://github.com/apache/incubator-druid/issues/5531 and
> > https://github.com/apache/incubator-druid/issues/5882, the former of
> which
> > should be fixed in 0.12.2. The effects of these issues can be at least
> > partially mitigated by setting and  maxSegmentsInNodeLoadingQueue and
> > maxSegmentsToMove http://druid.io/docs/latest/
> > configuration/coordinator.html
> > to limit how deep load queues get and minimizing the number of bad
> > decisions the coordinator makes when a historical disappears due to zk
> > blip, upgrade, or anything else.
> >
> > On Thu, Jul 19, 2018 at 1:10 PM, Samarth Jain 
> > wrote:
> >
> > > Hi Jihoon,
> > >
> > > I have a 6 node historical test cluster. 3 nodes are at ~80% and the
> > other
> > > two at ~60 and ~50% disk utilization.
> > >
> > > The interesting thing is that the 6th node ended up getting into zk
> > timeout
> > > (because of large GC pause) and is no longer part of the cluster (which
> > is
> > > a separate issue I am trying to figure out).
> > > On this 6th node, I see that it is busy loading segments. However, once
> > it
> > > is done downloading, I am not sure if it will report back to zk as
> being
> > > available.
> > >
> > >
> > >
> > >
> > >
> > > On Thu, Jul 19, 2018 at 12:58 PM, Jihoon Son 
> wrote:
> > >
> > > > Hi Samarth,
> > > >
> > > > have you had a change to check the segment balancing status of your
> > > > cluster?
> > > > Do you see any significant imbalance between historicals?
> > > >
> > > > Jihoon
> > > >
> > > > On Thu, Jul 19, 2018 at 12:28 PM Samarth Jain <
> samarth.j...@gmail.com>
> > > > wrote:
> > > >
> > > > > I am working on upgrading our internal cluster to 0.12.1 release
> and
> > > > seeing
> > > > > that a few data sources fail to load. Looking at coordinator logs,
> I
> > am
> > > > > seeing messages like this for the datasource:
> > > > >
> > > > > @40005b50dbc637061cec 2018-07-19T18:43:08,923 INFO
> > > > > [Coordinator-Exec--0] io.druid.server.coordinator.
> > CuratorLoadQueuePeon
> > > -
> > > > > Asking server peon[/druid-test--001/loadQueue/127.0.0.1:7103] to
> > drop
> > > > > segment[*datasource*
> > > > >
> > > > > _2015-09-03T00:00:00.000Z_2015-09-04T00:00:00.000Z_2018-
> > > > 04-23T21:24:04.910Z]
> > > > >
> > > > >
> > > > >
> > > > > @40005b50dbc637391f84 2018-07-19T18:43:08,926 WARN
> > > > > [Coordinator-Exec--0] io.druid.server.coordinator.rules.LoadRule -
> > No
> > > > > available [_default_tier] servers or node capacity to assign
> primary
> > > > >
> > > > > segment[*datasource*-08-10T00:00:00.000Z_2015-08-11T00:00:
> > > > 00.000Z_2018-04-23T21:24:04.910Z]!
> > > > > Expected Replicants[1]
> > > > >
> > > > >
> > > > > The datasource failed to load for a long time and then eventually
> was
> > > > > loaded successfully. Has anyone else seen this? I see a few fixes
> > > around
> > > > > segment loading and coordination in 0.12.2 (which I am hoping will
> be
> > > out
> > > > > soon) but I am not sure if they address this issue.
> > > > >
> > > >
> > >
> >
>


[GitHub] jihoonson edited a comment on issue #4434: [Proposal] Automatic background segment compaction

2018-07-19 Thread GitBox
jihoonson edited a comment on issue #4434: [Proposal] Automatic background 
segment compaction
URL: 
https://github.com/apache/incubator-druid/issues/4434#issuecomment-311573364
 
 
   I have to admit that this proposal is quite complicated and tricky. I'll 
raise another proposal for this problem soon.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



Re: Issue with segments not loading/taking a long time

2018-07-19 Thread Samarth Jain
Thanks for the reply, Clint. It does look related.

We also noticed that historicals are taking a long time to download the
segments after a restart. At least in 0.10.1, restart of a historical
wouldn't be a big deal as the segments it is responsible for serving were
still available on the local disk and so after a restart, the segment load
process would finish up fast.
With 0.12.1, I am seeing that a restarted historical is spending a lot of
time loading segments.

Sample log line:

*@40005b50ee941f5ead84 2018-07-19T20:03:22,526 INFO
[Segment-Load-Startup-5]
io.druid.server.coordination.SegmentLoadDropHandler - Loading
segment[3681/4816][xyz_druid_raw_2015-10-11T00:00:00.000Z_2015-10-12T00:00:00.000Z_2018-03-14T23:58:34.091Z]*


Has something changed around how the segment assignment works since 0.10.1
which could cause this change in behavior? I could always dig in the code
and try figure out but considering you have actively worked in this area
recently, I thought you might have an idea?











On Thu, Jul 19, 2018 at 1:26 PM, Clint Wylie  wrote:

> You might be running into something related to these issues
> https://github.com/apache/incubator-druid/issues/5531 and
> https://github.com/apache/incubator-druid/issues/5882, the former of which
> should be fixed in 0.12.2. The effects of these issues can be at least
> partially mitigated by setting and  maxSegmentsInNodeLoadingQueue and
> maxSegmentsToMove http://druid.io/docs/latest/
> configuration/coordinator.html
> to limit how deep load queues get and minimizing the number of bad
> decisions the coordinator makes when a historical disappears due to zk
> blip, upgrade, or anything else.
>
> On Thu, Jul 19, 2018 at 1:10 PM, Samarth Jain 
> wrote:
>
> > Hi Jihoon,
> >
> > I have a 6 node historical test cluster. 3 nodes are at ~80% and the
> other
> > two at ~60 and ~50% disk utilization.
> >
> > The interesting thing is that the 6th node ended up getting into zk
> timeout
> > (because of large GC pause) and is no longer part of the cluster (which
> is
> > a separate issue I am trying to figure out).
> > On this 6th node, I see that it is busy loading segments. However, once
> it
> > is done downloading, I am not sure if it will report back to zk as being
> > available.
> >
> >
> >
> >
> >
> > On Thu, Jul 19, 2018 at 12:58 PM, Jihoon Son  wrote:
> >
> > > Hi Samarth,
> > >
> > > have you had a change to check the segment balancing status of your
> > > cluster?
> > > Do you see any significant imbalance between historicals?
> > >
> > > Jihoon
> > >
> > > On Thu, Jul 19, 2018 at 12:28 PM Samarth Jain 
> > > wrote:
> > >
> > > > I am working on upgrading our internal cluster to 0.12.1 release and
> > > seeing
> > > > that a few data sources fail to load. Looking at coordinator logs, I
> am
> > > > seeing messages like this for the datasource:
> > > >
> > > > @40005b50dbc637061cec 2018-07-19T18:43:08,923 INFO
> > > > [Coordinator-Exec--0] io.druid.server.coordinator.
> CuratorLoadQueuePeon
> > -
> > > > Asking server peon[/druid-test--001/loadQueue/127.0.0.1:7103] to
> drop
> > > > segment[*datasource*
> > > >
> > > > _2015-09-03T00:00:00.000Z_2015-09-04T00:00:00.000Z_2018-
> > > 04-23T21:24:04.910Z]
> > > >
> > > >
> > > >
> > > > @40005b50dbc637391f84 2018-07-19T18:43:08,926 WARN
> > > > [Coordinator-Exec--0] io.druid.server.coordinator.rules.LoadRule -
> No
> > > > available [_default_tier] servers or node capacity to assign primary
> > > >
> > > > segment[*datasource*-08-10T00:00:00.000Z_2015-08-11T00:00:
> > > 00.000Z_2018-04-23T21:24:04.910Z]!
> > > > Expected Replicants[1]
> > > >
> > > >
> > > > The datasource failed to load for a long time and then eventually was
> > > > loaded successfully. Has anyone else seen this? I see a few fixes
> > around
> > > > segment loading and coordination in 0.12.2 (which I am hoping will be
> > out
> > > > soon) but I am not sure if they address this issue.
> > > >
> > >
> >
>


Re: Issue with segments not loading/taking a long time

2018-07-19 Thread Clint Wylie
You might be running into something related to these issues
https://github.com/apache/incubator-druid/issues/5531 and
https://github.com/apache/incubator-druid/issues/5882, the former of which
should be fixed in 0.12.2. The effects of these issues can be at least
partially mitigated by setting and  maxSegmentsInNodeLoadingQueue and
maxSegmentsToMove http://druid.io/docs/latest/configuration/coordinator.html
to limit how deep load queues get and minimizing the number of bad
decisions the coordinator makes when a historical disappears due to zk
blip, upgrade, or anything else.

On Thu, Jul 19, 2018 at 1:10 PM, Samarth Jain 
wrote:

> Hi Jihoon,
>
> I have a 6 node historical test cluster. 3 nodes are at ~80% and the other
> two at ~60 and ~50% disk utilization.
>
> The interesting thing is that the 6th node ended up getting into zk timeout
> (because of large GC pause) and is no longer part of the cluster (which is
> a separate issue I am trying to figure out).
> On this 6th node, I see that it is busy loading segments. However, once it
> is done downloading, I am not sure if it will report back to zk as being
> available.
>
>
>
>
>
> On Thu, Jul 19, 2018 at 12:58 PM, Jihoon Son  wrote:
>
> > Hi Samarth,
> >
> > have you had a change to check the segment balancing status of your
> > cluster?
> > Do you see any significant imbalance between historicals?
> >
> > Jihoon
> >
> > On Thu, Jul 19, 2018 at 12:28 PM Samarth Jain 
> > wrote:
> >
> > > I am working on upgrading our internal cluster to 0.12.1 release and
> > seeing
> > > that a few data sources fail to load. Looking at coordinator logs, I am
> > > seeing messages like this for the datasource:
> > >
> > > @40005b50dbc637061cec 2018-07-19T18:43:08,923 INFO
> > > [Coordinator-Exec--0] io.druid.server.coordinator.CuratorLoadQueuePeon
> -
> > > Asking server peon[/druid-test--001/loadQueue/127.0.0.1:7103] to drop
> > > segment[*datasource*
> > >
> > > _2015-09-03T00:00:00.000Z_2015-09-04T00:00:00.000Z_2018-
> > 04-23T21:24:04.910Z]
> > >
> > >
> > >
> > > @40005b50dbc637391f84 2018-07-19T18:43:08,926 WARN
> > > [Coordinator-Exec--0] io.druid.server.coordinator.rules.LoadRule - No
> > > available [_default_tier] servers or node capacity to assign primary
> > >
> > > segment[*datasource*-08-10T00:00:00.000Z_2015-08-11T00:00:
> > 00.000Z_2018-04-23T21:24:04.910Z]!
> > > Expected Replicants[1]
> > >
> > >
> > > The datasource failed to load for a long time and then eventually was
> > > loaded successfully. Has anyone else seen this? I see a few fixes
> around
> > > segment loading and coordination in 0.12.2 (which I am hoping will be
> out
> > > soon) but I am not sure if they address this issue.
> > >
> >
>


Re: Issue with segments not loading/taking a long time

2018-07-19 Thread Samarth Jain
Hi Jihoon,

I have a 6 node historical test cluster. 3 nodes are at ~80% and the other
two at ~60 and ~50% disk utilization.

The interesting thing is that the 6th node ended up getting into zk timeout
(because of large GC pause) and is no longer part of the cluster (which is
a separate issue I am trying to figure out).
On this 6th node, I see that it is busy loading segments. However, once it
is done downloading, I am not sure if it will report back to zk as being
available.





On Thu, Jul 19, 2018 at 12:58 PM, Jihoon Son  wrote:

> Hi Samarth,
>
> have you had a change to check the segment balancing status of your
> cluster?
> Do you see any significant imbalance between historicals?
>
> Jihoon
>
> On Thu, Jul 19, 2018 at 12:28 PM Samarth Jain 
> wrote:
>
> > I am working on upgrading our internal cluster to 0.12.1 release and
> seeing
> > that a few data sources fail to load. Looking at coordinator logs, I am
> > seeing messages like this for the datasource:
> >
> > @40005b50dbc637061cec 2018-07-19T18:43:08,923 INFO
> > [Coordinator-Exec--0] io.druid.server.coordinator.CuratorLoadQueuePeon -
> > Asking server peon[/druid-test--001/loadQueue/127.0.0.1:7103] to drop
> > segment[*datasource*
> >
> > _2015-09-03T00:00:00.000Z_2015-09-04T00:00:00.000Z_2018-
> 04-23T21:24:04.910Z]
> >
> >
> >
> > @40005b50dbc637391f84 2018-07-19T18:43:08,926 WARN
> > [Coordinator-Exec--0] io.druid.server.coordinator.rules.LoadRule - No
> > available [_default_tier] servers or node capacity to assign primary
> >
> > segment[*datasource*-08-10T00:00:00.000Z_2015-08-11T00:00:
> 00.000Z_2018-04-23T21:24:04.910Z]!
> > Expected Replicants[1]
> >
> >
> > The datasource failed to load for a long time and then eventually was
> > loaded successfully. Has anyone else seen this? I see a few fixes around
> > segment loading and coordination in 0.12.2 (which I am hoping will be out
> > soon) but I am not sure if they address this issue.
> >
>


Druid 0.12.2-rc1 vote

2018-07-19 Thread Jihoon Son
Hi all,

we have no open issues and PRs for 0.12.2 (
https://github.com/apache/incubator-druid/milestone/27). The 0.12.2 branch
is already available and all PRs for 0.12.2 have merged into that branch.

Let's vote on releasing RC1. Here is my +1.

This is a non-ASF release.

Best,
Jihoon


Re: Druid 0.12.2-rc1 vote

2018-07-19 Thread Jihoon Son
Hi guys,

I think we're ready for releasing 0.12.2.
I'm closing this vote and creating a new one.

Best,
Jihoon

On Wed, Jul 11, 2018 at 1:43 PM Gian Merlino  wrote:

> Well, it's never good if a WTH?! message actually gets logged. They are
> usually meant to be things that should "never" happen. I am ok with holding
> off 0.12.2-rc1 until this fix is in.
>
> On Wed, Jul 11, 2018 at 1:04 PM Jihoon Son  wrote:
>
> > Thanks everyone for voting.
> >
> > Unfortunately, I found another bug in Kafka indexing service (
> > https://github.com/apache/incubator-druid/issues/5992). I think it's
> worth
> > to include 0.12.2.
> > I'm currently working on that issue and can probably finish at least by
> > this week.
> >
> > Can we add it to 0.12.2 and vote again once a patch to fix is merged?
> >
> > Jihoon
> >
> > On Wed, Jul 11, 2018 at 10:02 AM Jonathan Wei  wrote:
> >
> > > +1
> > >
> > > On Wed, Jul 11, 2018 at 9:44 AM, Gian Merlino  wrote:
> > >
> > > > +1 from me too!
> > > >
> > > > On Wed, Jul 11, 2018 at 7:28 AM Charles Allen 
> > > wrote:
> > > >
> > > > > That is very helpful, thank you!
> > > > >
> > > > > +1 for continuing with 0.12.2-RC1
> > > > >
> > > > > On Tue, Jul 10, 2018 at 6:51 PM Clint Wylie 
> > > > wrote:
> > > > >
> > > > > > Heya, sorry for the delay (and missing the sync, i'll try to get
> > > better
> > > > > > about showing up). I've fixed a handful of coordinator bugs post
> > > 0.12.0
> > > > > > (and
> > > > > > not backported to 0.12.1), some of these issues go far back, some
> > > back
> > > > to
> > > > > > when segment assignment priority for different tiers of
> historicals
> > > was
> > > > > > introduced, some are just some oddities on the behavior of the
> > > balancer
> > > > > > that I am unsure when were introduced. This is the complete list
> of
> > > > fixes
> > > > > > that are currently in 0.12.2 afaik, with a small description (see
> > PRs
> > > > and
> > > > > > associated issues for more details)
> > > > > >
> > > > > > https://github.com/apache/incubator-druid/pull/5528 fixed an
> issue
> > > > that
> > > > > > movement did not drop the segment from the server the segment was
> > > being
> > > > > > moved from (this one goes wy back, to batch segment
> > > announcements)
> > > > > >
> > > > > > https://github.com/apache/incubator-druid/pull/5529 changed
> > behavior
> > > > of
> > > > > > drop to use the balancer to choose where to drop segments from,
> > based
> > > > on
> > > > > > behavior observed caused by the issue of 5528
> > > > > >
> > > > > > https://github.com/apache/incubator-druid/pull/5532 fixes an
> issue
> > > > where
> > > > > > primary assignment during load rule processing would assign an
> > > > > unavailable
> > > > > > segment to every server with capacity until at least 1 historical
> > had
> > > > the
> > > > > > segment (and drop it from all the others if they all loaded at
> the
> > > same
> > > > > > time), choking load queues from doing useful things
> > > > > >
> > > > > > https://github.com/apache/incubator-druid/pull/ fixed a way
> > for
> > > > http
> > > > > > based coordinator to get stuck loading or dropping segments and a
> > > > > companion
> > > > > > PR that fixed a lambda that wasn't friendly to older jvm versions
> > > > > > https://github.com/apache/incubator-druid/pull/5591
> > > > > >
> > > > > > https://github.com/apache/incubator-druid/pull/5888 makes
> > balancing
> > > > > honor
> > > > > > a
> > > > > > load rule max load queue depth setting to help prevent movement
> > from
> > > > > > starving loading
> > > > > >
> > > > > > https://github.com/apache/incubator-druid/pull/5928 doesn't
> really
> > > fix
> > > > > > anything, just does an early return to avoid doing pointless work
> > > > > >
> > > > > > Additionally, there are a couple of pairs of PRs that are not
> > > currently
> > > > > in
> > > > > > 0.12.2: https://github.com/druid-io/druid/pull/5927 and
> > > > > > https://github.com/apache/incubator-druid/pull/5929 and their
> > > > respective
> > > > > > fixes which have yet to be merged, but have been performing well
> on
> > > our
> > > > > > test cluster,
> https://github.com/apache/incubator-druid/pull/5987
> > > and
> > > > > > https://github.com/apache/incubator-druid/pull/5988. One of them
> > > makes
> > > > > > balancing behave in a way more consistent with expectations by
> > always
> > > > > > trying to move maxSegmentsToMove and more correctly tracking what
> > the
> > > > > > balancer is doing, and one just adds better logging (without much
> > > extra
> > > > > log
> > > > > > volume) due to frustrations I had chasing down all these other
> > > issues.
> > > > > Both
> > > > > > of these were slated for 0.12.2 but were pulled out because of
> the
> > > > issues
> > > > > > (which the open PRs fix afaict). I would be in favor of sliding
> > them
> > > in
> > > > > > there, pending review of the fixes, but understand if they won't
> > make
> > > > the
> > > > > > cut since they maybe fall 

Issue with segments not loading/taking a long time

2018-07-19 Thread Samarth Jain
I am working on upgrading our internal cluster to 0.12.1 release and seeing
that a few data sources fail to load. Looking at coordinator logs, I am
seeing messages like this for the datasource:

@40005b50dbc637061cec 2018-07-19T18:43:08,923 INFO
[Coordinator-Exec--0] io.druid.server.coordinator.CuratorLoadQueuePeon -
Asking server peon[/druid-test--001/loadQueue/127.0.0.1:7103] to drop
segment[*datasource*
_2015-09-03T00:00:00.000Z_2015-09-04T00:00:00.000Z_2018-04-23T21:24:04.910Z]



@40005b50dbc637391f84 2018-07-19T18:43:08,926 WARN
[Coordinator-Exec--0] io.druid.server.coordinator.rules.LoadRule - No
available [_default_tier] servers or node capacity to assign primary
segment[*datasource*-08-10T00:00:00.000Z_2015-08-11T00:00:00.000Z_2018-04-23T21:24:04.910Z]!
Expected Replicants[1]


The datasource failed to load for a long time and then eventually was
loaded successfully. Has anyone else seen this? I see a few fixes around
segment loading and coordination in 0.12.2 (which I am hoping will be out
soon) but I am not sure if they address this issue.


[GitHub] gianm commented on issue #3236: gitter community channel?

2018-07-19 Thread GitBox
gianm commented on issue #3236: gitter community channel?
URL: 
https://github.com/apache/incubator-druid/issues/3236#issuecomment-406387424
 
 
   We do already have an IRC channel, and there are some nice web based IRC 
clients. IMO if you are interested in a chat channel being a more active part 
of the community, the best way to accomplish that is to take charge: join the 
IRC channel, encourage others to do so in the easiest way possible, and help 
people when they show up asking about stuff. If enough people do that then it 
will become a thing.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



Re: synchronization question about datasketches aggregator

2018-07-19 Thread Gian Merlino
Hi Will,

Check out also this thread for related discussion:

https://lists.apache.org/thread.html/9899aa790a7eb561ab66f47b35c8f66ffe695432719251351339521a@%3Cdev.druid.apache.org%3E

On Thu, Jul 19, 2018 at 11:21 AM Will Lauer  wrote:

> A colleague recently pointed out to me that all the sketch operations that
> take place in SketchAggregator (in the datasketches module) use a
> SychronizedUnion class that basically wraps a normal sketch Union and
> synchronizes all operations. From what I can tell with other aggregators in
> the Druid code base, there doesn't appear to be a need to synchronize. It
> looks like Aggregators are always processed from within a single thread. Is
> it reasonable to remove all the syncrhonizations from the SketchAggregator
> and avoid the performance hit that they impose at runtime?
>
> Will
>
> Will Lauer
> Senior Principal Architect
>
> Progress requires pain
>
> m: 508.561.6427
>
> o: 217.255.4262
>


[GitHub] gianm commented on issue #3956: Thread safe reads for aggregators in IncrementalIndex

2018-07-19 Thread GitBox
gianm commented on issue #3956: Thread safe reads for aggregators in 
IncrementalIndex
URL: https://github.com/apache/incubator-druid/pull/3956#issuecomment-406384627
 
 
   I was just looking at this issue again after the conversations on the 
mailing list about sketch synchronization: 
https://lists.apache.org/thread.html/9899aa790a7eb561ab66f47b35c8f66ffe695432719251351339521a@%3Cdev.druid.apache.org%3E
   
   I was wondering, does it make more sense for thread-safety here to be 
handled systematically (at the IncrementalIndex) or for each aggregator to be 
thread safe? Currently we do different approaches: the sketch aggregators 
endeavor to be thread-safe on their own. The primitive aggregators don't bother 
to even try, and they're probably fine, since they're primitives. 
HyperLogLogAggregator tries a little bit -- it at least makes sure the 
different calls use different buffer objects -- but I bet it has a bug where 
"get" could potentially read something weird and corrupt in some rare 
situations. (Like if the offset is being updated while a "get" is going on.)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] gianm closed pull request #6022: Log the full stack trace when an HTTP request fails

2018-07-19 Thread GitBox
gianm closed pull request #6022: Log the full stack trace when an HTTP request 
fails
URL: https://github.com/apache/incubator-druid/pull/6022
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/server/src/main/java/io/druid/discovery/DruidLeaderClient.java 
b/server/src/main/java/io/druid/discovery/DruidLeaderClient.java
index 0541d6edbba..094d04f527e 100644
--- a/server/src/main/java/io/druid/discovery/DruidLeaderClient.java
+++ b/server/src/main/java/io/druid/discovery/DruidLeaderClient.java
@@ -159,8 +159,7 @@ public FullResponseHolder go(
   }
   catch (IOException | ChannelException ex) {
 // can happen if the node is stopped.
-log.info("Request[%s] failed with msg [%s].", request.getUrl(), 
ex.getMessage());
-log.debug(ex, "Request[%s] failed.", request.getUrl());
+log.warn(ex, "Request[%s] failed.", request.getUrl());
 
 try {
   if (request.getUrl().getQuery() == null) {


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



Re: synchronization question about datasketches aggregator

2018-07-19 Thread Roman Leventov
There is a race between aggregators and ingestion updates. Actually, many
aggregators are vulnerable now. See this issue:
https://github.com/apache/incubator-druid/pull/3956 and a conversation
starting from this message:
https://github.com/apache/incubator-druid/pull/5148#discussion_r170906998.

However, you could replace a simple synchronized with ReadWriteLock or
Striped (see ArrayOfDoublesSketchMergeBufferAggregator for
example), that would be a useful contribution to Druid.

On Thu, 19 Jul 2018 at 13:21, Will Lauer  wrote:

> A colleague recently pointed out to me that all the sketch operations that
> take place in SketchAggregator (in the datasketches module) use a
> SychronizedUnion class that basically wraps a normal sketch Union and
> synchronizes all operations. From what I can tell with other aggregators in
> the Druid code base, there doesn't appear to be a need to synchronize. It
> looks like Aggregators are always processed from within a single thread. Is
> it reasonable to remove all the syncrhonizations from the SketchAggregator
> and avoid the performance hit that they impose at runtime?
>
> Will
>
> Will Lauer
> Senior Principal Architect
>
> Progress requires pain
>
> m: 508.561.6427
>
> o: 217.255.4262
>


[GitHub] jihoonson commented on a change in pull request #5492: Native parallel batch indexing without shuffle

2018-07-19 Thread GitBox
jihoonson commented on a change in pull request #5492: Native parallel batch 
indexing without shuffle
URL: https://github.com/apache/incubator-druid/pull/5492#discussion_r203823447
 
 

 ##
 File path: 
indexing-service/src/main/java/io/druid/indexing/common/task/ParallelIndexSubTask.java
 ##
 @@ -0,0 +1,431 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package io.druid.indexing.common.task;
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+import io.druid.client.indexing.IndexingServiceClient;
+import io.druid.data.input.Firehose;
+import io.druid.data.input.FirehoseFactory;
+import io.druid.data.input.InputRow;
+import io.druid.indexer.TaskStatus;
+import io.druid.indexing.appenderator.ActionBasedSegmentAllocator;
+import io.druid.indexing.appenderator.ActionBasedUsedSegmentChecker;
+import io.druid.indexing.common.TaskLockType;
+import io.druid.indexing.common.TaskToolbox;
+import io.druid.indexing.common.actions.LockTryAcquireAction;
+import io.druid.indexing.common.actions.SegmentAllocateAction;
+import io.druid.indexing.common.actions.SurrogateAction;
+import io.druid.indexing.common.actions.TaskActionClient;
+import io.druid.indexing.firehose.IngestSegmentFirehoseFactory;
+import io.druid.java.util.common.ISE;
+import io.druid.java.util.common.Intervals;
+import io.druid.java.util.common.StringUtils;
+import io.druid.java.util.common.logger.Logger;
+import io.druid.java.util.common.parsers.ParseException;
+import io.druid.query.DruidMetrics;
+import io.druid.segment.indexing.DataSchema;
+import io.druid.segment.indexing.RealtimeIOConfig;
+import io.druid.segment.indexing.granularity.GranularitySpec;
+import io.druid.segment.realtime.FireDepartment;
+import io.druid.segment.realtime.FireDepartmentMetrics;
+import io.druid.segment.realtime.RealtimeMetricsMonitor;
+import io.druid.segment.realtime.appenderator.Appenderator;
+import io.druid.segment.realtime.appenderator.AppenderatorDriverAddResult;
+import io.druid.segment.realtime.appenderator.Appenderators;
+import io.druid.segment.realtime.appenderator.BaseAppenderatorDriver;
+import io.druid.segment.realtime.appenderator.BatchAppenderatorDriver;
+import io.druid.segment.realtime.appenderator.SegmentAllocator;
+import io.druid.segment.realtime.appenderator.SegmentsAndMetadata;
+import io.druid.timeline.DataSegment;
+import org.apache.commons.io.FileUtils;
+import org.joda.time.Interval;
+
+import javax.annotation.Nullable;
+import java.io.File;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.SortedSet;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.TimeoutException;
+
+/**
+ * A worker task of {@link ParallelIndexSupervisorTask}. Similar to {@link 
IndexTask}, but this task
+ * generates and pushes segments, and reports them to the {@link 
ParallelIndexSupervisorTask} instead of
+ * publishing on its own.
+ */
+public class ParallelIndexSubTask extends AbstractTask
 
 Review comment:
   We have a rough plan for that. It's 'Two phase parallel indexing with 
shuffle' in https://github.com/apache/incubator-druid/issues/5543.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5492: Native parallel batch indexing without shuffle

2018-07-19 Thread GitBox
jihoonson commented on a change in pull request #5492: Native parallel batch 
indexing without shuffle
URL: https://github.com/apache/incubator-druid/pull/5492#discussion_r203215000
 
 

 ##
 File path: 
indexing-service/src/main/java/io/druid/indexing/common/task/SinglePhaseParallelIndexTaskRunner.java
 ##
 @@ -0,0 +1,484 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package io.druid.indexing.common.task;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Preconditions;
+import com.google.common.util.concurrent.FutureCallback;
+import com.google.common.util.concurrent.Futures;
+import com.google.common.util.concurrent.ListenableFuture;
+import io.druid.client.indexing.IndexingServiceClient;
+import io.druid.data.input.FiniteFirehoseFactory;
+import io.druid.data.input.FirehoseFactory;
+import io.druid.data.input.InputSplit;
+import io.druid.indexer.TaskState;
+import io.druid.indexer.TaskStatusPlus;
+import io.druid.indexing.appenderator.ActionBasedUsedSegmentChecker;
+import io.druid.indexing.common.TaskToolbox;
+import io.druid.indexing.common.actions.SegmentTransactionalInsertAction;
+import io.druid.indexing.common.task.TaskMonitor.MonitorEntry;
+import io.druid.indexing.common.task.TaskMonitor.SubTaskCompleteEvent;
+import io.druid.java.util.common.ISE;
+import io.druid.java.util.common.logger.Logger;
+import io.druid.segment.realtime.appenderator.SegmentIdentifier;
+import io.druid.segment.realtime.appenderator.TransactionalSegmentPublisher;
+import io.druid.segment.realtime.appenderator.UsedSegmentChecker;
+import io.druid.timeline.DataSegment;
+
+import javax.annotation.Nullable;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.ConcurrentMap;
+import java.util.concurrent.LinkedBlockingDeque;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+import java.util.stream.Stream;
+
+/**
+ * An implementation of {@link ParallelIndexTaskRunner} to support best-effort 
roll-up. This runner can submit and
+ * monitor multiple {@link ParallelIndexSubTask}s.
+ *
+ * As its name indicates, distributed indexing is done in a single phase, 
i.e., without shuffling intermediate data. As
+ * a result, this task can't be used for perfect rollup.
+ */
+public class SinglePhaseParallelIndexTaskRunner implements 
ParallelIndexTaskRunner
+{
+  private static final Logger log = new 
Logger(SinglePhaseParallelIndexTaskRunner.class);
+
+  private final TaskToolbox toolbox;
+  private final String taskId;
+  private final String groupId;
+  private final ParallelIndexIngestionSpec ingestionSchema;
+  private final Map context;
+  private final FiniteFirehoseFactory baseFirehoseFactory;
+  private final int maxNumTasks;
+  private final IndexingServiceClient indexingServiceClient;
+
+  private final BlockingQueue> 
taskCompleteEvents =
+  new LinkedBlockingDeque<>();
+
+  // subTaskId -> report
+  private final ConcurrentMap segmentsMap = new 
ConcurrentHashMap<>();
+
+  private volatile boolean stopped;
+  private volatile TaskMonitor taskMonitor;
+
+  private int nextSpecId = 0;
+
+  SinglePhaseParallelIndexTaskRunner(
+  TaskToolbox toolbox,
+  String taskId,
+  String groupId,
+  ParallelIndexIngestionSpec ingestionSchema,
+  Map context,
+  IndexingServiceClient indexingServiceClient
+  )
+  {
+this.toolbox = toolbox;
+this.taskId = taskId;
+this.groupId = groupId;
+this.ingestionSchema = ingestionSchema;
+this.context = context;
+this.baseFirehoseFactory = (FiniteFirehoseFactory) 
ingestionSchema.getIOConfig().getFirehoseFactory();
+this.maxNumTasks = ingestionSchema.getTuningConfig().getMaxNumSubTasks();
+this.indexingServiceClient = 
Preconditions.checkNotNull(indexingServiceClient, "indexingServiceClient");
+  }
+
+  @Override
+  public TaskState run() throws Exception
+  {
+final Iterator subTaskSpecIterator = 

[GitHub] jihoonson commented on a change in pull request #5492: Native parallel batch indexing without shuffle

2018-07-19 Thread GitBox
jihoonson commented on a change in pull request #5492: Native parallel batch 
indexing without shuffle
URL: https://github.com/apache/incubator-druid/pull/5492#discussion_r203822871
 
 

 ##
 File path: 
indexing-service/src/main/java/io/druid/indexing/common/task/ParallelIndexSupervisorTask.java
 ##
 @@ -0,0 +1,541 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package io.druid.indexing.common.task;
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.jaxrs.smile.SmileMediaTypes;
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+import com.google.common.base.Throwables;
+import io.druid.client.indexing.IndexingServiceClient;
+import io.druid.data.input.FiniteFirehoseFactory;
+import io.druid.data.input.FirehoseFactory;
+import io.druid.indexer.TaskStatus;
+import io.druid.indexing.common.Counters;
+import io.druid.indexing.common.TaskLock;
+import io.druid.indexing.common.TaskToolbox;
+import io.druid.indexing.common.actions.LockListAction;
+import io.druid.indexing.common.actions.TaskActionClient;
+import io.druid.indexing.common.stats.RowIngestionMetersFactory;
+import io.druid.indexing.common.task.IndexTask.IndexIngestionSpec;
+import io.druid.indexing.common.task.IndexTask.IndexTuningConfig;
+import io.druid.indexing.common.task.ParallelIndexTaskRunner.SubTaskSpecStatus;
+import io.druid.java.util.common.IAE;
+import io.druid.java.util.common.ISE;
+import io.druid.java.util.common.logger.Logger;
+import io.druid.segment.indexing.granularity.GranularitySpec;
+import io.druid.segment.realtime.appenderator.SegmentIdentifier;
+import io.druid.segment.realtime.firehose.ChatHandler;
+import io.druid.segment.realtime.firehose.ChatHandlerProvider;
+import io.druid.segment.realtime.firehose.ChatHandlers;
+import io.druid.server.security.Action;
+import io.druid.server.security.AuthorizerMapper;
+import io.druid.timeline.partition.NumberedShardSpec;
+import org.joda.time.DateTime;
+import org.joda.time.Interval;
+
+import javax.annotation.Nullable;
+import javax.servlet.http.HttpServletRequest;
+import javax.ws.rs.Consumes;
+import javax.ws.rs.GET;
+import javax.ws.rs.POST;
+import javax.ws.rs.Path;
+import javax.ws.rs.PathParam;
+import javax.ws.rs.Produces;
+import javax.ws.rs.core.Context;
+import javax.ws.rs.core.MediaType;
+import javax.ws.rs.core.Response;
+import javax.ws.rs.core.Response.Status;
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+import java.util.Map.Entry;
+import java.util.SortedSet;
+import java.util.stream.Collectors;
+
+/**
+ * ParallelIndexSupervisorTask is capable of running multiple subTasks for 
parallel indexing. This is
+ * applicable if the input {@link FiniteFirehoseFactory} is splittable. While 
this task is running, it can submit
+ * multiple child tasks to overlords. This task succeeds only when all its 
child tasks succeed; otherwise it fails.
+ *
+ * @see ParallelIndexTaskRunner
+ */
+public class ParallelIndexSupervisorTask extends AbstractTask implements 
ChatHandler
+{
+  static final String TYPE = "index_parallel";
+
+  private static final Logger log = new 
Logger(ParallelIndexSupervisorTask.class);
+
+  private final ParallelIndexIngestionSpec ingestionSchema;
+  private final FiniteFirehoseFactory baseFirehoseFactory;
+  private final IndexingServiceClient indexingServiceClient;
+  private final ChatHandlerProvider chatHandlerProvider;
+  private final AuthorizerMapper authorizerMapper;
+  private final RowIngestionMetersFactory rowIngestionMetersFactory;
+
+  private final Counters counters = new Counters();
+
+  private volatile ParallelIndexTaskRunner runner;
+
+  // toolbox is initlized when run() is called, and can be used for processing 
HTTP endpoint requests.
+  private volatile TaskToolbox toolbox;
+
+  @JsonCreator
+  public ParallelIndexSupervisorTask(
+  @JsonProperty("id") String id,
+  @JsonProperty("resource") TaskResource taskResource,
+  @JsonProperty("spec") ParallelIndexIngestionSpec 

Re: list polluted by gitbox messages

2018-07-19 Thread Gian Merlino
We're working with infra to redirect the notifications:
https://issues.apache.org/jira/browse/INFRA-16674

In the meantime, I have been using these filters to keep myself sane:
https://gist.github.com/gianm/0eb410915c02e3844e11235172894c62 (it's a gist
because the filters are partially based on content, and if paste them in
this message, they'll miscategorize it…)

On Thu, Jul 19, 2018 at 9:56 AM Prashant Deva 
wrote:

> seems like every bit of activity on gitbox is being posted to the dev
> mailing list. its impossible to see any real messages since all i see are
> gitbox mails.
>
> Prashant
>


list polluted by gitbox messages

2018-07-19 Thread Prashant Deva
seems like every bit of activity on gitbox is being posted to the dev
mailing list. its impossible to see any real messages since all i see are
gitbox mails.

Prashant


[GitHub] drcrallen opened a new issue #6024: Missing exception handling as part of `io.druid.java.util.http.client.netty.HttpClientPipelineFactory`

2018-07-19 Thread GitBox
drcrallen opened a new issue #6024: Missing exception handling as part of 
`io.druid.java.util.http.client.netty.HttpClientPipelineFactory`
URL: https://github.com/apache/incubator-druid/issues/6024
 
 
   The `io.druid.java.util.http.client.netty.HttpClientPipelineFactory` class 
constructs the netty pipeline for handling request channels. But whenever an 
exception happens in reading (for example a historical shuts down during a 
query) The following log line is printed in the logs:
   
   
   ```
   EXCEPTION, please implement 
org.jboss.netty.handler.codec.http.HttpContentDecompressor.exceptionCaught() 
for proper handling.
   ```
   
   As can be seen in 
`org.jboss.netty.channel.SimpleChannelUpstreamHandler#exceptionCaught`
   
   This ask is that "proper" exception handling be implemented in the http 
client workflow such that the logs do not have such an error when there are 
exceptions in the netty channel.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] drcrallen commented on a change in pull request #5913: Move Caching Cluster Client to java streams and allow parallel intermediate merges

2018-07-19 Thread GitBox
drcrallen commented on a change in pull request #5913: Move Caching Cluster 
Client to java streams and allow parallel intermediate merges
URL: https://github.com/apache/incubator-druid/pull/5913#discussion_r203792145
 
 

 ##
 File path: processing/src/main/java/io/druid/guice/ForkJoinPoolProvider.java
 ##
 @@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package io.druid.guice;
+
+import io.druid.java.util.common.StringUtils;
+import io.druid.java.util.common.concurrent.Execs;
+import io.druid.java.util.common.logger.Logger;
+
+import javax.inject.Provider;
+import java.util.concurrent.ForkJoinPool;
+
+public class ForkJoinPoolProvider implements Provider
+{
+  private static final Logger LOG = new Logger(ForkJoinPoolProvider.class);
+
+  private final String nameFormat;
+
+  public ForkJoinPoolProvider(String nameFormat)
+  {
+// Fail fast on bad name format
+StringUtils.format(nameFormat, 3);
 
 Review comment:
   I can do that


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] drcrallen commented on a change in pull request #5913: Move Caching Cluster Client to java streams and allow parallel intermediate merges

2018-07-19 Thread GitBox
drcrallen commented on a change in pull request #5913: Move Caching Cluster 
Client to java streams and allow parallel intermediate merges
URL: https://github.com/apache/incubator-druid/pull/5913#discussion_r203791629
 
 

 ##
 File path: server/src/main/java/io/druid/client/CachingClusteredClient.java
 ##
 @@ -249,74 +296,88 @@ public CachingClusteredClient(
 contextBuilder.put(CacheConfig.POPULATE_CACHE, false);
 contextBuilder.put("bySegment", true);
   }
-  return contextBuilder.build();
+  return Collections.unmodifiableMap(contextBuilder);
 }
 
-Sequence run(final UnaryOperator> timelineConverter)
+Stream> run(final UnaryOperator> timelineConverter)
 {
   @Nullable
   TimelineLookup timeline = 
serverView.getTimeline(query.getDataSource());
   if (timeline == null) {
-return Sequences.empty();
+return Stream.empty();
   }
   timeline = timelineConverter.apply(timeline);
   if (uncoveredIntervalsLimit > 0) {
 computeUncoveredIntervals(timeline);
   }
 
-  final Set segments = computeSegmentsToQuery(timeline);
+  Stream segments = computeSegmentsToQuery(timeline);
   @Nullable
   final byte[] queryCacheKey = computeQueryCacheKey();
   if (query.getContext().get(QueryResource.HEADER_IF_NONE_MATCH) != null) {
+// Materialize then re-stream
+List materializedSegments = 
segments.collect(Collectors.toList());
+segments = materializedSegments.stream();
+
 @Nullable
 final String prevEtag = (String) 
query.getContext().get(QueryResource.HEADER_IF_NONE_MATCH);
 @Nullable
-final String currentEtag = computeCurrentEtag(segments, queryCacheKey);
+final String currentEtag = computeCurrentEtag(materializedSegments, 
queryCacheKey);
 if (currentEtag != null && currentEtag.equals(prevEtag)) {
-  return Sequences.empty();
+  return Stream.empty();
 }
   }
 
-  final List> alreadyCachedResults = 
pruneSegmentsWithCachedResults(queryCacheKey, segments);
-  final SortedMap> segmentsByServer = 
groupSegmentsByServer(segments);
-  return new LazySequence<>(() -> {
-List> sequencesByInterval = new 
ArrayList<>(alreadyCachedResults.size() + segmentsByServer.size());
-addSequencesFromCache(sequencesByInterval, alreadyCachedResults);
-addSequencesFromServer(sequencesByInterval, segmentsByServer);
-return Sequences
-.simple(sequencesByInterval)
-.flatMerge(seq -> seq, query.getResultOrdering());
-  });
+  // This pipeline follows a few general steps:
+  // 1. Fetch cache results - Unfortunately this is an eager operation so 
that the non cached items can
+  // be batched per server. Cached results are assigned to a mock server 
ALREADY_CACHED_SERVER
+  // 2. Group the segment information by server
+  // 3. Per server (including the ALREADY_CACHED_SERVER) create the 
appropriate Sequence results - cached results
+  // are handled in their own merge
+  final Stream>> 
cacheResolvedResults = deserializeFromCache(
+  maybeFetchCacheResults(
+  queryCacheKey,
+  segments
+  )
+  );
+  return groupCachedResultsByServer(
+  cacheResolvedResults
+  ).map(
+  this::runOnServer
+  ).collect(
+  // We do a hard materialization here so that the resulting 
spliterators have properties that we want
+  // Otherwise the stream's spliterator is of a hash map entry 
spliterator from the group-by-server operation
+  // This also causes eager initialization of the **sequences**, aka 
forking off the direct druid client requests
+  // Sequence result accumulation should still be lazy
+  Collectors.toList()
+  ).stream();
 }
 
-private Set computeSegmentsToQuery(TimelineLookup timeline)
+private Stream 
computeSegmentsToQuery(TimelineLookup timeline)
 
 Review comment:
   I find the Stream workflow easier to follow since the steps taken are 
explicitly called out and you don't have to figure out if a delegated item 
passed in is executed first or last. The API is well documented and does not 
depend on any libraries outside the standard java library. Any areas that 
*need* eager materialization of the stream should do so. And they do in some 
places (like in the caching part). 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: 

[GitHub] drcrallen commented on a change in pull request #5913: Move Caching Cluster Client to java streams and allow parallel intermediate merges

2018-07-19 Thread GitBox
drcrallen commented on a change in pull request #5913: Move Caching Cluster 
Client to java streams and allow parallel intermediate merges
URL: https://github.com/apache/incubator-druid/pull/5913#discussion_r203791629
 
 

 ##
 File path: server/src/main/java/io/druid/client/CachingClusteredClient.java
 ##
 @@ -249,74 +296,88 @@ public CachingClusteredClient(
 contextBuilder.put(CacheConfig.POPULATE_CACHE, false);
 contextBuilder.put("bySegment", true);
   }
-  return contextBuilder.build();
+  return Collections.unmodifiableMap(contextBuilder);
 }
 
-Sequence run(final UnaryOperator> timelineConverter)
+Stream> run(final UnaryOperator> timelineConverter)
 {
   @Nullable
   TimelineLookup timeline = 
serverView.getTimeline(query.getDataSource());
   if (timeline == null) {
-return Sequences.empty();
+return Stream.empty();
   }
   timeline = timelineConverter.apply(timeline);
   if (uncoveredIntervalsLimit > 0) {
 computeUncoveredIntervals(timeline);
   }
 
-  final Set segments = computeSegmentsToQuery(timeline);
+  Stream segments = computeSegmentsToQuery(timeline);
   @Nullable
   final byte[] queryCacheKey = computeQueryCacheKey();
   if (query.getContext().get(QueryResource.HEADER_IF_NONE_MATCH) != null) {
+// Materialize then re-stream
+List materializedSegments = 
segments.collect(Collectors.toList());
+segments = materializedSegments.stream();
+
 @Nullable
 final String prevEtag = (String) 
query.getContext().get(QueryResource.HEADER_IF_NONE_MATCH);
 @Nullable
-final String currentEtag = computeCurrentEtag(segments, queryCacheKey);
+final String currentEtag = computeCurrentEtag(materializedSegments, 
queryCacheKey);
 if (currentEtag != null && currentEtag.equals(prevEtag)) {
-  return Sequences.empty();
+  return Stream.empty();
 }
   }
 
-  final List> alreadyCachedResults = 
pruneSegmentsWithCachedResults(queryCacheKey, segments);
-  final SortedMap> segmentsByServer = 
groupSegmentsByServer(segments);
-  return new LazySequence<>(() -> {
-List> sequencesByInterval = new 
ArrayList<>(alreadyCachedResults.size() + segmentsByServer.size());
-addSequencesFromCache(sequencesByInterval, alreadyCachedResults);
-addSequencesFromServer(sequencesByInterval, segmentsByServer);
-return Sequences
-.simple(sequencesByInterval)
-.flatMerge(seq -> seq, query.getResultOrdering());
-  });
+  // This pipeline follows a few general steps:
+  // 1. Fetch cache results - Unfortunately this is an eager operation so 
that the non cached items can
+  // be batched per server. Cached results are assigned to a mock server 
ALREADY_CACHED_SERVER
+  // 2. Group the segment information by server
+  // 3. Per server (including the ALREADY_CACHED_SERVER) create the 
appropriate Sequence results - cached results
+  // are handled in their own merge
+  final Stream>> 
cacheResolvedResults = deserializeFromCache(
+  maybeFetchCacheResults(
+  queryCacheKey,
+  segments
+  )
+  );
+  return groupCachedResultsByServer(
+  cacheResolvedResults
+  ).map(
+  this::runOnServer
+  ).collect(
+  // We do a hard materialization here so that the resulting 
spliterators have properties that we want
+  // Otherwise the stream's spliterator is of a hash map entry 
spliterator from the group-by-server operation
+  // This also causes eager initialization of the **sequences**, aka 
forking off the direct druid client requests
+  // Sequence result accumulation should still be lazy
+  Collectors.toList()
+  ).stream();
 }
 
-private Set computeSegmentsToQuery(TimelineLookup timeline)
+private Stream 
computeSegmentsToQuery(TimelineLookup timeline)
 
 Review comment:
   I find the Stream workflow easier to follow since the steps taken are 
explicitly called out. Any areas that *need* eager materialization of the 
stream should do so. And they do in some places (like in the caching part). 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] drcrallen commented on a change in pull request #5913: Move Caching Cluster Client to java streams and allow parallel intermediate merges

2018-07-19 Thread GitBox
drcrallen commented on a change in pull request #5913: Move Caching Cluster 
Client to java streams and allow parallel intermediate merges
URL: https://github.com/apache/incubator-druid/pull/5913#discussion_r203791186
 
 

 ##
 File path: server/src/main/java/io/druid/client/CachingClusteredClient.java
 ##
 @@ -471,94 +552,162 @@ private CachePopulator getCachePopulator(String 
segmentId, Interval segmentInter
   return cachePopulatorMap.get(StringUtils.format("%s_%s", segmentId, 
segmentInterval));
 }
 
-private SortedMap> 
groupSegmentsByServer(Set segments)
+/**
+ * Check the input stream to see what was cached and what was not. For the 
ones that were cached, merge the results
+ * and return the merged sequence. For the ones that were NOT cached, get 
the server result sequence queued up into
+ * the stream response
+ *
+ * @param segmentOrResult A list that is traversed in order to determine 
what should be sent back. All segments
+ *should be on the same server.
+ *
+ * @return A sequence of either the merged cached results, or the server 
results from any particular server
+ */
+private Sequence runOnServer(List> 
segmentOrResult)
 {
-  final SortedMap> serverSegments = 
Maps.newTreeMap();
-  for (ServerToSegment serverToSegment : segments) {
-final QueryableDruidServer queryableDruidServer = 
serverToSegment.getServer().pick();
-
-if (queryableDruidServer == null) {
-  log.makeAlert(
-  "No servers found for SegmentDescriptor[%s] for DataSource[%s]?! 
How can this be?!",
-  serverToSegment.getSegmentDescriptor(),
-  query.getDataSource()
-  ).emit();
-} else {
-  final DruidServer server = queryableDruidServer.getServer();
-  serverSegments.computeIfAbsent(server, s -> new 
ArrayList<>()).add(serverToSegment.getSegmentDescriptor());
-}
+  final List segmentsOfServer = segmentOrResult.stream(
+  ).map(
+  ServerMaybeSegmentMaybeCache::getSegmentDescriptor
+  ).filter(
+  Optional::isPresent
+  ).map(
+  Optional::get
+  ).collect(
+  Collectors.toList()
+  );
+
+  // We should only ever have cache or queries to run, not both. So if we 
have no segments, try caches
+  if (segmentsOfServer.isEmpty()) {
+// Have a special sequence for the cache results so the merge doesn't 
go all crazy.
+// See 
io.druid.java.util.common.guava.MergeSequenceTest.testScrewsUpOnOutOfOrder for 
an example
+// With zero results actually being found (no segments no caches) this 
should essentially return a no-op
+// merge sequence
+return new MergeSequence<>(query.getResultOrdering(), Sequences.simple(
+segmentOrResult.stream(
+).map(
+ServerMaybeSegmentMaybeCache::getCachedValue
+).filter(
+Optional::isPresent
+).map(
+Optional::get
+).map(
+Collections::singletonList
+).map(
+Sequences::simple
+)
+));
   }
-  return serverSegments;
-}
 
-private void addSequencesFromCache(
-final List> listOfSequences,
-final List> cachedResults
-)
-{
-  if (strategy == null) {
-return;
+  final DruidServer server = segmentOrResult.get(0).getServer();
+  final QueryRunner serverRunner = serverView.getQueryRunner(server);
+
+  if (serverRunner == null) {
+log.error("Server[%s] doesn't have a query runner", server);
+return Sequences.empty();
   }
 
-  final Function pullFromCacheFunction = 
strategy.pullFromSegmentLevelCache();
-  final TypeReference cacheObjectClazz = 
strategy.getCacheObjectClazz();
-  for (Pair cachedResultPair : cachedResults) {
-final byte[] cachedResult = cachedResultPair.rhs;
-Sequence cachedSequence = new BaseSequence<>(
-new BaseSequence.IteratorMaker>()
-{
-  @Override
-  public Iterator make()
-  {
-try {
-  if (cachedResult.length == 0) {
-return Collections.emptyIterator();
-  }
+  final MultipleSpecificSegmentSpec segmentsOfServerSpec = new 
MultipleSpecificSegmentSpec(segmentsOfServer);
 
-  return objectMapper.readValues(
-  objectMapper.getFactory().createParser(cachedResult),
-  cacheObjectClazz
-  );
-}
-catch (IOException e) {
-  throw new RuntimeException(e);
-}
-  }
+  final Sequence serverResults;
+  if (isBySegment) {
+serverResults = getBySegmentServerResults(serverRunner, 
segmentsOfServerSpec);
+  } else if 

[GitHub] gianm commented on issue #6014: Optionally refuse to consume new data until the prior chunk is being consumed

2018-07-19 Thread GitBox
gianm commented on issue #6014: Optionally refuse to consume new data until the 
prior chunk is being consumed
URL: https://github.com/apache/incubator-druid/pull/6014#issuecomment-406326473
 
 
   @drcrallen, a question: what happens when one query has a huge set of data 
to pull in, but the others don't? Will the entire broker start to block, or 
will the "nice" queries get to keep gathering data while the "not nice" one 
blocks?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] gianm commented on issue #4949: Add limit to query result buffering queue

2018-07-19 Thread GitBox
gianm commented on issue #4949: Add limit to query result buffering queue
URL: https://github.com/apache/incubator-druid/pull/4949#issuecomment-406325484
 
 
   It looks like #6014 is attempting to solve the same problem.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



Re: Build failure on 0.13.SNAPSHOT

2018-07-19 Thread Jihoon Son
Hi Dongjin,

what maven command did you run?

Jihoon

On Wed, Jul 18, 2018 at 10:38 PM Dongjin Lee  wrote:

> Hello. I am trying to build druid, but it fails. My environment is like the
> following:
>
> - CPU: Intel(R) Core(TM) i7-7560U CPU @ 2.40GHz
> - RAM: 7704 MB
> - OS: ubuntu 18.04
> - JDK: openjdk version "1.8.0_171" (default configuration, with MaxHeapSize
> = 1928 MB)
> - Branch: master (commit: cd8ea3d)
>
> The error message I got is:
>
> [INFO]
> > 
> > [INFO] Reactor Summary:
> > [INFO]
> > [INFO] io.druid:druid . SUCCESS [
> > 50.258 s]
> > [INFO] java-util .. SUCCESS
> [03:57
> > min]
> > [INFO] druid-api .. SUCCESS [
> > 22.694 s]
> > [INFO] druid-common ... SUCCESS [
> > 14.083 s]
> > [INFO] druid-hll .. SUCCESS [
> > 17.126 s]
> > [INFO] extendedset  SUCCESS [
> > 10.856 s]
> >
> > *[INFO] druid-processing ... FAILURE
> > [04:36 min]*[INFO] druid-aws-common ...
> > SKIPPED
> > [INFO] druid-server ... SKIPPED
> > [INFO] druid-examples . SKIPPED
> > ...
> > [INFO]
> > 
> > [INFO] BUILD FAILURE
> > [INFO]
> > 
> > [INFO] Total time: 10:29 min
> > [INFO] Finished at: 2018-07-19T13:23:31+09:00
> > [INFO] Final Memory: 88M/777M
> > [INFO]
> > 
> >
> > *[ERROR] Failed to execute goal
> > org.apache.maven.plugins:maven-surefire-plugin:2.19.1:test (default-test)
> > on project druid-processing: Execution default-test of goal
> > org.apache.maven.plugins:maven-surefire-plugin:2.19.1:test failed: The
> > forked VM terminated without properly saying goodbye. VM crash or
> > System.exit called?*[ERROR] Command was /bin/sh -c cd
> > /home/djlee/workspace/java/druid/processing &&
> > /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xmx3000m
> -Duser.language=en
> > -Duser.country=US -Dfile.encoding=UTF-8 -Duser.timezone=UTC
> > -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
> > -Ddruid.indexing.doubleStorage=double -jar
> >
> /home/djlee/workspace/java/druid/processing/target/surefire/surefirebooter1075382243904099051.jar
> >
> /home/djlee/workspace/java/druid/processing/target/surefire/surefire559351134757209tmp
> >
> /home/djlee/workspace/java/druid/processing/target/surefire/surefire_5173894389718744688tmp
>
>
> It seems like it fails when it runs tests on `druid-processing` module but
> I can't certain. Is there anyone who can give me some hints? Thanks in
> advance.
>
> Best,
> Dongjin
>
> --
> *Dongjin Lee*
>
> *A hitchhiker in the mathematical world.*
>
> *github:  github.com/dongjinleekr
> linkedin: kr.linkedin.com/in/dongjinleekr
> slideshare:
> www.slideshare.net/dongjinleekr
> *
>


[GitHub] asdf2014 commented on issue #5980: Various changes about a few coding specifications

2018-07-19 Thread GitBox
asdf2014 commented on issue #5980: Various changes about a few coding 
specifications
URL: https://github.com/apache/incubator-druid/pull/5980#issuecomment-406300864
 
 
   @leventov It seems that both travis and teamcity have succeeded. Any other 
good suggestions?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



Re: Subscription Request

2018-07-19 Thread Gian Merlino
Hi Dongjin,

To subscribe, just send a mail to dev-subscr...@druid.apache.org.

On Wed, Jul 18, 2018 at 9:55 PM Dongjin Lee  wrote:

> --
> *Dongjin Lee*
>
> *A hitchhiker in the mathematical world.*
>
> *github:  github.com/dongjinleekr
> linkedin: kr.linkedin.com/in/dongjinleekr
> slideshare:
> www.slideshare.net/dongjinleekr
> *
>


[GitHub] sascha-coenen opened a new issue #6023: Full ISO 8601 compatibility in query intervals

2018-07-19 Thread GitBox
sascha-coenen opened a new issue #6023: Full ISO 8601 compatibility in query 
intervals
URL: https://github.com/apache/incubator-druid/issues/6023
 
 
   The current Druid documentation states in several places that queries have a 
mandatory "intervals" or "interval" attribute which can contain ISO-8601 
compatible intervals.
   
   For instance the documentation on the groupby query 
(http://druid.io/docs/latest/querying/groupbyquery.html)
   states: "intervals : A JSON Object representing ISO-8601 Intervals. This 
defines the time ranges to run the query over"
   
   According to the following wikipedia source 
(https://en.wikipedia.org/wiki/ISO_8601#Time_intervals), there are four valid 
forms of ISO-8601 intervals:
   > 
   > /
   > /
   > /
   > 
   > 
   
   It seems that Druid only supports the first three variants. When specifying 
a duration only, e.g. "P2D", then  Druid returns an exception saying that the 
"/" character is required within each interval. 
   
   I believe that supporting the last interval format which is a duration 
relative to the current time (or latest data present in Druid) would be a very 
useful addition. Most queries involve relative dateranges. If forcing people to 
use absolute dates, every query needs to be templated such that the 
timeintervals gets computed by some external logic and then inserted into the 
query. If the relative duration format was supported, many queries could work 
without the need of templated injections of external metadata.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] nicolasblaye commented on issue #6018: RegisteredLookup java api

2018-07-19 Thread GitBox
nicolasblaye commented on issue #6018: RegisteredLookup java api
URL: 
https://github.com/apache/incubator-druid/issues/6018#issuecomment-406204255
 
 
   By the way, I did a quick fix for our use case, but it's far from something 
clean. I created a `RegisteredExtractionFn` (instead of a 
`RegisteredExtractor`). I would gladly try to make a PR to add the 
RegisteredExtractor in the druid extensions if it's really not there


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] nicolasblaye commented on issue #6018: RegisteredLookup java api

2018-07-19 Thread GitBox
nicolasblaye commented on issue #6018: RegisteredLookup java api
URL: 
https://github.com/apache/incubator-druid/issues/6018#issuecomment-406201106
 
 
   Hi drcrallen,
   
   Thank you for your quick response. Indeed, this class is in the druid server 
jar, but I need it in the druid-api jar (or at least in the extension).
   
   My use case:
   
   I have globally cached lookups and a java api. The java api gives dashboards 
using druid queries. The query are done via the java api. In this api, I have 
no way to call a registered lookup because there is no extractor for it.
   
   Some code for more precision:
   
   This is the request I want to do (and it works using the json api)
   ``` json
   {
 "dimensions": [
   {
 "type" : "extraction",
 "dimension" : "a_dimension",
 "outputName" : "toto",
 "extractionFn": {
 "type":"registeredLookup",
 "lookup": "a_lookup",
 "retainMissingValue":true,
 "injective":true
 }
   }
 ], 
 "aggregations": [
   {
 "type": "longSum", 
 "fieldName": "count", 
 "name": "sum__count"
   }
 ], 
 "intervals": "2018-07-13T00:00:00+00:00/2018-07-18T12:56:20+00:00", 
 "limitSpec": {
   "limit": 5000, 
   "type": "default", 
   "columns": [
 {
   "direction": "descending", 
   "dimension": "sum__count"
 }
   ]
 }, 
 "granularity": "day", 
 "postAggregations": [], 
 "queryType": "groupBy", 
 "dataSource": "datasource"
   }
   ```
   
   This is how we try to get the lookup in our java-api. Notice the lookup 
extractor inside the `LookupExtractionFn`
   
   ``` java
   /**
* Return the dimension spec of the given column name: it can be a 
classic column name but also a lookup.
*/
   private static DimensionSpec getDimensionSpec(String name) {
   String lookupFrom = lookups.get(name);
   if (!(lookupFrom == null || lookupFrom.length() == 0)) {
   return new ExtractionDimensionSpec(lookupFrom, name, new 
LookupExtractionFn(new LookupExtractor(name), true, null, null, null));
   } else {
   return new DefaultDimensionSpec(name, name);
   }
   }
   ```
   And this is all the available lookup extractor
   ```
   @JsonTypeInfo(use = JsonTypeInfo.Id.NAME, property = "type")
   @JsonSubTypes(value = {
   @JsonSubTypes.Type(name = "map", value = MapLookupExtractor.class)
   })
   public abstract class LookupExtractor
   ```
   
   I looked into 
https://mvnrepository.com/artifact/io.druid.extensions/druid-lookups-cached-global
 but couldn't find the `registeredLookup`, only 
`@JsonTypeName("cachedNamespace")`
   
   Hope this was clearer.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org