[GitHub] [druid] maytasm commented on a change in pull request #10371: Auto-compaction snapshot status API

2020-09-14 Thread GitBox


maytasm commented on a change in pull request #10371:
URL: https://github.com/apache/druid/pull/10371#discussion_r488392899



##
File path: 
server/src/main/java/org/apache/druid/server/coordinator/AutoCompactionSnapshot.java
##
@@ -0,0 +1,192 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.server.coordinator;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+import javax.annotation.Nullable;
+import javax.validation.constraints.NotNull;
+import java.util.Objects;
+
+public class AutoCompactionSnapshot
+{
+  public enum AutoCompactionScheduleStatus
+  {
+NOT_ENABLED,
+RUNNING
+  }
+
+  @JsonProperty
+  private String dataSource;
+  @JsonProperty
+  private AutoCompactionScheduleStatus scheduleStatus;
+  @JsonProperty
+  private String latestScheduledTaskId;
+  @JsonProperty
+  private long byteCountAwaitingCompaction;

Review comment:
   Im fine either way. "byteCount" matches it with the other Counts 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] maytasm commented on a change in pull request #10371: Auto-compaction snapshot status API

2020-09-14 Thread GitBox


maytasm commented on a change in pull request #10371:
URL: https://github.com/apache/druid/pull/10371#discussion_r488392581



##
File path: 
server/src/main/java/org/apache/druid/server/coordinator/duty/CompactSegments.java
##
@@ -238,25 +272,102 @@ private CoordinatorStats makeStats(int 
numCompactionTasks, CompactionSegmentIter
   {
 final CoordinatorStats stats = new CoordinatorStats();
 stats.addToGlobalStat(COMPACTION_TASK_COUNT, numCompactionTasks);
-totalSizesOfSegmentsAwaitingCompactionPerDataSource = 
iterator.totalRemainingSegmentsSizeBytes();
-
totalSizesOfSegmentsAwaitingCompactionPerDataSource.object2LongEntrySet().fastForEach(
-entry -> {
-  final String dataSource = entry.getKey();
-  final long totalSizeOfSegmentsAwaitingCompaction = 
entry.getLongValue();
-  stats.addToDataSourceStat(
-  TOTAL_SIZE_OF_SEGMENTS_AWAITING_COMPACTION,
-  dataSource,
-  totalSizeOfSegmentsAwaitingCompaction
-  );
-}
-);
+
+// Make sure that the iterator iterate through all the remaining segments 
so that we can get accurate and correct
+// statistics (remaining, skipped, processed, etc.). The reason we have to 
do this explicitly here is because
+// earlier (when we are iterating to submit compaction tasks) we may have 
ran out of task slot and were not able
+// to iterate to the first segment that needs compaction for some 
datasource.
+iterator.flushAllSegments();

Review comment:
   Made the change for auto compaction duty to the last of the set of 
Coordinator duty and change the scheduling of the Coordinator duty from 
scheduleWithFixedDelay to scheduleWithFixedRate. This should ensures that the 
other duty still get run at every 30 mins (assuming the auto compaction doesn't 
go pass 30 mins which I doubt it will).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] maytasm commented on a change in pull request #10371: Auto-compaction snapshot status API

2020-09-14 Thread GitBox


maytasm commented on a change in pull request #10371:
URL: https://github.com/apache/druid/pull/10371#discussion_r488392405



##
File path: 
server/src/main/java/org/apache/druid/server/coordinator/AutoCompactionSnapshot.java
##
@@ -0,0 +1,192 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.server.coordinator;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+import javax.annotation.Nullable;
+import javax.validation.constraints.NotNull;
+import java.util.Objects;
+
+public class AutoCompactionSnapshot
+{
+  public enum AutoCompactionScheduleStatus
+  {
+NOT_ENABLED,
+RUNNING
+  }
+
+  @JsonProperty
+  private String dataSource;
+  @JsonProperty
+  private AutoCompactionScheduleStatus scheduleStatus;
+  @JsonProperty
+  private String latestScheduledTaskId;
+  @JsonProperty
+  private long byteCountAwaitingCompaction;

Review comment:
   Initially I had it as "bytes" but @jihoonson suggested "byteCount".





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] maytasm commented on pull request #10371: Auto-compaction snapshot status API

2020-09-14 Thread GitBox


maytasm commented on pull request #10371:
URL: https://github.com/apache/druid/pull/10371#issuecomment-692471451


   > > I was thinking of "snapshot" as the snapshot taken at the point in time 
the API is called. I think "status" also works. Let me know if you still think 
"status" is better. I'm open to both
   > 
   > I think "status" is better, "snapshot" could be misinterpreted as 
snapshotting the data itself, vs getting a snapshot of the stats
   
   Changed the API endpoint to "status"



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] maytasm commented on a change in pull request #10371: Auto-compaction snapshot status API

2020-09-14 Thread GitBox


maytasm commented on a change in pull request #10371:
URL: https://github.com/apache/druid/pull/10371#discussion_r488390448



##
File path: 
server/src/main/java/org/apache/druid/server/coordinator/duty/CompactSegments.java
##
@@ -238,25 +272,102 @@ private CoordinatorStats makeStats(int 
numCompactionTasks, CompactionSegmentIter
   {
 final CoordinatorStats stats = new CoordinatorStats();
 stats.addToGlobalStat(COMPACTION_TASK_COUNT, numCompactionTasks);
-totalSizesOfSegmentsAwaitingCompactionPerDataSource = 
iterator.totalRemainingSegmentsSizeBytes();
-
totalSizesOfSegmentsAwaitingCompactionPerDataSource.object2LongEntrySet().fastForEach(
-entry -> {
-  final String dataSource = entry.getKey();
-  final long totalSizeOfSegmentsAwaitingCompaction = 
entry.getLongValue();
-  stats.addToDataSourceStat(
-  TOTAL_SIZE_OF_SEGMENTS_AWAITING_COMPACTION,
-  dataSource,
-  totalSizeOfSegmentsAwaitingCompaction
-  );
-}
-);
+
+// Make sure that the iterator iterate through all the remaining segments 
so that we can get accurate and correct
+// statistics (remaining, skipped, processed, etc.). The reason we have to 
do this explicitly here is because
+// earlier (when we are iterating to submit compaction tasks) we may have 
ran out of task slot and were not able
+// to iterate to the first segment that needs compaction for some 
datasource.
+iterator.flushAllSegments();

Review comment:
   I also thought of having two implementation of statistic calculation. We 
can have the old method which is just a sum of all segments from the earliest 
to the segment that we iterated up to. We can call this approximation and the 
downside is that it can overestimate if there are compacted intervals between 
the  earliest to the segment that we iterated up to. Then we can have this new 
implementation which is an exact calculation (i.e. we check each and every 
interval). Each datasource can choose which implementation to use for its stats 
calculation.
   
   However, I discussed this with @jihoonson and we think that this new 
implementation is actually ok to use as a default for all datasources. This 
code is run as part of Coordinator duty hence it is not on critical path and 
there is no client waiting, etc. While it is going through all segments, it is 
doing this at a interval level (we do batch of segments for each interval at a 
time). Thus, the looping is for every interval of every datasources (not every 
segments). Also, the auto compaction is run every 30 mins by default. This 
should be plenty of time for this code to finish before the next cycle begins. 
   Furthermore, we can move the auto compaction duty to the last of the set of 
Coordinator duty and change the scheduling of the Coordinator duty from 
scheduleWithFixedDelay to scheduleWithFixedRate. This should ensures that the 
other duty still get run at every 30 mins (assuming the auto compaction doesn't 
go pass 30 mins which I doubt it will).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] maytasm commented on a change in pull request #10371: Auto-compaction snapshot status API

2020-09-14 Thread GitBox


maytasm commented on a change in pull request #10371:
URL: https://github.com/apache/druid/pull/10371#discussion_r488385897



##
File path: 
server/src/main/java/org/apache/druid/server/coordinator/duty/NewestSegmentFirstIterator.java
##
@@ -112,27 +114,38 @@
   }
 
   @Override
-  public Object2LongOpenHashMap totalRemainingSegmentsSizeBytes()
+  public Map totalRemainingStatistics()
   {
-final Object2LongOpenHashMap resultMap = new 
Object2LongOpenHashMap<>();
-resultMap.defaultReturnValue(UNKNOWN_TOTAL_REMAINING_SEGMENTS_SIZE);
-for (QueueEntry entry : queue) {
-  final VersionedIntervalTimeline timeline = 
dataSources.get(entry.getDataSource());
-  final Interval interval = new 
Interval(timeline.first().getInterval().getStart(), entry.interval.getEnd());
-
-  final List> holders = 
timeline.lookup(interval);
-
-  long size = 0;
-  for (DataSegment segment : FluentIterable
-  .from(holders)
-  .transformAndConcat(TimelineObjectHolder::getObject)
-  .transform(PartitionChunk::getObject)) {
-size += segment.getSize();
-  }
+return remainingSegments;
+  }
+
+  @Override
+  public Map totalProcessedStatistics()
+  {
+return processedSegments;
+  }
 
-  resultMap.put(entry.getDataSource(), size);
+  @Override
+  public void flushAllSegments()
+  {
+if (queue.isEmpty()) {
+  return;
+}
+QueueEntry entry;
+while ((entry = queue.poll()) != null) {
+  final List resultSegments = entry.segments;
+  final String dataSourceName = resultSegments.get(0).getDataSource();
+  // This entry was in the queue, meaning that it was not processed. 
Hence, also aggregates it's
+  // statistic to the remaining segments counts.
+  collectSegmentStatistics(remainingSegments, dataSourceName, new 
SegmentsToCompact(entry.segments));
+  final CompactibleTimelineObjectHolderCursor 
compactibleTimelineObjectHolderCursor = timelineIterators.get(
+  dataSourceName
+  );
+  // WARNING: This iterates the compactibleTimelineObjectHolderCursor.
+  // Since this method is intended to only be use after all necessary 
iteration is done on this iterator

Review comment:
   I mean that this method (`flushAllSegments`), will iterates the 
`compactibleTimelineObjectHolderCursor` in `iterateAllSegments` method call. 
This class `NewestSegmentFirstIterator` when use as a iterator in the next() 
method will also iterate the `compactibleTimelineObjectHolderCursor`.
   Hence, this iterator (`NewestSegmentFirstIterator) cannot be use to iterate 
after this method (`flushAllSegments`) is called.
   
   Basically, you cannot call `flushAllSegments` while iterating the 
NewestSegmentFirstIterator and you cannot call `flushAllSegments` then go back 
to iterating the NewestSegmentFirstIterator





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] jon-wei commented on pull request #10371: Auto-compaction snapshot status API

2020-09-14 Thread GitBox


jon-wei commented on pull request #10371:
URL: https://github.com/apache/druid/pull/10371#issuecomment-692462764


   > I was thinking of "snapshot" as the snapshot taken at the point in time 
the API is called. I think "status" also works. Let me know if you still think 
"status" is better. I'm open to both
   
   I think "status" is better, "snapshot" could be misinterpreted as 
snapshotting the data itself, vs getting a snapshot of the stats



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] jon-wei commented on a change in pull request #10371: Auto-compaction snapshot status API

2020-09-14 Thread GitBox


jon-wei commented on a change in pull request #10371:
URL: https://github.com/apache/druid/pull/10371#discussion_r488322171



##
File path: 
server/src/main/java/org/apache/druid/server/coordinator/AutoCompactionSnapshot.java
##
@@ -0,0 +1,192 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.server.coordinator;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+import javax.annotation.Nullable;
+import javax.validation.constraints.NotNull;
+import java.util.Objects;
+
+public class AutoCompactionSnapshot
+{
+  public enum AutoCompactionScheduleStatus
+  {
+NOT_ENABLED,
+RUNNING
+  }
+
+  @JsonProperty
+  private String dataSource;
+  @JsonProperty
+  private AutoCompactionScheduleStatus scheduleStatus;
+  @JsonProperty
+  private String latestScheduledTaskId;
+  @JsonProperty
+  private long byteCountAwaitingCompaction;

Review comment:
   suggest just "bytes" instead of "byteCount" for the property name

##
File path: 
server/src/main/java/org/apache/druid/server/coordinator/duty/NewestSegmentFirstIterator.java
##
@@ -112,27 +114,38 @@
   }
 
   @Override
-  public Object2LongOpenHashMap totalRemainingSegmentsSizeBytes()
+  public Map totalRemainingStatistics()
   {
-final Object2LongOpenHashMap resultMap = new 
Object2LongOpenHashMap<>();
-resultMap.defaultReturnValue(UNKNOWN_TOTAL_REMAINING_SEGMENTS_SIZE);
-for (QueueEntry entry : queue) {
-  final VersionedIntervalTimeline timeline = 
dataSources.get(entry.getDataSource());
-  final Interval interval = new 
Interval(timeline.first().getInterval().getStart(), entry.interval.getEnd());
-
-  final List> holders = 
timeline.lookup(interval);
-
-  long size = 0;
-  for (DataSegment segment : FluentIterable
-  .from(holders)
-  .transformAndConcat(TimelineObjectHolder::getObject)
-  .transform(PartitionChunk::getObject)) {
-size += segment.getSize();
-  }
+return remainingSegments;
+  }
+
+  @Override
+  public Map totalProcessedStatistics()
+  {
+return processedSegments;
+  }
 
-  resultMap.put(entry.getDataSource(), size);
+  @Override
+  public void flushAllSegments()
+  {
+if (queue.isEmpty()) {
+  return;
+}
+QueueEntry entry;
+while ((entry = queue.poll()) != null) {
+  final List resultSegments = entry.segments;
+  final String dataSourceName = resultSegments.get(0).getDataSource();
+  // This entry was in the queue, meaning that it was not processed. 
Hence, also aggregates it's
+  // statistic to the remaining segments counts.
+  collectSegmentStatistics(remainingSegments, dataSourceName, new 
SegmentsToCompact(entry.segments));
+  final CompactibleTimelineObjectHolderCursor 
compactibleTimelineObjectHolderCursor = timelineIterators.get(
+  dataSourceName
+  );
+  // WARNING: This iterates the compactibleTimelineObjectHolderCursor.
+  // Since this method is intended to only be use after all necessary 
iteration is done on this iterator

Review comment:
   I don't think I understand this comment, if all iteration is done (by 
that do you mean `compactibleTimelineObjectHolderCursor.hasNext` returns 
false?), then iterateAllSegments would do nothing.

##
File path: 
server/src/main/java/org/apache/druid/server/coordinator/duty/CompactSegments.java
##
@@ -238,25 +272,102 @@ private CoordinatorStats makeStats(int 
numCompactionTasks, CompactionSegmentIter
   {
 final CoordinatorStats stats = new CoordinatorStats();
 stats.addToGlobalStat(COMPACTION_TASK_COUNT, numCompactionTasks);
-totalSizesOfSegmentsAwaitingCompactionPerDataSource = 
iterator.totalRemainingSegmentsSizeBytes();
-
totalSizesOfSegmentsAwaitingCompactionPerDataSource.object2LongEntrySet().fastForEach(
-entry -> {
-  final String dataSource = entry.getKey();
-  final long totalSizeOfSegmentsAwaitingCompaction = 
entry.getLongValue();
-  stats.addToDataSourceStat(
-  TOTAL_SIZE_OF_SEGMENTS_AWAITING_COMPACTION,
-  dataSource,
-  

[GitHub] [druid] tianzhipeng-git commented on issue #10384: OutOfMemoryError when hadoop ingest

2020-09-14 Thread GitBox


tianzhipeng-git commented on issue #10384:
URL: https://github.com/apache/druid/issues/10384#issuecomment-692437817


   How many reducers does this hadoop task have? And which `segmentGranularity` 
and `partitionsSpec` you use?
   
   Maybe you should adjust your ingestion json to make `more segments`, see 
[partition](https://druid.apache.org/docs/latest/ingestion/index.html#partitioning)
 and [hadoop-base 
partition](https://druid.apache.org/docs/latest/ingestion/hadoop.html#partitionsspec)
   
   (BTW, `mapreduce.map.memory.mb` must be a little bigger than 
`mapred.map.child.java.opts`, same as reduce)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid] branch master updated (5751d0e -> e465f05)

2020-09-14 Thread cwylie
This is an automated email from the ASF dual-hosted git repository.

cwylie pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git.


from 5751d0e  Skip coverage check for tag builds (#10397)
 add e465f05  Web console: Improve number alignment in tables (#10389)

No new revisions were added by this update.

Summary of changes:
 .../e2e-tests/component/datasources/datasource.ts  |   2 +-
 .../e2e-tests/component/datasources/overview.ts|  12 +-
 web-console/e2e-tests/tutorial-batch.spec.ts   |   2 +-
 web-console/e2e-tests/util/table.ts|   7 +-
 web-console/src/bootstrap/react-table-defaults.tsx |  16 +-
 .../__snapshots__/braced-text.spec.tsx.snap|  18 ++
 .../braced-text/braced-text.scss}  |  32 ++-
 .../braced-text/braced-text.spec.tsx}  |   9 +-
 .../src/components/braced-text/braced-text.tsx |  55 +
 .../datasource-columns-table.tsx   |   6 +-
 web-console/src/components/index.ts|   1 +
 .../__snapshots__/segment-timeline.spec.tsx.snap   |   8 +-
 .../segment-timeline/segment-timeline.tsx  |   8 +-
 .../src/components/show-history/show-history.tsx   |   5 -
 web-console/src/components/show-json/show-json.tsx |   6 +-
 .../dialogs/retention-dialog/retention-dialog.tsx  |   5 -
 web-console/src/utils/general.tsx  |   8 +-
 .../__snapshots__/datasource-view.spec.tsx.snap|  85 +--
 .../src/views/datasource-view/datasource-view.tsx  | 250 ++---
 .../views/home-view/status-card/status-card.tsx|   7 +-
 web-console/src/views/query-view/query-view.tsx|   2 +-
 .../src/views/segments-view/segments-view.tsx  |  62 +++--
 22 files changed, 396 insertions(+), 210 deletions(-)
 create mode 100644 
web-console/src/components/braced-text/__snapshots__/braced-text.spec.tsx.snap
 copy web-console/src/{console-application.scss => 
components/braced-text/braced-text.scss} (72%)
 copy web-console/src/{views/load-data-view/load-data-view.spec.tsx => 
components/braced-text/braced-text.spec.tsx} (81%)
 create mode 100644 web-console/src/components/braced-text/braced-text.tsx


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] clintropolis merged pull request #10389: Web console: Improve number alignment in tables

2020-09-14 Thread GitBox


clintropolis merged pull request #10389:
URL: https://github.com/apache/druid/pull/10389


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid] branch master updated (f71ba6f -> 5751d0e)

2020-09-14 Thread fjy
This is an automated email from the ASF dual-hosted git repository.

fjy pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git.


from f71ba6f  Vectorized ANY aggregators (#10338)
 add 5751d0e  Skip coverage check for tag builds (#10397)

No new revisions were added by this update.

Summary of changes:
 .travis.yml | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] fjy merged pull request #10397: Skip coverage check for tag builds

2020-09-14 Thread GitBox


fjy merged pull request #10397:
URL: https://github.com/apache/druid/pull/10397


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid] branch master updated (e012d5c -> f71ba6f)

2020-09-14 Thread suneet
This is an automated email from the ASF dual-hosted git repository.

suneet pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git.


from e012d5c  allow vectorized query engines to utilize vectorized virtual 
columns (#10388)
 add f71ba6f  Vectorized ANY aggregators (#10338)

No new revisions were added by this update.

Summary of changes:
 docs/querying/query-context.md |   4 +-
 .../any/DoubleAnyAggregatorFactory.java|  21 +++
 .../DoubleAnyVectorAggregator.java}|  29 +--
 .../aggregation/any/FloatAnyAggregatorFactory.java |  21 +++
 ...gregator.java => FloatAnyVectorAggregator.java} |  46 ++---
 .../aggregation/any/LongAnyAggregatorFactory.java  |  21 +++
 ...ggregator.java => LongAnyVectorAggregator.java} |  45 ++---
 .../any/NumericAnyVectorAggregator.java| 139 ++
 .../any/NumericNilVectorAggregator.java| 113 +++
 .../any/StringAnyAggregatorFactory.java|  30 +++
 .../aggregation/any/StringAnyVectorAggregator.java | 145 ++
 .../any/DoubleAnyAggregatorFactoryTest.java| 101 ++
 .../any/DoubleAnyVectorAggregatorTest.java | 104 ++
 .../any/FloatAnyAggregatorFactoryTest.java | 101 ++
 .../any/FloatAnyVectorAggregatorTest.java  | 104 ++
 .../any/LongAnyAggregatorFactoryTest.java  | 101 ++
 .../any/LongAnyVectorAggregatorTest.java   | 103 ++
 .../any/NumericAnyVectorAggregatorTest.java| 210 +
 .../any/StringAnyAggregatorFactoryTest.java| 110 +++
 .../any/StringAnyVectorAggregatorTest.java | 189 +++
 .../apache/druid/sql/calcite/CalciteQueryTest.java |  31 +--
 website/.spelling  |   4 +
 22 files changed, 1679 insertions(+), 93 deletions(-)
 copy 
processing/src/main/java/org/apache/druid/query/aggregation/{DoubleMaxBufferAggregator.java
 => any/DoubleAnyVectorAggregator.java} (53%)
 copy 
processing/src/main/java/org/apache/druid/query/aggregation/any/{LongAnyAggregator.java
 => FloatAnyVectorAggregator.java} (52%)
 copy 
processing/src/main/java/org/apache/druid/query/aggregation/any/{DoubleAnyAggregator.java
 => LongAnyVectorAggregator.java} (53%)
 create mode 100644 
processing/src/main/java/org/apache/druid/query/aggregation/any/NumericAnyVectorAggregator.java
 create mode 100644 
processing/src/main/java/org/apache/druid/query/aggregation/any/NumericNilVectorAggregator.java
 create mode 100644 
processing/src/main/java/org/apache/druid/query/aggregation/any/StringAnyVectorAggregator.java
 create mode 100644 
processing/src/test/java/org/apache/druid/query/aggregation/any/DoubleAnyAggregatorFactoryTest.java
 create mode 100644 
processing/src/test/java/org/apache/druid/query/aggregation/any/DoubleAnyVectorAggregatorTest.java
 create mode 100644 
processing/src/test/java/org/apache/druid/query/aggregation/any/FloatAnyAggregatorFactoryTest.java
 create mode 100644 
processing/src/test/java/org/apache/druid/query/aggregation/any/FloatAnyVectorAggregatorTest.java
 create mode 100644 
processing/src/test/java/org/apache/druid/query/aggregation/any/LongAnyAggregatorFactoryTest.java
 create mode 100644 
processing/src/test/java/org/apache/druid/query/aggregation/any/LongAnyVectorAggregatorTest.java
 create mode 100644 
processing/src/test/java/org/apache/druid/query/aggregation/any/NumericAnyVectorAggregatorTest.java
 create mode 100644 
processing/src/test/java/org/apache/druid/query/aggregation/any/StringAnyAggregatorFactoryTest.java
 create mode 100644 
processing/src/test/java/org/apache/druid/query/aggregation/any/StringAnyVectorAggregatorTest.java


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] suneet-s merged pull request #10338: Vectorized ANY aggregators

2020-09-14 Thread GitBox


suneet-s merged pull request #10338:
URL: https://github.com/apache/druid/pull/10338


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] suneet-s commented on a change in pull request #10388: allow vectorized query engines to utilize vectorized virtual columns

2020-09-14 Thread GitBox


suneet-s commented on a change in pull request #10388:
URL: https://github.com/apache/druid/pull/10388#discussion_r488348990



##
File path: 
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/vector/VectorGroupByEngine.java
##
@@ -70,24 +73,47 @@ public static boolean canVectorize(
   @Nullable final Filter filter
   )
   {
-// Multi-value dimensions are not yet supported.
-//
-// Two notes here about how we're handling this check:
-//   1) After multi-value dimensions are supported, we could alter 
"GroupByQueryEngineV2.isAllSingleValueDims"
-//  to accept a ColumnSelectorFactory, which makes more sense than 
using a StorageAdapter (see #8013).
-//   2) Technically using StorageAdapter here is bad since it only looks 
at real columns, but they might
-//  be shadowed by virtual columns (again, see #8013). But it's fine 
for now since adapter.canVectorize
-//  always returns false if there are any virtual columns.
-//
-// This situation should sort itself out pretty well once this engine 
supports multi-valued columns. Then we
-// won't have to worry about having this all-single-value-dims check here.
-
-return 
GroupByQueryEngineV2.isAllSingleValueDims(adapter::getColumnCapabilities, 
query.getDimensions(), true)
+Function capabilitiesFunction = name ->
+query.getVirtualColumns().getColumnCapabilitiesWithFallback(adapter, 
name);
+
+return canVectorizeDimensions(capabilitiesFunction, query.getDimensions())
&& 
query.getDimensions().stream().allMatch(DimensionSpec::canVectorize)
&& query.getAggregatorSpecs().stream().allMatch(aggregatorFactory 
-> aggregatorFactory.canVectorize(adapter))
&& adapter.canVectorize(filter, query.getVirtualColumns(), false);
   }
 
+  public static boolean canVectorizeDimensions(
+  final Function capabilitiesFunction,
+  final List dimensions
+  )
+  {
+return dimensions
+.stream()
+.allMatch(
+dimension -> {
+  if (dimension.mustDecorate()) {
+// group by on multi value dimensions are not currently 
supported
+// DimensionSpecs that decorate may turn singly-valued columns 
into multi-valued selectors.
+// To be safe, we must return false here.
+return false;
+  }
+
+  // Now check column capabilities.
+  final ColumnCapabilities columnCapabilities = 
capabilitiesFunction.apply(dimension.getDimension());
+  // null here currently means the column does not exist, nil 
columns can be vectorized
+  if (columnCapabilities == null) {
+return true;
+  }
+  // strings must be single valued, dictionary encoded, and have 
unique dictionary entries
+  if (ValueType.STRING.equals(columnCapabilities.getType())) {
+return columnCapabilities.hasMultipleValues().isFalse() &&

Review comment:
   Well, I guess the example I gave is equivalent. Maybe a better way to 
ask that question is should `UNKNOWN` be treated as multi-value or single value?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] suneet-s commented on a change in pull request #10388: allow vectorized query engines to utilize vectorized virtual columns

2020-09-14 Thread GitBox


suneet-s commented on a change in pull request #10388:
URL: https://github.com/apache/druid/pull/10388#discussion_r488348359



##
File path: 
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/vector/VectorGroupByEngine.java
##
@@ -70,24 +73,47 @@ public static boolean canVectorize(
   @Nullable final Filter filter
   )
   {
-// Multi-value dimensions are not yet supported.
-//
-// Two notes here about how we're handling this check:
-//   1) After multi-value dimensions are supported, we could alter 
"GroupByQueryEngineV2.isAllSingleValueDims"
-//  to accept a ColumnSelectorFactory, which makes more sense than 
using a StorageAdapter (see #8013).
-//   2) Technically using StorageAdapter here is bad since it only looks 
at real columns, but they might
-//  be shadowed by virtual columns (again, see #8013). But it's fine 
for now since adapter.canVectorize
-//  always returns false if there are any virtual columns.
-//
-// This situation should sort itself out pretty well once this engine 
supports multi-valued columns. Then we
-// won't have to worry about having this all-single-value-dims check here.
-
-return 
GroupByQueryEngineV2.isAllSingleValueDims(adapter::getColumnCapabilities, 
query.getDimensions(), true)
+Function capabilitiesFunction = name ->
+query.getVirtualColumns().getColumnCapabilitiesWithFallback(adapter, 
name);
+
+return canVectorizeDimensions(capabilitiesFunction, query.getDimensions())
&& 
query.getDimensions().stream().allMatch(DimensionSpec::canVectorize)
&& query.getAggregatorSpecs().stream().allMatch(aggregatorFactory 
-> aggregatorFactory.canVectorize(adapter))
&& adapter.canVectorize(filter, query.getVirtualColumns(), false);
   }
 
+  public static boolean canVectorizeDimensions(
+  final Function capabilitiesFunction,
+  final List dimensions
+  )
+  {
+return dimensions
+.stream()
+.allMatch(
+dimension -> {
+  if (dimension.mustDecorate()) {
+// group by on multi value dimensions are not currently 
supported
+// DimensionSpecs that decorate may turn singly-valued columns 
into multi-valued selectors.
+// To be safe, we must return false here.
+return false;
+  }
+
+  // Now check column capabilities.
+  final ColumnCapabilities columnCapabilities = 
capabilitiesFunction.apply(dimension.getDimension());
+  // null here currently means the column does not exist, nil 
columns can be vectorized
+  if (columnCapabilities == null) {
+return true;
+  }
+  // strings must be single valued, dictionary encoded, and have 
unique dictionary entries
+  if (ValueType.STRING.equals(columnCapabilities.getType())) {
+return columnCapabilities.hasMultipleValues().isFalse() &&

Review comment:
   How do you decide between 
`columnCapabilities.hasMultipleValues().isFalse()` and 
`!columnCapabilities.hasMultipleValues().isMaybeTrue()` - I'm never sure which 
one to check





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid] branch master updated (184b202 -> e012d5c)

2020-09-14 Thread cwylie
This is an automated email from the ASF dual-hosted git repository.

cwylie pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git.


from 184b202  add computed Expr output types (#10370)
 add e012d5c  allow vectorized query engines to utilize vectorized virtual 
columns (#10388)

No new revisions were added by this update.

Summary of changes:
 .../epinephelinae/GroupByQueryEngineV2.java|  21 +-
 .../epinephelinae/vector/VectorGroupByEngine.java  |  52 +++-
 .../QueryableIndexCursorSequenceBuilder.java   |  51 ++--
 .../segment/QueryableIndexStorageAdapter.java  |   5 +-
 .../org/apache/druid/segment/VirtualColumn.java| 183 +--
 .../org/apache/druid/segment/VirtualColumns.java   | 249 +++
 .../QueryableIndexVectorColumnSelectorFactory.java | 225 +-
 .../virtual/AlwaysTwoCounterAggregatorFactory.java | 257 
 .../virtual/AlwaysTwoVectorizedVirtualColumn.java  | 321 
 .../virtual/VectorizedVirtualColumnTest.java   | 334 +
 10 files changed, 1490 insertions(+), 208 deletions(-)
 create mode 100644 
processing/src/test/java/org/apache/druid/segment/virtual/AlwaysTwoCounterAggregatorFactory.java
 create mode 100644 
processing/src/test/java/org/apache/druid/segment/virtual/AlwaysTwoVectorizedVirtualColumn.java
 create mode 100644 
processing/src/test/java/org/apache/druid/segment/virtual/VectorizedVirtualColumnTest.java


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] clintropolis merged pull request #10388: allow vectorized query engines to utilize vectorized virtual columns

2020-09-14 Thread GitBox


clintropolis merged pull request #10388:
URL: https://github.com/apache/druid/pull/10388


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid] branch master updated (084b23d -> 184b202)

2020-09-14 Thread cwylie
This is an automated email from the ASF dual-hosted git repository.

cwylie pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git.


from 084b23d  benchmark for indexed table experiments (#10327)
 add 184b202  add computed Expr output types (#10370)

No new revisions were added by this update.

Summary of changes:
 .../org/apache/druid/math/expr/ApplyFunction.java  |  86 +-
 .../druid/math/expr/BinaryLogicalOperatorExpr.java |  79 +-
 .../apache/druid/math/expr/BinaryOperatorExpr.java |   9 +-
 .../org/apache/druid/math/expr/ConstantExpr.java   |  41 +-
 .../main/java/org/apache/druid/math/expr/Expr.java |  73 +-
 .../java/org/apache/druid/math/expr/ExprEval.java  |   5 +
 .../apache/druid/math/expr/ExprListenerImpl.java   |   2 +-
 .../org/apache/druid/math/expr/ExprMacroTable.java |  14 +-
 .../java/org/apache/druid/math/expr/ExprType.java  | 148 +++-
 .../java/org/apache/druid/math/expr/Function.java  | 933 ++---
 .../org/apache/druid/math/expr/FunctionalExpr.java |  55 +-
 .../org/apache/druid/math/expr/IdentifierExpr.java |  10 +-
 .../java/org/apache/druid/math/expr/Parser.java|  28 +-
 .../apache/druid/math/expr/UnaryOperatorExpr.java  |  21 +-
 .../java/org/apache/druid/math/expr/ExprTest.java  |  40 +-
 .../org/apache/druid/math/expr/OutputTypeTest.java | 463 ++
 .../org/apache/druid/math/expr/ParserTest.java |   8 +-
 .../query/expressions/BloomFilterExprMacro.java|  10 +-
 .../druid/query/expression/ContainsExpr.java   |  12 +-
 .../expression/IPv4AddressMatchExprMacro.java  |  10 +-
 .../expression/IPv4AddressParseExprMacro.java  |   9 +
 .../expression/IPv4AddressStringifyExprMacro.java  |   9 +
 .../druid/query/expression/LikeExprMacro.java  |  10 +-
 .../druid/query/expression/LookupExprMacro.java|   9 +
 .../query/expression/RegexpExtractExprMacro.java   |   9 +
 .../query/expression/RegexpLikeExprMacro.java  |  12 +-
 .../query/expression/TimestampCeilExprMacro.java   |  16 +
 .../expression/TimestampExtractExprMacro.java  |  15 +
 .../query/expression/TimestampFloorExprMacro.java  |  16 +
 .../query/expression/TimestampFormatExprMacro.java |   9 +
 .../query/expression/TimestampParseExprMacro.java  |   9 +
 .../query/expression/TimestampShiftExprMacro.java  |  16 +
 .../druid/query/expression/TrimExprMacro.java  |  18 +-
 .../druid/segment/filter/ExpressionFilter.java |   4 +-
 .../join/filter/JoinFilterCorrelations.java|   4 +-
 .../druid/segment/virtual/ExpressionSelectors.java |  57 +-
 .../RowBasedExpressionColumnValueSelector.java |  10 +-
 .../query/expression/RegexpLikeExprMacroTest.java  |  17 +-
 .../builtin/ReductionOperatorConversionHelper.java |   4 +-
 .../apache/druid/sql/calcite/rel/Projection.java   |   4 +-
 40 files changed, 1838 insertions(+), 466 deletions(-)
 create mode 100644 
core/src/test/java/org/apache/druid/math/expr/OutputTypeTest.java


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid] branch master updated (084b23d -> 184b202)

2020-09-14 Thread cwylie
This is an automated email from the ASF dual-hosted git repository.

cwylie pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git.


from 084b23d  benchmark for indexed table experiments (#10327)
 add 184b202  add computed Expr output types (#10370)

No new revisions were added by this update.

Summary of changes:
 .../org/apache/druid/math/expr/ApplyFunction.java  |  86 +-
 .../druid/math/expr/BinaryLogicalOperatorExpr.java |  79 +-
 .../apache/druid/math/expr/BinaryOperatorExpr.java |   9 +-
 .../org/apache/druid/math/expr/ConstantExpr.java   |  41 +-
 .../main/java/org/apache/druid/math/expr/Expr.java |  73 +-
 .../java/org/apache/druid/math/expr/ExprEval.java  |   5 +
 .../apache/druid/math/expr/ExprListenerImpl.java   |   2 +-
 .../org/apache/druid/math/expr/ExprMacroTable.java |  14 +-
 .../java/org/apache/druid/math/expr/ExprType.java  | 148 +++-
 .../java/org/apache/druid/math/expr/Function.java  | 933 ++---
 .../org/apache/druid/math/expr/FunctionalExpr.java |  55 +-
 .../org/apache/druid/math/expr/IdentifierExpr.java |  10 +-
 .../java/org/apache/druid/math/expr/Parser.java|  28 +-
 .../apache/druid/math/expr/UnaryOperatorExpr.java  |  21 +-
 .../java/org/apache/druid/math/expr/ExprTest.java  |  40 +-
 .../org/apache/druid/math/expr/OutputTypeTest.java | 463 ++
 .../org/apache/druid/math/expr/ParserTest.java |   8 +-
 .../query/expressions/BloomFilterExprMacro.java|  10 +-
 .../druid/query/expression/ContainsExpr.java   |  12 +-
 .../expression/IPv4AddressMatchExprMacro.java  |  10 +-
 .../expression/IPv4AddressParseExprMacro.java  |   9 +
 .../expression/IPv4AddressStringifyExprMacro.java  |   9 +
 .../druid/query/expression/LikeExprMacro.java  |  10 +-
 .../druid/query/expression/LookupExprMacro.java|   9 +
 .../query/expression/RegexpExtractExprMacro.java   |   9 +
 .../query/expression/RegexpLikeExprMacro.java  |  12 +-
 .../query/expression/TimestampCeilExprMacro.java   |  16 +
 .../expression/TimestampExtractExprMacro.java  |  15 +
 .../query/expression/TimestampFloorExprMacro.java  |  16 +
 .../query/expression/TimestampFormatExprMacro.java |   9 +
 .../query/expression/TimestampParseExprMacro.java  |   9 +
 .../query/expression/TimestampShiftExprMacro.java  |  16 +
 .../druid/query/expression/TrimExprMacro.java  |  18 +-
 .../druid/segment/filter/ExpressionFilter.java |   4 +-
 .../join/filter/JoinFilterCorrelations.java|   4 +-
 .../druid/segment/virtual/ExpressionSelectors.java |  57 +-
 .../RowBasedExpressionColumnValueSelector.java |  10 +-
 .../query/expression/RegexpLikeExprMacroTest.java  |  17 +-
 .../builtin/ReductionOperatorConversionHelper.java |   4 +-
 .../apache/druid/sql/calcite/rel/Projection.java   |   4 +-
 40 files changed, 1838 insertions(+), 466 deletions(-)
 create mode 100644 
core/src/test/java/org/apache/druid/math/expr/OutputTypeTest.java


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] clintropolis commented on a change in pull request #10370: add computed Expr output types

2020-09-14 Thread GitBox


clintropolis commented on a change in pull request #10370:
URL: https://github.com/apache/druid/pull/10370#discussion_r488324915



##
File path: 
core/src/main/java/org/apache/druid/math/expr/BinaryLogicalOperatorExpr.java
##
@@ -57,6 +57,17 @@ protected final double evalDouble(double left, double right)
 // Use Double.compare for more consistent NaN handling.
 return Evals.asDouble(Double.compare(left, right) < 0);
   }
+
+  @Nullable
+  @Override
+  public ExprType getOutputType(InputBindingTypes inputTypes)
+  {
+ExprType implicitCast = super.getOutputType(inputTypes);
+if (ExprType.STRING.equals(implicitCast)) {
+  return ExprType.LONG;
+}
+return implicitCast;

Review comment:
   I'm going to merge this PR, but we can continue this discussion in the 
next one, so this stuff can actually be tied into things in a concrete manner. 
Looking at postgres, its math functions do, at least somewhat, allow for 
implicit conversion of sting numbers to numbers. So `2.0 + '3.1' = 5.1` and 
`cos('1')` work, but `'2.0' + '3.1'` does not. In druid  currently the first 
would have the same behavior as postgres, but the 2nd would be null or 0 
depending on the value of `druid.generic.useDefaultValueForNull`, while the 3rd 
expression would do string concatenation.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] clintropolis merged pull request #10370: add computed Expr output types

2020-09-14 Thread GitBox


clintropolis merged pull request #10370:
URL: https://github.com/apache/druid/pull/10370


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] suneet-s commented on a change in pull request #10390: Vectorized variance aggregators

2020-09-14 Thread GitBox


suneet-s commented on a change in pull request #10390:
URL: https://github.com/apache/druid/pull/10390#discussion_r488319436



##
File path: 
extensions-core/stats/src/main/java/org/apache/druid/query/aggregation/variance/VarianceAggregatorFactory.java
##
@@ -162,10 +167,36 @@ public BufferAggregator 
factorizeBuffered(ColumnSelectorFactory metricFactory)
 throw new IAE(
 "Incompatible type for metric[%s], expected a float, double, long, or 
variance, but got a %s",
 fieldName,
-inputType
+type
 );
   }
 
+  @Override
+  public VectorAggregator factorizeVector(VectorColumnSelectorFactory 
selectorFactory)
+  {
+final String type = getTypeString(selectorFactory);
+if (ValueType.FLOAT.name().equalsIgnoreCase(type)) {

Review comment:
   Fix was easy for everything. Done with some tests. Waiting on a full 
review, and I'll push up a new patch.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] jihoonson commented on issue #10321: [Proposal] Design for a Configurable Index Type

2020-09-14 Thread GitBox


jihoonson commented on issue #10321:
URL: https://github.com/apache/druid/issues/10321#issuecomment-692390625


   @liran-funaro thanks for your answer.
   
   > I'll look into that. Improving query performance is on our agenda, and we 
considered comparing the groupBy v1 with Oak to the groupBy v2 mechanism. We 
will report our results in a different issue in the future.
   
   Sounds good. I'm looking forward to seeing your report. 
:slightly_smiling_face: 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] clintropolis commented on a change in pull request #10390: Vectorized variance aggregators

2020-09-14 Thread GitBox


clintropolis commented on a change in pull request #10390:
URL: https://github.com/apache/druid/pull/10390#discussion_r488305025



##
File path: 
extensions-core/stats/src/main/java/org/apache/druid/query/aggregation/variance/VarianceAggregatorFactory.java
##
@@ -162,10 +167,36 @@ public BufferAggregator 
factorizeBuffered(ColumnSelectorFactory metricFactory)
 throw new IAE(
 "Incompatible type for metric[%s], expected a float, double, long, or 
variance, but got a %s",
 fieldName,
-inputType
+type
 );
   }
 
+  @Override
+  public VectorAggregator factorizeVector(VectorColumnSelectorFactory 
selectorFactory)
+  {
+final String type = getTypeString(selectorFactory);
+if (ValueType.FLOAT.name().equalsIgnoreCase(type)) {

Review comment:
   Ah, this is actually only true if `inputType` isn't specified, so it's 
not _never_ true, just will never be true if the input type isn't explicitly 
specified as `"variance"`, and cant autodetect from its inputs.
   
   I think it would be nice if you fix it up at least for the vectorized 
codepath, but would also probably be nice to just fix it up for all of them if 
it isn't too much trouble, and I don't think waiting on #10277 is necessary 
since there are a handful of aggregator factories which could be improved to 
have a stronger check on their complex input type.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] suneet-s commented on a change in pull request #10390: Vectorized variance aggregators

2020-09-14 Thread GitBox


suneet-s commented on a change in pull request #10390:
URL: https://github.com/apache/druid/pull/10390#discussion_r488292797



##
File path: 
extensions-core/stats/src/main/java/org/apache/druid/query/aggregation/variance/VarianceAggregatorFactory.java
##
@@ -162,10 +167,36 @@ public BufferAggregator 
factorizeBuffered(ColumnSelectorFactory metricFactory)
 throw new IAE(
 "Incompatible type for metric[%s], expected a float, double, long, or 
variance, but got a %s",
 fieldName,
-inputType
+type
 );
   }
 
+  @Override
+  public VectorAggregator factorizeVector(VectorColumnSelectorFactory 
selectorFactory)
+  {
+final String type = getTypeString(selectorFactory);
+if (ValueType.FLOAT.name().equalsIgnoreCase(type)) {

Review comment:
   Can I file a separate bug for this issue since it appears to be a bug 
independent of the vectorization code path? I can look at fixing that issue 
after #10277 goes in





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] suneet-s commented on a change in pull request #10338: Vectorized ANY aggregators

2020-09-14 Thread GitBox


suneet-s commented on a change in pull request #10338:
URL: https://github.com/apache/druid/pull/10338#discussion_r488280699



##
File path: 
processing/src/main/java/org/apache/druid/query/aggregation/any/NumericNilVectorAggregator.java
##
@@ -0,0 +1,146 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation.any;
+
+import org.apache.druid.common.config.NullHandling;
+import org.apache.druid.query.aggregation.VectorAggregator;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+
+/**
+ * A vector aggregator that returns the default numeric value.
+ */
+public abstract class NumericNilVectorAggregator implements VectorAggregator

Review comment:
   I decided not to extend `NoopVectorAggregator` since it's constructor 
was private, and I didn't want to think through what the right inheritance 
model should be. I did take your suggestion on just instantiating 1 singleton 
for each numeric type to reduce the boiler plate code. New patch incoming...





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] clintropolis commented on a change in pull request #10390: Vectorized variance aggregators

2020-09-14 Thread GitBox


clintropolis commented on a change in pull request #10390:
URL: https://github.com/apache/druid/pull/10390#discussion_r488175527



##
File path: 
extensions-core/stats/src/main/java/org/apache/druid/query/aggregation/variance/VarianceAggregatorFactory.java
##
@@ -162,10 +167,36 @@ public BufferAggregator 
factorizeBuffered(ColumnSelectorFactory metricFactory)
 throw new IAE(
 "Incompatible type for metric[%s], expected a float, double, long, or 
variance, but got a %s",
 fieldName,
-inputType
+type
 );
   }
 
+  @Override
+  public VectorAggregator factorizeVector(VectorColumnSelectorFactory 
selectorFactory)
+  {
+final String type = getTypeString(selectorFactory);
+if (ValueType.FLOAT.name().equalsIgnoreCase(type)) {

Review comment:
   Oh, this isn't actually correct. This method is backed by `getType`, 
which means `VARIANCE_TYPE_NAME.equalsIgnoreCase(type)` will never be true.  I 
suggest instead of using `getTypeString` to get the column capabilities, and 
then this enum can switch on `getType` directly instead of converting them to 
and from strings, and if type is `ValueType.COMPLEX` then to treat it as the 
object aggregator factory.
   
   After #10277 goes in, then we can check `getComplexTypeName` matches 
`VARIANCE_TYPE_NAME`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] clintropolis commented on a change in pull request #10338: Vectorized ANY aggregators

2020-09-14 Thread GitBox


clintropolis commented on a change in pull request #10338:
URL: https://github.com/apache/druid/pull/10338#discussion_r488262030



##
File path: 
processing/src/main/java/org/apache/druid/query/aggregation/any/NumericNilVectorAggregator.java
##
@@ -0,0 +1,146 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation.any;
+
+import org.apache.druid.common.config.NullHandling;
+import org.apache.druid.query.aggregation.VectorAggregator;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+
+/**
+ * A vector aggregator that returns the default numeric value.
+ */
+public abstract class NumericNilVectorAggregator implements VectorAggregator

Review comment:
   nit: can extend `NoopVectorAggregator` to reduce boilerplate. I also 
don't think you need separate classes for each type, since the number is an 
object anyway, can just instantiate each singleton using 
`NullHandling.defaultLongValue()` etc.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] clintropolis merged pull request #10327: benchmark for indexed table experiments

2020-09-14 Thread GitBox


clintropolis merged pull request #10327:
URL: https://github.com/apache/druid/pull/10327


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid] branch master updated (f5e2645 -> 084b23d)

2020-09-14 Thread cwylie
This is an automated email from the ASF dual-hosted git repository.

cwylie pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git.


from f5e2645  Support SearchQueryDimFilter in sql via new methods (#10350)
 add 084b23d  benchmark for indexed table experiments (#10327)

No new revisions were added by this update.

Summary of changes:
 .../benchmark/IndexedTableJoinCursorBenchmark.java | 416 +
 ...mark.java => IndexedTableLoadingBenchmark.java} |  83 ++--
 2 files changed, 460 insertions(+), 39 deletions(-)
 create mode 100644 
benchmarks/src/test/java/org/apache/druid/benchmark/IndexedTableJoinCursorBenchmark.java
 copy 
benchmarks/src/test/java/org/apache/druid/benchmark/{ConsistentHasherBenchmark.java
 => IndexedTableLoadingBenchmark.java} (52%)


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] ccaominh commented on pull request #10397: Skip coverage check for tag builds

2020-09-14 Thread GitBox


ccaominh commented on pull request #10397:
URL: https://github.com/apache/druid/pull/10397#issuecomment-692336327


   > LGTM
   > 
   > I found some tags that had failed CI
   > `druid-0.19.0-rc1` -> 
https://travis-ci.org/github/apache/druid/builds/707574622
   > `druid-0.19.0` -> 
https://travis-ci.org/github/apache/druid/builds/709850305
   > 
   > It looks like the code coverage bot did not run for the 0.18 tag
   > `druid-0.18.1` -> 
https://travis-ci.org/github/apache/druid/builds/686661101
   
   The coverage bot was added in 0.19: https://github.com/apache/druid/pull/9863



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] ccaominh opened a new pull request #10397: Skip coverage check for tag builds

2020-09-14 Thread GitBox


ccaominh opened a new pull request #10397:
URL: https://github.com/apache/druid/pull/10397


   ### Description
   
   The code coverage diff calculation assumes the `TRAVIS_BRANCH` environment 
variable is the name of a branch; however, for tag builds it is the name of the 
tag so the diff calculation fails. Since builds triggered by tags do not have a 
code diff, the coverage check should be skipped to avoid the error and to save 
some CI resources.
   
   
   
   This PR has:
   - [x] been self-reviewed.
   - [x] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [x] tested new tag does not cause coverage check to fail:
  - https://travis-ci.org/github/ccaominh/druid/builds/726445429
   - [x] tested coverage check is still able to flag uncovered changes in PR:
  - https://github.com/ccaominh/druid/pull/5
  - https://travis-ci.org/github/ccaominh/druid/jobs/726474544



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] mitchlloyd commented on pull request #10386: Upgrade AWS SDK

2020-09-14 Thread GitBox


mitchlloyd commented on pull request #10386:
URL: https://github.com/apache/druid/pull/10386#issuecomment-692269623


   @suneet-s I'm running the tests following [the 
documentation](https://github.com/apache/druid/blob/master/integration-tests/README.md#running-a-test-that-uses-cloud):
   
   However I'm running into some issues. First the documented command to run 
seems to run all tests which fail at different points on different runs:
   
   ```sh
   mvn verify -P integration-tests \
 -Dgroups=s3-deep-storage \
 -Doverride.config.path=s3-creds-override \
 -Ddruid.test.config.streamEndpoint=kinesis.us-west-2.amazonaws.com \
 -Ddruid.test.config.cloudBucket= \
 -Ddruid.test.config.cloudPath=s3-tests/
   ```
   
   I added the `-pl integration-tests` option and this seems to start up Docker:
   
   ```sh
   mvn verify -pl integration-tests \
  -P integration-tests \
 -Dgroups=s3-deep-storage \
 -Doverride.config.path=s3-creds-override \
 -Ddruid.test.config.streamEndpoint=kinesis.us-west-2.amazonaws.com \
 -Ddruid.test.config.cloudBucket= \
 -Ddruid.test.config.cloudPath=s3-tests/
   ```
   
   I notice that during start up there is a message that says: 
   
   `Could not set docker host IP - integration tests can not run`
   
   I also see this error:
   
   
 Error
 
   Step 8/36 : RUN find /var/lib/mysql -type f -exec touch {} \; && 
/etc/init.d/mysql start   && echo "CREATE USER 'druid'@'%' IDENTIFIED BY 
'diurd'; GRANT ALL ON druid.* TO 'druid'@'%'; CREATE database druid DEFAULT 
CHARACTER SET utf8mb4;" | mysql -u root   && /etc/init.d/mysql stop
---> Running in d25c117522f4
* /etc/init.d/mysql: ERROR: The partition with /var/lib/mysql is too full!
   The command '/bin/sh -c find /var/lib/mysql -type f -exec touch {} \; && 
/etc/init.d/mysql start   && echo "CREATE USER 'druid'@'%' IDENTIFIED BY 
'diurd'; GRANT ALL ON druid.* TO 'druid'@'%'; CREATE database druid DEFAULT 
CHARACTER SET utf8mb4;" | mysql -u root   && /etc/init.d/mysql stop' 
returned a non-zero code: 1
   Error response from daemon: No such container: druid-historical
   Error: No such container: druid-historical
   Error response from daemon: No such container: 
druid-historical-for-query-retry-test
   Error: No such container: druid-historical-for-query-retry-test
   Error response from daemon: No such container: druid-coordinator
   Error: No such container: druid-coordinator
   Error response from daemon: No such container: druid-overlord
   Error: No such container: druid-overlord
   Error response from daemon: No such container: druid-router
   Error: No such container: druid-router
   Error response from daemon: No such container: druid-router-permissive-tls
   Error: No such container: druid-router-permissive-tls
   Error response from daemon: No such container: 
druid-router-no-client-auth-tls
   Error: No such container: druid-router-no-client-auth-tls
   Error response from daemon: No such container: druid-router-custom-check-tls
   Error: No such container: druid-router-custom-check-tls
   Error response from daemon: No such container: druid-broker
   Error: No such container: druid-broker
   Error response from daemon: No such container: druid-middlemanager
   Error: No such container: druid-middlemanager
   Error response from daemon: No such container: druid-zookeeper-kafka
   Error: No such container: druid-zookeeper-kafka
   Error response from daemon: No such container: druid-metadata-storage
   Error: No such container: druid-metadata-storage
   Error response from daemon: No such container: druid-it-hadoop
   Error: No such container: druid-it-hadoop
   druid-it-net
 
   
   
   Possibly of note, after that the test stops with what I know is a user 
prompt:
   
   ```
   Pulling druid-zookeeper-kafka (druid/cluster:)...
   The image for the service you're trying to recreate has been removed. If you 
continue, volume data could be lost. Consider backing up your data before 
continuing.
   ```
   
   I can get past that by pressing `y` and enter to bypass the prompt.
   
   Then I get 
   
   ```
   Pulling druid-zookeeper-kafka (druid/cluster:)...
   pull access denied for druid/cluster, repository does not exist or may 
require 'docker login': denied: requested access to the resource is denied
   ```
   
   After that a test section runs but runs no tests
   
   ```
   Tests run: 0, Failures: 0, Errors: 0, Skipped: 0
   ```
   
   Then I get errors from 
`org.apache.maven.plugins:maven-failsafe-plugin:2.22.0:verify`
   
   
 Error
 
   [INFO] BUILD FAILURE
   [INFO] 

   [INFO] Total time:  01:16 min
   [INFO] Finished at: 2020-09-14T12:27:30-07:00
   [INFO] 

   [ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-failsafe-plugin:2.22.0:verify 
(integration-tests) on 

[GitHub] [druid] clintropolis commented on a change in pull request #10370: add computed Expr output types

2020-09-14 Thread GitBox


clintropolis commented on a change in pull request #10370:
URL: https://github.com/apache/druid/pull/10370#discussion_r488172223



##
File path: 
core/src/main/java/org/apache/druid/math/expr/BinaryLogicalOperatorExpr.java
##
@@ -57,6 +57,17 @@ protected final double evalDouble(double left, double right)
 // Use Double.compare for more consistent NaN handling.
 return Evals.asDouble(Double.compare(left, right) < 0);
   }
+
+  @Nullable
+  @Override
+  public ExprType getOutputType(InputBindingTypes inputTypes)
+  {
+ExprType implicitCast = super.getOutputType(inputTypes);
+if (ExprType.STRING.equals(implicitCast)) {
+  return ExprType.LONG;
+}
+return implicitCast;

Review comment:
   >can you elaborate more on this? How will that look like?
   
   Ah, its going to be based on using these methods, just the processors will 
be specialized to deal with the correct type based on the set of input types. 
Since the output type information isn't used for non-vectorized expressions, 
I'm trying to model this as the change I want to see in the world and will 
ensure that this matches the behavior of the vectorized expressions, but I've 
gone ahead and split operator from function auto conversion and changed it to 
match existing behavior for now in case it is necessary, and can always 
consolidate them again in the future.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid-website] branch asf-site updated (1b56e53 -> 375870d)

2020-09-14 Thread vogievetsky
This is an automated email from the ASF dual-hosted git repository.

vogievetsky pushed a change to branch asf-site
in repository https://gitbox.apache.org/repos/asf/druid-website.git.


from 1b56e53  Merge pull request #94 from implydata/autobuild
 new 60c2ccd  Autobuild
 new 9adff91  restore files
 new 375870d  Merge pull request #95 from implydata/autobuild

The 239 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 index.html | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid-website] vogievetsky merged pull request #95: Autobuild

2020-09-14 Thread GitBox


vogievetsky merged pull request #95:
URL: https://github.com/apache/druid-website/pull/95


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid-website] vogievetsky opened a new pull request #95: Autobuild

2020-09-14 Thread GitBox


vogievetsky opened a new pull request #95:
URL: https://github.com/apache/druid-website/pull/95


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid-website-src] branch master updated: add 9/28 meetup

2020-09-14 Thread vogievetsky
This is an automated email from the ASF dual-hosted git repository.

vogievetsky pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/druid-website-src.git


The following commit(s) were added to refs/heads/master by this push:
 new 852c067  add 9/28 meetup
 new 96bc747  Merge pull request #161 from druid-matt/patch-14
852c067 is described below

commit 852c0673747103d00424d598874c0d5449d0b307
Author: Matt Sarrel <59710709+druid-m...@users.noreply.github.com>
AuthorDate: Mon Sep 14 11:33:22 2020 -0700

add 9/28 meetup
---
 _data/events.yml | 5 +
 1 file changed, 5 insertions(+)

diff --git a/_data/events.yml b/_data/events.yml
index 61edc77..1410bc3 100644
--- a/_data/events.yml
+++ b/_data/events.yml
@@ -9,6 +9,11 @@
   info: "Bay Area Apache Druid Meetup"
   link: https://www.meetup.com/druidio/events/272938350/
   
+- date: 2020-09-28
+  name: "September Office Hours: All About Apache Druid"
+  info: "Bay Area Apache Druid Meetup"
+  link: https://www.meetup.com/druidio/events/273120070/
+  
 - date: 2020-11-02
   name: "Druid Summit"
   info: "San Francisco Airport Marriott Waterfront, 1800 Old Bayshore Highway, 
Burlingame, CA"


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid-website-src] branch master updated: fix typo on index.html

2020-09-14 Thread vogievetsky
This is an automated email from the ASF dual-hosted git repository.

vogievetsky pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/druid-website-src.git


The following commit(s) were added to refs/heads/master by this push:
 new 09e6842  fix typo on index.html
 new 02e699a  Merge pull request #160 from druid-matt/patch-13
09e6842 is described below

commit 09e6842649aa464f5521d56abe031ed546bfe1b7
Author: Matt Sarrel <59710709+druid-m...@users.noreply.github.com>
AuthorDate: Sun Sep 13 15:33:38 2020 -0700

fix typo on index.html

outpeform to outperform
---
 index.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/index.html b/index.html
index 7c66563..03f16c9 100644
--- a/index.html
+++ b/index.html
@@ -46,7 +46,7 @@ canonical: 'https://druid.apache.org/'
   
   Up to 100x faster than traditional solutions
   
-Druid has been benchmarked
 to greatly outpeform legacy solutions for data ingestion and data querying. 
Druid's novel architecture combines the best of data warehouses, timeseries 
databases, and search 
systems.
+Druid has been benchmarked
 to greatly outperform legacy solutions for data ingestion and data querying. 
Druid's novel architecture combines the best of data warehouses, timeseries 
databases, and search 
systems.
   
 
 


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid-website-src] vogievetsky merged pull request #161: add 9/28 meetup

2020-09-14 Thread GitBox


vogievetsky merged pull request #161:
URL: https://github.com/apache/druid-website-src/pull/161


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid-website-src] vogievetsky merged pull request #160: fix typo on index.html

2020-09-14 Thread GitBox


vogievetsky merged pull request #160:
URL: https://github.com/apache/druid-website-src/pull/160


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid-website-src] druid-matt closed pull request #150: fix broken links to docs

2020-09-14 Thread GitBox


druid-matt closed pull request #150:
URL: https://github.com/apache/druid-website-src/pull/150


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid-website-src] druid-matt opened a new pull request #161: add 9/28 meetup

2020-09-14 Thread GitBox


druid-matt opened a new pull request #161:
URL: https://github.com/apache/druid-website-src/pull/161


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid] branch master updated (3d4b48e -> f5e2645)

2020-09-14 Thread suneet
This is an automated email from the ASF dual-hosted git repository.

suneet pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git.


from 3d4b48e  TransformSpecTest should extends InitializedNullHandlingTest 
(#10392)
 add f5e2645  Support SearchQueryDimFilter in sql via new methods (#10350)

No new revisions were added by this update.

Summary of changes:
 docs/misc/math-expr.md |   2 +
 docs/querying/sql.md   |   2 +
 .../CaseInsensitiveContainsExprMacro.java  |  63 ++
 .../druid/query/expression/ContainsExpr.java   | 101 +
 .../druid/query/expression/ContainsExprMacro.java  |  62 ++
 ...Test.java => CaseInsensitiveExprMacroTest.java} |  66 +++---
 ...prMacroTest.java => ContainsExprMacroTest.java} |  50 ++---
 .../org/apache/druid/guice/ExpressionModule.java   |   4 +
 ...ersion.java => ContainsOperatorConversion.java} |  93 ++---
 .../sql/calcite/planner/DruidOperatorTable.java|   3 +
 .../calcite/expression/ExpressionTestHelper.java   |  15 +-
 .../sql/calcite/expression/ExpressionsTest.java| 228 +
 website/.spelling  |   2 +
 13 files changed, 611 insertions(+), 80 deletions(-)
 create mode 100644 
processing/src/main/java/org/apache/druid/query/expression/CaseInsensitiveContainsExprMacro.java
 create mode 100644 
processing/src/main/java/org/apache/druid/query/expression/ContainsExpr.java
 create mode 100644 
processing/src/main/java/org/apache/druid/query/expression/ContainsExprMacro.java
 copy 
processing/src/test/java/org/apache/druid/query/expression/{RegexpLikeExprMacroTest.java
 => CaseInsensitiveExprMacroTest.java} (53%)
 copy 
processing/src/test/java/org/apache/druid/query/expression/{RegexpLikeExprMacroTest.java
 => ContainsExprMacroTest.java} (58%)
 copy 
sql/src/main/java/org/apache/druid/sql/calcite/expression/builtin/{RegexpLikeOperatorConversion.java
 => ContainsOperatorConversion.java} (51%)


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid] branch master updated (3d4b48e -> f5e2645)

2020-09-14 Thread suneet
This is an automated email from the ASF dual-hosted git repository.

suneet pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git.


from 3d4b48e  TransformSpecTest should extends InitializedNullHandlingTest 
(#10392)
 add f5e2645  Support SearchQueryDimFilter in sql via new methods (#10350)

No new revisions were added by this update.

Summary of changes:
 docs/misc/math-expr.md |   2 +
 docs/querying/sql.md   |   2 +
 .../CaseInsensitiveContainsExprMacro.java  |  63 ++
 .../druid/query/expression/ContainsExpr.java   | 101 +
 .../druid/query/expression/ContainsExprMacro.java  |  62 ++
 ...Test.java => CaseInsensitiveExprMacroTest.java} |  66 +++---
 ...prMacroTest.java => ContainsExprMacroTest.java} |  50 ++---
 .../org/apache/druid/guice/ExpressionModule.java   |   4 +
 ...ersion.java => ContainsOperatorConversion.java} |  93 ++---
 .../sql/calcite/planner/DruidOperatorTable.java|   3 +
 .../calcite/expression/ExpressionTestHelper.java   |  15 +-
 .../sql/calcite/expression/ExpressionsTest.java| 228 +
 website/.spelling  |   2 +
 13 files changed, 611 insertions(+), 80 deletions(-)
 create mode 100644 
processing/src/main/java/org/apache/druid/query/expression/CaseInsensitiveContainsExprMacro.java
 create mode 100644 
processing/src/main/java/org/apache/druid/query/expression/ContainsExpr.java
 create mode 100644 
processing/src/main/java/org/apache/druid/query/expression/ContainsExprMacro.java
 copy 
processing/src/test/java/org/apache/druid/query/expression/{RegexpLikeExprMacroTest.java
 => CaseInsensitiveExprMacroTest.java} (53%)
 copy 
processing/src/test/java/org/apache/druid/query/expression/{RegexpLikeExprMacroTest.java
 => ContainsExprMacroTest.java} (58%)
 copy 
sql/src/main/java/org/apache/druid/sql/calcite/expression/builtin/{RegexpLikeOperatorConversion.java
 => ContainsOperatorConversion.java} (51%)


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid] branch master updated (3d4b48e -> f5e2645)

2020-09-14 Thread suneet
This is an automated email from the ASF dual-hosted git repository.

suneet pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git.


from 3d4b48e  TransformSpecTest should extends InitializedNullHandlingTest 
(#10392)
 add f5e2645  Support SearchQueryDimFilter in sql via new methods (#10350)

No new revisions were added by this update.

Summary of changes:
 docs/misc/math-expr.md |   2 +
 docs/querying/sql.md   |   2 +
 .../CaseInsensitiveContainsExprMacro.java  |  63 ++
 .../druid/query/expression/ContainsExpr.java   | 101 +
 .../druid/query/expression/ContainsExprMacro.java  |  62 ++
 ...Test.java => CaseInsensitiveExprMacroTest.java} |  66 +++---
 ...prMacroTest.java => ContainsExprMacroTest.java} |  50 ++---
 .../org/apache/druid/guice/ExpressionModule.java   |   4 +
 ...ersion.java => ContainsOperatorConversion.java} |  93 ++---
 .../sql/calcite/planner/DruidOperatorTable.java|   3 +
 .../calcite/expression/ExpressionTestHelper.java   |  15 +-
 .../sql/calcite/expression/ExpressionsTest.java| 228 +
 website/.spelling  |   2 +
 13 files changed, 611 insertions(+), 80 deletions(-)
 create mode 100644 
processing/src/main/java/org/apache/druid/query/expression/CaseInsensitiveContainsExprMacro.java
 create mode 100644 
processing/src/main/java/org/apache/druid/query/expression/ContainsExpr.java
 create mode 100644 
processing/src/main/java/org/apache/druid/query/expression/ContainsExprMacro.java
 copy 
processing/src/test/java/org/apache/druid/query/expression/{RegexpLikeExprMacroTest.java
 => CaseInsensitiveExprMacroTest.java} (53%)
 copy 
processing/src/test/java/org/apache/druid/query/expression/{RegexpLikeExprMacroTest.java
 => ContainsExprMacroTest.java} (58%)
 copy 
sql/src/main/java/org/apache/druid/sql/calcite/expression/builtin/{RegexpLikeOperatorConversion.java
 => ContainsOperatorConversion.java} (51%)


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid] branch master updated (3d4b48e -> f5e2645)

2020-09-14 Thread suneet
This is an automated email from the ASF dual-hosted git repository.

suneet pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git.


from 3d4b48e  TransformSpecTest should extends InitializedNullHandlingTest 
(#10392)
 add f5e2645  Support SearchQueryDimFilter in sql via new methods (#10350)

No new revisions were added by this update.

Summary of changes:
 docs/misc/math-expr.md |   2 +
 docs/querying/sql.md   |   2 +
 .../CaseInsensitiveContainsExprMacro.java  |  63 ++
 .../druid/query/expression/ContainsExpr.java   | 101 +
 .../druid/query/expression/ContainsExprMacro.java  |  62 ++
 ...Test.java => CaseInsensitiveExprMacroTest.java} |  66 +++---
 ...prMacroTest.java => ContainsExprMacroTest.java} |  50 ++---
 .../org/apache/druid/guice/ExpressionModule.java   |   4 +
 ...ersion.java => ContainsOperatorConversion.java} |  93 ++---
 .../sql/calcite/planner/DruidOperatorTable.java|   3 +
 .../calcite/expression/ExpressionTestHelper.java   |  15 +-
 .../sql/calcite/expression/ExpressionsTest.java| 228 +
 website/.spelling  |   2 +
 13 files changed, 611 insertions(+), 80 deletions(-)
 create mode 100644 
processing/src/main/java/org/apache/druid/query/expression/CaseInsensitiveContainsExprMacro.java
 create mode 100644 
processing/src/main/java/org/apache/druid/query/expression/ContainsExpr.java
 create mode 100644 
processing/src/main/java/org/apache/druid/query/expression/ContainsExprMacro.java
 copy 
processing/src/test/java/org/apache/druid/query/expression/{RegexpLikeExprMacroTest.java
 => CaseInsensitiveExprMacroTest.java} (53%)
 copy 
processing/src/test/java/org/apache/druid/query/expression/{RegexpLikeExprMacroTest.java
 => ContainsExprMacroTest.java} (58%)
 copy 
sql/src/main/java/org/apache/druid/sql/calcite/expression/builtin/{RegexpLikeOperatorConversion.java
 => ContainsOperatorConversion.java} (51%)


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid] branch master updated (3d4b48e -> f5e2645)

2020-09-14 Thread suneet
This is an automated email from the ASF dual-hosted git repository.

suneet pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git.


from 3d4b48e  TransformSpecTest should extends InitializedNullHandlingTest 
(#10392)
 add f5e2645  Support SearchQueryDimFilter in sql via new methods (#10350)

No new revisions were added by this update.

Summary of changes:
 docs/misc/math-expr.md |   2 +
 docs/querying/sql.md   |   2 +
 .../CaseInsensitiveContainsExprMacro.java  |  63 ++
 .../druid/query/expression/ContainsExpr.java   | 101 +
 .../druid/query/expression/ContainsExprMacro.java  |  62 ++
 ...Test.java => CaseInsensitiveExprMacroTest.java} |  66 +++---
 ...prMacroTest.java => ContainsExprMacroTest.java} |  50 ++---
 .../org/apache/druid/guice/ExpressionModule.java   |   4 +
 ...ersion.java => ContainsOperatorConversion.java} |  93 ++---
 .../sql/calcite/planner/DruidOperatorTable.java|   3 +
 .../calcite/expression/ExpressionTestHelper.java   |  15 +-
 .../sql/calcite/expression/ExpressionsTest.java| 228 +
 website/.spelling  |   2 +
 13 files changed, 611 insertions(+), 80 deletions(-)
 create mode 100644 
processing/src/main/java/org/apache/druid/query/expression/CaseInsensitiveContainsExprMacro.java
 create mode 100644 
processing/src/main/java/org/apache/druid/query/expression/ContainsExpr.java
 create mode 100644 
processing/src/main/java/org/apache/druid/query/expression/ContainsExprMacro.java
 copy 
processing/src/test/java/org/apache/druid/query/expression/{RegexpLikeExprMacroTest.java
 => CaseInsensitiveExprMacroTest.java} (53%)
 copy 
processing/src/test/java/org/apache/druid/query/expression/{RegexpLikeExprMacroTest.java
 => ContainsExprMacroTest.java} (58%)
 copy 
sql/src/main/java/org/apache/druid/sql/calcite/expression/builtin/{RegexpLikeOperatorConversion.java
 => ContainsOperatorConversion.java} (51%)


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] suneet-s merged pull request #10350: Support SearchQueryDimFilter in sql via new methods

2020-09-14 Thread GitBox


suneet-s merged pull request #10350:
URL: https://github.com/apache/druid/pull/10350


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] suneet-s commented on a change in pull request #10350: Support SearchQueryDimFilter in sql via new methods

2020-09-14 Thread GitBox


suneet-s commented on a change in pull request #10350:
URL: https://github.com/apache/druid/pull/10350#discussion_r488085625



##
File path: 
processing/src/test/java/org/apache/druid/query/expression/ContainsExprMacroTest.java
##
@@ -0,0 +1,142 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.expression;
+
+import com.google.common.collect.ImmutableMap;
+import org.apache.druid.common.config.NullHandling;
+import org.apache.druid.math.expr.ExprEval;
+import org.apache.druid.math.expr.ExprType;
+import org.apache.druid.math.expr.Parser;
+import org.junit.Assert;
+import org.junit.Test;
+
+public class ContainsExprMacroTest extends MacroTestBase
+{
+  public ContainsExprMacroTest()
+  {
+super(new ContainsExprMacro());
+  }
+
+  @Test
+  public void testErrorZeroArguments()
+  {
+expectException(IllegalArgumentException.class, "Function[contains_string] 
must have 2 arguments");
+eval("contains_string()", Parser.withMap(ImmutableMap.of()));
+  }
+
+  @Test
+  public void testErrorThreeArguments()
+  {
+expectException(IllegalArgumentException.class, "Function[contains_string] 
must have 2 arguments");
+eval("contains_string('a', 'b', 'c')", Parser.withMap(ImmutableMap.of()));
+  }
+
+  @Test
+  public void testMatch()
+  {
+final ExprEval result = eval("contains_string(a, 'oba')", 
Parser.withMap(ImmutableMap.of("a", "foobar")));
+Assert.assertEquals(
+ExprEval.of(true, ExprType.LONG).value(),
+result.value()
+);
+  }
+
+  @Test
+  public void testNoMatch()
+  {
+final ExprEval result = eval("contains_string(a, 'bar')", 
Parser.withMap(ImmutableMap.of("a", "foo")));
+Assert.assertEquals(
+ExprEval.of(false, ExprType.LONG).value(),
+result.value()
+);
+  }
+
+  @Test
+  public void testNullSearch()
+  {
+if (NullHandling.sqlCompatible()) {
+  expectException(IllegalArgumentException.class, 
"Function[contains_string] substring must be a string literal");
+}
+
+final ExprEval result = eval("contains_string(a, null)", 
Parser.withMap(ImmutableMap.of("a", "foo")));
+Assert.assertEquals(
+ExprEval.of(true, ExprType.LONG).value(),
+result.value()
+);
+  }
+
+  @Test
+  public void testEmptyStringSearch()
+  {
+final ExprEval result = eval("contains_string(a, '')", 
Parser.withMap(ImmutableMap.of("a", "foo")));
+Assert.assertEquals(
+ExprEval.of(true, ExprType.LONG).value(),
+result.value()
+);
+  }
+
+  @Test
+  public void testNullSearchOnEmptyString()
+  {
+if (NullHandling.sqlCompatible()) {
+  expectException(IllegalArgumentException.class, 
"Function[contains_string] substring must be a string literal");
+}
+
+final ExprEval result = eval("contains_string(a, null)", 
Parser.withMap(ImmutableMap.of("a", "")));
+Assert.assertEquals(
+ExprEval.of(true, ExprType.LONG).value(),
+result.value()
+);
+  }
+
+  @Test
+  public void testEmptyStringSearchOnEmptyString()
+  {
+final ExprEval result = eval("contains_string(a, '')", 
Parser.withMap(ImmutableMap.of("a", "")));
+Assert.assertEquals(
+ExprEval.of(true, ExprType.LONG).value(),
+result.value()
+);
+  }
+
+  @Test
+  public void testNullSearchOnNull()
+  {
+if (NullHandling.sqlCompatible()) {
+  expectException(IllegalArgumentException.class, 
"Function[contains_string] substring must be a string literal");
+}
+
+final ExprEval result = eval("contains_string(a, null)", 
Parser.withSuppliers(ImmutableMap.of("a", () -> null)));
+Assert.assertEquals(
+ExprEval.of(true, ExprType.LONG).value(),

Review comment:
   > In non-sql compatible mode, both of these will be empty instead of 
null and hence the function would return `true`.
   > In SQL compatible mode, we would see an exception since `null` is not a 
valid literal.
   
   Thanks for the explanation





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this 

[GitHub] [druid] clintropolis commented on a change in pull request #10327: benchmark for indexed table experiments

2020-09-14 Thread GitBox


clintropolis commented on a change in pull request #10327:
URL: https://github.com/apache/druid/pull/10327#discussion_r488079104



##
File path: 
benchmarks/src/test/java/org/apache/druid/benchmark/IndexedTableLoadingBenchmark.java
##
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.benchmark;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableSet;
+import org.apache.druid.java.util.common.io.Closer;
+import org.apache.druid.segment.QueryableIndexSegment;
+import org.apache.druid.segment.join.table.IndexedTable;
+import org.openjdk.jmh.annotations.Benchmark;
+import org.openjdk.jmh.annotations.BenchmarkMode;
+import org.openjdk.jmh.annotations.Fork;
+import org.openjdk.jmh.annotations.Level;
+import org.openjdk.jmh.annotations.Measurement;
+import org.openjdk.jmh.annotations.Mode;
+import org.openjdk.jmh.annotations.OutputTimeUnit;
+import org.openjdk.jmh.annotations.Param;
+import org.openjdk.jmh.annotations.Scope;
+import org.openjdk.jmh.annotations.Setup;
+import org.openjdk.jmh.annotations.State;
+import org.openjdk.jmh.annotations.TearDown;
+import org.openjdk.jmh.annotations.Warmup;
+import org.openjdk.jmh.infra.Blackhole;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.Set;
+import java.util.concurrent.TimeUnit;
+
+@State(Scope.Benchmark)
+@Fork(value = 1)
+@Warmup(iterations = 3)
+@Measurement(iterations = 5)
+public class IndexedTableLoadingBenchmark
+{
+  private static List> KEY_COLUMN_SETS = ImmutableList.of(
+  ImmutableSet.of("stringKey", "longKey")
+  );
+
+  @Param({"0"})
+  int keyColumns;
+
+  @Param({"5", "50", "500"})
+  int rowsPerSegment;
+
+  @Param({"segment"})
+  String indexedTableType;
+
+  Closer closer = Closer.create();
+
+  QueryableIndexSegment tableSegment = null;
+
+  @Setup(Level.Trial)
+  public void setup()
+  {
+tableSegment = 
IndexedTableJoinCursorBenchmark.makeQueryableIndexSegment(closer, "join", 
rowsPerSegment);
+  }
+
+  @TearDown
+  public void tearDown() throws IOException
+  {
+closer.close();
+  }
+
+  @Benchmark
+  @BenchmarkMode(Mode.AverageTime)
+  @OutputTimeUnit(TimeUnit.MILLISECONDS)
+  public void loadTable(Blackhole blackhole)
+  {
+try (
+IndexedTable table =

Review comment:
   yeah, that makes sense, modified to close in a teardown method that runs 
after each iteration





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10370: add computed Expr output types

2020-09-14 Thread GitBox


abhishekagarwal87 commented on a change in pull request #10370:
URL: https://github.com/apache/druid/pull/10370#discussion_r488075026



##
File path: 
core/src/main/java/org/apache/druid/math/expr/BinaryLogicalOperatorExpr.java
##
@@ -57,6 +57,17 @@ protected final double evalDouble(double left, double right)
 // Use Double.compare for more consistent NaN handling.
 return Evals.asDouble(Double.compare(left, right) < 0);
   }
+
+  @Nullable
+  @Override
+  public ExprType getOutputType(InputBindingTypes inputTypes)
+  {
+ExprType implicitCast = super.getOutputType(inputTypes);
+if (ExprType.STRING.equals(implicitCast)) {
+  return ExprType.LONG;
+}
+return implicitCast;

Review comment:
   > This PR is setting up for vectorized expressions, which are going to 
be strongly typed before processing, which I think makes this 
inconsistent/confusing/magical behavior not necessary.
   
   can you elaborate more on this? How will that look like? 
   
   While I understand the right behavior is debatable, it may still be best to 
keep the type in sync with eval even if it's not very intuitive. It may require 
that the logic of figuring out the type is not reusable as a function 
`ExprType.autoTypeConversion` and may have to be written differently for some 
operators.  We can, later on, change the different functions, fixing both the 
`eval` and `getOutputType` together. But till then, they will be in sync. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10370: add computed Expr output types

2020-09-14 Thread GitBox


abhishekagarwal87 commented on a change in pull request #10370:
URL: https://github.com/apache/druid/pull/10370#discussion_r488075026



##
File path: 
core/src/main/java/org/apache/druid/math/expr/BinaryLogicalOperatorExpr.java
##
@@ -57,6 +57,17 @@ protected final double evalDouble(double left, double right)
 // Use Double.compare for more consistent NaN handling.
 return Evals.asDouble(Double.compare(left, right) < 0);
   }
+
+  @Nullable
+  @Override
+  public ExprType getOutputType(InputBindingTypes inputTypes)
+  {
+ExprType implicitCast = super.getOutputType(inputTypes);
+if (ExprType.STRING.equals(implicitCast)) {
+  return ExprType.LONG;
+}
+return implicitCast;

Review comment:
   can you elaborate more on this? How will that look like? 
   > This PR is setting up for vectorized expressions, which are going to be 
strongly typed before processing, which I think makes this 
inconsistent/confusing/magical behavior not necessary.
   While I understand the right behavior is debatable, it may still be best to 
keep the type in sync with eval even if it's not very intuitive. It may require 
that the logic of figuring out the type is not reusable as a function 
`ExprType.autoTypeConversion` and may have to be written differently for some 
operators.  We can, later on, change the different functions, fixing both the 
`eval` and `getOutputType` together. But till then, they will be in sync. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] gianm commented on pull request #10377: Include Sequence-building time in CPU time metric.

2020-09-14 Thread GitBox


gianm commented on pull request #10377:
URL: https://github.com/apache/druid/pull/10377#issuecomment-692168297


   > Nice. It would be useful to call this out in the Release notes as well.
   
   Good point, I added the "Release Notes" label.
   
   I haven't had time to fix the coverage issue yet, but I'll be able to get to 
it soonish.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] suneet-s commented on pull request #10386: Upgrade AWS SDK

2020-09-14 Thread GitBox


suneet-s commented on pull request #10386:
URL: https://github.com/apache/druid/pull/10386#issuecomment-692145515


   @mitchlloyd it looks like the license check job is complaining about the 
licenses file not being updated
   
   ```
   Error: found 3 missing licenses. These licenses are reported, but missing in 
the registry
   druid_module: kinesis-indexing-service, groupId: com.amazonaws, artifactId: 
jmespath-java, version: 1.11.854, license: Apache License version 2.0
   druid_module: kinesis-indexing-service, groupId: com.amazonaws, artifactId: 
aws-java-sdk-sts, version: 1.11.854, license: Apache License version 2.0
   druid_module: kinesis-indexing-service, groupId: com.amazonaws, artifactId: 
aws-java-sdk-kinesis, version: 1.11.854, license: Apache License version 2.0
   ```
   
   Have you been able to run the AWS integration tests with the new sdk version?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] AshishKapoor commented on a change in pull request #10357: [web-console] Update README.md

2020-09-14 Thread GitBox


AshishKapoor commented on a change in pull request #10357:
URL: https://github.com/apache/druid/pull/10357#discussion_r488034132



##
File path: web-console/README.md
##
@@ -25,7 +25,9 @@ This is the unified Druid web console that servers as a data 
management layer fo
 
 1. You need to be withing the `web-console` directory
 2. Install the modules with `npm install`
-3. Run `npm start` will start in development mode and will proxy druid 
requests to `localhost:`
+3. Run `npm run compile` to compile the scss files

Review comment:
   Wait, @suneet-s  Let me verify this again. I believe it's because 
there's no Xcode on the machine in use.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] AshishKapoor commented on a change in pull request #10357: [web-console] Update README.md

2020-09-14 Thread GitBox


AshishKapoor commented on a change in pull request #10357:
URL: https://github.com/apache/druid/pull/10357#discussion_r488031187



##
File path: web-console/README.md
##
@@ -25,7 +25,9 @@ This is the unified Druid web console that servers as a data 
management layer fo
 
 1. You need to be withing the `web-console` directory
 2. Install the modules with `npm install`
-3. Run `npm start` will start in development mode and will proxy druid 
requests to `localhost:`
+3. Run `npm run compile` to compile the scss files

Review comment:
   Hi, @suneet-s as requested:
   ```
   Build failed with error code: 1
   npm WARN optional SKIPPING OPTIONAL DEPENDENCY: fsevents@1.2.9 
(node_modules/fsevents):
   npm WARN optional SKIPPING OPTIONAL DEPENDENCY: fsevents@1.2.9 install: 
`node install`
   npm WARN optional SKIPPING OPTIONAL DEPENDENCY: Exit status 1
   
   npm ERR! code ELIFECYCLE
   npm ERR! errno 1
   npm ERR! node-sass@4.13.1 postinstall: `node scripts/build.js`
   npm ERR! Exit status 1
   npm ERR! 
   npm ERR! Failed at the node-sass@4.13.1 postinstall script.
   npm ERR! This is probably not a problem with npm. There is likely additional 
logging output above.
   
   npm ERR! A complete log of this run can be found in:
   npm ERR! 
/Users/ashishkapoor/.npm/_logs/2020-09-14T15_36_01_509Z-debug.log
   ➜  web-console 
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] AshishKapoor commented on a change in pull request #10357: [web-console] Update README.md

2020-09-14 Thread GitBox


AshishKapoor commented on a change in pull request #10357:
URL: https://github.com/apache/druid/pull/10357#discussion_r488031187



##
File path: web-console/README.md
##
@@ -25,7 +25,9 @@ This is the unified Druid web console that servers as a data 
management layer fo
 
 1. You need to be withing the `web-console` directory
 2. Install the modules with `npm install`
-3. Run `npm start` will start in development mode and will proxy druid 
requests to `localhost:`
+3. Run `npm run compile` to compile the scss files

Review comment:
   ```
   Build failed with error code: 1
   npm WARN optional SKIPPING OPTIONAL DEPENDENCY: fsevents@1.2.9 
(node_modules/fsevents):
   npm WARN optional SKIPPING OPTIONAL DEPENDENCY: fsevents@1.2.9 install: 
`node install`
   npm WARN optional SKIPPING OPTIONAL DEPENDENCY: Exit status 1
   
   npm ERR! code ELIFECYCLE
   npm ERR! errno 1
   npm ERR! node-sass@4.13.1 postinstall: `node scripts/build.js`
   npm ERR! Exit status 1
   npm ERR! 
   npm ERR! Failed at the node-sass@4.13.1 postinstall script.
   npm ERR! This is probably not a problem with npm. There is likely additional 
logging output above.
   
   npm ERR! A complete log of this run can be found in:
   npm ERR! 
/Users/ashishkapoor/.npm/_logs/2020-09-14T15_36_01_509Z-debug.log
   ➜  web-console 
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] suneet-s commented on a change in pull request #10357: [web-console] Update README.md

2020-09-14 Thread GitBox


suneet-s commented on a change in pull request #10357:
URL: https://github.com/apache/druid/pull/10357#discussion_r488025313



##
File path: web-console/README.md
##
@@ -25,7 +25,9 @@ This is the unified Druid web console that servers as a data 
management layer fo
 
 1. You need to be withing the `web-console` directory
 2. Install the modules with `npm install`
-3. Run `npm start` will start in development mode and will proxy druid 
requests to `localhost:`
+3. Run `npm run compile` to compile the scss files

Review comment:
   I don't remember having to do this step when I ran the web console 
locally. Can you describe what wasn't working with the instructions before this 
change





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] suneet-s merged pull request #10392: TransformSpecTest should extends InitializedNullHandlingTest

2020-09-14 Thread GitBox


suneet-s merged pull request #10392:
URL: https://github.com/apache/druid/pull/10392


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] suneet-s closed issue #10391: Unit test fail due to missing extend InitializedNullHandlingTest

2020-09-14 Thread GitBox


suneet-s closed issue #10391:
URL: https://github.com/apache/druid/issues/10391


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid] branch master updated: TransformSpecTest should extends InitializedNullHandlingTest (#10392)

2020-09-14 Thread suneet
This is an automated email from the ASF dual-hosted git repository.

suneet pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git


The following commit(s) were added to refs/heads/master by this push:
 new 3d4b48e  TransformSpecTest should extends InitializedNullHandlingTest 
(#10392)
3d4b48e is described below

commit 3d4b48e0aa5109e3b09c6d1c8d04264f8d9121cd
Author: Cheng Pan <379377...@qq.com>
AuthorDate: Mon Sep 14 23:22:24 2020 +0800

TransformSpecTest should extends InitializedNullHandlingTest (#10392)
---
 .../test/java/org/apache/druid/segment/indexing/TransformSpecTest.java | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git 
a/server/src/test/java/org/apache/druid/segment/indexing/TransformSpecTest.java 
b/server/src/test/java/org/apache/druid/segment/indexing/TransformSpecTest.java
index 8102a71..763a924 100644
--- 
a/server/src/test/java/org/apache/druid/segment/indexing/TransformSpecTest.java
+++ 
b/server/src/test/java/org/apache/druid/segment/indexing/TransformSpecTest.java
@@ -35,12 +35,13 @@ import org.apache.druid.query.filter.SelectorDimFilter;
 import org.apache.druid.segment.TestHelper;
 import org.apache.druid.segment.transform.ExpressionTransform;
 import org.apache.druid.segment.transform.TransformSpec;
+import org.apache.druid.testing.InitializedNullHandlingTest;
 import org.junit.Assert;
 import org.junit.Test;
 
 import java.util.Map;
 
-public class TransformSpecTest
+public class TransformSpecTest extends InitializedNullHandlingTest
 {
   private static final MapInputRowParser PARSER = new MapInputRowParser(
   new TimeAndDimsParseSpec(


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] a2l007 commented on pull request #10377: Include Sequence-building time in CPU time metric.

2020-09-14 Thread GitBox


a2l007 commented on pull request #10377:
URL: https://github.com/apache/druid/pull/10377#issuecomment-692096884


   Nice. It would be useful to call this out in the Release notes as well.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] tlblessing opened a new pull request #10396: Update index.md

2020-09-14 Thread GitBox


tlblessing opened a new pull request #10396:
URL: https://github.com/apache/druid/pull/10396


   Cool product.
   
   Minor suggested copy edits.
   -avoid excessive use of double quotes for terms that are well understood.
   -avoid future tense when possible.
   -missing punctuation.
   
   
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] a2l007 commented on pull request #9832: Disable sending server version in response headers

2020-09-14 Thread GitBox


a2l007 commented on pull request #9832:
URL: https://github.com/apache/druid/pull/9832#issuecomment-692057094


   Yeah makes sense. I've removed the config and related changes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] AshishKapoor commented on issue #10349: Druid data loader does not work with input time column name - '__time'

2020-09-14 Thread GitBox


AshishKapoor commented on issue #10349:
URL: https://github.com/apache/druid/issues/10349#issuecomment-692018903


   > `__time` is a reserved name representing time column of the segment. 
Currently it requires a value of of unix epoch. You could change the property 
name of `__time` in json to another name.
   
   I've noticed it works with other databases.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] AshishKapoor edited a comment on issue #10349: Druid data loader does not work with input time column name - '__time'

2020-09-14 Thread GitBox


AshishKapoor edited a comment on issue #10349:
URL: https://github.com/apache/druid/issues/10349#issuecomment-692018903


   > `__time` is a reserved name representing time column of the segment. 
Currently it requires a value of of unix epoch. You could change the property 
name of `__time` in json to another name.
   
   @FrankChen021 I've noticed it works with other databases.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] maytasm commented on a change in pull request #10371: Auto-compaction snapshot status API

2020-09-14 Thread GitBox


maytasm commented on a change in pull request #10371:
URL: https://github.com/apache/druid/pull/10371#discussion_r487839440



##
File path: 
server/src/main/java/org/apache/druid/server/coordinator/AutoCompactionSnapshot.java
##
@@ -0,0 +1,192 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.server.coordinator;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+import javax.annotation.Nullable;
+import javax.validation.constraints.NotNull;
+import java.util.Objects;
+
+public class AutoCompactionSnapshot
+{
+  public enum AutoCompactionScheduleStatus
+  {
+NOT_ENABLED,
+RUNNING
+  }
+
+  @JsonProperty
+  private String dataSource;
+  @JsonProperty
+  private AutoCompactionScheduleStatus scheduleStatus;
+  @JsonProperty
+  private String latestScheduledTaskId;
+  @JsonProperty
+  private long byteAwaitingCompaction;
+  @JsonProperty
+  private long byteProcessed;

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] maytasm commented on a change in pull request #10371: Auto-compaction snapshot status API

2020-09-14 Thread GitBox


maytasm commented on a change in pull request #10371:
URL: https://github.com/apache/druid/pull/10371#discussion_r487838791



##
File path: 
server/src/main/java/org/apache/druid/server/coordinator/duty/EmitClusterStatsAndMetrics.java
##
@@ -296,18 +296,87 @@ public DruidCoordinatorRuntimeParams 
run(DruidCoordinatorRuntimeParams params)
 
 emitter.emit(
 new ServiceMetricEvent.Builder().build(
-"compact/task/count",
+"compact/task/scheduled/count",
 stats.getGlobalStat(CompactSegments.COMPACTION_TASK_COUNT)
 )
 );
 
+emitter.emit(
+new ServiceMetricEvent.Builder().build(
+"compact/task/maxSlot/count",
+stats.getGlobalStat(CompactSegments.MAX_COMPACTION_TASK_SLOT)
+)
+);
+
+emitter.emit(
+new ServiceMetricEvent.Builder().build(
+"compact/task/availableSlot/count",
+stats.getGlobalStat(CompactSegments.AVAILABLE_COMPACTION_TASK_SLOT)
+)
+);
+
+stats.forEachDataSourceStat(
+CompactSegments.TOTAL_SIZE_OF_SEGMENTS_AWAITING_COMPACTION,
+(final String dataSource, final long count) -> {
+  emitter.emit(
+  new ServiceMetricEvent.Builder()
+  .setDimension(DruidMetrics.DATASOURCE, dataSource)
+  .build("segment/waitCompact/segmentByte", count)
+  );
+}
+);
+
+stats.forEachDataSourceStat(
+CompactSegments.TOTAL_COUNT_OF_SEGMENTS_AWAITING_COMPACTION,
+(final String dataSource, final long count) -> {
+  emitter.emit(
+  new ServiceMetricEvent.Builder()
+  .setDimension(DruidMetrics.DATASOURCE, dataSource)
+  .build("segment/waitCompact/segmentCount", count)
+  );
+}
+);
+
+stats.forEachDataSourceStat(
+CompactSegments.TOTAL_INTERVAL_OF_SEGMENTS_AWAITING_COMPACTION,
+(final String dataSource, final long count) -> {
+  emitter.emit(
+  new ServiceMetricEvent.Builder()
+  .setDimension(DruidMetrics.DATASOURCE, dataSource)
+  .build("segment/waitCompact/intervalCount", count)
+  );
+}
+);
+
+stats.forEachDataSourceStat(
+CompactSegments.TOTAL_SIZE_OF_SEGMENTS_COMPACTED,
+(final String dataSource, final long count) -> {
+  emitter.emit(
+  new ServiceMetricEvent.Builder()
+  .setDimension(DruidMetrics.DATASOURCE, dataSource)
+  .build("segment/compacted/segmentByte", count)
+  );
+}
+);
+
+stats.forEachDataSourceStat(
+CompactSegments.TOTAL_COUNT_OF_SEGMENTS_COMPACTED,
+(final String dataSource, final long count) -> {
+  emitter.emit(
+  new ServiceMetricEvent.Builder()
+  .setDimension(DruidMetrics.DATASOURCE, dataSource)
+  .build("segment/compacted/segmentCount", count)
+  );
+}
+);
+
 stats.forEachDataSourceStat(
-"segmentsWaitCompact",
+CompactSegments.TOTAL_INTERVAL_OF_SEGMENTS_COMPACTED,
 (final String dataSource, final long count) -> {
   emitter.emit(
   new ServiceMetricEvent.Builder()
   .setDimension(DruidMetrics.DATASOURCE, dataSource)
-  .build("segment/waitCompact/count", count)
+  .build("segment/compacted/intervalCount", count)

Review comment:
   Yep  





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] maytasm commented on a change in pull request #10371: Auto-compaction snapshot status API

2020-09-14 Thread GitBox


maytasm commented on a change in pull request #10371:
URL: https://github.com/apache/druid/pull/10371#discussion_r487838589



##
File path: 
server/src/main/java/org/apache/druid/server/coordinator/duty/EmitClusterStatsAndMetrics.java
##
@@ -296,18 +296,87 @@ public DruidCoordinatorRuntimeParams 
run(DruidCoordinatorRuntimeParams params)
 
 emitter.emit(
 new ServiceMetricEvent.Builder().build(
-"compact/task/count",
+"compact/task/scheduled/count",

Review comment:
   Revert back





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10366: Proposed changes for making joins cacheable

2020-09-14 Thread GitBox


abhishekagarwal87 commented on a change in pull request #10366:
URL: https://github.com/apache/druid/pull/10366#discussion_r487837336



##
File path: 
server/src/main/java/org/apache/druid/query/ResultLevelCachingQueryRunner.java
##
@@ -86,7 +92,15 @@ public ResultLevelCachingQueryRunner(
 if (useResultCache || populateResultCache) {
 
   final String cacheKeyStr = 
StringUtils.fromUtf8(strategy.computeResultLevelCacheKey(query));
-  final byte[] cachedResultSet = 
fetchResultsFromResultLevelCache(cacheKeyStr);
+  DataSourceAnalysis analysis = 
DataSourceAnalysis.forDataSource(query.getDataSource());
+  byte[] dataSourceCacheKey = 
Joinables.computeDataSourceCacheKey(analysis, joinableFactory).orElse(null);
+  if (null == dataSourceCacheKey) {
+return baseRunner.run(
+queryPlus,
+responseContext
+);
+  }
+  final byte[] cachedResultSet = 
fetchResultsFromResultLevelCache(cacheKeyStr, dataSourceCacheKey);

Review comment:
   earlier `cacheKeyStr` was being used as namespace as well as the key 
which was just like unnecessary duplication. Internally, the namespace and key 
are composed together into another bigger key. By the way, this too would cause 
cache bust once on an upgrade. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10366: Proposed changes for making joins cacheable

2020-09-14 Thread GitBox


abhishekagarwal87 commented on a change in pull request #10366:
URL: https://github.com/apache/druid/pull/10366#discussion_r487835902



##
File path: 
processing/src/main/java/org/apache/druid/segment/join/table/BroadcastSegmentIndexedTable.java
##
@@ -241,6 +244,22 @@ public ColumnSelectorFactory 
makeColumnSelectorFactory(ReadableOffset offset, bo
 );
   }
 
+  @Override
+  public Optional computeCacheKey()
+  {
+SegmentId segmentId = segment.getId();

Review comment:
   I wanted to use `CacheUtil` here but that is in `server` module and not 
accessible here. May be I can move that class to `core`?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] AshishKapoor closed issue #10349: Druid data loader does not work with input time column name - '__time'

2020-09-14 Thread GitBox


AshishKapoor closed issue #10349:
URL: https://github.com/apache/druid/issues/10349


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10366: Proposed changes for making joins cacheable

2020-09-14 Thread GitBox


abhishekagarwal87 commented on a change in pull request #10366:
URL: https://github.com/apache/druid/pull/10366#discussion_r487819771



##
File path: processing/src/main/java/org/apache/druid/segment/join/Joinables.java
##
@@ -118,6 +126,47 @@ public static boolean isPrefixedBy(final String 
columnName, final String prefix)
 );
   }
 
+  /**
+   * Compute a cache key prefix for data sources that is to be used in segment 
level and result level caches. The
+   * data source can either be base (clauses is empty) or RHS of a join 
(clauses is non-empty). In both of the cases,
+   * a non-null cache is returned. However, the cache key is null if there is 
a join and some of the right data sources
+   * participating in the join do not support caching yet
+   *
+   * @param dataSourceAnalysis
+   * @param joinableFactory
+   * @return
+   */
+  public static Optional computeDataSourceCacheKey(
+  final DataSourceAnalysis dataSourceAnalysis,
+  final JoinableFactory joinableFactory
+  )
+  {
+final CacheKeyBuilder keyBuilder;
+final List clauses = 
dataSourceAnalysis.getPreJoinableClauses();
+if (clauses.isEmpty()) {
+  keyBuilder = new CacheKeyBuilder(REGULAR_OPERATION);

Review comment:
   You are right. I was thinking that it will only be one time bust. And, 
that having a distinct prefix will eliminate the possibility of a cache key 
collision. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] jagtapsantosh opened a new issue #10395: Getting error when accessing materialized view query in web console and in logs using druid 0.19.0: Could not resolve type id 'view' as a subt

2020-09-14 Thread GitBox


jagtapsantosh opened a new issue #10395:
URL: https://github.com/apache/druid/issues/10395


   Hello Guys,
 I am using druid 0.19.0 version and trying to setup 
Materialized View and Materialized Selection Features. I am following the 
instruction given on [this 
link](https://druid.apache.org/docs/latest/development/extensions-contrib/materialized-view.html)
 of druid official documentation. I have successfully loaded the extensions 
like materialized-view-selection on Broker and load 
materialized-view-maintenance on Overlord. I have setup Hadoop on the same 
machine of druid and storing deep storage data in hadoop.
 I could successfully able to submit derivativeDataSource 
supervisor spec and created derivativeDataSource as per mentioned under 
Materialized-view-maintenance of the official druid document. However I am 
getting following error while submitting sample view type query spec mentioned 
under Materialized-view-selection under the [this 
link](https://druid.apache.org/docs/latest/development/extensions-contrib/materialized-view.html)
   
   Exception Details: 
   com.fasterxml.jackson.databind.exc.InvalidTypeIdException: Could not resolve 
type id 'view' as a subtype of `org.apache.druid.query.Query`: known type ids = 
[dataSourceMetadata, groupBy, scan, search, segmentMetadata, select, 
timeBoundary, timeseries, topN]
at [Source: (org.eclipse.jetty.server.HttpInputOverHTTP); line: 1, column: 
14]
   at 
com.fasterxml.jackson.databind.exc.InvalidTypeIdException.from(InvalidTypeIdException.java:43)
 ~[jackson-databind-2.10.2.jar:2.10.2]
   at 
com.fasterxml.jackson.databind.DeserializationContext.invalidTypeIdException(DeserializationContext.java:1758)
 ~[jackson-databind-2.10.2.jar:2.10.2]
   at 
com.fasterxml.jackson.databind.DeserializationContext.handleUnknownTypeId(DeserializationContext.java:1265)
 ~[jackson-databind-2.10.2.jar:2.10.2]
   at 
com.fasterxml.jackson.databind.jsontype.impl.TypeDeserializerBase._handleUnknownTypeId(TypeDeserializerBase.java:290)
 ~[jackson-databind-2.10.2.jar:2.10.2]
   at 
com.fasterxml.jackson.databind.jsontype.impl.TypeDeserializerBase._findDeserializer(TypeDeserializerBase.java:162)
 ~[jackson-databind-2.10.2.jar:2.10.2]
   at 
com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:113)
 ~[jackson-databind-2.10.2.jar:2.10.2]
   at 
com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:97)
 ~[jackson-databind-2.10.2.jar:2.10.2]
   at 
com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:254)
 ~[jackson-databind-2.10.2.jar:2.10.2]
   at 
com.fasterxml.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize(TypeWrappedDeserializer.java:68)
 ~[jackson-databind-2.10.2.jar:2.10.2]
   at 
com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4202)
 ~[jackson-databind-2.10.2.jar:2.10.2]
   at 
com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3242) 
~[jackson-databind-2.10.2.jar:2.10.2]
   at 
org.apache.druid.server.AsyncQueryForwardingServlet.service(AsyncQueryForwardingServlet.java:235)
 [druid-server-0.19.0.jar:0.19.0]
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) 
[javax.servlet-api-3.1.0.jar:3.1.0]
   at 
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:865) 
[jetty-servlet-9.4.12.v20180830.jar:9.4.12.v20180830]
   at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1655)
 [jetty-servlet-9.4.12.v20180830.jar:9.4.12.v20180830]
   at 
org.apache.druid.server.security.PreResponseAuthorizationCheckFilter.doFilter(PreResponseAuthorizationCheckFilter.java:82)
 [druid-server-0.19.0.jar:0.19.0]
   at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
 [jetty-servlet-9.4.12.v20180830.jar:9.4.12.v20180830]
   at 
org.apache.druid.server.security.AllowHttpMethodsResourceFilter.doFilter(AllowHttpMethodsResourceFilter.java:78)
 [druid-server-0.19.0.jar:0.19.0]
   at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
 [jetty-servlet-9.4.12.v20180830.jar:9.4.12.v20180830]
   at 
org.apache.druid.server.security.AllowOptionsResourceFilter.doFilter(AllowOptionsResourceFilter.java:75)
 [druid-server-0.19.0.jar:0.19.0]
   at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
 [jetty-servlet-9.4.12.v20180830.jar:9.4.12.v20180830]
   at 
org.apache.druid.server.security.AllowAllAuthenticator$1.doFilter(AllowAllAuthenticator.java:84)
 [druid-server-0.19.0.jar:0.19.0]
   at 

[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10366: Proposed changes for making joins cacheable

2020-09-14 Thread GitBox


abhishekagarwal87 commented on a change in pull request #10366:
URL: https://github.com/apache/druid/pull/10366#discussion_r487818427



##
File path: 
server/src/main/java/org/apache/druid/server/coordination/ServerManager.java
##
@@ -293,21 +302,28 @@ public ServerManager(
 queryMetrics -> queryMetrics.segment(segmentIdString)
 );
 
-CachingQueryRunner cachingQueryRunner = new CachingQueryRunner<>(
-segmentIdString,
-segmentDescriptor,
-objectMapper,
-cache,
-toolChest,
-metricsEmittingQueryRunnerInner,
-cachePopulator,
-cacheConfig
-);
+QueryRunner queryRunner;

Review comment:
   I was thinking that less places worry about the different meanings of 
`cacheKeyPrefix` content, the better it is. So for CachingQueryRunner, 
`cacheKeyPrefix` is just a plain dumb byte array 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] pan3793 commented on pull request #9507: optionally disable all of hardcoded zookeeper use

2020-09-14 Thread GitBox


pan3793 commented on pull request #9507:
URL: https://github.com/apache/druid/pull/9507#issuecomment-691971352


   It would be nice if druid support k8s in mainline, will it be addressed in 
next release?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10366: Proposed changes for making joins cacheable

2020-09-14 Thread GitBox


abhishekagarwal87 commented on a change in pull request #10366:
URL: https://github.com/apache/druid/pull/10366#discussion_r48760



##
File path: 
server/src/main/java/org/apache/druid/segment/join/BroadcastTableJoinableFactory.java
##
@@ -76,4 +76,27 @@ public boolean isDirectlyJoinable(DataSource dataSource)
 }
 return Optional.empty();
   }
+
+  @Override
+  public Optional computeJoinCacheKey(DataSource dataSource)

Review comment:
   Yes. good point. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10366: Proposed changes for making joins cacheable

2020-09-14 Thread GitBox


abhishekagarwal87 commented on a change in pull request #10366:
URL: https://github.com/apache/druid/pull/10366#discussion_r487699662



##
File path: 
server/src/main/java/org/apache/druid/query/ResultLevelCachingQueryRunner.java
##
@@ -86,7 +92,15 @@ public ResultLevelCachingQueryRunner(
 if (useResultCache || populateResultCache) {
 
   final String cacheKeyStr = 
StringUtils.fromUtf8(strategy.computeResultLevelCacheKey(query));
-  final byte[] cachedResultSet = 
fetchResultsFromResultLevelCache(cacheKeyStr);
+  DataSourceAnalysis analysis = 
DataSourceAnalysis.forDataSource(query.getDataSource());
+  byte[] dataSourceCacheKey = 
Joinables.computeDataSourceCacheKey(analysis, joinableFactory).orElse(null);
+  if (null == dataSourceCacheKey) {
+return baseRunner.run(
+queryPlus,
+responseContext
+);
+  }
+  final byte[] cachedResultSet = 
fetchResultsFromResultLevelCache(cacheKeyStr, dataSourceCacheKey);

Review comment:
   yeah. this is not well thought out right now. I will revisit this. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10366: Proposed changes for making joins cacheable

2020-09-14 Thread GitBox


abhishekagarwal87 commented on a change in pull request #10366:
URL: https://github.com/apache/druid/pull/10366#discussion_r487699118



##
File path: 
processing/src/main/java/org/apache/druid/segment/join/table/BroadcastSegmentIndexedTable.java
##
@@ -241,6 +244,22 @@ public ColumnSelectorFactory 
makeColumnSelectorFactory(ReadableOffset offset, bo
 );
   }
 
+  @Override
+  public Optional computeCacheKey()
+  {
+SegmentId segmentId = segment.getId();

Review comment:
   are you suggesting to move this logic to `SegmentId` class? 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10366: Proposed changes for making joins cacheable

2020-09-14 Thread GitBox


abhishekagarwal87 commented on a change in pull request #10366:
URL: https://github.com/apache/druid/pull/10366#discussion_r487698508



##
File path: 
processing/src/main/java/org/apache/druid/segment/join/MapJoinableFactory.java
##
@@ -80,4 +80,21 @@ public boolean isDirectlyJoinable(DataSource dataSource)
 }
 return maybeJoinable;
   }
+
+  @Override
+  public Optional computeJoinCacheKey(DataSource dataSource)

Review comment:
   Right now as I see that additional info comes from the data source 
itself. The joinable factory is deciding based on the `dataSource` whether it 
can build the joinable factory.  can you share any hypothetical examples 
wherein one factory decides to build Joinable but the other one does not on the 
basis of `JoinConditionAnalysis`?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10366: Proposed changes for making joins cacheable

2020-09-14 Thread GitBox


abhishekagarwal87 commented on a change in pull request #10366:
URL: https://github.com/apache/druid/pull/10366#discussion_r487695326



##
File path: processing/src/main/java/org/apache/druid/segment/join/Joinables.java
##
@@ -118,6 +126,47 @@ public static boolean isPrefixedBy(final String 
columnName, final String prefix)
 );
   }
 
+  /**
+   * Compute a cache key prefix for data sources that is to be used in segment 
level and result level caches. The
+   * data source can either be base (clauses is empty) or RHS of a join 
(clauses is non-empty). In both of the cases,
+   * a non-null cache is returned. However, the cache key is null if there is 
a join and some of the right data sources
+   * participating in the join do not support caching yet
+   *
+   * @param dataSourceAnalysis
+   * @param joinableFactory
+   * @return
+   */
+  public static Optional computeDataSourceCacheKey(
+  final DataSourceAnalysis dataSourceAnalysis,
+  final JoinableFactory joinableFactory
+  )
+  {
+final CacheKeyBuilder keyBuilder;
+final List clauses = 
dataSourceAnalysis.getPreJoinableClauses();
+if (clauses.isEmpty()) {
+  keyBuilder = new CacheKeyBuilder(REGULAR_OPERATION);
+} else {
+  keyBuilder = new CacheKeyBuilder(JOIN_OPERATION);
+  for (PreJoinableClause clause : clauses) {
+if (!clause.getCondition().canHashJoin()) {

Review comment:
   I was thinking to keep it here so that I could just pass the 
`data-source` inside the `computeJoinCacheKey`.  The engine for hash join works 
outside the boundaries of joinable factories. so it felt simpler to just keep 
it here and joinable factories just worry about the data source. The other 
cache info from the condition is derived here in this class as well (original 
expression etc) 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] liran-funaro commented on issue #10321: [Proposal] Design for a Configurable Index Type

2020-09-14 Thread GitBox


liran-funaro commented on issue #10321:
URL: https://github.com/apache/druid/issues/10321#issuecomment-691852058


   @jihoonson Thank you for taking the time to review this proposal.
   
   >I'm assuming that those knobs will have default values and users usually 
don't have to worry about them. Is this right?
   
   Yes. They will be `@Nullable` and will be set to the default value in that 
case. E.g., for `OffheapIncrementalIndex.Spec`:
   ```java
   public Spec(
   final @JsonProperty("bufferSize") @Nullable Integer bufferSize,
   final @JsonProperty("cacheSize") @Nullable Integer cacheSize
   )
   {
 this.bufferSize = bufferSize != null && bufferSize > 0 ? bufferSize : 
DEFAULT_BUFFER_SIZE;
 this.cacheSize = cacheSize != null && cacheSize > this.bufferSize ? 
cacheSize : DEFAULT_CACHE_SIZE;
 // ...
   }
   ```
   
   > do you think it's stable enough or do you perhaps need to make some 
changes in the near future?
   
   Using the `@UnstableApi` seems like a good idea at this point.
   
   > I'm not sure how much it's slower compared to groupBy v2 or how much using 
Oak can help with its performance. You may want to look into concurrent 
processing in groupBy v2 more closely such as `ConcurrentGrouper`.
   
   I'll look into that. Improving query performance is on our agenda, and we 
considered comparing the groupBy v1 with Oak to the groupBy v2 mechanism. We 
will report our results in a different issue in the future.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org