[GitHub] [druid] maytasm commented on a change in pull request #10371: Auto-compaction snapshot status API
maytasm commented on a change in pull request #10371: URL: https://github.com/apache/druid/pull/10371#discussion_r488392899 ## File path: server/src/main/java/org/apache/druid/server/coordinator/AutoCompactionSnapshot.java ## @@ -0,0 +1,192 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.server.coordinator; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; + +import javax.annotation.Nullable; +import javax.validation.constraints.NotNull; +import java.util.Objects; + +public class AutoCompactionSnapshot +{ + public enum AutoCompactionScheduleStatus + { +NOT_ENABLED, +RUNNING + } + + @JsonProperty + private String dataSource; + @JsonProperty + private AutoCompactionScheduleStatus scheduleStatus; + @JsonProperty + private String latestScheduledTaskId; + @JsonProperty + private long byteCountAwaitingCompaction; Review comment: Im fine either way. "byteCount" matches it with the other Counts This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] maytasm commented on a change in pull request #10371: Auto-compaction snapshot status API
maytasm commented on a change in pull request #10371: URL: https://github.com/apache/druid/pull/10371#discussion_r488392581 ## File path: server/src/main/java/org/apache/druid/server/coordinator/duty/CompactSegments.java ## @@ -238,25 +272,102 @@ private CoordinatorStats makeStats(int numCompactionTasks, CompactionSegmentIter { final CoordinatorStats stats = new CoordinatorStats(); stats.addToGlobalStat(COMPACTION_TASK_COUNT, numCompactionTasks); -totalSizesOfSegmentsAwaitingCompactionPerDataSource = iterator.totalRemainingSegmentsSizeBytes(); - totalSizesOfSegmentsAwaitingCompactionPerDataSource.object2LongEntrySet().fastForEach( -entry -> { - final String dataSource = entry.getKey(); - final long totalSizeOfSegmentsAwaitingCompaction = entry.getLongValue(); - stats.addToDataSourceStat( - TOTAL_SIZE_OF_SEGMENTS_AWAITING_COMPACTION, - dataSource, - totalSizeOfSegmentsAwaitingCompaction - ); -} -); + +// Make sure that the iterator iterate through all the remaining segments so that we can get accurate and correct +// statistics (remaining, skipped, processed, etc.). The reason we have to do this explicitly here is because +// earlier (when we are iterating to submit compaction tasks) we may have ran out of task slot and were not able +// to iterate to the first segment that needs compaction for some datasource. +iterator.flushAllSegments(); Review comment: Made the change for auto compaction duty to the last of the set of Coordinator duty and change the scheduling of the Coordinator duty from scheduleWithFixedDelay to scheduleWithFixedRate. This should ensures that the other duty still get run at every 30 mins (assuming the auto compaction doesn't go pass 30 mins which I doubt it will). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] maytasm commented on a change in pull request #10371: Auto-compaction snapshot status API
maytasm commented on a change in pull request #10371: URL: https://github.com/apache/druid/pull/10371#discussion_r488392405 ## File path: server/src/main/java/org/apache/druid/server/coordinator/AutoCompactionSnapshot.java ## @@ -0,0 +1,192 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.server.coordinator; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; + +import javax.annotation.Nullable; +import javax.validation.constraints.NotNull; +import java.util.Objects; + +public class AutoCompactionSnapshot +{ + public enum AutoCompactionScheduleStatus + { +NOT_ENABLED, +RUNNING + } + + @JsonProperty + private String dataSource; + @JsonProperty + private AutoCompactionScheduleStatus scheduleStatus; + @JsonProperty + private String latestScheduledTaskId; + @JsonProperty + private long byteCountAwaitingCompaction; Review comment: Initially I had it as "bytes" but @jihoonson suggested "byteCount". This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] maytasm commented on pull request #10371: Auto-compaction snapshot status API
maytasm commented on pull request #10371: URL: https://github.com/apache/druid/pull/10371#issuecomment-692471451 > > I was thinking of "snapshot" as the snapshot taken at the point in time the API is called. I think "status" also works. Let me know if you still think "status" is better. I'm open to both > > I think "status" is better, "snapshot" could be misinterpreted as snapshotting the data itself, vs getting a snapshot of the stats Changed the API endpoint to "status" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] maytasm commented on a change in pull request #10371: Auto-compaction snapshot status API
maytasm commented on a change in pull request #10371: URL: https://github.com/apache/druid/pull/10371#discussion_r488390448 ## File path: server/src/main/java/org/apache/druid/server/coordinator/duty/CompactSegments.java ## @@ -238,25 +272,102 @@ private CoordinatorStats makeStats(int numCompactionTasks, CompactionSegmentIter { final CoordinatorStats stats = new CoordinatorStats(); stats.addToGlobalStat(COMPACTION_TASK_COUNT, numCompactionTasks); -totalSizesOfSegmentsAwaitingCompactionPerDataSource = iterator.totalRemainingSegmentsSizeBytes(); - totalSizesOfSegmentsAwaitingCompactionPerDataSource.object2LongEntrySet().fastForEach( -entry -> { - final String dataSource = entry.getKey(); - final long totalSizeOfSegmentsAwaitingCompaction = entry.getLongValue(); - stats.addToDataSourceStat( - TOTAL_SIZE_OF_SEGMENTS_AWAITING_COMPACTION, - dataSource, - totalSizeOfSegmentsAwaitingCompaction - ); -} -); + +// Make sure that the iterator iterate through all the remaining segments so that we can get accurate and correct +// statistics (remaining, skipped, processed, etc.). The reason we have to do this explicitly here is because +// earlier (when we are iterating to submit compaction tasks) we may have ran out of task slot and were not able +// to iterate to the first segment that needs compaction for some datasource. +iterator.flushAllSegments(); Review comment: I also thought of having two implementation of statistic calculation. We can have the old method which is just a sum of all segments from the earliest to the segment that we iterated up to. We can call this approximation and the downside is that it can overestimate if there are compacted intervals between the earliest to the segment that we iterated up to. Then we can have this new implementation which is an exact calculation (i.e. we check each and every interval). Each datasource can choose which implementation to use for its stats calculation. However, I discussed this with @jihoonson and we think that this new implementation is actually ok to use as a default for all datasources. This code is run as part of Coordinator duty hence it is not on critical path and there is no client waiting, etc. While it is going through all segments, it is doing this at a interval level (we do batch of segments for each interval at a time). Thus, the looping is for every interval of every datasources (not every segments). Also, the auto compaction is run every 30 mins by default. This should be plenty of time for this code to finish before the next cycle begins. Furthermore, we can move the auto compaction duty to the last of the set of Coordinator duty and change the scheduling of the Coordinator duty from scheduleWithFixedDelay to scheduleWithFixedRate. This should ensures that the other duty still get run at every 30 mins (assuming the auto compaction doesn't go pass 30 mins which I doubt it will). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] maytasm commented on a change in pull request #10371: Auto-compaction snapshot status API
maytasm commented on a change in pull request #10371: URL: https://github.com/apache/druid/pull/10371#discussion_r488385897 ## File path: server/src/main/java/org/apache/druid/server/coordinator/duty/NewestSegmentFirstIterator.java ## @@ -112,27 +114,38 @@ } @Override - public Object2LongOpenHashMap totalRemainingSegmentsSizeBytes() + public Map totalRemainingStatistics() { -final Object2LongOpenHashMap resultMap = new Object2LongOpenHashMap<>(); -resultMap.defaultReturnValue(UNKNOWN_TOTAL_REMAINING_SEGMENTS_SIZE); -for (QueueEntry entry : queue) { - final VersionedIntervalTimeline timeline = dataSources.get(entry.getDataSource()); - final Interval interval = new Interval(timeline.first().getInterval().getStart(), entry.interval.getEnd()); - - final List> holders = timeline.lookup(interval); - - long size = 0; - for (DataSegment segment : FluentIterable - .from(holders) - .transformAndConcat(TimelineObjectHolder::getObject) - .transform(PartitionChunk::getObject)) { -size += segment.getSize(); - } +return remainingSegments; + } + + @Override + public Map totalProcessedStatistics() + { +return processedSegments; + } - resultMap.put(entry.getDataSource(), size); + @Override + public void flushAllSegments() + { +if (queue.isEmpty()) { + return; +} +QueueEntry entry; +while ((entry = queue.poll()) != null) { + final List resultSegments = entry.segments; + final String dataSourceName = resultSegments.get(0).getDataSource(); + // This entry was in the queue, meaning that it was not processed. Hence, also aggregates it's + // statistic to the remaining segments counts. + collectSegmentStatistics(remainingSegments, dataSourceName, new SegmentsToCompact(entry.segments)); + final CompactibleTimelineObjectHolderCursor compactibleTimelineObjectHolderCursor = timelineIterators.get( + dataSourceName + ); + // WARNING: This iterates the compactibleTimelineObjectHolderCursor. + // Since this method is intended to only be use after all necessary iteration is done on this iterator Review comment: I mean that this method (`flushAllSegments`), will iterates the `compactibleTimelineObjectHolderCursor` in `iterateAllSegments` method call. This class `NewestSegmentFirstIterator` when use as a iterator in the next() method will also iterate the `compactibleTimelineObjectHolderCursor`. Hence, this iterator (`NewestSegmentFirstIterator) cannot be use to iterate after this method (`flushAllSegments`) is called. Basically, you cannot call `flushAllSegments` while iterating the NewestSegmentFirstIterator and you cannot call `flushAllSegments` then go back to iterating the NewestSegmentFirstIterator This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] jon-wei commented on pull request #10371: Auto-compaction snapshot status API
jon-wei commented on pull request #10371: URL: https://github.com/apache/druid/pull/10371#issuecomment-692462764 > I was thinking of "snapshot" as the snapshot taken at the point in time the API is called. I think "status" also works. Let me know if you still think "status" is better. I'm open to both I think "status" is better, "snapshot" could be misinterpreted as snapshotting the data itself, vs getting a snapshot of the stats This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] jon-wei commented on a change in pull request #10371: Auto-compaction snapshot status API
jon-wei commented on a change in pull request #10371: URL: https://github.com/apache/druid/pull/10371#discussion_r488322171 ## File path: server/src/main/java/org/apache/druid/server/coordinator/AutoCompactionSnapshot.java ## @@ -0,0 +1,192 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.server.coordinator; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; + +import javax.annotation.Nullable; +import javax.validation.constraints.NotNull; +import java.util.Objects; + +public class AutoCompactionSnapshot +{ + public enum AutoCompactionScheduleStatus + { +NOT_ENABLED, +RUNNING + } + + @JsonProperty + private String dataSource; + @JsonProperty + private AutoCompactionScheduleStatus scheduleStatus; + @JsonProperty + private String latestScheduledTaskId; + @JsonProperty + private long byteCountAwaitingCompaction; Review comment: suggest just "bytes" instead of "byteCount" for the property name ## File path: server/src/main/java/org/apache/druid/server/coordinator/duty/NewestSegmentFirstIterator.java ## @@ -112,27 +114,38 @@ } @Override - public Object2LongOpenHashMap totalRemainingSegmentsSizeBytes() + public Map totalRemainingStatistics() { -final Object2LongOpenHashMap resultMap = new Object2LongOpenHashMap<>(); -resultMap.defaultReturnValue(UNKNOWN_TOTAL_REMAINING_SEGMENTS_SIZE); -for (QueueEntry entry : queue) { - final VersionedIntervalTimeline timeline = dataSources.get(entry.getDataSource()); - final Interval interval = new Interval(timeline.first().getInterval().getStart(), entry.interval.getEnd()); - - final List> holders = timeline.lookup(interval); - - long size = 0; - for (DataSegment segment : FluentIterable - .from(holders) - .transformAndConcat(TimelineObjectHolder::getObject) - .transform(PartitionChunk::getObject)) { -size += segment.getSize(); - } +return remainingSegments; + } + + @Override + public Map totalProcessedStatistics() + { +return processedSegments; + } - resultMap.put(entry.getDataSource(), size); + @Override + public void flushAllSegments() + { +if (queue.isEmpty()) { + return; +} +QueueEntry entry; +while ((entry = queue.poll()) != null) { + final List resultSegments = entry.segments; + final String dataSourceName = resultSegments.get(0).getDataSource(); + // This entry was in the queue, meaning that it was not processed. Hence, also aggregates it's + // statistic to the remaining segments counts. + collectSegmentStatistics(remainingSegments, dataSourceName, new SegmentsToCompact(entry.segments)); + final CompactibleTimelineObjectHolderCursor compactibleTimelineObjectHolderCursor = timelineIterators.get( + dataSourceName + ); + // WARNING: This iterates the compactibleTimelineObjectHolderCursor. + // Since this method is intended to only be use after all necessary iteration is done on this iterator Review comment: I don't think I understand this comment, if all iteration is done (by that do you mean `compactibleTimelineObjectHolderCursor.hasNext` returns false?), then iterateAllSegments would do nothing. ## File path: server/src/main/java/org/apache/druid/server/coordinator/duty/CompactSegments.java ## @@ -238,25 +272,102 @@ private CoordinatorStats makeStats(int numCompactionTasks, CompactionSegmentIter { final CoordinatorStats stats = new CoordinatorStats(); stats.addToGlobalStat(COMPACTION_TASK_COUNT, numCompactionTasks); -totalSizesOfSegmentsAwaitingCompactionPerDataSource = iterator.totalRemainingSegmentsSizeBytes(); - totalSizesOfSegmentsAwaitingCompactionPerDataSource.object2LongEntrySet().fastForEach( -entry -> { - final String dataSource = entry.getKey(); - final long totalSizeOfSegmentsAwaitingCompaction = entry.getLongValue(); - stats.addToDataSourceStat( - TOTAL_SIZE_OF_SEGMENTS_AWAITING_COMPACTION, - dataSource, -
[GitHub] [druid] tianzhipeng-git commented on issue #10384: OutOfMemoryError when hadoop ingest
tianzhipeng-git commented on issue #10384: URL: https://github.com/apache/druid/issues/10384#issuecomment-692437817 How many reducers does this hadoop task have? And which `segmentGranularity` and `partitionsSpec` you use? Maybe you should adjust your ingestion json to make `more segments`, see [partition](https://druid.apache.org/docs/latest/ingestion/index.html#partitioning) and [hadoop-base partition](https://druid.apache.org/docs/latest/ingestion/hadoop.html#partitionsspec) (BTW, `mapreduce.map.memory.mb` must be a little bigger than `mapred.map.child.java.opts`, same as reduce) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[druid] branch master updated (5751d0e -> e465f05)
This is an automated email from the ASF dual-hosted git repository. cwylie pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/druid.git. from 5751d0e Skip coverage check for tag builds (#10397) add e465f05 Web console: Improve number alignment in tables (#10389) No new revisions were added by this update. Summary of changes: .../e2e-tests/component/datasources/datasource.ts | 2 +- .../e2e-tests/component/datasources/overview.ts| 12 +- web-console/e2e-tests/tutorial-batch.spec.ts | 2 +- web-console/e2e-tests/util/table.ts| 7 +- web-console/src/bootstrap/react-table-defaults.tsx | 16 +- .../__snapshots__/braced-text.spec.tsx.snap| 18 ++ .../braced-text/braced-text.scss} | 32 ++- .../braced-text/braced-text.spec.tsx} | 9 +- .../src/components/braced-text/braced-text.tsx | 55 + .../datasource-columns-table.tsx | 6 +- web-console/src/components/index.ts| 1 + .../__snapshots__/segment-timeline.spec.tsx.snap | 8 +- .../segment-timeline/segment-timeline.tsx | 8 +- .../src/components/show-history/show-history.tsx | 5 - web-console/src/components/show-json/show-json.tsx | 6 +- .../dialogs/retention-dialog/retention-dialog.tsx | 5 - web-console/src/utils/general.tsx | 8 +- .../__snapshots__/datasource-view.spec.tsx.snap| 85 +-- .../src/views/datasource-view/datasource-view.tsx | 250 ++--- .../views/home-view/status-card/status-card.tsx| 7 +- web-console/src/views/query-view/query-view.tsx| 2 +- .../src/views/segments-view/segments-view.tsx | 62 +++-- 22 files changed, 396 insertions(+), 210 deletions(-) create mode 100644 web-console/src/components/braced-text/__snapshots__/braced-text.spec.tsx.snap copy web-console/src/{console-application.scss => components/braced-text/braced-text.scss} (72%) copy web-console/src/{views/load-data-view/load-data-view.spec.tsx => components/braced-text/braced-text.spec.tsx} (81%) create mode 100644 web-console/src/components/braced-text/braced-text.tsx - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] clintropolis merged pull request #10389: Web console: Improve number alignment in tables
clintropolis merged pull request #10389: URL: https://github.com/apache/druid/pull/10389 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[druid] branch master updated (f71ba6f -> 5751d0e)
This is an automated email from the ASF dual-hosted git repository. fjy pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/druid.git. from f71ba6f Vectorized ANY aggregators (#10338) add 5751d0e Skip coverage check for tag builds (#10397) No new revisions were added by this update. Summary of changes: .travis.yml | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] fjy merged pull request #10397: Skip coverage check for tag builds
fjy merged pull request #10397: URL: https://github.com/apache/druid/pull/10397 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[druid] branch master updated (e012d5c -> f71ba6f)
This is an automated email from the ASF dual-hosted git repository. suneet pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/druid.git. from e012d5c allow vectorized query engines to utilize vectorized virtual columns (#10388) add f71ba6f Vectorized ANY aggregators (#10338) No new revisions were added by this update. Summary of changes: docs/querying/query-context.md | 4 +- .../any/DoubleAnyAggregatorFactory.java| 21 +++ .../DoubleAnyVectorAggregator.java}| 29 +-- .../aggregation/any/FloatAnyAggregatorFactory.java | 21 +++ ...gregator.java => FloatAnyVectorAggregator.java} | 46 ++--- .../aggregation/any/LongAnyAggregatorFactory.java | 21 +++ ...ggregator.java => LongAnyVectorAggregator.java} | 45 ++--- .../any/NumericAnyVectorAggregator.java| 139 ++ .../any/NumericNilVectorAggregator.java| 113 +++ .../any/StringAnyAggregatorFactory.java| 30 +++ .../aggregation/any/StringAnyVectorAggregator.java | 145 ++ .../any/DoubleAnyAggregatorFactoryTest.java| 101 ++ .../any/DoubleAnyVectorAggregatorTest.java | 104 ++ .../any/FloatAnyAggregatorFactoryTest.java | 101 ++ .../any/FloatAnyVectorAggregatorTest.java | 104 ++ .../any/LongAnyAggregatorFactoryTest.java | 101 ++ .../any/LongAnyVectorAggregatorTest.java | 103 ++ .../any/NumericAnyVectorAggregatorTest.java| 210 + .../any/StringAnyAggregatorFactoryTest.java| 110 +++ .../any/StringAnyVectorAggregatorTest.java | 189 +++ .../apache/druid/sql/calcite/CalciteQueryTest.java | 31 +-- website/.spelling | 4 + 22 files changed, 1679 insertions(+), 93 deletions(-) copy processing/src/main/java/org/apache/druid/query/aggregation/{DoubleMaxBufferAggregator.java => any/DoubleAnyVectorAggregator.java} (53%) copy processing/src/main/java/org/apache/druid/query/aggregation/any/{LongAnyAggregator.java => FloatAnyVectorAggregator.java} (52%) copy processing/src/main/java/org/apache/druid/query/aggregation/any/{DoubleAnyAggregator.java => LongAnyVectorAggregator.java} (53%) create mode 100644 processing/src/main/java/org/apache/druid/query/aggregation/any/NumericAnyVectorAggregator.java create mode 100644 processing/src/main/java/org/apache/druid/query/aggregation/any/NumericNilVectorAggregator.java create mode 100644 processing/src/main/java/org/apache/druid/query/aggregation/any/StringAnyVectorAggregator.java create mode 100644 processing/src/test/java/org/apache/druid/query/aggregation/any/DoubleAnyAggregatorFactoryTest.java create mode 100644 processing/src/test/java/org/apache/druid/query/aggregation/any/DoubleAnyVectorAggregatorTest.java create mode 100644 processing/src/test/java/org/apache/druid/query/aggregation/any/FloatAnyAggregatorFactoryTest.java create mode 100644 processing/src/test/java/org/apache/druid/query/aggregation/any/FloatAnyVectorAggregatorTest.java create mode 100644 processing/src/test/java/org/apache/druid/query/aggregation/any/LongAnyAggregatorFactoryTest.java create mode 100644 processing/src/test/java/org/apache/druid/query/aggregation/any/LongAnyVectorAggregatorTest.java create mode 100644 processing/src/test/java/org/apache/druid/query/aggregation/any/NumericAnyVectorAggregatorTest.java create mode 100644 processing/src/test/java/org/apache/druid/query/aggregation/any/StringAnyAggregatorFactoryTest.java create mode 100644 processing/src/test/java/org/apache/druid/query/aggregation/any/StringAnyVectorAggregatorTest.java - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] suneet-s merged pull request #10338: Vectorized ANY aggregators
suneet-s merged pull request #10338: URL: https://github.com/apache/druid/pull/10338 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] suneet-s commented on a change in pull request #10388: allow vectorized query engines to utilize vectorized virtual columns
suneet-s commented on a change in pull request #10388: URL: https://github.com/apache/druid/pull/10388#discussion_r488348990 ## File path: processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/vector/VectorGroupByEngine.java ## @@ -70,24 +73,47 @@ public static boolean canVectorize( @Nullable final Filter filter ) { -// Multi-value dimensions are not yet supported. -// -// Two notes here about how we're handling this check: -// 1) After multi-value dimensions are supported, we could alter "GroupByQueryEngineV2.isAllSingleValueDims" -// to accept a ColumnSelectorFactory, which makes more sense than using a StorageAdapter (see #8013). -// 2) Technically using StorageAdapter here is bad since it only looks at real columns, but they might -// be shadowed by virtual columns (again, see #8013). But it's fine for now since adapter.canVectorize -// always returns false if there are any virtual columns. -// -// This situation should sort itself out pretty well once this engine supports multi-valued columns. Then we -// won't have to worry about having this all-single-value-dims check here. - -return GroupByQueryEngineV2.isAllSingleValueDims(adapter::getColumnCapabilities, query.getDimensions(), true) +Function capabilitiesFunction = name -> +query.getVirtualColumns().getColumnCapabilitiesWithFallback(adapter, name); + +return canVectorizeDimensions(capabilitiesFunction, query.getDimensions()) && query.getDimensions().stream().allMatch(DimensionSpec::canVectorize) && query.getAggregatorSpecs().stream().allMatch(aggregatorFactory -> aggregatorFactory.canVectorize(adapter)) && adapter.canVectorize(filter, query.getVirtualColumns(), false); } + public static boolean canVectorizeDimensions( + final Function capabilitiesFunction, + final List dimensions + ) + { +return dimensions +.stream() +.allMatch( +dimension -> { + if (dimension.mustDecorate()) { +// group by on multi value dimensions are not currently supported +// DimensionSpecs that decorate may turn singly-valued columns into multi-valued selectors. +// To be safe, we must return false here. +return false; + } + + // Now check column capabilities. + final ColumnCapabilities columnCapabilities = capabilitiesFunction.apply(dimension.getDimension()); + // null here currently means the column does not exist, nil columns can be vectorized + if (columnCapabilities == null) { +return true; + } + // strings must be single valued, dictionary encoded, and have unique dictionary entries + if (ValueType.STRING.equals(columnCapabilities.getType())) { +return columnCapabilities.hasMultipleValues().isFalse() && Review comment: Well, I guess the example I gave is equivalent. Maybe a better way to ask that question is should `UNKNOWN` be treated as multi-value or single value? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] suneet-s commented on a change in pull request #10388: allow vectorized query engines to utilize vectorized virtual columns
suneet-s commented on a change in pull request #10388: URL: https://github.com/apache/druid/pull/10388#discussion_r488348359 ## File path: processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/vector/VectorGroupByEngine.java ## @@ -70,24 +73,47 @@ public static boolean canVectorize( @Nullable final Filter filter ) { -// Multi-value dimensions are not yet supported. -// -// Two notes here about how we're handling this check: -// 1) After multi-value dimensions are supported, we could alter "GroupByQueryEngineV2.isAllSingleValueDims" -// to accept a ColumnSelectorFactory, which makes more sense than using a StorageAdapter (see #8013). -// 2) Technically using StorageAdapter here is bad since it only looks at real columns, but they might -// be shadowed by virtual columns (again, see #8013). But it's fine for now since adapter.canVectorize -// always returns false if there are any virtual columns. -// -// This situation should sort itself out pretty well once this engine supports multi-valued columns. Then we -// won't have to worry about having this all-single-value-dims check here. - -return GroupByQueryEngineV2.isAllSingleValueDims(adapter::getColumnCapabilities, query.getDimensions(), true) +Function capabilitiesFunction = name -> +query.getVirtualColumns().getColumnCapabilitiesWithFallback(adapter, name); + +return canVectorizeDimensions(capabilitiesFunction, query.getDimensions()) && query.getDimensions().stream().allMatch(DimensionSpec::canVectorize) && query.getAggregatorSpecs().stream().allMatch(aggregatorFactory -> aggregatorFactory.canVectorize(adapter)) && adapter.canVectorize(filter, query.getVirtualColumns(), false); } + public static boolean canVectorizeDimensions( + final Function capabilitiesFunction, + final List dimensions + ) + { +return dimensions +.stream() +.allMatch( +dimension -> { + if (dimension.mustDecorate()) { +// group by on multi value dimensions are not currently supported +// DimensionSpecs that decorate may turn singly-valued columns into multi-valued selectors. +// To be safe, we must return false here. +return false; + } + + // Now check column capabilities. + final ColumnCapabilities columnCapabilities = capabilitiesFunction.apply(dimension.getDimension()); + // null here currently means the column does not exist, nil columns can be vectorized + if (columnCapabilities == null) { +return true; + } + // strings must be single valued, dictionary encoded, and have unique dictionary entries + if (ValueType.STRING.equals(columnCapabilities.getType())) { +return columnCapabilities.hasMultipleValues().isFalse() && Review comment: How do you decide between `columnCapabilities.hasMultipleValues().isFalse()` and `!columnCapabilities.hasMultipleValues().isMaybeTrue()` - I'm never sure which one to check This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[druid] branch master updated (184b202 -> e012d5c)
This is an automated email from the ASF dual-hosted git repository. cwylie pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/druid.git. from 184b202 add computed Expr output types (#10370) add e012d5c allow vectorized query engines to utilize vectorized virtual columns (#10388) No new revisions were added by this update. Summary of changes: .../epinephelinae/GroupByQueryEngineV2.java| 21 +- .../epinephelinae/vector/VectorGroupByEngine.java | 52 +++- .../QueryableIndexCursorSequenceBuilder.java | 51 ++-- .../segment/QueryableIndexStorageAdapter.java | 5 +- .../org/apache/druid/segment/VirtualColumn.java| 183 +-- .../org/apache/druid/segment/VirtualColumns.java | 249 +++ .../QueryableIndexVectorColumnSelectorFactory.java | 225 +- .../virtual/AlwaysTwoCounterAggregatorFactory.java | 257 .../virtual/AlwaysTwoVectorizedVirtualColumn.java | 321 .../virtual/VectorizedVirtualColumnTest.java | 334 + 10 files changed, 1490 insertions(+), 208 deletions(-) create mode 100644 processing/src/test/java/org/apache/druid/segment/virtual/AlwaysTwoCounterAggregatorFactory.java create mode 100644 processing/src/test/java/org/apache/druid/segment/virtual/AlwaysTwoVectorizedVirtualColumn.java create mode 100644 processing/src/test/java/org/apache/druid/segment/virtual/VectorizedVirtualColumnTest.java - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] clintropolis merged pull request #10388: allow vectorized query engines to utilize vectorized virtual columns
clintropolis merged pull request #10388: URL: https://github.com/apache/druid/pull/10388 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[druid] branch master updated (084b23d -> 184b202)
This is an automated email from the ASF dual-hosted git repository. cwylie pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/druid.git. from 084b23d benchmark for indexed table experiments (#10327) add 184b202 add computed Expr output types (#10370) No new revisions were added by this update. Summary of changes: .../org/apache/druid/math/expr/ApplyFunction.java | 86 +- .../druid/math/expr/BinaryLogicalOperatorExpr.java | 79 +- .../apache/druid/math/expr/BinaryOperatorExpr.java | 9 +- .../org/apache/druid/math/expr/ConstantExpr.java | 41 +- .../main/java/org/apache/druid/math/expr/Expr.java | 73 +- .../java/org/apache/druid/math/expr/ExprEval.java | 5 + .../apache/druid/math/expr/ExprListenerImpl.java | 2 +- .../org/apache/druid/math/expr/ExprMacroTable.java | 14 +- .../java/org/apache/druid/math/expr/ExprType.java | 148 +++- .../java/org/apache/druid/math/expr/Function.java | 933 ++--- .../org/apache/druid/math/expr/FunctionalExpr.java | 55 +- .../org/apache/druid/math/expr/IdentifierExpr.java | 10 +- .../java/org/apache/druid/math/expr/Parser.java| 28 +- .../apache/druid/math/expr/UnaryOperatorExpr.java | 21 +- .../java/org/apache/druid/math/expr/ExprTest.java | 40 +- .../org/apache/druid/math/expr/OutputTypeTest.java | 463 ++ .../org/apache/druid/math/expr/ParserTest.java | 8 +- .../query/expressions/BloomFilterExprMacro.java| 10 +- .../druid/query/expression/ContainsExpr.java | 12 +- .../expression/IPv4AddressMatchExprMacro.java | 10 +- .../expression/IPv4AddressParseExprMacro.java | 9 + .../expression/IPv4AddressStringifyExprMacro.java | 9 + .../druid/query/expression/LikeExprMacro.java | 10 +- .../druid/query/expression/LookupExprMacro.java| 9 + .../query/expression/RegexpExtractExprMacro.java | 9 + .../query/expression/RegexpLikeExprMacro.java | 12 +- .../query/expression/TimestampCeilExprMacro.java | 16 + .../expression/TimestampExtractExprMacro.java | 15 + .../query/expression/TimestampFloorExprMacro.java | 16 + .../query/expression/TimestampFormatExprMacro.java | 9 + .../query/expression/TimestampParseExprMacro.java | 9 + .../query/expression/TimestampShiftExprMacro.java | 16 + .../druid/query/expression/TrimExprMacro.java | 18 +- .../druid/segment/filter/ExpressionFilter.java | 4 +- .../join/filter/JoinFilterCorrelations.java| 4 +- .../druid/segment/virtual/ExpressionSelectors.java | 57 +- .../RowBasedExpressionColumnValueSelector.java | 10 +- .../query/expression/RegexpLikeExprMacroTest.java | 17 +- .../builtin/ReductionOperatorConversionHelper.java | 4 +- .../apache/druid/sql/calcite/rel/Projection.java | 4 +- 40 files changed, 1838 insertions(+), 466 deletions(-) create mode 100644 core/src/test/java/org/apache/druid/math/expr/OutputTypeTest.java - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[druid] branch master updated (084b23d -> 184b202)
This is an automated email from the ASF dual-hosted git repository. cwylie pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/druid.git. from 084b23d benchmark for indexed table experiments (#10327) add 184b202 add computed Expr output types (#10370) No new revisions were added by this update. Summary of changes: .../org/apache/druid/math/expr/ApplyFunction.java | 86 +- .../druid/math/expr/BinaryLogicalOperatorExpr.java | 79 +- .../apache/druid/math/expr/BinaryOperatorExpr.java | 9 +- .../org/apache/druid/math/expr/ConstantExpr.java | 41 +- .../main/java/org/apache/druid/math/expr/Expr.java | 73 +- .../java/org/apache/druid/math/expr/ExprEval.java | 5 + .../apache/druid/math/expr/ExprListenerImpl.java | 2 +- .../org/apache/druid/math/expr/ExprMacroTable.java | 14 +- .../java/org/apache/druid/math/expr/ExprType.java | 148 +++- .../java/org/apache/druid/math/expr/Function.java | 933 ++--- .../org/apache/druid/math/expr/FunctionalExpr.java | 55 +- .../org/apache/druid/math/expr/IdentifierExpr.java | 10 +- .../java/org/apache/druid/math/expr/Parser.java| 28 +- .../apache/druid/math/expr/UnaryOperatorExpr.java | 21 +- .../java/org/apache/druid/math/expr/ExprTest.java | 40 +- .../org/apache/druid/math/expr/OutputTypeTest.java | 463 ++ .../org/apache/druid/math/expr/ParserTest.java | 8 +- .../query/expressions/BloomFilterExprMacro.java| 10 +- .../druid/query/expression/ContainsExpr.java | 12 +- .../expression/IPv4AddressMatchExprMacro.java | 10 +- .../expression/IPv4AddressParseExprMacro.java | 9 + .../expression/IPv4AddressStringifyExprMacro.java | 9 + .../druid/query/expression/LikeExprMacro.java | 10 +- .../druid/query/expression/LookupExprMacro.java| 9 + .../query/expression/RegexpExtractExprMacro.java | 9 + .../query/expression/RegexpLikeExprMacro.java | 12 +- .../query/expression/TimestampCeilExprMacro.java | 16 + .../expression/TimestampExtractExprMacro.java | 15 + .../query/expression/TimestampFloorExprMacro.java | 16 + .../query/expression/TimestampFormatExprMacro.java | 9 + .../query/expression/TimestampParseExprMacro.java | 9 + .../query/expression/TimestampShiftExprMacro.java | 16 + .../druid/query/expression/TrimExprMacro.java | 18 +- .../druid/segment/filter/ExpressionFilter.java | 4 +- .../join/filter/JoinFilterCorrelations.java| 4 +- .../druid/segment/virtual/ExpressionSelectors.java | 57 +- .../RowBasedExpressionColumnValueSelector.java | 10 +- .../query/expression/RegexpLikeExprMacroTest.java | 17 +- .../builtin/ReductionOperatorConversionHelper.java | 4 +- .../apache/druid/sql/calcite/rel/Projection.java | 4 +- 40 files changed, 1838 insertions(+), 466 deletions(-) create mode 100644 core/src/test/java/org/apache/druid/math/expr/OutputTypeTest.java - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] clintropolis commented on a change in pull request #10370: add computed Expr output types
clintropolis commented on a change in pull request #10370: URL: https://github.com/apache/druid/pull/10370#discussion_r488324915 ## File path: core/src/main/java/org/apache/druid/math/expr/BinaryLogicalOperatorExpr.java ## @@ -57,6 +57,17 @@ protected final double evalDouble(double left, double right) // Use Double.compare for more consistent NaN handling. return Evals.asDouble(Double.compare(left, right) < 0); } + + @Nullable + @Override + public ExprType getOutputType(InputBindingTypes inputTypes) + { +ExprType implicitCast = super.getOutputType(inputTypes); +if (ExprType.STRING.equals(implicitCast)) { + return ExprType.LONG; +} +return implicitCast; Review comment: I'm going to merge this PR, but we can continue this discussion in the next one, so this stuff can actually be tied into things in a concrete manner. Looking at postgres, its math functions do, at least somewhat, allow for implicit conversion of sting numbers to numbers. So `2.0 + '3.1' = 5.1` and `cos('1')` work, but `'2.0' + '3.1'` does not. In druid currently the first would have the same behavior as postgres, but the 2nd would be null or 0 depending on the value of `druid.generic.useDefaultValueForNull`, while the 3rd expression would do string concatenation. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] clintropolis merged pull request #10370: add computed Expr output types
clintropolis merged pull request #10370: URL: https://github.com/apache/druid/pull/10370 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] suneet-s commented on a change in pull request #10390: Vectorized variance aggregators
suneet-s commented on a change in pull request #10390: URL: https://github.com/apache/druid/pull/10390#discussion_r488319436 ## File path: extensions-core/stats/src/main/java/org/apache/druid/query/aggregation/variance/VarianceAggregatorFactory.java ## @@ -162,10 +167,36 @@ public BufferAggregator factorizeBuffered(ColumnSelectorFactory metricFactory) throw new IAE( "Incompatible type for metric[%s], expected a float, double, long, or variance, but got a %s", fieldName, -inputType +type ); } + @Override + public VectorAggregator factorizeVector(VectorColumnSelectorFactory selectorFactory) + { +final String type = getTypeString(selectorFactory); +if (ValueType.FLOAT.name().equalsIgnoreCase(type)) { Review comment: Fix was easy for everything. Done with some tests. Waiting on a full review, and I'll push up a new patch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] jihoonson commented on issue #10321: [Proposal] Design for a Configurable Index Type
jihoonson commented on issue #10321: URL: https://github.com/apache/druid/issues/10321#issuecomment-692390625 @liran-funaro thanks for your answer. > I'll look into that. Improving query performance is on our agenda, and we considered comparing the groupBy v1 with Oak to the groupBy v2 mechanism. We will report our results in a different issue in the future. Sounds good. I'm looking forward to seeing your report. :slightly_smiling_face: This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] clintropolis commented on a change in pull request #10390: Vectorized variance aggregators
clintropolis commented on a change in pull request #10390: URL: https://github.com/apache/druid/pull/10390#discussion_r488305025 ## File path: extensions-core/stats/src/main/java/org/apache/druid/query/aggregation/variance/VarianceAggregatorFactory.java ## @@ -162,10 +167,36 @@ public BufferAggregator factorizeBuffered(ColumnSelectorFactory metricFactory) throw new IAE( "Incompatible type for metric[%s], expected a float, double, long, or variance, but got a %s", fieldName, -inputType +type ); } + @Override + public VectorAggregator factorizeVector(VectorColumnSelectorFactory selectorFactory) + { +final String type = getTypeString(selectorFactory); +if (ValueType.FLOAT.name().equalsIgnoreCase(type)) { Review comment: Ah, this is actually only true if `inputType` isn't specified, so it's not _never_ true, just will never be true if the input type isn't explicitly specified as `"variance"`, and cant autodetect from its inputs. I think it would be nice if you fix it up at least for the vectorized codepath, but would also probably be nice to just fix it up for all of them if it isn't too much trouble, and I don't think waiting on #10277 is necessary since there are a handful of aggregator factories which could be improved to have a stronger check on their complex input type. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] suneet-s commented on a change in pull request #10390: Vectorized variance aggregators
suneet-s commented on a change in pull request #10390: URL: https://github.com/apache/druid/pull/10390#discussion_r488292797 ## File path: extensions-core/stats/src/main/java/org/apache/druid/query/aggregation/variance/VarianceAggregatorFactory.java ## @@ -162,10 +167,36 @@ public BufferAggregator factorizeBuffered(ColumnSelectorFactory metricFactory) throw new IAE( "Incompatible type for metric[%s], expected a float, double, long, or variance, but got a %s", fieldName, -inputType +type ); } + @Override + public VectorAggregator factorizeVector(VectorColumnSelectorFactory selectorFactory) + { +final String type = getTypeString(selectorFactory); +if (ValueType.FLOAT.name().equalsIgnoreCase(type)) { Review comment: Can I file a separate bug for this issue since it appears to be a bug independent of the vectorization code path? I can look at fixing that issue after #10277 goes in This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] suneet-s commented on a change in pull request #10338: Vectorized ANY aggregators
suneet-s commented on a change in pull request #10338: URL: https://github.com/apache/druid/pull/10338#discussion_r488280699 ## File path: processing/src/main/java/org/apache/druid/query/aggregation/any/NumericNilVectorAggregator.java ## @@ -0,0 +1,146 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.query.aggregation.any; + +import org.apache.druid.common.config.NullHandling; +import org.apache.druid.query.aggregation.VectorAggregator; + +import javax.annotation.Nullable; +import java.nio.ByteBuffer; + +/** + * A vector aggregator that returns the default numeric value. + */ +public abstract class NumericNilVectorAggregator implements VectorAggregator Review comment: I decided not to extend `NoopVectorAggregator` since it's constructor was private, and I didn't want to think through what the right inheritance model should be. I did take your suggestion on just instantiating 1 singleton for each numeric type to reduce the boiler plate code. New patch incoming... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] clintropolis commented on a change in pull request #10390: Vectorized variance aggregators
clintropolis commented on a change in pull request #10390: URL: https://github.com/apache/druid/pull/10390#discussion_r488175527 ## File path: extensions-core/stats/src/main/java/org/apache/druid/query/aggregation/variance/VarianceAggregatorFactory.java ## @@ -162,10 +167,36 @@ public BufferAggregator factorizeBuffered(ColumnSelectorFactory metricFactory) throw new IAE( "Incompatible type for metric[%s], expected a float, double, long, or variance, but got a %s", fieldName, -inputType +type ); } + @Override + public VectorAggregator factorizeVector(VectorColumnSelectorFactory selectorFactory) + { +final String type = getTypeString(selectorFactory); +if (ValueType.FLOAT.name().equalsIgnoreCase(type)) { Review comment: Oh, this isn't actually correct. This method is backed by `getType`, which means `VARIANCE_TYPE_NAME.equalsIgnoreCase(type)` will never be true. I suggest instead of using `getTypeString` to get the column capabilities, and then this enum can switch on `getType` directly instead of converting them to and from strings, and if type is `ValueType.COMPLEX` then to treat it as the object aggregator factory. After #10277 goes in, then we can check `getComplexTypeName` matches `VARIANCE_TYPE_NAME`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] clintropolis commented on a change in pull request #10338: Vectorized ANY aggregators
clintropolis commented on a change in pull request #10338: URL: https://github.com/apache/druid/pull/10338#discussion_r488262030 ## File path: processing/src/main/java/org/apache/druid/query/aggregation/any/NumericNilVectorAggregator.java ## @@ -0,0 +1,146 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.query.aggregation.any; + +import org.apache.druid.common.config.NullHandling; +import org.apache.druid.query.aggregation.VectorAggregator; + +import javax.annotation.Nullable; +import java.nio.ByteBuffer; + +/** + * A vector aggregator that returns the default numeric value. + */ +public abstract class NumericNilVectorAggregator implements VectorAggregator Review comment: nit: can extend `NoopVectorAggregator` to reduce boilerplate. I also don't think you need separate classes for each type, since the number is an object anyway, can just instantiate each singleton using `NullHandling.defaultLongValue()` etc. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] clintropolis merged pull request #10327: benchmark for indexed table experiments
clintropolis merged pull request #10327: URL: https://github.com/apache/druid/pull/10327 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[druid] branch master updated (f5e2645 -> 084b23d)
This is an automated email from the ASF dual-hosted git repository. cwylie pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/druid.git. from f5e2645 Support SearchQueryDimFilter in sql via new methods (#10350) add 084b23d benchmark for indexed table experiments (#10327) No new revisions were added by this update. Summary of changes: .../benchmark/IndexedTableJoinCursorBenchmark.java | 416 + ...mark.java => IndexedTableLoadingBenchmark.java} | 83 ++-- 2 files changed, 460 insertions(+), 39 deletions(-) create mode 100644 benchmarks/src/test/java/org/apache/druid/benchmark/IndexedTableJoinCursorBenchmark.java copy benchmarks/src/test/java/org/apache/druid/benchmark/{ConsistentHasherBenchmark.java => IndexedTableLoadingBenchmark.java} (52%) - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] ccaominh commented on pull request #10397: Skip coverage check for tag builds
ccaominh commented on pull request #10397: URL: https://github.com/apache/druid/pull/10397#issuecomment-692336327 > LGTM > > I found some tags that had failed CI > `druid-0.19.0-rc1` -> https://travis-ci.org/github/apache/druid/builds/707574622 > `druid-0.19.0` -> https://travis-ci.org/github/apache/druid/builds/709850305 > > It looks like the code coverage bot did not run for the 0.18 tag > `druid-0.18.1` -> https://travis-ci.org/github/apache/druid/builds/686661101 The coverage bot was added in 0.19: https://github.com/apache/druid/pull/9863 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] ccaominh opened a new pull request #10397: Skip coverage check for tag builds
ccaominh opened a new pull request #10397: URL: https://github.com/apache/druid/pull/10397 ### Description The code coverage diff calculation assumes the `TRAVIS_BRANCH` environment variable is the name of a branch; however, for tag builds it is the name of the tag so the diff calculation fails. Since builds triggered by tags do not have a code diff, the coverage check should be skipped to avoid the error and to save some CI resources. This PR has: - [x] been self-reviewed. - [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader. - [x] tested new tag does not cause coverage check to fail: - https://travis-ci.org/github/ccaominh/druid/builds/726445429 - [x] tested coverage check is still able to flag uncovered changes in PR: - https://github.com/ccaominh/druid/pull/5 - https://travis-ci.org/github/ccaominh/druid/jobs/726474544 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] mitchlloyd commented on pull request #10386: Upgrade AWS SDK
mitchlloyd commented on pull request #10386: URL: https://github.com/apache/druid/pull/10386#issuecomment-692269623 @suneet-s I'm running the tests following [the documentation](https://github.com/apache/druid/blob/master/integration-tests/README.md#running-a-test-that-uses-cloud): However I'm running into some issues. First the documented command to run seems to run all tests which fail at different points on different runs: ```sh mvn verify -P integration-tests \ -Dgroups=s3-deep-storage \ -Doverride.config.path=s3-creds-override \ -Ddruid.test.config.streamEndpoint=kinesis.us-west-2.amazonaws.com \ -Ddruid.test.config.cloudBucket= \ -Ddruid.test.config.cloudPath=s3-tests/ ``` I added the `-pl integration-tests` option and this seems to start up Docker: ```sh mvn verify -pl integration-tests \ -P integration-tests \ -Dgroups=s3-deep-storage \ -Doverride.config.path=s3-creds-override \ -Ddruid.test.config.streamEndpoint=kinesis.us-west-2.amazonaws.com \ -Ddruid.test.config.cloudBucket= \ -Ddruid.test.config.cloudPath=s3-tests/ ``` I notice that during start up there is a message that says: `Could not set docker host IP - integration tests can not run` I also see this error: Error Step 8/36 : RUN find /var/lib/mysql -type f -exec touch {} \; && /etc/init.d/mysql start && echo "CREATE USER 'druid'@'%' IDENTIFIED BY 'diurd'; GRANT ALL ON druid.* TO 'druid'@'%'; CREATE database druid DEFAULT CHARACTER SET utf8mb4;" | mysql -u root && /etc/init.d/mysql stop ---> Running in d25c117522f4 * /etc/init.d/mysql: ERROR: The partition with /var/lib/mysql is too full! The command '/bin/sh -c find /var/lib/mysql -type f -exec touch {} \; && /etc/init.d/mysql start && echo "CREATE USER 'druid'@'%' IDENTIFIED BY 'diurd'; GRANT ALL ON druid.* TO 'druid'@'%'; CREATE database druid DEFAULT CHARACTER SET utf8mb4;" | mysql -u root && /etc/init.d/mysql stop' returned a non-zero code: 1 Error response from daemon: No such container: druid-historical Error: No such container: druid-historical Error response from daemon: No such container: druid-historical-for-query-retry-test Error: No such container: druid-historical-for-query-retry-test Error response from daemon: No such container: druid-coordinator Error: No such container: druid-coordinator Error response from daemon: No such container: druid-overlord Error: No such container: druid-overlord Error response from daemon: No such container: druid-router Error: No such container: druid-router Error response from daemon: No such container: druid-router-permissive-tls Error: No such container: druid-router-permissive-tls Error response from daemon: No such container: druid-router-no-client-auth-tls Error: No such container: druid-router-no-client-auth-tls Error response from daemon: No such container: druid-router-custom-check-tls Error: No such container: druid-router-custom-check-tls Error response from daemon: No such container: druid-broker Error: No such container: druid-broker Error response from daemon: No such container: druid-middlemanager Error: No such container: druid-middlemanager Error response from daemon: No such container: druid-zookeeper-kafka Error: No such container: druid-zookeeper-kafka Error response from daemon: No such container: druid-metadata-storage Error: No such container: druid-metadata-storage Error response from daemon: No such container: druid-it-hadoop Error: No such container: druid-it-hadoop druid-it-net Possibly of note, after that the test stops with what I know is a user prompt: ``` Pulling druid-zookeeper-kafka (druid/cluster:)... The image for the service you're trying to recreate has been removed. If you continue, volume data could be lost. Consider backing up your data before continuing. ``` I can get past that by pressing `y` and enter to bypass the prompt. Then I get ``` Pulling druid-zookeeper-kafka (druid/cluster:)... pull access denied for druid/cluster, repository does not exist or may require 'docker login': denied: requested access to the resource is denied ``` After that a test section runs but runs no tests ``` Tests run: 0, Failures: 0, Errors: 0, Skipped: 0 ``` Then I get errors from `org.apache.maven.plugins:maven-failsafe-plugin:2.22.0:verify` Error [INFO] BUILD FAILURE [INFO] [INFO] Total time: 01:16 min [INFO] Finished at: 2020-09-14T12:27:30-07:00 [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-failsafe-plugin:2.22.0:verify (integration-tests) on
[GitHub] [druid] clintropolis commented on a change in pull request #10370: add computed Expr output types
clintropolis commented on a change in pull request #10370: URL: https://github.com/apache/druid/pull/10370#discussion_r488172223 ## File path: core/src/main/java/org/apache/druid/math/expr/BinaryLogicalOperatorExpr.java ## @@ -57,6 +57,17 @@ protected final double evalDouble(double left, double right) // Use Double.compare for more consistent NaN handling. return Evals.asDouble(Double.compare(left, right) < 0); } + + @Nullable + @Override + public ExprType getOutputType(InputBindingTypes inputTypes) + { +ExprType implicitCast = super.getOutputType(inputTypes); +if (ExprType.STRING.equals(implicitCast)) { + return ExprType.LONG; +} +return implicitCast; Review comment: >can you elaborate more on this? How will that look like? Ah, its going to be based on using these methods, just the processors will be specialized to deal with the correct type based on the set of input types. Since the output type information isn't used for non-vectorized expressions, I'm trying to model this as the change I want to see in the world and will ensure that this matches the behavior of the vectorized expressions, but I've gone ahead and split operator from function auto conversion and changed it to match existing behavior for now in case it is necessary, and can always consolidate them again in the future. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[druid-website] branch asf-site updated (1b56e53 -> 375870d)
This is an automated email from the ASF dual-hosted git repository. vogievetsky pushed a change to branch asf-site in repository https://gitbox.apache.org/repos/asf/druid-website.git. from 1b56e53 Merge pull request #94 from implydata/autobuild new 60c2ccd Autobuild new 9adff91 restore files new 375870d Merge pull request #95 from implydata/autobuild The 239 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: index.html | 19 ++- 1 file changed, 18 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid-website] vogievetsky merged pull request #95: Autobuild
vogievetsky merged pull request #95: URL: https://github.com/apache/druid-website/pull/95 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid-website] vogievetsky opened a new pull request #95: Autobuild
vogievetsky opened a new pull request #95: URL: https://github.com/apache/druid-website/pull/95 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[druid-website-src] branch master updated: add 9/28 meetup
This is an automated email from the ASF dual-hosted git repository. vogievetsky pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/druid-website-src.git The following commit(s) were added to refs/heads/master by this push: new 852c067 add 9/28 meetup new 96bc747 Merge pull request #161 from druid-matt/patch-14 852c067 is described below commit 852c0673747103d00424d598874c0d5449d0b307 Author: Matt Sarrel <59710709+druid-m...@users.noreply.github.com> AuthorDate: Mon Sep 14 11:33:22 2020 -0700 add 9/28 meetup --- _data/events.yml | 5 + 1 file changed, 5 insertions(+) diff --git a/_data/events.yml b/_data/events.yml index 61edc77..1410bc3 100644 --- a/_data/events.yml +++ b/_data/events.yml @@ -9,6 +9,11 @@ info: "Bay Area Apache Druid Meetup" link: https://www.meetup.com/druidio/events/272938350/ +- date: 2020-09-28 + name: "September Office Hours: All About Apache Druid" + info: "Bay Area Apache Druid Meetup" + link: https://www.meetup.com/druidio/events/273120070/ + - date: 2020-11-02 name: "Druid Summit" info: "San Francisco Airport Marriott Waterfront, 1800 Old Bayshore Highway, Burlingame, CA" - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[druid-website-src] branch master updated: fix typo on index.html
This is an automated email from the ASF dual-hosted git repository. vogievetsky pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/druid-website-src.git The following commit(s) were added to refs/heads/master by this push: new 09e6842 fix typo on index.html new 02e699a Merge pull request #160 from druid-matt/patch-13 09e6842 is described below commit 09e6842649aa464f5521d56abe031ed546bfe1b7 Author: Matt Sarrel <59710709+druid-m...@users.noreply.github.com> AuthorDate: Sun Sep 13 15:33:38 2020 -0700 fix typo on index.html outpeform to outperform --- index.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/index.html b/index.html index 7c66563..03f16c9 100644 --- a/index.html +++ b/index.html @@ -46,7 +46,7 @@ canonical: 'https://druid.apache.org/' Up to 100x faster than traditional solutions -Druid has been benchmarked to greatly outpeform legacy solutions for data ingestion and data querying. Druid's novel architecture combines the best of data warehouses, timeseries databases, and search systems. +Druid has been benchmarked to greatly outperform legacy solutions for data ingestion and data querying. Druid's novel architecture combines the best of data warehouses, timeseries databases, and search systems. - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid-website-src] vogievetsky merged pull request #161: add 9/28 meetup
vogievetsky merged pull request #161: URL: https://github.com/apache/druid-website-src/pull/161 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid-website-src] vogievetsky merged pull request #160: fix typo on index.html
vogievetsky merged pull request #160: URL: https://github.com/apache/druid-website-src/pull/160 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid-website-src] druid-matt closed pull request #150: fix broken links to docs
druid-matt closed pull request #150: URL: https://github.com/apache/druid-website-src/pull/150 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid-website-src] druid-matt opened a new pull request #161: add 9/28 meetup
druid-matt opened a new pull request #161: URL: https://github.com/apache/druid-website-src/pull/161 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[druid] branch master updated (3d4b48e -> f5e2645)
This is an automated email from the ASF dual-hosted git repository. suneet pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/druid.git. from 3d4b48e TransformSpecTest should extends InitializedNullHandlingTest (#10392) add f5e2645 Support SearchQueryDimFilter in sql via new methods (#10350) No new revisions were added by this update. Summary of changes: docs/misc/math-expr.md | 2 + docs/querying/sql.md | 2 + .../CaseInsensitiveContainsExprMacro.java | 63 ++ .../druid/query/expression/ContainsExpr.java | 101 + .../druid/query/expression/ContainsExprMacro.java | 62 ++ ...Test.java => CaseInsensitiveExprMacroTest.java} | 66 +++--- ...prMacroTest.java => ContainsExprMacroTest.java} | 50 ++--- .../org/apache/druid/guice/ExpressionModule.java | 4 + ...ersion.java => ContainsOperatorConversion.java} | 93 ++--- .../sql/calcite/planner/DruidOperatorTable.java| 3 + .../calcite/expression/ExpressionTestHelper.java | 15 +- .../sql/calcite/expression/ExpressionsTest.java| 228 + website/.spelling | 2 + 13 files changed, 611 insertions(+), 80 deletions(-) create mode 100644 processing/src/main/java/org/apache/druid/query/expression/CaseInsensitiveContainsExprMacro.java create mode 100644 processing/src/main/java/org/apache/druid/query/expression/ContainsExpr.java create mode 100644 processing/src/main/java/org/apache/druid/query/expression/ContainsExprMacro.java copy processing/src/test/java/org/apache/druid/query/expression/{RegexpLikeExprMacroTest.java => CaseInsensitiveExprMacroTest.java} (53%) copy processing/src/test/java/org/apache/druid/query/expression/{RegexpLikeExprMacroTest.java => ContainsExprMacroTest.java} (58%) copy sql/src/main/java/org/apache/druid/sql/calcite/expression/builtin/{RegexpLikeOperatorConversion.java => ContainsOperatorConversion.java} (51%) - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[druid] branch master updated (3d4b48e -> f5e2645)
This is an automated email from the ASF dual-hosted git repository. suneet pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/druid.git. from 3d4b48e TransformSpecTest should extends InitializedNullHandlingTest (#10392) add f5e2645 Support SearchQueryDimFilter in sql via new methods (#10350) No new revisions were added by this update. Summary of changes: docs/misc/math-expr.md | 2 + docs/querying/sql.md | 2 + .../CaseInsensitiveContainsExprMacro.java | 63 ++ .../druid/query/expression/ContainsExpr.java | 101 + .../druid/query/expression/ContainsExprMacro.java | 62 ++ ...Test.java => CaseInsensitiveExprMacroTest.java} | 66 +++--- ...prMacroTest.java => ContainsExprMacroTest.java} | 50 ++--- .../org/apache/druid/guice/ExpressionModule.java | 4 + ...ersion.java => ContainsOperatorConversion.java} | 93 ++--- .../sql/calcite/planner/DruidOperatorTable.java| 3 + .../calcite/expression/ExpressionTestHelper.java | 15 +- .../sql/calcite/expression/ExpressionsTest.java| 228 + website/.spelling | 2 + 13 files changed, 611 insertions(+), 80 deletions(-) create mode 100644 processing/src/main/java/org/apache/druid/query/expression/CaseInsensitiveContainsExprMacro.java create mode 100644 processing/src/main/java/org/apache/druid/query/expression/ContainsExpr.java create mode 100644 processing/src/main/java/org/apache/druid/query/expression/ContainsExprMacro.java copy processing/src/test/java/org/apache/druid/query/expression/{RegexpLikeExprMacroTest.java => CaseInsensitiveExprMacroTest.java} (53%) copy processing/src/test/java/org/apache/druid/query/expression/{RegexpLikeExprMacroTest.java => ContainsExprMacroTest.java} (58%) copy sql/src/main/java/org/apache/druid/sql/calcite/expression/builtin/{RegexpLikeOperatorConversion.java => ContainsOperatorConversion.java} (51%) - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[druid] branch master updated (3d4b48e -> f5e2645)
This is an automated email from the ASF dual-hosted git repository. suneet pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/druid.git. from 3d4b48e TransformSpecTest should extends InitializedNullHandlingTest (#10392) add f5e2645 Support SearchQueryDimFilter in sql via new methods (#10350) No new revisions were added by this update. Summary of changes: docs/misc/math-expr.md | 2 + docs/querying/sql.md | 2 + .../CaseInsensitiveContainsExprMacro.java | 63 ++ .../druid/query/expression/ContainsExpr.java | 101 + .../druid/query/expression/ContainsExprMacro.java | 62 ++ ...Test.java => CaseInsensitiveExprMacroTest.java} | 66 +++--- ...prMacroTest.java => ContainsExprMacroTest.java} | 50 ++--- .../org/apache/druid/guice/ExpressionModule.java | 4 + ...ersion.java => ContainsOperatorConversion.java} | 93 ++--- .../sql/calcite/planner/DruidOperatorTable.java| 3 + .../calcite/expression/ExpressionTestHelper.java | 15 +- .../sql/calcite/expression/ExpressionsTest.java| 228 + website/.spelling | 2 + 13 files changed, 611 insertions(+), 80 deletions(-) create mode 100644 processing/src/main/java/org/apache/druid/query/expression/CaseInsensitiveContainsExprMacro.java create mode 100644 processing/src/main/java/org/apache/druid/query/expression/ContainsExpr.java create mode 100644 processing/src/main/java/org/apache/druid/query/expression/ContainsExprMacro.java copy processing/src/test/java/org/apache/druid/query/expression/{RegexpLikeExprMacroTest.java => CaseInsensitiveExprMacroTest.java} (53%) copy processing/src/test/java/org/apache/druid/query/expression/{RegexpLikeExprMacroTest.java => ContainsExprMacroTest.java} (58%) copy sql/src/main/java/org/apache/druid/sql/calcite/expression/builtin/{RegexpLikeOperatorConversion.java => ContainsOperatorConversion.java} (51%) - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[druid] branch master updated (3d4b48e -> f5e2645)
This is an automated email from the ASF dual-hosted git repository. suneet pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/druid.git. from 3d4b48e TransformSpecTest should extends InitializedNullHandlingTest (#10392) add f5e2645 Support SearchQueryDimFilter in sql via new methods (#10350) No new revisions were added by this update. Summary of changes: docs/misc/math-expr.md | 2 + docs/querying/sql.md | 2 + .../CaseInsensitiveContainsExprMacro.java | 63 ++ .../druid/query/expression/ContainsExpr.java | 101 + .../druid/query/expression/ContainsExprMacro.java | 62 ++ ...Test.java => CaseInsensitiveExprMacroTest.java} | 66 +++--- ...prMacroTest.java => ContainsExprMacroTest.java} | 50 ++--- .../org/apache/druid/guice/ExpressionModule.java | 4 + ...ersion.java => ContainsOperatorConversion.java} | 93 ++--- .../sql/calcite/planner/DruidOperatorTable.java| 3 + .../calcite/expression/ExpressionTestHelper.java | 15 +- .../sql/calcite/expression/ExpressionsTest.java| 228 + website/.spelling | 2 + 13 files changed, 611 insertions(+), 80 deletions(-) create mode 100644 processing/src/main/java/org/apache/druid/query/expression/CaseInsensitiveContainsExprMacro.java create mode 100644 processing/src/main/java/org/apache/druid/query/expression/ContainsExpr.java create mode 100644 processing/src/main/java/org/apache/druid/query/expression/ContainsExprMacro.java copy processing/src/test/java/org/apache/druid/query/expression/{RegexpLikeExprMacroTest.java => CaseInsensitiveExprMacroTest.java} (53%) copy processing/src/test/java/org/apache/druid/query/expression/{RegexpLikeExprMacroTest.java => ContainsExprMacroTest.java} (58%) copy sql/src/main/java/org/apache/druid/sql/calcite/expression/builtin/{RegexpLikeOperatorConversion.java => ContainsOperatorConversion.java} (51%) - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[druid] branch master updated (3d4b48e -> f5e2645)
This is an automated email from the ASF dual-hosted git repository. suneet pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/druid.git. from 3d4b48e TransformSpecTest should extends InitializedNullHandlingTest (#10392) add f5e2645 Support SearchQueryDimFilter in sql via new methods (#10350) No new revisions were added by this update. Summary of changes: docs/misc/math-expr.md | 2 + docs/querying/sql.md | 2 + .../CaseInsensitiveContainsExprMacro.java | 63 ++ .../druid/query/expression/ContainsExpr.java | 101 + .../druid/query/expression/ContainsExprMacro.java | 62 ++ ...Test.java => CaseInsensitiveExprMacroTest.java} | 66 +++--- ...prMacroTest.java => ContainsExprMacroTest.java} | 50 ++--- .../org/apache/druid/guice/ExpressionModule.java | 4 + ...ersion.java => ContainsOperatorConversion.java} | 93 ++--- .../sql/calcite/planner/DruidOperatorTable.java| 3 + .../calcite/expression/ExpressionTestHelper.java | 15 +- .../sql/calcite/expression/ExpressionsTest.java| 228 + website/.spelling | 2 + 13 files changed, 611 insertions(+), 80 deletions(-) create mode 100644 processing/src/main/java/org/apache/druid/query/expression/CaseInsensitiveContainsExprMacro.java create mode 100644 processing/src/main/java/org/apache/druid/query/expression/ContainsExpr.java create mode 100644 processing/src/main/java/org/apache/druid/query/expression/ContainsExprMacro.java copy processing/src/test/java/org/apache/druid/query/expression/{RegexpLikeExprMacroTest.java => CaseInsensitiveExprMacroTest.java} (53%) copy processing/src/test/java/org/apache/druid/query/expression/{RegexpLikeExprMacroTest.java => ContainsExprMacroTest.java} (58%) copy sql/src/main/java/org/apache/druid/sql/calcite/expression/builtin/{RegexpLikeOperatorConversion.java => ContainsOperatorConversion.java} (51%) - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] suneet-s merged pull request #10350: Support SearchQueryDimFilter in sql via new methods
suneet-s merged pull request #10350: URL: https://github.com/apache/druid/pull/10350 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] suneet-s commented on a change in pull request #10350: Support SearchQueryDimFilter in sql via new methods
suneet-s commented on a change in pull request #10350: URL: https://github.com/apache/druid/pull/10350#discussion_r488085625 ## File path: processing/src/test/java/org/apache/druid/query/expression/ContainsExprMacroTest.java ## @@ -0,0 +1,142 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.query.expression; + +import com.google.common.collect.ImmutableMap; +import org.apache.druid.common.config.NullHandling; +import org.apache.druid.math.expr.ExprEval; +import org.apache.druid.math.expr.ExprType; +import org.apache.druid.math.expr.Parser; +import org.junit.Assert; +import org.junit.Test; + +public class ContainsExprMacroTest extends MacroTestBase +{ + public ContainsExprMacroTest() + { +super(new ContainsExprMacro()); + } + + @Test + public void testErrorZeroArguments() + { +expectException(IllegalArgumentException.class, "Function[contains_string] must have 2 arguments"); +eval("contains_string()", Parser.withMap(ImmutableMap.of())); + } + + @Test + public void testErrorThreeArguments() + { +expectException(IllegalArgumentException.class, "Function[contains_string] must have 2 arguments"); +eval("contains_string('a', 'b', 'c')", Parser.withMap(ImmutableMap.of())); + } + + @Test + public void testMatch() + { +final ExprEval result = eval("contains_string(a, 'oba')", Parser.withMap(ImmutableMap.of("a", "foobar"))); +Assert.assertEquals( +ExprEval.of(true, ExprType.LONG).value(), +result.value() +); + } + + @Test + public void testNoMatch() + { +final ExprEval result = eval("contains_string(a, 'bar')", Parser.withMap(ImmutableMap.of("a", "foo"))); +Assert.assertEquals( +ExprEval.of(false, ExprType.LONG).value(), +result.value() +); + } + + @Test + public void testNullSearch() + { +if (NullHandling.sqlCompatible()) { + expectException(IllegalArgumentException.class, "Function[contains_string] substring must be a string literal"); +} + +final ExprEval result = eval("contains_string(a, null)", Parser.withMap(ImmutableMap.of("a", "foo"))); +Assert.assertEquals( +ExprEval.of(true, ExprType.LONG).value(), +result.value() +); + } + + @Test + public void testEmptyStringSearch() + { +final ExprEval result = eval("contains_string(a, '')", Parser.withMap(ImmutableMap.of("a", "foo"))); +Assert.assertEquals( +ExprEval.of(true, ExprType.LONG).value(), +result.value() +); + } + + @Test + public void testNullSearchOnEmptyString() + { +if (NullHandling.sqlCompatible()) { + expectException(IllegalArgumentException.class, "Function[contains_string] substring must be a string literal"); +} + +final ExprEval result = eval("contains_string(a, null)", Parser.withMap(ImmutableMap.of("a", ""))); +Assert.assertEquals( +ExprEval.of(true, ExprType.LONG).value(), +result.value() +); + } + + @Test + public void testEmptyStringSearchOnEmptyString() + { +final ExprEval result = eval("contains_string(a, '')", Parser.withMap(ImmutableMap.of("a", ""))); +Assert.assertEquals( +ExprEval.of(true, ExprType.LONG).value(), +result.value() +); + } + + @Test + public void testNullSearchOnNull() + { +if (NullHandling.sqlCompatible()) { + expectException(IllegalArgumentException.class, "Function[contains_string] substring must be a string literal"); +} + +final ExprEval result = eval("contains_string(a, null)", Parser.withSuppliers(ImmutableMap.of("a", () -> null))); +Assert.assertEquals( +ExprEval.of(true, ExprType.LONG).value(), Review comment: > In non-sql compatible mode, both of these will be empty instead of null and hence the function would return `true`. > In SQL compatible mode, we would see an exception since `null` is not a valid literal. Thanks for the explanation This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this
[GitHub] [druid] clintropolis commented on a change in pull request #10327: benchmark for indexed table experiments
clintropolis commented on a change in pull request #10327: URL: https://github.com/apache/druid/pull/10327#discussion_r488079104 ## File path: benchmarks/src/test/java/org/apache/druid/benchmark/IndexedTableLoadingBenchmark.java ## @@ -0,0 +1,97 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.benchmark; + +import com.google.common.collect.ImmutableList; +import com.google.common.collect.ImmutableSet; +import org.apache.druid.java.util.common.io.Closer; +import org.apache.druid.segment.QueryableIndexSegment; +import org.apache.druid.segment.join.table.IndexedTable; +import org.openjdk.jmh.annotations.Benchmark; +import org.openjdk.jmh.annotations.BenchmarkMode; +import org.openjdk.jmh.annotations.Fork; +import org.openjdk.jmh.annotations.Level; +import org.openjdk.jmh.annotations.Measurement; +import org.openjdk.jmh.annotations.Mode; +import org.openjdk.jmh.annotations.OutputTimeUnit; +import org.openjdk.jmh.annotations.Param; +import org.openjdk.jmh.annotations.Scope; +import org.openjdk.jmh.annotations.Setup; +import org.openjdk.jmh.annotations.State; +import org.openjdk.jmh.annotations.TearDown; +import org.openjdk.jmh.annotations.Warmup; +import org.openjdk.jmh.infra.Blackhole; + +import java.io.IOException; +import java.util.List; +import java.util.Set; +import java.util.concurrent.TimeUnit; + +@State(Scope.Benchmark) +@Fork(value = 1) +@Warmup(iterations = 3) +@Measurement(iterations = 5) +public class IndexedTableLoadingBenchmark +{ + private static List> KEY_COLUMN_SETS = ImmutableList.of( + ImmutableSet.of("stringKey", "longKey") + ); + + @Param({"0"}) + int keyColumns; + + @Param({"5", "50", "500"}) + int rowsPerSegment; + + @Param({"segment"}) + String indexedTableType; + + Closer closer = Closer.create(); + + QueryableIndexSegment tableSegment = null; + + @Setup(Level.Trial) + public void setup() + { +tableSegment = IndexedTableJoinCursorBenchmark.makeQueryableIndexSegment(closer, "join", rowsPerSegment); + } + + @TearDown + public void tearDown() throws IOException + { +closer.close(); + } + + @Benchmark + @BenchmarkMode(Mode.AverageTime) + @OutputTimeUnit(TimeUnit.MILLISECONDS) + public void loadTable(Blackhole blackhole) + { +try ( +IndexedTable table = Review comment: yeah, that makes sense, modified to close in a teardown method that runs after each iteration This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10370: add computed Expr output types
abhishekagarwal87 commented on a change in pull request #10370: URL: https://github.com/apache/druid/pull/10370#discussion_r488075026 ## File path: core/src/main/java/org/apache/druid/math/expr/BinaryLogicalOperatorExpr.java ## @@ -57,6 +57,17 @@ protected final double evalDouble(double left, double right) // Use Double.compare for more consistent NaN handling. return Evals.asDouble(Double.compare(left, right) < 0); } + + @Nullable + @Override + public ExprType getOutputType(InputBindingTypes inputTypes) + { +ExprType implicitCast = super.getOutputType(inputTypes); +if (ExprType.STRING.equals(implicitCast)) { + return ExprType.LONG; +} +return implicitCast; Review comment: > This PR is setting up for vectorized expressions, which are going to be strongly typed before processing, which I think makes this inconsistent/confusing/magical behavior not necessary. can you elaborate more on this? How will that look like? While I understand the right behavior is debatable, it may still be best to keep the type in sync with eval even if it's not very intuitive. It may require that the logic of figuring out the type is not reusable as a function `ExprType.autoTypeConversion` and may have to be written differently for some operators. We can, later on, change the different functions, fixing both the `eval` and `getOutputType` together. But till then, they will be in sync. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10370: add computed Expr output types
abhishekagarwal87 commented on a change in pull request #10370: URL: https://github.com/apache/druid/pull/10370#discussion_r488075026 ## File path: core/src/main/java/org/apache/druid/math/expr/BinaryLogicalOperatorExpr.java ## @@ -57,6 +57,17 @@ protected final double evalDouble(double left, double right) // Use Double.compare for more consistent NaN handling. return Evals.asDouble(Double.compare(left, right) < 0); } + + @Nullable + @Override + public ExprType getOutputType(InputBindingTypes inputTypes) + { +ExprType implicitCast = super.getOutputType(inputTypes); +if (ExprType.STRING.equals(implicitCast)) { + return ExprType.LONG; +} +return implicitCast; Review comment: can you elaborate more on this? How will that look like? > This PR is setting up for vectorized expressions, which are going to be strongly typed before processing, which I think makes this inconsistent/confusing/magical behavior not necessary. While I understand the right behavior is debatable, it may still be best to keep the type in sync with eval even if it's not very intuitive. It may require that the logic of figuring out the type is not reusable as a function `ExprType.autoTypeConversion` and may have to be written differently for some operators. We can, later on, change the different functions, fixing both the `eval` and `getOutputType` together. But till then, they will be in sync. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] gianm commented on pull request #10377: Include Sequence-building time in CPU time metric.
gianm commented on pull request #10377: URL: https://github.com/apache/druid/pull/10377#issuecomment-692168297 > Nice. It would be useful to call this out in the Release notes as well. Good point, I added the "Release Notes" label. I haven't had time to fix the coverage issue yet, but I'll be able to get to it soonish. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] suneet-s commented on pull request #10386: Upgrade AWS SDK
suneet-s commented on pull request #10386: URL: https://github.com/apache/druid/pull/10386#issuecomment-692145515 @mitchlloyd it looks like the license check job is complaining about the licenses file not being updated ``` Error: found 3 missing licenses. These licenses are reported, but missing in the registry druid_module: kinesis-indexing-service, groupId: com.amazonaws, artifactId: jmespath-java, version: 1.11.854, license: Apache License version 2.0 druid_module: kinesis-indexing-service, groupId: com.amazonaws, artifactId: aws-java-sdk-sts, version: 1.11.854, license: Apache License version 2.0 druid_module: kinesis-indexing-service, groupId: com.amazonaws, artifactId: aws-java-sdk-kinesis, version: 1.11.854, license: Apache License version 2.0 ``` Have you been able to run the AWS integration tests with the new sdk version? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] AshishKapoor commented on a change in pull request #10357: [web-console] Update README.md
AshishKapoor commented on a change in pull request #10357: URL: https://github.com/apache/druid/pull/10357#discussion_r488034132 ## File path: web-console/README.md ## @@ -25,7 +25,9 @@ This is the unified Druid web console that servers as a data management layer fo 1. You need to be withing the `web-console` directory 2. Install the modules with `npm install` -3. Run `npm start` will start in development mode and will proxy druid requests to `localhost:` +3. Run `npm run compile` to compile the scss files Review comment: Wait, @suneet-s Let me verify this again. I believe it's because there's no Xcode on the machine in use. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] AshishKapoor commented on a change in pull request #10357: [web-console] Update README.md
AshishKapoor commented on a change in pull request #10357: URL: https://github.com/apache/druid/pull/10357#discussion_r488031187 ## File path: web-console/README.md ## @@ -25,7 +25,9 @@ This is the unified Druid web console that servers as a data management layer fo 1. You need to be withing the `web-console` directory 2. Install the modules with `npm install` -3. Run `npm start` will start in development mode and will proxy druid requests to `localhost:` +3. Run `npm run compile` to compile the scss files Review comment: Hi, @suneet-s as requested: ``` Build failed with error code: 1 npm WARN optional SKIPPING OPTIONAL DEPENDENCY: fsevents@1.2.9 (node_modules/fsevents): npm WARN optional SKIPPING OPTIONAL DEPENDENCY: fsevents@1.2.9 install: `node install` npm WARN optional SKIPPING OPTIONAL DEPENDENCY: Exit status 1 npm ERR! code ELIFECYCLE npm ERR! errno 1 npm ERR! node-sass@4.13.1 postinstall: `node scripts/build.js` npm ERR! Exit status 1 npm ERR! npm ERR! Failed at the node-sass@4.13.1 postinstall script. npm ERR! This is probably not a problem with npm. There is likely additional logging output above. npm ERR! A complete log of this run can be found in: npm ERR! /Users/ashishkapoor/.npm/_logs/2020-09-14T15_36_01_509Z-debug.log ➜ web-console ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] AshishKapoor commented on a change in pull request #10357: [web-console] Update README.md
AshishKapoor commented on a change in pull request #10357: URL: https://github.com/apache/druid/pull/10357#discussion_r488031187 ## File path: web-console/README.md ## @@ -25,7 +25,9 @@ This is the unified Druid web console that servers as a data management layer fo 1. You need to be withing the `web-console` directory 2. Install the modules with `npm install` -3. Run `npm start` will start in development mode and will proxy druid requests to `localhost:` +3. Run `npm run compile` to compile the scss files Review comment: ``` Build failed with error code: 1 npm WARN optional SKIPPING OPTIONAL DEPENDENCY: fsevents@1.2.9 (node_modules/fsevents): npm WARN optional SKIPPING OPTIONAL DEPENDENCY: fsevents@1.2.9 install: `node install` npm WARN optional SKIPPING OPTIONAL DEPENDENCY: Exit status 1 npm ERR! code ELIFECYCLE npm ERR! errno 1 npm ERR! node-sass@4.13.1 postinstall: `node scripts/build.js` npm ERR! Exit status 1 npm ERR! npm ERR! Failed at the node-sass@4.13.1 postinstall script. npm ERR! This is probably not a problem with npm. There is likely additional logging output above. npm ERR! A complete log of this run can be found in: npm ERR! /Users/ashishkapoor/.npm/_logs/2020-09-14T15_36_01_509Z-debug.log ➜ web-console ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] suneet-s commented on a change in pull request #10357: [web-console] Update README.md
suneet-s commented on a change in pull request #10357: URL: https://github.com/apache/druid/pull/10357#discussion_r488025313 ## File path: web-console/README.md ## @@ -25,7 +25,9 @@ This is the unified Druid web console that servers as a data management layer fo 1. You need to be withing the `web-console` directory 2. Install the modules with `npm install` -3. Run `npm start` will start in development mode and will proxy druid requests to `localhost:` +3. Run `npm run compile` to compile the scss files Review comment: I don't remember having to do this step when I ran the web console locally. Can you describe what wasn't working with the instructions before this change This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] suneet-s merged pull request #10392: TransformSpecTest should extends InitializedNullHandlingTest
suneet-s merged pull request #10392: URL: https://github.com/apache/druid/pull/10392 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] suneet-s closed issue #10391: Unit test fail due to missing extend InitializedNullHandlingTest
suneet-s closed issue #10391: URL: https://github.com/apache/druid/issues/10391 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[druid] branch master updated: TransformSpecTest should extends InitializedNullHandlingTest (#10392)
This is an automated email from the ASF dual-hosted git repository. suneet pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/druid.git The following commit(s) were added to refs/heads/master by this push: new 3d4b48e TransformSpecTest should extends InitializedNullHandlingTest (#10392) 3d4b48e is described below commit 3d4b48e0aa5109e3b09c6d1c8d04264f8d9121cd Author: Cheng Pan <379377...@qq.com> AuthorDate: Mon Sep 14 23:22:24 2020 +0800 TransformSpecTest should extends InitializedNullHandlingTest (#10392) --- .../test/java/org/apache/druid/segment/indexing/TransformSpecTest.java | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/server/src/test/java/org/apache/druid/segment/indexing/TransformSpecTest.java b/server/src/test/java/org/apache/druid/segment/indexing/TransformSpecTest.java index 8102a71..763a924 100644 --- a/server/src/test/java/org/apache/druid/segment/indexing/TransformSpecTest.java +++ b/server/src/test/java/org/apache/druid/segment/indexing/TransformSpecTest.java @@ -35,12 +35,13 @@ import org.apache.druid.query.filter.SelectorDimFilter; import org.apache.druid.segment.TestHelper; import org.apache.druid.segment.transform.ExpressionTransform; import org.apache.druid.segment.transform.TransformSpec; +import org.apache.druid.testing.InitializedNullHandlingTest; import org.junit.Assert; import org.junit.Test; import java.util.Map; -public class TransformSpecTest +public class TransformSpecTest extends InitializedNullHandlingTest { private static final MapInputRowParser PARSER = new MapInputRowParser( new TimeAndDimsParseSpec( - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] a2l007 commented on pull request #10377: Include Sequence-building time in CPU time metric.
a2l007 commented on pull request #10377: URL: https://github.com/apache/druid/pull/10377#issuecomment-692096884 Nice. It would be useful to call this out in the Release notes as well. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] tlblessing opened a new pull request #10396: Update index.md
tlblessing opened a new pull request #10396: URL: https://github.com/apache/druid/pull/10396 Cool product. Minor suggested copy edits. -avoid excessive use of double quotes for terms that are well understood. -avoid future tense when possible. -missing punctuation. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] a2l007 commented on pull request #9832: Disable sending server version in response headers
a2l007 commented on pull request #9832: URL: https://github.com/apache/druid/pull/9832#issuecomment-692057094 Yeah makes sense. I've removed the config and related changes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] AshishKapoor commented on issue #10349: Druid data loader does not work with input time column name - '__time'
AshishKapoor commented on issue #10349: URL: https://github.com/apache/druid/issues/10349#issuecomment-692018903 > `__time` is a reserved name representing time column of the segment. Currently it requires a value of of unix epoch. You could change the property name of `__time` in json to another name. I've noticed it works with other databases. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] AshishKapoor edited a comment on issue #10349: Druid data loader does not work with input time column name - '__time'
AshishKapoor edited a comment on issue #10349: URL: https://github.com/apache/druid/issues/10349#issuecomment-692018903 > `__time` is a reserved name representing time column of the segment. Currently it requires a value of of unix epoch. You could change the property name of `__time` in json to another name. @FrankChen021 I've noticed it works with other databases. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] maytasm commented on a change in pull request #10371: Auto-compaction snapshot status API
maytasm commented on a change in pull request #10371: URL: https://github.com/apache/druid/pull/10371#discussion_r487839440 ## File path: server/src/main/java/org/apache/druid/server/coordinator/AutoCompactionSnapshot.java ## @@ -0,0 +1,192 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.server.coordinator; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; + +import javax.annotation.Nullable; +import javax.validation.constraints.NotNull; +import java.util.Objects; + +public class AutoCompactionSnapshot +{ + public enum AutoCompactionScheduleStatus + { +NOT_ENABLED, +RUNNING + } + + @JsonProperty + private String dataSource; + @JsonProperty + private AutoCompactionScheduleStatus scheduleStatus; + @JsonProperty + private String latestScheduledTaskId; + @JsonProperty + private long byteAwaitingCompaction; + @JsonProperty + private long byteProcessed; Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] maytasm commented on a change in pull request #10371: Auto-compaction snapshot status API
maytasm commented on a change in pull request #10371: URL: https://github.com/apache/druid/pull/10371#discussion_r487838791 ## File path: server/src/main/java/org/apache/druid/server/coordinator/duty/EmitClusterStatsAndMetrics.java ## @@ -296,18 +296,87 @@ public DruidCoordinatorRuntimeParams run(DruidCoordinatorRuntimeParams params) emitter.emit( new ServiceMetricEvent.Builder().build( -"compact/task/count", +"compact/task/scheduled/count", stats.getGlobalStat(CompactSegments.COMPACTION_TASK_COUNT) ) ); +emitter.emit( +new ServiceMetricEvent.Builder().build( +"compact/task/maxSlot/count", +stats.getGlobalStat(CompactSegments.MAX_COMPACTION_TASK_SLOT) +) +); + +emitter.emit( +new ServiceMetricEvent.Builder().build( +"compact/task/availableSlot/count", +stats.getGlobalStat(CompactSegments.AVAILABLE_COMPACTION_TASK_SLOT) +) +); + +stats.forEachDataSourceStat( +CompactSegments.TOTAL_SIZE_OF_SEGMENTS_AWAITING_COMPACTION, +(final String dataSource, final long count) -> { + emitter.emit( + new ServiceMetricEvent.Builder() + .setDimension(DruidMetrics.DATASOURCE, dataSource) + .build("segment/waitCompact/segmentByte", count) + ); +} +); + +stats.forEachDataSourceStat( +CompactSegments.TOTAL_COUNT_OF_SEGMENTS_AWAITING_COMPACTION, +(final String dataSource, final long count) -> { + emitter.emit( + new ServiceMetricEvent.Builder() + .setDimension(DruidMetrics.DATASOURCE, dataSource) + .build("segment/waitCompact/segmentCount", count) + ); +} +); + +stats.forEachDataSourceStat( +CompactSegments.TOTAL_INTERVAL_OF_SEGMENTS_AWAITING_COMPACTION, +(final String dataSource, final long count) -> { + emitter.emit( + new ServiceMetricEvent.Builder() + .setDimension(DruidMetrics.DATASOURCE, dataSource) + .build("segment/waitCompact/intervalCount", count) + ); +} +); + +stats.forEachDataSourceStat( +CompactSegments.TOTAL_SIZE_OF_SEGMENTS_COMPACTED, +(final String dataSource, final long count) -> { + emitter.emit( + new ServiceMetricEvent.Builder() + .setDimension(DruidMetrics.DATASOURCE, dataSource) + .build("segment/compacted/segmentByte", count) + ); +} +); + +stats.forEachDataSourceStat( +CompactSegments.TOTAL_COUNT_OF_SEGMENTS_COMPACTED, +(final String dataSource, final long count) -> { + emitter.emit( + new ServiceMetricEvent.Builder() + .setDimension(DruidMetrics.DATASOURCE, dataSource) + .build("segment/compacted/segmentCount", count) + ); +} +); + stats.forEachDataSourceStat( -"segmentsWaitCompact", +CompactSegments.TOTAL_INTERVAL_OF_SEGMENTS_COMPACTED, (final String dataSource, final long count) -> { emitter.emit( new ServiceMetricEvent.Builder() .setDimension(DruidMetrics.DATASOURCE, dataSource) - .build("segment/waitCompact/count", count) + .build("segment/compacted/intervalCount", count) Review comment: Yep This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] maytasm commented on a change in pull request #10371: Auto-compaction snapshot status API
maytasm commented on a change in pull request #10371: URL: https://github.com/apache/druid/pull/10371#discussion_r487838589 ## File path: server/src/main/java/org/apache/druid/server/coordinator/duty/EmitClusterStatsAndMetrics.java ## @@ -296,18 +296,87 @@ public DruidCoordinatorRuntimeParams run(DruidCoordinatorRuntimeParams params) emitter.emit( new ServiceMetricEvent.Builder().build( -"compact/task/count", +"compact/task/scheduled/count", Review comment: Revert back This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10366: Proposed changes for making joins cacheable
abhishekagarwal87 commented on a change in pull request #10366: URL: https://github.com/apache/druid/pull/10366#discussion_r487837336 ## File path: server/src/main/java/org/apache/druid/query/ResultLevelCachingQueryRunner.java ## @@ -86,7 +92,15 @@ public ResultLevelCachingQueryRunner( if (useResultCache || populateResultCache) { final String cacheKeyStr = StringUtils.fromUtf8(strategy.computeResultLevelCacheKey(query)); - final byte[] cachedResultSet = fetchResultsFromResultLevelCache(cacheKeyStr); + DataSourceAnalysis analysis = DataSourceAnalysis.forDataSource(query.getDataSource()); + byte[] dataSourceCacheKey = Joinables.computeDataSourceCacheKey(analysis, joinableFactory).orElse(null); + if (null == dataSourceCacheKey) { +return baseRunner.run( +queryPlus, +responseContext +); + } + final byte[] cachedResultSet = fetchResultsFromResultLevelCache(cacheKeyStr, dataSourceCacheKey); Review comment: earlier `cacheKeyStr` was being used as namespace as well as the key which was just like unnecessary duplication. Internally, the namespace and key are composed together into another bigger key. By the way, this too would cause cache bust once on an upgrade. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10366: Proposed changes for making joins cacheable
abhishekagarwal87 commented on a change in pull request #10366: URL: https://github.com/apache/druid/pull/10366#discussion_r487835902 ## File path: processing/src/main/java/org/apache/druid/segment/join/table/BroadcastSegmentIndexedTable.java ## @@ -241,6 +244,22 @@ public ColumnSelectorFactory makeColumnSelectorFactory(ReadableOffset offset, bo ); } + @Override + public Optional computeCacheKey() + { +SegmentId segmentId = segment.getId(); Review comment: I wanted to use `CacheUtil` here but that is in `server` module and not accessible here. May be I can move that class to `core`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] AshishKapoor closed issue #10349: Druid data loader does not work with input time column name - '__time'
AshishKapoor closed issue #10349: URL: https://github.com/apache/druid/issues/10349 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10366: Proposed changes for making joins cacheable
abhishekagarwal87 commented on a change in pull request #10366: URL: https://github.com/apache/druid/pull/10366#discussion_r487819771 ## File path: processing/src/main/java/org/apache/druid/segment/join/Joinables.java ## @@ -118,6 +126,47 @@ public static boolean isPrefixedBy(final String columnName, final String prefix) ); } + /** + * Compute a cache key prefix for data sources that is to be used in segment level and result level caches. The + * data source can either be base (clauses is empty) or RHS of a join (clauses is non-empty). In both of the cases, + * a non-null cache is returned. However, the cache key is null if there is a join and some of the right data sources + * participating in the join do not support caching yet + * + * @param dataSourceAnalysis + * @param joinableFactory + * @return + */ + public static Optional computeDataSourceCacheKey( + final DataSourceAnalysis dataSourceAnalysis, + final JoinableFactory joinableFactory + ) + { +final CacheKeyBuilder keyBuilder; +final List clauses = dataSourceAnalysis.getPreJoinableClauses(); +if (clauses.isEmpty()) { + keyBuilder = new CacheKeyBuilder(REGULAR_OPERATION); Review comment: You are right. I was thinking that it will only be one time bust. And, that having a distinct prefix will eliminate the possibility of a cache key collision. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] jagtapsantosh opened a new issue #10395: Getting error when accessing materialized view query in web console and in logs using druid 0.19.0: Could not resolve type id 'view' as a subt
jagtapsantosh opened a new issue #10395: URL: https://github.com/apache/druid/issues/10395 Hello Guys, I am using druid 0.19.0 version and trying to setup Materialized View and Materialized Selection Features. I am following the instruction given on [this link](https://druid.apache.org/docs/latest/development/extensions-contrib/materialized-view.html) of druid official documentation. I have successfully loaded the extensions like materialized-view-selection on Broker and load materialized-view-maintenance on Overlord. I have setup Hadoop on the same machine of druid and storing deep storage data in hadoop. I could successfully able to submit derivativeDataSource supervisor spec and created derivativeDataSource as per mentioned under Materialized-view-maintenance of the official druid document. However I am getting following error while submitting sample view type query spec mentioned under Materialized-view-selection under the [this link](https://druid.apache.org/docs/latest/development/extensions-contrib/materialized-view.html) Exception Details: com.fasterxml.jackson.databind.exc.InvalidTypeIdException: Could not resolve type id 'view' as a subtype of `org.apache.druid.query.Query`: known type ids = [dataSourceMetadata, groupBy, scan, search, segmentMetadata, select, timeBoundary, timeseries, topN] at [Source: (org.eclipse.jetty.server.HttpInputOverHTTP); line: 1, column: 14] at com.fasterxml.jackson.databind.exc.InvalidTypeIdException.from(InvalidTypeIdException.java:43) ~[jackson-databind-2.10.2.jar:2.10.2] at com.fasterxml.jackson.databind.DeserializationContext.invalidTypeIdException(DeserializationContext.java:1758) ~[jackson-databind-2.10.2.jar:2.10.2] at com.fasterxml.jackson.databind.DeserializationContext.handleUnknownTypeId(DeserializationContext.java:1265) ~[jackson-databind-2.10.2.jar:2.10.2] at com.fasterxml.jackson.databind.jsontype.impl.TypeDeserializerBase._handleUnknownTypeId(TypeDeserializerBase.java:290) ~[jackson-databind-2.10.2.jar:2.10.2] at com.fasterxml.jackson.databind.jsontype.impl.TypeDeserializerBase._findDeserializer(TypeDeserializerBase.java:162) ~[jackson-databind-2.10.2.jar:2.10.2] at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:113) ~[jackson-databind-2.10.2.jar:2.10.2] at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:97) ~[jackson-databind-2.10.2.jar:2.10.2] at com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:254) ~[jackson-databind-2.10.2.jar:2.10.2] at com.fasterxml.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize(TypeWrappedDeserializer.java:68) ~[jackson-databind-2.10.2.jar:2.10.2] at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4202) ~[jackson-databind-2.10.2.jar:2.10.2] at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3242) ~[jackson-databind-2.10.2.jar:2.10.2] at org.apache.druid.server.AsyncQueryForwardingServlet.service(AsyncQueryForwardingServlet.java:235) [druid-server-0.19.0.jar:0.19.0] at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) [javax.servlet-api-3.1.0.jar:3.1.0] at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:865) [jetty-servlet-9.4.12.v20180830.jar:9.4.12.v20180830] at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1655) [jetty-servlet-9.4.12.v20180830.jar:9.4.12.v20180830] at org.apache.druid.server.security.PreResponseAuthorizationCheckFilter.doFilter(PreResponseAuthorizationCheckFilter.java:82) [druid-server-0.19.0.jar:0.19.0] at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642) [jetty-servlet-9.4.12.v20180830.jar:9.4.12.v20180830] at org.apache.druid.server.security.AllowHttpMethodsResourceFilter.doFilter(AllowHttpMethodsResourceFilter.java:78) [druid-server-0.19.0.jar:0.19.0] at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642) [jetty-servlet-9.4.12.v20180830.jar:9.4.12.v20180830] at org.apache.druid.server.security.AllowOptionsResourceFilter.doFilter(AllowOptionsResourceFilter.java:75) [druid-server-0.19.0.jar:0.19.0] at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642) [jetty-servlet-9.4.12.v20180830.jar:9.4.12.v20180830] at org.apache.druid.server.security.AllowAllAuthenticator$1.doFilter(AllowAllAuthenticator.java:84) [druid-server-0.19.0.jar:0.19.0] at
[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10366: Proposed changes for making joins cacheable
abhishekagarwal87 commented on a change in pull request #10366: URL: https://github.com/apache/druid/pull/10366#discussion_r487818427 ## File path: server/src/main/java/org/apache/druid/server/coordination/ServerManager.java ## @@ -293,21 +302,28 @@ public ServerManager( queryMetrics -> queryMetrics.segment(segmentIdString) ); -CachingQueryRunner cachingQueryRunner = new CachingQueryRunner<>( -segmentIdString, -segmentDescriptor, -objectMapper, -cache, -toolChest, -metricsEmittingQueryRunnerInner, -cachePopulator, -cacheConfig -); +QueryRunner queryRunner; Review comment: I was thinking that less places worry about the different meanings of `cacheKeyPrefix` content, the better it is. So for CachingQueryRunner, `cacheKeyPrefix` is just a plain dumb byte array This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] pan3793 commented on pull request #9507: optionally disable all of hardcoded zookeeper use
pan3793 commented on pull request #9507: URL: https://github.com/apache/druid/pull/9507#issuecomment-691971352 It would be nice if druid support k8s in mainline, will it be addressed in next release? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10366: Proposed changes for making joins cacheable
abhishekagarwal87 commented on a change in pull request #10366: URL: https://github.com/apache/druid/pull/10366#discussion_r48760 ## File path: server/src/main/java/org/apache/druid/segment/join/BroadcastTableJoinableFactory.java ## @@ -76,4 +76,27 @@ public boolean isDirectlyJoinable(DataSource dataSource) } return Optional.empty(); } + + @Override + public Optional computeJoinCacheKey(DataSource dataSource) Review comment: Yes. good point. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10366: Proposed changes for making joins cacheable
abhishekagarwal87 commented on a change in pull request #10366: URL: https://github.com/apache/druid/pull/10366#discussion_r487699662 ## File path: server/src/main/java/org/apache/druid/query/ResultLevelCachingQueryRunner.java ## @@ -86,7 +92,15 @@ public ResultLevelCachingQueryRunner( if (useResultCache || populateResultCache) { final String cacheKeyStr = StringUtils.fromUtf8(strategy.computeResultLevelCacheKey(query)); - final byte[] cachedResultSet = fetchResultsFromResultLevelCache(cacheKeyStr); + DataSourceAnalysis analysis = DataSourceAnalysis.forDataSource(query.getDataSource()); + byte[] dataSourceCacheKey = Joinables.computeDataSourceCacheKey(analysis, joinableFactory).orElse(null); + if (null == dataSourceCacheKey) { +return baseRunner.run( +queryPlus, +responseContext +); + } + final byte[] cachedResultSet = fetchResultsFromResultLevelCache(cacheKeyStr, dataSourceCacheKey); Review comment: yeah. this is not well thought out right now. I will revisit this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10366: Proposed changes for making joins cacheable
abhishekagarwal87 commented on a change in pull request #10366: URL: https://github.com/apache/druid/pull/10366#discussion_r487699118 ## File path: processing/src/main/java/org/apache/druid/segment/join/table/BroadcastSegmentIndexedTable.java ## @@ -241,6 +244,22 @@ public ColumnSelectorFactory makeColumnSelectorFactory(ReadableOffset offset, bo ); } + @Override + public Optional computeCacheKey() + { +SegmentId segmentId = segment.getId(); Review comment: are you suggesting to move this logic to `SegmentId` class? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10366: Proposed changes for making joins cacheable
abhishekagarwal87 commented on a change in pull request #10366: URL: https://github.com/apache/druid/pull/10366#discussion_r487698508 ## File path: processing/src/main/java/org/apache/druid/segment/join/MapJoinableFactory.java ## @@ -80,4 +80,21 @@ public boolean isDirectlyJoinable(DataSource dataSource) } return maybeJoinable; } + + @Override + public Optional computeJoinCacheKey(DataSource dataSource) Review comment: Right now as I see that additional info comes from the data source itself. The joinable factory is deciding based on the `dataSource` whether it can build the joinable factory. can you share any hypothetical examples wherein one factory decides to build Joinable but the other one does not on the basis of `JoinConditionAnalysis`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10366: Proposed changes for making joins cacheable
abhishekagarwal87 commented on a change in pull request #10366: URL: https://github.com/apache/druid/pull/10366#discussion_r487695326 ## File path: processing/src/main/java/org/apache/druid/segment/join/Joinables.java ## @@ -118,6 +126,47 @@ public static boolean isPrefixedBy(final String columnName, final String prefix) ); } + /** + * Compute a cache key prefix for data sources that is to be used in segment level and result level caches. The + * data source can either be base (clauses is empty) or RHS of a join (clauses is non-empty). In both of the cases, + * a non-null cache is returned. However, the cache key is null if there is a join and some of the right data sources + * participating in the join do not support caching yet + * + * @param dataSourceAnalysis + * @param joinableFactory + * @return + */ + public static Optional computeDataSourceCacheKey( + final DataSourceAnalysis dataSourceAnalysis, + final JoinableFactory joinableFactory + ) + { +final CacheKeyBuilder keyBuilder; +final List clauses = dataSourceAnalysis.getPreJoinableClauses(); +if (clauses.isEmpty()) { + keyBuilder = new CacheKeyBuilder(REGULAR_OPERATION); +} else { + keyBuilder = new CacheKeyBuilder(JOIN_OPERATION); + for (PreJoinableClause clause : clauses) { +if (!clause.getCondition().canHashJoin()) { Review comment: I was thinking to keep it here so that I could just pass the `data-source` inside the `computeJoinCacheKey`. The engine for hash join works outside the boundaries of joinable factories. so it felt simpler to just keep it here and joinable factories just worry about the data source. The other cache info from the condition is derived here in this class as well (original expression etc) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] liran-funaro commented on issue #10321: [Proposal] Design for a Configurable Index Type
liran-funaro commented on issue #10321: URL: https://github.com/apache/druid/issues/10321#issuecomment-691852058 @jihoonson Thank you for taking the time to review this proposal. >I'm assuming that those knobs will have default values and users usually don't have to worry about them. Is this right? Yes. They will be `@Nullable` and will be set to the default value in that case. E.g., for `OffheapIncrementalIndex.Spec`: ```java public Spec( final @JsonProperty("bufferSize") @Nullable Integer bufferSize, final @JsonProperty("cacheSize") @Nullable Integer cacheSize ) { this.bufferSize = bufferSize != null && bufferSize > 0 ? bufferSize : DEFAULT_BUFFER_SIZE; this.cacheSize = cacheSize != null && cacheSize > this.bufferSize ? cacheSize : DEFAULT_CACHE_SIZE; // ... } ``` > do you think it's stable enough or do you perhaps need to make some changes in the near future? Using the `@UnstableApi` seems like a good idea at this point. > I'm not sure how much it's slower compared to groupBy v2 or how much using Oak can help with its performance. You may want to look into concurrent processing in groupBy v2 more closely such as `ConcurrentGrouper`. I'll look into that. Improving query performance is on our agenda, and we considered comparing the groupBy v1 with Oak to the groupBy v2 mechanism. We will report our results in a different issue in the future. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org