[GitHub] [incubator-druid] jihoonson commented on a change in pull request #7734: reorganizing the ToC
jihoonson commented on a change in pull request #7734: reorganizing the ToC URL: https://github.com/apache/incubator-druid/pull/7734#discussion_r286783441 ## File path: docs/content/toc.md ## @@ -126,20 +132,19 @@ layout: toc * [Metrics and Monitoring](/docs/VERSION/operations/metrics.html) * [Alerts](/docs/VERSION/operations/alerts.html) * [Different Hadoop Versions](/docs/VERSION/operations/other-hadoop.html) - * [HTTP Compression](/docs/VERSION/operations/http-compression.html) - * [Basic Cluster Tuning](/docs/VERSION/operations/basic-cluster-tuning.html) - * Examples - * [Single-server Deployment Examples](/docs/VERSION/operations/single-server.html) - * [Clustered Deployment Example](/docs/VERSION/operations/example-cluster.html) - * [Recommendations](/docs/VERSION/operations/recommendations.html) -* [Performance FAQ](/docs/VERSION/operations/performance-faq.html) + * [HTTP Compression](/docs/VERSION/operations/http-compression.html) * [API Reference](/docs/VERSION/operations/api-reference.html) * [Coordinator](/docs/VERSION/operations/api-reference.html#coordinator) * [Overlord](/docs/VERSION/operations/api-reference.html#overlord) * [MiddleManager](/docs/VERSION/operations/api-reference.html#middlemanager) * [Peon](/docs/VERSION/operations/api-reference.html#peon) * [Broker](/docs/VERSION/operations/api-reference.html#broker) - * [Historical](/docs/VERSION/operations/api-reference.html#historical) + * [Historical](/docs/VERSION/operations/api-reference.html#historical) + * Tuning and Recommendations +* [Basic Cluster Tuning](/docs/VERSION/operations/basic-cluster-tuning.html) +* [General Recommendations](/docs/VERSION/operations/recommendations.html) +* [Performance FAQ](/docs/VERSION/operations/performance-faq.html) +* [JVM Best Practices](/docs/VERSION/configuration/index.html#jvm-configuration-best-practices) * Tools * [Dump Segment Tool](/docs/VERSION/operations/dump-segment.html) * [Insert Segment Tool](/docs/VERSION/operations/insert-segment-to-db.html) Review comment: Ah I see. It looks fine to me. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] viongpanzi commented on a change in pull request #7716: AggregatorUtil should cache parsed expression to avoid memory problem (OOM/FGC) when Expression is used in metricsSpe
viongpanzi commented on a change in pull request #7716: AggregatorUtil should cache parsed expression to avoid memory problem (OOM/FGC) when Expression is used in metricsSpec URL: https://github.com/apache/incubator-druid/pull/7716#discussion_r286778106 ## File path: processing/src/main/java/org/apache/druid/query/aggregation/AggregatorUtil.java ## @@ -196,7 +246,7 @@ static BaseFloatColumnValueSelector makeColumnValueSelectorWithFloatDefault( if (fieldName != null) { return metricFactory.makeColumnValueSelector(fieldName); } else { - final Expr expr = Parser.parse(fieldExpression, macroTable); + final Expr expr = parseIfAbsent(fieldExpression, macroTable); Review comment: @himanshug Would it be better if we move the cache to the Parser class and add a new method like parseIfAbsent? Otherwise, we need to add cache in SimpleXXAggregatorFactory, respectively. ``` public abstract class SimpleFloatAggregatorFactory extends NullableAggregatorFactory { protected final String name; @Nullable protected final String fieldName; @Nullable protected final String expression; protected final ExprMacroTable macroTable; protected final Supplier fieldExpression; public SimpleFloatAggregatorFactory( ExprMacroTable macroTable, String name, @Nullable final String fieldName, @Nullable String expression ) { this.macroTable = macroTable; this.name = name; this.fieldName = fieldName; this.expression = expression; this.fieldExpression = () -> expression == null ? null : Parser.parseIfAbsent(expression, macroTable); ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] fjy commented on a change in pull request #7734: reorganizing the ToC
fjy commented on a change in pull request #7734: reorganizing the ToC URL: https://github.com/apache/incubator-druid/pull/7734#discussion_r286777179 ## File path: docs/content/toc.md ## @@ -126,20 +132,19 @@ layout: toc * [Metrics and Monitoring](/docs/VERSION/operations/metrics.html) * [Alerts](/docs/VERSION/operations/alerts.html) * [Different Hadoop Versions](/docs/VERSION/operations/other-hadoop.html) - * [HTTP Compression](/docs/VERSION/operations/http-compression.html) - * [Basic Cluster Tuning](/docs/VERSION/operations/basic-cluster-tuning.html) - * Examples - * [Single-server Deployment Examples](/docs/VERSION/operations/single-server.html) - * [Clustered Deployment Example](/docs/VERSION/operations/example-cluster.html) - * [Recommendations](/docs/VERSION/operations/recommendations.html) -* [Performance FAQ](/docs/VERSION/operations/performance-faq.html) + * [HTTP Compression](/docs/VERSION/operations/http-compression.html) * [API Reference](/docs/VERSION/operations/api-reference.html) * [Coordinator](/docs/VERSION/operations/api-reference.html#coordinator) * [Overlord](/docs/VERSION/operations/api-reference.html#overlord) * [MiddleManager](/docs/VERSION/operations/api-reference.html#middlemanager) * [Peon](/docs/VERSION/operations/api-reference.html#peon) * [Broker](/docs/VERSION/operations/api-reference.html#broker) - * [Historical](/docs/VERSION/operations/api-reference.html#historical) + * [Historical](/docs/VERSION/operations/api-reference.html#historical) + * Tuning and Recommendations +* [Basic Cluster Tuning](/docs/VERSION/operations/basic-cluster-tuning.html) +* [General Recommendations](/docs/VERSION/operations/recommendations.html) +* [Performance FAQ](/docs/VERSION/operations/performance-faq.html) +* [JVM Best Practices](/docs/VERSION/configuration/index.html#jvm-configuration-best-practices) * Tools * [Dump Segment Tool](/docs/VERSION/operations/dump-segment.html) * [Insert Segment Tool](/docs/VERSION/operations/insert-segment-to-db.html) Review comment: I'm fine removing it as long as it is the right thing to do This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] fjy commented on a change in pull request #7734: reorganizing the ToC
fjy commented on a change in pull request #7734: reorganizing the ToC URL: https://github.com/apache/incubator-druid/pull/7734#discussion_r286777083 ## File path: docs/content/toc.md ## @@ -126,20 +132,19 @@ layout: toc * [Metrics and Monitoring](/docs/VERSION/operations/metrics.html) * [Alerts](/docs/VERSION/operations/alerts.html) * [Different Hadoop Versions](/docs/VERSION/operations/other-hadoop.html) - * [HTTP Compression](/docs/VERSION/operations/http-compression.html) - * [Basic Cluster Tuning](/docs/VERSION/operations/basic-cluster-tuning.html) - * Examples - * [Single-server Deployment Examples](/docs/VERSION/operations/single-server.html) - * [Clustered Deployment Example](/docs/VERSION/operations/example-cluster.html) - * [Recommendations](/docs/VERSION/operations/recommendations.html) -* [Performance FAQ](/docs/VERSION/operations/performance-faq.html) + * [HTTP Compression](/docs/VERSION/operations/http-compression.html) * [API Reference](/docs/VERSION/operations/api-reference.html) * [Coordinator](/docs/VERSION/operations/api-reference.html#coordinator) * [Overlord](/docs/VERSION/operations/api-reference.html#overlord) * [MiddleManager](/docs/VERSION/operations/api-reference.html#middlemanager) * [Peon](/docs/VERSION/operations/api-reference.html#peon) * [Broker](/docs/VERSION/operations/api-reference.html#broker) - * [Historical](/docs/VERSION/operations/api-reference.html#historical) + * [Historical](/docs/VERSION/operations/api-reference.html#historical) + * Tuning and Recommendations +* [Basic Cluster Tuning](/docs/VERSION/operations/basic-cluster-tuning.html) +* [General Recommendations](/docs/VERSION/operations/recommendations.html) +* [Performance FAQ](/docs/VERSION/operations/performance-faq.html) +* [JVM Best Practices](/docs/VERSION/configuration/index.html#jvm-configuration-best-practices) * Tools * [Dump Segment Tool](/docs/VERSION/operations/dump-segment.html) * [Insert Segment Tool](/docs/VERSION/operations/insert-segment-to-db.html) Review comment: I don't see insert segment tool removed in the toc of that PR. Moreover, @gianm added it back in in #7658 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] fjy commented on a change in pull request #7734: reorganizing the ToC
fjy commented on a change in pull request #7734: reorganizing the ToC URL: https://github.com/apache/incubator-druid/pull/7734#discussion_r286777083 ## File path: docs/content/toc.md ## @@ -126,20 +132,19 @@ layout: toc * [Metrics and Monitoring](/docs/VERSION/operations/metrics.html) * [Alerts](/docs/VERSION/operations/alerts.html) * [Different Hadoop Versions](/docs/VERSION/operations/other-hadoop.html) - * [HTTP Compression](/docs/VERSION/operations/http-compression.html) - * [Basic Cluster Tuning](/docs/VERSION/operations/basic-cluster-tuning.html) - * Examples - * [Single-server Deployment Examples](/docs/VERSION/operations/single-server.html) - * [Clustered Deployment Example](/docs/VERSION/operations/example-cluster.html) - * [Recommendations](/docs/VERSION/operations/recommendations.html) -* [Performance FAQ](/docs/VERSION/operations/performance-faq.html) + * [HTTP Compression](/docs/VERSION/operations/http-compression.html) * [API Reference](/docs/VERSION/operations/api-reference.html) * [Coordinator](/docs/VERSION/operations/api-reference.html#coordinator) * [Overlord](/docs/VERSION/operations/api-reference.html#overlord) * [MiddleManager](/docs/VERSION/operations/api-reference.html#middlemanager) * [Peon](/docs/VERSION/operations/api-reference.html#peon) * [Broker](/docs/VERSION/operations/api-reference.html#broker) - * [Historical](/docs/VERSION/operations/api-reference.html#historical) + * [Historical](/docs/VERSION/operations/api-reference.html#historical) + * Tuning and Recommendations +* [Basic Cluster Tuning](/docs/VERSION/operations/basic-cluster-tuning.html) +* [General Recommendations](/docs/VERSION/operations/recommendations.html) +* [Performance FAQ](/docs/VERSION/operations/performance-faq.html) +* [JVM Best Practices](/docs/VERSION/configuration/index.html#jvm-configuration-best-practices) * Tools * [Dump Segment Tool](/docs/VERSION/operations/dump-segment.html) * [Insert Segment Tool](/docs/VERSION/operations/insert-segment-to-db.html) Review comment: I don't see insert segment tool removed. Moreover, @gianm added it back in in #7658 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] jihoonson opened a new pull request #7735: Remove LegacyKafkaIndexTaskRunner
jihoonson opened a new pull request #7735: Remove LegacyKafkaIndexTaskRunner URL: https://github.com/apache/incubator-druid/pull/7735 Kafka indexing service supports Incremental handoff since 0.12.0 (https://github.com/apache/incubator-druid/pull/4815) and LegacyKafkaIndexTaskRunner was deprecated in 0.13.0. After this PR, rolling update from a version < 0.12.0 is not supported. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[incubator-druid] branch master updated: SeekableStreamIndexTaskRunner: Lazy init of runner. (#7729)
This is an automated email from the ASF dual-hosted git repository. fjy pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-druid.git The following commit(s) were added to refs/heads/master by this push: new 53b6467 SeekableStreamIndexTaskRunner: Lazy init of runner. (#7729) 53b6467 is described below commit 53b6467fc83cd4a78d87b5fd1557c84b2a5b2513 Author: Gian Merlino AuthorDate: Wed May 22 21:13:57 2019 -0700 SeekableStreamIndexTaskRunner: Lazy init of runner. (#7729) The main motivation is that this fixes #7724, by making it so the overlord doesn't try to create a task runner and parser when all it really wants to do is create a task object and serialize it. --- .../druid/indexing/kafka/KafkaIndexTask.java | 6 +- .../druid/indexing/kafka/KafkaIndexTaskTest.java | 67 -- .../druid/indexing/kinesis/KinesisIndexTask.java | 3 +- .../seekablestream/SeekableStreamIndexTask.java| 32 ++- 4 files changed, 76 insertions(+), 32 deletions(-) diff --git a/extensions-core/kafka-indexing-service/src/main/java/org/apache/druid/indexing/kafka/KafkaIndexTask.java b/extensions-core/kafka-indexing-service/src/main/java/org/apache/druid/indexing/kafka/KafkaIndexTask.java index df56cce..314f99c 100644 --- a/extensions-core/kafka-indexing-service/src/main/java/org/apache/druid/indexing/kafka/KafkaIndexTask.java +++ b/extensions-core/kafka-indexing-service/src/main/java/org/apache/druid/indexing/kafka/KafkaIndexTask.java @@ -131,18 +131,20 @@ public class KafkaIndexTask extends SeekableStreamIndexTask { if (context != null && context.get(SeekableStreamSupervisor.IS_INCREMENTAL_HANDOFF_SUPPORTED) != null && ((boolean) context.get(SeekableStreamSupervisor.IS_INCREMENTAL_HANDOFF_SUPPORTED))) { + //noinspection unchecked return new IncrementalPublishingKafkaIndexTaskRunner( this, - parser, + dataSchema.getParser(), authorizerMapper, chatHandlerProvider, savedParseExceptions, rowIngestionMetersFactory ); } else { + //noinspection unchecked return new LegacyKafkaIndexTaskRunner( this, - parser, + dataSchema.getParser(), authorizerMapper, chatHandlerProvider, savedParseExceptions, diff --git a/extensions-core/kafka-indexing-service/src/test/java/org/apache/druid/indexing/kafka/KafkaIndexTaskTest.java b/extensions-core/kafka-indexing-service/src/test/java/org/apache/druid/indexing/kafka/KafkaIndexTaskTest.java index 6b9a203..4c9b39b 100644 --- a/extensions-core/kafka-indexing-service/src/test/java/org/apache/druid/indexing/kafka/KafkaIndexTaskTest.java +++ b/extensions-core/kafka-indexing-service/src/test/java/org/apache/druid/indexing/kafka/KafkaIndexTaskTest.java @@ -195,7 +195,7 @@ import static org.apache.druid.query.QueryPlus.wrap; public class KafkaIndexTaskTest { private static final Logger log = new Logger(KafkaIndexTaskTest.class); - private static final ObjectMapper objectMapper = TestHelper.makeJsonMapper(); + private static final ObjectMapper OBJECT_MAPPER = new TestUtils().getTestObjectMapper(); private static final long POLL_RETRY_MS = 100; private static TestingCluster zkServer; @@ -204,6 +204,10 @@ public class KafkaIndexTaskTest private static ListeningExecutorService taskExec; private static int topicPostfix; + static { +new KafkaIndexTaskModule().getJacksonModules().forEach(OBJECT_MAPPER::registerModule); + } + private final List runningTasks = new ArrayList<>(); private long handoffConditionTimeout = 0; @@ -244,7 +248,7 @@ public class KafkaIndexTaskTest private static final DataSchema DATA_SCHEMA = new DataSchema( "test_ds", - objectMapper.convertValue( + OBJECT_MAPPER.convertValue( new StringInputRowParser( new JSONParseSpec( new TimestampSpec("timestamp", "iso", null), @@ -272,7 +276,7 @@ public class KafkaIndexTaskTest }, new UniformGranularitySpec(Granularities.DAY, Granularities.NONE, null), null, - objectMapper + OBJECT_MAPPER ); private static List> generateRecords(String topic) @@ -730,10 +734,11 @@ public class KafkaIndexTaskTest SegmentDescriptor desc7 = sd(task, "2013/P1D", 0); Assert.assertEquals(ImmutableSet.of(desc1, desc2, desc3, desc4, desc5, desc6, desc7), publishedDescriptors()); Assert.assertEquals( - new KafkaDataSourceMetadata( - new SeekableStreamEndSequenceNumbers<>(topic, ImmutableMap.of(0, 10L, 1, 2L)) - ), - metadataStorageCoordinator.getDataSourceMetadata(DATA_SCHEMA.getDataSource())); +new KafkaDataSourceMetadata( +new SeekableStreamEndSequenceNumbers<>(topic, ImmutableMap.of(0, 10L, 1, 2L)) +), + metadataStorageCoordinator.getDataS
[GitHub] [incubator-druid] fjy closed issue #7724: Lookup transform expression not defined during ingest from kafka
fjy closed issue #7724: Lookup transform expression not defined during ingest from kafka URL: https://github.com/apache/incubator-druid/issues/7724 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] fjy merged pull request #7729: SeekableStreamIndexTaskRunner: Lazy init of runner.
fjy merged pull request #7729: SeekableStreamIndexTaskRunner: Lazy init of runner. URL: https://github.com/apache/incubator-druid/pull/7729 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] fjy merged pull request #7731: fix AggregatorFactory.finalizeComputation implementations to be ok with null inputs
fjy merged pull request #7731: fix AggregatorFactory.finalizeComputation implementations to be ok with null inputs URL: https://github.com/apache/incubator-druid/pull/7731 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[incubator-druid] branch master updated: fix AggregatorFactory.finalizeComputation implementations to be ok with null inputs (#7731)
This is an automated email from the ASF dual-hosted git repository. fjy pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-druid.git The following commit(s) were added to refs/heads/master by this push: new ffc2397 fix AggregatorFactory.finalizeComputation implementations to be ok with null inputs (#7731) ffc2397 is described below commit ffc2397bcda67c6ba153ef072694d5bd740696f9 Author: Clint Wylie AuthorDate: Wed May 22 21:13:09 2019 -0700 fix AggregatorFactory.finalizeComputation implementations to be ok with null inputs (#7731) * AggregatorFactory finalizeComputation is nullable with nullable input, make implementations honor this * fixes --- .../aggregation/distinctcount/DistinctCountAggregatorFactory.java | 4 +++- .../momentsketch/aggregator/MomentSketchAggregatorFactory.java | 3 ++- .../apache/druid/query/movingaverage/AveragerFactoryWrapper.java | 4 +++- .../tdigestsketch/TDigestBuildSketchAggregatorFactory.java | 3 ++- .../apache/druid/query/aggregation/TimestampAggregatorFactory.java | 6 -- .../aggregation/datasketches/hll/HllSketchAggregatorFactory.java | 6 +- .../datasketches/quantiles/DoublesSketchAggregatorFactory.java | 5 +++-- .../datasketches/theta/SketchMergeAggregatorFactory.java | 7 ++- .../datasketches/tuple/ArrayOfDoublesSketchAggregatorFactory.java | 5 +++-- .../query/aggregation/bloom/BloomFilterAggregatorFactory.java | 3 ++- .../histogram/ApproximateHistogramAggregatorFactory.java | 5 +++-- .../query/aggregation/variance/VarianceAggregatorFactory.java | 6 -- .../org/apache/druid/query/aggregation/CountAggregatorFactory.java | 4 +++- .../apache/druid/query/aggregation/FilteredAggregatorFactory.java | 4 +++- .../apache/druid/query/aggregation/HistogramAggregatorFactory.java | 5 +++-- .../druid/query/aggregation/JavaScriptAggregatorFactory.java | 4 +++- .../druid/query/aggregation/SuppressedAggregatorFactory.java | 3 ++- .../aggregation/cardinality/CardinalityAggregatorFactory.java | 5 +++-- .../query/aggregation/first/StringFirstAggregatorFactory.java | 6 -- .../aggregation/hyperloglog/HyperUniquesAggregatorFactory.java | 6 -- .../druid/query/aggregation/last/StringLastAggregatorFactory.java | 6 -- 21 files changed, 69 insertions(+), 31 deletions(-) diff --git a/extensions-contrib/distinctcount/src/main/java/org/apache/druid/query/aggregation/distinctcount/DistinctCountAggregatorFactory.java b/extensions-contrib/distinctcount/src/main/java/org/apache/druid/query/aggregation/distinctcount/DistinctCountAggregatorFactory.java index 524cc37..08f5136 100644 --- a/extensions-contrib/distinctcount/src/main/java/org/apache/druid/query/aggregation/distinctcount/DistinctCountAggregatorFactory.java +++ b/extensions-contrib/distinctcount/src/main/java/org/apache/druid/query/aggregation/distinctcount/DistinctCountAggregatorFactory.java @@ -35,6 +35,7 @@ import org.apache.druid.query.dimension.DefaultDimensionSpec; import org.apache.druid.segment.ColumnSelectorFactory; import org.apache.druid.segment.DimensionSelector; +import javax.annotation.Nullable; import java.nio.ByteBuffer; import java.util.Collections; import java.util.Comparator; @@ -147,8 +148,9 @@ public class DistinctCountAggregatorFactory extends AggregatorFactory return object; } + @Nullable @Override - public Object finalizeComputation(Object object) + public Object finalizeComputation(@Nullable Object object) { return object; } diff --git a/extensions-contrib/momentsketch/src/main/java/org/apache/druid/query/aggregation/momentsketch/aggregator/MomentSketchAggregatorFactory.java b/extensions-contrib/momentsketch/src/main/java/org/apache/druid/query/aggregation/momentsketch/aggregator/MomentSketchAggregatorFactory.java index 918ad3e..1abe6a1 100644 --- a/extensions-contrib/momentsketch/src/main/java/org/apache/druid/query/aggregation/momentsketch/aggregator/MomentSketchAggregatorFactory.java +++ b/extensions-contrib/momentsketch/src/main/java/org/apache/druid/query/aggregation/momentsketch/aggregator/MomentSketchAggregatorFactory.java @@ -206,8 +206,9 @@ public class MomentSketchAggregatorFactory extends AggregatorFactory ); } + @Nullable @Override - public Object finalizeComputation(Object object) + public Object finalizeComputation(@Nullable Object object) { return object; } diff --git a/extensions-contrib/moving-average-query/src/main/java/org/apache/druid/query/movingaverage/AveragerFactoryWrapper.java b/extensions-contrib/moving-average-query/src/main/java/org/apache/druid/query/movingaverage/AveragerFactoryWrapper.java index f6f1d90..9e4118c 100644 --- a/extensions-contrib/moving-average-query/src/main/java/org/apache/druid/query/movingaverage/AveragerFactoryWrapper.java +++ b/extensions-contrib/moving-average-query/src/main/java/org/apache/druid
[GitHub] [incubator-druid] fjy merged pull request #7669: add tests to dialogs, components and views
fjy merged pull request #7669: add tests to dialogs, components and views URL: https://github.com/apache/incubator-druid/pull/7669 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gianm commented on issue #7728: fix Select query fails with columns called "timestamp" bug
gianm commented on issue #7728: fix Select query fails with columns called "timestamp" bug URL: https://github.com/apache/incubator-druid/pull/7728#issuecomment-495053991 In master, Scan supports time ordering, and this will be released in 0.15.0. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] hellobabygogo commented on a change in pull request #7728: fix Select query fails with columns called "timestamp" bug
hellobabygogo commented on a change in pull request #7728: fix Select query fails with columns called "timestamp" bug URL: https://github.com/apache/incubator-druid/pull/7728#discussion_r286753674 ## File path: processing/src/main/java/org/apache/druid/query/select/EventHolder.java ## @@ -32,7 +32,7 @@ */ public class EventHolder { - public static final String timestampKey = "timestamp"; + public static final String timestampKey = "__time"; Review comment: Thanks for your reply! ok, done! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] hellobabygogo commented on issue #7728: fix Select query fails with columns called "timestamp" bug
hellobabygogo commented on issue #7728: fix Select query fails with columns called "timestamp" bug URL: https://github.com/apache/incubator-druid/pull/7728#issuecomment-495041803 > @hellobabygogo thanks for the contrib. Just an FYI that select query is now deprecated in favor of Scan query. @fjy , hi fangjin Thanks for your reply! Because our users need sorting, so the scan query can not replace select This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] hellobabygogo commented on issue #7728: fix Select query fails with columns called "timestamp" bug
hellobabygogo commented on issue #7728: fix Select query fails with columns called "timestamp" bug URL: https://github.com/apache/incubator-druid/pull/7728#issuecomment-495034525 > How about instead fixing this by keeping time in `timestamp`, but then also dropping any dimension/metric that is named `timestamp` (instead of overwriting the time)? That would mean there's no incompatibility. > > I don't think the API churn is a good thing in this case. It will make migration tough. If we're going to force people to change their API calls, they might as well switch to Scan, which is better than Select. hi gianm Because our users need sorting, so the scan query can not replace select This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gianm commented on issue #7726: longMin/longMax has different result, should unify?
gianm commented on issue #7726: longMin/longMax has different result, should unify? URL: https://github.com/apache/incubator-druid/issues/7726#issuecomment-495018328 What would SQL standard behavior be? I think we're not standard for query (2), it should probably return `null`. I'm not sure what query (1) should return. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gianm commented on issue #7701: Bump Jackson to 2.9.9
gianm commented on issue #7701: Bump Jackson to 2.9.9 URL: https://github.com/apache/incubator-druid/pull/7701#issuecomment-495018165 Jackson and Guava are two libraries that always seem to cause dependency nightmares in connection with Hadoop. I would want to test this out with various Hadoop distributions before pulling the trigger. I know it's a pain but we have been burned here before. Is there any chance you are willing to do some of these tests? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gianm commented on issue #7733: Provide segment size property that includes replicated segments
gianm commented on issue #7733: Provide segment size property that includes replicated segments URL: https://github.com/apache/incubator-druid/issues/7733#issuecomment-495017202 It sounds useful to me. If it's not computationally or memory intensive to compute, I don't see why not. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gianm commented on issue #7728: fix Select query fails with columns called "timestamp" bug
gianm commented on issue #7728: fix Select query fails with columns called "timestamp" bug URL: https://github.com/apache/incubator-druid/pull/7728#issuecomment-495016935 How about instead fixing this by keeping time in `timestamp`, but then also dropping any dimension/metric that is named `timestamp` (instead of overwriting the time)? That would mean there's no incompatibility. I don't think the API churn is a good thing in this case. It will make migration tough. If we're going to force people to change their API calls, they might as well switch to Scan, which is better than Select. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] himanshug commented on issue #7731: fix AggregatorFactory.finalizeComputation implementations to be ok with null inputs
himanshug commented on issue #7731: fix AggregatorFactory.finalizeComputation implementations to be ok with null inputs URL: https://github.com/apache/incubator-druid/pull/7731#issuecomment-495014170 @clintropolis yeah , I forgot that default behavior might be dependent on the aggregation being considered. this PR is fine on its own. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gianm commented on issue #7609: Local variable names shouldn't start with capital
gianm commented on issue #7609: Local variable names shouldn't start with capital URL: https://github.com/apache/incubator-druid/issues/7609#issuecomment-494997860 > There are several occurrences of variables which are not constants (i.e non static fields) which have variables all capitalized for instance, Pattern DOT_OR_WHITESPACE = Pattern.compile("[\s]+|[.]+"); etc. I'm changing these to lower case. Is that fine ? For some of these, it might be an indication that the variable was intended to be a static constant, but the original developer just forgot to include the static keyword. So I would approach these with an open mind to fixing bugs like that too, not necessarily just fixing the names to match the current state. > For local final variables example here http://checkstyle.sourceforge.net/config_naming.html#LocalFinalVariableName suggests names that are only upper case letters and digits "^[A-Z][A-Z0-9]*$". IMO, this is silly, I don't see a reason to require this pattern. In fact, I'd be more inclined to require the _opposite_ (local final variables should start with lowercase letters). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] clintropolis commented on issue #7731: fix AggregatorFactory.finalizeComputation implementations to be ok with null inputs
clintropolis commented on issue #7731: fix AggregatorFactory.finalizeComputation implementations to be ok with null inputs URL: https://github.com/apache/incubator-druid/pull/7731#issuecomment-494988305 I think cardinality/hyperunique behavior also raises another thing to consider, is that maybe we should try to make this behavior consistent for the same types of aggregators, like for example `HllSketchAggregatorFactory.finalizeComputation` should maybe return 0 instead of null when faced with null to behave the same way. Reasoning this through is maybe a bit more involved than fixing the null pointer exceptions, so it might be better suited to a follow-up PR to try to make similar aggregators behave consistently? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] stale[bot] commented on issue #7174: Change reservoir sampling of segments to spliterator approach
stale[bot] commented on issue #7174: Change reservoir sampling of segments to spliterator approach URL: https://github.com/apache/incubator-druid/pull/7174#issuecomment-494982136 This pull request has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] stale[bot] closed pull request #7174: Change reservoir sampling of segments to spliterator approach
stale[bot] closed pull request #7174: Change reservoir sampling of segments to spliterator approach URL: https://github.com/apache/incubator-druid/pull/7174 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] clintropolis commented on a change in pull request #7729: SeekableStreamIndexTaskRunner: Lazy init of runner.
clintropolis commented on a change in pull request #7729: SeekableStreamIndexTaskRunner: Lazy init of runner. URL: https://github.com/apache/incubator-druid/pull/7729#discussion_r286673679 ## File path: indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/SeekableStreamIndexTask.java ## @@ -283,13 +282,20 @@ public boolean withinMinMaxRecordTime(final InputRow row) @VisibleForTesting public Appenderator getAppenderator() { -return runner.getAppenderator(); +return getRunner().getAppenderator(); } @VisibleForTesting public SeekableStreamIndexTaskRunner getRunner() { +if (runner == null) { + synchronized (runnerInitLock) { +if (runner == null) { + runner = createTaskRunner(); +} + } +} Review comment: guava has `Suppliers.memoize` that does this, we use it in a couple of places already This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] clintropolis edited a comment on issue #7731: fix AggregatorFactory.finalizeComputation implementations to be ok with null inputs
clintropolis edited a comment on issue #7731: fix AggregatorFactory.finalizeComputation implementations to be ok with null inputs URL: https://github.com/apache/incubator-druid/pull/7731#issuecomment-494957996 > ...use above in all of druid code instead of finalizeComputation(..) . This would fix the problem for extensions not in druid codebase and also one less thing to worry by future aggregator factory writers. I considered that as well, but didn't initially make the change because it would be taking a stronger position to mean that aggregators are not allowed to transform a null into a different value when finalizing a computation. My initial commit failed tests because I didn't realize that the cardinality/hyperunique aggregator were doing this in the `HyperUniquesAggregatorFactory.estimateCardinality` method, because it wasn't marked as nullable so I assumed it was non-null. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] clintropolis commented on issue #7731: fix AggregatorFactory.finalizeComputation implementations to be ok with null inputs
clintropolis commented on issue #7731: fix AggregatorFactory.finalizeComputation implementations to be ok with null inputs URL: https://github.com/apache/incubator-druid/pull/7731#issuecomment-494957996 >This would fix the problem for extensions not in druid codebase and also one less thing to worry by future aggregator factory writers. I considered that as well, but didn't initially make the change because it would be taking a stronger position to mean that aggregators are not allowed to transform a null into a different value when finalizing a computation. My initial commit failed tests because I didn't realize that the cardinality/hyperunique aggregator were doing this in the `HyperUniquesAggregatorFactory.estimateCardinality` method, because it wasn't marked as nullable so I assumed it was non-null. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] vogievetsky commented on issue #7669: add tests to dialogs, components and views
vogievetsky commented on issue #7669: add tests to dialogs, components and views URL: https://github.com/apache/incubator-druid/pull/7669#issuecomment-494951826 Yay, this finally passed all the tests! This is good to go as far as I am concerned đź‘Ť This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] jihoonson commented on a change in pull request #7734: reorganizing the ToC
jihoonson commented on a change in pull request #7734: reorganizing the ToC URL: https://github.com/apache/incubator-druid/pull/7734#discussion_r286663238 ## File path: docs/content/toc.md ## @@ -126,20 +132,19 @@ layout: toc * [Metrics and Monitoring](/docs/VERSION/operations/metrics.html) * [Alerts](/docs/VERSION/operations/alerts.html) * [Different Hadoop Versions](/docs/VERSION/operations/other-hadoop.html) - * [HTTP Compression](/docs/VERSION/operations/http-compression.html) - * [Basic Cluster Tuning](/docs/VERSION/operations/basic-cluster-tuning.html) - * Examples - * [Single-server Deployment Examples](/docs/VERSION/operations/single-server.html) - * [Clustered Deployment Example](/docs/VERSION/operations/example-cluster.html) - * [Recommendations](/docs/VERSION/operations/recommendations.html) -* [Performance FAQ](/docs/VERSION/operations/performance-faq.html) + * [HTTP Compression](/docs/VERSION/operations/http-compression.html) * [API Reference](/docs/VERSION/operations/api-reference.html) * [Coordinator](/docs/VERSION/operations/api-reference.html#coordinator) * [Overlord](/docs/VERSION/operations/api-reference.html#overlord) * [MiddleManager](/docs/VERSION/operations/api-reference.html#middlemanager) * [Peon](/docs/VERSION/operations/api-reference.html#peon) * [Broker](/docs/VERSION/operations/api-reference.html#broker) - * [Historical](/docs/VERSION/operations/api-reference.html#historical) + * [Historical](/docs/VERSION/operations/api-reference.html#historical) + * Tuning and Recommendations +* [Basic Cluster Tuning](/docs/VERSION/operations/basic-cluster-tuning.html) +* [General Recommendations](/docs/VERSION/operations/recommendations.html) +* [Performance FAQ](/docs/VERSION/operations/performance-faq.html) +* [JVM Best Practices](/docs/VERSION/configuration/index.html#jvm-configuration-best-practices) * Tools * [Dump Segment Tool](/docs/VERSION/operations/dump-segment.html) * [Insert Segment Tool](/docs/VERSION/operations/insert-segment-to-db.html) Review comment: Would you please remove this line? It was removed in https://github.com/apache/incubator-druid/pull/6911. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] himanshug commented on a change in pull request #7729: SeekableStreamIndexTaskRunner: Lazy init of runner.
himanshug commented on a change in pull request #7729: SeekableStreamIndexTaskRunner: Lazy init of runner. URL: https://github.com/apache/incubator-druid/pull/7729#discussion_r286661510 ## File path: indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/SeekableStreamIndexTask.java ## @@ -283,13 +282,20 @@ public boolean withinMinMaxRecordTime(final InputRow row) @VisibleForTesting public Appenderator getAppenderator() { -return runner.getAppenderator(); +return getRunner().getAppenderator(); } @VisibleForTesting public SeekableStreamIndexTaskRunner getRunner() { +if (runner == null) { + synchronized (runnerInitLock) { +if (runner == null) { + runner = createTaskRunner(); +} + } +} Review comment: and here I thought that with all java8+ fanciness we would have a method in `AtomicReference.getOrCreate(Supplier)` but it seems this is still the way for lazy singleton. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] himanshug commented on issue #7731: fix AggregatorFactory.finalizeComputation implementations to be ok with null inputs
himanshug commented on issue #7731: fix AggregatorFactory.finalizeComputation implementations to be ok with null inputs URL: https://github.com/apache/incubator-druid/pull/7731#issuecomment-494931501 LGTM in general. But I had a thought of adding following method to AggregatorFactory ``` @Nullable public final Object finalizeComputationWithNull(@Nullable Object object) { return object == null ? null : finalizeComputation(object); } ``` and use above in all of druid code instead of `finalizeComputation(..)` . This would fix the problem for extensions not in druid codebase and also one less thing to worry by future aggregator factory writers. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] fjy opened a new pull request #7734: reorganizing the ToC
fjy opened a new pull request #7734: reorganizing the ToC URL: https://github.com/apache/incubator-druid/pull/7734 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[incubator-druid] branch toc-fix created (now afd80e0)
This is an automated email from the ASF dual-hosted git repository. fjy pushed a change to branch toc-fix in repository https://gitbox.apache.org/repos/asf/incubator-druid.git. at afd80e0 reorganizing the ToC No new revisions were added by this update. - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] himanshug commented on a change in pull request #7728: fix Select query fails with columns called "timestamp" bug
himanshug commented on a change in pull request #7728: fix Select query fails with columns called "timestamp" bug URL: https://github.com/apache/incubator-druid/pull/7728#discussion_r286641405 ## File path: processing/src/main/java/org/apache/druid/query/select/EventHolder.java ## @@ -32,7 +32,7 @@ */ public class EventHolder { - public static final String timestampKey = "timestamp"; + public static final String timestampKey = "__time"; Review comment: well... it was a "documented feature", so documentation for select query needs to be updated as well and please add release notes in the PR description which need to be added to the release notes where this will go as someone might be using the key "timestamp". This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] a2l007 opened a new issue #7733: Provide segment size property that includes replicated segments
a2l007 opened a new issue #7733: Provide segment size property that includes replicated segments URL: https://github.com/apache/incubator-druid/issues/7733 ### Description I'd like to a new property `replicatedSize` to the tier-specific result of `druid/coordinator/v1/datasources?simple` and `/druid/coordinator/v1/datasources/{dataSourceName}` which would provide the segment size that includes replication. ### Motivation Presently the datasources endpoint provides the unreplicated segment size of each tier within a datasource. A replicated segment size property for each tier would be useful for services to monitor total tier usage periodically. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] himanshug merged pull request #7674: add s3 authentication method informations
himanshug merged pull request #7674: add s3 authentication method informations URL: https://github.com/apache/incubator-druid/pull/7674 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[incubator-druid] branch master updated (1fe0de1 -> bd899b9)
This is an automated email from the ASF dual-hosted git repository. himanshug pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-druid.git. from 1fe0de1 Fix currSize attribute of historical server type (#7706) add bd899b9 add s3 authentication method informations (#7674) No new revisions were added by this update. Summary of changes: docs/content/configuration/index.md| 5 ++-- docs/content/development/extensions-core/s3.md | 39 -- 2 files changed, 34 insertions(+), 10 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] mcbrewster commented on a change in pull request #7705: Web-Console: add go to editor button to tasks and supervisors view
mcbrewster commented on a change in pull request #7705: Web-Console: add go to editor button to tasks and supervisors view URL: https://github.com/apache/incubator-druid/pull/7705#discussion_r286611391 ## File path: web-console/src/views/tasks-view.tsx ## @@ -303,6 +303,16 @@ ORDER BY "rank" DESC, "created_time" DESC`); onAction: () => this.setState({ terminateSupervisorId: id }) } ]; +if (type === 'kafka' || type === 'kinesis') { + actions.push( + { + icon: IconNames.WRENCH, + title: 'Open in json spec editor', + onAction: () => this.props.goToLoadDataView(id) + }); +} +// @ts-ignore Review comment: when i have icon: IconNames.CROSS or icon: 'cross' I get the error that string is not assignable to type of IconName This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] fjy merged pull request #7732: [Backport] Update tutorial to delete data
fjy merged pull request #7732: [Backport] Update tutorial to delete data URL: https://github.com/apache/incubator-druid/pull/7732 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[incubator-druid] branch 0.15.0-incubating updated: Update tutorial to delete data (#7577) (#7732)
This is an automated email from the ASF dual-hosted git repository. fjy pushed a commit to branch 0.15.0-incubating in repository https://gitbox.apache.org/repos/asf/incubator-druid.git The following commit(s) were added to refs/heads/0.15.0-incubating by this push: new f695425 Update tutorial to delete data (#7577) (#7732) f695425 is described below commit f695425c5b4c79ed945dbe58e2d32d74e5b47503 Author: Jihoon Son AuthorDate: Wed May 22 10:48:56 2019 -0700 Update tutorial to delete data (#7577) (#7732) * Update tutorial to delete data * update tutorial, remove old ways to drop data * PR comments --- .../content/tutorials/img/tutorial-deletion-02.png | Bin 200459 -> 810422 bytes .../content/tutorials/img/tutorial-deletion-03.png | Bin 0 -> 805673 bytes docs/content/tutorials/tutorial-delete-data.md | 58 +++-- .../tutorial/deletion-disable-segments.json| 7 +++ 4 files changed, 37 insertions(+), 28 deletions(-) diff --git a/docs/content/tutorials/img/tutorial-deletion-02.png b/docs/content/tutorials/img/tutorial-deletion-02.png index fdea20f..9b84f0c 100644 Binary files a/docs/content/tutorials/img/tutorial-deletion-02.png and b/docs/content/tutorials/img/tutorial-deletion-02.png differ diff --git a/docs/content/tutorials/img/tutorial-deletion-03.png b/docs/content/tutorials/img/tutorial-deletion-03.png new file mode 100644 index 000..e6fb1f3 Binary files /dev/null and b/docs/content/tutorials/img/tutorial-deletion-03.png differ diff --git a/docs/content/tutorials/tutorial-delete-data.md b/docs/content/tutorials/tutorial-delete-data.md index 41812f7..46fbbdc 100644 --- a/docs/content/tutorials/tutorial-delete-data.md +++ b/docs/content/tutorials/tutorial-delete-data.md @@ -29,8 +29,6 @@ This tutorial demonstrates how to delete existing data. For this tutorial, we'll assume you've already downloaded Apache Druid (incubating) as described in the [single-machine quickstart](index.html) and have it running on your local machine. -Completing [Tutorial: Configuring retention](../tutorials/tutorial-retention.html) first is highly recommended, as we will be using retention rules in this tutorial. - ## Load initial data In this tutorial, we will use the Wikipedia edits data, with an indexing spec that creates hourly segments. This spec is located at `quickstart/tutorial/deletion-index.json`, and it creates a datasource called `deletion-tutorial`. @@ -47,30 +45,25 @@ When the load finishes, open [http://localhost:/unified-console.html#datasou Permanent deletion of a Druid segment has two steps: -1. The segment must first be marked as "unused". This occurs when a segment is dropped by retention rules, and when a user manually disables a segment through the Coordinator API. This tutorial will cover both cases. +1. The segment must first be marked as "unused". This occurs when a user manually disables a segment through the Coordinator API. 2. After segments have been marked as "unused", a Kill Task will delete any "unused" segments from Druid's metadata store as well as deep storage. -Let's drop some segments now, first with load rules, then manually. - -## Drop some data with load rules - -As with the previous retention tutorial, there are currently 24 segments in the `deletion-tutorial` datasource. - -click the blue pencil icon next to `Cluster default: loadForever` for the `deletion-tutorial` datasource. +Let's drop some segments now, by using the coordinator API to drop data by interval and segmentIds. -A rule configuration window will appear. +## Disable segments by interval -Now click the `+ New rule` button twice. +Let's disable segments in a specified interval. This will mark all segments in the interval as "unused", but not remove them from deep storage. +Let's disable segments in interval `2015-09-12T18:00:00.000Z/2015-09-12T20:00:00.000Z` i.e. between hour 18 and 20. -In the upper rule box, select `Load` and `by interval`, and then enter `2015-09-12T12:00:00.000Z/2015-09-13T00:00:00.000Z` in field next to `by interval`. Replicants can remain at 2 in the `_default_tier`. - -In the lower rule box, select `Drop` and `forever`. +```bash +curl -X 'POST' -H 'Content-Type:application/json' -d '{ "interval" : "2015-09-12T18:00:00.000Z/2015-09-12T20:00:00.000Z" }' http://localhost:8081/druid/coordinator/v1/datasources/deletion-tutorial/markUnused +``` -Now click `Next` and enter `tutorial` for both the user and changelog comment field. +After that command completes, you should see that the segment for hour 18 and 19 have been disabled: -This will cause the first 12 segments of `deletion-tutorial` to be dropped. However, these dropped segments are not removed from deep storage. +![Segments 2](../tutorials/img/tutorial-deletion-02.png "Segments 2") -You can see that all 24 segments are still present in deep storage by listing the contents of `apache-druid-#{DRUIDVERSION}/
[GitHub] [incubator-druid] sashidhar commented on issue #7609: Local variable names shouldn't start with capital
sashidhar commented on issue #7609: Local variable names shouldn't start with capital URL: https://github.com/apache/incubator-druid/issues/7609#issuecomment-494899188 @leventov, 3. For local final variables example here http://checkstyle.sourceforge.net/config_naming.html#LocalFinalVariableName suggests names that are only upper case letters and digits `"^[A-Z][A-Z0-9]*$"`. As per this pattern, following variable should be changed from final ObjectMapper mapper = new ObjectMapper(); to final ObjectMapper MAPPER = new ObjectMapper(); I'm not sure if this makes sense changing non static variable names to all caps !! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] vogievetsky commented on a change in pull request #7705: Web-Console: add go to editor button to tasks and supervisors view
vogievetsky commented on a change in pull request #7705: Web-Console: add go to editor button to tasks and supervisors view URL: https://github.com/apache/incubator-druid/pull/7705#discussion_r286602310 ## File path: web-console/src/views/load-data-view.tsx ## @@ -175,14 +176,14 @@ const VIEW_TITLE: Record = { export interface LoadDataViewProps extends React.Props { initSpec: IngestionSpec | null; - goToTask: (taskId: string | null, openDialog?: string) => void; + taskId?: string | null; Review comment: I would call this `initTaskId` also you need a supervisor Id in case you are coming with a supervisor This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] vogievetsky commented on a change in pull request #7705: Web-Console: add go to editor button to tasks and supervisors view
vogievetsky commented on a change in pull request #7705: Web-Console: add go to editor button to tasks and supervisors view URL: https://github.com/apache/incubator-druid/pull/7705#discussion_r286601831 ## File path: web-console/src/views/tasks-view.tsx ## @@ -303,6 +303,16 @@ ORDER BY "rank" DESC, "created_time" DESC`); onAction: () => this.setState({ terminateSupervisorId: id }) } ]; +if (type === 'kafka' || type === 'kinesis') { + actions.push( + { + icon: IconNames.WRENCH, + title: 'Open in json spec editor', Review comment: This should say "Open in data loader" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] vogievetsky commented on a change in pull request #7705: Web-Console: add go to editor button to tasks and supervisors view
vogievetsky commented on a change in pull request #7705: Web-Console: add go to editor button to tasks and supervisors view URL: https://github.com/apache/incubator-druid/pull/7705#discussion_r285795400 ## File path: web-console/src/views/tasks-view.tsx ## @@ -508,18 +518,24 @@ ORDER BY "rank" DESC, "created_time" DESC`); ; } - // -- - - private getTaskActions(id: string, status: string): BasicAction[] { -if (status !== 'RUNNING' && status !== 'WAITING' && status !== 'PENDING') return []; -return [ - { -icon: IconNames.CROSS, + private getTaskActions(id: string, status: string, type: string): BasicAction[] { +const actions = []; +if (type === 'index' || type === 'index_parallel') { + actions.push({ +title: 'Open in json spec editor', Review comment: this should be "Open in data loader" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] vogievetsky commented on a change in pull request #7705: Web-Console: add go to editor button to tasks and supervisors view
vogievetsky commented on a change in pull request #7705: Web-Console: add go to editor button to tasks and supervisors view URL: https://github.com/apache/incubator-druid/pull/7705#discussion_r286601690 ## File path: web-console/src/views/tasks-view.tsx ## @@ -303,6 +303,16 @@ ORDER BY "rank" DESC, "created_time" DESC`); onAction: () => this.setState({ terminateSupervisorId: id }) } ]; +if (type === 'kafka' || type === 'kinesis') { + actions.push( + { + icon: IconNames.WRENCH, + title: 'Open in json spec editor', + onAction: () => this.props.goToLoadDataView(id) + }); +} +// @ts-ignore Review comment: what is this line? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] vogievetsky commented on a change in pull request #7705: Web-Console: add go to editor button to tasks and supervisors view
vogievetsky commented on a change in pull request #7705: Web-Console: add go to editor button to tasks and supervisors view URL: https://github.com/apache/incubator-druid/pull/7705#discussion_r286601266 ## File path: web-console/src/console-application.tsx ## @@ -107,6 +107,7 @@ export class ConsoleApplication extends React.Component
[GitHub] [incubator-druid] vogievetsky commented on a change in pull request #7705: Web-Console: add go to editor button to tasks and supervisors view
vogievetsky commented on a change in pull request #7705: Web-Console: add go to editor button to tasks and supervisors view URL: https://github.com/apache/incubator-druid/pull/7705#discussion_r286601322 ## File path: web-console/src/views/load-data-view.tsx ## @@ -27,6 +27,7 @@ import { } from '@blueprintjs/core'; import { IconNames } from '@blueprintjs/icons'; import axios from 'axios'; +import {edit} from 'brace'; Review comment: typo? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] jihoonson opened a new pull request #7732: [Backport] Update tutorial to delete data
jihoonson opened a new pull request #7732: [Backport] Update tutorial to delete data URL: https://github.com/apache/incubator-druid/pull/7732 Backport of #7577 to 0.15.0-incubating. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] sashidhar edited a comment on issue #7609: Local variable names shouldn't start with capital
sashidhar edited a comment on issue #7609: Local variable names shouldn't start with capital URL: https://github.com/apache/incubator-druid/issues/7609#issuecomment-494883301 @leventov 1. This is the match pattern I'm currently using for Local Variable names `'^[a-z_](_?[a-zA-Z0-9_]+)*$'`considering variables starting with and ending with underscores as well. Let me know if this is fine. 2. There are several occurrences of variables which are not constants (non final) and have variables all capitalized for instance,Pattern DOT_OR_WHITESPACE = Pattern.compile("[\\s]+|[.]+");etc. I'm changing these to lower case. Is that fine ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] sashidhar edited a comment on issue #7609: Local variable names shouldn't start with capital
sashidhar edited a comment on issue #7609: Local variable names shouldn't start with capital URL: https://github.com/apache/incubator-druid/issues/7609#issuecomment-494883301 @leventov 1. This is the match pattern I'm currently using for Local Variable names `'^[a-z_](_?[a-zA-Z0-9_]+)*$'`considering variables starting with and ending with underscores as well. Let me know if this is fine. 2. There are several occurrences of variables which are not constants (non final) and have variables all capitalized for instance,Pattern DOT_OR_WHITESPACE = Pattern.compile("[\\s]+|[.]+");etc. I'm changing these to lower case. Is that fine ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] sashidhar edited a comment on issue #7609: Local variable names shouldn't start with capital
sashidhar edited a comment on issue #7609: Local variable names shouldn't start with capital URL: https://github.com/apache/incubator-druid/issues/7609#issuecomment-494883301 @leventov 1. This is the match pattern I'm currently using for Local Variable names `'^[a-z_](_?[a-zA-Z0-9_]+)*$'`considering variables starting with and ending with underscores as well. Let me know if this is fine. 2. There are several occurrences of variables which are not constants (non final) and have variables all capitalized for instance,Pattern DOT_OR_WHITESPACE = Pattern.compile("[\\s]+|[.]+");etc. I'm changing these to lower case. Is that fine ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] sashidhar commented on issue #7609: Local variable names shouldn't start with capital
sashidhar commented on issue #7609: Local variable names shouldn't start with capital URL: https://github.com/apache/incubator-druid/issues/7609#issuecomment-494883301 @leventov 1. This is the match pattern I'm currently using for Local Variable names `'^[a-z_](_?[a-zA-Z0-9_]+)*$'`considering variables starting with and ending with underscores as well. Let me know if this is fine. 2. There are several occurrences of variables which are not constants (non final) and have variables all capitalized for instance,Pattern DOT_OR_WHITESPACE = Pattern.compile("[\\s]+|[.]+");etc. I'm changing these to lower case. Is that fine ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] fjy commented on issue #7728: fix Select query fails with columns called "timestamp" bug
fjy commented on issue #7728: fix Select query fails with columns called "timestamp" bug URL: https://github.com/apache/incubator-druid/pull/7728#issuecomment-494850783 @hellobabygogo thanks for the contrib. Just an FYI that select query is now deprecated in favor of Scan query. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] Fokko commented on issue #7701: Bump Jackson to 2.9.9
Fokko commented on issue #7701: Bump Jackson to 2.9.9 URL: https://github.com/apache/incubator-druid/pull/7701#issuecomment-494820707 Rebased against master This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gocho1 commented on issue #7674: add s3 authentication method informations
gocho1 commented on issue #7674: add s3 authentication method informations URL: https://github.com/apache/incubator-druid/pull/7674#issuecomment-494753005 Thanks, it worked ! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] clintropolis opened a new pull request #7731: fix AggregatorFactory.finalizeComputation implementations to be ok with null inputs
clintropolis opened a new pull request #7731: fix AggregatorFactory.finalizeComputation implementations to be ok with null inputs URL: https://github.com/apache/incubator-druid/pull/7731 `AggregatorFactory.finalizeComputation` is nullable with nullable input, but some aggregators do not check for this and can explode if the input is null. I _think_ I got them all in this PR. I did not add unit tests to ensure all of these methods are ok, but tested by hand and the scenario should easily be able to be replicated by using any of the aggs fixed in this PR in a timeseries query with `grandTotal` enabled and using an out of range interval (so no actual segment data for input) which produces a `java.lang.NullPointerException`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gianm commented on a change in pull request #7653: Refactor SQLMetadataSegmentManager; Change contract of REST methods in DataSourcesResource
gianm commented on a change in pull request #7653: Refactor SQLMetadataSegmentManager; Change contract of REST methods in DataSourcesResource URL: https://github.com/apache/incubator-druid/pull/7653#discussion_r286416635 ## File path: server/src/main/java/org/apache/druid/metadata/SQLMetadataSegmentManager.java ## @@ -197,366 +320,489 @@ private Runnable createPollTaskForStartOrder(long startOrder) } @Override - @LifecycleStop - public void stop() + public void stopPollingDatabasePeriodically() { -ReentrantReadWriteLock.WriteLock lock = startStopLock.writeLock(); +ReentrantReadWriteLock.WriteLock lock = startStopPollLock.writeLock(); lock.lock(); try { - if (!isStarted()) { + if (!isPollingDatabasePeriodically()) { return; } - dataSources = null; - currentStartOrder = -1; - exec.shutdownNow(); - exec = null; + periodicPollTaskFuture.cancel(false); + latestDatabasePoll = null; + + // NOT nulling dataSources, allowing to query the latest polled data even when this SegmentsMetadata object is + // stopped. + + currentStartPollingOrder = -1; } finally { lock.unlock(); } } - private Pair usedPayloadMapper( - final int index, - final ResultSet resultSet, - final StatementContext context - ) throws SQLException + private void awaitOrPerformDatabasePoll() { +// Double-checked locking with awaitPeriodicOrFreshOnDemandDatabasePoll() call playing the role of the "check". +if (awaitPeriodicOrFreshOnDemandDatabasePoll()) { + return; +} +ReentrantReadWriteLock.WriteLock lock = startStopPollLock.writeLock(); +lock.lock(); try { - return new Pair<>( - jsonMapper.readValue(resultSet.getBytes("payload"), DataSegment.class), - resultSet.getBoolean("used") - ); + if (awaitPeriodicOrFreshOnDemandDatabasePoll()) { +return; + } + OnDemandDatabasePoll newOnDemandUpdate = new OnDemandDatabasePoll(); + this.latestDatabasePoll = newOnDemandUpdate; + doOnDemandPoll(newOnDemandUpdate); } -catch (IOException e) { - throw new RuntimeException(e); +finally { + lock.unlock(); } } - /** - * Gets a list of all datasegments that overlap the provided interval along with thier used status. - */ - private List> getDataSegmentsOverlappingInterval( - final String dataSource, - final Interval interval - ) - { -return connector.inReadOnlyTransaction( -(handle, status) -> handle.createQuery( -StringUtils.format( -"SELECT used, payload FROM %1$s WHERE dataSource = :dataSource AND start < :end AND %2$send%2$s > :start", -getSegmentsTable(), -connector.getQuoteString() -) -) -.setFetchSize(connector.getStreamingFetchSize()) -.bind("dataSource", dataSource) -.bind("start", interval.getStart().toString()) -.bind("end", interval.getEnd().toString()) -.map(this::usedPayloadMapper) -.list() -); - } - - private List> getDataSegments( - final String dataSource, - final Collection segmentIds, - final Handle handle - ) + private boolean awaitPeriodicOrFreshOnDemandDatabasePoll() { -return segmentIds.stream().map( -segmentId -> Optional.ofNullable( -handle.createQuery( -StringUtils.format( -"SELECT used, payload FROM %1$s WHERE dataSource = :dataSource AND id = :id", -getSegmentsTable() -) -) -.bind("dataSource", dataSource) -.bind("id", segmentId) -.map(this::usedPayloadMapper) -.first() -) -.orElseThrow(() -> new UnknownSegmentIdException(StringUtils.format("Cannot find segment id [%s]", segmentId))) -) -.collect(Collectors.toList()); +DatabasePoll latestDatabasePoll = this.latestDatabasePoll; +if (latestDatabasePoll instanceof PeriodicDatabasePoll) { + Futures.getUnchecked(((PeriodicDatabasePoll) latestDatabasePoll).firstPollCompletionFuture); + return true; +} +if (latestDatabasePoll instanceof OnDemandDatabasePoll) { + long periodicPollDelayNanos = TimeUnit.MILLISECONDS.toNanos(periodicPollDelay.getMillis()); + OnDemandDatabasePoll latestOnDemandUpdate = (OnDemandDatabasePoll) latestDatabasePoll; + boolean latestUpdateIsFresh = latestOnDemandUpdate.nanosElapsedFromInitiation() < periodicPollDelayNanos; + if (latestUpdateIsFresh) { +Futures.getUnchecked(latestOnDemandUpdate.pollCompletionFuture); +return true; + } + // Latest on-demand update is not fresh. Fall through to return false from this method. +} else { + assert latestDatabasePoll == null; +} +return false; } - /** - * Builds a Version
[GitHub] [incubator-druid] gianm commented on a change in pull request #7653: Refactor SQLMetadataSegmentManager; Change contract of REST methods in DataSourcesResource
gianm commented on a change in pull request #7653: Refactor SQLMetadataSegmentManager; Change contract of REST methods in DataSourcesResource URL: https://github.com/apache/incubator-druid/pull/7653#discussion_r286368954 ## File path: server/src/main/java/org/apache/druid/metadata/SQLMetadataSegmentManager.java ## @@ -197,366 +320,489 @@ private Runnable createPollTaskForStartOrder(long startOrder) } @Override - @LifecycleStop - public void stop() + public void stopPollingDatabasePeriodically() { -ReentrantReadWriteLock.WriteLock lock = startStopLock.writeLock(); +ReentrantReadWriteLock.WriteLock lock = startStopPollLock.writeLock(); lock.lock(); try { - if (!isStarted()) { + if (!isPollingDatabasePeriodically()) { return; } - dataSources = null; - currentStartOrder = -1; - exec.shutdownNow(); - exec = null; + periodicPollTaskFuture.cancel(false); + latestDatabasePoll = null; + + // NOT nulling dataSources, allowing to query the latest polled data even when this SegmentsMetadata object is + // stopped. + + currentStartPollingOrder = -1; } finally { lock.unlock(); } } - private Pair usedPayloadMapper( - final int index, - final ResultSet resultSet, - final StatementContext context - ) throws SQLException + private void awaitOrPerformDatabasePoll() { +// Double-checked locking with awaitPeriodicOrFreshOnDemandDatabasePoll() call playing the role of the "check". +if (awaitPeriodicOrFreshOnDemandDatabasePoll()) { + return; +} +ReentrantReadWriteLock.WriteLock lock = startStopPollLock.writeLock(); +lock.lock(); try { - return new Pair<>( - jsonMapper.readValue(resultSet.getBytes("payload"), DataSegment.class), - resultSet.getBoolean("used") - ); + if (awaitPeriodicOrFreshOnDemandDatabasePoll()) { +return; + } + OnDemandDatabasePoll newOnDemandUpdate = new OnDemandDatabasePoll(); + this.latestDatabasePoll = newOnDemandUpdate; + doOnDemandPoll(newOnDemandUpdate); } -catch (IOException e) { - throw new RuntimeException(e); +finally { + lock.unlock(); } } - /** - * Gets a list of all datasegments that overlap the provided interval along with thier used status. - */ - private List> getDataSegmentsOverlappingInterval( - final String dataSource, - final Interval interval - ) - { -return connector.inReadOnlyTransaction( -(handle, status) -> handle.createQuery( -StringUtils.format( -"SELECT used, payload FROM %1$s WHERE dataSource = :dataSource AND start < :end AND %2$send%2$s > :start", -getSegmentsTable(), -connector.getQuoteString() -) -) -.setFetchSize(connector.getStreamingFetchSize()) -.bind("dataSource", dataSource) -.bind("start", interval.getStart().toString()) -.bind("end", interval.getEnd().toString()) -.map(this::usedPayloadMapper) -.list() -); - } - - private List> getDataSegments( - final String dataSource, - final Collection segmentIds, - final Handle handle - ) + private boolean awaitPeriodicOrFreshOnDemandDatabasePoll() { -return segmentIds.stream().map( -segmentId -> Optional.ofNullable( -handle.createQuery( -StringUtils.format( -"SELECT used, payload FROM %1$s WHERE dataSource = :dataSource AND id = :id", -getSegmentsTable() -) -) -.bind("dataSource", dataSource) -.bind("id", segmentId) -.map(this::usedPayloadMapper) -.first() -) -.orElseThrow(() -> new UnknownSegmentIdException(StringUtils.format("Cannot find segment id [%s]", segmentId))) -) -.collect(Collectors.toList()); +DatabasePoll latestDatabasePoll = this.latestDatabasePoll; Review comment: It looks like the idea is that if we are in the midst of a leadership epoch, then: - `isPollingDatabasePeriodically` would return true - `latestDatabasePoll` will be a PeriodicDatabasePoll - we are guaranteed to await for that PeriodicDatabasePoll to complete its first poll, if `awaitPeriodicOrFreshOnDemandDatabasePoll` is called after the leadership epoch starts Is that right? If so it should provide the property I mentioned in another comment (preventing leadership epoch N from using an older snapshot than leadership epoch N-1). If so IMO it'd be nice to have some comments or assertions about these class invariants. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries
[GitHub] [incubator-druid] gianm commented on a change in pull request #7653: Refactor SQLMetadataSegmentManager; Change contract of REST methods in DataSourcesResource
gianm commented on a change in pull request #7653: Refactor SQLMetadataSegmentManager; Change contract of REST methods in DataSourcesResource URL: https://github.com/apache/incubator-druid/pull/7653#discussion_r286321411 ## File path: server/src/main/java/org/apache/druid/metadata/SQLMetadataSegmentManager.java ## @@ -80,51 +87,121 @@ private static final EmittingLogger log = new EmittingLogger(SQLMetadataSegmentManager.class); /** - * Use to synchronize {@link #start()}, {@link #stop()}, {@link #poll()}, and {@link #isStarted()}. These methods - * should be synchronized to prevent from being called at the same time if two different threads are calling them. - * This might be possible if a druid coordinator gets and drops leadership repeatedly in quick succession. + * Marker interface for objects stored in {@link #latestDatabasePoll}. See the comment for that field for details. */ - private final ReentrantReadWriteLock startStopLock = new ReentrantReadWriteLock(); + private interface DatabasePoll + {} + + /** Represents periodic {@link #poll}s happening from {@link #exec}. */ + private static class PeriodicDatabasePoll implements DatabasePoll + { +/** + * This future allows to wait until {@link #dataSources} is initialized in the first {@link #poll()} happening since + * {@link #startPollingDatabasePeriodically()} is called for the first time, or since the last visible (in + * happens-before terms) call to {@link #startPollingDatabasePeriodically()} in case of Coordinator's leadership + * changes. + */ +final CompletableFuture firstPollCompletionFuture = new CompletableFuture<>(); + } + + /** + * Represents on-demand {@link #poll} initiated at periods of time when SqlSegmentsMetadata doesn't poll the database + * periodically. + */ + private static class OnDemandDatabasePoll implements DatabasePoll + { +final long initiationTimeNanos = System.nanoTime(); +final CompletableFuture pollCompletionFuture = new CompletableFuture<>(); + +long nanosElapsedFromInitiation() +{ + return System.nanoTime() - initiationTimeNanos; +} + } + + /** + * Use to synchronize {@link #startPollingDatabasePeriodically}, {@link #stopPollingDatabasePeriodically}, {@link + * #poll}, and {@link #isPollingDatabasePeriodically}. These methods should be synchronized to prevent from being + * called at the same time if two different threads are calling them. This might be possible if Coordinator gets and + * drops leadership repeatedly in quick succession. + * + * This lock is also used to synchronize {@link #awaitOrPerformDatabasePoll} for times when SqlSegmentsMetadata + * is not polling the database periodically (in other words, when the Coordinator is not the leader). + */ + private final ReentrantReadWriteLock startStopPollLock = new ReentrantReadWriteLock(); /** * Used to ensure that {@link #poll()} is never run concurrently. It should already be so (at least in production * code), where {@link #poll()} is called only from the task created in {@link #createPollTaskForStartOrder} and is * scheduled in a single-threaded {@link #exec}, so this lock is an additional safety net in case there are bugs in * the code, and for tests, where {@link #poll()} is called from the outside code. * - * Not using {@link #startStopLock}.writeLock() in order to still be able to run {@link #poll()} concurrently with - * {@link #isStarted()}. + * Not using {@link #startStopPollLock}.writeLock() in order to still be able to run {@link #poll()} concurrently + * with {@link #isPollingDatabasePeriodically()}. */ private final Object pollLock = new Object(); private final ObjectMapper jsonMapper; - private final Supplier config; + private final Duration periodicPollDelay; private final Supplier dbTables; private final SQLMetadataConnector connector; - // Volatile since this reference is reassigned in "poll" and then read from in other threads. - // Starts null so we can differentiate "never polled" (null) from "polled, but empty" (empty map). - // Note that this is not simply a lazy-initialized variable: it starts off as null, and may transition between - // null and nonnull multiple times as stop() and start() are called. - @Nullable - private volatile ConcurrentHashMap dataSources = null; + /** + * This field is made volatile to avoid "ghost secondary reads" that may result in NPE, see + * https://github.com/code-review-checklists/java-concurrency#safe-local-dcl (note that dataSources resembles a lazily + * initialized field). Alternative is to always read the field in a snapshot local variable, but it's too easy to + * forget to do. + * + * This field may be updated from {@link #exec}, or from whatever thread calling {@link #doOnDemandPoll} via {@link + * #awaitOrPerformDatabasePoll()} via one of the public methods of
[GitHub] [incubator-druid] gianm commented on a change in pull request #7653: Refactor SQLMetadataSegmentManager; Change contract of REST methods in DataSourcesResource
gianm commented on a change in pull request #7653: Refactor SQLMetadataSegmentManager; Change contract of REST methods in DataSourcesResource URL: https://github.com/apache/incubator-druid/pull/7653#discussion_r286368775 ## File path: server/src/main/java/org/apache/druid/metadata/SQLMetadataSegmentManager.java ## @@ -197,366 +320,489 @@ private Runnable createPollTaskForStartOrder(long startOrder) } @Override - @LifecycleStop - public void stop() + public void stopPollingDatabasePeriodically() { -ReentrantReadWriteLock.WriteLock lock = startStopLock.writeLock(); +ReentrantReadWriteLock.WriteLock lock = startStopPollLock.writeLock(); lock.lock(); try { - if (!isStarted()) { + if (!isPollingDatabasePeriodically()) { return; } - dataSources = null; - currentStartOrder = -1; - exec.shutdownNow(); - exec = null; + periodicPollTaskFuture.cancel(false); + latestDatabasePoll = null; + + // NOT nulling dataSources, allowing to query the latest polled data even when this SegmentsMetadata object is + // stopped. + + currentStartPollingOrder = -1; } finally { lock.unlock(); } } - private Pair usedPayloadMapper( - final int index, - final ResultSet resultSet, - final StatementContext context - ) throws SQLException + private void awaitOrPerformDatabasePoll() { +// Double-checked locking with awaitPeriodicOrFreshOnDemandDatabasePoll() call playing the role of the "check". +if (awaitPeriodicOrFreshOnDemandDatabasePoll()) { + return; +} +ReentrantReadWriteLock.WriteLock lock = startStopPollLock.writeLock(); +lock.lock(); try { - return new Pair<>( - jsonMapper.readValue(resultSet.getBytes("payload"), DataSegment.class), - resultSet.getBoolean("used") - ); + if (awaitPeriodicOrFreshOnDemandDatabasePoll()) { +return; + } + OnDemandDatabasePoll newOnDemandUpdate = new OnDemandDatabasePoll(); + this.latestDatabasePoll = newOnDemandUpdate; + doOnDemandPoll(newOnDemandUpdate); } -catch (IOException e) { - throw new RuntimeException(e); +finally { + lock.unlock(); } } - /** - * Gets a list of all datasegments that overlap the provided interval along with thier used status. - */ - private List> getDataSegmentsOverlappingInterval( - final String dataSource, - final Interval interval - ) - { -return connector.inReadOnlyTransaction( -(handle, status) -> handle.createQuery( -StringUtils.format( -"SELECT used, payload FROM %1$s WHERE dataSource = :dataSource AND start < :end AND %2$send%2$s > :start", -getSegmentsTable(), -connector.getQuoteString() -) -) -.setFetchSize(connector.getStreamingFetchSize()) -.bind("dataSource", dataSource) -.bind("start", interval.getStart().toString()) -.bind("end", interval.getEnd().toString()) -.map(this::usedPayloadMapper) -.list() -); - } - - private List> getDataSegments( - final String dataSource, - final Collection segmentIds, - final Handle handle - ) + private boolean awaitPeriodicOrFreshOnDemandDatabasePoll() Review comment: How about calling this `awaitLatestDatabasePoll()`? It seems more intuitive. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gianm commented on a change in pull request #7653: Refactor SQLMetadataSegmentManager; Change contract of REST methods in DataSourcesResource
gianm commented on a change in pull request #7653: Refactor SQLMetadataSegmentManager; Change contract of REST methods in DataSourcesResource URL: https://github.com/apache/incubator-druid/pull/7653#discussion_r286365821 ## File path: server/src/main/java/org/apache/druid/metadata/SQLMetadataSegmentManager.java ## @@ -80,51 +87,121 @@ private static final EmittingLogger log = new EmittingLogger(SQLMetadataSegmentManager.class); /** - * Use to synchronize {@link #start()}, {@link #stop()}, {@link #poll()}, and {@link #isStarted()}. These methods - * should be synchronized to prevent from being called at the same time if two different threads are calling them. - * This might be possible if a druid coordinator gets and drops leadership repeatedly in quick succession. + * Marker interface for objects stored in {@link #latestDatabasePoll}. See the comment for that field for details. */ - private final ReentrantReadWriteLock startStopLock = new ReentrantReadWriteLock(); + private interface DatabasePoll + {} + + /** Represents periodic {@link #poll}s happening from {@link #exec}. */ + private static class PeriodicDatabasePoll implements DatabasePoll + { +/** + * This future allows to wait until {@link #dataSources} is initialized in the first {@link #poll()} happening since + * {@link #startPollingDatabasePeriodically()} is called for the first time, or since the last visible (in + * happens-before terms) call to {@link #startPollingDatabasePeriodically()} in case of Coordinator's leadership + * changes. + */ +final CompletableFuture firstPollCompletionFuture = new CompletableFuture<>(); + } + + /** + * Represents on-demand {@link #poll} initiated at periods of time when SqlSegmentsMetadata doesn't poll the database + * periodically. + */ + private static class OnDemandDatabasePoll implements DatabasePoll + { +final long initiationTimeNanos = System.nanoTime(); +final CompletableFuture pollCompletionFuture = new CompletableFuture<>(); + +long nanosElapsedFromInitiation() +{ + return System.nanoTime() - initiationTimeNanos; +} + } + + /** + * Use to synchronize {@link #startPollingDatabasePeriodically}, {@link #stopPollingDatabasePeriodically}, {@link + * #poll}, and {@link #isPollingDatabasePeriodically}. These methods should be synchronized to prevent from being + * called at the same time if two different threads are calling them. This might be possible if Coordinator gets and + * drops leadership repeatedly in quick succession. + * + * This lock is also used to synchronize {@link #awaitOrPerformDatabasePoll} for times when SqlSegmentsMetadata + * is not polling the database periodically (in other words, when the Coordinator is not the leader). + */ + private final ReentrantReadWriteLock startStopPollLock = new ReentrantReadWriteLock(); /** * Used to ensure that {@link #poll()} is never run concurrently. It should already be so (at least in production * code), where {@link #poll()} is called only from the task created in {@link #createPollTaskForStartOrder} and is * scheduled in a single-threaded {@link #exec}, so this lock is an additional safety net in case there are bugs in * the code, and for tests, where {@link #poll()} is called from the outside code. * - * Not using {@link #startStopLock}.writeLock() in order to still be able to run {@link #poll()} concurrently with - * {@link #isStarted()}. + * Not using {@link #startStopPollLock}.writeLock() in order to still be able to run {@link #poll()} concurrently + * with {@link #isPollingDatabasePeriodically()}. */ private final Object pollLock = new Object(); private final ObjectMapper jsonMapper; - private final Supplier config; + private final Duration periodicPollDelay; private final Supplier dbTables; private final SQLMetadataConnector connector; - // Volatile since this reference is reassigned in "poll" and then read from in other threads. - // Starts null so we can differentiate "never polled" (null) from "polled, but empty" (empty map). - // Note that this is not simply a lazy-initialized variable: it starts off as null, and may transition between - // null and nonnull multiple times as stop() and start() are called. - @Nullable - private volatile ConcurrentHashMap dataSources = null; + /** + * This field is made volatile to avoid "ghost secondary reads" that may result in NPE, see + * https://github.com/code-review-checklists/java-concurrency#safe-local-dcl (note that dataSources resembles a lazily + * initialized field). Alternative is to always read the field in a snapshot local variable, but it's too easy to + * forget to do. + * + * This field may be updated from {@link #exec}, or from whatever thread calling {@link #doOnDemandPoll} via {@link + * #awaitOrPerformDatabasePoll()} via one of the public methods of
[GitHub] [incubator-druid] gianm commented on a change in pull request #7653: Refactor SQLMetadataSegmentManager; Change contract of REST methods in DataSourcesResource
gianm commented on a change in pull request #7653: Refactor SQLMetadataSegmentManager; Change contract of REST methods in DataSourcesResource URL: https://github.com/apache/incubator-druid/pull/7653#discussion_r286321710 ## File path: server/src/main/java/org/apache/druid/metadata/SQLMetadataSegmentManager.java ## @@ -80,51 +87,121 @@ private static final EmittingLogger log = new EmittingLogger(SQLMetadataSegmentManager.class); /** - * Use to synchronize {@link #start()}, {@link #stop()}, {@link #poll()}, and {@link #isStarted()}. These methods - * should be synchronized to prevent from being called at the same time if two different threads are calling them. - * This might be possible if a druid coordinator gets and drops leadership repeatedly in quick succession. + * Marker interface for objects stored in {@link #latestDatabasePoll}. See the comment for that field for details. */ - private final ReentrantReadWriteLock startStopLock = new ReentrantReadWriteLock(); + private interface DatabasePoll + {} + + /** Represents periodic {@link #poll}s happening from {@link #exec}. */ + private static class PeriodicDatabasePoll implements DatabasePoll + { +/** + * This future allows to wait until {@link #dataSources} is initialized in the first {@link #poll()} happening since + * {@link #startPollingDatabasePeriodically()} is called for the first time, or since the last visible (in + * happens-before terms) call to {@link #startPollingDatabasePeriodically()} in case of Coordinator's leadership + * changes. + */ +final CompletableFuture firstPollCompletionFuture = new CompletableFuture<>(); + } + + /** + * Represents on-demand {@link #poll} initiated at periods of time when SqlSegmentsMetadata doesn't poll the database + * periodically. + */ + private static class OnDemandDatabasePoll implements DatabasePoll + { +final long initiationTimeNanos = System.nanoTime(); +final CompletableFuture pollCompletionFuture = new CompletableFuture<>(); + +long nanosElapsedFromInitiation() +{ + return System.nanoTime() - initiationTimeNanos; +} + } + + /** + * Use to synchronize {@link #startPollingDatabasePeriodically}, {@link #stopPollingDatabasePeriodically}, {@link + * #poll}, and {@link #isPollingDatabasePeriodically}. These methods should be synchronized to prevent from being + * called at the same time if two different threads are calling them. This might be possible if Coordinator gets and + * drops leadership repeatedly in quick succession. + * + * This lock is also used to synchronize {@link #awaitOrPerformDatabasePoll} for times when SqlSegmentsMetadata + * is not polling the database periodically (in other words, when the Coordinator is not the leader). + */ + private final ReentrantReadWriteLock startStopPollLock = new ReentrantReadWriteLock(); /** * Used to ensure that {@link #poll()} is never run concurrently. It should already be so (at least in production * code), where {@link #poll()} is called only from the task created in {@link #createPollTaskForStartOrder} and is * scheduled in a single-threaded {@link #exec}, so this lock is an additional safety net in case there are bugs in * the code, and for tests, where {@link #poll()} is called from the outside code. * - * Not using {@link #startStopLock}.writeLock() in order to still be able to run {@link #poll()} concurrently with - * {@link #isStarted()}. + * Not using {@link #startStopPollLock}.writeLock() in order to still be able to run {@link #poll()} concurrently + * with {@link #isPollingDatabasePeriodically()}. */ private final Object pollLock = new Object(); private final ObjectMapper jsonMapper; - private final Supplier config; + private final Duration periodicPollDelay; private final Supplier dbTables; private final SQLMetadataConnector connector; - // Volatile since this reference is reassigned in "poll" and then read from in other threads. - // Starts null so we can differentiate "never polled" (null) from "polled, but empty" (empty map). - // Note that this is not simply a lazy-initialized variable: it starts off as null, and may transition between - // null and nonnull multiple times as stop() and start() are called. - @Nullable - private volatile ConcurrentHashMap dataSources = null; + /** + * This field is made volatile to avoid "ghost secondary reads" that may result in NPE, see + * https://github.com/code-review-checklists/java-concurrency#safe-local-dcl (note that dataSources resembles a lazily + * initialized field). Alternative is to always read the field in a snapshot local variable, but it's too easy to + * forget to do. + * + * This field may be updated from {@link #exec}, or from whatever thread calling {@link #doOnDemandPoll} via {@link + * #awaitOrPerformDatabasePoll()} via one of the public methods of
[GitHub] [incubator-druid] gianm commented on a change in pull request #7653: Refactor SQLMetadataSegmentManager; Change contract of REST methods in DataSourcesResource
gianm commented on a change in pull request #7653: Refactor SQLMetadataSegmentManager; Change contract of REST methods in DataSourcesResource URL: https://github.com/apache/incubator-druid/pull/7653#discussion_r286416399 ## File path: server/src/main/java/org/apache/druid/metadata/SQLMetadataSegmentManager.java ## @@ -197,366 +320,489 @@ private Runnable createPollTaskForStartOrder(long startOrder) } @Override - @LifecycleStop - public void stop() + public void stopPollingDatabasePeriodically() { -ReentrantReadWriteLock.WriteLock lock = startStopLock.writeLock(); +ReentrantReadWriteLock.WriteLock lock = startStopPollLock.writeLock(); lock.lock(); try { - if (!isStarted()) { + if (!isPollingDatabasePeriodically()) { return; } - dataSources = null; - currentStartOrder = -1; - exec.shutdownNow(); - exec = null; + periodicPollTaskFuture.cancel(false); + latestDatabasePoll = null; + + // NOT nulling dataSources, allowing to query the latest polled data even when this SegmentsMetadata object is + // stopped. + + currentStartPollingOrder = -1; } finally { lock.unlock(); } } - private Pair usedPayloadMapper( - final int index, - final ResultSet resultSet, - final StatementContext context - ) throws SQLException + private void awaitOrPerformDatabasePoll() { +// Double-checked locking with awaitPeriodicOrFreshOnDemandDatabasePoll() call playing the role of the "check". +if (awaitPeriodicOrFreshOnDemandDatabasePoll()) { + return; +} +ReentrantReadWriteLock.WriteLock lock = startStopPollLock.writeLock(); +lock.lock(); try { - return new Pair<>( - jsonMapper.readValue(resultSet.getBytes("payload"), DataSegment.class), - resultSet.getBoolean("used") - ); + if (awaitPeriodicOrFreshOnDemandDatabasePoll()) { +return; + } + OnDemandDatabasePoll newOnDemandUpdate = new OnDemandDatabasePoll(); + this.latestDatabasePoll = newOnDemandUpdate; + doOnDemandPoll(newOnDemandUpdate); } -catch (IOException e) { - throw new RuntimeException(e); +finally { + lock.unlock(); } } - /** - * Gets a list of all datasegments that overlap the provided interval along with thier used status. - */ - private List> getDataSegmentsOverlappingInterval( - final String dataSource, - final Interval interval - ) - { -return connector.inReadOnlyTransaction( -(handle, status) -> handle.createQuery( -StringUtils.format( -"SELECT used, payload FROM %1$s WHERE dataSource = :dataSource AND start < :end AND %2$send%2$s > :start", -getSegmentsTable(), -connector.getQuoteString() -) -) -.setFetchSize(connector.getStreamingFetchSize()) -.bind("dataSource", dataSource) -.bind("start", interval.getStart().toString()) -.bind("end", interval.getEnd().toString()) -.map(this::usedPayloadMapper) -.list() -); - } - - private List> getDataSegments( - final String dataSource, - final Collection segmentIds, - final Handle handle - ) + private boolean awaitPeriodicOrFreshOnDemandDatabasePoll() { -return segmentIds.stream().map( -segmentId -> Optional.ofNullable( -handle.createQuery( -StringUtils.format( -"SELECT used, payload FROM %1$s WHERE dataSource = :dataSource AND id = :id", -getSegmentsTable() -) -) -.bind("dataSource", dataSource) -.bind("id", segmentId) -.map(this::usedPayloadMapper) -.first() -) -.orElseThrow(() -> new UnknownSegmentIdException(StringUtils.format("Cannot find segment id [%s]", segmentId))) -) -.collect(Collectors.toList()); +DatabasePoll latestDatabasePoll = this.latestDatabasePoll; +if (latestDatabasePoll instanceof PeriodicDatabasePoll) { + Futures.getUnchecked(((PeriodicDatabasePoll) latestDatabasePoll).firstPollCompletionFuture); + return true; +} +if (latestDatabasePoll instanceof OnDemandDatabasePoll) { + long periodicPollDelayNanos = TimeUnit.MILLISECONDS.toNanos(periodicPollDelay.getMillis()); + OnDemandDatabasePoll latestOnDemandUpdate = (OnDemandDatabasePoll) latestDatabasePoll; + boolean latestUpdateIsFresh = latestOnDemandUpdate.nanosElapsedFromInitiation() < periodicPollDelayNanos; + if (latestUpdateIsFresh) { +Futures.getUnchecked(latestOnDemandUpdate.pollCompletionFuture); +return true; + } + // Latest on-demand update is not fresh. Fall through to return false from this method. +} else { + assert latestDatabasePoll == null; +} +return false; } - /** - * Builds a Version
[GitHub] [incubator-druid] gianm commented on a change in pull request #7653: Refactor SQLMetadataSegmentManager; Change contract of REST methods in DataSourcesResource
gianm commented on a change in pull request #7653: Refactor SQLMetadataSegmentManager; Change contract of REST methods in DataSourcesResource URL: https://github.com/apache/incubator-druid/pull/7653#discussion_r286322201 ## File path: server/src/main/java/org/apache/druid/metadata/SQLMetadataSegmentManager.java ## @@ -136,33 +213,70 @@ public SQLMetadataSegmentManager( ) { this.jsonMapper = jsonMapper; -this.config = config; +this.periodicPollDelay = config.get().getPollDuration().toStandardDuration(); this.dbTables = dbTables; this.connector = connector; } - @Override + /** + * Don't confuse this method with {@link #startPollingDatabasePeriodically}. This is a lifecycle starting method to Review comment: Do you think it's worth including a sanity check that this method is only called one time? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gianm commented on a change in pull request #7653: Refactor SQLMetadataSegmentManager; Change contract of REST methods in DataSourcesResource
gianm commented on a change in pull request #7653: Refactor SQLMetadataSegmentManager; Change contract of REST methods in DataSourcesResource URL: https://github.com/apache/incubator-druid/pull/7653#discussion_r286335156 ## File path: server/src/main/java/org/apache/druid/metadata/SQLMetadataSegmentManager.java ## @@ -171,24 +285,33 @@ public void start() } } - private Runnable createPollTaskForStartOrder(long startOrder) + private Runnable createPollTaskForStartOrder(long startOrder, PeriodicDatabasePoll periodicPollUpdate) { return () -> { - // poll() is synchronized together with start(), stop() and isStarted() to ensure that when stop() exits, poll() - // won't actually run anymore after that (it could only enter the syncrhonized section and exit immediately - // because the localStartedOrder doesn't match the new currentStartOrder). It's needed to avoid flakiness in - // SQLMetadataSegmentManagerTest. See https://github.com/apache/incubator-druid/issues/6028 - ReentrantReadWriteLock.ReadLock lock = startStopLock.readLock(); + // poll() is synchronized together with startPollingDatabasePeriodically(), stopPollingDatabasePeriodically() and + // isPollingDatabasePeriodically() to ensure that when stopPollingDatabasePeriodically() exits, poll() won't + // actually run anymore after that (it could only enter the synchronized section and exit immediately because the + // localStartedOrder doesn't match the new currentStartPollingOrder). It's needed to avoid flakiness in + // SqlSegmentsMetadataTest. See https://github.com/apache/incubator-druid/issues/6028 + ReentrantReadWriteLock.ReadLock lock = startStopPollLock.readLock(); lock.lock(); try { -if (startOrder == currentStartOrder) { +if (startOrder == currentStartPollingOrder) { poll(); + periodicPollUpdate.firstPollCompletionFuture.complete(null); } else { - log.debug("startOrder = currentStartOrder = %d, skipping poll()", startOrder); + log.debug("startOrder = currentStartPollingOrder = %d, skipping poll()", startOrder); } } - catch (Exception e) { -log.makeAlert(e, "uncaught exception in segment manager polling thread").emit(); + catch (Throwable t) { +log.makeAlert(t, "Uncaught exception in " + getClass().getName() + "'s polling thread").emit(); Review comment: `makeAlert` supports embedded format strings, if you like to use that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] clintropolis opened a new pull request #7730: add support for multi-value string dimensions for 'HllSketch' build aggregator
clintropolis opened a new pull request #7730: add support for multi-value string dimensions for 'HllSketch' build aggregator URL: https://github.com/apache/incubator-druid/pull/7730 Without this patch, attempting to use `HLLSketchBuild` aggregator on a multi-value string dimension results in a query failure caused by: ``` java.lang.RuntimeException: org.apache.druid.java.util.common.IAE: Unsupported type class java.util.Arrays$ArrayList ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gianm commented on issue #7674: add s3 authentication method informations
gianm commented on issue #7674: add s3 authentication method informations URL: https://github.com/apache/incubator-druid/pull/7674#issuecomment-494742673 I restarted that one, it might have flaked out. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gocho1 commented on issue #7674: add s3 authentication method informations
gocho1 commented on issue #7674: add s3 authentication method informations URL: https://github.com/apache/incubator-druid/pull/7674#issuecomment-494742065 Should i do something about the build not passing ? Seems like a timeout during tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gianm commented on issue #7724: Lookup transform expression not defined during ingest from kafka
gianm commented on issue #7724: Lookup transform expression not defined during ingest from kafka URL: https://github.com/apache/incubator-druid/issues/7724#issuecomment-494723741 > Maybe a nice way to address that would be to extend #7222 to put the LookupSerdeModule on the overlord and MM too. Ah, it seems like you already had the same idea @clintropolis: https://github.com/apache/incubator-druid/pull/7082#discussion_r264888771 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gianm commented on issue #7724: Lookup transform expression not defined during ingest from kafka
gianm commented on issue #7724: Lookup transform expression not defined during ingest from kafka URL: https://github.com/apache/incubator-druid/issues/7724#issuecomment-49472 I raised https://github.com/apache/incubator-druid/pull/7729 which makes the runner and parser initialization lazy, so it won't happen on the overlord. I think this is good even if we end up fixing #5727 some other way, since it's senseless for the overlord to be making these objects. By the way, I don't think that #7729 would fix #5727 too. That's happening because the overlord can't figure out how to make a LookupReferencesManager. Maybe a nice way to address that would be to extend #7222 to put the LookupSerdeModule on the overlord and MM too. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gianm opened a new pull request #7729: SeekableStreamIndexTaskRunner: Lazy init of runner.
gianm opened a new pull request #7729: SeekableStreamIndexTaskRunner: Lazy init of runner. URL: https://github.com/apache/incubator-druid/pull/7729 The main motivation is that this fixes #7724, by making it so the overlord doesn't try to create a task runner and parser when all it really wants to do is create a task object and serialize it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] clintropolis commented on issue #7724: Lookup transform expression not defined during ingest from kafka
clintropolis commented on issue #7724: Lookup transform expression not defined during ingest from kafka URL: https://github.com/apache/incubator-druid/issues/7724#issuecomment-494718031 Potentially related #5727 and #7082 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gianm commented on issue #7724: Lookup transform expression not defined during ingest from kafka
gianm commented on issue #7724: Lookup transform expression not defined during ingest from kafka URL: https://github.com/apache/incubator-druid/issues/7724#issuecomment-494711696 I think this is happening because the overlord doesn't load lookups and doesn't have a lookup module, and so, it doesn't know about the `lookup` function. Something here should be lazier than it is… the overlord shouldn't need to be instantiating parsers. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] sanastas commented on issue #7676: Add OakIncrementalIndex to Druid
sanastas commented on issue #7676: Add OakIncrementalIndex to Druid URL: https://github.com/apache/incubator-druid/pull/7676#issuecomment-494687467 Thank you for your input! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org