[GitHub] [druid] kaijianding commented on a change in pull request #10688: Historical lazy on start enhancement with segment files check

2021-01-06 Thread GitBox


kaijianding commented on a change in pull request #10688:
URL: https://github.com/apache/druid/pull/10688#discussion_r552404073



##
File path: core/src/main/java/org/apache/druid/coordination/CommonCallback.java
##
@@ -0,0 +1,26 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.coordination;

Review comment:
   I guess this interface should be in `org.apache.druid.segment`, beside 
the IndexIO





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] aaasif04 opened a new issue #10731: service is not loading

2021-01-06 Thread GitBox


aaasif04 opened a new issue #10731:
URL: https://github.com/apache/druid/issues/10731


   I have installed druid 0.20.0as cluster and started all the services. But 
only the data services are up and shown in druid console. And in the router 
log, it shows errors like
   ERROR [CoordinatorRuleManager-Exec--0] 
org.apache.druid.curator.discovery.ServerDiscoverySelector - No server instance 
found for [druid/coordinator]
   ERROR [CoordinatorRuleManager-Exec--0] 
org.apache.druid.server.router.CoordinatorRuleManager - Exception while polling 
for rules
   org.apache.druid.java.util.common.IOE: No known server
   
   All the files are created in zookeeper, thus showing connection between 
zookeeper is ok. So, what might be the problem.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] liran-funaro commented on pull request #10593: IncrementalIndex Tests and Benchmarks Parametrization

2021-01-06 Thread GitBox


liran-funaro commented on pull request #10593:
URL: https://github.com/apache/druid/pull/10593#issuecomment-755211304


   Can we proceed to merge this PR?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] loquisgon commented on a change in pull request #10685: Subtotal queries give incorrect results if query has a limit spec

2021-01-06 Thread GitBox


loquisgon commented on a change in pull request #10685:
URL: https://github.com/apache/druid/pull/10685#discussion_r552810775



##
File path: 
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/ByteBufferMinMaxOffsetHeap.java
##
@@ -59,6 +60,35 @@ public ByteBufferMinMaxOffsetHeap(
 this.heapIndexUpdater = heapIndexUpdater;
   }
 
+  public ByteBufferMinMaxOffsetHeap copy()
+  {
+LimitedBufferHashGrouper.BufferGrouperOffsetHeapIndexUpdater updater =
+Optional
+.ofNullable(heapIndexUpdater)
+
.map(LimitedBufferHashGrouper.BufferGrouperOffsetHeapIndexUpdater::copy)
+.orElse(null);
+
+// deep copy buf
+ByteBuffer buffer = ByteBuffer.allocateDirect(buf.capacity());

Review comment:
   Yeah, I agree. My biggest concern with this code is that it is not 
obvious to the caller that creating new iterators of the type being changed 
here will allocate new off-heap memory in an unbounded fashion. This is ok if 
we think that not "too many" copies will be done but I cannot affirm that.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] loquisgon commented on a change in pull request #10685: Subtotal queries give incorrect results if query has a limit spec

2021-01-06 Thread GitBox


loquisgon commented on a change in pull request #10685:
URL: https://github.com/apache/druid/pull/10685#discussion_r552811029



##
File path: 
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/ByteBufferMinMaxOffsetHeap.java
##
@@ -59,6 +60,35 @@ public ByteBufferMinMaxOffsetHeap(
 this.heapIndexUpdater = heapIndexUpdater;
   }
 
+  public ByteBufferMinMaxOffsetHeap copy()
+  {
+LimitedBufferHashGrouper.BufferGrouperOffsetHeapIndexUpdater updater =
+Optional
+.ofNullable(heapIndexUpdater)
+
.map(LimitedBufferHashGrouper.BufferGrouperOffsetHeapIndexUpdater::copy)
+.orElse(null);
+
+// deep copy buf
+ByteBuffer buffer = ByteBuffer.allocateDirect(buf.capacity());
+
+int bufPosition = buf.position();
+int bufLimit = buf.limit();
+buf.rewind();
+
+buffer.put(buf);
+buffer.position(bufPosition);
+buffer.limit(bufLimit);
+
+buf.position(bufPosition);
+buf.limit(bufLimit);

Review comment:
   Yeah, agreed. Code should be refactored to abstract out the duplicated 
block.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] loquisgon commented on a change in pull request #10685: Subtotal queries give incorrect results if query has a limit spec

2021-01-06 Thread GitBox


loquisgon commented on a change in pull request #10685:
URL: https://github.com/apache/druid/pull/10685#discussion_r552813435



##
File path: 
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/ByteBufferMinMaxOffsetHeap.java
##
@@ -59,6 +60,35 @@ public ByteBufferMinMaxOffsetHeap(
 this.heapIndexUpdater = heapIndexUpdater;
   }
 
+  public ByteBufferMinMaxOffsetHeap copy()
+  {
+LimitedBufferHashGrouper.BufferGrouperOffsetHeapIndexUpdater updater =
+Optional
+.ofNullable(heapIndexUpdater)
+
.map(LimitedBufferHashGrouper.BufferGrouperOffsetHeapIndexUpdater::copy)
+.orElse(null);
+
+// deep copy buf
+ByteBuffer buffer = ByteBuffer.allocateDirect(buf.capacity());
+
+int bufPosition = buf.position();
+int bufLimit = buf.limit();
+buf.rewind();
+
+buffer.put(buf);
+buffer.position(bufPosition);
+buffer.limit(bufLimit);
+
+buf.position(bufPosition);
+buf.limit(bufLimit);

Review comment:
   It won't hurt to copy the order I guess. I will look more into it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] suneet-s commented on pull request #10724: Coordinator Dynamic Config changes to ease upgrading with new config value

2021-01-06 Thread GitBox


suneet-s commented on pull request #10724:
URL: https://github.com/apache/druid/pull/10724#issuecomment-755627357


   @capistrant It looks like CI is complaining about a missing file for a newly 
added (I think) integration test 🤔 I'm not sure how this PR could have caused 
that failure - https://travis-ci.com/github/apache/druid/jobs/468930258 (I 
tried re-triggering this job a couple times just incase)
   
   Could you maybe try merging master back into this PR to see if that will 
make Travis happy



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] jihoonson commented on a change in pull request #10685: Subtotal queries give incorrect results if query has a limit spec

2021-01-06 Thread GitBox


jihoonson commented on a change in pull request #10685:
URL: https://github.com/apache/druid/pull/10685#discussion_r552969515



##
File path: 
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/ByteBufferMinMaxOffsetHeap.java
##
@@ -59,6 +60,35 @@ public ByteBufferMinMaxOffsetHeap(
 this.heapIndexUpdater = heapIndexUpdater;
   }
 
+  public ByteBufferMinMaxOffsetHeap copy()
+  {
+LimitedBufferHashGrouper.BufferGrouperOffsetHeapIndexUpdater updater =
+Optional
+.ofNullable(heapIndexUpdater)
+
.map(LimitedBufferHashGrouper.BufferGrouperOffsetHeapIndexUpdater::copy)
+.orElse(null);
+
+// deep copy buf
+ByteBuffer buffer = ByteBuffer.allocateDirect(buf.capacity());

Review comment:
   > Yeah, I agree. My biggest concern with this code is that it is not 
obvious to the caller that creating new iterators of the type being changed 
here will allocate new off-heap memory in an unbounded fashion. This is ok if 
we think that not "too many" copies will be done but I cannot affirm that.
   
   It seems pretty dangerous to me as users are not expected to be aware of 
this behavior. The blocking merge buffer pool is used to avoid unexpectedly 
using "too many" copies at the same time like in this problematic scenario. 
   
   By the way, on the second look, I'm wondering why we copy the buffer instead 
of fixing the iterator of `LimitedBufferHashGrouper` to be re-iterable. This 
seems a better fix to me.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] jihoonson commented on a change in pull request #10685: Subtotal queries give incorrect results if query has a limit spec

2021-01-06 Thread GitBox


jihoonson commented on a change in pull request #10685:
URL: https://github.com/apache/druid/pull/10685#discussion_r552972254



##
File path: 
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/ByteBufferMinMaxOffsetHeap.java
##
@@ -59,6 +60,35 @@ public ByteBufferMinMaxOffsetHeap(
 this.heapIndexUpdater = heapIndexUpdater;
   }
 
+  public ByteBufferMinMaxOffsetHeap copy()
+  {
+LimitedBufferHashGrouper.BufferGrouperOffsetHeapIndexUpdater updater =
+Optional
+.ofNullable(heapIndexUpdater)
+
.map(LimitedBufferHashGrouper.BufferGrouperOffsetHeapIndexUpdater::copy)
+.orElse(null);
+
+// deep copy buf
+ByteBuffer buffer = ByteBuffer.allocateDirect(buf.capacity());

Review comment:
   Talked with @loganlinn offline, we agreed fixing 
`LimitedBufferHashGrouper` would be a better fix.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] suneet-s merged pull request #10711: Typo: missing comma in json

2021-01-06 Thread GitBox


suneet-s merged pull request #10711:
URL: https://github.com/apache/druid/pull/10711


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid] branch master updated (68bb038 -> f9fc189)

2021-01-06 Thread suneet
This is an automated email from the ASF dual-hosted git repository.

suneet pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git.


from 68bb038  Multiphase segment merge for IndexMergerV9 (#10689)
 add f9fc189  Typo: missing comma in json (#10711)

No new revisions were added by this update.

Summary of changes:
 docs/querying/aggregations.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] himanshug commented on a change in pull request #9518: AWS RDS token based password provider

2021-01-06 Thread GitBox


himanshug commented on a change in pull request #9518:
URL: https://github.com/apache/druid/pull/9518#discussion_r552981751



##
File path: docs/development/extensions-core/druid-aws-rds.md
##
@@ -0,0 +1,38 @@
+---
+id: druid-aws-rds
+title: "Druid AWS RDS Module"
+---
+
+
+
+This module provides AWS RDS token [password 
provider](../../operations/password-provider.md) provides temp token for 
accessing AWS RDS DB cluster.

Review comment:
   reworded to hopefully remote the confusion





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] himanshug commented on a change in pull request #9518: AWS RDS token based password provider

2021-01-06 Thread GitBox


himanshug commented on a change in pull request #9518:
URL: https://github.com/apache/druid/pull/9518#discussion_r552983056



##
File path: 
server/src/main/java/org/apache/druid/metadata/BasicDataSourceExt.java
##
@@ -0,0 +1,179 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.metadata;
+
+import com.google.common.annotations.VisibleForTesting;
+import org.apache.commons.dbcp2.BasicDataSource;
+import org.apache.commons.dbcp2.ConnectionFactory;
+import org.apache.druid.java.util.common.RE;
+import org.apache.druid.java.util.common.logger.Logger;
+
+import java.sql.Driver;
+import java.sql.DriverManager;
+import java.sql.SQLException;
+import java.util.Properties;
+
+/**
+ * This class exists so that {@link MetadataStorageConnectorConfig} is asked 
for password every time a brand new
+ * connection is established with DB. {@link PasswordProvider} impls such as 
those based on AWS tokens refresh the
+ * underlying token periodically since each token is valid for a certain 
period of time only.
+ * So, This class overrides,ummm copies due to lack of extensibility, the 
methods from base class in order to keep
+ * track of connection properties and call {@link 
MetadataStorageConnectorConfig#getPassword()} everytime a new
+ * connection is setup.
+ */
+public class BasicDataSourceExt extends BasicDataSource
+{
+  private static final Logger LOGGER = new Logger(BasicDataSourceExt.class);
+
+  private Properties connectionProperties;
+  private final MetadataStorageConnectorConfig connectorConfig;
+
+  public BasicDataSourceExt(MetadataStorageConnectorConfig connectorConfig)
+  {
+this.connectorConfig = connectorConfig;
+this.connectionProperties = new Properties();
+  }
+
+  @Override
+  public void addConnectionProperty(String name, String value)
+  {
+connectionProperties.put(name, value);
+super.addConnectionProperty(name, value);
+  }
+
+  @Override
+  public void removeConnectionProperty(String name)
+  {
+connectionProperties.remove(name);
+super.removeConnectionProperty(name);
+  }
+
+  @Override
+  public void setConnectionProperties(String connectionProperties)
+  {
+if (connectionProperties == null) {
+  throw new NullPointerException("connectionProperties is null");
+}
+
+String[] entries = connectionProperties.split(";");
+Properties properties = new Properties();
+for (String entry : entries) {
+  if (entry.length() > 0) {
+int index = entry.indexOf('=');
+if (index > 0) {
+  String name = entry.substring(0, index);
+  String value = entry.substring(index + 1);
+  properties.setProperty(name, value);
+} else {
+  // no value is empty string which is how java.util.Properties works
+  properties.setProperty(entry, "");
+}
+  }
+}
+this.connectionProperties = properties;
+super.setConnectionProperties(connectionProperties);
+  }
+
+  @VisibleForTesting
+  public Properties getConnectionProperties()
+  {
+return connectionProperties;
+  }
+
+  @Override
+  protected ConnectionFactory createConnectionFactory() throws SQLException
+  {
+Driver driverToUse = getDriver();
+
+if (driverToUse == null) {
+  Class driverFromCCL = null;
+  if (getDriverClassName() != null) {
+try {
+  try {
+if (getDriverClassLoader() == null) {
+  driverFromCCL = Class.forName(getDriverClassName());
+} else {
+  driverFromCCL = Class.forName(
+  getDriverClassName(), true, getDriverClassLoader());
+}
+  }
+  catch (ClassNotFoundException cnfe) {
+driverFromCCL = Thread.currentThread(
+).getContextClassLoader().loadClass(
+getDriverClassName());
+  }
+}
+catch (Exception t) {
+  String message = "Cannot load JDBC driver class '" +
+   getDriverClassName() + "'";
+  LOGGER.error(t, message);
+  throw new SQLException(message, t);
+}
+  }
+
+  try {
+if (driverFromCCL == null) {
+  driverToUse = DriverManager.getD

[GitHub] [druid] himanshug commented on a change in pull request #9518: AWS RDS token based password provider

2021-01-06 Thread GitBox


himanshug commented on a change in pull request #9518:
URL: https://github.com/apache/druid/pull/9518#discussion_r552981751



##
File path: docs/development/extensions-core/druid-aws-rds.md
##
@@ -0,0 +1,38 @@
+---
+id: druid-aws-rds
+title: "Druid AWS RDS Module"
+---
+
+
+
+This module provides AWS RDS token [password 
provider](../../operations/password-provider.md) provides temp token for 
accessing AWS RDS DB cluster.

Review comment:
   reworded to hopefully remove the confusion





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] gianm merged pull request #10630: Scan query: More accurate error message when segment per time chunk limit is exceeded.

2021-01-06 Thread GitBox


gianm merged pull request #10630:
URL: https://github.com/apache/druid/pull/10630


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid] branch master updated (f9fc189 -> 48e576a)

2021-01-06 Thread gian
This is an automated email from the ASF dual-hosted git repository.

gian pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git.


from f9fc189  Typo: missing comma in json (#10711)
 add 48e576a  Scan query: More accurate error message when segment per time 
chunk limit is exceeded. (#10630)

No new revisions were added by this update.

Summary of changes:
 .../druid/query/scan/ScanQueryRunnerFactory.java   | 18 +++---
 .../query/scan/ScanQueryRunnerFactoryTest.java | 66 +-
 2 files changed, 74 insertions(+), 10 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] jihoonson edited a comment on pull request #10448: Added CronScheduler support as a proof to clock drift while emitting metrics

2021-01-06 Thread GitBox


jihoonson edited a comment on pull request #10448:
URL: https://github.com/apache/druid/pull/10448#issuecomment-755111625


   Hi @leventov, as test coverage is not very great for Druid metrics system, 
I'm wondering how much the CronScheduler is well-tested in production 
environment, especially when it runs for a long time period. Could you give me 
some idea?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] vogievetsky commented on a change in pull request #10710: fix web-console show json bug

2021-01-06 Thread GitBox


vogievetsky commented on a change in pull request #10710:
URL: https://github.com/apache/druid/pull/10710#discussion_r552995684



##
File path: web-console/src/components/show-json/show-json.tsx
##
@@ -41,7 +41,13 @@ export const ShowJson = React.memo(function ShowJson(props: 
ShowJsonProps) {
   const resp = await Api.instance.get(endpoint);
   let data = resp.data;
   if (transform) data = transform(data);
-  return typeof data === 'string' ? data : JSON.stringify(data, undefined, 
2);
+  return typeof data === 'string'
+? data
+: JSON.stringify(
+data,
+(_, value) => (typeof value === 'bigint' ? value.toString() : 
value),

Review comment:
   This is cool but it would be better to use the stringify function that 
comes with `json-bigint`
   
   Just do `import * as JSONBig from 'json-bigint-native';`
   
   and then replace all `JSON.stringify` with `JSONBig.stringify`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] himanshug commented on a change in pull request #10730: Introduce KafkaRecordEntity to support Kafka headers in InputFormats

2021-01-06 Thread GitBox


himanshug commented on a change in pull request #10730:
URL: https://github.com/apache/druid/pull/10730#discussion_r553063732



##
File path: 
extensions-core/kafka-indexing-service/src/main/java/org/apache/druid/indexing/kafka/KafkaRecordSupplier.java
##
@@ -119,15 +120,16 @@ public void seekToLatest(Set> 
partitions)
 
   @Nonnull
   @Override
-  public List> poll(long timeout)
+  public List> 
poll(long timeout)
   {
-List> polledRecords = new 
ArrayList<>();
+List> 
polledRecords = new ArrayList<>();
 for (ConsumerRecord record : 
consumer.poll(Duration.ofMillis(timeout))) {
+
   polledRecords.add(new OrderedPartitionableRecord<>(
   record.topic(),
   record.partition(),
   record.offset(),
-  record.value() == null ? null : ImmutableList.of(record.value())
+  ImmutableList.of(new KafkaRecordEntity(record))

Review comment:
   I think it is fine as long as it doesn't cause an unhandled NPE 
somewhere downstream and just counts as `unparseable` record for existing 
pipelines.

##
File path: 
extensions-core/kafka-indexing-service/src/main/java/org/apache/druid/data/input/kafka/KafkaRecordEntity.java
##
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.data.input.kafka;
+
+import org.apache.druid.data.input.impl.ByteEntity;
+import org.apache.kafka.clients.consumer.ConsumerRecord;
+
+public class KafkaRecordEntity extends ByteEntity
+{
+  private final ConsumerRecord record;
+
+  public KafkaRecordEntity(ConsumerRecord record)
+  {
+super(record.value());
+this.record = record;
+  }
+
+  public ConsumerRecord getRecord()

Review comment:
   not used anywhere , is this only to allow writing extensions with custom 
InputFormat which can take advantage of the extra metadata available ?

##
File path: 
indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/common/OrderedPartitionableRecord.java
##
@@ -55,7 +54,7 @@ public OrderedPartitionableRecord(
 this.stream = stream;
 this.partitionId = partitionId;
 this.sequenceNumber = sequenceNumber;
-this.data = data == null ? ImmutableList.of() : data;
+this.data = data;

Review comment:
   why remove the null check and conversion to empty list ?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] xvrl commented on a change in pull request #10730: Introduce KafkaRecordEntity to support Kafka headers in InputFormats

2021-01-06 Thread GitBox


xvrl commented on a change in pull request #10730:
URL: https://github.com/apache/druid/pull/10730#discussion_r553074022



##
File path: 
extensions-core/kafka-indexing-service/src/main/java/org/apache/druid/data/input/kafka/KafkaRecordEntity.java
##
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.data.input.kafka;
+
+import org.apache.druid.data.input.impl.ByteEntity;
+import org.apache.kafka.clients.consumer.ConsumerRecord;
+
+public class KafkaRecordEntity extends ByteEntity
+{
+  private final ConsumerRecord record;
+
+  public KafkaRecordEntity(ConsumerRecord record)
+  {
+super(record.value());
+this.record = record;
+  }
+
+  public ConsumerRecord getRecord()

Review comment:
   that's correct. We could have a sample InputFormat for documentation 
purposes, though I'm not sure if there is much value. A more generic input 
format for Kafka that wraps multiple input-formats would require a lot more 
thought.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] jihoonson opened a new pull request #10732: Add a config for monitorScheduler type

2021-01-06 Thread GitBox


jihoonson opened a new pull request #10732:
URL: https://github.com/apache/druid/pull/10732


   https://github.com/apache/druid/pull/10448 modified the `MonitorScheduler` 
to use `CronScheduler` instead of `ScheduledExecutorService`. This change looks 
good to me except that I'm not sure how well-tested `CronScheduler` is. This PR 
adds the previous `ScheduledExecutorService`-based `MonitorScheduler` back, and 
a new config, `druid.monitoring.schedulerClassName`, to determine what type of 
`MonitorScheduler` to use. This PR doesn't change the default 
`MonitorScheduler` as I find `CronScheduler` has a reasonable test coverage. 
However, if there is any unknown bug there, users can still use the old 
monitorScheduler. The new config is intentionally not documented as no one is 
supposed to touch it. However, it should be called out in the release notes.
   
   This PR additionally fixes 3 bugs in `MonitorScheduler`.
   - When an exception is thrown in `monitor.monitor()`, the behaviour has 
changed unexpectedly to stop the monitor. The monitor will ignore exceptions 
and continue working after this PR as it used to do.
   - There is a race condition between [when a scheduledFuture is set in a 
monitor](https://github.com/apache/druid/pull/10448/files#diff-e1da51fa67513b22b8110a76b29aa3c5cc8d00de418d7e5afc498c3bc3a8f107R178)
 and [when the scheduledFuture is 
used](https://github.com/apache/druid/pull/10448/files#diff-e1da51fa67513b22b8110a76b29aa3c5cc8d00de418d7e5afc498c3bc3a8f107R153).
 This will not likely happen in production since the first cronTask will be 
executed after the emitter period, but is possible in theory.
   - https://github.com/apache/druid/pull/10448 changed to use 64 threads for 
monitoring which seems an overkill to me. This PR changed it back to use a 
single thread.
   
   
   
   This PR has:
   - [x] been self-reviewed.
  - [ ] using the [concurrency 
checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md)
 (Remove this item if the PR doesn't have any relation to concurrency.)
   - [x] added documentation for new or modified features or behaviors.
   - [x] added Javadocs for most classes and all non-trivial methods. Linked 
related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in 
[licenses.yaml](https://github.com/apache/druid/blob/master/licenses.yaml)
   - [ ] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [x] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] xvrl commented on a change in pull request #10730: Introduce KafkaRecordEntity to support Kafka headers in InputFormats

2021-01-06 Thread GitBox


xvrl commented on a change in pull request #10730:
URL: https://github.com/apache/druid/pull/10730#discussion_r553074022



##
File path: 
extensions-core/kafka-indexing-service/src/main/java/org/apache/druid/data/input/kafka/KafkaRecordEntity.java
##
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.data.input.kafka;
+
+import org.apache.druid.data.input.impl.ByteEntity;
+import org.apache.kafka.clients.consumer.ConsumerRecord;
+
+public class KafkaRecordEntity extends ByteEntity
+{
+  private final ConsumerRecord record;
+
+  public KafkaRecordEntity(ConsumerRecord record)
+  {
+super(record.value());
+this.record = record;
+  }
+
+  public ConsumerRecord getRecord()

Review comment:
   that's correct. We could have a sample InputFormat for documentation 
purposes, though I'm not sure if there is much value. A more generic input 
format for Kafka that wraps multiple input-formats would require a lot more 
thought, and I couldn't think of a one-size-fits-all approach that seemed very 
useful or didn't have additional complexities to deal with. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] xvrl commented on a change in pull request #10730: Introduce KafkaRecordEntity to support Kafka headers in InputFormats

2021-01-06 Thread GitBox


xvrl commented on a change in pull request #10730:
URL: https://github.com/apache/druid/pull/10730#discussion_r553074881



##
File path: 
indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/common/OrderedPartitionableRecord.java
##
@@ -55,7 +54,7 @@ public OrderedPartitionableRecord(
 this.stream = stream;
 this.partitionId = partitionId;
 this.sequenceNumber = sequenceNumber;
-this.data = data == null ? ImmutableList.of() : data;
+this.data = data;

Review comment:
   I reverted this change after seeing that we rely on null in a bunch of 
places.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] lgtm-com[bot] commented on pull request #10732: Add a config for monitorScheduler type

2021-01-06 Thread GitBox


lgtm-com[bot] commented on pull request #10732:
URL: https://github.com/apache/druid/pull/10732#issuecomment-755862034


   This pull request **introduces 1 alert** when merging 
6b791d120362fdcfd14e7b1f10d30e941a52a373 into 
48e576a307da0efba1cb036c274327733658c1d6 - [view on 
LGTM.com](https://lgtm.com/projects/g/apache/druid/rev/pr-ba61b23623936b01c7a11557c2a97abbdc2118cf)
   
   **new alerts:**
   
   * 1 for Dereferenced variable may be null



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] himanshug merged pull request #9518: AWS RDS token based password provider

2021-01-06 Thread GitBox


himanshug merged pull request #9518:
URL: https://github.com/apache/druid/pull/9518


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] himanshug commented on pull request #9518: AWS RDS token based password provider

2021-01-06 Thread GitBox


himanshug commented on pull request #9518:
URL: https://github.com/apache/druid/pull/9518#issuecomment-755889518


   thanks @clintropolis 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid] branch master updated (48e576a -> c7b1212)

2021-01-06 Thread himanshug
This is an automated email from the ASF dual-hosted git repository.

himanshug pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git.


from 48e576a  Scan query: More accurate error message when segment per time 
chunk limit is exceeded. (#10630)
 add c7b1212  AWS RDS token based password  provider (#9518)

No new revisions were added by this update.

Summary of changes:
 distribution/pom.xml   |   2 +
 docs/development/extensions-core/druid-aws-rds.md  |  38 +
 docs/development/extensions.md |   1 +
 .../pom.xml|  45 --
 .../org/apache/druid/aws/rds/AWSRDSModule.java}|  11 +-
 .../druid/aws/rds/AWSRDSTokenPasswordProvider.java | 123 ++
 .../org.apache.druid.initialization.DruidModule|   3 +-
 .../aws/rds/AWSRDSTokenPasswordProviderTest.java   |  82 ++
 licenses.yaml  |  10 ++
 pom.xml|   4 +-
 server/pom.xml |  11 ++
 .../apache/druid/metadata/BasicDataSourceExt.java  | 179 +
 .../metadata/SQLFirehoseDatabaseConnector.java |   2 +-
 .../druid/metadata/SQLMetadataConnector.java   |   2 +-
 .../druid/metadata/BasicDataSourceExtTest.java | 113 +
 website/.spelling  |   3 +
 16 files changed, 601 insertions(+), 28 deletions(-)
 create mode 100644 docs/development/extensions-core/druid-aws-rds.md
 copy extensions-core/{simple-client-sslcontext => 
druid-aws-rds-extensions}/pom.xml (75%)
 copy 
extensions-core/{s3-extensions/src/main/java/org/apache/druid/firehose/s3/S3FirehoseDruidModule.java
 => 
druid-aws-rds-extensions/src/main/java/org/apache/druid/aws/rds/AWSRDSModule.java}
 (84%)
 create mode 100644 
extensions-core/druid-aws-rds-extensions/src/main/java/org/apache/druid/aws/rds/AWSRDSTokenPasswordProvider.java
 copy services/src/bin/run.sh => 
extensions-core/druid-aws-rds-extensions/src/main/resources/META-INF/services/org.apache.druid.initialization.DruidModule
 (95%)
 create mode 100644 
extensions-core/druid-aws-rds-extensions/src/test/java/org/apache/druid/aws/rds/AWSRDSTokenPasswordProviderTest.java
 create mode 100644 
server/src/main/java/org/apache/druid/metadata/BasicDataSourceExt.java
 create mode 100644 
server/src/test/java/org/apache/druid/metadata/BasicDataSourceExtTest.java


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] jon-wei opened a new pull request #10733: Update deps for CVE-2020-28168 and CVE-2020-28052

2021-01-06 Thread GitBox


jon-wei opened a new pull request #10733:
URL: https://github.com/apache/druid/pull/10733


   This PR updates the axios dependency (from web-console) and a bouncy castle 
dependency coming in through the kubernetes extension, addressing the following 
CVEs that cause `mvn dependency-check:check` to fail:
   
   https://nvd.nist.gov/vuln/detail/CVE-2020-28052 
   https://nvd.nist.gov/vuln/detail/CVE-2020-28052
   
   Axios is updated to version 0.21.1.
   
   For Bouncy Castle dependency, I couldn't find a version of the kubernetes 
client (https://github.com/kubernetes-client/java) that used a > 1.66 version, 
so I added a direct version override to use 1.68. 
   
   This PR has:
   - [x] been self-reviewed.
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked 
related entities via Javadoc links.
   - [x] added or updated version, license, or notice information in 
[licenses.yaml](https://github.com/apache/druid/blob/master/licenses.yaml)
   - [x] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.
   
   
   
   
   
   # Key changed/added classes in this PR
* `MyFoo`
* `OurBar`
* `TheirBaz`
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] clintropolis commented on a change in pull request #10724: Coordinator Dynamic Config changes to ease upgrading with new config value

2021-01-06 Thread GitBox


clintropolis commented on a change in pull request #10724:
URL: https://github.com/apache/druid/pull/10724#discussion_r553119211



##
File path: 
server/src/main/java/org/apache/druid/server/coordinator/CoordinatorDynamicConfig.java
##
@@ -87,6 +88,8 @@
   private final int maxSegmentsInNodeLoadingQueue;
   private final boolean pauseCoordination;
 
+  private static final EmittingLogger log = new 
EmittingLogger(CoordinatorDynamicConfig.class);

Review comment:
   nit: I don't think this needs to be `EmittingLogger` since it isn't 
doing an alert anywhere 

##
File path: 
server/src/main/java/org/apache/druid/server/coordinator/CoordinatorDynamicConfig.java
##
@@ -96,7 +99,7 @@ public CoordinatorDynamicConfig(
   @JsonProperty("mergeBytesLimit") long mergeBytesLimit,
   @JsonProperty("mergeSegmentsLimit") int mergeSegmentsLimit,
   @JsonProperty("maxSegmentsToMove") int maxSegmentsToMove,
-  @JsonProperty("percentOfSegmentsToConsiderPerMove") double 
percentOfSegmentsToConsiderPerMove,
+  @JsonProperty("percentOfSegmentsToConsiderPerMove") Double 
percentOfSegmentsToConsiderPerMove,

Review comment:
   I guess the alternative way to fix this instead of switching out the 
primitive would be to treat `0` as not configured and so use the default of 
`100`, though this way seems fine too, I think it just sticks out a bit because 
the rest of the numbery parameters are primitives.
   
   also nit: should annotate this parameter with `@Nullable`
   
   i missed the first PR, but any reason this is a double instead of integer 
similar to other percent based config 
`decommissioningMaxPercentOfMaxSegmentsToMove`? I guess there can be millions 
of segments so the numbers being dealt with here are a bit different (at least 
hopefully no one is trying to actually move on the scale of that many segments 
per coordinator run..) so maybe those decimal points really do make a 
difference, but otoh I can't really imagine operators configuring this much 
more granular than integer numbers, which I think was the reason we used 
integer for the other one iirc.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] abhishekagarwal87 opened a new pull request #10734: add offsetFetchPeriod to kinesis ingestion doc

2021-01-06 Thread GitBox


abhishekagarwal87 opened a new pull request #10734:
URL: https://github.com/apache/druid/pull/10734


   Minor doc fix



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] bananaaggle commented on pull request #10710: fix web-console show json bug

2021-01-06 Thread GitBox


bananaaggle commented on pull request #10710:
URL: https://github.com/apache/druid/pull/10710#issuecomment-755929310


   @vogievetsky I change code follow your suggestion. And I check all places 
JSON.stringify used and replace it if necessary.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] ayushkul2910 commented on a change in pull request #10448: Added CronScheduler support as a proof to clock drift while emitting metrics

2021-01-06 Thread GitBox


ayushkul2910 commented on a change in pull request #10448:
URL: https://github.com/apache/druid/pull/10448#discussion_r553162349



##
File path: 
server/src/main/java/org/apache/druid/server/metrics/MetricsModule.java
##
@@ -108,10 +108,10 @@ public MonitorScheduler getMonitorScheduler(
 
 return new MonitorScheduler(
 config.get(),
-
CronScheduler.newBuilder(Duration.ofSeconds(1L)).setThreadName("MonitorScheduler-%s").build(),
+
CronScheduler.newBuilder(Duration.ofSeconds(1L)).setThreadName("MonitorSchedulerThread").build(),
 emitter,
 monitors,
-Executors.newCachedThreadPool()
+Execs.multiThreaded(64, "MonitorThread-%d")

Review comment:
   Hi @jihoonson , sorry for delay in response. 
   I think currently there are ~20 monitors, which can run concurrently with 
the MonitorScheduler class. Suppose a case in which frequency of scheduling < 
time taken by the executor thread to do `monitor.monitor(...)`(Although I am 
not sure if this case is possible in practical, kind of edge case). This can 
result in queuing of the tasks if threads are very less. I think we should 
atleast have no. of threads equal to max number of monitors supported. I may be 
missing something here. What do you think?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org