[GitHub] [druid] imply-cheddar commented on a diff in pull request #14002: lower segment heap footprint and fix bug with expression type coercion

2023-03-31 Thread via GitHub


imply-cheddar commented on code in PR #14002:
URL: https://github.com/apache/druid/pull/14002#discussion_r1154121916


##
processing/src/main/java/org/apache/druid/java/util/common/io/smoosh/SmooshedFileMapper.java:
##
@@ -48,7 +48,11 @@
  */
 public class SmooshedFileMapper implements Closeable
 {
-  private static final Interner STRING_INTERNER = 
Interners.newWeakInterner();
+  /**
+   * Interner for smoosh internal files, which includes all column names since 
very column has an internal file

Review Comment:
   typo: s/very/every/



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] tejaswini-imply opened a new pull request, #14005: Update Hadoop3 as default build version

2023-03-31 Thread via GitHub


tejaswini-imply opened a new pull request, #14005:
URL: https://github.com/apache/druid/pull/14005

   ### Description:
   Hadoop 2 often causes red security scans on Druid distribution because of 
the dependencies it brings. We want to move away from Hadoop 2 and provide 
Hadoop 3 distribution available. Switch druid to building with Hadoop 3 by 
default. Druid will still be compatible with Hadoop 2 and users can build 
hadoop-2 compatible distribution using hadoop2 profile.
   
   
   
    Release note
   
   Druid is now built with Hadoop 3. If you need Hadoop 2 compatible 
distribution, you can build one yourself using the source code and building a 
tar with hadoop2 profile.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] github-code-scanning[bot] commented on a diff in pull request #14004: Errors take 3

2023-03-31 Thread via GitHub


github-code-scanning[bot] commented on code in PR #14004:
URL: https://github.com/apache/druid/pull/14004#discussion_r1154227559


##
sql/src/test/java/org/apache/druid/sql/avatica/DruidAvaticaHandlerTest.java:
##
@@ -249,60 +254,66 @@
   protected AbstractAvaticaHandler getAvaticaHandler(final DruidMeta druidMeta)
   {
 return new DruidAvaticaJsonHandler(
-druidMeta,
-new DruidNode("dummy", "dummy", false, 1, null, true, false),
-new AvaticaMonitor()
+druidMeta,
+new DruidNode("dummy", "dummy", false, 1, null, true, false),
+new AvaticaMonitor()
 );
   }
 
   @Before
   public void setUp() throws Exception
   {
-walker = CalciteTests.createMockWalker(conglomerate, 
temporaryFolder.newFolder());
 final DruidSchemaCatalog rootSchema = makeRootSchema();
 testRequestLogger = new TestRequestLogger();
 
 injector = new CoreInjectorBuilder(new StartupInjectorBuilder().build())
-.addModule(binder -> {
-
binder.bindConstant().annotatedWith(Names.named("serviceName")).to("test");
-
binder.bindConstant().annotatedWith(Names.named("servicePort")).to(0);
-
binder.bindConstant().annotatedWith(Names.named("tlsServicePort")).to(-1);
-
binder.bind(AuthenticatorMapper.class).toInstance(CalciteTests.TEST_AUTHENTICATOR_MAPPER);
-
binder.bind(AuthorizerMapper.class).toInstance(CalciteTests.TEST_AUTHORIZER_MAPPER);
-
binder.bind(Escalator.class).toInstance(CalciteTests.TEST_AUTHENTICATOR_ESCALATOR);
-binder.bind(RequestLogger.class).toInstance(testRequestLogger);
-binder.bind(DruidSchemaCatalog.class).toInstance(rootSchema);
-for (NamedSchema schema : rootSchema.getNamedSchemas().values()) {
-  Multibinder.newSetBinder(binder, 
NamedSchema.class).addBinding().toInstance(schema);
+.addModule(
+binder -> {
+  
binder.bindConstant().annotatedWith(Names.named("serviceName")).to("test");
+  
binder.bindConstant().annotatedWith(Names.named("servicePort")).to(0);
+  
binder.bindConstant().annotatedWith(Names.named("tlsServicePort")).to(-1);
+  
binder.bind(AuthenticatorMapper.class).toInstance(CalciteTests.TEST_AUTHENTICATOR_MAPPER);
+  
binder.bind(AuthorizerMapper.class).toInstance(CalciteTests.TEST_AUTHORIZER_MAPPER);
+  
binder.bind(Escalator.class).toInstance(CalciteTests.TEST_AUTHENTICATOR_ESCALATOR);
+  binder.bind(RequestLogger.class).toInstance(testRequestLogger);
+  binder.bind(DruidSchemaCatalog.class).toInstance(rootSchema);
+  for (NamedSchema schema : rootSchema.getNamedSchemas().values()) 
{
+Multibinder.newSetBinder(binder, 
NamedSchema.class).addBinding().toInstance(schema);
+  }
+  binder.bind(QueryLifecycleFactory.class)
+
.toInstance(CalciteTests.createMockQueryLifecycleFactory(walker, conglomerate));
+  binder.bind(DruidOperatorTable.class).toInstance(operatorTable);
+  binder.bind(ExprMacroTable.class).toInstance(macroTable);
+  binder.bind(PlannerConfig.class).toInstance(plannerConfig);
+  binder.bind(String.class)
+.annotatedWith(DruidSchemaName.class)
+.toInstance(CalciteTests.DRUID_SCHEMA_NAME);
+  
binder.bind(AvaticaServerConfig.class).toInstance(AVATICA_CONFIG);
+  binder.bind(ServiceEmitter.class).to(NoopServiceEmitter.class);
+  
binder.bind(QuerySchedulerProvider.class).in(LazySingleton.class);
+  binder.bind(QueryScheduler.class)
+.toProvider(QuerySchedulerProvider.class)
+.in(LazySingleton.class);
+  binder.install(new SqlModule.SqlStatementFactoryModule());
+  binder.bind(new TypeLiteral>()
+  {
+  }).toInstance(Suppliers.ofInstance(new 
DefaultQueryConfig(ImmutableMap.of(;
+  binder.bind(CalciteRulesManager.class).toInstance(new 
CalciteRulesManager(ImmutableSet.of()));
+  
binder.bind(JoinableFactoryWrapper.class).toInstance(CalciteTests.createJoinableFactoryWrapper());
+  
binder.bind(CatalogResolver.class).toInstance(CatalogResolver.NULL_RESOLVER);
 }
-binder.bind(QueryLifecycleFactory.class)
-  
.toInstance(CalciteTests.createMockQueryLifecycleFactory(walker, conglomerate));
-binder.bind(DruidOperatorTable.class).toInstance(operatorTable);
-binder.bind(ExprMacroTable.class).toInstance(macroTable);
-binder.bind(PlannerConfig.class).toInstance(plannerConfig);
-binder.bind(String.class)
-  .annotatedWith(DruidSchemaName.class)
-  .toInstance(CalciteTests.DRUID_SCHEMA_NAME);
-binder.bind(AvaticaServe

[GitHub] [druid] clintropolis merged pull request #13996: actually backwards compatible frontCoded string encoding strategy

2023-03-31 Thread via GitHub


clintropolis merged PR #13996:
URL: https://github.com/apache/druid/pull/13996


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid] branch master updated (51f3db2ce6 -> e3211e3be0)

2023-03-31 Thread cwylie
This is an automated email from the ASF dual-hosted git repository.

cwylie pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git


from 51f3db2ce6 Fix peon errors when executing tasks in ipv6(#13972) 
(#13995)
 add e3211e3be0 actually backwards compatible frontCoded string encoding 
strategy (#13996)

No new revisions were added by this update.

Summary of changes:
 .../src/test/java/org/apache/druid/benchmark/query/SqlBenchmark.java  | 3 ++-
 .../java/org/apache/druid/benchmark/query/SqlNestedDataBenchmark.java | 3 ++-
 docs/ingestion/ingestion-spec.md  | 4 ++--
 .../test/java/org/apache/druid/msq/indexing/MSQTuningConfigTest.java  | 3 ++-
 .../java/org/apache/druid/msq/util/MultiStageQueryContextTest.java| 2 +-
 .../main/java/org/apache/druid/segment/data/FrontCodedIndexed.java| 2 +-
 processing/src/test/java/org/apache/druid/segment/TestIndex.java  | 3 ++-
 .../org/apache/druid/segment/column/StringEncodingStrategyTest.java   | 3 ++-
 .../src/test/java/org/apache/druid/segment/filter/BaseFilterTest.java | 4 +++-
 .../test/java/org/apache/druid/segment/filter/SpatialFilterTest.java  | 3 ++-
 10 files changed, 19 insertions(+), 11 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] kfaraz commented on issue #13979: how to mograte data from one historical to another in the same cluster

2023-03-31 Thread via GitHub


kfaraz commented on issue #13979:
URL: https://github.com/apache/druid/issues/13979#issuecomment-1491814724

   1. Segments are permanently stored in deep storage (e.g. S3). If a 
historical goes down, nothing is lost permanently and the missing segments can 
always be loaded on a different historical.
   2. When a historical goes down, the segments being served by that historical 
would become under-replicated or even unavailable if the segment had only 1 
copy in the cluster.
   3. When segments are unavailable, queries would not return data for that 
segment.
   4. As soon as the new historical loads these missing segments, the data is 
available for query again.
   
   If you don't want data to be unavailable at any point, the method of doing 
this right now is to mark the historical as "decommissioning". The Druid 
coordinator would then move all segments on this "decommissioning" historical 
to other active historicals. Once all segments are removed from the 
"decomissioning" historical, it can be safely terminated.
   
   Please refer to the configuration in the docs here for example usage of 
"decomissioning":
   
https://druid.apache.org/docs/latest/configuration/index.html#dynamic-configuration


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] phillbrown opened a new issue, #14006: Error when using string_to_mv() from a subquery

2023-03-31 Thread via GitHub


phillbrown opened a new issue, #14006:
URL: https://github.com/apache/druid/issues/14006

   ### Affected Version
   
   25.0
   
   ### Description
   
   Steps to reproduce:
   
   1. Load the 
[attached](https://github.com/apache/druid/files/11123016/string-to-mv.zip) 
data using native batch ingestion with the 
[attached](https://github.com/apache/druid/files/11123016/string-to-mv.zip) spec
   2. Run the following query using the SQL native engine
   
   ```
   SELECT STRING_TO_MV(mvstring, ',')
   FROM (
 SELECT DISTINCT "parent_id", MV_TO_STRING("multi_values", ',') AS mvstring
 FROM "example"
 GROUP BY 1, 2
   )
   ```
   
   The query should return the following error:
   
   ```
   Error: Unknown exception
   
   Cannot coerce [[Ljava.lang.String;] to VARCHAR
   
   org.apache.druid.java.util.common.ISE
   ```
   
   Interestingly the query runs as expected if we use the ms query engine. I'm 
using the out-the-box auto configuration.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] somu-imply opened a new pull request, #14007: Fixing NPE in window functions when order by DESC is used within OVER

2023-03-31 Thread via GitHub


somu-imply opened a new pull request, #14007:
URL: https://github.com/apache/druid/pull/14007

   
   
   This PR has:
   
   - [ ] been self-reviewed.
  - [ ] using the [concurrency 
checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md)
 (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] a release note entry in the PR description.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked 
related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in 
[licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
   - [ ] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] fjy merged pull request #13993: Web console: add a nice UI for overlord dynamic configs and improve the docs

2023-03-31 Thread via GitHub


fjy merged PR #13993:
URL: https://github.com/apache/druid/pull/13993


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid] branch master updated: Web console: add a nice UI for overlord dynamic configs and improve the docs (#13993)

2023-03-31 Thread fjy
This is an automated email from the ASF dual-hosted git repository.

fjy pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git


The following commit(s) were added to refs/heads/master by this push:
 new 981662e9f4 Web console: add a nice UI for overlord dynamic configs and 
improve the docs (#13993)
981662e9f4 is described below

commit 981662e9f407093162d5649c9e8a223c131d8c78
Author: Vadim Ogievetsky 
AuthorDate: Fri Mar 31 10:12:25 2023 -0700

Web console: add a nice UI for overlord dynamic configs and improve the 
docs (#13993)

* in progress

* better form

* doc updates

* doc changes

* add inline docs

* fix tests

* Update docs/configuration/index.md

Co-authored-by: 317brian <53799971+317br...@users.noreply.github.com>

* Update docs/configuration/index.md

Co-authored-by: 317brian <53799971+317br...@users.noreply.github.com>

* Update docs/configuration/index.md

Co-authored-by: 317brian <53799971+317br...@users.noreply.github.com>

* Update docs/configuration/index.md

Co-authored-by: 317brian <53799971+317br...@users.noreply.github.com>

* Update docs/configuration/index.md

Co-authored-by: 317brian <53799971+317br...@users.noreply.github.com>

* Update docs/configuration/index.md

Co-authored-by: 317brian <53799971+317br...@users.noreply.github.com>

* Update docs/configuration/index.md

Co-authored-by: 317brian <53799971+317br...@users.noreply.github.com>

* Update docs/configuration/index.md

Co-authored-by: 317brian <53799971+317br...@users.noreply.github.com>

* Update docs/configuration/index.md

Co-authored-by: 317brian <53799971+317br...@users.noreply.github.com>

* Update docs/configuration/index.md

Co-authored-by: 317brian <53799971+317br...@users.noreply.github.com>

* Update docs/configuration/index.md

Co-authored-by: 317brian <53799971+317br...@users.noreply.github.com>

* Update docs/configuration/index.md

Co-authored-by: 317brian <53799971+317br...@users.noreply.github.com>

* Update docs/configuration/index.md

Co-authored-by: 317brian <53799971+317br...@users.noreply.github.com>

* Update docs/configuration/index.md

Co-authored-by: 317brian <53799971+317br...@users.noreply.github.com>

* Update docs/configuration/index.md

Co-authored-by: 317brian <53799971+317br...@users.noreply.github.com>

* Update docs/configuration/index.md

Co-authored-by: 317brian <53799971+317br...@users.noreply.github.com>

* Update docs/configuration/index.md

Co-authored-by: 317brian <53799971+317br...@users.noreply.github.com>

* Update docs/configuration/index.md

Co-authored-by: 317brian <53799971+317br...@users.noreply.github.com>

* Update docs/configuration/index.md

Co-authored-by: 317brian <53799971+317br...@users.noreply.github.com>

* Update docs/configuration/index.md

Co-authored-by: 317brian <53799971+317br...@users.noreply.github.com>

* final fixes

* fix case

* Update docs/configuration/index.md

Co-authored-by: Kashif Faraz 

* Update docs/configuration/index.md

Co-authored-by: Kashif Faraz 

* Update docs/configuration/index.md

Co-authored-by: Kashif Faraz 

* Update docs/configuration/index.md

Co-authored-by: Kashif Faraz 

* Update docs/configuration/index.md

Co-authored-by: Kashif Faraz 

* Update docs/configuration/index.md

Co-authored-by: Kashif Faraz 

* Update docs/configuration/index.md

Co-authored-by: Kashif Faraz 

* Update docs/configuration/index.md

Co-authored-by: Kashif Faraz 

* Update docs/configuration/index.md

Co-authored-by: Kashif Faraz 

* Update docs/configuration/index.md

Co-authored-by: Kashif Faraz 

* fix overflow

* fix spelling

-

Co-authored-by: 317brian <53799971+317br...@users.noreply.github.com>
Co-authored-by: Kashif Faraz 
---
 docs/configuration/index.md| 152 ++--
 .../form-group-with-info/form-group-with-info.scss |   8 +
 .../overload-dynamic-config-dialog.spec.tsx.snap   |   1 +
 .../overlord-dynamic-config-dialog.scss|  10 --
 .../overlord-dynamic-config-dialog.tsx |  33 +++-
 .../overlord-dynamic-config.tsx| 196 -
 6 files changed, 320 insertions(+), 80 deletions(-)

diff --git a/docs/configuration/index.md b/docs/configuration/index.md
index 5fbf4b0c07..4f86695b95 100644
--- a/docs/configuration/index.md
+++ b/docs/configuration/index.md
@@ -1168,9 +1168,9 @@ There are additional configs for autoscaling (i

[GitHub] [druid] xvrl commented on issue #12824: Upgrade Quickstart Python Scripts to Python 3

2023-03-31 Thread via GitHub


xvrl commented on issue #12824:
URL: https://github.com/apache/druid/issues/12824#issuecomment-1492394893

   I believe this has been addressed by 
https://github.com/apache/druid/pull/12841


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] xvrl closed issue #12824: Upgrade Quickstart Python Scripts to Python 3

2023-03-31 Thread via GitHub


xvrl closed issue #12824: Upgrade Quickstart Python Scripts to Python 3
URL: https://github.com/apache/druid/issues/12824


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] georgew5656 opened a new pull request, #14008: New error message for task deletion

2023-03-31 Thread via GitHub


georgew5656 opened a new pull request, #14008:
URL: https://github.com/apache/druid/pull/14008

   ### Description
   The fabric8 API returns false only if the K8s API returns a 404 on a delete 
attempt (resource not found). If there is a actual error than fabric8 will 
rethrow the client exception. Updating the log message to account for this.
   
   See:
   
https://javadoc.io/static/io.fabric8/kubernetes-client/4.6.4/io/fabric8/kubernetes/client/dsl/base/BaseOperation.html#delete--
   
   
   # Key changed/added classes in this PR
   Update DruidKubernetesPeonClient to handle false results from job.delete() 
differently.
   This PR has:
   
   - [X] been self-reviewed.
  - [ ] using the [concurrency 
checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md)
 (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] a release note entry in the PR description.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked 
related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in 
[licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
   - [ ] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [ ] added integration tests.
   - [X] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] somu-imply commented on a diff in pull request #14007: Fixing NPE in window functions when order by DESC is used within OVER

2023-03-31 Thread via GitHub


somu-imply commented on code in PR #14007:
URL: https://github.com/apache/druid/pull/14007#discussion_r1154796168


##
sql/src/test/resources/calcite/tests/window/wikipediaAggregationsMultipleOrderingDesc.sqlTest:
##
@@ -10,16 +10,22 @@ sql: |
 FROM wikipedia
 GROUP BY 1, 2
 ORDER BY 1 DESC, 2 DESC
+LIMIT 2
 
 expectedOperators:
+  - {type: "naiveSort", columns: [{column: "d0", direction: "ASC"}, {column: 
"d1", direction: "DESC"}]}
   - { type: "naivePartition",  partitionColumns: [ "d0" ] }
   - type: "window"
 processor:
   type: "framedAgg"
   frame: { peerType: "ROWS", lowUnbounded: false, lowOffset: 3, 
uppUnbounded: false, uppOffset: 2 }
   aggregations:
 - { type: "longSum", name: "w0", fieldName: "a0" }
-  - { type: "naiveSort", columns: [ { column: "d1", direction: "DESC" }, { 
column: "a0", direction: "DESC"} ]}

Review Comment:
   Unclear as in why this changed from ASC to DESC. Investigating further



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] somu-imply commented on a diff in pull request #14007: Fixing NPE in window functions when order by DESC is used within OVER

2023-03-31 Thread via GitHub


somu-imply commented on code in PR #14007:
URL: https://github.com/apache/druid/pull/14007#discussion_r1154796168


##
sql/src/test/resources/calcite/tests/window/wikipediaAggregationsMultipleOrderingDesc.sqlTest:
##
@@ -10,16 +10,22 @@ sql: |
 FROM wikipedia
 GROUP BY 1, 2
 ORDER BY 1 DESC, 2 DESC
+LIMIT 2
 
 expectedOperators:
+  - {type: "naiveSort", columns: [{column: "d0", direction: "ASC"}, {column: 
"d1", direction: "DESC"}]}
   - { type: "naivePartition",  partitionColumns: [ "d0" ] }
   - type: "window"
 processor:
   type: "framedAgg"
   frame: { peerType: "ROWS", lowUnbounded: false, lowOffset: 3, 
uppUnbounded: false, uppOffset: 2 }
   aggregations:
 - { type: "longSum", name: "w0", fieldName: "a0" }
-  - { type: "naiveSort", columns: [ { column: "d1", direction: "DESC" }, { 
column: "a0", direction: "DESC"} ]}

Review Comment:
   Unclear as in why this changed from DESC to ASC. Investigating further



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] paul-rogers opened a new pull request, #14009: Add basic security functions to druidapi

2023-03-31 Thread via GitHub


paul-rogers opened a new pull request, #14009:
URL: https://github.com/apache/druid/pull/14009

   This PR adds a complete set of Basic security functions to the Python 
`druidapi`. These functions are handy for setting up security, inspecting the 
security setup, and learning the nuances of the basic security system. They 
would make a fine foundation for Basic security tutorial notebook. If we did 
such a notebook:
   
   * Emphasize that users are defined twice: once in the authorizer, again in 
the authenticator.
   * The many config settings that have to be done just right.
   * The complexities of SQL security: sometimes one needs multiple permissions.
   
   Since the Druid console doesn't provide tools to set up basic security, 
doing it via Python is a handy way to get started until a user defines a more 
production-grade integration with an external system.
   
    Example:
   
   ```python
   # Define a coordinator-specific client, using the admin user
   coord = druidapi.jupyter_client('http://localhost:8081', auth=('admin', 
'pwd'))
   # Create a basic auth client for your authenticator and authorizer:
   ac = coord.basic_security('myAuthorizer', 'myAuthenticator')
   
   # Get information
   # List users
   ac.users()
   # List roles
   ac.users()
   # List roles for a user
   ac.authorization_user('alice')
   # List permissions for a role
   ac.role_permissions('aliceRole')
   
   # Create user
   ac.add_user('fred', 'pwd')
   # Create role
   ac.add_role('myRole')
   # Grant permissions to a role
   perms = [ac.resource_action(consts.DATASOURCE_RESOURCE, 'foo', 
consts.READ_ACTION)]
   ac.set_role_permissions('myRole', perms)
   # Assign a role to a user
   ac.assign_role_to_user('myRole', 'fred')
   
   # "Log in" as the new user
   fred = druidapi.jupyter_client('http://localhost:', auth=('fred', 'pwd'))
   # Perform operations as the user.
   fred.sql.sql('SELECT * FROM foo LIMIT 10')
   
   # Drop user
   ac.drop_user('fred')
   ```
   
    Release note
   
   See the description.
   
   
   
   This PR has:
   
   - [X] been self-reviewed.
   - [X] added documentation for new or modified features or behaviors.
   - [X] a release note entry in the PR description.
   - [X] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [X] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] clintropolis merged pull request #14002: lower segment heap footprint and fix bug with expression type coercion

2023-03-31 Thread via GitHub


clintropolis merged PR #14002:
URL: https://github.com/apache/druid/pull/14002


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid] branch master updated (981662e9f4 -> 518698a952)

2023-03-31 Thread cwylie
This is an automated email from the ASF dual-hosted git repository.

cwylie pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git


from 981662e9f4 Web console: add a nice UI for overlord dynamic configs and 
improve the docs (#13993)
 add 518698a952 lower segment heap footprint and fix bug with expression 
type coercion (#14002)

No new revisions were added by this update.

Summary of changes:
 .../indexing/common/task/CompactionTaskTest.java   |  3 +-
 .../util/common/io/smoosh/SmooshedFileMapper.java  |  6 ++-
 .../druid/math/expr/ExpressionTypeConversion.java  |  4 +-
 .../java/org/apache/druid/segment/IndexIO.java | 32 ---
 .../apache/druid/segment/SimpleQueryableIndex.java | 45 ++
 .../org/apache/druid/math/expr/OutputTypeTest.java | 28 ++
 6 files changed, 65 insertions(+), 53 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] danprince1 commented on a diff in pull request #13758: add ingested intervals to task report

2023-03-31 Thread via GitHub


danprince1 commented on code in PR #13758:
URL: https://github.com/apache/druid/pull/13758#discussion_r1154889307


##
indexing-service/src/main/java/org/apache/druid/indexing/common/task/IndexTask.java:
##
@@ -534,7 +536,7 @@ public TaskStatus runTask(final TaskToolbox toolbox)
 catch (Exception e) {
   log.error(e, "Encountered exception in %s.", ingestionState);
   errorMsg = Throwables.getStackTraceAsString(e);
-  toolbox.getTaskReportFileWriter().write(getId(), 
getTaskCompletionReports());
+  toolbox.getTaskReportFileWriter().write(getId(), 
getTaskCompletionReports(Collections.emptyList()));

Review Comment:
   Good idea - done.



##
indexing-service/src/main/java/org/apache/druid/indexing/common/task/batch/parallel/MultiPhaseParallelIndexStatsReporter.java:
##
@@ -0,0 +1,102 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.indexing.common.task.batch.parallel;
+
+import org.apache.druid.indexing.common.IngestionStatsAndErrorsTaskReport;
+import org.apache.druid.indexing.common.IngestionStatsAndErrorsTaskReportData;
+import org.apache.druid.indexing.common.TaskReport;
+import org.apache.druid.java.util.common.Pair;
+import org.apache.druid.java.util.common.logger.Logger;
+import org.apache.druid.segment.incremental.ParseExceptionReport;
+import org.apache.druid.segment.incremental.RowIngestionMetersTotals;
+import org.apache.druid.segment.incremental.SimpleRowIngestionMeters;
+import org.joda.time.Interval;
+
+import java.util.ArrayList;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+
+public class MultiPhaseParallelIndexStatsReporter extends 
ParallelIndexStatsReporter
+{
+  private static final Logger LOG = new 
Logger(MultiPhaseParallelIndexStatsReporter.class);
+
+  @Override
+  ParallelIndexStats report(
+  ParallelIndexSupervisorTask task,
+  Object runner,
+  boolean includeUnparseable,
+  String full
+  )
+  {
+// use cached version if available
+ParallelIndexStats cached = task.getIndexGenerateRowStats();
+if (null != cached) {
+  return cached;
+}
+
+ParallelIndexTaskRunner currentRunner = (ParallelIndexTaskRunner) runner;
+if (!currentRunner.getName().equals("partial segment generation")) {

Review Comment:
   Good idea - done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] danprince1 commented on a diff in pull request #13758: add ingested intervals to task report

2023-03-31 Thread via GitHub


danprince1 commented on code in PR #13758:
URL: https://github.com/apache/druid/pull/13758#discussion_r1154889605


##
indexing-service/src/main/java/org/apache/druid/indexing/common/task/batch/parallel/ParallelIndexStatsReporter.java:
##
@@ -0,0 +1,142 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.indexing.common.task.batch.parallel;
+
+import com.google.common.collect.ImmutableMap;
+import org.apache.druid.indexing.common.IngestionStatsAndErrorsTaskReport;
+import org.apache.druid.indexing.common.IngestionStatsAndErrorsTaskReportData;
+import org.apache.druid.indexing.common.TaskReport;
+import org.apache.druid.java.util.common.Pair;
+import org.apache.druid.java.util.common.logger.Logger;
+import org.apache.druid.segment.incremental.ParseExceptionReport;
+import org.apache.druid.segment.incremental.RowIngestionMeters;
+import org.apache.druid.segment.incremental.RowIngestionMetersTotals;
+import org.apache.druid.segment.incremental.SimpleRowIngestionMeters;
+
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+
+public abstract class ParallelIndexStatsReporter
+{
+  private static final Logger LOG = new 
Logger(ParallelIndexStatsReporter.class);
+
+  abstract ParallelIndexStats report(
+  ParallelIndexSupervisorTask task,
+  Object runner,
+  boolean includeUnparseable,
+  String full
+  );
+
+  protected RowIngestionMetersTotals getBuildSegmentsStatsFromTaskReport(
+  Map taskReport,
+  boolean includeUnparseable,
+  List unparseableEvents
+  )
+  {
+IngestionStatsAndErrorsTaskReport ingestionStatsAndErrorsReport =
+(IngestionStatsAndErrorsTaskReport) taskReport.get(
+IngestionStatsAndErrorsTaskReport.REPORT_KEY);
+IngestionStatsAndErrorsTaskReportData reportData =
+(IngestionStatsAndErrorsTaskReportData) 
ingestionStatsAndErrorsReport.getPayload();
+RowIngestionMetersTotals totals = getTotalsFromBuildSegmentsRowStats(
+reportData.getRowStats().get(RowIngestionMeters.BUILD_SEGMENTS)
+);
+if (includeUnparseable) {
+  List taskUnparsebleEvents =
+  (List) 
reportData.getUnparseableEvents().get(RowIngestionMeters.BUILD_SEGMENTS);
+  unparseableEvents.addAll(taskUnparsebleEvents);
+}
+return totals;
+  }
+
+  private RowIngestionMetersTotals getTotalsFromBuildSegmentsRowStats(Object 
buildSegmentsRowStats)
+  {
+if (buildSegmentsRowStats instanceof RowIngestionMetersTotals) {
+  // This case is for unit tests. Normally when deserialized the row stats 
will apppear as a Map.
+  return (RowIngestionMetersTotals) buildSegmentsRowStats;
+} else if (buildSegmentsRowStats instanceof Map) {
+  Map buildSegmentsRowStatsMap = (Map) 
buildSegmentsRowStats;
+  return new RowIngestionMetersTotals(
+  ((Number) buildSegmentsRowStatsMap.get("processed")).longValue(),
+  ((Number) 
buildSegmentsRowStatsMap.get("processedBytes")).longValue(),
+  ((Number) 
buildSegmentsRowStatsMap.get("processedWithError")).longValue(),
+  ((Number) buildSegmentsRowStatsMap.get("thrownAway")).longValue(),
+  ((Number) buildSegmentsRowStatsMap.get("unparseable")).longValue()

Review Comment:
   Good idea - done.



##
indexing-service/src/main/java/org/apache/druid/indexing/common/task/batch/parallel/ParallelIndexStatsReporter.java:
##
@@ -0,0 +1,142 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific languag

[GitHub] [druid] danprince1 commented on a diff in pull request #13758: add ingested intervals to task report

2023-03-31 Thread via GitHub


danprince1 commented on code in PR #13758:
URL: https://github.com/apache/druid/pull/13758#discussion_r1154889897


##
indexing-service/src/main/java/org/apache/druid/indexing/common/task/batch/parallel/ParallelIndexStatsReporter.java:
##
@@ -0,0 +1,142 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.indexing.common.task.batch.parallel;
+
+import com.google.common.collect.ImmutableMap;
+import org.apache.druid.indexing.common.IngestionStatsAndErrorsTaskReport;
+import org.apache.druid.indexing.common.IngestionStatsAndErrorsTaskReportData;
+import org.apache.druid.indexing.common.TaskReport;
+import org.apache.druid.java.util.common.Pair;
+import org.apache.druid.java.util.common.logger.Logger;
+import org.apache.druid.segment.incremental.ParseExceptionReport;
+import org.apache.druid.segment.incremental.RowIngestionMeters;
+import org.apache.druid.segment.incremental.RowIngestionMetersTotals;
+import org.apache.druid.segment.incremental.SimpleRowIngestionMeters;
+
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+
+public abstract class ParallelIndexStatsReporter
+{
+  private static final Logger LOG = new 
Logger(ParallelIndexStatsReporter.class);
+
+  abstract ParallelIndexStats report(
+  ParallelIndexSupervisorTask task,
+  Object runner,
+  boolean includeUnparseable,
+  String full
+  );
+
+  protected RowIngestionMetersTotals getBuildSegmentsStatsFromTaskReport(
+  Map taskReport,
+  boolean includeUnparseable,
+  List unparseableEvents
+  )
+  {
+IngestionStatsAndErrorsTaskReport ingestionStatsAndErrorsReport =
+(IngestionStatsAndErrorsTaskReport) taskReport.get(
+IngestionStatsAndErrorsTaskReport.REPORT_KEY);
+IngestionStatsAndErrorsTaskReportData reportData =
+(IngestionStatsAndErrorsTaskReportData) 
ingestionStatsAndErrorsReport.getPayload();
+RowIngestionMetersTotals totals = getTotalsFromBuildSegmentsRowStats(
+reportData.getRowStats().get(RowIngestionMeters.BUILD_SEGMENTS)
+);
+if (includeUnparseable) {
+  List taskUnparsebleEvents =
+  (List) 
reportData.getUnparseableEvents().get(RowIngestionMeters.BUILD_SEGMENTS);
+  unparseableEvents.addAll(taskUnparsebleEvents);
+}
+return totals;
+  }
+
+  private RowIngestionMetersTotals getTotalsFromBuildSegmentsRowStats(Object 
buildSegmentsRowStats)
+  {
+if (buildSegmentsRowStats instanceof RowIngestionMetersTotals) {
+  // This case is for unit tests. Normally when deserialized the row stats 
will apppear as a Map.
+  return (RowIngestionMetersTotals) buildSegmentsRowStats;
+} else if (buildSegmentsRowStats instanceof Map) {
+  Map buildSegmentsRowStatsMap = (Map) 
buildSegmentsRowStats;
+  return new RowIngestionMetersTotals(
+  ((Number) buildSegmentsRowStatsMap.get("processed")).longValue(),
+  ((Number) 
buildSegmentsRowStatsMap.get("processedBytes")).longValue(),
+  ((Number) 
buildSegmentsRowStatsMap.get("processedWithError")).longValue(),
+  ((Number) buildSegmentsRowStatsMap.get("thrownAway")).longValue(),
+  ((Number) buildSegmentsRowStatsMap.get("unparseable")).longValue()
+  );
+} else {
+  // should never happen
+  throw new RuntimeException("Unrecognized buildSegmentsRowStats type: " + 
buildSegmentsRowStats.getClass());
+}
+  }
+
+  protected RowIngestionMetersTotals 
getRowStatsAndUnparseableEventsForRunningTasks(
+  ParallelIndexSupervisorTask task,
+  Set runningTaskIds,
+  List unparseableEvents,
+  boolean includeUnparseable
+  )
+  {
+final SimpleRowIngestionMeters buildSegmentsRowStats = new 
SimpleRowIngestionMeters();
+for (String runningTaskId : runningTaskIds) {
+  try {
+final Map report = task.fetchTaskReport(runningTaskId);
+if (report == null || report.isEmpty()) {
+  // task does not have a running report yet
+  continue;
+}
+
+Map ingestionStatsAndErrors = (Map) 
report.get("ingestionStatsAndErrors");
+Map payload = (Map) 
ingestionStatsAndErrors.get("payload");
+Map rowStats = (M

[GitHub] [druid] danprince1 commented on a diff in pull request #13758: add ingested intervals to task report

2023-03-31 Thread via GitHub


danprince1 commented on code in PR #13758:
URL: https://github.com/apache/druid/pull/13758#discussion_r1154890424


##
indexing-service/src/main/java/org/apache/druid/indexing/common/task/batch/parallel/ParallelIndexSupervisorTask.java:
##
@@ -767,7 +770,8 @@ TaskStatus runHashPartitionMultiPhaseParallel(TaskToolbox 
toolbox) throws Except
   );
   return TaskStatus.failure(getId(), errMsg);
 }
-indexGenerateRowStats = 
doGetRowStatsAndUnparseableEventsParallelMultiPhase(indexingRunner, true);
+
+indexGenerateRowStats = new 
MultiPhaseParallelIndexStatsReporter().report(this, indexingRunner, true, 
"full");

Review Comment:
   Turns out this string 'full' parameter was pretty wonky to begin with, so I 
changed it to a boolean, which seemed to make much more sense.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] danprince1 commented on pull request #13758: add ingested intervals to task report

2023-03-31 Thread via GitHub


danprince1 commented on PR #13758:
URL: https://github.com/apache/druid/pull/13758#issuecomment-1492601211

   I ran coverage locally and added some tests until it passed, so hopefully 
that is resolved as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] georgew5656 opened a new pull request, #14010: Fix issues with null pointers on jobResponse

2023-03-31 Thread via GitHub


georgew5656 opened a new pull request, #14010:
URL: https://github.com/apache/druid/pull/14010

   ### Description
   I was doing some additional testing on my change in 
https://github.com/apache/druid/pull/14001, and realized that I missed some 
logic to handle the null job in  getJobDuration. This causes a null pointer 
exception to be thrown when a job is manually shut down.
   
    Release note
   - FIx bug with manual task shutdown in the KubernetesTaskRunner
   - 
   # Key changed/added classes in this PR
   - Explicitly pass null as the job parameter to the JobResponse constructor
   - Add a log for when the job has been manually cleaned up
   - Check job null status when grabbing name for logs
   
   - [X] been self-reviewed.
  - [ ] using the [concurrency 
checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md)
 (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] a release note entry in the PR description.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked 
related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in 
[licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
   - [ ] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [X] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [ ] added integration tests.
   - [X] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] nlippis commented on a diff in pull request #14010: Fix issues with null pointers on jobResponse

2023-03-31 Thread via GitHub


nlippis commented on code in PR #14010:
URL: https://github.com/apache/druid/pull/14010#discussion_r1154906934


##
extensions-contrib/kubernetes-overlord-extensions/src/main/java/org/apache/druid/k8s/overlord/common/DruidKubernetesPeonClient.java:
##
@@ -111,7 +111,8 @@ public JobResponse waitForJobCompletion(K8sTaskId taskId, 
long howLong, TimeUnit
   unit
   );
   if (job == null) {
-return new JobResponse(job, PeonPhase.FAILED);
+log.info("K8s job for task was not found %s", taskId);
+return new JobResponse(null, PeonPhase.FAILED);

Review Comment:
   Though this is an improvement over the current NPE, if we do this then a 
status won't ever be recorded for the task and it may be attempted to be rerun. 
 We should report the status using the taskid passed into this function.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] 317brian commented on a diff in pull request #13736: docs: sql unnest and cleanup unnest datasource

2023-03-31 Thread via GitHub


317brian commented on code in PR #13736:
URL: https://github.com/apache/druid/pull/13736#discussion_r1154920712


##
docs/querying/sql.md:
##
@@ -82,6 +83,43 @@ FROM clause, metadata tables are not considered datasources. 
They exist only in
 For more information about table, lookup, query, and join datasources, refer 
to the [Datasources](datasource.md)
 documentation.
 
+## UNNEST
+
+> The UNNEST SQL function is [experimental](../development/experimental.md). 
Its API and behavior are subject
+> to change in future releases. It is not recommended to use this feature in 
production at this time.
+
+The UNNEST clause unnests array values. It's the SQL equivalent to the [unnest 
datasource](./datasource.md#unnest). The source for UNNEST can be an array or 
an input that's been transformed into an array, such as with helper functions 
like MV_TO_ARRAY or ARRAY.
+
+The following is the general syntax for UNNEST, specifically a query that 
returns the column that gets unnested:
+
+```sql
+SELECT column_alias_name FROM datasource, UNNEST(source_expression1) AS 
table_alias_name1(column_alias_name1), UNNEST(source_expression2) AS 
table_alias_name2(column_alias_name2), ...
+```
+
+* The `datasource` for UNNEST can be any Druid datasource, such as the 
following:
+  * A table, such as  `FROM a_table`.
+  * A subset of a table based on a query, a filter, or a JOIN. For example, 
`FROM (SELECT columnA,columnB,columnC from a_table)`.
+* The `source_expression` for the UNNEST function must be an array and can 
come from any expression. If the dimension you are unnesting is a multi-value 
dimension, you have to specify `MV_TO_ARRAY(dimension)` to convert it to an 
implicit ARRAY type. You can also specify any expression that has an SQL array 
datatype. For example, you can call UNNEST on the following:
+  * `ARRAY[dim1,dim2]` if you want to make an array out of two dimensions. 
+  * `ARRAY_CONCAT(dim1,dim2)` if you want to concatenate two multi-value 
dimensions. 
+* The `AS table_alias_name(column_alias_name)` clause  is not required but is 
highly recommended. Use it to specify the output, which can be an existing 
column or a new one. Replace `table_alias_name` and `column_alias_name` with a 
table and column name you want to alias the unnested results to. If you don't 
provide this, Druid uses a nondescriptive name, such as `EXPR$0`.
+
+Keep these two things in mind when writing your query:
+
+- You must include the context parameter `"enableUnnest": true`.
+- You can unnest multiple source expressions in a single query.
+- Notice the comma between the datasource and the UNNEST function. This is 
needed in most cases of the UNNEST function. Specifically, it is not needed 
when you're unnesting an inline array since the array itself is the datasource.
+- If you view the native explanation of a SQL UNNEST, you'll notice that Druid 
uses `j0.unnest` as a virtual column to perform the unnest. An underscore is 
added for each unnest, so you may notice virtual columns named `_j0.unnest` or 
`__j0.unnest`.
+

Review Comment:
   ```suggestion
   - If you view the native explanation of a SQL UNNEST, you'll notice that 
Druid uses `j0.unnest` as a virtual column to perform the unnest. An underscore 
is added for each unnest, so you may notice virtual columns named `_j0.unnest` 
or `__j0.unnest`.
   - UNNEST preserves the ordering of the source array that is being unnested
   
   ```



##
docs/querying/sql.md:
##
@@ -82,6 +83,43 @@ FROM clause, metadata tables are not considered datasources. 
They exist only in
 For more information about table, lookup, query, and join datasources, refer 
to the [Datasources](datasource.md)
 documentation.
 
+## UNNEST
+
+> The UNNEST SQL function is [experimental](../development/experimental.md). 
Its API and behavior are subject
+> to change in future releases. It is not recommended to use this feature in 
production at this time.
+
+The UNNEST clause unnests array values. It's the SQL equivalent to the [unnest 
datasource](./datasource.md#unnest). The source for UNNEST can be an array or 
an input that's been transformed into an array, such as with helper functions 
like MV_TO_ARRAY or ARRAY.
+
+The following is the general syntax for UNNEST, specifically a query that 
returns the column that gets unnested:
+
+```sql
+SELECT column_alias_name FROM datasource, UNNEST(source_expression1) AS 
table_alias_name1(column_alias_name1), UNNEST(source_expression2) AS 
table_alias_name2(column_alias_name2), ...
+```
+
+* The `datasource` for UNNEST can be any Druid datasource, such as the 
following:
+  * A table, such as  `FROM a_table`.
+  * A subset of a table based on a query, a filter, or a JOIN. For example, 
`FROM (SELECT columnA,columnB,columnC from a_table)`.
+* The `source_expression` for the UNNEST function must be an array and can 
come from any expression. If the dimension you are unnesting is a multi-value 
dimension, you have to specify `MV_TO_ARRAY(

[GitHub] [druid] 317brian commented on a diff in pull request #13736: docs: sql unnest and cleanup unnest datasource

2023-03-31 Thread via GitHub


317brian commented on code in PR #13736:
URL: https://github.com/apache/druid/pull/13736#discussion_r1154920870


##
docs/querying/sql.md:
##
@@ -82,6 +83,43 @@ FROM clause, metadata tables are not considered datasources. 
They exist only in
 For more information about table, lookup, query, and join datasources, refer 
to the [Datasources](datasource.md)
 documentation.
 
+## UNNEST
+
+> The UNNEST SQL function is [experimental](../development/experimental.md). 
Its API and behavior are subject
+> to change in future releases. It is not recommended to use this feature in 
production at this time.
+
+The UNNEST clause unnests array values. It's the SQL equivalent to the [unnest 
datasource](./datasource.md#unnest). The source for UNNEST can be an array or 
an input that's been transformed into an array, such as with helper functions 
like MV_TO_ARRAY or ARRAY.
+
+The following is the general syntax for UNNEST, specifically a query that 
returns the column that gets unnested:
+
+```sql
+SELECT column_alias_name FROM datasource, UNNEST(source_expression1) AS 
table_alias_name1(column_alias_name1), UNNEST(source_expression2) AS 
table_alias_name2(column_alias_name2), ...
+```
+
+* The `datasource` for UNNEST can be any Druid datasource, such as the 
following:
+  * A table, such as  `FROM a_table`.
+  * A subset of a table based on a query, a filter, or a JOIN. For example, 
`FROM (SELECT columnA,columnB,columnC from a_table)`.
+* The `source_expression` for the UNNEST function must be an array and can 
come from any expression. If the dimension you are unnesting is a multi-value 
dimension, you have to specify `MV_TO_ARRAY(dimension)` to convert it to an 
implicit ARRAY type. You can also specify any expression that has an SQL array 
datatype. For example, you can call UNNEST on the following:
+  * `ARRAY[dim1,dim2]` if you want to make an array out of two dimensions. 
+  * `ARRAY_CONCAT(dim1,dim2)` if you want to concatenate two multi-value 
dimensions. 
+* The `AS table_alias_name(column_alias_name)` clause  is not required but is 
highly recommended. Use it to specify the output, which can be an existing 
column or a new one. Replace `table_alias_name` and `column_alias_name` with a 
table and column name you want to alias the unnested results to. If you don't 
provide this, Druid uses a nondescriptive name, such as `EXPR$0`.
+
+Keep these two things in mind when writing your query:
+
+- You must include the context parameter `"enableUnnest": true`.
+- You can unnest multiple source expressions in a single query.
+- Notice the comma between the datasource and the UNNEST function. This is 
needed in most cases of the UNNEST function. Specifically, it is not needed 
when you're unnesting an inline array since the array itself is the datasource.
+- If you view the native explanation of a SQL UNNEST, you'll notice that Druid 
uses `j0.unnest` as a virtual column to perform the unnest. An underscore is 
added for each unnest, so you may notice virtual columns named `_j0.unnest` or 
`__j0.unnest`.
+
+For examples, see the [Unnest arrays 
tutorial](../tutorials/tutorial-unnest-arrays.md).
+
+The UNNEST function has the following limitations:
+
+- The function does not remove any duplicates or nulls in an array. Nulls will 
be treated as any other value in an array. If there are multiple nulls within 
the array, a record corresponding to each of the nulls gets created.
+- Arrays inside complex JSON types are not supported.
+- You cannot perform an UNNEST at ingestion time, including SQL-based 
ingestion using the MSQ task engine.
+- UNNEST preserves the ordering in the source array that is being unnested.

Review Comment:
   ```suggestion
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] vtlim commented on a diff in pull request #13758: add ingested intervals to task report

2023-03-31 Thread via GitHub


vtlim commented on code in PR #13758:
URL: https://github.com/apache/druid/pull/13758#discussion_r1154926674


##
docs/ingestion/tasks.md:
##
@@ -98,6 +101,14 @@ For some task types, the indexing task can wait for the 
newly ingested segments
 |`segmentAvailabilityConfirmed`|Whether all segments generated by this 
ingestion task had been confirmed as available for queries in the cluster 
before the task completed.|
 |`segmentAvailabilityWaitTimeMs`|Milliseconds waited by the ingestion task for 
the newly ingested segments to be available for query after completing 
ingestion was completed.|
 
+ Ingested Intervals

Review Comment:
   Docs look good. Only nit is to use sentence case for subheadings: `Ingested 
intervals`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] paul-rogers opened a new pull request, #14011: Unit tests for the Python druidapi

2023-03-31 Thread via GitHub


paul-rogers opened a new pull request, #14011:
URL: https://github.com/apache/druid/pull/14011

   Adds a set of unit tests for the recently-added `druidapi` Python library 
used in notebooks and by those of us who like to use Jupyter notebooks for 
debugging. The tests are based on the Python `unittest` framework which is 
roughly similar to the Java JUnit framework. The tests are a bit light, but 
they do cover much of the existing functionality. It should be easy to add 
additional tests, such as for the basic security method in an adjacent PR. See 
the README file for details.
   
    Release note
   
   No user-visible changes.
   
   
   
   This PR has:
   
   - [X] been self-reviewed.
   - [X] a release note entry in the PR description.
   - [X] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [x] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [X] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] techdocsmith commented on a diff in pull request #13868: Add docs contribution page

2023-03-31 Thread via GitHub


techdocsmith commented on code in PR #13868:
URL: https://github.com/apache/druid/pull/13868#discussion_r1154986298


##
docs/development/contributing-to-docs.md:
##
@@ -0,0 +1,139 @@
+---
+id: contributing-to-docs
+title: "How to contribute to Druid docs"
+---
+
+
+
+Druid is a [community-led project](https://druid.apache.org/community/) and we 
are delighted to receive contributions of anything from minor fixes to docs to 
big new features.
+
+Druid docs contributors:
+
+* Improve existing content
+* Create new content
+
+## Getting started
+
+Druid docs contributors can open an issue about documentation, or contribute a 
change with a pull request (PR).
+
+The open source Druid docs are located here:
+https://druid.apache.org/docs/latest/design/index.html
+
+Some of the Druid docs are incorporated into the Imply docs:
+https://docs.imply.io/latest/apache-druid-doc/
+
+If you need to update a Druid doc, locate and update the doc in the Druid 
repo, following the instructions below. Once a month, we run a script to update 
the Imply docs repo with recent updates to the Druid docs.
+
+## Druid repo branches
+
+The Druid team works on Apache Druid master, and then branches to 0.17, 0.18, 
etc and 017-iap (which stands for Imply Analytics Platform).
+
+See 
[CONTRIBUTING.md](https://github.com/apache/incubator-druid/blob/master/CONTRIBUTING.md)
 for instructions on contributing to Apache Druid.
+
+## Before you begin
+
+Before you can contribute to the Druid docs for the first time, you must 
complete the following steps:
+
+  1. Fork the [Druid repo](https://github.com/apache/druid). Your fork will be 
the ```origin``` remote.
+  2. Clone the Druid repo from your fork.
+  3. Set up your remotes locally ```upstream``` in the Druid repo in 
```.git/config```:
+  
+  [remote "upstream"]
+   url = https://github.com/apache/druid.git
+   fetch = +refs/heads/*:refs/remotes/upstream/*
+   pushurl = no_push
+  [branch "master"]
+   remote = upstream
+   merge = refs/heads/master
+  [remote "origin"]
+   url = https://github.com/{my-git-id}/druid.git
+   fetch = +refs/heads/*:refs/remotes/upstream/*
+  [branch "master"]
+   remote = origin
+   merge = refs/heads/master
+  
+
+  For ```upstream```, ```push_url = no_push``` means you won’t accidentally 
push to upstream.
+  Make sure to put your github id for {my-git-id}.
+  4. ```git config --list --show-origin``` to make sure you’ve got your email 
configured. If you need to set your email, you can set it per repo or globally. 
Global instructions 
[here](https://docs.github.com/en/github-ae@latest/account-and-profile/setting-up-and-managing-your-github-user-account/managing-email-preferences/setting-your-commit-email-address#setting-your-commit-email-address-in-git).
+  5. Docusaurus?
+
+## Contributing
+
+  1. On branch ```master```, fetch the latest commit:
+
+  
+  git fetch upstream
+
+  remote: Enumerating objects: 397, done.
+  remote: Counting objects: 100% (341/341), done.
+  remote: Compressing objects: 100% (181/181), done.
+  remote: Total 397 (delta 118), reused 255 (delta 98), pack-reused 56
+  Receiving objects: 100% (397/397), 266.19 KiB | 16.64 MiB/s, done.
+  Resolving deltas: 100% (118/118), completed with 44 local objects.
+  From https://github.com/apache/druid
+ 819d706082..6c9f926e3e  0.21.0   -> upstream/0.21.0
+ 8a3be6bccc..84aac4832d  master   -> upstream/master
+   * [new tag]   druid-0.21.0 -> druid-0.21.0
+
+  ➜ git reset --hard upstream/master
+  HEAD is now at 84aac4832d Add feature to automatically remove rules based on 
retention period (#11164)
+  
+
+  Now you're up to date.
+
+  2. Create your working branch:
+  
+  git checkout -b my-work
+  
+  3. Make your changes, add, and commit:
+  
+  git add my-change.md
+  git commit -m "i made some changes"
+  
+  4. Test changes locally.
+  5. Push your changes to your fork: ```origin```
+  
+  git push --set-upstream origin my-work
+  
+  6. Go to GitHub for the Druid repo. It should realize you have a new branch 
in your fork. Create a pull request from your for ```my-work``` to ```master``` 
in the Druid repo.
+
+## Style guide
+
+Before publishing new content or updating an existing topic, audit your 
documentation using this checklist to make sure your contributions align with 
existing documentation.
+
+* Use descriptive link text. If a link downloads a file, make sure to indicate 
this action
+* Use present tense where possible
+* Avoid negative constructions when possible
+* Use clear and direct language
+* Use descriptive headings and titles
+* Avoid using a present participle or gerund as the first word in a heading or 
title
+* Use sentence case in document titles and headings
+* Don’t use images of text or code samples
+* Use SVG over PNG if available
+* Provide an equivalent text explanation with each image
+* Use the appr

[GitHub] [druid] 317brian opened a new pull request, #14012: docs: copyedits for MSQ join algos

2023-03-31 Thread via GitHub


317brian opened a new pull request, #14012:
URL: https://github.com/apache/druid/pull/14012

   Copyedits for the MSQ reference docs for the 2 join types
   
    Release note
   
   N/a
   
   
   
   
   
   
   
   This PR has:
   
   - [X] been self-reviewed.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] abhishekagarwal87 commented on pull request #13976: Fixing regression issues on unnest

2023-03-31 Thread via GitHub


abhishekagarwal87 commented on PR #13976:
URL: https://github.com/apache/druid/pull/13976#issuecomment-1492853961

   @cryptoe - Bugs usually don't need release notes and UNNEST wasn't available 
in previous releases anyway. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] tejaswini-imply opened a new pull request, #14013: Remove duplicate trigger in Cron Job ITs workflow

2023-03-31 Thread via GitHub


tejaswini-imply opened a new pull request, #14013:
URL: https://github.com/apache/druid/pull/14013

   This 
[commit](https://github.com/apache/druid/commit/3c096c01a2c4554b6f107627fb55755b4f2a6cb0)
 added duplicate trigger on pull-request in `Cron Job ITs` workflow. Hence 
removing this duplicate that is failing Cron Job ITs workflow to start.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org