clintropolis commented on code in PR #14956: URL: https://github.com/apache/druid/pull/14956#discussion_r1343104541
########## docs/querying/math-expr.md: ########## @@ -82,7 +82,7 @@ The following built-in functions are available. |concat|concat(expr, expr...) concatenate a list of strings| |format|format(pattern[, args...]) returns a string formatted in the manner of Java's [String.format](https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#format-java.lang.String-java.lang.Object...-).| |like|like(expr, pattern[, escape]) is equivalent to SQL `expr LIKE pattern`| -|lookup|lookup(expr, lookup-name) looks up expr in a registered [query-time lookup](../querying/lookups.md)| +|lookup|lookup(expr, lookup-name[,replaceMissingValueWith]) looks up expr in a registered [query-time lookup](../querying/lookups.md)| Review Comment: nit: should probably specify `replaceMissingValueWith` must be a constant ########## docs/querying/sql-scalar.md: ########## @@ -98,7 +98,7 @@ String functions accept strings, and return a type appropriate to the function. |`CHAR_LENGTH(expr)`|Alias for `LENGTH`.| |`CHARACTER_LENGTH(expr)`|Alias for `LENGTH`.| |`STRLEN(expr)`|Alias for `LENGTH`.| -|`LOOKUP(expr, lookupName)`|Look up `expr` in a registered [query-time lookup table](lookups.md). Note that lookups can also be queried directly using the [`lookup` schema](sql.md#from).| +|`LOOKUP(expr, lookupName, [replaceMissingValueWith])`|Look up `expr` in a registered [query-time lookup table](lookups.md). Note that lookups can also be queried directly using the [`lookup` schema](sql.md#from). Optional constant replaceMissingValueWith can be passed as 3rd argument to be returned when value is missing from lookup.| Review Comment: nit: ```suggestion |`LOOKUP(expr, lookupName[, replaceMissingValueWith])`|Look up `expr` in a registered [query-time lookup table](lookups.md). Note that lookups can also be queried directly using the [`lookup` schema](sql.md#from). Optional constant `replaceMissingValueWith` can be passed as the 3rd argument to be returned when value is missing from lookup.| ``` ########## processing/src/main/java/org/apache/druid/query/expression/LookupExprMacro.java: ########## @@ -69,7 +70,9 @@ public Expr apply(final List<Expr> args) lookupExtractorFactoryContainerProvider, lookupName, false, - null, + replaceMissingValueWith != null && replaceMissingValueWith.isLiteral() + ? (String) replaceMissingValueWith.getLiteralValue() Review Comment: I suppose since other types can also be literals it might be safer to use `Evals.asString` instead of casting ########## sql/src/main/java/org/apache/druid/sql/calcite/expression/builtin/QueryLookupOperatorConversion.java: ########## @@ -91,4 +104,18 @@ public DruidExpression toDruidExpression( } ); } + + private String getReplaceMissingValueWith( + final List<DruidExpression> inputExpressions, + final PlannerContext plannerContext + ) + { + if (inputExpressions.size() > 2) { + final Expr missingValExpr = plannerContext.parseExpression(inputExpressions.get(2).getExpression()); + if (missingValExpr.isLiteral()) { + return (String) missingValExpr.getLiteralValue(); Review Comment: otoh, this one is probably fine as is and doesn't need to use `Evals.asString` like the native layer should since the operand type checker specifies the 3rd argument as character type, though it would also probably be harmless ########## sql/src/test/java/org/apache/druid/sql/calcite/CalciteQueryTest.java: ########## @@ -8967,6 +8967,65 @@ public void testFilterAndGroupByLookup() ) ); } + @Test + public void testLookupReplaceMissingValueWith() + { + // Cannot vectorize due to extraction dimension specs. + cannotVectorize(); + + final RegisteredLookupExtractionFn extractionFn1 = new RegisteredLookupExtractionFn( + null, + "lookyloo", + false, + "Missing_Value", + null, + true + ); + final RegisteredLookupExtractionFn extractionFnRMVNull = new RegisteredLookupExtractionFn( + null, + "lookyloo", + false, + null, + null, + true + ); + testQuery( + "SELECT LOOKUP(dim1, 'lookyloo', 'Missing_Value'), LOOKUP(dim1, 'lookyloo', null) as rmvNull, COUNT(*) FROM foo group by 1,2", + ImmutableList.of( + GroupByQuery.builder() + .setDataSource(CalciteTests.DATASOURCE1) + .setInterval(querySegmentSpec(Filtration.eternity())) + .setGranularity(Granularities.ALL) + .setDimensions( + dimensions( + new ExtractionDimensionSpec( + "dim1", + "d0", + ColumnType.STRING, + extractionFn1 + ), + new ExtractionDimensionSpec( + "dim1", + "d1", + ColumnType.STRING, + extractionFnRMVNull + ) + ) + ) + .setAggregatorSpecs( + aggregators( + new CountAggregatorFactory("a0") + ) + ) + .setContext(QUERY_CONTEXT_DEFAULT) + .build() + ), + ImmutableList.of( + new Object[]{"Missing_Value", NullHandling.sqlCompatible() ? null : "", 5L}, Review Comment: nit: can use `NullHandling.defaultStringValue()` ########## docs/querying/sql-functions.md: ########## @@ -885,7 +885,7 @@ Calculates the base-10 of the numeric expression. ## LOOKUP -`LOOKUP(<CHARACTER>, <CHARACTER>)` +`LOOKUP(<CHARACTER>, <CHARACTER>, [<CHARACTER>])` Review Comment: nit: ```suggestion `LOOKUP(<CHARACTER>, <CHARACTER>[, <CHARACTER>])` ``` although looking at this page, this isn't actually consistent across all functions... i think this form makes more sense because the comma wouldn't be there unless the argument is there, but :shrug: -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
