clintropolis commented on code in PR #14956:
URL: https://github.com/apache/druid/pull/14956#discussion_r1343104541


##########
docs/querying/math-expr.md:
##########
@@ -82,7 +82,7 @@ The following built-in functions are available.
 |concat|concat(expr, expr...) concatenate a list of strings|
 |format|format(pattern[, args...]) returns a string formatted in the manner of 
Java's 
[String.format](https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#format-java.lang.String-java.lang.Object...-).|
 |like|like(expr, pattern[, escape]) is equivalent to SQL `expr LIKE pattern`|
-|lookup|lookup(expr, lookup-name) looks up expr in a registered [query-time 
lookup](../querying/lookups.md)|
+|lookup|lookup(expr, lookup-name[,replaceMissingValueWith]) looks up expr in a 
registered [query-time lookup](../querying/lookups.md)|

Review Comment:
   nit: should probably specify `replaceMissingValueWith` must be a constant



##########
docs/querying/sql-scalar.md:
##########
@@ -98,7 +98,7 @@ String functions accept strings, and return a type 
appropriate to the function.
 |`CHAR_LENGTH(expr)`|Alias for `LENGTH`.|
 |`CHARACTER_LENGTH(expr)`|Alias for `LENGTH`.|
 |`STRLEN(expr)`|Alias for `LENGTH`.|
-|`LOOKUP(expr, lookupName)`|Look up `expr` in a registered [query-time lookup 
table](lookups.md). Note that lookups can also be queried directly using the 
[`lookup` schema](sql.md#from).|
+|`LOOKUP(expr, lookupName, [replaceMissingValueWith])`|Look up `expr` in a 
registered [query-time lookup table](lookups.md). Note that lookups can also be 
queried directly using the [`lookup` schema](sql.md#from). Optional constant 
replaceMissingValueWith can be passed as 3rd argument to be returned when value 
is missing from lookup.|

Review Comment:
   nit:
   
   ```suggestion
   |`LOOKUP(expr, lookupName[, replaceMissingValueWith])`|Look up `expr` in a 
registered [query-time lookup table](lookups.md). Note that lookups can also be 
queried directly using the [`lookup` schema](sql.md#from). Optional constant 
`replaceMissingValueWith` can be passed as the 3rd argument to be returned when 
value is missing from lookup.|
   ```



##########
processing/src/main/java/org/apache/druid/query/expression/LookupExprMacro.java:
##########
@@ -69,7 +70,9 @@ public Expr apply(final List<Expr> args)
         lookupExtractorFactoryContainerProvider,
         lookupName,
         false,
-        null,
+        replaceMissingValueWith != null && replaceMissingValueWith.isLiteral()
+        ? (String) replaceMissingValueWith.getLiteralValue()

Review Comment:
   I suppose since other types can also be literals it might be safer to use 
`Evals.asString` instead of casting



##########
sql/src/main/java/org/apache/druid/sql/calcite/expression/builtin/QueryLookupOperatorConversion.java:
##########
@@ -91,4 +104,18 @@ public DruidExpression toDruidExpression(
         }
     );
   }
+
+  private String getReplaceMissingValueWith(
+      final List<DruidExpression> inputExpressions,
+      final PlannerContext plannerContext
+  )
+  {
+    if (inputExpressions.size() > 2) {
+      final Expr missingValExpr = 
plannerContext.parseExpression(inputExpressions.get(2).getExpression());
+      if (missingValExpr.isLiteral()) {
+        return (String) missingValExpr.getLiteralValue();

Review Comment:
   otoh, this one is probably fine as is and doesn't need to use 
`Evals.asString` like the native layer should since the operand type checker 
specifies the 3rd argument as character type, though it would also probably be 
harmless



##########
sql/src/test/java/org/apache/druid/sql/calcite/CalciteQueryTest.java:
##########
@@ -8967,6 +8967,65 @@ public void testFilterAndGroupByLookup()
         )
     );
   }
+  @Test
+  public void testLookupReplaceMissingValueWith()
+  {
+    // Cannot vectorize due to extraction dimension specs.
+    cannotVectorize();
+
+    final RegisteredLookupExtractionFn extractionFn1 = new 
RegisteredLookupExtractionFn(
+        null,
+        "lookyloo",
+        false,
+        "Missing_Value",
+        null,
+        true
+    );
+    final RegisteredLookupExtractionFn extractionFnRMVNull = new 
RegisteredLookupExtractionFn(
+        null,
+        "lookyloo",
+        false,
+        null,
+        null,
+        true
+    );
+    testQuery(
+        "SELECT LOOKUP(dim1, 'lookyloo', 'Missing_Value'), LOOKUP(dim1, 
'lookyloo', null) as rmvNull, COUNT(*) FROM foo group by 1,2",
+        ImmutableList.of(
+            GroupByQuery.builder()
+                        .setDataSource(CalciteTests.DATASOURCE1)
+                        .setInterval(querySegmentSpec(Filtration.eternity()))
+                        .setGranularity(Granularities.ALL)
+                        .setDimensions(
+                            dimensions(
+                                new ExtractionDimensionSpec(
+                                    "dim1",
+                                    "d0",
+                                    ColumnType.STRING,
+                                    extractionFn1
+                                ),
+                                new ExtractionDimensionSpec(
+                                    "dim1",
+                                    "d1",
+                                    ColumnType.STRING,
+                                    extractionFnRMVNull
+                                )
+                            )
+                        )
+                        .setAggregatorSpecs(
+                            aggregators(
+                                new CountAggregatorFactory("a0")
+                            )
+                        )
+                        .setContext(QUERY_CONTEXT_DEFAULT)
+                        .build()
+        ),
+        ImmutableList.of(
+            new Object[]{"Missing_Value", NullHandling.sqlCompatible() ? null 
: "", 5L},

Review Comment:
   nit: can use `NullHandling.defaultStringValue()`



##########
docs/querying/sql-functions.md:
##########
@@ -885,7 +885,7 @@ Calculates the base-10 of the numeric expression.
 
 ## LOOKUP
 
-`LOOKUP(<CHARACTER>, <CHARACTER>)`
+`LOOKUP(<CHARACTER>, <CHARACTER>, [<CHARACTER>])`

Review Comment:
   nit: 
   ```suggestion
   `LOOKUP(<CHARACTER>, <CHARACTER>[, <CHARACTER>])`
   ```
   
   although looking at this page, this isn't actually consistent across all 
functions... i think this form makes more sense because the comma wouldn't be 
there unless the argument is there, but :shrug:



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to