westonpace commented on a change in pull request #9323:
URL: https://github.com/apache/arrow/pull/9323#discussion_r575451640



##########
File path: cpp/src/arrow/dataset/partition.cc
##########
@@ -74,15 +74,26 @@ Status KeyValuePartitioning::SetDefaultValuesFromKeys(const 
Expression& expr,
                                                       RecordBatchProjector* 
projector) {
   ARROW_ASSIGN_OR_RAISE(auto known_values, ExtractKnownFieldValues(expr));
   for (const auto& ref_value : known_values) {
-    if (!ref_value.second.is_scalar()) {
-      return Status::Invalid("non-scalar partition key ", 
ref_value.second.ToString());
+    const auto& known_value = ref_value.second;
+    if (known_value.concrete() && !known_value.datum.is_scalar()) {
+      return Status::Invalid("non-scalar partition key ", 
known_value.datum.ToString());
     }
 
     ARROW_ASSIGN_OR_RAISE(auto match,
                           ref_value.first.FindOneOrNone(*projector->schema()));
 
     if (match.empty()) continue;
-    RETURN_NOT_OK(projector->SetDefaultValue(match, 
ref_value.second.scalar()));
+
+    const auto& field = projector->schema()->field(match[0]);
+    if (known_value.concrete()) {
+      RETURN_NOT_OK(projector->SetDefaultValue(match, 
known_value.datum.scalar()));
+    } else if (known_value.valid) {
+      return Status::Invalid(
+          "Partition expression not defined enough to set default value for ",

Review comment:
       Today we only get in this state if the expression is something like 
`pa.field("a").is_valid()` although technically this could be inferred from 
expressions like `pa.field("a") > 7` as well.  Any expression that only 
evaluates true if the field is non-null gives us some hint at least that the 
result will be non-null.
   
   Although, on second review, it's probably better to just `continue` in this 
case rather than return an error.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to