westonpace commented on a change in pull request #9323:
URL: https://github.com/apache/arrow/pull/9323#discussion_r575451640
##########
File path: cpp/src/arrow/dataset/partition.cc
##########
@@ -74,15 +74,26 @@ Status KeyValuePartitioning::SetDefaultValuesFromKeys(const
Expression& expr,
RecordBatchProjector*
projector) {
ARROW_ASSIGN_OR_RAISE(auto known_values, ExtractKnownFieldValues(expr));
for (const auto& ref_value : known_values) {
- if (!ref_value.second.is_scalar()) {
- return Status::Invalid("non-scalar partition key ",
ref_value.second.ToString());
+ const auto& known_value = ref_value.second;
+ if (known_value.concrete() && !known_value.datum.is_scalar()) {
+ return Status::Invalid("non-scalar partition key ",
known_value.datum.ToString());
}
ARROW_ASSIGN_OR_RAISE(auto match,
ref_value.first.FindOneOrNone(*projector->schema()));
if (match.empty()) continue;
- RETURN_NOT_OK(projector->SetDefaultValue(match,
ref_value.second.scalar()));
+
+ const auto& field = projector->schema()->field(match[0]);
+ if (known_value.concrete()) {
+ RETURN_NOT_OK(projector->SetDefaultValue(match,
known_value.datum.scalar()));
+ } else if (known_value.valid) {
+ return Status::Invalid(
+ "Partition expression not defined enough to set default value for ",
Review comment:
Today we only get in this state if the expression is something like
`pa.field("a").is_valid()` although technically this could be inferred from
expressions like `pa.field("a") > 7` as well. Any expression that only
evaluates true if the field is non-null gives us some hint at least that the
result will be non-null.
Although, on second review, it's probably better to just `continue` in this
case rather than return an error.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]