kosiew commented on code in PR #20369:
URL: https://github.com/apache/datafusion/pull/20369#discussion_r2888965206
##########
datafusion/functions-table/src/generate_series.rs:
##########
@@ -238,100 +243,213 @@ impl GenerateSeriesTable {
Self { schema, args }
}
+ pub fn as_partition(&self, batch_size: usize) -> Result<Arc<dyn
LazyPartition>> {
Review Comment:
`GenerateSeriesTable::as_partition` currently constructs
`GenerateSeriesPartition` without validating timestamp timezone input.
Previously `as_generator` parsed timezone eagerly and could fail during
planning. Consider validating in `GenerateSeriesPartition::new` (or a
`try_new`) to keep error timing consistent and fail earlier.
##########
datafusion/physical-plan/src/memory.rs:
##########
@@ -246,17 +379,24 @@ impl LazyMemoryExec {
.add_orderings(std::iter::once(ordering));
}
+ /// Get the partitions.
+ pub fn partitions(&self) -> &[Arc<dyn LazyPartition>] {
+ &self.partitions
+ }
+
/// Get the batch generators
+ #[deprecated(note = "Use LazyMemoryExec::partitions instead")]
+ #[expect(deprecated)]
pub fn generators(&self) -> &Vec<Arc<RwLock<dyn LazyBatchGenerator>>> {
Review Comment:
Not the fault of this PR but
`generators()` returns `&Vec<_>`. Since this is a compatibility accessor,
returning a slice (`&[_]`) would avoid exposing container type and align with
other accessor style.
##########
datafusion/physical-plan/src/memory.rs:
##########
@@ -228,11 +361,11 @@ impl LazyMemoryExec {
pub fn try_set_partitioning(&mut self, partitioning: Partitioning) ->
Result<()> {
let partition_count = partitioning.partition_count();
- let generator_count = self.batch_generators.len();
+ let generator_count = self.partitions.len();
Review Comment:
`generator_count` although this now counts partitions?
##########
datafusion/physical-plan/src/memory.rs:
##########
@@ -268,10 +408,10 @@ impl DisplayAs for LazyMemoryExec {
write!(
f,
"LazyMemoryExec: partitions={}, batch_generators=[{}]",
Review Comment:
`DisplayAs` still prints `batch_generators=` even though the primary
abstraction is now `partitions`?
##########
datafusion/proto/src/physical_plan/mod.rs:
##########
@@ -3451,22 +3457,118 @@ impl protobuf::PhysicalPlanNode {
}
}
+ fn serialize_generate_series_args(
+ args: &GenSeriesArgs,
+ ) -> Result<protobuf::generate_series_node::Args> {
+ match args {
+ GenSeriesArgs::ContainsNull { name } => {
+ Ok(protobuf::generate_series_node::Args::ContainsNull(
+ protobuf::GenerateSeriesArgsContainsNull {
+ name: Self::str_to_generate_series_name(name)? as i32,
+ },
+ ))
+ }
+ GenSeriesArgs::Int64Args {
+ start,
+ end,
+ step,
+ include_end,
+ name,
+ } => Ok(protobuf::generate_series_node::Args::Int64Args(
+ protobuf::GenerateSeriesArgsInt64 {
+ start: *start,
+ end: *end,
+ step: *step,
+ include_end: *include_end,
+ name: Self::str_to_generate_series_name(name)? as i32,
+ },
+ )),
+ GenSeriesArgs::TimestampArgs {
+ start,
+ end,
+ step,
+ tz,
+ include_end,
+ name,
+ } => Ok(protobuf::generate_series_node::Args::TimestampArgs(
+ protobuf::GenerateSeriesArgsTimestamp {
+ start: *start,
+ end: *end,
+ step:
Some(datafusion_proto_common::IntervalMonthDayNanoValue {
+ months: step.months,
+ days: step.days,
+ nanos: step.nanoseconds,
+ }),
+ include_end: *include_end,
+ name: Self::str_to_generate_series_name(name)? as i32,
+ tz: tz.as_ref().map(ToString::to_string),
+ },
+ )),
+ GenSeriesArgs::DateArgs {
+ start,
+ end,
+ step,
+ include_end,
+ name,
+ } => Ok(protobuf::generate_series_node::Args::DateArgs(
+ protobuf::GenerateSeriesArgsDate {
+ start: *start,
+ end: *end,
+ step:
Some(datafusion_proto_common::IntervalMonthDayNanoValue {
+ months: step.months,
+ days: step.days,
+ nanos: step.nanoseconds,
+ }),
+ include_end: *include_end,
+ name: Self::str_to_generate_series_name(name)? as i32,
+ },
+ )),
+ }
+ }
+
+ #[allow(deprecated)]
fn try_from_lazy_memory_exec(exec: &LazyMemoryExec) ->
Result<Option<Self>> {
Review Comment:
The native `GenerateSeriesPartition` path uses
`serialize_generate_series_args`, but legacy generator downcasts still manually
reconstruct protobuf args.
Could we extract a helper that converts legacy generator variants into
`GenSeriesArgs` first, then reuse `serialize_generate_series_args` to reduce
duplicated conversion logic?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]