Blizzara commented on code in PR #10646: URL: https://github.com/apache/datafusion/pull/10646#discussion_r1618379915
########## datafusion/substrait/src/logical_plan/consumer.rs: ########## @@ -1160,6 +1162,24 @@ pub(crate) fn from_substrait_type(dt: &substrait::proto::Type) -> Result<DataTyp "Unsupported Substrait type variation {v} of type {s_kind:?}" ), }, + r#type::Kind::UserDefined(u) => { + match u.type_reference { + INTERVAL_YEAR_MONTH_TYPE_REF => { Review Comment: I see, actually I think both references (type_variation_reference and type_reference) have the same problem - I hadn't realized it affects the type_variation_reference too. Now, if each system defines the type/variation references as consts, then plans will look compatible, but may have different meaning (nothing tells another Substrait producer/consumer that type_variation_reference = 1 on a list means a large list, for example). I believe both should be defined as SimpleExtension files, following the [schema](https://github.com/substrait-io/substrait/blob/4f5b4ac4d473c9f03f30f86eca080073d99ef1c7/text/simple_extensions_schema.yaml#L17) (like here https://github.com/apache/arrow/blob/main/format/substrait/extension_types.yaml) rather than as constants (kinda what https://github.com/apache/datafusion/blob/09054263df1d8de06b6e77ab0cbd99027bb7ceb6/datafusion/substrait/src/variation_const.rs#L21 already says 😅). And then (I think) in the plan itself, there'd be instances of the SimpleExtensionDeclaration messages (https://github.com/substrait-io/substrait/blob/4f5b4ac4d473c9f03f30f86eca080073d99ef1c7/proto/substrait/extensions/extensions.proto#L39), and the type_reference and type_variation_reference would link to the type_anchor and type_variation_anchor, rather than to the hard-coded constants. That way one can (in theory, at least) teach another Substrait producer/consumer about the DataFusion extensions and keep plans compatible, or at least the systems will recognize that the plans are incompatible as it refers to extension URIs that the other system doesn't know about. Does that make sense? I'm not 100% sure of anything I'm saying here, so I may as be understanding something wrong. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org