Blizzara commented on code in PR #10646:
URL: https://github.com/apache/datafusion/pull/10646#discussion_r1618379915


##########
datafusion/substrait/src/logical_plan/consumer.rs:
##########
@@ -1160,6 +1162,24 @@ pub(crate) fn from_substrait_type(dt: 
&substrait::proto::Type) -> Result<DataTyp
                     "Unsupported Substrait type variation {v} of type 
{s_kind:?}"
                 ),
             },
+            r#type::Kind::UserDefined(u) => {
+                match u.type_reference {
+                    INTERVAL_YEAR_MONTH_TYPE_REF => {

Review Comment:
   I see, actually I think both references (type_variation_reference and 
type_reference) have the same problem - I hadn't realized it affects the 
type_variation_reference too. Now, if each system defines the type/variation 
references as consts, then plans will look compatible, but may have different 
meaning (nothing tells another Substrait producer/consumer that 
type_variation_reference = 1 on a list means a large list, for example).
   
   I believe both should be defined as SimpleExtension files, following the 
[schema](https://github.com/substrait-io/substrait/blob/4f5b4ac4d473c9f03f30f86eca080073d99ef1c7/text/simple_extensions_schema.yaml#L17)
 (like here 
https://github.com/apache/arrow/blob/main/format/substrait/extension_types.yaml)
 rather than as constants (kinda what 
https://github.com/apache/datafusion/blob/09054263df1d8de06b6e77ab0cbd99027bb7ceb6/datafusion/substrait/src/variation_const.rs#L21
 already says 😅). And then (I think) in the plan itself, there'd be instances 
of the SimpleExtensionDeclaration messages 
(https://github.com/substrait-io/substrait/blob/4f5b4ac4d473c9f03f30f86eca080073d99ef1c7/proto/substrait/extensions/extensions.proto#L39),
 and the type_reference and type_variation_reference would link to the 
type_anchor and type_variation_anchor, rather than to the hard-coded constants. 
   
   That way one can (in theory, at least) teach another Substrait 
producer/consumer about the DataFusion extensions and keep plans compatible, or 
at least the systems will recognize that the plans are incompatible as it 
refers to extension URIs that the other system doesn't know about.
   
   Does that make sense? I'm not 100% sure of anything I'm saying here, so I 
may as be understanding something wrong.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to