Re: Polymorphism in DataFusion

2020-08-21 Thread Andy Grove
Yes, this matches my understanding as well. Thanks, Jorge. On Fri, Aug 21, 2020 at 4:11 PM Andrew Lamb wrote: > Thanks for writing that up Jorge -- I read the documents and left some > comments, but in general I would say this matches my personal understanding > of the design of DataFusion and

Re: Polymorphism in DataFusion

2020-08-21 Thread Andrew Lamb
Thanks for writing that up Jorge -- I read the documents and left some comments, but in general I would say this matches my personal understanding of the design of DataFusion and where I think it should head. On Fri, Aug 21, 2020 at 4:41 PM Jorge Cardoso Leitão < jorgecarlei...@gmail.com> wrote:

Re: Polymorphism in DataFusion

2020-08-21 Thread Jorge Cardoso Leitão
Hi everyone, Got it. I agree that we should aim for a proposal. As exercise, I wrote some personal notes about DataFusion's notions and invariants, as they form the basis for any proposal. I would

Re: Polymorphism in DataFusion

2020-08-19 Thread Andrew Lamb
I think B) is closer to what I was thinking. We may be using the term statically and dynamically typed a little differently -- I am sorry for the confusion. I have somewhat lost track of exactly what we are proposing and for that I apologize. I propose a next step of sketching out a proposed API

Re: Polymorphism in DataFusion

2020-08-19 Thread Jorge Cardoso Leitão
Hi, Thank you for this enlightening discussion, Andrew! So, just to make sure I understood, are you proposing A), B) or something else? A) we should not accept / declare polymorphic operations: all types should be known based on the operation name (e.g. sum_f32, plus_f32, etc.) B) we should

Re: Polymorphism in DataFusion

2020-08-18 Thread Andrew Lamb
It is my personal opinion that actual UDF functions registered with data fusion should take a known set of input types and single return type (e.g. sum_i32 --> i32). I think this would: 1. Simplify the implementation of both the DataFusion optimizer and the UDFs 2. Make it easier for UDF writers

Re: Polymorphism in DataFusion

2020-08-17 Thread Jorge Cardoso Leitão
Thanks Andrew, I am not sure I articulated this well enough, though, as I did not specify the type of polymorphism that I was thinking about. xD My question was/is about whether we should accept functions whose return type is known during planning, and constant during execution, or whether their

Re: Polymorphism in DataFusion

2020-08-17 Thread Andrew Lamb
In my opinion, I suggest we do not continue down the path of (runtime) polymorphic functions unless a compelling use case for them can be articulated. You have done a great job articulating some of the implementation challenges, but I personally struggle to describe when, as a user of DataFusion,

Polymorphism in DataFusion

2020-08-17 Thread Jorge Cardoso Leitão
Hi, Recently, I have been contributing to DataFusion, and I would like to bring to your attention a question that I faced while PRing to DataFusion that IMO needs some alignment :) DataFusion supports scalar UDFs: functions that expect a type, return a type, and performs some operation on the