uros-db commented on PR #51467:
URL: https://github.com/apache/spark/pull/51467#issuecomment-3083227077

   > This prototype covers <1%, I think.
   
   But there are many things that this approach can never cover, e.g. casting 
rules, type coercion, function support, specific operations related to a 
particular type (think collated strings for example), etc. So let's exclude 
this from the equation.
   
   What I'm trying to make sense of here is the following - does this kind of 
approach actually make a large impact on new data type introduction, or does it 
just complicate the codebase with only minor advantages. To quantify this 
decision better, I may be useful to have a reasonable table of approximations, 
such as:
   
   - total time to implement a new data type: 30 weeks of work
   - implementing stuff that's *out of scope* for this prototype (casting, 
coercion, functions): 20 weeks of work
   - implementing stuff that's *in scope* for this prototype: 10 weeks of work
   - total time to implement the full actual type TypeOps framework approach 
and apply it to TIME: 6 weeks of work
   - total time to apply the new approach to new data types in the future: 2 
weeks of work
   - total time saved in the future per each new added data type: 8 weeks of 
work


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to