uros-db commented on PR #51467: URL: https://github.com/apache/spark/pull/51467#issuecomment-3083227077
> This prototype covers <1%, I think. But there are many things that this approach can never cover, e.g. casting rules, type coercion, function support, specific operations related to a particular type (think collated strings for example), etc. So let's exclude this from the equation. What I'm trying to make sense of here is the following - does this kind of approach actually make a large impact on new data type introduction, or does it just complicate the codebase with only minor advantages. To quantify this decision better, I may be useful to have a reasonable table of approximations, such as: - total time to implement a new data type: 30 weeks of work - implementing stuff that's *out of scope* for this prototype (casting, coercion, functions): 20 weeks of work - implementing stuff that's *in scope* for this prototype: 10 weeks of work - total time to implement the full actual type TypeOps framework approach and apply it to TIME: 6 weeks of work - total time to apply the new approach to new data types in the future: 2 weeks of work - total time saved in the future per each new added data type: 8 weeks of work -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
