Wanted to throw this one out for discussion for a bug I found and how to fix it...
So we are inconsistent with how we handle sum() on windowing functions. If all the rows are null and the rows are all on "preceding" rows, we return NULL. On "following" rows, however, if all the rows are null, we return 0. This is inconsistent and I have a fix for that so that we always return null. The fix I have is here (not yet reviewed): https://github.com/apache/hive/pull/3789 My discussion though lies in a different problem which you can see in the patch I uploaded. My current fix changes behavior of the following statement: "select sum('a') from my_table". If my_table has rows, right now we are return 0.0. I've looked on postgres and it doesn't even allow a sum on a string column so I can't really compare to that database. My current fix doesn't disable this, but it does change the behavior to return NULL on this select. I kinda feel that returning NULL is more correct than return 0, but I wanted to throw this out there to see what y'all think. This would be a change in behavior and that makes me nervous. Thanks!
