[jira] [Commented] (HIVE-15982) Support the width_bucket function
[ https://issues.apache.org/jira/browse/HIVE-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1588#comment-1588 ] Sahil Takiar commented on HIVE-15982: - Documented this under: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-MathematicalFunctions > Support the width_bucket function > - > > Key: HIVE-15982 > URL: https://issues.apache.org/jira/browse/HIVE-15982 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Sahil Takiar > Labels: TODOC3.0 > Fix For: 3.0.0 > > Attachments: HIVE-15982.1.patch, HIVE-15982.2.patch, > HIVE-15982.3.patch, HIVE-15982.4.patch, HIVE-15982.5.patch, HIVE-15982.6.patch > > > Support the width_bucket(wbo, wbb1, wbb2, wbc) which returns an integer > between 0 and wbc+1 by mapping wbo into the ith equally sized bucket made by > dividing wbb1 and wbb2 into equally sized regions. If wbo < wbb1, return 1, > if wbo > wbb2 return wbc+1. Reference: SQL standard section 4.4. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15982) Support the width_bucket function
[ https://issues.apache.org/jira/browse/HIVE-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980578#comment-15980578 ] Carter Shanklin commented on HIVE-15982: I ran a test suite I use against this and noted a few issues in HIVE-16513. I think the biggest challenge will be that floating point numbers can't be used. I hope I didn't confuse the issue when I said "numeric value expressions" above. > Support the width_bucket function > - > > Key: HIVE-15982 > URL: https://issues.apache.org/jira/browse/HIVE-15982 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Sahil Takiar > Labels: TODOC3.0 > Fix For: 3.0.0 > > Attachments: HIVE-15982.1.patch, HIVE-15982.2.patch, > HIVE-15982.3.patch, HIVE-15982.4.patch, HIVE-15982.5.patch, HIVE-15982.6.patch > > > Support the width_bucket(wbo, wbb1, wbb2, wbc) which returns an integer > between 0 and wbc+1 by mapping wbo into the ith equally sized bucket made by > dividing wbb1 and wbb2 into equally sized regions. If wbo < wbb1, return 1, > if wbo > wbb2 return wbc+1. Reference: SQL standard section 4.4. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15982) Support the width_bucket function
[ https://issues.apache.org/jira/browse/HIVE-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15979716#comment-15979716 ] Lefty Leverenz commented on HIVE-15982: --- Doc note: The width_bucket function needs to be documented in the UDFs wiki. I'm not sure which section it belongs in -- perhaps UDAFs. * [Hive Operators and UDFs | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF] Added a TODOC3.0 label. > Support the width_bucket function > - > > Key: HIVE-15982 > URL: https://issues.apache.org/jira/browse/HIVE-15982 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Sahil Takiar > Labels: TODOC3.0 > Fix For: 3.0.0 > > Attachments: HIVE-15982.1.patch, HIVE-15982.2.patch, > HIVE-15982.3.patch, HIVE-15982.4.patch, HIVE-15982.5.patch, HIVE-15982.6.patch > > > Support the width_bucket(wbo, wbb1, wbb2, wbc) which returns an integer > between 0 and wbc+1 by mapping wbo into the ith equally sized bucket made by > dividing wbb1 and wbb2 into equally sized regions. If wbo < wbb1, return 1, > if wbo > wbb2 return wbc+1. Reference: SQL standard section 4.4. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15982) Support the width_bucket function
[ https://issues.apache.org/jira/browse/HIVE-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15979383#comment-15979383 ] Hive QA commented on HIVE-15982: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12864551/HIVE-15982.6.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10626 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] (batchId=225) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[show_functions] (batchId=69) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=143) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4827/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4827/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4827/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12864551 - PreCommit-HIVE-Build > Support the width_bucket function > - > > Key: HIVE-15982 > URL: https://issues.apache.org/jira/browse/HIVE-15982 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Sahil Takiar > Attachments: HIVE-15982.1.patch, HIVE-15982.2.patch, > HIVE-15982.3.patch, HIVE-15982.4.patch, HIVE-15982.5.patch, HIVE-15982.6.patch > > > Support the width_bucket(wbo, wbb1, wbb2, wbc) which returns an integer > between 0 and wbc+1 by mapping wbo into the ith equally sized bucket made by > dividing wbb1 and wbb2 into equally sized regions. If wbo < wbb1, return 1, > if wbo > wbb2 return wbc+1. Reference: SQL standard section 4.4. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15982) Support the width_bucket function
[ https://issues.apache.org/jira/browse/HIVE-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15977996#comment-15977996 ] Ashutosh Chauhan commented on HIVE-15982: - +1 > Support the width_bucket function > - > > Key: HIVE-15982 > URL: https://issues.apache.org/jira/browse/HIVE-15982 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Sahil Takiar > Attachments: HIVE-15982.1.patch, HIVE-15982.2.patch, > HIVE-15982.3.patch, HIVE-15982.4.patch, HIVE-15982.5.patch > > > Support the width_bucket(wbo, wbb1, wbb2, wbc) which returns an integer > between 0 and wbc+1 by mapping wbo into the ith equally sized bucket made by > dividing wbb1 and wbb2 into equally sized regions. If wbo < wbb1, return 1, > if wbo > wbb2 return wbc+1. Reference: SQL standard section 4.4. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15982) Support the width_bucket function
[ https://issues.apache.org/jira/browse/HIVE-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15977252#comment-15977252 ] Sahil Takiar commented on HIVE-15982: - Sounds good, attaching updated patch. It returns 0 now. > Support the width_bucket function > - > > Key: HIVE-15982 > URL: https://issues.apache.org/jira/browse/HIVE-15982 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Sahil Takiar > Attachments: HIVE-15982.1.patch, HIVE-15982.2.patch, > HIVE-15982.3.patch > > > Support the width_bucket(wbo, wbb1, wbb2, wbc) which returns an integer > between 0 and wbc+1 by mapping wbo into the ith equally sized bucket made by > dividing wbb1 and wbb2 into equally sized regions. If wbo < wbb1, return 1, > if wbo > wbb2 return wbc+1. Reference: SQL standard section 4.4. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15982) Support the width_bucket function
[ https://issues.apache.org/jira/browse/HIVE-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976984#comment-15976984 ] Ashutosh Chauhan commented on HIVE-15982: - Checked the standard. It doesnt say if wbo < wb1, return 1. Rather it says, Values outside the range between the second and third arguments are assigned to either 0 (zero) or the value of the final argument plus 1 (one). So, I think its safe to assume postgres & oracle has correct implementation. > Support the width_bucket function > - > > Key: HIVE-15982 > URL: https://issues.apache.org/jira/browse/HIVE-15982 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Sahil Takiar > Attachments: HIVE-15982.1.patch, HIVE-15982.2.patch > > > Support the width_bucket(wbo, wbb1, wbb2, wbc) which returns an integer > between 0 and wbc+1 by mapping wbo into the ith equally sized bucket made by > dividing wbb1 and wbb2 into equally sized regions. If wbo < wbb1, return 1, > if wbo > wbb2 return wbc+1. Reference: SQL standard section 4.4. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15982) Support the width_bucket function
[ https://issues.apache.org/jira/browse/HIVE-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976943#comment-15976943 ] Sahil Takiar commented on HIVE-15982: - [~cartershanklin] could we double check the SQL standard for this? Is the statement "If wbo < wbb1, return 1" correct? It seems both Oracle and Postgres return 0 instead of 1 > Support the width_bucket function > - > > Key: HIVE-15982 > URL: https://issues.apache.org/jira/browse/HIVE-15982 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Sahil Takiar > Attachments: HIVE-15982.1.patch, HIVE-15982.2.patch > > > Support the width_bucket(wbo, wbb1, wbb2, wbc) which returns an integer > between 0 and wbc+1 by mapping wbo into the ith equally sized bucket made by > dividing wbb1 and wbb2 into equally sized regions. If wbo < wbb1, return 1, > if wbo > wbb2 return wbc+1. Reference: SQL standard section 4.4. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15982) Support the width_bucket function
[ https://issues.apache.org/jira/browse/HIVE-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15975747#comment-15975747 ] Ashutosh Chauhan commented on HIVE-15982: - I tested it on Postgres and it agrees with Oracle. So, its worth rechecking the standard for this. > Support the width_bucket function > - > > Key: HIVE-15982 > URL: https://issues.apache.org/jira/browse/HIVE-15982 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Sahil Takiar > Attachments: HIVE-15982.1.patch, HIVE-15982.2.patch > > > Support the width_bucket(wbo, wbb1, wbb2, wbc) which returns an integer > between 0 and wbc+1 by mapping wbo into the ith equally sized bucket made by > dividing wbb1 and wbb2 into equally sized regions. If wbo < wbb1, return 1, > if wbo > wbb2 return wbc+1. Reference: SQL standard section 4.4. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15982) Support the width_bucket function
[ https://issues.apache.org/jira/browse/HIVE-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15975605#comment-15975605 ] Sahil Takiar commented on HIVE-15982: - Thanks for taking a look [~ashutoshc] - I've added an updated patch, the results of the qfile are now consistent with what Oracle returns. The one exception is that in the attached patch {{width_bucket(1, 5, 25, 4)}} returns 1. The spec outlined in this JIRA description says that "If wbo < wbb1, return 1" - it seems Oracle returns 0 rather than 1. I removed the {{GenericUDF#getConstantLongValue}} method because it isn't used. > Support the width_bucket function > - > > Key: HIVE-15982 > URL: https://issues.apache.org/jira/browse/HIVE-15982 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Sahil Takiar > Attachments: HIVE-15982.1.patch, HIVE-15982.2.patch > > > Support the width_bucket(wbo, wbb1, wbb2, wbc) which returns an integer > between 0 and wbc+1 by mapping wbo into the ith equally sized bucket made by > dividing wbb1 and wbb2 into equally sized regions. If wbo < wbb1, return 1, > if wbo > wbb2 return wbc+1. Reference: SQL standard section 4.4. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15982) Support the width_bucket function
[ https://issues.apache.org/jira/browse/HIVE-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973721#comment-15973721 ] Ashutosh Chauhan commented on HIVE-15982: - * Executing select width_bucket(1, 5, 25, 4), width_bucket(10, 5, 25, 4), width_bucket(20, 5, 25, 4), width_bucket(30, 5, 25, 4) from dual; on oracle yields 0 2 4 5 which is different than your test case. * You may use PrimitiveObjectInspectorUtils::getLong() instead of writing custom function to extract long values. > Support the width_bucket function > - > > Key: HIVE-15982 > URL: https://issues.apache.org/jira/browse/HIVE-15982 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Sahil Takiar > Attachments: HIVE-15982.1.patch > > > Support the width_bucket(wbo, wbb1, wbb2, wbc) which returns an integer > between 0 and wbc+1 by mapping wbo into the ith equally sized bucket made by > dividing wbb1 and wbb2 into equally sized regions. If wbo < wbb1, return 1, > if wbo > wbb2 return wbc+1. Reference: SQL standard section 4.4. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15982) Support the width_bucket function
[ https://issues.apache.org/jira/browse/HIVE-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973605#comment-15973605 ] Hive QA commented on HIVE-15982: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12863878/HIVE-15982.1.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10587 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_16] (batchId=234) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[show_functions] (batchId=68) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_order_null] (batchId=28) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4741/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4741/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4741/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12863878 - PreCommit-HIVE-Build > Support the width_bucket function > - > > Key: HIVE-15982 > URL: https://issues.apache.org/jira/browse/HIVE-15982 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Sahil Takiar > Attachments: HIVE-15982.1.patch > > > Support the width_bucket(wbo, wbb1, wbb2, wbc) which returns an integer > between 0 and wbc+1 by mapping wbo into the ith equally sized bucket made by > dividing wbb1 and wbb2 into equally sized regions. If wbo < wbb1, return 1, > if wbo > wbb2 return wbc+1. Reference: SQL standard section 4.4. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15982) Support the width_bucket function
[ https://issues.apache.org/jira/browse/HIVE-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15971687#comment-15971687 ] Carter Shanklin commented on HIVE-15982: [~stakiar] Sorry for the delayed response, I checked the SQL spec, for SQL conformance only numeric value expressions are needed. I can see some value in supporting dates or timestamps, e.g. for creating a histogram of sign-up dates, first purchase dates and so on, but it doesn't seem mandatory. Supporting intervals -- maybe a histogram account inactivity time. Again, nice to have. > Support the width_bucket function > - > > Key: HIVE-15982 > URL: https://issues.apache.org/jira/browse/HIVE-15982 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Sahil Takiar > > Support the width_bucket(wbo, wbb1, wbb2, wbc) which returns an integer > between 0 and wbc+1 by mapping wbo into the ith equally sized bucket made by > dividing wbb1 and wbb2 into equally sized regions. If wbo < wbb1, return 1, > if wbo > wbb2 return wbc+1. Reference: SQL standard section 4.4. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15982) Support the width_bucket function
[ https://issues.apache.org/jira/browse/HIVE-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964872#comment-15964872 ] Sahil Takiar commented on HIVE-15982: - Thanks [~cartershanklin]. I'm basing the implementation largely on https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions214.htm and https://my.vertica.com/docs/7.1.x/HTML/Content/Authoring/SQLReferenceManual/Functions/Mathematical/WIDTH_BUCKET.htm - they both mention support for datetime, interval, timestamp, etc. - is that something we want to support too? > Support the width_bucket function > - > > Key: HIVE-15982 > URL: https://issues.apache.org/jira/browse/HIVE-15982 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Sahil Takiar > > Support the width_bucket(wbo, wbb1, wbb2, wbc) which returns an integer > between 0 and wbc+1 by mapping wbo into the ith equally sized bucket made by > dividing wbb1 and wbb2 into equally sized regions. If wbo < wbb1, return 1, > if wbo > wbb2 return wbc+1. Reference: SQL standard section 4.4. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15982) Support the width_bucket function
[ https://issues.apache.org/jira/browse/HIVE-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964466#comment-15964466 ] Carter Shanklin commented on HIVE-15982: [~stakiar] no problem at all, thanks for looking into this > Support the width_bucket function > - > > Key: HIVE-15982 > URL: https://issues.apache.org/jira/browse/HIVE-15982 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin > > Support the width_bucket(wbo, wbb1, wbb2, wbc) which returns an integer > between 0 and wbc+1 by mapping wbo into the ith equally sized bucket made by > dividing wbb1 and wbb2 into equally sized regions. If wbo < wbb1, return 1, > if wbo > wbb2 return wbc+1. Reference: SQL standard section 4.4. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15982) Support the width_bucket function
[ https://issues.apache.org/jira/browse/HIVE-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15963804#comment-15963804 ] Sahil Takiar commented on HIVE-15982: - [~cartershanklin] if no one is working on this, I would like to take this up. If thats ok with you, I will assign this to myself. > Support the width_bucket function > - > > Key: HIVE-15982 > URL: https://issues.apache.org/jira/browse/HIVE-15982 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin > > Support the width_bucket(wbo, wbb1, wbb2, wbc) which returns an integer > between 0 and wbc+1 by mapping wbo into the ith equally sized bucket made by > dividing wbb1 and wbb2 into equally sized regions. If wbo < wbb1, return 1, > if wbo > wbb2 return wbc+1. Reference: SQL standard section 4.4. -- This message was sent by Atlassian JIRA (v6.3.15#6346)