[ 
https://issues.apache.org/jira/browse/HIVE-24762?focusedWorklogId=647811&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-647811
 ]

ASF GitHub Bot logged work on HIVE-24762:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 08/Sep/21 08:18
            Start Date: 08/Sep/21 08:18
    Worklog Time Spent: 10m 
      Work Description: abstractdog opened a new pull request #1965:
URL: https://github.com/apache/hive/pull/1965


   ### What changes were proposed in this pull request?
   StringValueBoundaryScanner.isDistanceGreater to take amt into account.
   
   
   ### Why are the changes needed?
   Described in jira.
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   
   ### How was this patch tested?
   Added string based range window to ptf.q.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 647811)
    Time Spent: 2h  (was: 1h 50m)

>  StringValueBoundaryScanner ignores boundary which leads to incorrect results
> -----------------------------------------------------------------------------
>
>                 Key: HIVE-24762
>                 URL: https://issues.apache.org/jira/browse/HIVE-24762
>             Project: Hive
>          Issue Type: Bug
>            Reporter: László Bodor
>            Assignee: László Bodor
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/ValueBoundaryScanner.java#L901
> {code}
>   public boolean isDistanceGreater(Object v1, Object v2, int amt) {
>     ...
>     return s1 != null && s2 != null && s1.compareTo(s2) > 0;
> {code}
> Like other boundary scanners, StringValueBoundaryScanner should take amt into 
> account, otherwise it'll result in the same range regardless of the given 
> window size. This typically affects queries where the range is defined on a 
> string column:
> {code}
> select p_mfgr, p_name, p_retailprice,
> count(*) over(partition by p_mfgr order by p_name range between 1 preceding 
> and current row) as cs1,
> count(*) over(partition by p_mfgr order by p_name range between 3 preceding 
> and current row) as cs2
> from vector_ptf_part_simple_orc;
> {code} 
> with "> 0" cs1 and cs2 will be calculated on the same window, so cs1 == cs2, 
> but actually it should be different, this is the correct result (see "almond 
> antique olive coral navajo"):
> {code}
> +-----------------+---------------------------------------------+------+------+
> |     p_mfgr      |                   p_name                    | cs1  | cs2  
> |
> +-----------------+---------------------------------------------+------+------+
> | Manufacturer#1  | almond antique burnished rose metallic      | 2    | 2    
> |
> | Manufacturer#1  | almond antique burnished rose metallic      | 2    | 2    
> |
> | Manufacturer#1  | almond antique chartreuse lavender yellow   | 6    | 6    
> |
> | Manufacturer#1  | almond antique chartreuse lavender yellow   | 6    | 6    
> |
> | Manufacturer#1  | almond antique chartreuse lavender yellow   | 6    | 6    
> |
> | Manufacturer#1  | almond antique chartreuse lavender yellow   | 6    | 6    
> |
> | Manufacturer#1  | almond antique salmon chartreuse burlywood  | 1    | 1    
> |
> | Manufacturer#1  | almond aquamarine burnished black steel     | 1    | 8    
> |
> | Manufacturer#1  | almond aquamarine pink moccasin thistle     | 4    | 4    
> |
> | Manufacturer#1  | almond aquamarine pink moccasin thistle     | 4    | 4    
> |
> | Manufacturer#1  | almond aquamarine pink moccasin thistle     | 4    | 4    
> |
> | Manufacturer#1  | almond aquamarine pink moccasin thistle     | 4    | 4    
> |
> | Manufacturer#2  | almond antique violet chocolate turquoise   | 1    | 1    
> |
> | Manufacturer#2  | almond antique violet turquoise frosted     | 3    | 3    
> |
> | Manufacturer#2  | almond antique violet turquoise frosted     | 3    | 3    
> |
> | Manufacturer#2  | almond antique violet turquoise frosted     | 3    | 3    
> |
> | Manufacturer#2  | almond aquamarine midnight light salmon     | 1    | 5    
> |
> | Manufacturer#2  | almond aquamarine rose maroon antique       | 2    | 2    
> |
> | Manufacturer#2  | almond aquamarine rose maroon antique       | 2    | 2    
> |
> | Manufacturer#2  | almond aquamarine sandy cyan gainsboro      | 3    | 3    
> |
> | Manufacturer#3  | almond antique chartreuse khaki white       | 1    | 1    
> |
> | Manufacturer#3  | almond antique forest lavender goldenrod    | 4    | 5    
> |
> | Manufacturer#3  | almond antique forest lavender goldenrod    | 4    | 5    
> |
> | Manufacturer#3  | almond antique forest lavender goldenrod    | 4    | 5    
> |
> | Manufacturer#3  | almond antique forest lavender goldenrod    | 4    | 5    
> |
> | Manufacturer#3  | almond antique metallic orange dim          | 1    | 1    
> |
> | Manufacturer#3  | almond antique misty red olive              | 1    | 1    
> |
> | Manufacturer#3  | almond antique olive coral navajo           | 1    | 3    
> |
> | Manufacturer#4  | almond antique gainsboro frosted violet     | 1    | 1    
> |
> | Manufacturer#4  | almond antique violet mint lemon            | 1    | 1    
> |
> | Manufacturer#4  | almond aquamarine floral ivory bisque       | 2    | 4    
> |
> | Manufacturer#4  | almond aquamarine floral ivory bisque       | 2    | 4    
> |
> | Manufacturer#4  | almond aquamarine yellow dodger mint        | 1    | 1    
> |
> | Manufacturer#4  | almond azure aquamarine papaya violet       | 1    | 1    
> |
> | Manufacturer#5  | almond antique blue firebrick mint          | 1    | 1    
> |
> | Manufacturer#5  | almond antique medium spring khaki          | 2    | 2    
> |
> | Manufacturer#5  | almond antique medium spring khaki          | 2    | 2    
> |
> | Manufacturer#5  | almond antique sky peru orange              | 1    | 1    
> |
> | Manufacturer#5  | almond aquamarine dodger light gainsboro    | 1    | 5    
> |
> | Manufacturer#5  | almond azure blanched chiffon midnight      | 1    | 1    
> |
> +-----------------+---------------------------------------------+------+------+
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to