[ 
https://issues.apache.org/jira/browse/HIVE-25905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17483120#comment-17483120
 ] 

Alessandro Solimando commented on HIVE-25905:
---------------------------------------------

You are right [~zabetak], it works as expected and it matches both the property 
description and the summarisation table in your comment:
{noformat}
-- hive.default.nulls.last is true by default, it sets NULLS_FIRST for DESC
set hive.default.nulls.last;

OUT:
hive.default.nulls.last=true

SELECT a, b, c, row_number() OVER (PARTITION BY a, b ORDER BY b DESC, c DESC)
FROM test1;

OUT:
John Doe        1990-05-10 00:00:00     2022-01-10 00:00:00     1
John Doe        1990-05-10 00:00:00     2021-12-10 00:00:00     2
John Doe        1990-05-10 00:00:00     2021-11-10 00:00:00     3
John Doe        1990-05-10 00:00:00     2021-10-10 00:00:00     4
John Doe        1990-05-10 00:00:00     2021-09-10 00:00:00     5
John Doe        1987-05-10 00:00:00     NULL    1
John Doe        1987-05-10 00:00:00     2022-01-10 00:00:00     2
John Doe        1987-05-10 00:00:00     2021-12-10 00:00:00     3
John Doe        1987-05-10 00:00:00     2021-11-10 00:00:00     4
John Doe        1987-05-10 00:00:00     2021-10-10 00:00:00     5

-- we set hive.default.nulls.last=false, it sets NULLS_LAST for DESC
set hive.default.nulls.last=false;

SELECT a, b, c, row_number() OVER (PARTITION BY a, b ORDER BY b DESC, c DESC)
FROM test1;

OUT:
John Doe        1990-05-10 00:00:00     2022-01-10 00:00:00     1
John Doe        1990-05-10 00:00:00     2021-12-10 00:00:00     2
John Doe        1990-05-10 00:00:00     2021-11-10 00:00:00     3
John Doe        1990-05-10 00:00:00     2021-10-10 00:00:00     4
John Doe        1990-05-10 00:00:00     2021-09-10 00:00:00     5
John Doe        1987-05-10 00:00:00     2022-01-10 00:00:00     1
John Doe        1987-05-10 00:00:00     2021-12-10 00:00:00     2
John Doe        1987-05-10 00:00:00     2021-11-10 00:00:00     3
John Doe        1987-05-10 00:00:00     2021-10-10 00:00:00     4
John Doe        1987-05-10 00:00:00     NULL    5

-- we set hive.default.nulls.last=false but we have explicit NULLS_LAST, we 
expect NULLS_LAST
set hive.default.nulls.last=false;

SELECT a, b, c, row_number() OVER (PARTITION BY a, b ORDER BY b DESC, c DESC 
NULLS LAST)
FROM test1;

OUT:
John Doe        1990-05-10 00:00:00     2022-01-10 00:00:00     1
John Doe        1990-05-10 00:00:00     2021-12-10 00:00:00     2
John Doe        1990-05-10 00:00:00     2021-11-10 00:00:00     3
John Doe        1990-05-10 00:00:00     2021-10-10 00:00:00     4
John Doe        1990-05-10 00:00:00     2021-09-10 00:00:00     5
John Doe        1987-05-10 00:00:00     2022-01-10 00:00:00     1
John Doe        1987-05-10 00:00:00     2021-12-10 00:00:00     2
John Doe        1987-05-10 00:00:00     2021-11-10 00:00:00     3
John Doe        1987-05-10 00:00:00     2021-10-10 00:00:00     4
John Doe        1987-05-10 00:00:00     NULL    5

-- we have explicit NULLS_FIRST, we expect NULLS_FIRST
SELECT a, b, c, row_number() OVER (PARTITION BY a, b ORDER BY b DESC, c DESC 
NULLS FIRST)
FROM test1;

--OUT:
John Doe        1990-05-10 00:00:00     2022-01-10 00:00:00     1
John Doe        1990-05-10 00:00:00     2021-12-10 00:00:00     2
John Doe        1990-05-10 00:00:00     2021-11-10 00:00:00     3
John Doe        1990-05-10 00:00:00     2021-10-10 00:00:00     4
John Doe        1990-05-10 00:00:00     2021-09-10 00:00:00     5
John Doe        1987-05-10 00:00:00     NULL    1
John Doe        1987-05-10 00:00:00     2022-01-10 00:00:00     2
John Doe        1987-05-10 00:00:00     2021-12-10 00:00:00     3
John Doe        1987-05-10 00:00:00     2021-11-10 00:00:00     4
John Doe        1987-05-10 00:00:00     2021-10-10 00:00:00     5{noformat}

> ORDER BY colName DESC does not honour 'hive.default.nulls.last' property in 
> absence of NULLS_LAST/FIRST
> -------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-25905
>                 URL: https://issues.apache.org/jira/browse/HIVE-25905
>             Project: Hive
>          Issue Type: Bug
>          Components: Parser
>    Affects Versions: 4.0.0
>            Reporter: Alessandro Solimando
>            Assignee: Alessandro Solimando
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Consider the following table and data:
> {noformat}
> create table test1
> (
>   a string,
>   b timestamp,
>   c timestamp
> );
> INSERT INTO TABLE test1 VALUES
> ('John Doe', '1990-05-10 00:00:00.0', '2022-01-10 00:00:00.0'),
> ('John Doe', '1990-05-10 00:00:00.0', '2021-12-10 00:00:00.0'),
> ('John Doe', '1990-05-10 00:00:00.0', '2021-11-10 00:00:00.0'),
> ('John Doe', '1990-05-10 00:00:00.0', '2021-10-10 00:00:00.0'),
> ('John Doe', '1990-05-10 00:00:00.0', '2021-09-10 00:00:00.0'),
> ('John Doe', '1987-05-10 00:00:00.0', '2022-01-10 00:00:00.0'),
> ('John Doe', '1987-05-10 00:00:00.0', '2021-12-10 00:00:00.0'),
> ('John Doe', '1987-05-10 00:00:00.0', '2021-11-10 00:00:00.0'),
> ('John Doe', '1987-05-10 00:00:00.0', '2021-10-10 00:00:00.0'),
> ('John Doe', '1987-05-10 00:00:00.0', null);{noformat}
> Consider also the following query:
> {noformat}
> SELECT a, b, c, row_number() OVER (PARTITION BY a, b ORDER BY b DESC, c DESC) 
> FROM test1; 
> {noformat}
> The output is:
> {noformat}
> John Doe      10/05/1990 00:00        10/01/2022 00:00        1
> John Doe      10/05/1990 00:00        10/12/2021 00:00        2
> John Doe      10/05/1990 00:00        10/11/2021 00:00        3
> John Doe      10/05/1990 00:00        10/10/2021 00:00        4
> John Doe      10/05/1990 00:00        10/09/2021 00:00        5
> John Doe      10/05/1987 00:00        NULL                    1
> John Doe      10/05/1987 00:00        10/01/2022 00:00        2
> John Doe      10/05/1987 00:00        10/12/2021 00:00        3
> John Doe      10/05/1987 00:00        10/11/2021 00:00        4
> John Doe      10/05/1987 00:00        10/10/2021 00:00        5{noformat}
> While the expected output should be:
> {noformat}
> John Doe      10/05/1990 00:00        10/01/2022 00:00        1
> John Doe      10/05/1990 00:00        10/12/2021 00:00        2
> John Doe      10/05/1990 00:00        10/11/2021 00:00        3
> John Doe      10/05/1990 00:00        10/10/2021 00:00        4
> John Doe      10/05/1990 00:00        10/09/2021 00:00        5
> John Doe      10/05/1987 00:00        10/01/2022 00:00        1
> John Doe      10/05/1987 00:00        10/12/2021 00:00        2
> John Doe      10/05/1987 00:00        10/11/2021 00:00        3
> John Doe      10/05/1987 00:00        10/10/2021 00:00        4
> John Doe      10/05/1987 00:00        NULL                    5{noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to