[jira] [Work logged] (HIVE-25335) Unreasonable setting reduce number, when join big size table(but small row count) and small size table

2021-09-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25335?focusedWorklogId=655346=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-655346
 ]

ASF GitHub Bot logged work on HIVE-25335:
-

Author: ASF GitHub Bot
Created on: 27/Sep/21 04:54
Start Date: 27/Sep/21 04:54
Worklog Time Spent: 10m 
  Work Description: zhengchenyu commented on pull request #2490:
URL: https://github.com/apache/hive/pull/2490#issuecomment-927529430


   @jcamachor Can you help me review it, or give me some suggestion?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 655346)
Time Spent: 0.5h  (was: 20m)

> Unreasonable setting reduce number, when join big size table(but small row 
> count) and small size table
> --
>
> Key: HIVE-25335
> URL: https://issues.apache.org/jira/browse/HIVE-25335
> Project: Hive
>  Issue Type: Improvement
>Reporter: zhengchenyu
>Assignee: zhengchenyu
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-25335.001.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> I found an application which is slow in our cluster, because the proccess 
> bytes of one reduce is very huge, but only two reduce. 
> when I debug, I found the reason. Because in this sql, one big size table 
> (about 30G) with few row count(about 3.5M), another small size table (about 
> 100M) have more row count (about 3.6M). So JoinStatsRule.process only use 
> 100M to estimate reducer's number. But we need to  process 30G byte in fact.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25561) Killed task should not commit file.

2021-09-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25561?focusedWorklogId=655342=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-655342
 ]

ASF GitHub Bot logged work on HIVE-25561:
-

Author: ASF GitHub Bot
Created on: 27/Sep/21 04:37
Start Date: 27/Sep/21 04:37
Worklog Time Spent: 10m 
  Work Description: zhengchenyu opened a new pull request #2674:
URL: https://github.com/apache/hive/pull/2674


   ### What changes were proposed in this pull request?
   We should set abort to true, when we catch any Exception.
   
   ### Why are the changes needed?
   
   For tez engine in our cluster, I found some duplicate line, especially tez 
speculation is enabled. In partition dir, I found both 02_0 and 02_1 
exist.
   It's a very low probability event. HIVE-10429 has fix some bug about 
interrupt, but some exception was not caught.
   
   In our cluster, Task receive SIGTERM, then ClientFinalizer(Hadoop Class) was 
called, hdfs client will close. Then will raise exception, but abort may not 
set to true.
   Then removeTempOrDuplicateFiles may fail because of inconsistency, duplicate 
file will retain.
   (Notes: Driver first list dir, then Task commit file, then Driver remove 
duplicate file. It is a inconsistency case)
   
   
   ### How was this patch tested?
   
   Manual test in our cluster. 
   And I add some delay in our test code, then increase the problem's 
probability.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 655342)
Remaining Estimate: 0h
Time Spent: 10m

> Killed task should not commit file.
> ---
>
> Key: HIVE-25561
> URL: https://issues.apache.org/jira/browse/HIVE-25561
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.2.1, 2.3.8, 2.4.0
>Reporter: zhengchenyu
>Assignee: zhengchenyu
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For tez engine in our cluster, I found some duplicate line, especially tez 
> speculation is enabled. In partition dir, I found both 02_0 and 02_1 
> exist.
> It's a very low probability event. HIVE-10429 has fix some bug about 
> interrupt, but some exception was not caught.
> In our cluster, Task receive SIGTERM, then ClientFinalizer(Hadoop Class) was 
> called, hdfs client will close. Then will raise exception, but abort may not 
> set to true.
> Then removeTempOrDuplicateFiles may fail because of inconsistency, duplicate 
> file will retain. 
> (Notes: Driver first list dir, then Task commit file, then Driver remove 
> duplicate file. It is a inconsistency case)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25561) Killed task should not commit file.

2021-09-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25561:
--
Labels: pull-request-available  (was: )

> Killed task should not commit file.
> ---
>
> Key: HIVE-25561
> URL: https://issues.apache.org/jira/browse/HIVE-25561
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.2.1, 2.3.8, 2.4.0
>Reporter: zhengchenyu
>Assignee: zhengchenyu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For tez engine in our cluster, I found some duplicate line, especially tez 
> speculation is enabled. In partition dir, I found both 02_0 and 02_1 
> exist.
> It's a very low probability event. HIVE-10429 has fix some bug about 
> interrupt, but some exception was not caught.
> In our cluster, Task receive SIGTERM, then ClientFinalizer(Hadoop Class) was 
> called, hdfs client will close. Then will raise exception, but abort may not 
> set to true.
> Then removeTempOrDuplicateFiles may fail because of inconsistency, duplicate 
> file will retain. 
> (Notes: Driver first list dir, then Task commit file, then Driver remove 
> duplicate file. It is a inconsistency case)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25561) Killed task should not commit file.

2021-09-26 Thread zhengchenyu (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhengchenyu reassigned HIVE-25561:
--

Assignee: zhengchenyu

> Killed task should not commit file.
> ---
>
> Key: HIVE-25561
> URL: https://issues.apache.org/jira/browse/HIVE-25561
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.2.1, 2.3.8, 2.4.0
>Reporter: zhengchenyu
>Assignee: zhengchenyu
>Priority: Major
>
> For tez engine in our cluster, I found some duplicate line, especially tez 
> speculation is enabled. In partition dir, I found both 02_0 and 02_1 
> exist.
> It's a very low probability event. HIVE-10429 has fix some bug about 
> interrupt, but some exception was not caught.
> In our cluster, Task receive SIGTERM, then ClientFinalizer(Hadoop Class) was 
> called, hdfs client will close. Then will raise exception, but abort may not 
> set to true.
> Then removeTempOrDuplicateFiles may fail because of inconsistency, duplicate 
> file will retain. 
> (Notes: Driver first list dir, then Task commit file, then Driver remove 
> duplicate file. It is a inconsistency case)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25497) Bump ORC to 1.7.0

2021-09-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25497?focusedWorklogId=655310=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-655310
 ]

ASF GitHub Bot logged work on HIVE-25497:
-

Author: ASF GitHub Bot
Created on: 26/Sep/21 20:02
Start Date: 26/Sep/21 20:02
Worklog Time Spent: 10m 
  Work Description: dongjoon-hyun commented on pull request #2615:
URL: https://github.com/apache/hive/pull/2615#issuecomment-927361985


   Hi, @pgaref . @williamhyun closed his PR for your taking over. Could you 
open your PR and share the progress to us please?
   > Have already done some preliminary work here and can take over the ticket 
but let me know if you want to invest more time into it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 655310)
Time Spent: 1h 10m  (was: 1h)

> Bump ORC to 1.7.0
> -
>
> Key: HIVE-25497
> URL: https://issues.apache.org/jira/browse/HIVE-25497
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: William Hyun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24289) RetryingMetaStoreClient should not retry connecting to HMS on genuine errors

2021-09-26 Thread Harshit Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harshit Gupta reassigned HIVE-24289:


Assignee: Harshit Gupta

> RetryingMetaStoreClient should not retry connecting to HMS on genuine errors
> 
>
> Key: HIVE-24289
> URL: https://issues.apache.org/jira/browse/HIVE-24289
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Harshit Gupta
>Priority: Major
>
> When there is genuine error from HMS, it should not be retried in 
> RetryingMetaStoreClient. 
> For e.g, following query would be retried multiple times (~20+ times) in HMS 
> causing huge delay in processing, even though this constraint is available in 
> HMS. 
> It should just throw exception to client and stop retrying in such cases.
> {noformat}
> alter table web_sales add constraint tpcds_bin_partitioned_orc_1_ws_s_hd 
> foreign key  (ws_ship_hdemo_sk) references household_demographics 
> (hd_demo_sk) disable novalidate rely;
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.thrift.TApplicationException: Internal error processing 
> add_foreign_key
>   at org.apache.hadoop.hive.ql.metadata.Hive.addForeignKey(Hive.java:5914)
> ..
> ...
> Caused by: org.apache.thrift.TApplicationException: Internal error processing 
> add_foreign_key
>at 
> org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
>at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79)
>at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_add_foreign_key(ThriftHiveMetastore.java:1872)
> {noformat}
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java#L256
> For e.g, if exception contains "Internal error processing ", it could stop 
> retrying all over again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25560) unable to run hive jobs

2021-09-26 Thread Pravin Gajanan Pawar (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pravin Gajanan Pawar resolved HIVE-25560.
-
Release Note: issue resolved
  Resolution: Fixed

> unable to run hive jobs
> ---
>
> Key: HIVE-25560
> URL: https://issues.apache.org/jira/browse/HIVE-25560
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Pravin Gajanan Pawar
>Assignee: Pravin Gajanan Pawar
>Priority: Minor
>
> unable to connect to hive



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25560) unable to run hive jobs

2021-09-26 Thread Pravin Gajanan Pawar (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17420241#comment-17420241
 ] 

Pravin Gajanan Pawar commented on HIVE-25560:
-

RCA =issue resolved

> unable to run hive jobs
> ---
>
> Key: HIVE-25560
> URL: https://issues.apache.org/jira/browse/HIVE-25560
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Pravin Gajanan Pawar
>Assignee: Pravin Gajanan Pawar
>Priority: Minor
>
> unable to connect to hive



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25560) unable to run hive jobs

2021-09-26 Thread Pravin Gajanan Pawar (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pravin Gajanan Pawar reassigned HIVE-25560:
---

Assignee: Pravin Gajanan Pawar

> unable to run hive jobs
> ---
>
> Key: HIVE-25560
> URL: https://issues.apache.org/jira/browse/HIVE-25560
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Pravin Gajanan Pawar
>Assignee: Pravin Gajanan Pawar
>Priority: Minor
>
> unable to connect to hive



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HIVE-25560) unable to run hive jobs

2021-09-26 Thread Pravin Gajanan Pawar (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25560 started by Pravin Gajanan Pawar.
---
> unable to run hive jobs
> ---
>
> Key: HIVE-25560
> URL: https://issues.apache.org/jira/browse/HIVE-25560
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Pravin Gajanan Pawar
>Assignee: Pravin Gajanan Pawar
>Priority: Minor
>
> unable to connect to hive



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25559) to_unix_timestamp udf result incorrect

2021-09-26 Thread zengxl (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17420221#comment-17420221
 ] 

zengxl commented on HIVE-25559:
---

* The execution result of the modified code

{code:java}
//代码占位符
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/local/hive/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/local/hadoop-3.2.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID = 869fe53f-fc8c-4d43-8cf4-8c0ed0ab1502Logging initialized using 
configuration in file:/usr/local/hive/conf/hive-log4j2.properties Async: true
Hive Session ID = f4f2d21c-c142-4efb-b5c5-9d8905eeef73
Hive-on-MR is deprecated in Hive 2 and may not be available in the future 
versions. Consider using a different execution engine (i.e. spark, tez) or 
using Hive 1.X releases.
hive> select unix_timestamp('2021-09-24 00:00:00');
-cunrrent time zone:Asia/Shanghai
-cunrrent time zone:Asia/Shanghai
-cunrrent time zone:Asia/Shanghai
-cunrrent time zone:Asia/Shanghai
OK
1632412800
Time taken: 3.637 seconds, Fetched: 1 row(s)
{code}

> to_unix_timestamp udf result incorrect
> --
>
> Key: HIVE-25559
> URL: https://issues.apache.org/jira/browse/HIVE-25559
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 3.1.2
>Reporter: zengxl
>Assignee: zengxl
>Priority: Critical
> Attachments: HIVE-25559.1.branch-3.1.2patch
>
>
> when I use *unix_timestamp* udf,What this function actually calls is 
> *to_unix_timestamp* udf.This return result is incorrect.Here is my SQL:
> {code:java}
> //代码占位符
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/hive/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/hadoop-3.2.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Hive Session ID = 3a04a9cf-1fdb-4017-a4bb-14763a3163c7Logging initialized 
> using configuration in file:/usr/local/hive/conf/hive-log4j2.properties 
> Async: true
> Hive Session ID = 92ca916b-cfde-43b5-bd86-10d50ff7d861
> Hive-on-MR is deprecated in Hive 2 and may not be available in the future 
> versions. Consider using a different execution engine (i.e. spark, tez) or 
> using Hive 1.X releases.
> hive> select unix_timestamp('2021-09-24 00:00:00');
> OK
> 1632441600
> Time taken: 3.729 seconds, Fetched: 1 row(s)
> {code}
> We see GenericUDFToUnixTimeStamp class code,I found that the fixed time zone 
> is set {color:#de350b}UTC{color}, not according to the user time zone.Time 
> zones vary with users,My time zone is {color:#de350b}Asia/Shanghai{color} 
> .Therefore, the function should use the user time zone Here is the code I 
> modified   
> {code:java}
> //代码占位符
> SessionState ss = SessionState.get(); String timeZoneStr = 
> ss.getConf().get("hive.local.time.zone"); if (timeZoneStr == null || 
> timeZoneStr.trim().isEmpty() || timeZoneStr.toLowerCase().equals("local")) { 
> timeZoneStr = System.getProperty("user.timezone"); } 
> formatter.setTimeZone(TimeZone.getTimeZone(timeZoneStr));
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25559) to_unix_timestamp udf result incorrect

2021-09-26 Thread zengxl (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zengxl updated HIVE-25559:
--
Attachment: HIVE-25559.1.branch-3.1.2patch
Status: Patch Available  (was: Open)

> to_unix_timestamp udf result incorrect
> --
>
> Key: HIVE-25559
> URL: https://issues.apache.org/jira/browse/HIVE-25559
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 3.1.2
>Reporter: zengxl
>Assignee: zengxl
>Priority: Critical
> Attachments: HIVE-25559.1.branch-3.1.2patch
>
>
> when I use *unix_timestamp* udf,What this function actually calls is 
> *to_unix_timestamp* udf.This return result is incorrect.Here is my SQL:
> {code:java}
> //代码占位符
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/hive/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/hadoop-3.2.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Hive Session ID = 3a04a9cf-1fdb-4017-a4bb-14763a3163c7Logging initialized 
> using configuration in file:/usr/local/hive/conf/hive-log4j2.properties 
> Async: true
> Hive Session ID = 92ca916b-cfde-43b5-bd86-10d50ff7d861
> Hive-on-MR is deprecated in Hive 2 and may not be available in the future 
> versions. Consider using a different execution engine (i.e. spark, tez) or 
> using Hive 1.X releases.
> hive> select unix_timestamp('2021-09-24 00:00:00');
> OK
> 1632441600
> Time taken: 3.729 seconds, Fetched: 1 row(s)
> {code}
> We see GenericUDFToUnixTimeStamp class code,I found that the fixed time zone 
> is set {color:#de350b}UTC{color}, not according to the user time zone.Time 
> zones vary with users,My time zone is {color:#de350b}Asia/Shanghai{color} 
> .Therefore, the function should use the user time zone Here is the code I 
> modified   
> {code:java}
> //代码占位符
> SessionState ss = SessionState.get(); String timeZoneStr = 
> ss.getConf().get("hive.local.time.zone"); if (timeZoneStr == null || 
> timeZoneStr.trim().isEmpty() || timeZoneStr.toLowerCase().equals("local")) { 
> timeZoneStr = System.getProperty("user.timezone"); } 
> formatter.setTimeZone(TimeZone.getTimeZone(timeZoneStr));
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25559) to_unix_timestamp udf result incorrect

2021-09-26 Thread zengxl (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zengxl updated HIVE-25559:
--
Attachment: (was: HIVE-25559.1.branch-3.1.2patch)

> to_unix_timestamp udf result incorrect
> --
>
> Key: HIVE-25559
> URL: https://issues.apache.org/jira/browse/HIVE-25559
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 3.1.2
>Reporter: zengxl
>Assignee: zengxl
>Priority: Critical
> Attachments: HIVE-25559.1.branch-3.1.2patch
>
>
> when I use *unix_timestamp* udf,What this function actually calls is 
> *to_unix_timestamp* udf.This return result is incorrect.Here is my SQL:
> {code:java}
> //代码占位符
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/hive/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/hadoop-3.2.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Hive Session ID = 3a04a9cf-1fdb-4017-a4bb-14763a3163c7Logging initialized 
> using configuration in file:/usr/local/hive/conf/hive-log4j2.properties 
> Async: true
> Hive Session ID = 92ca916b-cfde-43b5-bd86-10d50ff7d861
> Hive-on-MR is deprecated in Hive 2 and may not be available in the future 
> versions. Consider using a different execution engine (i.e. spark, tez) or 
> using Hive 1.X releases.
> hive> select unix_timestamp('2021-09-24 00:00:00');
> OK
> 1632441600
> Time taken: 3.729 seconds, Fetched: 1 row(s)
> {code}
> We see GenericUDFToUnixTimeStamp class code,I found that the fixed time zone 
> is set {color:#de350b}UTC{color}, not according to the user time zone.Time 
> zones vary with users,My time zone is {color:#de350b}Asia/Shanghai{color} 
> .Therefore, the function should use the user time zone Here is the code I 
> modified   
> {code:java}
> //代码占位符
> SessionState ss = SessionState.get(); String timeZoneStr = 
> ss.getConf().get("hive.local.time.zone"); if (timeZoneStr == null || 
> timeZoneStr.trim().isEmpty() || timeZoneStr.toLowerCase().equals("local")) { 
> timeZoneStr = System.getProperty("user.timezone"); } 
> formatter.setTimeZone(TimeZone.getTimeZone(timeZoneStr));
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25559) to_unix_timestamp udf result incorrect

2021-09-26 Thread zengxl (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zengxl updated HIVE-25559:
--
Attachment: HIVE-25559.1.branch-3.1.2patch

> to_unix_timestamp udf result incorrect
> --
>
> Key: HIVE-25559
> URL: https://issues.apache.org/jira/browse/HIVE-25559
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 3.1.2
>Reporter: zengxl
>Assignee: zengxl
>Priority: Critical
> Attachments: HIVE-25559.1.branch-3.1.2patch
>
>
> when I use *unix_timestamp* udf,What this function actually calls is 
> *to_unix_timestamp* udf.This return result is incorrect.Here is my SQL:
> {code:java}
> //代码占位符
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/hive/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/hadoop-3.2.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Hive Session ID = 3a04a9cf-1fdb-4017-a4bb-14763a3163c7Logging initialized 
> using configuration in file:/usr/local/hive/conf/hive-log4j2.properties 
> Async: true
> Hive Session ID = 92ca916b-cfde-43b5-bd86-10d50ff7d861
> Hive-on-MR is deprecated in Hive 2 and may not be available in the future 
> versions. Consider using a different execution engine (i.e. spark, tez) or 
> using Hive 1.X releases.
> hive> select unix_timestamp('2021-09-24 00:00:00');
> OK
> 1632441600
> Time taken: 3.729 seconds, Fetched: 1 row(s)
> {code}
> We see GenericUDFToUnixTimeStamp class code,I found that the fixed time zone 
> is set {color:#de350b}UTC{color}, not according to the user time zone.Time 
> zones vary with users,My time zone is {color:#de350b}Asia/Shanghai{color} 
> .Therefore, the function should use the user time zone Here is the code I 
> modified   
> {code:java}
> //代码占位符
> SessionState ss = SessionState.get(); String timeZoneStr = 
> ss.getConf().get("hive.local.time.zone"); if (timeZoneStr == null || 
> timeZoneStr.trim().isEmpty() || timeZoneStr.toLowerCase().equals("local")) { 
> timeZoneStr = System.getProperty("user.timezone"); } 
> formatter.setTimeZone(TimeZone.getTimeZone(timeZoneStr));
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25559) to_unix_timestamp udf result incorrect

2021-09-26 Thread zengxl (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zengxl reassigned HIVE-25559:
-


> to_unix_timestamp udf result incorrect
> --
>
> Key: HIVE-25559
> URL: https://issues.apache.org/jira/browse/HIVE-25559
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 3.1.2
>Reporter: zengxl
>Assignee: zengxl
>Priority: Critical
>
> when I use *unix_timestamp* udf,What this function actually calls is 
> *to_unix_timestamp* udf.This return result is incorrect.Here is my SQL:
> {code:java}
> //代码占位符
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/hive/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/hadoop-3.2.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Hive Session ID = 3a04a9cf-1fdb-4017-a4bb-14763a3163c7Logging initialized 
> using configuration in file:/usr/local/hive/conf/hive-log4j2.properties 
> Async: true
> Hive Session ID = 92ca916b-cfde-43b5-bd86-10d50ff7d861
> Hive-on-MR is deprecated in Hive 2 and may not be available in the future 
> versions. Consider using a different execution engine (i.e. spark, tez) or 
> using Hive 1.X releases.
> hive> select unix_timestamp('2021-09-24 00:00:00');
> OK
> 1632441600
> Time taken: 3.729 seconds, Fetched: 1 row(s)
> {code}
> We see GenericUDFToUnixTimeStamp class code,I found that the fixed time zone 
> is set {color:#de350b}UTC{color}, not according to the user time zone.Time 
> zones vary with users,My time zone is {color:#de350b}Asia/Shanghai{color} 
> .Therefore, the function should use the user time zone Here is the code I 
> modified   
> {code:java}
> //代码占位符
> SessionState ss = SessionState.get(); String timeZoneStr = 
> ss.getConf().get("hive.local.time.zone"); if (timeZoneStr == null || 
> timeZoneStr.trim().isEmpty() || timeZoneStr.toLowerCase().equals("local")) { 
> timeZoneStr = System.getProperty("user.timezone"); } 
> formatter.setTimeZone(TimeZone.getTimeZone(timeZoneStr));
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)