[jira] [Updated] (HIVE-25146) JMH tests for Multi HT and parallel load
[ https://issues.apache.org/jira/browse/HIVE-25146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25146: -- Fix Version/s: 4.0.0 > JMH tests for Multi HT and parallel load > > > Key: HIVE-25146 > URL: https://issues.apache.org/jira/browse/HIVE-25146 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > JMH tests for parallel HT load, configuration parameters include > LOAD_THREADS_NUM, ROWS_NUM and JOIN_TYPE. > A single thread simulates the default load behaviour while ROWS_NUM < 1M will > default to a single thread for simplicity. > Higher number of threads >=2 evaluates the benefit of parallel loading of the > HT. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (HIVE-25146) JMH tests for Multi HT and parallel load
[ https://issues.apache.org/jira/browse/HIVE-25146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis resolved HIVE-25146. --- Resolution: Fixed > JMH tests for Multi HT and parallel load > > > Key: HIVE-25146 > URL: https://issues.apache.org/jira/browse/HIVE-25146 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > JMH tests for parallel HT load, configuration parameters include > LOAD_THREADS_NUM, ROWS_NUM and JOIN_TYPE. > A single thread simulates the default load behaviour while ROWS_NUM < 1M will > default to a single thread for simplicity. > Higher number of threads >=2 evaluates the benefit of parallel loading of the > HT. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HIVE-25146) JMH tests for Multi HT and parallel load
[ https://issues.apache.org/jira/browse/HIVE-25146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25146: -- Description: JMH tests for parallel HT load, configuration parameters include LOAD_THREADS_NUM, ROWS_NUM and JOIN_TYPE. A single thread simulates the default load behaviour while ROWS_NUM < 1M will default to a single thread for simplicity. Higher number of threads >=2 evaluates the benefit of parallel loading of the HT. was:As the title suggests, add some benchmarks for Parallel HT construction feature > JMH tests for Multi HT and parallel load > > > Key: HIVE-25146 > URL: https://issues.apache.org/jira/browse/HIVE-25146 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > JMH tests for parallel HT load, configuration parameters include > LOAD_THREADS_NUM, ROWS_NUM and JOIN_TYPE. > A single thread simulates the default load behaviour while ROWS_NUM < 1M will > default to a single thread for simplicity. > Higher number of threads >=2 evaluates the benefit of parallel loading of the > HT. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Work started] (HIVE-25146) JMH tests for Multi HT and parallel load
[ https://issues.apache.org/jira/browse/HIVE-25146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-25146 started by Panagiotis Garefalakis. - > JMH tests for Multi HT and parallel load > > > Key: HIVE-25146 > URL: https://issues.apache.org/jira/browse/HIVE-25146 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > As the title suggests, add some benchmarks for Parallel HT construction > feature -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HIVE-25149) Support parallel load for Fast HT implementations
[ https://issues.apache.org/jira/browse/HIVE-25149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25149: -- Fix Version/s: 4.0.0 > Support parallel load for Fast HT implementations > - > > Key: HIVE-25149 > URL: https://issues.apache.org/jira/browse/HIVE-25149 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 3h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HIVE-25583) Support parallel load for HastTables - Interfaces
[ https://issues.apache.org/jira/browse/HIVE-25583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25583: -- Fix Version/s: 4.0.0 > Support parallel load for HastTables - Interfaces > - > > Key: HIVE-25583 > URL: https://issues.apache.org/jira/browse/HIVE-25583 > Project: Hive > Issue Type: Sub-task >Reporter: Ramesh Kumar Thangarajan >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Support parallel load for HastTables - Interfaces > * Introducing VectorMapJoinFastHashTableContainerBase class that implements > VectorMapJoinHashTable > * Each VectorMapJoinFastStringHashMapContainer is a singleton that contains > an array of HashTables (1 or more) > * VectorMapJoinFastTableContainer now initializes > VectorMapJoinFastHashTableContainers instead of HTs directly -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (HIVE-25149) Support parallel load for Fast HT implementations
[ https://issues.apache.org/jira/browse/HIVE-25149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis resolved HIVE-25149. --- Resolution: Fixed > Support parallel load for Fast HT implementations > - > > Key: HIVE-25149 > URL: https://issues.apache.org/jira/browse/HIVE-25149 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 3h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HIVE-25149) Support parallel load for Fast HT implementations
[ https://issues.apache.org/jira/browse/HIVE-25149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17494785#comment-17494785 ] Panagiotis Garefalakis commented on HIVE-25149: --- Resolved as part of [https://github.com/apache/hive/pull/3029] Thanks [~rameshkumar] for the review[|https://issues.apache.org/jira/secure/ViewProfile.jspa?name=rameshkumar] > Support parallel load for Fast HT implementations > - > > Key: HIVE-25149 > URL: https://issues.apache.org/jira/browse/HIVE-25149 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 3h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Work started] (HIVE-25583) Support parallel load for HastTables - Interfaces
[ https://issues.apache.org/jira/browse/HIVE-25583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-25583 started by Panagiotis Garefalakis. - > Support parallel load for HastTables - Interfaces > - > > Key: HIVE-25583 > URL: https://issues.apache.org/jira/browse/HIVE-25583 > Project: Hive > Issue Type: Sub-task >Reporter: Ramesh Kumar Thangarajan >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Support parallel load for HastTables - Interfaces > * Introducing VectorMapJoinFastHashTableContainerBase class that implements > VectorMapJoinHashTable > * Each VectorMapJoinFastStringHashMapContainer is a singleton that contains > an array of HashTables (1 or more) > * VectorMapJoinFastTableContainer now initializes > VectorMapJoinFastHashTableContainers instead of HTs directly -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HIVE-25583) Support parallel load for HastTables - Interfaces
[ https://issues.apache.org/jira/browse/HIVE-25583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17490416#comment-17490416 ] Panagiotis Garefalakis commented on HIVE-25583: --- Resolved via [https://github.com/apache/hive/pull/2999] Thanks [~rameshkumar] for the review! > Support parallel load for HastTables - Interfaces > - > > Key: HIVE-25583 > URL: https://issues.apache.org/jira/browse/HIVE-25583 > Project: Hive > Issue Type: Sub-task >Reporter: Ramesh Kumar Thangarajan >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Support parallel load for HastTables - Interfaces > * Introducing VectorMapJoinFastHashTableContainerBase class that implements > VectorMapJoinHashTable > * Each VectorMapJoinFastStringHashMapContainer is a singleton that contains > an array of HashTables (1 or more) > * VectorMapJoinFastTableContainer now initializes > VectorMapJoinFastHashTableContainers instead of HTs directly -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (HIVE-25583) Support parallel load for HastTables - Interfaces
[ https://issues.apache.org/jira/browse/HIVE-25583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis resolved HIVE-25583. --- Resolution: Fixed > Support parallel load for HastTables - Interfaces > - > > Key: HIVE-25583 > URL: https://issues.apache.org/jira/browse/HIVE-25583 > Project: Hive > Issue Type: Sub-task >Reporter: Ramesh Kumar Thangarajan >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Support parallel load for HastTables - Interfaces > * Introducing VectorMapJoinFastHashTableContainerBase class that implements > VectorMapJoinHashTable > * Each VectorMapJoinFastStringHashMapContainer is a singleton that contains > an array of HashTables (1 or more) > * VectorMapJoinFastTableContainer now initializes > VectorMapJoinFastHashTableContainers instead of HTs directly -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HIVE-25583) Support parallel load for HastTables - Interfaces
[ https://issues.apache.org/jira/browse/HIVE-25583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25583: -- Description: Support parallel load for HastTables - Interfaces * Introducing VectorMapJoinFastHashTableContainerBase class that implements VectorMapJoinHashTable * Each VectorMapJoinFastStringHashMapContainer is a singleton that contains an array of HashTables (1 or more) * VectorMapJoinFastTableContainer now initializes VectorMapJoinFastHashTableContainers instead of HTs directly > Support parallel load for HastTables - Interfaces > - > > Key: HIVE-25583 > URL: https://issues.apache.org/jira/browse/HIVE-25583 > Project: Hive > Issue Type: Sub-task >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > Support parallel load for HastTables - Interfaces > * Introducing VectorMapJoinFastHashTableContainerBase class that implements > VectorMapJoinHashTable > * Each VectorMapJoinFastStringHashMapContainer is a singleton that contains > an array of HashTables (1 or more) > * VectorMapJoinFastTableContainer now initializes > VectorMapJoinFastHashTableContainers instead of HTs directly -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (HIVE-25583) Support parallel load for HastTables - Interfaces
[ https://issues.apache.org/jira/browse/HIVE-25583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-25583: - Assignee: Panagiotis Garefalakis (was: Ramesh Kumar Thangarajan) > Support parallel load for HastTables - Interfaces > - > > Key: HIVE-25583 > URL: https://issues.apache.org/jira/browse/HIVE-25583 > Project: Hive > Issue Type: Sub-task >Reporter: Ramesh Kumar Thangarajan >Assignee: Panagiotis Garefalakis >Priority: Major > > Support parallel load for HastTables - Interfaces > * Introducing VectorMapJoinFastHashTableContainerBase class that implements > VectorMapJoinHashTable > * Each VectorMapJoinFastStringHashMapContainer is a singleton that contains > an array of HashTables (1 or more) > * VectorMapJoinFastTableContainer now initializes > VectorMapJoinFastHashTableContainers instead of HTs directly -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (HIVE-25828) Remove unused import and method in ParseUtils
[ https://issues.apache.org/jira/browse/HIVE-25828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis resolved HIVE-25828. --- Resolution: Fixed > Remove unused import and method in ParseUtils > - > > Key: HIVE-25828 > URL: https://issues.apache.org/jira/browse/HIVE-25828 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: zhangbutao >Assignee: zhangbutao >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > (1) Remove unused import > (2) Remove unused method _sameTree(ASTNode node, ASTNode otherNode)_ -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HIVE-25828) Remove unused import and method in ParseUtils
[ https://issues.apache.org/jira/browse/HIVE-25828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17479290#comment-17479290 ] Panagiotis Garefalakis commented on HIVE-25828: --- Resolved via [https://github.com/apache/hive/pull/2900] Thanks [~zhangbutao] for the patch! > Remove unused import and method in ParseUtils > - > > Key: HIVE-25828 > URL: https://issues.apache.org/jira/browse/HIVE-25828 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: zhangbutao >Assignee: zhangbutao >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > (1) Remove unused import > (2) Remove unused method _sameTree(ASTNode node, ASTNode otherNode)_ -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HIVE-25828) Remove unused import and method in ParseUtils
[ https://issues.apache.org/jira/browse/HIVE-25828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25828: -- Affects Version/s: 4.0.0 > Remove unused import and method in ParseUtils > - > > Key: HIVE-25828 > URL: https://issues.apache.org/jira/browse/HIVE-25828 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: zhangbutao >Assignee: zhangbutao >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > (1) Remove unused import > (2) Remove unused method _sameTree(ASTNode node, ASTNode otherNode)_ -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (HIVE-25145) Improve Multi-HashTable EstimatedMemorySize
[ https://issues.apache.org/jira/browse/HIVE-25145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-25145: - Assignee: Panagiotis Garefalakis > Improve Multi-HashTable EstimatedMemorySize > --- > > Key: HIVE-25145 > URL: https://issues.apache.org/jira/browse/HIVE-25145 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > > When Multi HashTable is used for parallel HT loading, we calculate the > estimatedMemorySize as the sum of all HTs. > However, each of those HTs already adds some constants to memory estimation > e.g., adding 16KB constant memory for keyBinarySortableDeserializeRead > This ticket aims to improve the memory estimation for Multi HT -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (HIVE-25148) Support parallel load for Optimized HT implementations
[ https://issues.apache.org/jira/browse/HIVE-25148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-25148: - Assignee: Panagiotis Garefalakis > Support parallel load for Optimized HT implementations > -- > > Key: HIVE-25148 > URL: https://issues.apache.org/jira/browse/HIVE-25148 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (HIVE-25146) JMH tests for Multi HT and parallel load
[ https://issues.apache.org/jira/browse/HIVE-25146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-25146: - Assignee: Panagiotis Garefalakis > JMH tests for Multi HT and parallel load > > > Key: HIVE-25146 > URL: https://issues.apache.org/jira/browse/HIVE-25146 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > > As the title suggests, add some benchmarks for Parallel HT construction > feature -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (HIVE-25149) Support parallel load for Fast HT implementations
[ https://issues.apache.org/jira/browse/HIVE-25149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-25149: - Assignee: Panagiotis Garefalakis > Support parallel load for Fast HT implementations > - > > Key: HIVE-25149 > URL: https://issues.apache.org/jira/browse/HIVE-25149 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (HIVE-25736) Close ORC readers
[ https://issues.apache.org/jira/browse/HIVE-25736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-25736: - Assignee: Peter Vary > Close ORC readers > - > > Key: HIVE-25736 > URL: https://issues.apache.org/jira/browse/HIVE-25736 > Project: Hive > Issue Type: Bug >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > After ORC-498 the Orc readers should be closed explicitly. One of the cases > was HIVE-25683, but there are several places where the ORC readers are still > not closed. > We should go through the code and make sure that the readers are closed. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HIVE-25541) JsonSerDe: TBLPROPERTY treating nested json as String
[ https://issues.apache.org/jira/browse/HIVE-25541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25541: -- Description: Native Jsonserde 'org.apache.hive.hcatalog.data.JsonSerDe' currently does not support loading nested json into a string type directly. It requires the declaring the column as complex type (struct, map, array) to unpack nested json data. Even though the data field is not a valid JSON String type there is value treating it as plain String instead of throwing an exception as we currently do. {code:java} create table json_table(data string, messageid string, publish_time bigint, attributes string); {"data":{"H":{"event":"track_active","platform":"Android"},"B":{"device_type":"Phone","uuid":"[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]"}},"messageId":"2475185636801962","publish_time":1622514629783,"attributes":{"region":"IN"}}"}} {code} This JIRA introduces an extra Table Property allowing to Stringify Complex JSON values instead of forcing the User to define the complete nested structure was: 本机 Jsonserde 'org.apache.hive.hcatalog.data.JsonSerDe' 目前不支持将嵌套的 json 直接加载到字符串类型中。它需要将列声明为复杂类型(结构、映射、数组)以解压嵌套的 json 数据。 即使数据字段不是有效的 JSON 字符串类型,也可以将其视为普通字符串,而不是像我们目前那样抛出异常。 {code:java} {code} 创建表 json_table(数据字符串,messageid 字符串,publish_time bigint,属性字符串); {code:java} {code} { {code:java} “数据” {code} :{ {code:java} “H” {code} :{ {code:java} “事件” {code} : {code:java} “track_active” {code} , {code:java} “平台” {code} : {code:java} “Android” {code} }, {code:java} “B” {code} :{ {code:java} “设备类型” {code} : {code:java} “电话” {code} , {code:java} “uuid” {code} : {code:java} “[36ffec24-f6a4 -4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16] “ {code} }}, {code:java} ”邮件ID“ {code} : {code:java} ”2475185636801962“ {code} , {code:java} ”publish_time“ {code} :1622514629783, {code:java} ”属性“ {code} :{ {code:java} ”区“ {code} : {code:java} ”IN“ {code} }}” }} 这个 JIRA 引入了一个额外的表属性,允许对复杂的 JSON 值进行字符串化,而不是强制用户定义完整的嵌套结构 > JsonSerDe: TBLPROPERTY treating nested json as String > - > > Key: HIVE-25541 > URL: https://issues.apache.org/jira/browse/HIVE-25541 > Project: Hive > Issue Type: Bug >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 2.5h > Remaining Estimate: 0h > > Native Jsonserde 'org.apache.hive.hcatalog.data.JsonSerDe' currently does not > support loading nested json into a string type directly. It requires the > declaring the column as complex type (struct, map, array) to unpack nested > json data. > Even though the data field is not a valid JSON String type there is value > treating it as plain String instead of throwing an exception as we currently > do. > {code:java} > create table json_table(data string, messageid string, publish_time bigint, > attributes string); > {"data":{"H":{"event":"track_active","platform":"Android"},"B":{"device_type":"Phone","uuid":"[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]"}},"messageId":"2475185636801962","publish_time":1622514629783,"attributes":{"region":"IN"}}"}} > {code} > This JIRA introduces an extra Table Property allowing to Stringify Complex > JSON values instead of forcing the User to define the complete nested > structure -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HIVE-25497) Bump ORC to 1.7.1
[ https://issues.apache.org/jira/browse/HIVE-25497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25497: -- Summary: Bump ORC to 1.7.1 (was: Bump ORC to 1.7.0) > Bump ORC to 1.7.1 > - > > Key: HIVE-25497 > URL: https://issues.apache.org/jira/browse/HIVE-25497 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: William Hyun >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (HIVE-25497) Bump ORC to 1.7.1
[ https://issues.apache.org/jira/browse/HIVE-25497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-25497: - Assignee: Panagiotis Garefalakis > Bump ORC to 1.7.1 > - > > Key: HIVE-25497 > URL: https://issues.apache.org/jira/browse/HIVE-25497 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: William Hyun >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HIVE-25765) skip.header.line.count property skips rows of each block in FetchOperator when file size is larger
[ https://issues.apache.org/jira/browse/HIVE-25765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17453204#comment-17453204 ] Panagiotis Garefalakis commented on HIVE-25765: --- Hey [~ganeshas] – thanks for reporting this! Is this bug also visible in the latest master branch? > skip.header.line.count property skips rows of each block in FetchOperator > when file size is larger > -- > > Key: HIVE-25765 > URL: https://issues.apache.org/jira/browse/HIVE-25765 > Project: Hive > Issue Type: Bug >Affects Versions: 3.1.2 >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Labels: pull-request-available > Attachments: data.txt.gz > > Time Spent: 20m > Remaining Estimate: 0h > > When _skip.header.line.count_ property is set in table properties, simple > select queries that gets converted into FetchTask skip rows of each block > instead of skipping header lines of each file. This happens when the file > size is larger and file is read in blocks. This issue doesn't exist when > select query is converted into map only job by setting > _hive.fetch.task.conversion_ to _none_ because the header lines are skipped > only for the first block because of [this > check|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/HiveContextAwareRecordReader.java#L330] > We should have similar check in FetchOperator to avoid this issue. > > *Steps to reproduce:* > {code:java} > -- Create table on top of the data file (uncompressed size: ~239M) attached > in this ticket > CREATE EXTERNAL TABLE test_table( > col1 string, > col2 string, > col3 string, > col4 string, > col5 string, > col6 string, > col7 string, > col8 string, > col9 string, > col10 string, > col11 string, > col12 string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > 'location_of_data_file' > TBLPROPERTIES ('skip.header.line.count'='1'); > -- Counting number of rows gives correct result with only one header line > skipped > select count(*) from test_table; > 3145727 > -- Select query skips more rows and the result depends upon the number of > blocks configured in underlying filesystem. 3 rows are skipped when the file > is read in 3 blocks. > select * from test_table; > . > . > Fetched 3145724 rows > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Comment Edited] (HIVE-25765) skip.header.line.count property skips rows of each block in FetchOperator when file size is larger
[ https://issues.apache.org/jira/browse/HIVE-25765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17453204#comment-17453204 ] Panagiotis Garefalakis edited comment on HIVE-25765 at 12/3/21, 8:51 PM: - Hey [~ganeshas] – thanks for reporting this! Is this also reproducible in the latest master branch? was (Author: pgaref): Hey [~ganeshas] – thanks for reporting this! Is this bug also visible in the latest master branch? > skip.header.line.count property skips rows of each block in FetchOperator > when file size is larger > -- > > Key: HIVE-25765 > URL: https://issues.apache.org/jira/browse/HIVE-25765 > Project: Hive > Issue Type: Bug >Affects Versions: 3.1.2 >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Labels: pull-request-available > Attachments: data.txt.gz > > Time Spent: 20m > Remaining Estimate: 0h > > When _skip.header.line.count_ property is set in table properties, simple > select queries that gets converted into FetchTask skip rows of each block > instead of skipping header lines of each file. This happens when the file > size is larger and file is read in blocks. This issue doesn't exist when > select query is converted into map only job by setting > _hive.fetch.task.conversion_ to _none_ because the header lines are skipped > only for the first block because of [this > check|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/HiveContextAwareRecordReader.java#L330] > We should have similar check in FetchOperator to avoid this issue. > > *Steps to reproduce:* > {code:java} > -- Create table on top of the data file (uncompressed size: ~239M) attached > in this ticket > CREATE EXTERNAL TABLE test_table( > col1 string, > col2 string, > col3 string, > col4 string, > col5 string, > col6 string, > col7 string, > col8 string, > col9 string, > col10 string, > col11 string, > col12 string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > 'location_of_data_file' > TBLPROPERTIES ('skip.header.line.count'='1'); > -- Counting number of rows gives correct result with only one header line > skipped > select count(*) from test_table; > 3145727 > -- Select query skips more rows and the result depends upon the number of > blocks configured in underlying filesystem. 3 rows are skipped when the file > is read in 3 blocks. > select * from test_table; > . > . > Fetched 3145724 rows > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HIVE-25697) Upgrade commons-compress to 1.21
[ https://issues.apache.org/jira/browse/HIVE-25697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17444055#comment-17444055 ] Panagiotis Garefalakis commented on HIVE-25697: --- Thanks for taking care of this [~kgyrtkirk] and for the patch [~rameshkumar] ! > Upgrade commons-compress to 1.21 > > > Key: HIVE-25697 > URL: https://issues.apache.org/jira/browse/HIVE-25697 > Project: Hive > Issue Type: Task >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Upgrade commons-compress to 1.21 due to CVEs -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HIVE-25541) JsonSerDe: TBLPROPERTY treating nested json as String
[ https://issues.apache.org/jira/browse/HIVE-25541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25541: -- Fix Version/s: 4.0.0 > JsonSerDe: TBLPROPERTY treating nested json as String > - > > Key: HIVE-25541 > URL: https://issues.apache.org/jira/browse/HIVE-25541 > Project: Hive > Issue Type: Bug >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 2.5h > Remaining Estimate: 0h > > Native Jsonserde 'org.apache.hive.hcatalog.data.JsonSerDe' currently does not > support loading nested json into a string type directly. It requires the > declaring the column as complex type (struct, map, array) to unpack nested > json data. > Even though the data field is not a valid JSON String type there is value > treating it as plain String instead of throwing an exception as we currently > do. > {code:java} > create table json_table(data string, messageid string, publish_time bigint, > attributes string); > {"data":{"H":{"event":"track_active","platform":"Android"},"B":{"device_type":"Phone","uuid":"[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]"}},"messageId":"2475185636801962","publish_time":1622514629783,"attributes":{"region":"IN"}}"}} > {code} > This JIRA introduces an extra Table Property allowing to Stringify Complex > JSON values instead of forcing the User to define the complete nested > structure -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25541) JsonSerDe: TBLPROPERTY treating nested json as String
[ https://issues.apache.org/jira/browse/HIVE-25541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17432323#comment-17432323 ] Panagiotis Garefalakis commented on HIVE-25541: --- Resolved via https://github.com/apache/hive/pull/2664 > JsonSerDe: TBLPROPERTY treating nested json as String > - > > Key: HIVE-25541 > URL: https://issues.apache.org/jira/browse/HIVE-25541 > Project: Hive > Issue Type: Bug >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 2.5h > Remaining Estimate: 0h > > Native Jsonserde 'org.apache.hive.hcatalog.data.JsonSerDe' currently does not > support loading nested json into a string type directly. It requires the > declaring the column as complex type (struct, map, array) to unpack nested > json data. > Even though the data field is not a valid JSON String type there is value > treating it as plain String instead of throwing an exception as we currently > do. > {code:java} > create table json_table(data string, messageid string, publish_time bigint, > attributes string); > {"data":{"H":{"event":"track_active","platform":"Android"},"B":{"device_type":"Phone","uuid":"[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]"}},"messageId":"2475185636801962","publish_time":1622514629783,"attributes":{"region":"IN"}}"}} > {code} > This JIRA introduces an extra Table Property allowing to Stringify Complex > JSON values instead of forcing the User to define the complete nested > structure -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25541) JsonSerDe: TBLPROPERTY treating nested json as String
[ https://issues.apache.org/jira/browse/HIVE-25541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis resolved HIVE-25541. --- Resolution: Fixed > JsonSerDe: TBLPROPERTY treating nested json as String > - > > Key: HIVE-25541 > URL: https://issues.apache.org/jira/browse/HIVE-25541 > Project: Hive > Issue Type: Bug >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 2.5h > Remaining Estimate: 0h > > Native Jsonserde 'org.apache.hive.hcatalog.data.JsonSerDe' currently does not > support loading nested json into a string type directly. It requires the > declaring the column as complex type (struct, map, array) to unpack nested > json data. > Even though the data field is not a valid JSON String type there is value > treating it as plain String instead of throwing an exception as we currently > do. > {code:java} > create table json_table(data string, messageid string, publish_time bigint, > attributes string); > {"data":{"H":{"event":"track_active","platform":"Android"},"B":{"device_type":"Phone","uuid":"[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]"}},"messageId":"2475185636801962","publish_time":1622514629783,"attributes":{"region":"IN"}}"}} > {code} > This JIRA introduces an extra Table Property allowing to Stringify Complex > JSON values instead of forcing the User to define the complete nested > structure -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25505) Incorrect results with header. skip.header.line.count if first line is blank
[ https://issues.apache.org/jira/browse/HIVE-25505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis resolved HIVE-25505. --- Resolution: Fixed > Incorrect results with header. skip.header.line.count if first line is blank > > > Key: HIVE-25505 > URL: https://issues.apache.org/jira/browse/HIVE-25505 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Steve Carlin >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > aAtable with header. skip.header.line.count=1 does not skip the first line if > it is blank, except in a fetch task. > To reproduce, create a csv table, ans set header. skip.header.line.count=1 in > table properties. > In the table location, create a single file, with a blank (empty) first line, > and say 2 further lines. > If you do a select * on it, you see 2 rows (correct) > If you do select count(*) on it, you get 3 (incorrect) > {code:java} > CREATE EXTERNAL TABLE `testcase1`(id int, name string) ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.OpenCSVSerde' > LOCATION '${system:test.tmp.dir}/testcase1' > TBLPROPERTIES ("skip.header.line.count"="1"); > SET hive.fetch.task.conversion = more; > select * from testcase1; > select count(*) from testcase1; > set hive.fetch.task.conversion=none; > select * from testcase1; > select count(*) from testcase1; > Test file: > 1,2019-12-31 > 2,2019-12-31 > 3,2019-12-31 > Should both yield (with the above test file): > A masked pattern was here > 1 2019-12-31 > 2 2019-12-31 > 3 2019-12-31 > 3 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25505) Incorrect results with header. skip.header.line.count if first line is blank
[ https://issues.apache.org/jira/browse/HIVE-25505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25505: -- Fix Version/s: 4.0.0 > Incorrect results with header. skip.header.line.count if first line is blank > > > Key: HIVE-25505 > URL: https://issues.apache.org/jira/browse/HIVE-25505 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Steve Carlin >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > aAtable with header. skip.header.line.count=1 does not skip the first line if > it is blank, except in a fetch task. > To reproduce, create a csv table, ans set header. skip.header.line.count=1 in > table properties. > In the table location, create a single file, with a blank (empty) first line, > and say 2 further lines. > If you do a select * on it, you see 2 rows (correct) > If you do select count(*) on it, you get 3 (incorrect) > {code:java} > CREATE EXTERNAL TABLE `testcase1`(id int, name string) ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.OpenCSVSerde' > LOCATION '${system:test.tmp.dir}/testcase1' > TBLPROPERTIES ("skip.header.line.count"="1"); > SET hive.fetch.task.conversion = more; > select * from testcase1; > select count(*) from testcase1; > set hive.fetch.task.conversion=none; > select * from testcase1; > select count(*) from testcase1; > Test file: > 1,2019-12-31 > 2,2019-12-31 > 3,2019-12-31 > Should both yield (with the above test file): > A masked pattern was here > 1 2019-12-31 > 2 2019-12-31 > 3 2019-12-31 > 3 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25505) Incorrect results with header. skip.header.line.count if first line is blank
[ https://issues.apache.org/jira/browse/HIVE-25505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17431159#comment-17431159 ] Panagiotis Garefalakis commented on HIVE-25505: --- Resolved via [https://github.com/apache/hive/pull/2717] Thanks [~abstractdog] for the review! > Incorrect results with header. skip.header.line.count if first line is blank > > > Key: HIVE-25505 > URL: https://issues.apache.org/jira/browse/HIVE-25505 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Steve Carlin >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > aAtable with header. skip.header.line.count=1 does not skip the first line if > it is blank, except in a fetch task. > To reproduce, create a csv table, ans set header. skip.header.line.count=1 in > table properties. > In the table location, create a single file, with a blank (empty) first line, > and say 2 further lines. > If you do a select * on it, you see 2 rows (correct) > If you do select count(*) on it, you get 3 (incorrect) > {code:java} > CREATE EXTERNAL TABLE `testcase1`(id int, name string) ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.OpenCSVSerde' > LOCATION '${system:test.tmp.dir}/testcase1' > TBLPROPERTIES ("skip.header.line.count"="1"); > SET hive.fetch.task.conversion = more; > select * from testcase1; > select count(*) from testcase1; > set hive.fetch.task.conversion=none; > select * from testcase1; > select count(*) from testcase1; > Test file: > 1,2019-12-31 > 2,2019-12-31 > 3,2019-12-31 > Should both yield (with the above test file): > A masked pattern was here > 1 2019-12-31 > 2 2019-12-31 > 3 2019-12-31 > 3 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HIVE-25505) Incorrect results with header. skip.header.line.count if first line is blank
[ https://issues.apache.org/jira/browse/HIVE-25505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17426212#comment-17426212 ] Panagiotis Garefalakis edited comment on HIVE-25505 at 10/8/21, 8:39 PM: - I have a repro for this – updating description and assigning to myself was (Author: pgaref): I have a repro for this -- updating description and assigned to myself > Incorrect results with header. skip.header.line.count if first line is blank > > > Key: HIVE-25505 > URL: https://issues.apache.org/jira/browse/HIVE-25505 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Steve Carlin >Assignee: Panagiotis Garefalakis >Priority: Major > > aAtable with header. skip.header.line.count=1 does not skip the first line if > it is blank, except in a fetch task. > To reproduce, create a csv table, ans set header. skip.header.line.count=1 in > table properties. > In the table location, create a single file, with a blank (empty) first line, > and say 2 further lines. > If you do a select * on it, you see 2 rows (correct) > If you do select count(*) on it, you get 3 (incorrect) > {code:java} > CREATE EXTERNAL TABLE `testcase1`(id int, name string) ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.OpenCSVSerde' > LOCATION '${system:test.tmp.dir}/testcase1' > TBLPROPERTIES ("skip.header.line.count"="1"); > SET hive.fetch.task.conversion = more; > select * from testcase1; > select count(*) from testcase1; > set hive.fetch.task.conversion=none; > select * from testcase1; > select count(*) from testcase1; > Test file: > 1,2019-12-31 > 2,2019-12-31 > 3,2019-12-31 > Should both yield (with the above test file): > A masked pattern was here > 1 2019-12-31 > 2 2019-12-31 > 3 2019-12-31 > 3 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25521) Data corruption when concatenating files with different compressions in same table/partition
[ https://issues.apache.org/jira/browse/HIVE-25521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis resolved HIVE-25521. --- Resolution: Fixed > Data corruption when concatenating files with different compressions in same > table/partition > > > Key: HIVE-25521 > URL: https://issues.apache.org/jira/browse/HIVE-25521 > Project: Hive > Issue Type: Bug >Reporter: Harish JP >Assignee: Harish JP >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Currently if files of different compressions are in same directory then > concatenate can fail and cause data corruption. This happens because file can > be moved by one task as incompatible file and the other tasks will fail after > this. > > This issue is addressed in this Jira by only processing a file in one task > where offset 0 is process and ignoring the the file in all other tasks. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25521) Data corruption when concatenating files with different compressions in same table/partition
[ https://issues.apache.org/jira/browse/HIVE-25521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17426356#comment-17426356 ] Panagiotis Garefalakis commented on HIVE-25521: --- Resolved via https://github.com/apache/hive/pull/2639 > Data corruption when concatenating files with different compressions in same > table/partition > > > Key: HIVE-25521 > URL: https://issues.apache.org/jira/browse/HIVE-25521 > Project: Hive > Issue Type: Bug >Reporter: Harish JP >Assignee: Harish JP >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Currently if files of different compressions are in same directory then > concatenate can fail and cause data corruption. This happens because file can > be moved by one task as incompatible file and the other tasks will fail after > this. > > This issue is addressed in this Jira by only processing a file in one task > where offset 0 is process and ignoring the the file in all other tasks. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25521) Data corruption when concatenating files with different compressions in same table/partition
[ https://issues.apache.org/jira/browse/HIVE-25521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25521: -- Fix Version/s: 4.0.0 > Data corruption when concatenating files with different compressions in same > table/partition > > > Key: HIVE-25521 > URL: https://issues.apache.org/jira/browse/HIVE-25521 > Project: Hive > Issue Type: Bug >Reporter: Harish JP >Assignee: Harish JP >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Currently if files of different compressions are in same directory then > concatenate can fail and cause data corruption. This happens because file can > be moved by one task as incompatible file and the other tasks will fail after > this. > > This issue is addressed in this Jira by only processing a file in one task > where offset 0 is process and ignoring the the file in all other tasks. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25505) Incorrect results with header. skip.header.line.count if first line is blank
[ https://issues.apache.org/jira/browse/HIVE-25505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25505: -- Description: aAtable with header. skip.header.line.count=1 does not skip the first line if it is blank, except in a fetch task. To reproduce, create a csv table, ans set header. skip.header.line.count=1 in table properties. In the table location, create a single file, with a blank (empty) first line, and say 2 further lines. If you do a select * on it, you see 2 rows (correct) If you do select count(*) on it, you get 3 (incorrect) {code:java} CREATE EXTERNAL TABLE `testcase1`(id int, name string) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' LOCATION '${system:test.tmp.dir}/testcase1' TBLPROPERTIES ("skip.header.line.count"="1"); SET hive.fetch.task.conversion = more; select * from testcase1; select count(*) from testcase1; set hive.fetch.task.conversion=none; select * from testcase1; select count(*) from testcase1; Test file: 1,2019-12-31 2,2019-12-31 3,2019-12-31 Should both yield (with the above test file): A masked pattern was here 1 2019-12-31 2 2019-12-31 3 2019-12-31 3 {code} was: aAtable with header. skip.header.line.count=1 does not skip the first line if it is blank, except in a fetch task. To reproduce, create a csv table, ans set header. skip.header.line.count=1 in table properties. In the table location, create a single file, with a blank (empty) first line, and say 2 further lines. If you do a select * on it, you see 2 rows (correct) If you do select count(\*) on it, you get 3 (incorrect) {code:java} // Some comments here public String getFoo() { return foo; } {code} > Incorrect results with header. skip.header.line.count if first line is blank > > > Key: HIVE-25505 > URL: https://issues.apache.org/jira/browse/HIVE-25505 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Steve Carlin >Assignee: Panagiotis Garefalakis >Priority: Major > > aAtable with header. skip.header.line.count=1 does not skip the first line if > it is blank, except in a fetch task. > To reproduce, create a csv table, ans set header. skip.header.line.count=1 in > table properties. > In the table location, create a single file, with a blank (empty) first line, > and say 2 further lines. > If you do a select * on it, you see 2 rows (correct) > If you do select count(*) on it, you get 3 (incorrect) > {code:java} > CREATE EXTERNAL TABLE `testcase1`(id int, name string) ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.OpenCSVSerde' > LOCATION '${system:test.tmp.dir}/testcase1' > TBLPROPERTIES ("skip.header.line.count"="1"); > SET hive.fetch.task.conversion = more; > select * from testcase1; > select count(*) from testcase1; > set hive.fetch.task.conversion=none; > select * from testcase1; > select count(*) from testcase1; > Test file: > 1,2019-12-31 > 2,2019-12-31 > 3,2019-12-31 > Should both yield (with the above test file): > A masked pattern was here > 1 2019-12-31 > 2 2019-12-31 > 3 2019-12-31 > 3 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25505) Incorrect results with header. skip.header.line.count if first line is blank
[ https://issues.apache.org/jira/browse/HIVE-25505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-25505: - Assignee: Panagiotis Garefalakis > Incorrect results with header. skip.header.line.count if first line is blank > > > Key: HIVE-25505 > URL: https://issues.apache.org/jira/browse/HIVE-25505 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Steve Carlin >Assignee: Panagiotis Garefalakis >Priority: Major > > aAtable with header. skip.header.line.count=1 does not skip the first line if > it is blank, except in a fetch task. > To reproduce, create a csv table, ans set header. skip.header.line.count=1 in > table properties. > In the table location, create a single file, with a blank (empty) first line, > and say 2 further lines. > If you do a select * on it, you see 2 rows (correct) > If you do select count(\*) on it, you get 3 (incorrect) > {code:java} > // Some comments here > public String getFoo() > { > return foo; > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25505) Incorrect results with header. skip.header.line.count if first line is blank
[ https://issues.apache.org/jira/browse/HIVE-25505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25505: -- Description: aAtable with header. skip.header.line.count=1 does not skip the first line if it is blank, except in a fetch task. To reproduce, create a csv table, ans set header. skip.header.line.count=1 in table properties. In the table location, create a single file, with a blank (empty) first line, and say 2 further lines. If you do a select * on it, you see 2 rows (correct) If you do select count(\*) on it, you get 3 (incorrect) {code:java} // Some comments here public String getFoo() { return foo; } {code} was: aAtable with header. skip.header.line.count=1 does not skip the first line if it is blank, except in a fetch task. To reproduce, create a csv table, ans set header. skip.header.line.count=1 in table properties. In the table location, create a single file, with a blank (empty) first line, and say 2 further lines. If you do a select * on it, you see 2 rows (correct) If you do select count(\*) on it, you get 3 (incorrect) > Incorrect results with header. skip.header.line.count if first line is blank > > > Key: HIVE-25505 > URL: https://issues.apache.org/jira/browse/HIVE-25505 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Steve Carlin >Priority: Major > > aAtable with header. skip.header.line.count=1 does not skip the first line if > it is blank, except in a fetch task. > To reproduce, create a csv table, ans set header. skip.header.line.count=1 in > table properties. > In the table location, create a single file, with a blank (empty) first line, > and say 2 further lines. > If you do a select * on it, you see 2 rows (correct) > If you do select count(\*) on it, you get 3 (incorrect) > {code:java} > // Some comments here > public String getFoo() > { > return foo; > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25505) Incorrect results with header. skip.header.line.count if first line is blank
[ https://issues.apache.org/jira/browse/HIVE-25505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17426212#comment-17426212 ] Panagiotis Garefalakis commented on HIVE-25505: --- I have a repro for this -- updating description and assigned to myself > Incorrect results with header. skip.header.line.count if first line is blank > > > Key: HIVE-25505 > URL: https://issues.apache.org/jira/browse/HIVE-25505 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Steve Carlin >Priority: Major > > aAtable with header. skip.header.line.count=1 does not skip the first line if > it is blank, except in a fetch task. > To reproduce, create a csv table, ans set header. skip.header.line.count=1 in > table properties. > In the table location, create a single file, with a blank (empty) first line, > and say 2 further lines. > If you do a select * on it, you see 2 rows (correct) > If you do select count(\*) on it, you get 3 (incorrect) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25362) LLAP: ensure tasks with locality have a chance to adjust delay
[ https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25362: -- Fix Version/s: 4.0.0 > LLAP: ensure tasks with locality have a chance to adjust delay > -- > > Key: HIVE-25362 > URL: https://issues.apache.org/jira/browse/HIVE-25362 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > HIVE-24914 introduced a short-circuit optimization when all nodes are busy > returning DELAYED_RESOURCES and reseting locality delay for a given tasks. > However, this may prevent tasks from adjusting their locality delay and being > added to the DelayQueue leading sometimes to missed locality chances when all > LLap resources are fully utilized. > To address the issue we should handle the two cases separately. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25362) LLAP: ensure tasks with locality have a chance to adjust delay
[ https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis resolved HIVE-25362. --- Resolution: Fixed > LLAP: ensure tasks with locality have a chance to adjust delay > -- > > Key: HIVE-25362 > URL: https://issues.apache.org/jira/browse/HIVE-25362 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > HIVE-24914 introduced a short-circuit optimization when all nodes are busy > returning DELAYED_RESOURCES and reseting locality delay for a given tasks. > However, this may prevent tasks from adjusting their locality delay and being > added to the DelayQueue leading sometimes to missed locality chances when all > LLap resources are fully utilized. > To address the issue we should handle the two cases separately. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25362) LLAP: ensure tasks with locality have a chance to adjust delay
[ https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17426197#comment-17426197 ] Panagiotis Garefalakis commented on HIVE-25362: --- Resolved via https://github.com/apache/hive/pull/2513 > LLAP: ensure tasks with locality have a chance to adjust delay > -- > > Key: HIVE-25362 > URL: https://issues.apache.org/jira/browse/HIVE-25362 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > HIVE-24914 introduced a short-circuit optimization when all nodes are busy > returning DELAYED_RESOURCES and reseting locality delay for a given tasks. > However, this may prevent tasks from adjusting their locality delay and being > added to the DelayQueue leading sometimes to missed locality chances when all > LLap resources are fully utilized. > To address the issue we should handle the two cases separately. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25599) Addendum HIVE-25570 Hive should send full URL path for authorization for the command insert overwrite location
[ https://issues.apache.org/jira/browse/HIVE-25599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17426052#comment-17426052 ] Panagiotis Garefalakis commented on HIVE-25599: --- Already resolved via https://github.com/apache/hive/commit/988be055289becbfc37b17264edafeca3edefbec > Addendum HIVE-25570 Hive should send full URL path for authorization for the > command insert overwrite location > -- > > Key: HIVE-25599 > URL: https://issues.apache.org/jira/browse/HIVE-25599 > Project: Hive > Issue Type: Bug >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25599) Addendum HIVE-25570 Hive should send full URL path for authorization for the command insert overwrite location
[ https://issues.apache.org/jira/browse/HIVE-25599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis resolved HIVE-25599. --- Resolution: Duplicate > Addendum HIVE-25570 Hive should send full URL path for authorization for the > command insert overwrite location > -- > > Key: HIVE-25599 > URL: https://issues.apache.org/jira/browse/HIVE-25599 > Project: Hive > Issue Type: Bug >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25599) Addendum HIVE-25570 Hive should send full URL path for authorization for the command insert overwrite location
[ https://issues.apache.org/jira/browse/HIVE-25599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25599: -- Summary: Addendum HIVE-25570 Hive should send full URL path for authorization for the command insert overwrite location (was: Addendum of HIVE-25570 Hive should send full URL path for authorization for the command insert overwrite location) > Addendum HIVE-25570 Hive should send full URL path for authorization for the > command insert overwrite location > -- > > Key: HIVE-25599 > URL: https://issues.apache.org/jira/browse/HIVE-25599 > Project: Hive > Issue Type: Bug >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25599) Addendum of HIVE-25570 Hive should send full URL path for authorization for the command insert overwrite location
[ https://issues.apache.org/jira/browse/HIVE-25599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-25599: - > Addendum of HIVE-25570 Hive should send full URL path for authorization for > the command insert overwrite location > - > > Key: HIVE-25599 > URL: https://issues.apache.org/jira/browse/HIVE-25599 > Project: Hive > Issue Type: Bug >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25520) Enable concatenate for external table.
[ https://issues.apache.org/jira/browse/HIVE-25520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17425134#comment-17425134 ] Panagiotis Garefalakis commented on HIVE-25520: --- Resolved via https://github.com/apache/hive/pull/2640 > Enable concatenate for external table. > -- > > Key: HIVE-25520 > URL: https://issues.apache.org/jira/browse/HIVE-25520 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Harish JP >Assignee: Harish JP >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Concatenate for external tables are disabled, enable this under a flag. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25520) Enable concatenate for external table.
[ https://issues.apache.org/jira/browse/HIVE-25520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25520: -- Fix Version/s: 4.0.0 > Enable concatenate for external table. > -- > > Key: HIVE-25520 > URL: https://issues.apache.org/jira/browse/HIVE-25520 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Harish JP >Assignee: Harish JP >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > Concatenate for external tables are disabled, enable this under a flag. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25520) Enable concatenate for external table.
[ https://issues.apache.org/jira/browse/HIVE-25520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis resolved HIVE-25520. --- Resolution: Fixed > Enable concatenate for external table. > -- > > Key: HIVE-25520 > URL: https://issues.apache.org/jira/browse/HIVE-25520 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Harish JP >Assignee: Harish JP >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Concatenate for external tables are disabled, enable this under a flag. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25541) JsonSerDe: TBLPROPERTY treating nested json as String
[ https://issues.apache.org/jira/browse/HIVE-25541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25541: -- Description: Native Jsonserde 'org.apache.hive.hcatalog.data.JsonSerDe' currently does not support loading nested json into a string type directly. It requires the declaring the column as complex type (struct, map, array) to unpack nested json data. Even though the data field is not a valid JSON String type there is value treating it as plain String instead of throwing an exception as we currently do. {code:java} create table json_table(data string, messageid string, publish_time bigint, attributes string); {"data":{"H":{"event":"track_active","platform":"Android"},"B":{"device_type":"Phone","uuid":"[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]"}},"messageId":"2475185636801962","publish_time":1622514629783,"attributes":{"region":"IN"}}"}} {code} This JIRA introduces an extra Table Property allowing to Stringify Complex JSON values instead of forcing the User to define the complete nested structure was: Native Jsonserde 'org.apache.hive.hcatalog.data.JsonSerDe' currently does not support loading nested json into a string type directly. It requires the declaring the column as complex type (struct, map, array) to unpack nested json data. Even though the data field is not a valid JSON String type there is value treating it as plain String instead of throwing an exception as we currently do. {code:java} create table json_table(data string, messageid string, publish_time bigint, attributes string); {"data":{"H":{"event":"track_active","platform":"Android"},"B":{"device_type":"Phone","uuid":"[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]"}},"messageId":"2475185636801962","publish_time":1622514629783,"attributes":{"region":"IN"}}"}} {code} This JIRA introduces an extra Table property allowing to Stringify Complex JSON values instead of forcing the User to define the complete nested structure > JsonSerDe: TBLPROPERTY treating nested json as String > - > > Key: HIVE-25541 > URL: https://issues.apache.org/jira/browse/HIVE-25541 > Project: Hive > Issue Type: Bug >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > > Native Jsonserde 'org.apache.hive.hcatalog.data.JsonSerDe' currently does not > support loading nested json into a string type directly. It requires the > declaring the column as complex type (struct, map, array) to unpack nested > json data. > Even though the data field is not a valid JSON String type there is value > treating it as plain String instead of throwing an exception as we currently > do. > {code:java} > create table json_table(data string, messageid string, publish_time bigint, > attributes string); > {"data":{"H":{"event":"track_active","platform":"Android"},"B":{"device_type":"Phone","uuid":"[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]"}},"messageId":"2475185636801962","publish_time":1622514629783,"attributes":{"region":"IN"}}"}} > {code} > This JIRA introduces an extra Table Property allowing to Stringify Complex > JSON values instead of forcing the User to define the complete nested > structure -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25541) JsonSerDe: TBLPROPERTY treating nested json as String
[ https://issues.apache.org/jira/browse/HIVE-25541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-25541: - > JsonSerDe: TBLPROPERTY treating nested json as String > - > > Key: HIVE-25541 > URL: https://issues.apache.org/jira/browse/HIVE-25541 > Project: Hive > Issue Type: Bug >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > > Native Jsonserde 'org.apache.hive.hcatalog.data.JsonSerDe' currently does not > support loading nested json into a string type directly. It requires the > declaring the column as complex type (struct, map, array) to unpack nested > json data. > Even though the data field is not a valid JSON String type there is value > treating it as plain String instead of throwing an exception as we currently > do. > {code:java} > create table json_table(data string, messageid string, publish_time bigint, > attributes string); > {"data":{"H":{"event":"track_active","platform":"Android"},"B":{"device_type":"Phone","uuid":"[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]"}},"messageId":"2475185636801962","publish_time":1622514629783,"attributes":{"region":"IN"}}"}} > {code} > This JIRA introduces an extra Table property allowing to Stringify Complex > JSON values instead of forcing the User to define the complete nested > structure -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25527) LLAP Scheduler task exits with fatal error if the executor node is down.
[ https://issues.apache.org/jira/browse/HIVE-25527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25527: -- Parent: HIVE-24913 Issue Type: Sub-task (was: Bug) > LLAP Scheduler task exits with fatal error if the executor node is down. > > > Key: HIVE-25527 > URL: https://issues.apache.org/jira/browse/HIVE-25527 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > In case the executor host has gone down, activeInstances will be updated with > null. So we need to check for empty/null values before accessing it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24316) Upgrade ORC from 1.5.6 to 1.5.8 in branch-3.1
[ https://issues.apache.org/jira/browse/HIVE-24316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17403937#comment-17403937 ] Panagiotis Garefalakis commented on HIVE-24316: --- Hey [~glapark] thanks for bringing this up -- taking a look at MemoryManagerImpl looks like checkMemory() is the new method that determines if the scale has changed and since ORC-361 removed getTotalMemoryPool() calls from multiple places we are loosing the effect of controlling the memory pool. The intention behind LlapAwareMemoryManager was to have memory per executor instead of the entire heap since multiple writers are involved. An idea could be to restore getTotalMemoryPool calls where needed . > Upgrade ORC from 1.5.6 to 1.5.8 in branch-3.1 > - > > Key: HIVE-24316 > URL: https://issues.apache.org/jira/browse/HIVE-24316 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 3.1.3 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > Fix For: 3.1.3 > > Time Spent: 4h 50m > Remaining Estimate: 0h > > This will bring eleven bug fixes. > * ORC 1.5.7: [https://issues.apache.org/jira/projects/ORC/versions/12345702] > * ORC 1.5.8: [https://issues.apache.org/jira/projects/ORC/versions/12346462] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25415) Disable auto-assign reviewer on forks
[ https://issues.apache.org/jira/browse/HIVE-25415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis resolved HIVE-25415. --- Resolution: Fixed > Disable auto-assign reviewer on forks > - > > Key: HIVE-25415 > URL: https://issues.apache.org/jira/browse/HIVE-25415 > Project: Hive > Issue Type: Bug > Components: Build Infrastructure >Reporter: Josh Soref >Assignee: Josh Soref >Priority: Trivial > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > {code:java} > Run shufo/auto-assign-reviewer-by-files@v1.1.1 > 5{ > 6 '**/*.thrift': [ 'kgyrtkirk', 'klcopp' ], > 7 '**/*.g': [ 'kgyrtkirk' ], > 8 '**/package.jdo': [ 'kgyrtkirk' ], > 9 '**/schq/**': [ 'kgyrtkirk' ], > 10 '**/*Scheduled*': [ 'kgyrtkirk' ], > 11 '**/*[sS]ketches*': [ 'kgyrtkirk' ], > 12 Jenkinsfile: [ 'kgyrtkirk' ], > 13 '.github/**': [ 'kgyrtkirk' ], > 14 '**/ddl/**': [ 'miklosgergely' ], > 15 '**/ql/*@(Driver|Compiler|Executor)*.java': [ 'miklosgergely' ], > 16 '**/schematool/**': [ 'miklosgergely' ], > 17 '**/metatool/**': [ 'miklosgergely' ], > 18 '**/tez/**/*.java': [ 'abstractdog' ], > 19 '**/*Tez*java': [ 'abstractdog' ], > 20 '**/*TopNKey*java': [ 'kasakrisz' ], > 21 '**/*CardinalityPreserving*java': [ 'kasakrisz' ], > 22 '**/*Llap*java': [ 'pgaref' ] > 23} > 24 > beeline/src/test/org/apache/hive/beeline/schematool/TestHiveSchemaTool.java > matches **/schematool/** > 25finished! > 26(node:1453) UnhandledPromiseRejectionWarning: HttpError: Reviews may only > be requested from collaborators. One or more of the users or teams you > specified is not a collaborator of the check-spelling/hive repository. > 27at > /home/runner/work/_actions/shufo/auto-assign-reviewer-by-files/v1.1.1/dist/index.js:1:301912 > 28at processTicksAndRejections (internal/process/task_queues.js:93:5) > 29at async assignReviewers > (/home/runner/work/_actions/shufo/auto-assign-reviewer-by-files/v1.1.1/dist/index.js:1:39056) > 30(node:1453) UnhandledPromiseRejectionWarning: Unhandled promise rejection. > This error originated either by throwing inside of an async function without > a catch block, or by rejecting a promise which was not handled with .catch(). > (rejection id: 1) > 31(node:1453) [DEP0018] DeprecationWarning: Unhandled promise rejections are > deprecated. In the future, promise rejections that are not handled will > terminate the Node.js process with a non-zero exit code. > Complete job0s {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25415) Disable auto-assign reviewer on forks
[ https://issues.apache.org/jira/browse/HIVE-25415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25415: -- Fix Version/s: 4.0.0 > Disable auto-assign reviewer on forks > - > > Key: HIVE-25415 > URL: https://issues.apache.org/jira/browse/HIVE-25415 > Project: Hive > Issue Type: Bug > Components: Build Infrastructure >Reporter: Josh Soref >Assignee: Josh Soref >Priority: Trivial > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > {code:java} > Run shufo/auto-assign-reviewer-by-files@v1.1.1 > 5{ > 6 '**/*.thrift': [ 'kgyrtkirk', 'klcopp' ], > 7 '**/*.g': [ 'kgyrtkirk' ], > 8 '**/package.jdo': [ 'kgyrtkirk' ], > 9 '**/schq/**': [ 'kgyrtkirk' ], > 10 '**/*Scheduled*': [ 'kgyrtkirk' ], > 11 '**/*[sS]ketches*': [ 'kgyrtkirk' ], > 12 Jenkinsfile: [ 'kgyrtkirk' ], > 13 '.github/**': [ 'kgyrtkirk' ], > 14 '**/ddl/**': [ 'miklosgergely' ], > 15 '**/ql/*@(Driver|Compiler|Executor)*.java': [ 'miklosgergely' ], > 16 '**/schematool/**': [ 'miklosgergely' ], > 17 '**/metatool/**': [ 'miklosgergely' ], > 18 '**/tez/**/*.java': [ 'abstractdog' ], > 19 '**/*Tez*java': [ 'abstractdog' ], > 20 '**/*TopNKey*java': [ 'kasakrisz' ], > 21 '**/*CardinalityPreserving*java': [ 'kasakrisz' ], > 22 '**/*Llap*java': [ 'pgaref' ] > 23} > 24 > beeline/src/test/org/apache/hive/beeline/schematool/TestHiveSchemaTool.java > matches **/schematool/** > 25finished! > 26(node:1453) UnhandledPromiseRejectionWarning: HttpError: Reviews may only > be requested from collaborators. One or more of the users or teams you > specified is not a collaborator of the check-spelling/hive repository. > 27at > /home/runner/work/_actions/shufo/auto-assign-reviewer-by-files/v1.1.1/dist/index.js:1:301912 > 28at processTicksAndRejections (internal/process/task_queues.js:93:5) > 29at async assignReviewers > (/home/runner/work/_actions/shufo/auto-assign-reviewer-by-files/v1.1.1/dist/index.js:1:39056) > 30(node:1453) UnhandledPromiseRejectionWarning: Unhandled promise rejection. > This error originated either by throwing inside of an async function without > a catch block, or by rejecting a promise which was not handled with .catch(). > (rejection id: 1) > 31(node:1453) [DEP0018] DeprecationWarning: Unhandled promise rejections are > deprecated. In the future, promise rejections that are not handled will > terminate the Node.js process with a non-zero exit code. > Complete job0s {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25415) Disable auto-assign reviewer on forks
[ https://issues.apache.org/jira/browse/HIVE-25415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17392092#comment-17392092 ] Panagiotis Garefalakis commented on HIVE-25415: --- Resolved via: https://github.com/apache/hive/pull/2554 > Disable auto-assign reviewer on forks > - > > Key: HIVE-25415 > URL: https://issues.apache.org/jira/browse/HIVE-25415 > Project: Hive > Issue Type: Bug > Components: Build Infrastructure >Reporter: Josh Soref >Assignee: Josh Soref >Priority: Trivial > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > {code:java} > Run shufo/auto-assign-reviewer-by-files@v1.1.1 > 5{ > 6 '**/*.thrift': [ 'kgyrtkirk', 'klcopp' ], > 7 '**/*.g': [ 'kgyrtkirk' ], > 8 '**/package.jdo': [ 'kgyrtkirk' ], > 9 '**/schq/**': [ 'kgyrtkirk' ], > 10 '**/*Scheduled*': [ 'kgyrtkirk' ], > 11 '**/*[sS]ketches*': [ 'kgyrtkirk' ], > 12 Jenkinsfile: [ 'kgyrtkirk' ], > 13 '.github/**': [ 'kgyrtkirk' ], > 14 '**/ddl/**': [ 'miklosgergely' ], > 15 '**/ql/*@(Driver|Compiler|Executor)*.java': [ 'miklosgergely' ], > 16 '**/schematool/**': [ 'miklosgergely' ], > 17 '**/metatool/**': [ 'miklosgergely' ], > 18 '**/tez/**/*.java': [ 'abstractdog' ], > 19 '**/*Tez*java': [ 'abstractdog' ], > 20 '**/*TopNKey*java': [ 'kasakrisz' ], > 21 '**/*CardinalityPreserving*java': [ 'kasakrisz' ], > 22 '**/*Llap*java': [ 'pgaref' ] > 23} > 24 > beeline/src/test/org/apache/hive/beeline/schematool/TestHiveSchemaTool.java > matches **/schematool/** > 25finished! > 26(node:1453) UnhandledPromiseRejectionWarning: HttpError: Reviews may only > be requested from collaborators. One or more of the users or teams you > specified is not a collaborator of the check-spelling/hive repository. > 27at > /home/runner/work/_actions/shufo/auto-assign-reviewer-by-files/v1.1.1/dist/index.js:1:301912 > 28at processTicksAndRejections (internal/process/task_queues.js:93:5) > 29at async assignReviewers > (/home/runner/work/_actions/shufo/auto-assign-reviewer-by-files/v1.1.1/dist/index.js:1:39056) > 30(node:1453) UnhandledPromiseRejectionWarning: Unhandled promise rejection. > This error originated either by throwing inside of an async function without > a catch block, or by rejecting a promise which was not handled with .catch(). > (rejection id: 1) > 31(node:1453) [DEP0018] DeprecationWarning: Unhandled promise rejections are > deprecated. In the future, promise rejections that are not handled will > terminate the Node.js process with a non-zero exit code. > Complete job0s {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25415) Disable auto-assign reviewer on forks
[ https://issues.apache.org/jira/browse/HIVE-25415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25415: -- Summary: Disable auto-assign reviewer on forks (was: auto-assign breaks on forks) > Disable auto-assign reviewer on forks > - > > Key: HIVE-25415 > URL: https://issues.apache.org/jira/browse/HIVE-25415 > Project: Hive > Issue Type: Bug > Components: Build Infrastructure >Reporter: Josh Soref >Assignee: Josh Soref >Priority: Trivial > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > {code:java} > Run shufo/auto-assign-reviewer-by-files@v1.1.1 > 5{ > 6 '**/*.thrift': [ 'kgyrtkirk', 'klcopp' ], > 7 '**/*.g': [ 'kgyrtkirk' ], > 8 '**/package.jdo': [ 'kgyrtkirk' ], > 9 '**/schq/**': [ 'kgyrtkirk' ], > 10 '**/*Scheduled*': [ 'kgyrtkirk' ], > 11 '**/*[sS]ketches*': [ 'kgyrtkirk' ], > 12 Jenkinsfile: [ 'kgyrtkirk' ], > 13 '.github/**': [ 'kgyrtkirk' ], > 14 '**/ddl/**': [ 'miklosgergely' ], > 15 '**/ql/*@(Driver|Compiler|Executor)*.java': [ 'miklosgergely' ], > 16 '**/schematool/**': [ 'miklosgergely' ], > 17 '**/metatool/**': [ 'miklosgergely' ], > 18 '**/tez/**/*.java': [ 'abstractdog' ], > 19 '**/*Tez*java': [ 'abstractdog' ], > 20 '**/*TopNKey*java': [ 'kasakrisz' ], > 21 '**/*CardinalityPreserving*java': [ 'kasakrisz' ], > 22 '**/*Llap*java': [ 'pgaref' ] > 23} > 24 > beeline/src/test/org/apache/hive/beeline/schematool/TestHiveSchemaTool.java > matches **/schematool/** > 25finished! > 26(node:1453) UnhandledPromiseRejectionWarning: HttpError: Reviews may only > be requested from collaborators. One or more of the users or teams you > specified is not a collaborator of the check-spelling/hive repository. > 27at > /home/runner/work/_actions/shufo/auto-assign-reviewer-by-files/v1.1.1/dist/index.js:1:301912 > 28at processTicksAndRejections (internal/process/task_queues.js:93:5) > 29at async assignReviewers > (/home/runner/work/_actions/shufo/auto-assign-reviewer-by-files/v1.1.1/dist/index.js:1:39056) > 30(node:1453) UnhandledPromiseRejectionWarning: Unhandled promise rejection. > This error originated either by throwing inside of an async function without > a catch block, or by rejecting a promise which was not handled with .catch(). > (rejection id: 1) > 31(node:1453) [DEP0018] DeprecationWarning: Unhandled promise rejections are > deprecated. In the future, promise rejections that are not handled will > terminate the Node.js process with a non-zero exit code. > Complete job0s {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25415) auto-assign breaks on forks
[ https://issues.apache.org/jira/browse/HIVE-25415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-25415: - Assignee: Josh Soref > auto-assign breaks on forks > --- > > Key: HIVE-25415 > URL: https://issues.apache.org/jira/browse/HIVE-25415 > Project: Hive > Issue Type: Bug > Components: Build Infrastructure >Reporter: Josh Soref >Assignee: Josh Soref >Priority: Trivial > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > {code:java} > Run shufo/auto-assign-reviewer-by-files@v1.1.1 > 5{ > 6 '**/*.thrift': [ 'kgyrtkirk', 'klcopp' ], > 7 '**/*.g': [ 'kgyrtkirk' ], > 8 '**/package.jdo': [ 'kgyrtkirk' ], > 9 '**/schq/**': [ 'kgyrtkirk' ], > 10 '**/*Scheduled*': [ 'kgyrtkirk' ], > 11 '**/*[sS]ketches*': [ 'kgyrtkirk' ], > 12 Jenkinsfile: [ 'kgyrtkirk' ], > 13 '.github/**': [ 'kgyrtkirk' ], > 14 '**/ddl/**': [ 'miklosgergely' ], > 15 '**/ql/*@(Driver|Compiler|Executor)*.java': [ 'miklosgergely' ], > 16 '**/schematool/**': [ 'miklosgergely' ], > 17 '**/metatool/**': [ 'miklosgergely' ], > 18 '**/tez/**/*.java': [ 'abstractdog' ], > 19 '**/*Tez*java': [ 'abstractdog' ], > 20 '**/*TopNKey*java': [ 'kasakrisz' ], > 21 '**/*CardinalityPreserving*java': [ 'kasakrisz' ], > 22 '**/*Llap*java': [ 'pgaref' ] > 23} > 24 > beeline/src/test/org/apache/hive/beeline/schematool/TestHiveSchemaTool.java > matches **/schematool/** > 25finished! > 26(node:1453) UnhandledPromiseRejectionWarning: HttpError: Reviews may only > be requested from collaborators. One or more of the users or teams you > specified is not a collaborator of the check-spelling/hive repository. > 27at > /home/runner/work/_actions/shufo/auto-assign-reviewer-by-files/v1.1.1/dist/index.js:1:301912 > 28at processTicksAndRejections (internal/process/task_queues.js:93:5) > 29at async assignReviewers > (/home/runner/work/_actions/shufo/auto-assign-reviewer-by-files/v1.1.1/dist/index.js:1:39056) > 30(node:1453) UnhandledPromiseRejectionWarning: Unhandled promise rejection. > This error originated either by throwing inside of an async function without > a catch block, or by rejecting a promise which was not handled with .catch(). > (rejection id: 1) > 31(node:1453) [DEP0018] DeprecationWarning: Unhandled promise rejections are > deprecated. In the future, promise rejections that are not handled will > terminate the Node.js process with a non-zero exit code. > Complete job0s {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25398) Converted external tables should be able to configure purge behaviour
[ https://issues.apache.org/jira/browse/HIVE-25398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25398: -- Component/s: Standalone Metastore > Converted external tables should be able to configure purge behaviour > - > > Key: HIVE-25398 > URL: https://issues.apache.org/jira/browse/HIVE-25398 > Project: Hive > Issue Type: Bug > Components: Standalone Metastore >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > > Creating non-ACID MANAGED tables is not allowed on Hive, which is instead > converting these tables to External: > https://issues.apache.org/jira/browse/HIVE-22158 > During table translation both TRANSLATED_TO_EXTERNAL and > 'external.table.purge' are set to True. However, there could be the case that > the second parameter is already set in the table properties by the User. This > is ticket is adding an extra check to maintain that property if set. > PS: A cleaner solution would be to create these Tables as External directly > but there could be the case the User is taking advantage of the translation > and is expecting the data NOT to be purged! > Example: > {code:java} > -- Non-ACID table will be translated to EXTERNAL > create table c(c int) LOCATION 'etp_1' > TBLPROPERTIES('transactional'='false','external.table.purge'='false'); > insert into c values(1); > -- Maintain the purge=false property set above > desc formatted c; > select count(*) from c; > drop table c; > -- Create table in same location, data should still be there > create table c(c int) LOCATION 'etp_1' > TBLPROPERTIES('transactional'='false','external.table.purge'='false'); > desc formatted c; > select count(*) from c; > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25398) Converted external tables should be able to configure purge behaviour
[ https://issues.apache.org/jira/browse/HIVE-25398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-25398: - > Converted external tables should be able to configure purge behaviour > - > > Key: HIVE-25398 > URL: https://issues.apache.org/jira/browse/HIVE-25398 > Project: Hive > Issue Type: Bug >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > > Creating non-ACID MANAGED tables is not allowed on Hive, which is instead > converting these tables to External: > https://issues.apache.org/jira/browse/HIVE-22158 > During table translation both TRANSLATED_TO_EXTERNAL and > 'external.table.purge' are set to True. However, there could be the case that > the second parameter is already set in the table properties by the User. This > is ticket is adding an extra check to maintain that property if set. > PS: A cleaner solution would be to create these Tables as External directly > but there could be the case the User is taking advantage of the translation > and is expecting the data NOT to be purged! > Example: > {code:java} > -- Non-ACID table will be translated to EXTERNAL > create table c(c int) LOCATION 'etp_1' > TBLPROPERTIES('transactional'='false','external.table.purge'='false'); > insert into c values(1); > -- Maintain the purge=false property set above > desc formatted c; > select count(*) from c; > drop table c; > -- Create table in same location, data should still be there > create table c(c int) LOCATION 'etp_1' > TBLPROPERTIES('transactional'='false','external.table.purge'='false'); > desc formatted c; > select count(*) from c; > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25190) BytesColumnVector fails when the aggregate size is > 1gb
[ https://issues.apache.org/jira/browse/HIVE-25190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis resolved HIVE-25190. --- Resolution: Fixed > BytesColumnVector fails when the aggregate size is > 1gb > > > Key: HIVE-25190 > URL: https://issues.apache.org/jira/browse/HIVE-25190 > Project: Hive > Issue Type: Bug >Reporter: Owen O'Malley >Assignee: Owen O'Malley >Priority: Major > Labels: pull-request-available > Time Spent: 2h 50m > Remaining Estimate: 0h > > Currently, BytesColumnVector will allocate a buffer for small values (< 1mb), > but fail with: > {code:java} > new RuntimeException("Overflow of newLength. smallBuffer.length=" > + smallBuffer.length + ", nextElemLength=" + nextElemLength); > {code:java} > if the aggregate size of the buffer crosses over 1gb. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25190) BytesColumnVector fails when the aggregate size is > 1gb
[ https://issues.apache.org/jira/browse/HIVE-25190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17387281#comment-17387281 ] Panagiotis Garefalakis commented on HIVE-25190: --- Thanks [~dongjoon] -- I was hesitating to close as we need a new storage-api version (as the fix) -- should be 2.8.1 > BytesColumnVector fails when the aggregate size is > 1gb > > > Key: HIVE-25190 > URL: https://issues.apache.org/jira/browse/HIVE-25190 > Project: Hive > Issue Type: Bug >Reporter: Owen O'Malley >Assignee: Owen O'Malley >Priority: Major > Labels: pull-request-available > Time Spent: 2h 50m > Remaining Estimate: 0h > > Currently, BytesColumnVector will allocate a buffer for small values (< 1mb), > but fail with: > {code:java} > new RuntimeException("Overflow of newLength. smallBuffer.length=" > + smallBuffer.length + ", nextElemLength=" + nextElemLength); > {code:java} > if the aggregate size of the buffer crosses over 1gb. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25386) hive-storage-api should not have guava compile dependency
[ https://issues.apache.org/jira/browse/HIVE-25386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-25386: - Assignee: Dongjoon Hyun > hive-storage-api should not have guava compile dependency > - > > Key: HIVE-25386 > URL: https://issues.apache.org/jira/browse/HIVE-25386 > Project: Hive > Issue Type: Bug > Components: storage-api >Affects Versions: 4.0.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Blocker > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > https://mvnrepository.com/artifact/org.apache.hive/hive-storage-api/2.8.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24458) Allow access to SArgs without converting to disjunctive normal form
[ https://issues.apache.org/jira/browse/HIVE-24458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-24458: -- Fix Version/s: (was: storage-2.7.3) > Allow access to SArgs without converting to disjunctive normal form > --- > > Key: HIVE-24458 > URL: https://issues.apache.org/jira/browse/HIVE-24458 > Project: Hive > Issue Type: Improvement >Reporter: Owen O'Malley >Assignee: Owen O'Malley >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > For some use cases, it is useful to have access to the SArg expression in a > non-normalized form. Currently, the SArg only provides the fully normalized > expression. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (HIVE-25362) LLAP: ensure tasks with locality have a chance to adjust delay
[ https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-25362 started by Panagiotis Garefalakis. - > LLAP: ensure tasks with locality have a chance to adjust delay > -- > > Key: HIVE-25362 > URL: https://issues.apache.org/jira/browse/HIVE-25362 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > HIVE-24914 introduced a short-circuit optimization when all nodes are busy > returning DELAYED_RESOURCES and reseting locality delay for a given tasks. > However, this may prevent tasks from adjusting their locality delay and being > added to the DelayQueue leading sometimes to missed locality chances when all > LLap resources are fully utilized. > To address the issue we should handle the two cases separately. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25362) LLAP: ensure tasks with locality have a chance to adjust delay
[ https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25362: -- Component/s: llap > LLAP: ensure tasks with locality have a chance to adjust delay > -- > > Key: HIVE-25362 > URL: https://issues.apache.org/jira/browse/HIVE-25362 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > HIVE-24914 introduced a short-circuit optimization when all nodes are busy > returning DELAYED_RESOURCES and reseting locality delay for a given tasks. > However, this may prevent tasks from adjusting their locality delay and being > added to the DelayQueue leading sometimes to missed locality chances when all > LLap resources are fully utilized. > To address the issue we should handle the two cases separately. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25362) LLAP: ensure tasks with locality have a chance to adjust delay
[ https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25362: -- Description: HIVE-24914 introduced a short-circuit optimization when all nodes are busy returning DELAYED_RESOURCES and reseting locality delay for a given tasks. However, this may prevent tasks from adjusting their locality delay and being added to the DelayQueue leading sometimes to missed locality chances when all LLap resources are fully utilized. To address the issue we should handle the two cases separately. was: HIVE-24914 introduced a short-circuit optimization when all nodes are busy returning DELAYED_RESOURCES and reseting locality delay for a given tasks. However, this may prevent tasks from being added to the DelayQueue leading to worse locality when all LLap resources are fully utilized. To address the issue we should handle the two cases separately. > LLAP: ensure tasks with locality have a chance to adjust delay > -- > > Key: HIVE-25362 > URL: https://issues.apache.org/jira/browse/HIVE-25362 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > > HIVE-24914 introduced a short-circuit optimization when all nodes are busy > returning DELAYED_RESOURCES and reseting locality delay for a given tasks. > However, this may prevent tasks from adjusting their locality delay and being > added to the DelayQueue leading sometimes to missed locality chances when all > LLap resources are fully utilized. > To address the issue we should handle the two cases separately. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25362) LLAP: ensure tasks with locality have a chance to adjust delay
[ https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25362: -- Summary: LLAP: ensure tasks with locality have a chance to adjust delay (was: LLAP: ensure tasks with locality have a chance to adjust localityDelay) > LLAP: ensure tasks with locality have a chance to adjust delay > -- > > Key: HIVE-25362 > URL: https://issues.apache.org/jira/browse/HIVE-25362 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > > HIVE-24914 introduced a short-circuit optimization when all nodes are busy > returning DELAYED_RESOURCES and reseting locality delay for a given tasks. > However, this may prevent tasks from being added to the DelayQueue leading to > worse locality when all LLap resources are fully utilized. > To address the issue we should handle the two cases separately. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25362) LLAP: ensure tasks with locality have a chance to adjust localityDelay
[ https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25362: -- Summary: LLAP: ensure tasks with locality have a chance to adjust localityDelay (was: LLAP: ensure tasks with locality are added to DelayQueue) > LLAP: ensure tasks with locality have a chance to adjust localityDelay > -- > > Key: HIVE-25362 > URL: https://issues.apache.org/jira/browse/HIVE-25362 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > > HIVE-24914 introduced a short-circuit optimization when all nodes are busy > returning DELAYED_RESOURCES and reseting locality delay for a given tasks. > However, this may prevent tasks from being added to the DelayQueue leading to > worse locality when all LLap resources are fully utilized. > To address the issue we should handle the two cases separately. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25362) LLAP: ensure tasks with locality are added to DelayQueue
[ https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25362: -- Parent: HIVE-24913 Issue Type: Sub-task (was: Bug) > LLAP: ensure tasks with locality are added to DelayQueue > > > Key: HIVE-25362 > URL: https://issues.apache.org/jira/browse/HIVE-25362 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > > HIVE-24914 introduced a short-circuit optimization when all nodes are busy > returning DELAYED_RESOURCES and reseting locality delay for a given tasks. > However, this may prevent tasks from being added to the DelayQueue leading to > worse locality when all LLap resources are fully utilized. > To address the issue we should handle the two cases separately. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25362) LLAP: ensure tasks with locality are added to DelayQueue
[ https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-25362: - > LLAP: ensure tasks with locality are added to DelayQueue > > > Key: HIVE-25362 > URL: https://issues.apache.org/jira/browse/HIVE-25362 > Project: Hive > Issue Type: Bug >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > > HIVE-24914 introduced a short-circuit optimization when all nodes are busy > returning DELAYED_RESOURCES and reseting locality delay for a given tasks. > However, this may prevent tasks from being added to the DelayQueue leading to > worse locality when all LLap resources are fully utilized. > To address the issue we should handle the two cases separately. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21489) EXPLAIN command throws ClassCastException in Hive
[ https://issues.apache.org/jira/browse/HIVE-21489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17379963#comment-17379963 ] Panagiotis Garefalakis commented on HIVE-21489: --- Resolved via https://github.com/apache/hive/pull/2373 Thanks [~rameshkumar] for the patch! > EXPLAIN command throws ClassCastException in Hive > - > > Key: HIVE-21489 > URL: https://issues.apache.org/jira/browse/HIVE-21489 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.4, 3.1.2 >Reporter: Ping Lu >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Fix For: 3.1.3, 4.0.0 > > Attachments: HIVE-21489.1.patch, HIVE-21489.2.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > I'm trying to run commands like explain select * from src in hive-2.3.4,but > it falls with the ClassCastException: > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer cannot be cast to > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer > Steps to reproduce: > 1)hive.execution.engine is the default value mr > 2)hive.security.authorization.enabled is set to true, and > hive.security.authorization.manager is set to > org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider > 3)start hivecli to run command:explain select * from src > I debug the code and find the issue HIVE-18778 causing the above > ClassCastException.If I set hive.in.test to true,the explain command can be > successfully executed。 > Now,I have one question,due to hive.in.test cann't be modified at runtime.how > to run explain command with using default authorization in hive-2.3.4, -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-21489) EXPLAIN command throws ClassCastException in Hive
[ https://issues.apache.org/jira/browse/HIVE-21489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-21489: -- Resolution: Fixed Status: Resolved (was: Patch Available) > EXPLAIN command throws ClassCastException in Hive > - > > Key: HIVE-21489 > URL: https://issues.apache.org/jira/browse/HIVE-21489 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.4, 3.1.2 >Reporter: Ping Lu >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Fix For: 3.1.3, 4.0.0 > > Attachments: HIVE-21489.1.patch, HIVE-21489.2.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > I'm trying to run commands like explain select * from src in hive-2.3.4,but > it falls with the ClassCastException: > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer cannot be cast to > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer > Steps to reproduce: > 1)hive.execution.engine is the default value mr > 2)hive.security.authorization.enabled is set to true, and > hive.security.authorization.manager is set to > org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider > 3)start hivecli to run command:explain select * from src > I debug the code and find the issue HIVE-18778 causing the above > ClassCastException.If I set hive.in.test to true,the explain command can be > successfully executed。 > Now,I have one question,due to hive.in.test cann't be modified at runtime.how > to run explain command with using default authorization in hive-2.3.4, -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-21489) EXPLAIN command throws ClassCastException in Hive
[ https://issues.apache.org/jira/browse/HIVE-21489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-21489: -- Affects Version/s: 3.1.2 > EXPLAIN command throws ClassCastException in Hive > - > > Key: HIVE-21489 > URL: https://issues.apache.org/jira/browse/HIVE-21489 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.4, 3.1.2 >Reporter: Ping Lu >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21489.1.patch, HIVE-21489.2.patch > > Time Spent: 1h 10m > Remaining Estimate: 0h > > I'm trying to run commands like explain select * from src in hive-2.3.4,but > it falls with the ClassCastException: > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer cannot be cast to > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer > Steps to reproduce: > 1)hive.execution.engine is the default value mr > 2)hive.security.authorization.enabled is set to true, and > hive.security.authorization.manager is set to > org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider > 3)start hivecli to run command:explain select * from src > I debug the code and find the issue HIVE-18778 causing the above > ClassCastException.If I set hive.in.test to true,the explain command can be > successfully executed。 > Now,I have one question,due to hive.in.test cann't be modified at runtime.how > to run explain command with using default authorization in hive-2.3.4, -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-21489) EXPLAIN command throws ClassCastException in Hive
[ https://issues.apache.org/jira/browse/HIVE-21489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-21489: -- Fix Version/s: 4.0.0 3.1.3 > EXPLAIN command throws ClassCastException in Hive > - > > Key: HIVE-21489 > URL: https://issues.apache.org/jira/browse/HIVE-21489 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.4, 3.1.2 >Reporter: Ping Lu >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Fix For: 3.1.3, 4.0.0 > > Attachments: HIVE-21489.1.patch, HIVE-21489.2.patch > > Time Spent: 1h 10m > Remaining Estimate: 0h > > I'm trying to run commands like explain select * from src in hive-2.3.4,but > it falls with the ClassCastException: > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer cannot be cast to > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer > Steps to reproduce: > 1)hive.execution.engine is the default value mr > 2)hive.security.authorization.enabled is set to true, and > hive.security.authorization.manager is set to > org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider > 3)start hivecli to run command:explain select * from src > I debug the code and find the issue HIVE-18778 causing the above > ClassCastException.If I set hive.in.test to true,the explain command can be > successfully executed。 > Now,I have one question,due to hive.in.test cann't be modified at runtime.how > to run explain command with using default authorization in hive-2.3.4, -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25242) Query performs extremely slow with hive.vectorized.adaptor.usage.mode = chosen
[ https://issues.apache.org/jira/browse/HIVE-25242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-25242: - Assignee: Attila Magyar > Query performs extremely slow with hive.vectorized.adaptor.usage.mode = > chosen > --- > > Key: HIVE-25242 > URL: https://issues.apache.org/jira/browse/HIVE-25242 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 4.0.0 >Reporter: Attila Magyar >Assignee: Attila Magyar >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > If hive.vectorized.adaptor.usage.mode is set to chosen only certain UDFS are > vectorized through the vectorized adaptor. > Queries like this one, performs very slowly because the concat is not chosen > to be vectorized. > {code:java} > select count(*) from tbl where to_date(concat(year, '-', month, '-', day)) > between to_date('2018-12-01') and to_date('2021-03-01'); {code} > The patch whitelists the concat udf so that it uses the vectorized adaptor in > chosen mode. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25248) Fix TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1
[ https://issues.apache.org/jira/browse/HIVE-25248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-25248: - Assignee: Panagiotis Garefalakis > Fix > TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1 > --- > > Key: HIVE-25248 > URL: https://issues.apache.org/jira/browse/HIVE-25248 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Panagiotis Garefalakis >Priority: Major > > This test is failing randomly recently > http://ci.hive.apache.org/job/hive-flaky-check/233/testReport/org.apache.hadoop.hive.llap.tezplugins/TestLlapTaskSchedulerService/testForcedLocalityMultiplePreemptionsSameHost1/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24458) Allow access to SArgs without converting to disjunctive normal form
[ https://issues.apache.org/jira/browse/HIVE-24458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-24458: -- Fix Version/s: storage-2.7.3 > Allow access to SArgs without converting to disjunctive normal form > --- > > Key: HIVE-24458 > URL: https://issues.apache.org/jira/browse/HIVE-24458 > Project: Hive > Issue Type: Improvement >Reporter: Owen O'Malley >Assignee: Owen O'Malley >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0, storage-2.7.3 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > For some use cases, it is useful to have access to the SArg expression in a > non-normalized form. Currently, the SArg only provides the fully normalized > expression. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25117) Vector PTF ClassCastException with Decimal64
[ https://issues.apache.org/jira/browse/HIVE-25117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25117: -- Affects Version/s: 4.0.0 > Vector PTF ClassCastException with Decimal64 > > > Key: HIVE-25117 > URL: https://issues.apache.org/jira/browse/HIVE-25117 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Panagiotis Garefalakis >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Attachments: vector_ptf_classcast_exception.q > > Time Spent: 1h > Remaining Estimate: 0h > > Only reproduces when there is at least 1 buffered batch, so needed 2 rows > with 1 row/batch: > {code:java} > set hive.vectorized.testing.reducer.batch.size=1; > {code} > {code:java} > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to > org.apache.hadoop.hive.ql.exec.vector.LongColumnVector > at > org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.copyNonSelectedColumnVector(VectorizedBatchUtil.java:664) > at > org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.forwardBufferedBatches(VectorPTFGroupBatches.java:228) > at > org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.fillGroupResultsAndForward(VectorPTFGroupBatches.java:318) > at > org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFOperator.process(VectorPTFOperator.java:403) > at > org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:497) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25117) Vector PTF ClassCastException with Decimal64
[ https://issues.apache.org/jira/browse/HIVE-25117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis resolved HIVE-25117. --- Resolution: Fixed > Vector PTF ClassCastException with Decimal64 > > > Key: HIVE-25117 > URL: https://issues.apache.org/jira/browse/HIVE-25117 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Panagiotis Garefalakis >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: vector_ptf_classcast_exception.q > > Time Spent: 1h > Remaining Estimate: 0h > > Only reproduces when there is at least 1 buffered batch, so needed 2 rows > with 1 row/batch: > {code:java} > set hive.vectorized.testing.reducer.batch.size=1; > {code} > {code:java} > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to > org.apache.hadoop.hive.ql.exec.vector.LongColumnVector > at > org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.copyNonSelectedColumnVector(VectorizedBatchUtil.java:664) > at > org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.forwardBufferedBatches(VectorPTFGroupBatches.java:228) > at > org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.fillGroupResultsAndForward(VectorPTFGroupBatches.java:318) > at > org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFOperator.process(VectorPTFOperator.java:403) > at > org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:497) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25117) Vector PTF ClassCastException with Decimal64
[ https://issues.apache.org/jira/browse/HIVE-25117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25117: -- Fix Version/s: 4.0.0 > Vector PTF ClassCastException with Decimal64 > > > Key: HIVE-25117 > URL: https://issues.apache.org/jira/browse/HIVE-25117 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Panagiotis Garefalakis >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: vector_ptf_classcast_exception.q > > Time Spent: 1h > Remaining Estimate: 0h > > Only reproduces when there is at least 1 buffered batch, so needed 2 rows > with 1 row/batch: > {code:java} > set hive.vectorized.testing.reducer.batch.size=1; > {code} > {code:java} > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to > org.apache.hadoop.hive.ql.exec.vector.LongColumnVector > at > org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.copyNonSelectedColumnVector(VectorizedBatchUtil.java:664) > at > org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.forwardBufferedBatches(VectorPTFGroupBatches.java:228) > at > org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.fillGroupResultsAndForward(VectorPTFGroupBatches.java:318) > at > org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFOperator.process(VectorPTFOperator.java:403) > at > org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:497) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25117) Vector PTF ClassCastException with Decimal64
[ https://issues.apache.org/jira/browse/HIVE-25117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17358477#comment-17358477 ] Panagiotis Garefalakis commented on HIVE-25117: --- Revolved via https://github.com/apache/hive/pull/2286 Thanks [~rameshkumar] for the patch! > Vector PTF ClassCastException with Decimal64 > > > Key: HIVE-25117 > URL: https://issues.apache.org/jira/browse/HIVE-25117 > Project: Hive > Issue Type: Bug >Reporter: Panagiotis Garefalakis >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Attachments: vector_ptf_classcast_exception.q > > Time Spent: 1h > Remaining Estimate: 0h > > Only reproduces when there is at least 1 buffered batch, so needed 2 rows > with 1 row/batch: > {code:java} > set hive.vectorized.testing.reducer.batch.size=1; > {code} > {code:java} > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to > org.apache.hadoop.hive.ql.exec.vector.LongColumnVector > at > org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.copyNonSelectedColumnVector(VectorizedBatchUtil.java:664) > at > org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.forwardBufferedBatches(VectorPTFGroupBatches.java:228) > at > org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.fillGroupResultsAndForward(VectorPTFGroupBatches.java:318) > at > org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFOperator.process(VectorPTFOperator.java:403) > at > org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:497) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HIVE-25180) Update netty to 4.1.60.Final
[ https://issues.apache.org/jira/browse/HIVE-25180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17358449#comment-17358449 ] Panagiotis Garefalakis edited comment on HIVE-25180 at 6/7/21, 8:50 AM: Resolved via https://github.com/apache/hive/pull/2345 thanks [~Csaba] and [~kgyrtkirk] for the review! was (Author: pgaref): Resolved via https://github.com/apache/hive/pull/2345 thanks [~Csaba] > Update netty to 4.1.60.Final > > > Key: HIVE-25180 > URL: https://issues.apache.org/jira/browse/HIVE-25180 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: Zoltan Haindrich >Assignee: Csaba Juhász >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25180) Update netty to 4.1.60.Final
[ https://issues.apache.org/jira/browse/HIVE-25180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis resolved HIVE-25180. --- Resolution: Fixed > Update netty to 4.1.60.Final > > > Key: HIVE-25180 > URL: https://issues.apache.org/jira/browse/HIVE-25180 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: Zoltan Haindrich >Assignee: Csaba Juhász >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25180) Update netty to 4.1.60.Final
[ https://issues.apache.org/jira/browse/HIVE-25180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17358449#comment-17358449 ] Panagiotis Garefalakis commented on HIVE-25180: --- Resolved via https://github.com/apache/hive/pull/2345 thanks [~Csaba] > Update netty to 4.1.60.Final > > > Key: HIVE-25180 > URL: https://issues.apache.org/jira/browse/HIVE-25180 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: Zoltan Haindrich >Assignee: Csaba Juhász >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25180) Update netty to 4.1.60.Final
[ https://issues.apache.org/jira/browse/HIVE-25180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25180: -- Affects Version/s: 4.0.0 > Update netty to 4.1.60.Final > > > Key: HIVE-25180 > URL: https://issues.apache.org/jira/browse/HIVE-25180 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: Zoltan Haindrich >Assignee: Csaba Juhász >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25180) Update netty to 4.1.60.Final
[ https://issues.apache.org/jira/browse/HIVE-25180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25180: -- Fix Version/s: 4.0.0 > Update netty to 4.1.60.Final > > > Key: HIVE-25180 > URL: https://issues.apache.org/jira/browse/HIVE-25180 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: Zoltan Haindrich >Assignee: Csaba Juhász >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25180) Update netty to 4.1.60.Final
[ https://issues.apache.org/jira/browse/HIVE-25180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-25180: - Assignee: Csaba Juhász > Update netty to 4.1.60.Final > > > Key: HIVE-25180 > URL: https://issues.apache.org/jira/browse/HIVE-25180 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Csaba Juhász >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25202) Support decimal64 operations for PTF operators
[ https://issues.apache.org/jira/browse/HIVE-25202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25202: -- Affects Version/s: 4.0.0 > Support decimal64 operations for PTF operators > -- > > Key: HIVE-25202 > URL: https://issues.apache.org/jira/browse/HIVE-25202 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > After the support for decimal64 vectorization for multiple operators, PTF > operators were found guilty of breaking the decimal64 chain if they happen to > occur between two operators. As a result they introduce unnecessary cast to > decimal. In order to prevent this, we will support PTF operators to handle > decimal64 data types too -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24037) Parallelize hash table constructions in map joins
[ https://issues.apache.org/jira/browse/HIVE-24037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-24037: - Assignee: (was: Ramesh Kumar Thangarajan) > Parallelize hash table constructions in map joins > - > > Key: HIVE-24037 > URL: https://issues.apache.org/jira/browse/HIVE-24037 > Project: Hive > Issue Type: Improvement >Reporter: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > > Parallelize hash table constructions in map joins -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25163) UnsupportedTemporalTypeException when starting llap
[ https://issues.apache.org/jira/browse/HIVE-25163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis resolved HIVE-25163. --- Resolution: Fixed > UnsupportedTemporalTypeException when starting llap > --- > > Key: HIVE-25163 > URL: https://issues.apache.org/jira/browse/HIVE-25163 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 4.0.0 >Reporter: Istvan Toth >Assignee: Istvan Toth >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > When trying to start the LLAP service I get > {noformat} > java.time.temporal.UnsupportedTemporalTypeException: Unsupported field: Year > at java.time.Instant.getLong(Instant.java:603) > at > java.time.format.DateTimePrintContext$1.getLong(DateTimePrintContext.java:205) > at > java.time.format.DateTimePrintContext.getValue(DateTimePrintContext.java:298) > at > java.time.format.DateTimeFormatterBuilder$NumberPrinterParser.format(DateTimeFormatterBuilder.java:2551) > at > java.time.format.DateTimeFormatterBuilder$CompositePrinterParser.format(DateTimeFormatterBuilder.java:2190) > at > java.time.format.DateTimeFormatter.formatTo(DateTimeFormatter.java:1746) > at > java.time.format.DateTimeFormatter.format(DateTimeFormatter.java:1720) > at > org.apache.hadoop.hive.llap.cli.service.LlapServiceDriver.startLlap(LlapServiceDriver.java:301) > at > org.apache.hadoop.hive.llap.cli.service.LlapServiceDriver.run(LlapServiceDriver.java:133) > at > org.apache.hadoop.hive.llap.cli.service.LlapServiceDriver.main(LlapServiceDriver.java:386) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25163) UnsupportedTemporalTypeException when starting llap
[ https://issues.apache.org/jira/browse/HIVE-25163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17357250#comment-17357250 ] Panagiotis Garefalakis commented on HIVE-25163: --- Resolved via https://github.com/apache/hive/pull/2322 Thanks for the patch [~stoty] ! > UnsupportedTemporalTypeException when starting llap > --- > > Key: HIVE-25163 > URL: https://issues.apache.org/jira/browse/HIVE-25163 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 4.0.0 >Reporter: Istvan Toth >Assignee: Istvan Toth >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > When trying to start the LLAP service I get > {noformat} > java.time.temporal.UnsupportedTemporalTypeException: Unsupported field: Year > at java.time.Instant.getLong(Instant.java:603) > at > java.time.format.DateTimePrintContext$1.getLong(DateTimePrintContext.java:205) > at > java.time.format.DateTimePrintContext.getValue(DateTimePrintContext.java:298) > at > java.time.format.DateTimeFormatterBuilder$NumberPrinterParser.format(DateTimeFormatterBuilder.java:2551) > at > java.time.format.DateTimeFormatterBuilder$CompositePrinterParser.format(DateTimeFormatterBuilder.java:2190) > at > java.time.format.DateTimeFormatter.formatTo(DateTimeFormatter.java:1746) > at > java.time.format.DateTimeFormatter.format(DateTimeFormatter.java:1720) > at > org.apache.hadoop.hive.llap.cli.service.LlapServiceDriver.startLlap(LlapServiceDriver.java:301) > at > org.apache.hadoop.hive.llap.cli.service.LlapServiceDriver.run(LlapServiceDriver.java:133) > at > org.apache.hadoop.hive.llap.cli.service.LlapServiceDriver.main(LlapServiceDriver.java:386) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25163) UnsupportedTemporalTypeException when starting llap
[ https://issues.apache.org/jira/browse/HIVE-25163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25163: -- Fix Version/s: 4.0.0 > UnsupportedTemporalTypeException when starting llap > --- > > Key: HIVE-25163 > URL: https://issues.apache.org/jira/browse/HIVE-25163 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 4.0.0 >Reporter: Istvan Toth >Assignee: Istvan Toth >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > When trying to start the LLAP service I get > {noformat} > java.time.temporal.UnsupportedTemporalTypeException: Unsupported field: Year > at java.time.Instant.getLong(Instant.java:603) > at > java.time.format.DateTimePrintContext$1.getLong(DateTimePrintContext.java:205) > at > java.time.format.DateTimePrintContext.getValue(DateTimePrintContext.java:298) > at > java.time.format.DateTimeFormatterBuilder$NumberPrinterParser.format(DateTimeFormatterBuilder.java:2551) > at > java.time.format.DateTimeFormatterBuilder$CompositePrinterParser.format(DateTimeFormatterBuilder.java:2190) > at > java.time.format.DateTimeFormatter.formatTo(DateTimeFormatter.java:1746) > at > java.time.format.DateTimeFormatter.format(DateTimeFormatter.java:1720) > at > org.apache.hadoop.hive.llap.cli.service.LlapServiceDriver.startLlap(LlapServiceDriver.java:301) > at > org.apache.hadoop.hive.llap.cli.service.LlapServiceDriver.run(LlapServiceDriver.java:133) > at > org.apache.hadoop.hive.llap.cli.service.LlapServiceDriver.main(LlapServiceDriver.java:386) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25169) using coalesce via vector,source column type is int and target column type is bigint,the result of target is zero
[ https://issues.apache.org/jira/browse/HIVE-25169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17353306#comment-17353306 ] Panagiotis Garefalakis commented on HIVE-25169: --- Hey [~junnan.yang] thanks for reporting this! Would it make sense to backport the ticket that resolved this from master? On a general note it would be much easier to review this with a github PR and a test case. Cheers > using coalesce via vector,source column type is int and target column type is > bigint,the result of target is zero > - > > Key: HIVE-25169 > URL: https://issues.apache.org/jira/browse/HIVE-25169 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 3.1.2 >Reporter: junnan.yang >Priority: Major > Attachments: HIVE-25169.01.patch > > > sourceTable: > product_id int; > ### > targetTable: > product_id bigint; > ## > sql: > insert overwrite table targetTable: > select > .. > coalesce(product_id,-1), > .. > from sourceTable; > ## > explain sql : > UDFToLong(COALESCE(product_id,-1)) (type: bigint) > ## > result : > the column product_id in targetTable is zero, this is wrong result > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25155) Bump ORC to 1.6.8
[ https://issues.apache.org/jira/browse/HIVE-25155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352606#comment-17352606 ] Panagiotis Garefalakis commented on HIVE-25155: --- Resolved via https://github.com/apache/hive/pull/2313 > Bump ORC to 1.6.8 > - > > Key: HIVE-25155 > URL: https://issues.apache.org/jira/browse/HIVE-25155 > Project: Hive > Issue Type: Improvement >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Trivial > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > https://orc.apache.org/news/2021/05/21/ORC-1.6.8/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25155) Bump ORC to 1.6.8
[ https://issues.apache.org/jira/browse/HIVE-25155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis resolved HIVE-25155. --- Resolution: Fixed > Bump ORC to 1.6.8 > - > > Key: HIVE-25155 > URL: https://issues.apache.org/jira/browse/HIVE-25155 > Project: Hive > Issue Type: Improvement >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Trivial > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > https://orc.apache.org/news/2021/05/21/ORC-1.6.8/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25155) Bump ORC to 1.6.8
[ https://issues.apache.org/jira/browse/HIVE-25155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25155: -- Fix Version/s: 4.0.0 > Bump ORC to 1.6.8 > - > > Key: HIVE-25155 > URL: https://issues.apache.org/jira/browse/HIVE-25155 > Project: Hive > Issue Type: Improvement >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Trivial > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > https://orc.apache.org/news/2021/05/21/ORC-1.6.8/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25148) Support parallel load for Optimized HT implementations
[ https://issues.apache.org/jira/browse/HIVE-25148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25148: -- Summary: Support parallel load for Optimized HT implementations (was: Support parallel load for Fast HT implementations) > Support parallel load for Optimized HT implementations > -- > > Key: HIVE-25148 > URL: https://issues.apache.org/jira/browse/HIVE-25148 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)