[jira] [Updated] (HUDI-3664) Column Stats are computed incorrectly right now

2022-03-31 Thread Alexey Kudinkin (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kudinkin updated HUDI-3664:
--
Status: Patch Available  (was: In Progress)

> Column Stats are computed incorrectly right now
> ---
>
> Key: HUDI-3664
> URL: https://issues.apache.org/jira/browse/HUDI-3664
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Sagar Sumit
>Assignee: Alexey Kudinkin
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>
> Currently, statistics are stored as strings instead of actual type of the 
> column it originates from, which requires to do proper decoding on the reader 
> side.
>  
> This also has direct performance implications since storing primitive types 
> as strings is vastly inefficient.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HUDI-3664) Column Stats are computed incorrectly right now

2022-03-31 Thread Alexey Kudinkin (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kudinkin updated HUDI-3664:
--
Story Points: 6  (was: 3)

> Column Stats are computed incorrectly right now
> ---
>
> Key: HUDI-3664
> URL: https://issues.apache.org/jira/browse/HUDI-3664
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Sagar Sumit
>Assignee: Alexey Kudinkin
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>
> Currently, statistics are stored as strings instead of actual type of the 
> column it originates from, which requires to do proper decoding on the reader 
> side.
>  
> This also has direct performance implications since storing primitive types 
> as strings is vastly inefficient.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HUDI-3664) Column Stats are computed incorrectly right now

2022-03-31 Thread Alexey Kudinkin (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kudinkin updated HUDI-3664:
--
Description: 
Currently, statistics are stored as strings instead of actual type of the 
column it originates from, which requires to do proper decoding on the reader 
side.

 

This also has direct performance implications since storing primitive types as 
strings is vastly inefficient.

  was:Instead of using string comparators, convert Avro schema to native Java 
type.


> Column Stats are computed incorrectly right now
> ---
>
> Key: HUDI-3664
> URL: https://issues.apache.org/jira/browse/HUDI-3664
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Sagar Sumit
>Assignee: Alexey Kudinkin
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>
> Currently, statistics are stored as strings instead of actual type of the 
> column it originates from, which requires to do proper decoding on the reader 
> side.
>  
> This also has direct performance implications since storing primitive types 
> as strings is vastly inefficient.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HUDI-3664) Column Stats are computed incorrectly right now

2022-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-3664:
-
Sprint: Hudi-Sprint-Mar-14, Hudi-Sprint-Mar-21, Hudi-Sprint-Mar-22  (was: 
Hudi-Sprint-Mar-14, Hudi-Sprint-Mar-21)

> Column Stats are computed incorrectly right now
> ---
>
> Key: HUDI-3664
> URL: https://issues.apache.org/jira/browse/HUDI-3664
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Sagar Sumit
>Assignee: Sagar Sumit
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>
> Instead of using string comparators, convert Avro schema to native Java type.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HUDI-3664) Column Stats are computed incorrectly right now

2022-03-22 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-3664:
-
Sprint: Hudi-Sprint-Mar-14, Hudi-Sprint-Mar-21  (was: Hudi-Sprint-Mar-14)

> Column Stats are computed incorrectly right now
> ---
>
> Key: HUDI-3664
> URL: https://issues.apache.org/jira/browse/HUDI-3664
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Sagar Sumit
>Assignee: Sagar Sumit
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>
> Instead of using string comparators, convert Avro schema to native Java type.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HUDI-3664) Column Stats are computed incorrectly right now

2022-03-21 Thread Sagar Sumit (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Sumit updated HUDI-3664:
--
Story Points: 3  (was: 1)

> Column Stats are computed incorrectly right now
> ---
>
> Key: HUDI-3664
> URL: https://issues.apache.org/jira/browse/HUDI-3664
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Sagar Sumit
>Assignee: Sagar Sumit
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>
> Instead of using string comparators, convert Avro schema to native Java type.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HUDI-3664) Column Stats are computed incorrectly right now

2022-03-21 Thread Sagar Sumit (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Sumit updated HUDI-3664:
--
Sprint: Hudi-Sprint-Mar-14

> Column Stats are computed incorrectly right now
> ---
>
> Key: HUDI-3664
> URL: https://issues.apache.org/jira/browse/HUDI-3664
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Sagar Sumit
>Assignee: Sagar Sumit
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>
> Instead of using string comparators, convert Avro schema to native Java type.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HUDI-3664) Column Stats are computed incorrectly right now

2022-03-21 Thread Sagar Sumit (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Sumit updated HUDI-3664:
--
Status: In Progress  (was: Open)

> Column Stats are computed incorrectly right now
> ---
>
> Key: HUDI-3664
> URL: https://issues.apache.org/jira/browse/HUDI-3664
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Sagar Sumit
>Assignee: Sagar Sumit
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>
> Instead of using string comparators, convert Avro schema to native Java type.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HUDI-3664) Column Stats are computed incorrectly right now

2022-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-3664:
-
Labels: pull-request-available  (was: )

> Column Stats are computed incorrectly right now
> ---
>
> Key: HUDI-3664
> URL: https://issues.apache.org/jira/browse/HUDI-3664
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Sagar Sumit
>Assignee: Sagar Sumit
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>
> Instead of using string comparators, convert Avro schema to native Java type.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HUDI-3664) Column Stats are computed incorrectly right now

2022-03-20 Thread Sagar Sumit (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Sumit updated HUDI-3664:
--
Description: Instead of using string comparators, convert Avro schema to 
native Java type.  (was: Instead of using string comparators, use Parquet's 
comparator while aggregating column stats.)

> Column Stats are computed incorrectly right now
> ---
>
> Key: HUDI-3664
> URL: https://issues.apache.org/jira/browse/HUDI-3664
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Sagar Sumit
>Assignee: Sagar Sumit
>Priority: Blocker
> Fix For: 0.11.0
>
>
> Instead of using string comparators, convert Avro schema to native Java type.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)