[jira] [Updated] (IMPALA-12918) Do not allow non-numeric values in Hive table stats during an alter table

2024-04-22 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang updated IMPALA-12918:

Labels: alter alter-table catalog-2024 newbie ramp-up stats validation  
(was: alter alter-table stats validation)

> Do not allow non-numeric values in Hive table stats during an alter table
> -
>
> Key: IMPALA-12918
> URL: https://issues.apache.org/jira/browse/IMPALA-12918
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 4.0.0
>Reporter: Miklos Szurap
>Priority: Major
>  Labels: alter, alter-table, catalog-2024, newbie, ramp-up, 
> stats, validation
>
> Hive table properties are string in their nature, however some of them have 
> special meaning and should have numeric values, like the "totalSize", 
> "numRows", "rawDataSize". 
> Impala currently allows these to be set to non-numeric values (including 
> empty string).
> From certain applications (like from Spark) we get quite obscure 
> "NumberFormatException" errors while trying to access such broken tables. 
> (see SPARK-47444)
> Impala should also validate "alter table" statements and not allow 
> non-numeric values in the "totalSize", "numRows", "rawDataSize" table 
> properties.
> For example a query which may break the table (after it can't be read from 
> Spark):
> {code}
> [impalacoordinator:21000] default> alter table t1p set 
> tblproperties('numRows'='', 'STATS_GENERATED_VIA_STATS_TASK'='true');
> {code}
> Note: beeline/Hive validates alter table statements with the "numRows" and 
> "rawDataSize", the "totalSize" still needs validation there too.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-12918) Do not allow non-numeric values in Hive table stats during an alter table

2024-03-18 Thread Miklos Szurap (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Szurap updated IMPALA-12918:
---
Description: 
Hive table properties are string in their nature, however some of them have 
special meaning and should have numeric values, like the "totalSize", 
"numRows", "rawDataSize". 
Impala currently allows these to be set to non-numeric values (including empty 
string).
>From certain applications (like from Spark) we get quite obscure 
>"NumberFormatException" errors while trying to access such broken tables. (see 
>SPARK-47444)

Impala should also validate "alter table" statements and not allow non-numeric 
values in the "totalSize", "numRows", "rawDataSize" table properties.

For example a query which may break the table (after it can't be read from 
Spark):
{code}
[impalacoordinator:21000] default> alter table t1p set 
tblproperties('numRows'='', 'STATS_GENERATED_VIA_STATS_TASK'='true');
{code}
Note: beeline/Hive validates alter table statements with the "numRows" and 
"rawDataSize", the "totalSize" still needs validation there too.

  was:
Hive table properties are string in their nature, however some of them have 
special meaning and should have numeric values, like the "totalSize", 
"numRows", "rawDataSize". 
Impala currently allows these to be set to non-numeric values (including empty 
string).
>From certain applications (like from Spark) we get quite obscure 
>"NumberFormatException" errors while trying to access such broken tables. (see 
>SPARK-47444)

Impala should also validate "alter table" statements and not allow non-numeric 
values in the "totalSize", "numRows", "rawDataSize" table properties.

For example a query which may break the table (after it can't be read Spark):
{code}
[impalacoordinator:21000] default> alter table t1p set 
tblproperties('numRows'='', 'STATS_GENERATED_VIA_STATS_TASK'='true');
{code}
Note: beeline/Hive validates alter table statements with the "numRows" and 
"rawDataSize", the "totalSize" still needs validation there too.


> Do not allow non-numeric values in Hive table stats during an alter table
> -
>
> Key: IMPALA-12918
> URL: https://issues.apache.org/jira/browse/IMPALA-12918
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 4.0.0
>Reporter: Miklos Szurap
>Priority: Major
>  Labels: alter, alter-table, stats, validation
>
> Hive table properties are string in their nature, however some of them have 
> special meaning and should have numeric values, like the "totalSize", 
> "numRows", "rawDataSize". 
> Impala currently allows these to be set to non-numeric values (including 
> empty string).
> From certain applications (like from Spark) we get quite obscure 
> "NumberFormatException" errors while trying to access such broken tables. 
> (see SPARK-47444)
> Impala should also validate "alter table" statements and not allow 
> non-numeric values in the "totalSize", "numRows", "rawDataSize" table 
> properties.
> For example a query which may break the table (after it can't be read from 
> Spark):
> {code}
> [impalacoordinator:21000] default> alter table t1p set 
> tblproperties('numRows'='', 'STATS_GENERATED_VIA_STATS_TASK'='true');
> {code}
> Note: beeline/Hive validates alter table statements with the "numRows" and 
> "rawDataSize", the "totalSize" still needs validation there too.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-12918) Do not allow non-numeric values in Hive table stats during an alter table

2024-03-18 Thread Miklos Szurap (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Szurap updated IMPALA-12918:
---
Affects Version/s: Impala 4.0.0

> Do not allow non-numeric values in Hive table stats during an alter table
> -
>
> Key: IMPALA-12918
> URL: https://issues.apache.org/jira/browse/IMPALA-12918
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 4.0.0
>Reporter: Miklos Szurap
>Priority: Major
>  Labels: alter, alter-table, stats, validation
>
> Hive table properties are string in their nature, however some of them have 
> special meaning and should have numeric values, like the "totalSize", 
> "numRows", "rawDataSize". 
> Impala currently allows these to be set to non-numeric values (including 
> empty string).
> From certain applications (like from Spark) we get quite obscure 
> "NumberFormatException" errors while trying to access such broken tables. 
> (see SPARK-47444)
> Impala should also validate "alter table" statements and not allow 
> non-numeric values in the "totalSize", "numRows", "rawDataSize" table 
> properties.
> For example a query which may break the table (after it can't be read Spark):
> {code}
> [impalacoordinator:21000] default> alter table t1p set 
> tblproperties('numRows'='', 'STATS_GENERATED_VIA_STATS_TASK'='true');
> {code}
> Note: beeline/Hive validates alter table statements with the "numRows" and 
> "rawDataSize", the "totalSize" still needs validation there too.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org