[jira] [Commented] (HIVE-15670) column_stats_accurate may not fit in PARTITION_PARAMS.VALUE

2017-10-30 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16225476#comment-16225476
 ] 

Sergey Shelukhin commented on HIVE-15670:
-

Yeah I suggested having bitmask as one of the possible solutions. Let me modify 
the description :)

> column_stats_accurate may not fit in PARTITION_PARAMS.VALUE
> ---
>
> Key: HIVE-15670
> URL: https://issues.apache.org/jira/browse/HIVE-15670
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> The JSON can be too big with many columns (see setColumnStatsState method).
> We can make JSON more compact by only storing the list of columns with true 
> values. Or we can even store a bitmask in a dedicated column, and adjust it 
> when altering table (rare enough). Or we can just change the VALUE column to 
> text blob (might be a painful change wrt upgrade scripts, and supporting all 
> the DBs' varied blob implementations, esp. in directsql).
> Storing denormalized flags in a separate table will probably be slow, 
> comparatively.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15670) column_stats_accurate may not fit in PARTITION_PARAMS.VALUE

2017-10-27 Thread Alexander Behm (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16223239#comment-16223239
 ] 

Alexander Behm commented on HIVE-15670:
---

Thanks for the response. From the perspective of a client issuing an RPC to 
alter column stats, it seems like a rather questionable side-effect to also 
alter the table metadata with a pretty big payload. Instead of "fixing" this 
issue by changing the database schema, could we instead remove the JSON string 
altogether?

I'm definitely not familiar with the implementation details, just trying to 
provide a perspective from a Metastore client that is not Hive.

> column_stats_accurate may not fit in PARTITION_PARAMS.VALUE
> ---
>
> Key: HIVE-15670
> URL: https://issues.apache.org/jira/browse/HIVE-15670
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> The JSON can be too big with many columns (see setColumnStatsState method).
> We can make JSON more compact by only storing the list of columns with true 
> values. Or we can even store a bitmask in a dedicated column, and adjust it 
> when altering table (rare enough). Or we can just change the VALUE column to 
> text blob (might be a painful change wrt upgrade scripts, and supporting all 
> the DBs' varied blob implementations, esp. in directsql).
> Storing denormalized flags in a separate table will probably be slow, 
> comparatively.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15670) column_stats_accurate may not fit in PARTITION_PARAMS.VALUE

2017-10-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217948#comment-16217948
 ] 

Sergey Shelukhin commented on HIVE-15670:
-

Beats me... the current implementation is as such.

> column_stats_accurate may not fit in PARTITION_PARAMS.VALUE
> ---
>
> Key: HIVE-15670
> URL: https://issues.apache.org/jira/browse/HIVE-15670
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> The JSON can be too big with many columns (see setColumnStatsState method).
> We can make JSON more compact by only storing the list of columns with true 
> values. Or we can even store a bitmask in a dedicated column, and adjust it 
> when altering table (rare enough). Or we can just change the VALUE column to 
> text blob (might be a painful change wrt upgrade scripts, and supporting all 
> the DBs' varied blob implementations, esp. in directsql).
> Storing denormalized flags in a separate table will probably be slow, 
> comparatively.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15670) column_stats_accurate may not fit in PARTITION_PARAMS.VALUE

2017-10-24 Thread Alexander Behm (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217912#comment-16217912
 ] 

Alexander Behm commented on HIVE-15670:
---

May I ask what's the purpose of storing this JSON in the tableproperties? Seems 
pretty expensive to me. If you want to keep track of the accuracy of column 
stats, why not populate a "last updated" timestamp in the appropriate column 
statistic?

> column_stats_accurate may not fit in PARTITION_PARAMS.VALUE
> ---
>
> Key: HIVE-15670
> URL: https://issues.apache.org/jira/browse/HIVE-15670
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> The JSON can be too big with many columns (see setColumnStatsState method).
> We can make JSON more compact by only storing the list of columns with true 
> values. Or we can even store a bitmask in a dedicated column, and adjust it 
> when altering table (rare enough). Or we can just change the VALUE column to 
> text blob (might be a painful change wrt upgrade scripts, and supporting all 
> the DBs' varied blob implementations, esp. in directsql).
> Storing denormalized flags in a separate table will probably be slow, 
> comparatively.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15670) column_stats_accurate may not fit in PARTITION_PARAMS.VALUE

2017-01-19 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15831055#comment-15831055
 ] 

Sergey Shelukhin commented on HIVE-15670:
-

[~prasanth_j] [~pxiong] fyi

> column_stats_accurate may not fit in PARTITION_PARAMS.VALUE
> ---
>
> Key: HIVE-15670
> URL: https://issues.apache.org/jira/browse/HIVE-15670
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> The JSON can be too big with many columns.
> We can make JSON more compact by only storing the list of columns with true 
> values. Or we can even store a bitmask in a dedicated column, and adjust it 
> when altering table (rare enough). Or we can just change the VALUE column to 
> text blob (might be a painful change wrt upgrade scripts, and supporting all 
> the DBs' varied blob implementations, esp. in directsql).
> Storing denormalized flags in a separate table will probably be slow, 
> comparatively.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)