[ 
https://issues.apache.org/jira/browse/HIVE-27163?focusedWorklogId=858831&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-858831
 ]

ASF GitHub Bot logged work on HIVE-27163:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 25/Apr/23 03:15
            Start Date: 25/Apr/23 03:15
    Worklog Time Spent: 10m 
      Work Description: dengzhhu653 commented on code in PR #4228:
URL: https://github.com/apache/hive/pull/4228#discussion_r1175967949


##########
iceberg/iceberg-handler/src/test/results/positive/use_basic_stats_from_iceberg.q.out:
##########
@@ -396,14 +396,14 @@ Stage-0
     Stage-1
       Reducer 2 vectorized
       File Output Operator [FS_8]
-        Select Operator [SEL_7] (rows=18 width=95)
+        Select Operator [SEL_7] (rows=18 width=192)
           Output:["_col0","_col1","_col2"]
         <-Map 1 [SIMPLE_EDGE] vectorized
           SHUFFLE [RS_6]
-            Select Operator [SEL_5] (rows=18 width=95)
+            Select Operator [SEL_5] (rows=18 width=192)
               Output:["_col0","_col1","_col2"]
-              TableScan [TS_0] (rows=18 width=95)
-                
default@tbl_ice,tbl_ice,Tbl:COMPLETE,Col:COMPLETE,Output:["a","b","c"]

Review Comment:
   We don't remove the table directory of `tbl_ice`, the column statistics 
should be stale by design





Issue Time Tracking
-------------------

    Worklog Id:     (was: 858831)
    Time Spent: 1h 10m  (was: 1h)

> Column stats are not getting published after an insert query into an external 
> table with custom location
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-27163
>                 URL: https://issues.apache.org/jira/browse/HIVE-27163
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>            Reporter: Taraka Rama Rao Lethavadla
>            Assignee: Zhihua Deng
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Test case details are below
> *test.q*
> {noformat}
> set hive.stats.column.autogather=true;
> set hive.stats.autogather=true;
> dfs ${system:test.dfs.mkdir} ${system:test.tmp.dir}/test;
> create external table test_custom(age int, name string) stored as orc 
> location '/tmp/test';
> insert into test_custom select 1, 'test';
> desc formatted test_custom age;{noformat}
> *test.q.out*
>  
>  
> {noformat}
> #### A masked pattern was here ####
> PREHOOK: type: CREATETABLE
> #### A masked pattern was here ####
> PREHOOK: Output: database:default
> PREHOOK: Output: default@test_custom
> #### A masked pattern was here ####
> POSTHOOK: type: CREATETABLE
> #### A masked pattern was here ####
> POSTHOOK: Output: database:default
> POSTHOOK: Output: default@test_custom
> PREHOOK: query: insert into test_custom select 1, 'test'
> PREHOOK: type: QUERY
> PREHOOK: Input: _dummy_database@_dummy_table
> PREHOOK: Output: default@test_custom
> POSTHOOK: query: insert into test_custom select 1, 'test'
> POSTHOOK: type: QUERY
> POSTHOOK: Input: _dummy_database@_dummy_table
> POSTHOOK: Output: default@test_custom
> POSTHOOK: Lineage: test_custom.age SIMPLE []
> POSTHOOK: Lineage: test_custom.name SIMPLE []
> PREHOOK: query: desc formatted test_custom age
> PREHOOK: type: DESCTABLE
> PREHOOK: Input: default@test_custom
> POSTHOOK: query: desc formatted test_custom age
> POSTHOOK: type: DESCTABLE
> POSTHOOK: Input: default@test_custom
> col_name                age
> data_type               int
> min
> max
> num_nulls
> distinct_count
> avg_col_len
> max_col_len
> num_trues
> num_falses
> bit_vector
> comment                 from deserializer{noformat}
> As we can see from desc formatted output, column stats were not populated
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to