Re: Review Request 65415: HIVE-18571 stats issues for MM tables

2018-03-06 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65415/
---

(Updated March 6, 2018, 8:44 p.m.)


Review request for hive and Eugene Koifman.


Repository: hive-git


Description
---

f.,v fbghdscd


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/common/HiveStatsUtils.java df77a4a2f2 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d955b48360 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenarios.java
 41c89b1cd3 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcrossInstances.java
 6e8d6b62a5 
  ql/src/java/org/apache/hadoop/hive/ql/QueryPlan.java 9ea7a7c79b 
  ql/src/java/org/apache/hadoop/hive/ql/exec/CopyTask.java eee5e66ea7 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java b490325091 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 804cd7868b 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
 0a82225d4a 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java ced84b3e15 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileWork.java 1a63d3f971 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java c0be51e0b2 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
1c6b793e11 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
7d2de75315 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 7f446ca1de 
  ql/src/java/org/apache/hadoop/hive/ql/plan/BasicStatsWork.java a4e770ce95 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java 
8ce0cb05b6 
  ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsNoJobTask.java 
946c300750 
  ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java 1d7660e8b2 
  ql/src/java/org/apache/hadoop/hive/ql/stats/ColStatsProcessor.java 7591c0681b 
  ql/src/java/org/apache/hadoop/hive/ql/stats/Partish.java 05b0474e90 
  ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java 
d84cf136d5 
  ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsPublisher.java 
922cfc23c0 
  ql/src/test/queries/clientnegative/orc_change_fileformat_acid.q cc73616a32 
  ql/src/test/queries/clientnegative/orc_change_serde_acid.q 91a2be50c0 
  ql/src/test/queries/clientnegative/orc_reorder_columns1_acid.q 234e74bb74 
  ql/src/test/queries/clientnegative/orc_reorder_columns2_acid.q 57ab049c6d 
  ql/src/test/queries/clientnegative/orc_replace_columns1_acid.q 9fe9209d03 
  ql/src/test/queries/clientnegative/orc_replace_columns2_acid.q 7b37757ebf 
  ql/src/test/queries/clientnegative/orc_replace_columns3_acid.q e3cb819b62 
  ql/src/test/queries/clientnegative/orc_type_promotion1_acid.q 3a8c08a829 
  ql/src/test/queries/clientnegative/orc_type_promotion2_acid.q 1d24b1dd18 
  ql/src/test/queries/clientnegative/orc_type_promotion3_acid.q 83764e29cc 
  ql/src/test/queries/clientpositive/acid_nullscan.q d048231584 
  ql/src/test/results/clientpositive/acid_nullscan.q.out 669fa3fa47 
  ql/src/test/results/clientpositive/autoColumnStats_4.q.out 1f4c0adfc7 
  ql/src/test/results/clientpositive/llap/acid_bucket_pruning.q.out 1abd3a29f1 
  ql/src/test/results/clientpositive/llap/acid_vectorization_original.q.out 
64e5b17936 
  ql/src/test/results/clientpositive/llap/default_constraint.q.out 89b1224004 
  ql/src/test/results/clientpositive/llap/dynpart_sort_optimization_acid.q.out 
6a97736008 
  ql/src/test/results/clientpositive/llap/enforce_constraint_notnull.q.out 
352f6bad01 
  
ql/src/test/results/clientpositive/llap/materialized_view_create_rewrite_3.q.out
 d8863a2c80 
  
ql/src/test/results/clientpositive/llap/materialized_view_create_rewrite_rebuild_dummy.q.out
 d8863a2c80 
  ql/src/test/results/clientpositive/llap/mm_all.q.out 23e733b4c0 
  ql/src/test/results/clientpositive/materialized_view_create_rewrite_3.q.out 
29e408c60c 
  ql/src/test/results/clientpositive/mm_all.q.out ac6c08057c 
  ql/src/test/results/clientpositive/mm_default.q.out 1345efdfb6 
  ql/src/test/results/clientpositive/tez/acid_vectorization_original_tez.q.out 
92a04ddbf3 
  ql/src/test/results/clientpositive/tez/explainanalyze_5.q.out 7f18f2b42b 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/common/StatsSetupConst.java
 59190893e6 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
 89354a2d34 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 2be018ba0f 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/Warehouse.java
 20c10607bb 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/FileUtils.java
 b44ff8ce47 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
 50f873a013 
  

Re: Review Request 65415: HIVE-18571 stats issues for MM tables

2018-03-05 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65415/#review198675
---




ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
Lines 12864 (patched)


this comment will be removed


- Sergey Shelukhin


On March 6, 2018, 1:57 a.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65415/
> ---
> 
> (Updated March 6, 2018, 1:57 a.m.)
> 
> 
> Review request for hive and Eugene Koifman.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> f.,v fbghdscd
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/HiveStatsUtils.java 
> df77a4a2f2 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 5f2058ebc6 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenarios.java
>  41c89b1cd3 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcrossInstances.java
>  6e8d6b62a5 
>   ql/src/java/org/apache/hadoop/hive/ql/QueryPlan.java 9ea7a7c79b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/CopyTask.java eee5e66ea7 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java b490325091 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java fd8423129f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
>  0a82225d4a 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 8dc1e8a94f 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileWork.java 
> 1a63d3f971 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 68a87e6d0f 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
> 1c6b793e11 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> 7d2de75315 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 7f446ca1de 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/BasicStatsWork.java a4e770ce95 
>   
> ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java 
> 8ce0cb05b6 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsNoJobTask.java 
> 946c300750 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java 1d7660e8b2 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/ColStatsProcessor.java 
> 7591c0681b 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/Partish.java 05b0474e90 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java 
> d84cf136d5 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsPublisher.java 
> 922cfc23c0 
>   ql/src/test/queries/clientpositive/acid_nullscan.q d048231584 
>   ql/src/test/results/clientpositive/acid_nullscan.q.out 669fa3fa47 
>   ql/src/test/results/clientpositive/autoColumnStats_4.q.out 1f4c0adfc7 
>   ql/src/test/results/clientpositive/llap/acid_bucket_pruning.q.out 
> 1abd3a29f1 
>   ql/src/test/results/clientpositive/llap/acid_vectorization_original.q.out 
> 11a99dbc33 
>   
> ql/src/test/results/clientpositive/llap/dynpart_sort_optimization_acid.q.out 
> 74984852e0 
>   
> ql/src/test/results/clientpositive/llap/materialized_view_create_rewrite_3.q.out
>  d8863a2c80 
>   
> ql/src/test/results/clientpositive/llap/materialized_view_create_rewrite_rebuild_dummy.q.out
>  d8863a2c80 
>   ql/src/test/results/clientpositive/llap/mm_all.q.out 23e733b4c0 
>   ql/src/test/results/clientpositive/materialized_view_create_rewrite_3.q.out 
> 29e408c60c 
>   ql/src/test/results/clientpositive/mm_all.q.out ac6c08057c 
>   ql/src/test/results/clientpositive/mm_default.q.out 1345efdfb6 
>   
> ql/src/test/results/clientpositive/tez/acid_vectorization_original_tez.q.out 
> b7d5b40d10 
>   ql/src/test/results/clientpositive/tez/explainanalyze_5.q.out 7f18f2b42b 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/common/StatsSetupConst.java
>  59190893e6 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
>  89354a2d34 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  1c422ca281 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/Warehouse.java
>  20c10607bb 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/FileUtils.java
>  b44ff8ce47 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
>  50f873a013 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
>  2599ab103e 
> 
> 
> Diff: https://reviews.apache.org/r/65415/diff/5/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



Re: Review Request 65415: HIVE-18571 stats issues for MM tables

2018-03-05 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65415/
---

(Updated March 6, 2018, 1:57 a.m.)


Review request for hive and Eugene Koifman.


Repository: hive-git


Description
---

f.,v fbghdscd


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/common/HiveStatsUtils.java df77a4a2f2 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 5f2058ebc6 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenarios.java
 41c89b1cd3 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcrossInstances.java
 6e8d6b62a5 
  ql/src/java/org/apache/hadoop/hive/ql/QueryPlan.java 9ea7a7c79b 
  ql/src/java/org/apache/hadoop/hive/ql/exec/CopyTask.java eee5e66ea7 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java b490325091 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java fd8423129f 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
 0a82225d4a 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 8dc1e8a94f 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileWork.java 1a63d3f971 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 68a87e6d0f 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
1c6b793e11 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
7d2de75315 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 7f446ca1de 
  ql/src/java/org/apache/hadoop/hive/ql/plan/BasicStatsWork.java a4e770ce95 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java 
8ce0cb05b6 
  ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsNoJobTask.java 
946c300750 
  ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java 1d7660e8b2 
  ql/src/java/org/apache/hadoop/hive/ql/stats/ColStatsProcessor.java 7591c0681b 
  ql/src/java/org/apache/hadoop/hive/ql/stats/Partish.java 05b0474e90 
  ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java 
d84cf136d5 
  ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsPublisher.java 
922cfc23c0 
  ql/src/test/queries/clientpositive/acid_nullscan.q d048231584 
  ql/src/test/results/clientpositive/acid_nullscan.q.out 669fa3fa47 
  ql/src/test/results/clientpositive/autoColumnStats_4.q.out 1f4c0adfc7 
  ql/src/test/results/clientpositive/llap/acid_bucket_pruning.q.out 1abd3a29f1 
  ql/src/test/results/clientpositive/llap/acid_vectorization_original.q.out 
11a99dbc33 
  ql/src/test/results/clientpositive/llap/dynpart_sort_optimization_acid.q.out 
74984852e0 
  
ql/src/test/results/clientpositive/llap/materialized_view_create_rewrite_3.q.out
 d8863a2c80 
  
ql/src/test/results/clientpositive/llap/materialized_view_create_rewrite_rebuild_dummy.q.out
 d8863a2c80 
  ql/src/test/results/clientpositive/llap/mm_all.q.out 23e733b4c0 
  ql/src/test/results/clientpositive/materialized_view_create_rewrite_3.q.out 
29e408c60c 
  ql/src/test/results/clientpositive/mm_all.q.out ac6c08057c 
  ql/src/test/results/clientpositive/mm_default.q.out 1345efdfb6 
  ql/src/test/results/clientpositive/tez/acid_vectorization_original_tez.q.out 
b7d5b40d10 
  ql/src/test/results/clientpositive/tez/explainanalyze_5.q.out 7f18f2b42b 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/common/StatsSetupConst.java
 59190893e6 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
 89354a2d34 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 1c422ca281 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/Warehouse.java
 20c10607bb 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/FileUtils.java
 b44ff8ce47 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
 50f873a013 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 2599ab103e 


Diff: https://reviews.apache.org/r/65415/diff/5/

Changes: https://reviews.apache.org/r/65415/diff/4-5/


Testing
---


Thanks,

Sergey Shelukhin



Re: Review Request 65415: HIVE-18571 stats issues for MM tables

2018-03-05 Thread Sergey Shelukhin


> On Feb. 2, 2018, 10 a.m., Zoltan Haindrich wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java
> > Line 157 (original), 162 (patched)
> > 
> >
> > I feel that currently the stats system is half-blind when it comes to 
> > acid tables...because the autogather operations are somewhat useless on 
> > them...
> > I was thinking about the following: removing this condition to collect 
> > stats even in case basic stats are off; would enable the stats to gather a 
> > total "rowtraffic" - which might be good enough for an estimation ; and it 
> > may give the join order optimization a chance to do its job better for 
> > acid/insert_only tables which have not been updated explicitly updated for 
> > a long time...
> > 
> > This could be probably done as a separate change (because it will 
> > probably rewrite every second q.out) - what do you think about it?

filed a jira


- Sergey


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65415/#review196696
---


On Feb. 27, 2018, 3:14 a.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65415/
> ---
> 
> (Updated Feb. 27, 2018, 3:14 a.m.)
> 
> 
> Review request for hive and Eugene Koifman.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> f.,v fbghdscd
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/HiveStatsUtils.java 
> df77a4a2f2 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java b490325091 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java fd8423129f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
>  0a82225d4a 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 70fcd2c142 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileWork.java 
> 1a63d3f971 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 8b0af3e5c8 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
> 67d05e65dd 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> 7d2de75315 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> cd6f1ee692 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/BasicStatsWork.java a4e770ce95 
>   
> ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java 
> 8ce0cb05b6 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsNoJobTask.java 
> 946c300750 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java 1d7660e8b2 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/ColStatsProcessor.java 
> 7591c0681b 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/Partish.java 05b0474e90 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java 
> d84cf136d5 
>   ql/src/test/results/clientpositive/autoColumnStats_4.q.out 9c0e020351 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/common/StatsSetupConst.java
>  59190893e6 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
>  89354a2d34 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  c6e34a8a22 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/Warehouse.java
>  20c10607bb 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/FileUtils.java
>  b44ff8ce47 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
>  50f873a013 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
>  2599ab103e 
> 
> 
> Diff: https://reviews.apache.org/r/65415/diff/4/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



Re: Review Request 65415: HIVE-18571 stats issues for MM tables

2018-03-02 Thread Sergey Shelukhin


> On March 1, 2018, 12:40 a.m., Eugene Koifman wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
> > Lines 1683 (patched)
> > 
> >
> > For full acid table, if you have no ParseDelta.isDeleteDelta(), then 
> > the ((non-delete) ParseDelta from getCurrentDirectories + (base | 
> > getOriginalFiles())) fileset should be accurate

Can you elaborate? not sure what you mean wrt this code


> On March 1, 2018, 12:40 a.m., Eugene Koifman wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
> > Lines 1695 (patched)
> > 
> >
> > adding ParseDelta.isDeleteDelta() seems wrong - if anything it should 
> > subtract from file size/row count

Well, I think at least the size is more likely to be used for scan size 
estimation, and delete deltas would need to be scanned together with other 
files.
I think the proper impl of stats for ACID would need to be done in separate 
jira and actually account properly for ACID operations.


> On March 1, 2018, 12:40 a.m., Eugene Koifman wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Lines 1619 (patched)
> > 
> >
> > should all these todos be jiras?

Will file jira(s) after the final version on the patch based on all the TODOs 
added where it's relevant.


> On March 1, 2018, 12:40 a.m., Eugene Koifman wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Lines 1697 (patched)
> > 
> >
> > I saw a nubmer of comments/logic to this effect - probably better to 
> > wait for HIVE-18824 and remove these

Well, it can still happen due to some bug. I'll keep the checks for safety, 
we'll see 0 stats if they happen to trigger.


> On March 1, 2018, 12:40 a.m., Eugene Koifman wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
> > Lines 647 (patched)
> > 
> >
> > why?  No ValidTxnList?

yes; also the code itself is in QL


> On March 1, 2018, 12:40 a.m., Eugene Koifman wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
> > Line 675 (original), 671 (patched)
> > 
> >
> > Jira? Assert?  at least a Wtf...

it's impossible to assert what file list is... 
It would be valid to call this by getting the file list from AcidUtils.
I'm going to file a follow up JIRA for this.


- Sergey


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65415/#review198419
---


On Feb. 27, 2018, 3:14 a.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65415/
> ---
> 
> (Updated Feb. 27, 2018, 3:14 a.m.)
> 
> 
> Review request for hive and Eugene Koifman.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> f.,v fbghdscd
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/HiveStatsUtils.java 
> df77a4a2f2 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java b490325091 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java fd8423129f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
>  0a82225d4a 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 70fcd2c142 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileWork.java 
> 1a63d3f971 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 8b0af3e5c8 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
> 67d05e65dd 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> 7d2de75315 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> cd6f1ee692 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/BasicStatsWork.java a4e770ce95 
>   
> ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java 
> 8ce0cb05b6 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsNoJobTask.java 
> 946c300750 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java 1d7660e8b2 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/ColStatsProcessor.java 
> 7591c0681b 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/Partish.java 05b0474e90 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java 
> d84cf136d5 
>   ql/src/test/results/clientpositive/autoColumnStats_4.q.out 

Re: Review Request 65415: HIVE-18571 stats issues for MM tables

2018-03-02 Thread Sergey Shelukhin


> On Feb. 2, 2018, 10 a.m., Zoltan Haindrich wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Lines 1947 (patched)
> > 
> >
> > I think this should be in somewhere in the BasicStat related class; or 
> > this can't be moved there?

It's used in metastore, so it cannot be moved to ql currently


> On Feb. 2, 2018, 10 a.m., Zoltan Haindrich wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java
> > Line 127 (original), 127 (patched)
> > 
> >
> > It seems to me that the old conditionals have done almost the same...by 
> > changing p.isAcid to p.isTransactional ; I don't see any difference; since 
> > if its being rewritten the flag will be turned on

Transactional is broader than full ACID (it means ACID or MM), and the name is 
more explicit. Most of these methods were changed and/or renamed in a prior 
patch to make the clearer.


> On Feb. 2, 2018, 10 a.m., Zoltan Haindrich wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
> > Lines 674 (patched)
> > 
> >
> > I don't understand why

Updated the comment


- Sergey


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65415/#review196696
---


On Feb. 27, 2018, 3:14 a.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65415/
> ---
> 
> (Updated Feb. 27, 2018, 3:14 a.m.)
> 
> 
> Review request for hive and Eugene Koifman.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> f.,v fbghdscd
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/HiveStatsUtils.java 
> df77a4a2f2 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java b490325091 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java fd8423129f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
>  0a82225d4a 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 70fcd2c142 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileWork.java 
> 1a63d3f971 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 8b0af3e5c8 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
> 67d05e65dd 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> 7d2de75315 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> cd6f1ee692 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/BasicStatsWork.java a4e770ce95 
>   
> ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java 
> 8ce0cb05b6 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsNoJobTask.java 
> 946c300750 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java 1d7660e8b2 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/ColStatsProcessor.java 
> 7591c0681b 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/Partish.java 05b0474e90 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java 
> d84cf136d5 
>   ql/src/test/results/clientpositive/autoColumnStats_4.q.out 9c0e020351 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/common/StatsSetupConst.java
>  59190893e6 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
>  89354a2d34 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  c6e34a8a22 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/Warehouse.java
>  20c10607bb 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/FileUtils.java
>  b44ff8ce47 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
>  50f873a013 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
>  2599ab103e 
> 
> 
> Diff: https://reviews.apache.org/r/65415/diff/4/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



Re: Review Request 65415: HIVE-18571 stats issues for MM tables

2018-02-28 Thread Eugene Koifman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65415/#review198419
---




ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
Line 1126 (original), 1134 (patched)


I think Wei added this to skip aborted deltas.  In full acid it's no 
possible since it relies on MoveTask.  This could probably safely generalize to 
all isTransactional()



ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
Lines 1683 (patched)


For full acid table, if you have no ParseDelta.isDeleteDelta(), then the 
((non-delete) ParseDelta from getCurrentDirectories + (base | 
getOriginalFiles())) fileset should be accurate



ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
Lines 1695 (patched)


adding ParseDelta.isDeleteDelta() seems wrong - if anything it should 
subtract from file size/row count



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 1585 (patched)


unused



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 1597 (patched)


This needs elaboration or be removed - it will be confusing to most people 
I think



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 1619 (patched)


should all these todos be jiras?



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 1697 (patched)


I saw a nubmer of comments/logic to this effect - probably better to wait 
for HIVE-18824 and remove these



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 2194 (patched)


Jiras?



ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java
Lines 523 (patched)


Jira?



standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
Line 616 (original), 611 (patched)


can you elaborate?



standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
Lines 647 (patched)


why?  No ValidTxnList?



standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
Line 675 (original), 671 (patched)


Jira? Assert?  at least a Wtf...



standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
Lines 758 (patched)


jira?


- Eugene Koifman


On Feb. 26, 2018, 7:14 p.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65415/
> ---
> 
> (Updated Feb. 26, 2018, 7:14 p.m.)
> 
> 
> Review request for hive and Eugene Koifman.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> f.,v fbghdscd
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/HiveStatsUtils.java 
> df77a4a2f2 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java b490325091 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java fd8423129f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
>  0a82225d4a 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 70fcd2c142 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileWork.java 
> 1a63d3f971 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 8b0af3e5c8 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
> 67d05e65dd 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> 7d2de75315 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> cd6f1ee692 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/BasicStatsWork.java a4e770ce95 
>   
> ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java 
> 8ce0cb05b6 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsNoJobTask.java 
> 946c300750 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java 1d7660e8b2 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/ColStatsProcessor.java 
> 7591c0681b 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/Partish.java 05b0474e90 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java 
> d84cf136d5 
>   

Re: Review Request 65415: HIVE-18571 stats issues for MM tables

2018-02-26 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65415/
---

(Updated Feb. 27, 2018, 3:14 a.m.)


Review request for hive and Eugene Koifman.


Repository: hive-git


Description
---

f.,v fbghdscd


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/common/HiveStatsUtils.java df77a4a2f2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java b490325091 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java fd8423129f 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
 0a82225d4a 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 70fcd2c142 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileWork.java 1a63d3f971 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 8b0af3e5c8 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
67d05e65dd 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
7d2de75315 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java cd6f1ee692 
  ql/src/java/org/apache/hadoop/hive/ql/plan/BasicStatsWork.java a4e770ce95 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java 
8ce0cb05b6 
  ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsNoJobTask.java 
946c300750 
  ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java 1d7660e8b2 
  ql/src/java/org/apache/hadoop/hive/ql/stats/ColStatsProcessor.java 7591c0681b 
  ql/src/java/org/apache/hadoop/hive/ql/stats/Partish.java 05b0474e90 
  ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java 
d84cf136d5 
  ql/src/test/results/clientpositive/autoColumnStats_4.q.out 9c0e020351 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/common/StatsSetupConst.java
 59190893e6 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
 89354a2d34 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 c6e34a8a22 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/Warehouse.java
 20c10607bb 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/FileUtils.java
 b44ff8ce47 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
 50f873a013 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 2599ab103e 


Diff: https://reviews.apache.org/r/65415/diff/4/

Changes: https://reviews.apache.org/r/65415/diff/3-4/


Testing
---


Thanks,

Sergey Shelukhin



Re: Review Request 65415: HIVE-18571 stats issues for MM tables

2018-02-23 Thread Sergey Shelukhin


> On Feb. 2, 2018, 10 a.m., Zoltan Haindrich wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
> > Lines 650 (patched)
> > 
> >
> > I might be missing something but I don't see why should quickstats be 
> > calculated differently for transactional tables...quickstats is num_files 
> > and total bytes on disk - these things apply to acid tables as well

For acid tables, files in old delta directories and such do not belong to the 
table data set.


- Sergey


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65415/#review196696
---


On Feb. 24, 2018, 2:09 a.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65415/
> ---
> 
> (Updated Feb. 24, 2018, 2:09 a.m.)
> 
> 
> Review request for hive and Eugene Koifman.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> f.,v fbghdscd
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java b490325091 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
>  0a82225d4a 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 70fcd2c142 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 8b0af3e5c8 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
> 67d05e65dd 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> 7d2de75315 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> cd6f1ee692 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/BasicStatsWork.java a4e770ce95 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsNoJobTask.java 
> 946c300750 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java 1d7660e8b2 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/Partish.java 05b0474e90 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java 
> d84cf136d5 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
>  89354a2d34 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  c6e34a8a22 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
>  50f873a013 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
>  2599ab103e 
> 
> 
> Diff: https://reviews.apache.org/r/65415/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



Re: Review Request 65415: HIVE-18571 stats issues for MM tables

2018-02-23 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65415/
---

(Updated Feb. 24, 2018, 2:09 a.m.)


Review request for hive and Eugene Koifman.


Repository: hive-git


Description
---

f.,v fbghdscd


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java b490325091 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
 0a82225d4a 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 70fcd2c142 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 8b0af3e5c8 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
67d05e65dd 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
7d2de75315 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java cd6f1ee692 
  ql/src/java/org/apache/hadoop/hive/ql/plan/BasicStatsWork.java a4e770ce95 
  ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsNoJobTask.java 
946c300750 
  ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java 1d7660e8b2 
  ql/src/java/org/apache/hadoop/hive/ql/stats/Partish.java 05b0474e90 
  ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java 
d84cf136d5 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
 89354a2d34 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 c6e34a8a22 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
 50f873a013 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 2599ab103e 


Diff: https://reviews.apache.org/r/65415/diff/3/

Changes: https://reviews.apache.org/r/65415/diff/2-3/


Testing
---


Thanks,

Sergey Shelukhin



Re: Review Request 65415: HIVE-18571 stats issues for MM tables

2018-02-02 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65415/#review196696
---




ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 1947 (patched)


I think this should be in somewhere in the BasicStat related class; or this 
can't be moved there?



ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java
Line 127 (original), 127 (patched)


It seems to me that the old conditionals have done almost the same...by 
changing p.isAcid to p.isTransactional ; I don't see any difference; since if 
its being rewritten the flag will be turned on



ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java
Line 157 (original), 162 (patched)


I feel that currently the stats system is half-blind when it comes to acid 
tables...because the autogather operations are somewhat useless on them...
I was thinking about the following: removing this condition to collect 
stats even in case basic stats are off; would enable the stats to gather a 
total "rowtraffic" - which might be good enough for an estimation ; and it may 
give the join order optimization a chance to do its job better for 
acid/insert_only tables which have not been updated explicitly updated for a 
long time...

This could be probably done as a separate change (because it will probably 
rewrite every second q.out) - what do you think about it?



standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
Lines 650 (patched)


I might be missing something but I don't see why should quickstats be 
calculated differently for transactional tables...quickstats is num_files and 
total bytes on disk - these things apply to acid tables as well



standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
Lines 674 (patched)


I don't understand why



standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
Lines 676 (patched)


I totally agree...it's very inconvinient to have this in the metastore


- Zoltan Haindrich


On Jan. 31, 2018, 2:15 a.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65415/
> ---
> 
> (Updated Jan. 31, 2018, 2:15 a.m.)
> 
> 
> Review request for hive and Eugene Koifman.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> f.,v fbghdscd
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 114d455ff8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
>  bad7962373 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 63bcedc000 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
> 6c73dc54a7 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> 5868d4dd56 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> dbf9363d11 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/BasicStatsWork.java a4e770ce95 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsNoJobTask.java 
> 946c300750 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java b48379013d 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/Partish.java 78f48b169a 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java 
> d84cf136d5 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
>  89354a2d34 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  ecc464418d 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
>  50f873a013 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
>  2599ab103e 
> 
> 
> Diff: https://reviews.apache.org/r/65415/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



Re: Review Request 65415: HIVE-18571 stats issues for MM tables

2018-01-30 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65415/
---

(Updated Jan. 31, 2018, 2:15 a.m.)


Review request for hive and Eugene Koifman.


Repository: hive-git


Description
---

f.,v fbghdscd


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 114d455ff8 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
 bad7962373 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 63bcedc000 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
6c73dc54a7 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
5868d4dd56 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java dbf9363d11 
  ql/src/java/org/apache/hadoop/hive/ql/plan/BasicStatsWork.java a4e770ce95 
  ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsNoJobTask.java 
946c300750 
  ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java b48379013d 
  ql/src/java/org/apache/hadoop/hive/ql/stats/Partish.java 78f48b169a 
  ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java 
d84cf136d5 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
 89354a2d34 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 ecc464418d 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
 50f873a013 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 2599ab103e 


Diff: https://reviews.apache.org/r/65415/diff/2/

Changes: https://reviews.apache.org/r/65415/diff/1-2/


Testing
---


Thanks,

Sergey Shelukhin



Re: Review Request 65415: HIVE-18571 stats issues for MM tables

2018-01-29 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65415/#review196500
---




ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsNoJobTask.java
Line 168 (original), 174 (patched)


should still be debug



ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java
Lines 68 (patched)


and these should be trace


- Sergey Shelukhin


On Jan. 30, 2018, 3:19 a.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65415/
> ---
> 
> (Updated Jan. 30, 2018, 3:19 a.m.)
> 
> 
> Review request for hive and Eugene Koifman.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> f.,v fbghdscd
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 0df30f1ea0 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
>  bad7962373 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 23983d85b3 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
> 6c73dc54a7 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> 5868d4dd56 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 83dfb47e1c 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/BasicStatsWork.java a4e770ce95 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java 1a9c11ec98 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsNoJobTask.java 
> 946c300750 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java b48379013d 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/Partish.java 78f48b169a 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java 
> d84cf136d5 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
>  89354a2d34 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  ecc464418d 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
>  50f873a013 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
>  2599ab103e 
> 
> 
> Diff: https://reviews.apache.org/r/65415/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



Review Request 65415: HIVE-18571 stats issues for MM tables

2018-01-29 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65415/
---

Review request for hive and Eugene Koifman.


Repository: hive-git


Description
---

f.,v fbghdscd


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 0df30f1ea0 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
 bad7962373 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 23983d85b3 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
6c73dc54a7 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
5868d4dd56 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 83dfb47e1c 
  ql/src/java/org/apache/hadoop/hive/ql/plan/BasicStatsWork.java a4e770ce95 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java 1a9c11ec98 
  ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsNoJobTask.java 
946c300750 
  ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java b48379013d 
  ql/src/java/org/apache/hadoop/hive/ql/stats/Partish.java 78f48b169a 
  ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java 
d84cf136d5 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
 89354a2d34 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 ecc464418d 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
 50f873a013 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 2599ab103e 


Diff: https://reviews.apache.org/r/65415/diff/1/


Testing
---


Thanks,

Sergey Shelukhin