Re: Review Request 62280: HIVE-16827: Merge stats task and column stats task into a single task

2017-11-06 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62280/
---

(Updated Nov. 7, 2017, 7:17 a.m.)


Review request for hive and Ashutosh Chauhan.


Changes
---

address review comments; update q.outs


Bugs: HIVE-16827
https://issues.apache.org/jira/browse/HIVE-16827


Repository: hive-git


Description (updated)
---

this was originally part of HIVE-13567

other notes:

* there are also a few improvements for basicstats; seems like earlier it was 
incorrectly computed in some cases
* added a small fix for acid transactional=false handling


Diffs (updated)
-

  
accumulo-handler/src/test/results/positive/accumulo_single_sourced_multi_insert.q.out
 59bca5024d 
  common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
a73893faff 
  contrib/src/test/results/clientpositive/serde_typedbytes.q.out 6876ca8775 
  contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 79cf8fe1e5 
  contrib/src/test/results/clientpositive/serde_typedbytes3.q.out fec58ef026 
  contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 1131478a7b 
  contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 8d3b95ece8 
  data/scripts/q_test_init_src.sql 56b44e0a5d 
  
hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out 
e8927e9d64 
  itests/hive-blobstore/src/test/results/clientpositive/explain.q.out 
09197f9b7b 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
 e55b1c257e 
  itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
f50f4af817 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
 660cebba5f 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
 ba0e83d562 
  
itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
 2ababb1eec 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
ad2baa2e26 
  itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
2edf749399 
  itests/src/test/resources/testconfiguration.properties 46abf8abae 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
0fcb93ec89 
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
a4917898e4 
  pom.xml 006e8f8611 
  ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java 583d3d3893 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java 1f286887e4 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 2331498781 
  ql/src/java/org/apache/hadoop/hive/ql/exec/StatsNoJobTask.java c333c494a1 
  ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 682b42c202 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java ab495cfae8 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 36a5eff1e3 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java b78c930cf5 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java cf4df9b3fb 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java 
78e83af4f0 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 
768640c544 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
dbf4b8da01 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/QueryPlanPostProcessor.java 
91c6c0050b 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MemoryDecider.java 
3a20cfe7ac 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SerializeFilter.java 
dc433fed22 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SkewJoinResolver.java 
2f9783ed18 
  ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
0f7ef8b46d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
3415a23dec 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ExplainSemanticAnalyzer.java 
7a0d4a752e 
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java 01cb2b31a5 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
1318c18379 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
5f2a34ef57 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ProcessAnalyzeTable.java 
9309fbd0e3 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 7a7460e5c2 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java a63f709eef 
  
ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkProcessAnalyzeTable.java 
6f21cae8d6 
  ql/src/java/org/apache/hadoop/hive/ql/plan/BasicStatsWork.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsDesc.java 97f323f4b7 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsWork.java 842fd1a411 
  ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java 550e6f867f 
  ql/src/java/org/apache/hadoop/hive/ql/plan/IStatsGat

Re: Review Request 62280: HIVE-16827: Merge stats task and column stats task into a single task

2017-10-31 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62280/
---

(Updated Oct. 31, 2017, 3:43 p.m.)


Review request for hive and Ashutosh Chauhan.


Changes
---

address review comments; relatively high volume of q.out changes because the 
zeros are coming back...


Bugs: HIVE-16827
https://issues.apache.org/jira/browse/HIVE-16827


Repository: hive-git


Description
---

this was originally part of HIVE-13567


Diffs (updated)
-

  
accumulo-handler/src/test/results/positive/accumulo_single_sourced_multi_insert.q.out
 59bca5024d 
  common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
a73893faff 
  contrib/src/test/results/clientpositive/serde_typedbytes.q.out 6876ca8775 
  contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 79cf8fe1e5 
  contrib/src/test/results/clientpositive/serde_typedbytes3.q.out fec58ef026 
  contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 1131478a7b 
  contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 8d3b95ece8 
  data/scripts/q_test_init_src.sql 56b44e0a5d 
  
hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out 
e8927e9d64 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
 e55b1c257e 
  itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
f50f4af817 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
 660cebba5f 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
 ba0e83d562 
  
itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
 2ababb1eec 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
ad2baa2e26 
  itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
2edf749399 
  itests/src/test/resources/testconfiguration.properties 4b521f052b 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 
70d33ff93e 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
7337c1ca6c 
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
ef097acb0f 
  pom.xml 006e8f8611 
  ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java 583d3d3893 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java 1f286887e4 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 2331498781 
  ql/src/java/org/apache/hadoop/hive/ql/exec/StatsNoJobTask.java c333c494a1 
  ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 682b42c202 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java ab495cfae8 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 36a5eff1e3 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java cd2d091a23 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 70656feea7 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java 
78e83af4f0 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 
768640c544 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
dbf4b8da01 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/QueryPlanPostProcessor.java 
91c6c0050b 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MemoryDecider.java 
3a20cfe7ac 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SerializeFilter.java 
dc433fed22 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SkewJoinResolver.java 
2f9783ed18 
  ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
0f7ef8b46d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
6a2ff75c84 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ExplainSemanticAnalyzer.java 
7a0d4a752e 
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java 01cb2b31a5 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
5f2a34ef57 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ProcessAnalyzeTable.java 
9309fbd0e3 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java e40ec5d909 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java a63f709eef 
  
ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkProcessAnalyzeTable.java 
6f21cae8d6 
  ql/src/java/org/apache/hadoop/hive/ql/plan/BasicStatsWork.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsDesc.java 97f323f4b7 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsWork.java 842fd1a411 
  ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java 550e6f867f 
  ql/src/java/org/apache/hadoop/hive/ql/plan/IStatsGatherDesc.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadFileDesc.java 30d99123fc 
  ql/src/java/org/apache/hadoop/hive/ql/plan/StatsNoJobWork.java 77c04f6c6e 
  ql/src/java/org/apache/hadoop/hi

Re: Review Request 62280: HIVE-16827: Merge stats task and column stats task into a single task

2017-10-31 Thread Zoltan Haindrich


- Zoltan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62280/#review189221
---


On Oct. 3, 2017, 8:06 a.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62280/
> ---
> 
> (Updated Oct. 3, 2017, 8:06 a.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-16827
> https://issues.apache.org/jira/browse/HIVE-16827
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> this was originally part of HIVE-13567
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024e4d6f21e3b1d24a5efcc1325e64d6e 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> b7dc88c93984a37a5df7ec8258c3e1e375cf7878 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 
> 6876ca8775098175155111c25d5dba4db63b3b1b 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 
> 79cf8fe1e5d159a9fa013dfd616a84918709a0f9 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out 
> fec58ef026074b2b2d0c9a4f91cc219e4e21421c 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 
> 1131478a7b6694a106d41206042fe6dee99eb8a2 
>   contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 
> 8d3b95ece81e55193d92cbc39960bf378990e256 
>   data/scripts/q_test_init_src.sql 56b44e0a5df4d32a6a9318a760ce2debdad00d50 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  e8927e9d64a7978355b387100e4e30608c34eb7b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  e55b1c257ef548000d96a2d7a1a14c39dc34bc0b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> f50f4af817f000f6cc59133d5966899e79d67c3b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  660cebba5f8a28abcd6b10b62d5982468a92b962 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  ba0e83d5623f6cd9d7ada998d47826883cb2aca4 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  2ababb1eec51f49a4a92367deaa87ee7b2d12797 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
> ad2baa2e265d2d0ffb94859e5141e4b38f2909b5 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> 2edf7493990572ab3a9d6f1a7c4407de91bef39c 
>   itests/src/test/resources/testconfiguration.properties 
> 038487f134f2c24d0b957404de6afb1766a2cb4a 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 
> 65551ad169d8151fce297d8a44fd30168a682ad0 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
> f4af6f497927d61517e263521ef97a27c616e14c 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> a147a2590d6de1fe161c7b02f043f179243cf83c 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> b5e4bf086e9b2659ebb3b221269ee4a69a0465ae 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/ColumnStatsMergerFactory.java
>  66be52413956b261373b2132c5678a204662c79e 
>   pom.xml 52e53012b9ca91d953baa257e1dd74ba86220455 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java 
> 583d3d3893298e823e27e82a13b252f799bbef79 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java 
> e3b0d1eaea337d318918c9e9ea07be53bffee189 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 
> bc265eb790ab8f9517ecfdd56b0634e47092ca35 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsNoJobTask.java 
> 9c3a664b9aea2d6e050ffe2d7626127827dbc52a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 
> 4db68066d6dbc557180a2fefe63069a924fede65 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 
> 6193b900e085bbe013f0526ee8c351cdff4e655a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 
> e9c69d94d5b41748cde08eaf2636c513c5731929 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 
> ae637279998e7e4983d6c9abf8c0af75b92d30da 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
> 436a2fe73bc91f59ea83490291fa3668a6442930 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  61f6a7c4ff38447db0ac2610e7308f4e710580ab 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 
> 9297a0b87454bf37b1a4c68327407cada6b37232 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> 25cca7b4be124f6b6a6ea645ef887f6cf5cfa8d6 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/QueryPlanPostProcessor.java 

Re: Review Request 62280: HIVE-16827: Merge stats task and column stats task into a single task

2017-10-31 Thread Zoltan Haindrich


> On Oct. 10, 2017, 11:49 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java
> > Lines 160 (patched)
> > 
> >
> > we want to just flip the state flag and keep whatever stats we have.
> 
> Zoltan Haindrich wrote:
> I would like to remove stats which might be misleading; outputs which 
> contain a value for `numRows` ; but basic_stats are false could cause a 
> headache for the explains readers...
> 
> It turned out that removing invalid stats has a positive sideeffect: 
> until distinct_windowing_no_cbo.q estimated 1 row for a table containing 10k 
> rows; now it changed to 84k which seems better to me.
> 
> Anyway...there is an issue with the `LOAD` statements; seems like 
> `hive.stats.autogather` doesn't work on it properly (this will be a followup)
> 
> Ashutosh Chauhan wrote:
> Having incorrect but approximate stats are much more useful for optimizer 
> than having no stats at all. We keep state so that we don't use it to answer 
> queries but approximate stats have role to play.
> 
> Load statement is suppose to set state to false. Since, it doesn't scan 
> data, stats can be computed for this statement.

but in case we want to keep "incorrect" stats...why don't update them? - they 
might be just partially correct; but in case its possible the stat system 
should update them or not?


- Zoltan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62280/#review187592
---


On Oct. 3, 2017, 8:06 a.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62280/
> ---
> 
> (Updated Oct. 3, 2017, 8:06 a.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-16827
> https://issues.apache.org/jira/browse/HIVE-16827
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> this was originally part of HIVE-13567
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024e4d6f21e3b1d24a5efcc1325e64d6e 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> b7dc88c93984a37a5df7ec8258c3e1e375cf7878 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 
> 6876ca8775098175155111c25d5dba4db63b3b1b 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 
> 79cf8fe1e5d159a9fa013dfd616a84918709a0f9 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out 
> fec58ef026074b2b2d0c9a4f91cc219e4e21421c 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 
> 1131478a7b6694a106d41206042fe6dee99eb8a2 
>   contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 
> 8d3b95ece81e55193d92cbc39960bf378990e256 
>   data/scripts/q_test_init_src.sql 56b44e0a5df4d32a6a9318a760ce2debdad00d50 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  e8927e9d64a7978355b387100e4e30608c34eb7b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  e55b1c257ef548000d96a2d7a1a14c39dc34bc0b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> f50f4af817f000f6cc59133d5966899e79d67c3b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  660cebba5f8a28abcd6b10b62d5982468a92b962 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  ba0e83d5623f6cd9d7ada998d47826883cb2aca4 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  2ababb1eec51f49a4a92367deaa87ee7b2d12797 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
> ad2baa2e265d2d0ffb94859e5141e4b38f2909b5 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> 2edf7493990572ab3a9d6f1a7c4407de91bef39c 
>   itests/src/test/resources/testconfiguration.properties 
> 038487f134f2c24d0b957404de6afb1766a2cb4a 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 
> 65551ad169d8151fce297d8a44fd30168a682ad0 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
> f4af6f497927d61517e263521ef97a27c616e14c 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> a147a2590d6de1fe161c7b02f043f179243cf83c 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> b5e4bf086e9b2659ebb3b221269ee4a69a0465ae 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/ColumnStatsMergerFactory.java
>  66be52413956b261373b2132c5678a204662c79e 
>   po

Re: Review Request 62280: HIVE-16827: Merge stats task and column stats task into a single task

2017-10-31 Thread Zoltan Haindrich


> On Oct. 25, 2017, 11:10 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java
> > Lines 127 (patched)
> > 
> >
> > Why do we need location? Stats tmp dir is in desc already.

this is a problematic thing:

* in case of CTAS the table object alike thing which is accessible during task 
generation time didn't have location filled out.
* DDLTask fills it out
* table location is needed for footerscan to work


> On Oct. 25, 2017, 11:10 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java
> > Lines 1486 (patched)
> > 
> >
> > I dont see this table obj being used here. Shall we remove this, 
> > especially since its making a metastore call.

I deliberately moved it to here; the table ref is used to construct StatsWork
I think its better to make the metastore calls before starting the plan 
execution.
These metastore calls pre-existed; IIRC this new way avoids 1 metastore call


> On Oct. 25, 2017, 11:10 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/temp_table_display_colstats_tbllvl.q.out
> > Line 295 (original)
> > 
> >
> > For create table we set basic stats as 0 and state as true, seems to be 
> > not happening here.

temporary tables are kinda problematic...I wanted to address it later; now that 
I've turned this back on...they have numRows 0 entries - which is incorrect
I wanted to fully disable this thing - to prevent issues arising from the fact 
that it misbehaves.


- Zoltan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62280/#review189221
---


On Oct. 3, 2017, 8:06 a.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62280/
> ---
> 
> (Updated Oct. 3, 2017, 8:06 a.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-16827
> https://issues.apache.org/jira/browse/HIVE-16827
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> this was originally part of HIVE-13567
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024e4d6f21e3b1d24a5efcc1325e64d6e 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> b7dc88c93984a37a5df7ec8258c3e1e375cf7878 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 
> 6876ca8775098175155111c25d5dba4db63b3b1b 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 
> 79cf8fe1e5d159a9fa013dfd616a84918709a0f9 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out 
> fec58ef026074b2b2d0c9a4f91cc219e4e21421c 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 
> 1131478a7b6694a106d41206042fe6dee99eb8a2 
>   contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 
> 8d3b95ece81e55193d92cbc39960bf378990e256 
>   data/scripts/q_test_init_src.sql 56b44e0a5df4d32a6a9318a760ce2debdad00d50 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  e8927e9d64a7978355b387100e4e30608c34eb7b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  e55b1c257ef548000d96a2d7a1a14c39dc34bc0b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> f50f4af817f000f6cc59133d5966899e79d67c3b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  660cebba5f8a28abcd6b10b62d5982468a92b962 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  ba0e83d5623f6cd9d7ada998d47826883cb2aca4 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  2ababb1eec51f49a4a92367deaa87ee7b2d12797 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
> ad2baa2e265d2d0ffb94859e5141e4b38f2909b5 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> 2edf7493990572ab3a9d6f1a7c4407de91bef39c 
>   itests/src/test/resources/testconfiguration.properties 
> 038487f134f2c24d0b957404de6afb1766a2cb4a 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 
> 65551ad169d8151fce297d8a44fd30168a682ad0 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
> f4af6f497927d61517e263521ef97a27c616e14c 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> a147a2590d6de1fe161c7b02f043f179243cf83c 
> 

Re: Review Request 62280: HIVE-16827: Merge stats task and column stats task into a single task

2017-10-25 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62280/#review189221
---




metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java
Lines 1757 (patched)


Better name: getMergableCols()?
Also can be protected.



ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java
Line 107 (original), 75 (patched)


should this be else if? There can't be both BasicStatsNoJob and BasicStats.



ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java
Line 143 (original), 95 (patched)


If we get here this will be an implementation bug, so we shall throw 
exception in this case.



ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java
Lines 127 (patched)


Why do we need location? Stats tmp dir is in desc already.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java
Lines 1486 (patched)


I dont see this table obj being used here. Shall we remove this, especially 
since its making a metastore call.



ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java
Lines 419 (patched)


change it to LOG.trace and remove XXX



ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java
Lines 433 (patched)


remove commented code.



ql/src/java/org/apache/hadoop/hive/ql/plan/LoadFileDesc.java
Line 37 (original), 37 (patched)


remove //



ql/src/java/org/apache/hadoop/hive/ql/plan/LoadFileDesc.java
Line 46 (original), 47 (patched)


remove //



ql/src/java/org/apache/hadoop/hive/ql/plan/StatsNoJobWork.java
Line 38 (original), 35 (patched)


remove //



ql/src/java/org/apache/hadoop/hive/ql/plan/StatsNoJobWork.java
Lines 40 (patched)


remove //



ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsNoJobTask.java
Lines 206 (patched)


Logger already prints threadname. Is there a need for this?



ql/src/test/results/clientpositive/temp_table_display_colstats_tbllvl.q.out
Lines 264 (patched)


Basic stats shall also be true here?



ql/src/test/results/clientpositive/temp_table_display_colstats_tbllvl.q.out
Line 295 (original)


For create table we set basic stats as 0 and state as true, seems to be not 
happening here.



ql/src/test/results/clientpositive/temp_table_display_colstats_tbllvl.q.out
Line 374 (original), 385 (patched)


Empty table should have Basic stats true.



ql/src/test/results/clientpositive/temp_table_display_colstats_tbllvl.q.out
Lines 501 (patched)


Basic stats missing here?


- Ashutosh Chauhan


On Oct. 3, 2017, 8:06 a.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62280/
> ---
> 
> (Updated Oct. 3, 2017, 8:06 a.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-16827
> https://issues.apache.org/jira/browse/HIVE-16827
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> this was originally part of HIVE-13567
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024e4d6f21e3b1d24a5efcc1325e64d6e 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> b7dc88c93984a37a5df7ec8258c3e1e375cf7878 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 
> 6876ca8775098175155111c25d5dba4db63b3b1b 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 
> 79cf8fe1e5d159a9fa013dfd616a84918709a0f9 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out 
> fec58ef026074b2b2d0c9a4f91cc219e4e21421c 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 
> 1131478a7b6694a106d41206042fe6dee99eb8a2 
>   contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 
> 8d3b95ece81e55193d92cbc39960bf378990e256 
>   data/scripts/q_test_init_src.sql 56b44e0a5df4d32a6a9318a760ce2debdad00d50 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  e8927e9d64a79783

Re: Review Request 62280: HIVE-16827: Merge stats task and column stats task into a single task

2017-10-25 Thread Ashutosh Chauhan


> On Oct. 10, 2017, 11:49 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java
> > Lines 160 (patched)
> > 
> >
> > we want to just flip the state flag and keep whatever stats we have.
> 
> Zoltan Haindrich wrote:
> I would like to remove stats which might be misleading; outputs which 
> contain a value for `numRows` ; but basic_stats are false could cause a 
> headache for the explains readers...
> 
> It turned out that removing invalid stats has a positive sideeffect: 
> until distinct_windowing_no_cbo.q estimated 1 row for a table containing 10k 
> rows; now it changed to 84k which seems better to me.
> 
> Anyway...there is an issue with the `LOAD` statements; seems like 
> `hive.stats.autogather` doesn't work on it properly (this will be a followup)

Having incorrect but approximate stats are much more useful for optimizer than 
having no stats at all. We keep state so that we don't use it to answer queries 
but approximate stats have role to play.

Load statement is suppose to set state to false. Since, it doesn't scan data, 
stats can be computed for this statement.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62280/#review187592
---


On Oct. 3, 2017, 8:06 a.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62280/
> ---
> 
> (Updated Oct. 3, 2017, 8:06 a.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-16827
> https://issues.apache.org/jira/browse/HIVE-16827
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> this was originally part of HIVE-13567
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024e4d6f21e3b1d24a5efcc1325e64d6e 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> b7dc88c93984a37a5df7ec8258c3e1e375cf7878 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 
> 6876ca8775098175155111c25d5dba4db63b3b1b 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 
> 79cf8fe1e5d159a9fa013dfd616a84918709a0f9 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out 
> fec58ef026074b2b2d0c9a4f91cc219e4e21421c 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 
> 1131478a7b6694a106d41206042fe6dee99eb8a2 
>   contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 
> 8d3b95ece81e55193d92cbc39960bf378990e256 
>   data/scripts/q_test_init_src.sql 56b44e0a5df4d32a6a9318a760ce2debdad00d50 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  e8927e9d64a7978355b387100e4e30608c34eb7b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  e55b1c257ef548000d96a2d7a1a14c39dc34bc0b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> f50f4af817f000f6cc59133d5966899e79d67c3b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  660cebba5f8a28abcd6b10b62d5982468a92b962 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  ba0e83d5623f6cd9d7ada998d47826883cb2aca4 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  2ababb1eec51f49a4a92367deaa87ee7b2d12797 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
> ad2baa2e265d2d0ffb94859e5141e4b38f2909b5 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> 2edf7493990572ab3a9d6f1a7c4407de91bef39c 
>   itests/src/test/resources/testconfiguration.properties 
> 038487f134f2c24d0b957404de6afb1766a2cb4a 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 
> 65551ad169d8151fce297d8a44fd30168a682ad0 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
> f4af6f497927d61517e263521ef97a27c616e14c 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> a147a2590d6de1fe161c7b02f043f179243cf83c 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> b5e4bf086e9b2659ebb3b221269ee4a69a0465ae 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/ColumnStatsMergerFactory.java
>  66be52413956b261373b2132c5678a204662c79e 
>   pom.xml 52e53012b9ca91d953baa257e1dd74ba86220455 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java 
> 583d3d3893298e823e27e82a13b252f799bbef79 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java 
> e3b0d1e

Re: Review Request 62280: HIVE-16827: Merge stats task and column stats task into a single task

2017-10-25 Thread Zoltan Haindrich


> On Oct. 10, 2017, 11:49 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java
> > Line 394 (original), 134 (patched)
> > 
> >
> > Any reason to not do this.

the next patch will also address the Type rename


> On Oct. 10, 2017, 11:49 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java
> > Lines 319 (patched)
> > 
> >
> > One of the side goal of merging basic and column stats task was also to 
> > make a single call to metastore with both basic stats as well as column 
> > stats. In this design of tasks being abstracted via IStatsProcessor these 
> > two stats task will still make independent metastore calls.
> > We shall improve on this. Can be done in a follow-up since this likely 
> > will also necessitate new thrift calls.

I think that making a single metastore call will be closer; but will definetly 
need a new metastore call - this will be a separate change later.


> On Oct. 10, 2017, 11:49 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/plan/StatsWork.java
> > Line 57 (original), 42 (patched)
> > 
> >
> > Instead of containing BasicStatsWork, it might better to extend 
> > BasicStatsWork, that way some of these fields need not to be repeated in 
> > these two classes.

I'm trying to move all kind of informations about the subjects to the StatsWork 
and enable it to be used as the entity which can be used to specify what stats 
it should gather.
Currently it works at some point as a delegate to basicstatswork; because 
changing the call-sites of StatsTask-s is a different thing.


> On Oct. 10, 2017, 11:49 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsNoJobTask.java
> > Lines 223 (patched)
> > 
> >
> > We want to get rid of this config, correct?

In this class I've just started using it; I guess the one which should be 
removed is the "atomic"


> On Oct. 10, 2017, 11:49 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java
> > Lines 160 (patched)
> > 
> >
> > we want to just flip the state flag and keep whatever stats we have.

I would like to remove stats which might be misleading; outputs which contain a 
value for `numRows` ; but basic_stats are false could cause a headache for the 
explains readers...

It turned out that removing invalid stats has a positive sideeffect: until 
distinct_windowing_no_cbo.q estimated 1 row for a table containing 10k rows; 
now it changed to 84k which seems better to me.

Anyway...there is an issue with the `LOAD` statements; seems like 
`hive.stats.autogather` doesn't work on it properly (this will be a followup)


> On Oct. 10, 2017, 11:49 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java
> > Lines 171 (patched)
> > 
> >
> > I think statsAggregator can never be null at this point. Throw 
> > exception?

I could think of a case when `part=1` has stats calculated; however `part=2` 
does not; and someone loads data into both of the partitions of the table.


- Zoltan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62280/#review187592
---


On Oct. 3, 2017, 8:06 a.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62280/
> ---
> 
> (Updated Oct. 3, 2017, 8:06 a.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-16827
> https://issues.apache.org/jira/browse/HIVE-16827
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> this was originally part of HIVE-13567
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024e4d6f21e3b1d24a5efcc1325e64d6e 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> b7dc88c93984a37a5df7ec8258c3e1e375cf7878 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 
> 6876ca8775098175155111c25d5dba4db63b3b1b 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 
> 79cf8fe1e5d159a9fa013dfd616a84918709a0f9 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out 
> fec58ef026074b2b2d0c9a4f91cc219e4e21421c 
>   contrib/src/test/resul

Re: Review Request 62280: HIVE-16827: Merge stats task and column stats task into a single task

2017-10-10 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62280/#review187592
---




common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java
Lines 267 (patched)


nit: whitespace



ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java
Line 394 (original), 134 (patched)


Any reason to not do this.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java
Line 112 (original), 110 (patched)


can remove this comment.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java
Lines 1535 (patched)


may delete this.



ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java
Lines 1750-1762 (patched)


May delete commented code.



ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java
Lines 319 (patched)


One of the side goal of merging basic and column stats task was also to 
make a single call to metastore with both basic stats as well as column stats. 
In this design of tasks being abstracted via IStatsProcessor these two stats 
task will still make independent metastore calls.
We shall improve on this. Can be done in a follow-up since this likely will 
also necessitate new thrift calls.



ql/src/java/org/apache/hadoop/hive/ql/plan/LoadFileDesc.java
Lines 57-63 (patched)


can remove this.



ql/src/java/org/apache/hadoop/hive/ql/plan/StatsNoJobWork.java
Lines 40 (patched)


can make it protected.



ql/src/java/org/apache/hadoop/hive/ql/plan/StatsWork.java
Line 57 (original), 42 (patched)


Instead of containing BasicStatsWork, it might better to extend 
BasicStatsWork, that way some of these fields need not to be repeated in these 
two classes.



ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsNoJobTask.java
Lines 223 (patched)


We want to get rid of this config, correct?



ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java
Lines 160 (patched)


we want to just flip the state flag and keep whatever stats we have.



ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java
Lines 171 (patched)


I think statsAggregator can never be null at this point. Throw exception?


- Ashutosh Chauhan


On Oct. 3, 2017, 8:06 a.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62280/
> ---
> 
> (Updated Oct. 3, 2017, 8:06 a.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-16827
> https://issues.apache.org/jira/browse/HIVE-16827
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> this was originally part of HIVE-13567
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024e4d6f21e3b1d24a5efcc1325e64d6e 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> b7dc88c93984a37a5df7ec8258c3e1e375cf7878 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 
> 6876ca8775098175155111c25d5dba4db63b3b1b 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 
> 79cf8fe1e5d159a9fa013dfd616a84918709a0f9 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out 
> fec58ef026074b2b2d0c9a4f91cc219e4e21421c 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 
> 1131478a7b6694a106d41206042fe6dee99eb8a2 
>   contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 
> 8d3b95ece81e55193d92cbc39960bf378990e256 
>   data/scripts/q_test_init_src.sql 56b44e0a5df4d32a6a9318a760ce2debdad00d50 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  e8927e9d64a7978355b387100e4e30608c34eb7b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  e55b1c257ef548000d96a2d7a1a14c39dc34bc0b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> f50f4af817f000f6cc59133d5966899e79d67c3b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  660cebba5f8a28abcd6b10b62d5982468a92b962 
>   
> itests/hive-blobstore/src/test/results/clientposit

Re: Review Request 62280: HIVE-16827: Merge stats task and column stats task into a single task

2017-10-10 Thread Zoltan Haindrich


> On Oct. 10, 2017, 12:42 a.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/autoColumnStats_3.q.out
> > Line 370 (original), 376 (patched)
> > 
> >
> > Even for value column stats arent accurate anymore. That shouldn't be 
> > there in map either.

from: 
https://github.com/kgyrtkirk/hive/blob/HIVE-16827/ql/src/test/queries/clientpositive/autoColumnStats_3.q#L47

for this testcase the relevant commands are:
```java
analyze table nzhang_part14 partition(ds='1', hr='3') compute statistics for 
columns value;
[...]
set hive.stats.column.autogather=true;
[...]
insert into table nzhang_part14 partition(ds, hr)
select key, value, ds, hr from (
  select * from (select 'k1' as key, cast(null as string) as value, '1' as ds, 
'2' as hr from src limit 2)a
  union all
  select * from (select 'k2' as key, '' as value, '1' as ds, '3' as hr from src 
limit 2)b
  union all
  select * from (select 'k3' as key, ' ' as value, '2' as ds, '1' as hr from 
src limit 2)c
) T;

desc formatted nzhang_part14 partition(ds='1', hr='3');
```
I think keeping the stats for partition 1,3/value is good as it was up-to-date 
when autogathering was enabled - and it was correctly updated during the 
`insert into` statement.

I feel that we are good here.


> On Oct. 10, 2017, 12:42 a.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/autoColumnStats_5.q.out
> > Line 606 (original), 597 (patched)
> > 
> >
> > column stats wont be accurate for any columns.

I feel that it works correctly by retaining the a/b column's stats - and not 
reporting anything for c/d.

It would be possible to kill the stats on alter.


> On Oct. 10, 2017, 12:42 a.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/columnstats_infinity.q.out
> > Line 279 (original), 281 (patched)
> > 
> >
> > stats for all cols should be inaccurate after insert.

`set hive.stats.column.autogather=true` is set prior to the create 
table&insert; so I think we are good here

https://github.com/kgyrtkirk/hive/blob/HIVE-16827/ql/src/test/queries/clientpositive/columnstats_infinity.q#L27


- Zoltan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62280/#review187480
---


On Oct. 3, 2017, 8:06 a.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62280/
> ---
> 
> (Updated Oct. 3, 2017, 8:06 a.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-16827
> https://issues.apache.org/jira/browse/HIVE-16827
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> this was originally part of HIVE-13567
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024e4d6f21e3b1d24a5efcc1325e64d6e 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> b7dc88c93984a37a5df7ec8258c3e1e375cf7878 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 
> 6876ca8775098175155111c25d5dba4db63b3b1b 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 
> 79cf8fe1e5d159a9fa013dfd616a84918709a0f9 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out 
> fec58ef026074b2b2d0c9a4f91cc219e4e21421c 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 
> 1131478a7b6694a106d41206042fe6dee99eb8a2 
>   contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 
> 8d3b95ece81e55193d92cbc39960bf378990e256 
>   data/scripts/q_test_init_src.sql 56b44e0a5df4d32a6a9318a760ce2debdad00d50 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  e8927e9d64a7978355b387100e4e30608c34eb7b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  e55b1c257ef548000d96a2d7a1a14c39dc34bc0b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> f50f4af817f000f6cc59133d5966899e79d67c3b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  660cebba5f8a28abcd6b10b62d5982468a92b962 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  ba0e83d5623f6cd9d7ada998d47826883cb2aca4 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  2ababb1eec51f49a4a92367deaa87ee7b2d12797 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
> ad2baa2e265d2d0ffb94859e5141e4b38f2909b5 
>   
> itests

Re: Review Request 62280: HIVE-16827: Merge stats task and column stats task into a single task

2017-10-09 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62280/#review187480
---




ql/src/test/results/clientpositive/autoColumnStats_3.q.out
Line 370 (original), 376 (patched)


Even for value column stats arent accurate anymore. That shouldn't be there 
in map either.



ql/src/test/results/clientpositive/autoColumnStats_5.q.out
Line 606 (original), 597 (patched)


column stats wont be accurate for any columns.



ql/src/test/results/clientpositive/columnstats_infinity.q.out
Line 279 (original), 281 (patched)


stats for all cols should be inaccurate after insert.



ql/src/test/results/clientpositive/temp_table_display_colstats_tbllvl.q.out
Lines 254 (patched)


basic stats should also be true here.



ql/src/test/results/clientpositive/temp_table_display_colstats_tbllvl.q.out
Line 364 (original), 375 (patched)


basic_stats should be true.


- Ashutosh Chauhan


On Oct. 3, 2017, 8:06 a.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62280/
> ---
> 
> (Updated Oct. 3, 2017, 8:06 a.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-16827
> https://issues.apache.org/jira/browse/HIVE-16827
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> this was originally part of HIVE-13567
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024e4d6f21e3b1d24a5efcc1325e64d6e 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> b7dc88c93984a37a5df7ec8258c3e1e375cf7878 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 
> 6876ca8775098175155111c25d5dba4db63b3b1b 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 
> 79cf8fe1e5d159a9fa013dfd616a84918709a0f9 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out 
> fec58ef026074b2b2d0c9a4f91cc219e4e21421c 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 
> 1131478a7b6694a106d41206042fe6dee99eb8a2 
>   contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 
> 8d3b95ece81e55193d92cbc39960bf378990e256 
>   data/scripts/q_test_init_src.sql 56b44e0a5df4d32a6a9318a760ce2debdad00d50 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  e8927e9d64a7978355b387100e4e30608c34eb7b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  e55b1c257ef548000d96a2d7a1a14c39dc34bc0b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> f50f4af817f000f6cc59133d5966899e79d67c3b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  660cebba5f8a28abcd6b10b62d5982468a92b962 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  ba0e83d5623f6cd9d7ada998d47826883cb2aca4 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  2ababb1eec51f49a4a92367deaa87ee7b2d12797 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
> ad2baa2e265d2d0ffb94859e5141e4b38f2909b5 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> 2edf7493990572ab3a9d6f1a7c4407de91bef39c 
>   itests/src/test/resources/testconfiguration.properties 
> 038487f134f2c24d0b957404de6afb1766a2cb4a 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 
> 65551ad169d8151fce297d8a44fd30168a682ad0 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
> f4af6f497927d61517e263521ef97a27c616e14c 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> a147a2590d6de1fe161c7b02f043f179243cf83c 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> b5e4bf086e9b2659ebb3b221269ee4a69a0465ae 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/ColumnStatsMergerFactory.java
>  66be52413956b261373b2132c5678a204662c79e 
>   pom.xml 52e53012b9ca91d953baa257e1dd74ba86220455 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java 
> 583d3d3893298e823e27e82a13b252f799bbef79 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java 
> e3b0d1eaea337d318918c9e9ea07be53bffee189 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 
> bc265eb790ab8f9517ecfdd56b0634e47092ca35 
>   ql/src/java/org/apache/h

Re: Review Request 62280: HIVE-16827: Merge stats task and column stats task into a single task

2017-10-03 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62280/
---

(Updated Oct. 3, 2017, 8:06 a.m.)


Review request for hive and Ashutosh Chauhan.


Changes
---

* address review comments
* refator some parts of stats gathering
* possibly fix some issues


Bugs: HIVE-16827
https://issues.apache.org/jira/browse/HIVE-16827


Repository: hive-git


Description
---

this was originally part of HIVE-13567


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
7c27d07024e4d6f21e3b1d24a5efcc1325e64d6e 
  common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
b7dc88c93984a37a5df7ec8258c3e1e375cf7878 
  contrib/src/test/results/clientpositive/serde_typedbytes.q.out 
6876ca8775098175155111c25d5dba4db63b3b1b 
  contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 
79cf8fe1e5d159a9fa013dfd616a84918709a0f9 
  contrib/src/test/results/clientpositive/serde_typedbytes3.q.out 
fec58ef026074b2b2d0c9a4f91cc219e4e21421c 
  contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 
1131478a7b6694a106d41206042fe6dee99eb8a2 
  contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 
8d3b95ece81e55193d92cbc39960bf378990e256 
  data/scripts/q_test_init_src.sql 56b44e0a5df4d32a6a9318a760ce2debdad00d50 
  
hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out 
e8927e9d64a7978355b387100e4e30608c34eb7b 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
 e55b1c257ef548000d96a2d7a1a14c39dc34bc0b 
  itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
f50f4af817f000f6cc59133d5966899e79d67c3b 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
 660cebba5f8a28abcd6b10b62d5982468a92b962 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
 ba0e83d5623f6cd9d7ada998d47826883cb2aca4 
  
itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
 2ababb1eec51f49a4a92367deaa87ee7b2d12797 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
ad2baa2e265d2d0ffb94859e5141e4b38f2909b5 
  itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
2edf7493990572ab3a9d6f1a7c4407de91bef39c 
  itests/src/test/resources/testconfiguration.properties 
038487f134f2c24d0b957404de6afb1766a2cb4a 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 
65551ad169d8151fce297d8a44fd30168a682ad0 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
f4af6f497927d61517e263521ef97a27c616e14c 
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
a147a2590d6de1fe161c7b02f043f179243cf83c 
  metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
b5e4bf086e9b2659ebb3b221269ee4a69a0465ae 
  
metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/ColumnStatsMergerFactory.java
 66be52413956b261373b2132c5678a204662c79e 
  pom.xml 52e53012b9ca91d953baa257e1dd74ba86220455 
  ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java 
583d3d3893298e823e27e82a13b252f799bbef79 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java 
e3b0d1eaea337d318918c9e9ea07be53bffee189 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 
bc265eb790ab8f9517ecfdd56b0634e47092ca35 
  ql/src/java/org/apache/hadoop/hive/ql/exec/StatsNoJobTask.java 
9c3a664b9aea2d6e050ffe2d7626127827dbc52a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 
4db68066d6dbc557180a2fefe63069a924fede65 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 
6193b900e085bbe013f0526ee8c351cdff4e655a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 
e9c69d94d5b41748cde08eaf2636c513c5731929 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 
ae637279998e7e4983d6c9abf8c0af75b92d30da 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
436a2fe73bc91f59ea83490291fa3668a6442930 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java 
61f6a7c4ff38447db0ac2610e7308f4e710580ab 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 
9297a0b87454bf37b1a4c68327407cada6b37232 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
25cca7b4be124f6b6a6ea645ef887f6cf5cfa8d6 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/QueryPlanPostProcessor.java 
b5bc386dfdf3df4ca6a5261aa6a67d5d8b6020dc 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MemoryDecider.java 
3a20cfe7ac693340bda97c345d1603d312dbafa3 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SerializeFilter.java 
dc433fed2230caa0afbb270c2e05fa8f356709cf 
  ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
06e00d723085463257ed49093d25

Re: Review Request 62280: HIVE-16827: Merge stats task and column stats task into a single task

2017-10-02 Thread Zoltan Haindrich


> On Sept. 20, 2017, 12:19 a.m., Ashutosh Chauhan wrote:
> > metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
> > Lines 299-304 (patched)
> > 
> >
> > Is this change required? If so, can you please add comments for it?

I've removed it...this "cascade" operation is pretty confusing...not sure why 
this cascade is there for - I think it makes no sense to not update the 
partitions ; but if it should work that way - then this snipplet is not needed.


- Zoltan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62280/#review185751
---


On Sept. 14, 2017, 8:06 p.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62280/
> ---
> 
> (Updated Sept. 14, 2017, 8:06 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-16827
> https://issues.apache.org/jira/browse/HIVE-16827
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> this was originally part of HIVE-13567
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024e4d6f21e3b1d24a5efcc1325e64d6e 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> b7dc88c93984a37a5df7ec8258c3e1e375cf7878 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 
> 6876ca8775098175155111c25d5dba4db63b3b1b 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 
> 79cf8fe1e5d159a9fa013dfd616a84918709a0f9 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out 
> fec58ef026074b2b2d0c9a4f91cc219e4e21421c 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 
> 1131478a7b6694a106d41206042fe6dee99eb8a2 
>   contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 
> 8d3b95ece81e55193d92cbc39960bf378990e256 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  68a417d0c15b2be70c38c132ad75cbf15fd3174d 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  e55b1c257ef548000d96a2d7a1a14c39dc34bc0b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> f50f4af817f000f6cc59133d5966899e79d67c3b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  660cebba5f8a28abcd6b10b62d5982468a92b962 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  ba0e83d5623f6cd9d7ada998d47826883cb2aca4 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  2ababb1eec51f49a4a92367deaa87ee7b2d12797 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
> ad2baa2e265d2d0ffb94859e5141e4b38f2909b5 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> 4a9af80fdc88e267b9aef97adcbe9e6aa18196dc 
>   itests/src/test/resources/testconfiguration.properties 
> d472bb3f9ef75ec3d0497c206bd6b5483d078eb1 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 
> b279e1d5677d98bf044be2d143ff45ac7a81faf6 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
> cf33cca24ffc7583f2be62bb79815b5e43b2ff42 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> bbe13fd77b5a0cfa7a6818f7a9e23985772acf3d 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> 3053dcb50b44fd0d302f5653dff02772b7cc6ad9 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/ColumnStatsMergerFactory.java
>  66be52413956b261373b2132c5678a204662c79e 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java 
> f43992c85d4695f6278eaa36420ef5ba331f5200 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/BasicStatsTask.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java 
> 2b2c004fea84838a438d1404337ca60fd300664d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsNoJobTask.java 
> 9c3a664b9aea2d6e050ffe2d7626127827dbc52a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 
> c22d69bb19064fe363276478ec89dd08db5a8705 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 
> fe9b6244df0c3994b8ab521cbb1edd853124e923 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
> 9f98b69b187bf25ba03494cdbd513bd508644280 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  61f6a7c4ff38447db0ac2610e7308f4e710580ab 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 
> 9297a0b87454bf37b1a4c68327407cada6b37232 
>   ql/sr

Re: Review Request 62280: HIVE-16827: Merge stats task and column stats task into a single task

2017-09-26 Thread Zoltan Haindrich


> On Sept. 19, 2017, 11:12 p.m., Ashutosh Chauhan wrote:
> > itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
> > Lines 112 (patched)
> > 
> >
> > this is confusing for a temp table. Also, how come column stats are 
> > available?

this is because stats are always filled out at creation time ; which is 
correct, because at creation time all stats are 0s are correct.

I've cleared them for now; temp tables should at least have basic stats 
available - HIVE-17533


> On Sept. 19, 2017, 11:12 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/alter_table_update_status_disable_bitvector.q.out
> > Line 98 (original), 102 (patched)
> > 
> >
> > Is this change expected?

at line 80:
```
COLUMN_STATS_ACCURATE   
{\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"key\":\"true\",\"value\":\"true\"}}
```
So..prior to this command basic_stats are uptodate.
I think `ALTER TABLE src_stat_int UPDATE STATISTICS for column key SET 
('numDVs'='','lowValue'='333.22','highValue'='22.22');` should not clear 
the `BASIC_STATS` state.


> On Sept. 19, 2017, 11:12 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/autoColumnStats_1.q.out
> > Line 480 (original), 558 (patched)
> > 
> >
> > Is this change expected?

this qfile is not maintained: autoColumnStats_1.q is run only with llap drivers


> On Sept. 19, 2017, 11:12 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/autoColumnStats_2.q.out
> > Lines 131 (patched)
> > 
> >
> > Is this change expected?

the appearance of the stats state is expected; the distinctCount is now correct 
being 309; 205 was wrong


> On Sept. 19, 2017, 11:12 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/autoColumnStats_4.q.out
> > Line 200 (original)
> > 
> >
> > Is this change expected?

this is an acid table; so I think it might be ok for now
since the basicStats flag was missing in the earlier output also; the 
columnstats were not used in that case as well.


> On Sept. 19, 2017, 11:12 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/columnstats_infinity.q.out
> > Line 279 (original), 281 (patched)
> > 
> >
> > Is this change expected?

there were no stats collected for 'float' earlier


> On Sept. 19, 2017, 11:12 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/columnstats_partlvl.q.out
> > Line 453 (original), 469 (patched)
> > 
> >
> > Is this change expected?

I've checked that prior to this patch numRows was 0 (incorrectly)
prior to this change column stats gathering have not collected row level 
statistics.


> On Sept. 19, 2017, 11:12 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/columnstats_partlvl_dp.q.out
> > Line 252 (original), 264 (patched)
> > 
> >
> > Is this change expected?

numrows was 0 before


> On Sept. 19, 2017, 11:12 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/outer_reference_windowed.q.out
> > Line 132 (original), 138 (patched)
> > 
> >
> > Is this change expected?

numrows was 0 before


- Zoltan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62280/#review185737
---


On Sept. 14, 2017, 8:06 p.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62280/
> ---
> 
> (Updated Sept. 14, 2017, 8:06 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-16827
> https://issues.apache.org/jira/browse/HIVE-16827
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> this was originally part of HIVE-13567
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024e4d6f21e3b1d24a5efcc1325e64d6e 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> b7dc88c93984a37a5df7ec8258c3e1e375cf7878 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 
> 6876ca8775098175155111c25d5dba4db63b3b1b 
>   contrib/src/test/r

Re: Review Request 62280: HIVE-16827: Merge stats task and column stats task into a single task

2017-09-26 Thread Zoltan Haindrich


> On Sept. 20, 2017, 12:19 a.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/BasicStatsTask.java
> > Lines 75 (patched)
> > 
> >
> > Do we need a notion of BasicStatsTask? 
> > Reason for merging two stats task is so that we can get rid of 1. 
> > Instead, we still have 2. Idea was there will be single Work which will 
> > define both stats and single task to execute them.

not really...but it needs some more work to phase it out...


> On Sept. 20, 2017, 12:19 a.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java
> > Lines 455 (patched)
> > 
> >
> > Executing one task from another breaks abstraction of Tasks. Any 
> > particular reason for this?

I aggree...the next patch will address this problem as well


> On Sept. 20, 2017, 12:19 a.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
> > Line 7333 (original), 7333 (patched)
> > 
> >
> > We can collect stats for non-native tables. Any reason for this?

I can think of the following scenario:
 * external table
 * enabled stats gather (uptodate)
 * adds data by moving files
 * => incorrect results...

maybe this should be handled differently...


- Zoltan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62280/#review185751
---


On Sept. 14, 2017, 8:06 p.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62280/
> ---
> 
> (Updated Sept. 14, 2017, 8:06 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-16827
> https://issues.apache.org/jira/browse/HIVE-16827
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> this was originally part of HIVE-13567
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024e4d6f21e3b1d24a5efcc1325e64d6e 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> b7dc88c93984a37a5df7ec8258c3e1e375cf7878 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 
> 6876ca8775098175155111c25d5dba4db63b3b1b 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 
> 79cf8fe1e5d159a9fa013dfd616a84918709a0f9 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out 
> fec58ef026074b2b2d0c9a4f91cc219e4e21421c 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 
> 1131478a7b6694a106d41206042fe6dee99eb8a2 
>   contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 
> 8d3b95ece81e55193d92cbc39960bf378990e256 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  68a417d0c15b2be70c38c132ad75cbf15fd3174d 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  e55b1c257ef548000d96a2d7a1a14c39dc34bc0b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> f50f4af817f000f6cc59133d5966899e79d67c3b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  660cebba5f8a28abcd6b10b62d5982468a92b962 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  ba0e83d5623f6cd9d7ada998d47826883cb2aca4 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  2ababb1eec51f49a4a92367deaa87ee7b2d12797 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
> ad2baa2e265d2d0ffb94859e5141e4b38f2909b5 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> 4a9af80fdc88e267b9aef97adcbe9e6aa18196dc 
>   itests/src/test/resources/testconfiguration.properties 
> d472bb3f9ef75ec3d0497c206bd6b5483d078eb1 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 
> b279e1d5677d98bf044be2d143ff45ac7a81faf6 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
> cf33cca24ffc7583f2be62bb79815b5e43b2ff42 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> bbe13fd77b5a0cfa7a6818f7a9e23985772acf3d 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> 3053dcb50b44fd0d302f5653dff02772b7cc6ad9 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/ColumnStatsMergerFactory.java
>  66be52413956b261373b2132c5678a204662c79e 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java 
> f43992c85d4695f627

Re: Review Request 62280: HIVE-16827: Merge stats task and column stats task into a single task

2017-09-19 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62280/#review185751
---




metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
Lines 299-304 (patched)


Is this change required? If so, can you please add comments for it?



ql/src/java/org/apache/hadoop/hive/ql/exec/BasicStatsTask.java
Lines 75 (patched)


Do we need a notion of BasicStatsTask? 
Reason for merging two stats task is so that we can get rid of 1. Instead, 
we still have 2. Idea was there will be single Work which will define both 
stats and single task to execute them.



ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java
Lines 455 (patched)


Executing one task from another breaks abstraction of Tasks. Any particular 
reason for this?



ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
Line 7333 (original), 7333 (patched)


We can collect stats for non-native tables. Any reason for this?


- Ashutosh Chauhan


On Sept. 14, 2017, 8:06 p.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62280/
> ---
> 
> (Updated Sept. 14, 2017, 8:06 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-16827
> https://issues.apache.org/jira/browse/HIVE-16827
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> this was originally part of HIVE-13567
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024e4d6f21e3b1d24a5efcc1325e64d6e 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> b7dc88c93984a37a5df7ec8258c3e1e375cf7878 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 
> 6876ca8775098175155111c25d5dba4db63b3b1b 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 
> 79cf8fe1e5d159a9fa013dfd616a84918709a0f9 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out 
> fec58ef026074b2b2d0c9a4f91cc219e4e21421c 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 
> 1131478a7b6694a106d41206042fe6dee99eb8a2 
>   contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 
> 8d3b95ece81e55193d92cbc39960bf378990e256 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  68a417d0c15b2be70c38c132ad75cbf15fd3174d 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  e55b1c257ef548000d96a2d7a1a14c39dc34bc0b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> f50f4af817f000f6cc59133d5966899e79d67c3b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  660cebba5f8a28abcd6b10b62d5982468a92b962 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  ba0e83d5623f6cd9d7ada998d47826883cb2aca4 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  2ababb1eec51f49a4a92367deaa87ee7b2d12797 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
> ad2baa2e265d2d0ffb94859e5141e4b38f2909b5 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> 4a9af80fdc88e267b9aef97adcbe9e6aa18196dc 
>   itests/src/test/resources/testconfiguration.properties 
> d472bb3f9ef75ec3d0497c206bd6b5483d078eb1 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 
> b279e1d5677d98bf044be2d143ff45ac7a81faf6 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
> cf33cca24ffc7583f2be62bb79815b5e43b2ff42 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> bbe13fd77b5a0cfa7a6818f7a9e23985772acf3d 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> 3053dcb50b44fd0d302f5653dff02772b7cc6ad9 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/ColumnStatsMergerFactory.java
>  66be52413956b261373b2132c5678a204662c79e 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java 
> f43992c85d4695f6278eaa36420ef5ba331f5200 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/BasicStatsTask.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java 
> 2b2c004fea84838a438d1404337ca60fd300664d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsNoJobTask.java 
> 9c3a664b9aea2d6e050ffe2d7626127827dbc52a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 
> c

Re: Review Request 62280: HIVE-16827: Merge stats task and column stats task into a single task

2017-09-19 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62280/#review185737
---




itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
Lines 112 (patched)


this is confusing for a temp table. Also, how come column stats are 
available?



ql/src/test/queries/clientpositive/bucket_map_join_tez2.q
Lines 1 (patched)


Since this is still off by default, there is no need for this.



ql/src/test/queries/clientpositive/bucket_num_reducers.q
Line 1 (original), 1 (patched)


Since this is still off by default, there is no need for this.



ql/src/test/queries/clientpositive/combine1.q
Lines 10 (patched)


Since this is still off by default, there is no need for this.



ql/src/test/queries/clientpositive/correlationoptimizer5.q
Lines 1 (patched)


Since this is still off by default, there is no need for this.



ql/src/test/queries/clientpositive/encryption_insert_values.q
Lines 2 (patched)


Since this is still off by default, there is no need for this.



ql/src/test/queries/clientpositive/encryption_move_tbl.q
Lines 2 (patched)


Since this is still off by default, there is no need for this.



ql/src/test/results/clientpositive/alter_table_update_status_disable_bitvector.q.out
Line 98 (original), 102 (patched)


Is this change expected?



ql/src/test/results/clientpositive/autoColumnStats_1.q.out
Line 480 (original), 558 (patched)


Is this change expected?



ql/src/test/results/clientpositive/autoColumnStats_2.q.out
Lines 131 (patched)


Is this change expected?



ql/src/test/results/clientpositive/autoColumnStats_4.q.out
Line 200 (original)


Is this change expected?



ql/src/test/results/clientpositive/columnstats_infinity.q.out
Line 279 (original), 281 (patched)


Is this change expected?



ql/src/test/results/clientpositive/columnstats_partlvl.q.out
Line 453 (original), 469 (patched)


Is this change expected?



ql/src/test/results/clientpositive/columnstats_partlvl_dp.q.out
Line 252 (original), 264 (patched)


Is this change expected?



ql/src/test/results/clientpositive/outer_reference_windowed.q.out
Line 132 (original), 138 (patched)


Is this change expected?


- Ashutosh Chauhan


On Sept. 14, 2017, 8:06 p.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62280/
> ---
> 
> (Updated Sept. 14, 2017, 8:06 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-16827
> https://issues.apache.org/jira/browse/HIVE-16827
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> this was originally part of HIVE-13567
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024e4d6f21e3b1d24a5efcc1325e64d6e 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> b7dc88c93984a37a5df7ec8258c3e1e375cf7878 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 
> 6876ca8775098175155111c25d5dba4db63b3b1b 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 
> 79cf8fe1e5d159a9fa013dfd616a84918709a0f9 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out 
> fec58ef026074b2b2d0c9a4f91cc219e4e21421c 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 
> 1131478a7b6694a106d41206042fe6dee99eb8a2 
>   contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 
> 8d3b95ece81e55193d92cbc39960bf378990e256 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  68a417d0c15b2be70c38c132ad75cbf15fd3174d 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  e55b1c257ef548000d96a2d7a1a14c39dc34bc0b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> f50f4af817f000f6cc59133d5966899e79d67c3b 
>   
> itests/hive-blobstore/src/test/results/clientpositive/in

Re: Review Request 62280: HIVE-16827: Merge stats task and column stats task into a single task

2017-09-14 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62280/
---

(Updated Sept. 14, 2017, 8:06 p.m.)


Review request for hive and Ashutosh Chauhan.


Changes
---

update q.out; looks ok to me - there are a few extra 0


Bugs: HIVE-16827
https://issues.apache.org/jira/browse/HIVE-16827


Repository: hive-git


Description
---

this was originally part of HIVE-13567


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
7c27d07024e4d6f21e3b1d24a5efcc1325e64d6e 
  common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
b7dc88c93984a37a5df7ec8258c3e1e375cf7878 
  contrib/src/test/results/clientpositive/serde_typedbytes.q.out 
6876ca8775098175155111c25d5dba4db63b3b1b 
  contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 
79cf8fe1e5d159a9fa013dfd616a84918709a0f9 
  contrib/src/test/results/clientpositive/serde_typedbytes3.q.out 
fec58ef026074b2b2d0c9a4f91cc219e4e21421c 
  contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 
1131478a7b6694a106d41206042fe6dee99eb8a2 
  contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 
8d3b95ece81e55193d92cbc39960bf378990e256 
  
hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out 
68a417d0c15b2be70c38c132ad75cbf15fd3174d 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
 e55b1c257ef548000d96a2d7a1a14c39dc34bc0b 
  itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
f50f4af817f000f6cc59133d5966899e79d67c3b 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
 660cebba5f8a28abcd6b10b62d5982468a92b962 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
 ba0e83d5623f6cd9d7ada998d47826883cb2aca4 
  
itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
 2ababb1eec51f49a4a92367deaa87ee7b2d12797 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
ad2baa2e265d2d0ffb94859e5141e4b38f2909b5 
  itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
4a9af80fdc88e267b9aef97adcbe9e6aa18196dc 
  itests/src/test/resources/testconfiguration.properties 
772113acdac100ac87db66eeecbbd9df10f184fb 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 
28c3cfed6d74a0788fce889adb7de22db42f5c13 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
df01b2578c4fb3ab4f99f7f7b0c921bdb2bc 
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
bbe13fd77b5a0cfa7a6818f7a9e23985772acf3d 
  metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
0a80241b77171421e22acfaf34d71ff20d7c17f5 
  
metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/ColumnStatsMergerFactory.java
 66be52413956b261373b2132c5678a204662c79e 
  ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java 
f43992c85d4695f6278eaa36420ef5ba331f5200 
  ql/src/java/org/apache/hadoop/hive/ql/exec/BasicStatsTask.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java 
2b2c004fea84838a438d1404337ca60fd300664d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/StatsNoJobTask.java 
9c3a664b9aea2d6e050ffe2d7626127827dbc52a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 
c22d69bb19064fe363276478ec89dd08db5a8705 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 
91ac4bf985777599afa392594c5e2e95691caacd 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
d661f10c407776a9123d074b3cf3dbcb1d5f0508 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java 
61f6a7c4ff38447db0ac2610e7308f4e710580ab 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 
9297a0b87454bf37b1a4c68327407cada6b37232 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
da153e36d2d0a4e0de1a68e8f26ead963a2317a6 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MemoryDecider.java 
3a20cfe7ac693340bda97c345d1603d312dbafa3 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SerializeFilter.java 
dc433fed2230caa0afbb270c2e05fa8f356709cf 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
a054abb127d5a67c845647e9d9c4f3c174791750 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ExplainSemanticAnalyzer.java 
7a0d4a752e6dfd02575f368168a4091de29aebf4 
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java 
1b0a2f066161da3bf912f24c55aa0a0c4ccf878d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
fa79700df71c116f229bb9cd25a4ed61f3a38bb0 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ProcessAnalyzeTable.java 
b6d7ee8a92d5c721995221cf50554f4ea6974edf 
  ql/src/java/org/apache/hadoop/hive/ql/parse/Semant

Review Request 62280: HIVE-16827: Merge stats task and column stats task into a single task

2017-09-13 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62280/
---

Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-16827
https://issues.apache.org/jira/browse/HIVE-16827


Repository: hive-git


Description
---

this was originally part of HIVE-13567


Diffs
-

  common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
7c27d07024e4d6f21e3b1d24a5efcc1325e64d6e 
  common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
b7dc88c93984a37a5df7ec8258c3e1e375cf7878 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
ad2baa2e265d2d0ffb94859e5141e4b38f2909b5 
  itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
4a9af80fdc88e267b9aef97adcbe9e6aa18196dc 
  itests/src/test/resources/testconfiguration.properties 
772113acdac100ac87db66eeecbbd9df10f184fb 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 
28c3cfed6d74a0788fce889adb7de22db42f5c13 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
df01b2578c4fb3ab4f99f7f7b0c921bdb2bc 
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
bbe13fd77b5a0cfa7a6818f7a9e23985772acf3d 
  metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
0a80241b77171421e22acfaf34d71ff20d7c17f5 
  
metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/ColumnStatsMergerFactory.java
 66be52413956b261373b2132c5678a204662c79e 
  ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java 
f43992c85d4695f6278eaa36420ef5ba331f5200 
  ql/src/java/org/apache/hadoop/hive/ql/exec/BasicStatsTask.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java 
2b2c004fea84838a438d1404337ca60fd300664d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/StatsNoJobTask.java 
9c3a664b9aea2d6e050ffe2d7626127827dbc52a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 
c22d69bb19064fe363276478ec89dd08db5a8705 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 
91ac4bf985777599afa392594c5e2e95691caacd 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
d661f10c407776a9123d074b3cf3dbcb1d5f0508 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java 
61f6a7c4ff38447db0ac2610e7308f4e710580ab 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 
9297a0b87454bf37b1a4c68327407cada6b37232 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
da153e36d2d0a4e0de1a68e8f26ead963a2317a6 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MemoryDecider.java 
3a20cfe7ac693340bda97c345d1603d312dbafa3 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SerializeFilter.java 
dc433fed2230caa0afbb270c2e05fa8f356709cf 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
a054abb127d5a67c845647e9d9c4f3c174791750 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ExplainSemanticAnalyzer.java 
7a0d4a752e6dfd02575f368168a4091de29aebf4 
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java 
1b0a2f066161da3bf912f24c55aa0a0c4ccf878d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
fa79700df71c116f229bb9cd25a4ed61f3a38bb0 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ProcessAnalyzeTable.java 
b6d7ee8a92d5c721995221cf50554f4ea6974edf 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
bc6e0d586e70fc7bf9b5bac231abaa7a14609069 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java 
08a8f00e06afc756282f42e30929fd31afb5 
  
ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkProcessAnalyzeTable.java 
a2876e1d4f35ce9d0114fbec73cc644d68dade57 
  ql/src/java/org/apache/hadoop/hive/ql/plan/BasicStatsWork.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsDesc.java 
97f323f4b7e89677fa2037f87df8735fc59d5b21 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsWork.java 
76811b1a93f255dddc154d468be4bead1a254e60 
  ql/src/java/org/apache/hadoop/hive/ql/plan/StatsNoJobWork.java 
77c04f6c6e5959e8ed5d891075912692a2f2ecfe 
  ql/src/java/org/apache/hadoop/hive/ql/plan/StatsWork.java 
a5050c5368d041c61694c0734c03d6577bbf85a8 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands.java 
bff9884aa1b10875d33b9a0b90c164fae144efef 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java 
0e0fca313ea54fb684d197b91b73e28f0e26ae39 
  ql/src/test/org/apache/hadoop/hive/ql/TxnCommandsBaseForTests.java 
PRE-CREATION 
  ql/src/test/queries/clientpositive/autoColumnStats_1.q 
7955b0723356f74169f58c0387e4e268486a04de 
  ql/src/test/queries/clientpositive/autoColumnStats_10.q PRE-CREATION 
  ql/src/test/queries/clientpositive/bucket_map_join_tez2.q 
37989ecc9d14d38d1633e5dd415ae2ceaf8028ed 
  ql/src/test/queries/clientpositive/bucket_num_reducers.q 
06f334e833ee901e39a3d058ef7dcd