Re: Review Request 65342: HIVE-18546

2018-01-29 Thread Jesús Camacho Rodríguez


> On Jan. 29, 2018, 7:23 p.m., Ashutosh Chauhan wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
> > Line 1919 (original), 1894 (patched)
> > 
> >
> > I wonder whether we shall check length of string before persisting in 
> > RDBMS. We are using different datatype in different DBs. This string 
> > shouldn't exceed max permissible length. Should throw exception saying 
> > "TxnList too long" ?

I have fixed this by using a CLOB for the TxnList. The overhead should not be 
huge, since the column is only written when MV is created or rebuilt, and it is 
only read when metastore is started.
I also store the creation metadata in a new table.


- Jesús


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65342/#review196454
---


On Jan. 30, 2018, 6:36 a.m., Jesús Camacho Rodríguez wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65342/
> ---
> 
> (Updated Jan. 30, 2018, 6:36 a.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-18546
> https://issues.apache.org/jira/browse/HIVE-18546
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-18546
> 
> 
> Diffs
> -
> 
>   metastore/scripts/upgrade/derby/048-HIVE-14498.derby.sql 
> 4ffd054530503681de1c9f6d65f8187fc1b7520d 
>   metastore/scripts/upgrade/derby/hive-schema-3.0.0.derby.sql 
> 6a59b0df712c8a9f9be880cec5fd8c8eddda4a7d 
>   metastore/scripts/upgrade/derby/hive-txn-schema-3.0.0.derby.sql 
> d72b06cb5866edf93dbcbb20268fc899439e5c43 
>   metastore/scripts/upgrade/hive/hive-schema-3.0.0.hive.sql 
> eb4f0124b5a7829e58d5e9a6a604c201ccea998a 
>   metastore/scripts/upgrade/mssql/033-HIVE-14498.mssql.sql 
> 3a47600bb09e2c20cc12f8759e1287001367604e 
>   metastore/scripts/upgrade/mssql/hive-schema-3.0.0.mssql.sql 
> c45bb3e323c640223b19831abbf4e806c3019f0b 
>   metastore/scripts/upgrade/mysql/048-HIVE-14498.mysql.sql 
> 986eaf5272eab560fa2f862910aaf74c5332c716 
>   metastore/scripts/upgrade/mysql/hive-schema-3.0.0.mysql.sql 
> 01c995d632d94a8f9cc3f46f94c54290abb3da13 
>   metastore/scripts/upgrade/mysql/hive-txn-schema-3.0.0.mysql.sql 
> 497846f994d431d8717aea36d4ad569892e3c8c3 
>   metastore/scripts/upgrade/oracle/048-HIVE-14498.oracle.sql 
> 0b01e89d92f7f48439024aeb326d675d123f0f8c 
>   metastore/scripts/upgrade/oracle/hive-schema-3.0.0.oracle.sql 
> e1aee6fb6c84999b17f87f80750582fafeae063f 
>   metastore/scripts/upgrade/oracle/hive-txn-schema-3.0.0.oracle.sql 
> 5411bc47103f901623244bc26c0ace87e10ad2e1 
>   metastore/scripts/upgrade/postgres/047-HIVE-14498.postgres.sql 
> 8d4de8870d93bab49c873cab44e6714b93491744 
>   metastore/scripts/upgrade/postgres/hive-schema-3.0.0.postgres.sql 
> 28cb01684a46aaeea40d7cbe1973d7bc20810988 
>   metastore/scripts/upgrade/postgres/hive-txn-schema-3.0.0.postgres.sql 
> a81d6eec6d6235706f1225d541f8290971cc6215 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 
> 51ef39057434c41fbe760c547e3bf231e65e4cc0 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 
> 9b0ffe0e91db05ae623531248f12745266789a11 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> d159e4bed1cd4ff04bed1c397318bc2951c02a51 
>   ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java 
> aa95d2fcdcd0b3cede35537a1d7d041ee738e4a8 
>   ql/src/test/results/clientpositive/llap/sysdb.q.out 
> 5ed427fd2aa6fbb83877031e6692bd8f1994730d 
>   standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 
> 42bc9297e72ac8fd77352cb786cfed3abf5af59b 
>   standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 
> 8b78230a32d4d4339189c1db4b533ed04ec080af 
>   
> standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp
>  6a2ff6c4c681b2dbaf339b214663212a2e6dab22 
>   standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h 
> df646a7d1771892e4404be5c4fba183c0f914510 
>   standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 
> 27f8c0f2fcb24a90be8a44d68947589004286c28 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AbortTxnsRequest.java
>  398f8d4e93c6077c110e6469bcd3715fdad5a634 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AddDynamicPartitions.java
>  2102aa5215598edfe5e5c53d541c4fe02ebc7f09 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AddForeignKeyRequest.java
>  a2225298e72f708e97324048592c37a308e43514 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AddNotNullConstraintRequest.java
>  e

Re: Review Request 65342: HIVE-18546

2018-01-29 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65342/
---

(Updated Jan. 30, 2018, 6:36 a.m.)


Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-18546
https://issues.apache.org/jira/browse/HIVE-18546


Repository: hive-git


Description
---

HIVE-18546


Diffs (updated)
-

  metastore/scripts/upgrade/derby/048-HIVE-14498.derby.sql 
4ffd054530503681de1c9f6d65f8187fc1b7520d 
  metastore/scripts/upgrade/derby/hive-schema-3.0.0.derby.sql 
6a59b0df712c8a9f9be880cec5fd8c8eddda4a7d 
  metastore/scripts/upgrade/derby/hive-txn-schema-3.0.0.derby.sql 
d72b06cb5866edf93dbcbb20268fc899439e5c43 
  metastore/scripts/upgrade/hive/hive-schema-3.0.0.hive.sql 
eb4f0124b5a7829e58d5e9a6a604c201ccea998a 
  metastore/scripts/upgrade/mssql/033-HIVE-14498.mssql.sql 
3a47600bb09e2c20cc12f8759e1287001367604e 
  metastore/scripts/upgrade/mssql/hive-schema-3.0.0.mssql.sql 
c45bb3e323c640223b19831abbf4e806c3019f0b 
  metastore/scripts/upgrade/mysql/048-HIVE-14498.mysql.sql 
986eaf5272eab560fa2f862910aaf74c5332c716 
  metastore/scripts/upgrade/mysql/hive-schema-3.0.0.mysql.sql 
01c995d632d94a8f9cc3f46f94c54290abb3da13 
  metastore/scripts/upgrade/mysql/hive-txn-schema-3.0.0.mysql.sql 
497846f994d431d8717aea36d4ad569892e3c8c3 
  metastore/scripts/upgrade/oracle/048-HIVE-14498.oracle.sql 
0b01e89d92f7f48439024aeb326d675d123f0f8c 
  metastore/scripts/upgrade/oracle/hive-schema-3.0.0.oracle.sql 
e1aee6fb6c84999b17f87f80750582fafeae063f 
  metastore/scripts/upgrade/oracle/hive-txn-schema-3.0.0.oracle.sql 
5411bc47103f901623244bc26c0ace87e10ad2e1 
  metastore/scripts/upgrade/postgres/047-HIVE-14498.postgres.sql 
8d4de8870d93bab49c873cab44e6714b93491744 
  metastore/scripts/upgrade/postgres/hive-schema-3.0.0.postgres.sql 
28cb01684a46aaeea40d7cbe1973d7bc20810988 
  metastore/scripts/upgrade/postgres/hive-txn-schema-3.0.0.postgres.sql 
a81d6eec6d6235706f1225d541f8290971cc6215 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 
51ef39057434c41fbe760c547e3bf231e65e4cc0 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 
9b0ffe0e91db05ae623531248f12745266789a11 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
d159e4bed1cd4ff04bed1c397318bc2951c02a51 
  ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java 
aa95d2fcdcd0b3cede35537a1d7d041ee738e4a8 
  ql/src/test/results/clientpositive/llap/sysdb.q.out 
5ed427fd2aa6fbb83877031e6692bd8f1994730d 
  standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 
42bc9297e72ac8fd77352cb786cfed3abf5af59b 
  standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 
8b78230a32d4d4339189c1db4b533ed04ec080af 
  
standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp
 6a2ff6c4c681b2dbaf339b214663212a2e6dab22 
  standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h 
df646a7d1771892e4404be5c4fba183c0f914510 
  standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 
27f8c0f2fcb24a90be8a44d68947589004286c28 
  
standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AbortTxnsRequest.java
 398f8d4e93c6077c110e6469bcd3715fdad5a634 
  
standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AddDynamicPartitions.java
 2102aa5215598edfe5e5c53d541c4fe02ebc7f09 
  
standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AddForeignKeyRequest.java
 a2225298e72f708e97324048592c37a308e43514 
  
standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AddNotNullConstraintRequest.java
 ef23d3025aabb2934f93230ea72c4585dda973e4 
  
standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AddPartitionsRequest.java
 13a23182488ebda9ab0f7163fd4d6822c04c975f 
  
standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AddPartitionsResult.java
 49ce6e1a6cc38994662f56536c6dd6bd55e67d47 
  
standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AddPrimaryKeyRequest.java
 478032a987e7688741fe55b9732f5ec0e8fc209f 
  
standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AddUniqueConstraintRequest.java
 b58f39f7b045f3dbaf95df9e28190517280bd8c4 
  
standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AggrStats.java
 54ef01f3ce4448f0b404772a30a5b0d61641e3c2 
  
standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/BasicTxnInfo.java
 f695e5d9bd4f96ff5a3d5b055a05ac00574ce01b 
  
standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ClearFileMetadataRequest.java
 dbda2ab74128a698452f639b6a7b142b47ca351b 
  
standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ClientCapabilities.java
 

Re: Review Request 65276: HIVE-18516

2018-01-29 Thread Deepak Jaiswal


> On Jan. 30, 2018, 3:09 a.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Lines 3821 (patched)
> > 
> >
> > So only non-bucketed ACID tables get the file rename to 00_0 etc? 
> > What happens with bucketed ACID tables and the file name?

Yes. As mentioned the change summary, the logic for bucketed ACID tables is 
part of the larger work I am doing for bucketed tables in general.


> On Jan. 30, 2018, 3:09 a.m., Jason Dere wrote:
> > ql/src/test/results/clientpositive/llap/load_data_acid_rename.q.out
> > Lines 29 (patched)
> > 
> >
> > Is there a smaller table/results you can test with?

The idea was to actually see if we get results or not.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65276/#review196499
---


On Jan. 29, 2018, 11:10 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65276/
> ---
> 
> (Updated Jan. 29, 2018, 11:10 p.m.)
> 
> 
> Review request for hive, Eugene Koifman and Jason Dere.
> 
> 
> Bugs: HIVE-18516
> https://issues.apache.org/jira/browse/HIVE-18516
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> load data should rename files consistent with insert statements for ACID 
> Tables.
> Includes test change for a missed test.
> 
> 
> Diffs
> -
> 
>   itests/src/test/resources/testconfiguration.properties d86ff58840 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 23983d85b3 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> 5868d4dd56 
>   ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveCopyFiles.java 
> c6a4a8926b 
>   ql/src/test/queries/clientpositive/load_data_acid_rename.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/smb_mapjoin_7.q 4a6afb0496 
>   ql/src/test/results/clientpositive/beeline/smb_mapjoin_7.q.out 7a6f8c53a5 
>   ql/src/test/results/clientpositive/llap/load_data_acid_rename.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/smb_mapjoin_7.q.out b71c5b87c1 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_7.q.out ac49c02913 
> 
> 
> Diff: https://reviews.apache.org/r/65276/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 65415: HIVE-18571 stats issues for MM tables

2018-01-29 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65415/#review196500
---




ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsNoJobTask.java
Line 168 (original), 174 (patched)


should still be debug



ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java
Lines 68 (patched)


and these should be trace


- Sergey Shelukhin


On Jan. 30, 2018, 3:19 a.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65415/
> ---
> 
> (Updated Jan. 30, 2018, 3:19 a.m.)
> 
> 
> Review request for hive and Eugene Koifman.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> f.,v fbghdscd
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 0df30f1ea0 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
>  bad7962373 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 23983d85b3 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
> 6c73dc54a7 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> 5868d4dd56 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 83dfb47e1c 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/BasicStatsWork.java a4e770ce95 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java 1a9c11ec98 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsNoJobTask.java 
> 946c300750 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java b48379013d 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/Partish.java 78f48b169a 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java 
> d84cf136d5 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
>  89354a2d34 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  ecc464418d 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
>  50f873a013 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
>  2599ab103e 
> 
> 
> Diff: https://reviews.apache.org/r/65415/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



Review Request 65415: HIVE-18571 stats issues for MM tables

2018-01-29 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65415/
---

Review request for hive and Eugene Koifman.


Repository: hive-git


Description
---

f.,v fbghdscd


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 0df30f1ea0 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
 bad7962373 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 23983d85b3 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
6c73dc54a7 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
5868d4dd56 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 83dfb47e1c 
  ql/src/java/org/apache/hadoop/hive/ql/plan/BasicStatsWork.java a4e770ce95 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java 1a9c11ec98 
  ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsNoJobTask.java 
946c300750 
  ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java b48379013d 
  ql/src/java/org/apache/hadoop/hive/ql/stats/Partish.java 78f48b169a 
  ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java 
d84cf136d5 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
 89354a2d34 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 ecc464418d 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
 50f873a013 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 2599ab103e 


Diff: https://reviews.apache.org/r/65415/diff/1/


Testing
---


Thanks,

Sergey Shelukhin



[jira] [Created] (HIVE-18579) Changes from HIVE-18495 introduced import paths from shaded jars

2018-01-29 Thread Deepesh Khandelwal (JIRA)
Deepesh Khandelwal created HIVE-18579:
-

 Summary: Changes from HIVE-18495 introduced import paths from 
shaded jars
 Key: HIVE-18579
 URL: https://issues.apache.org/jira/browse/HIVE-18579
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.0.0
Reporter: Deepesh Khandelwal
Assignee: Zoltan Haindrich


When compiling the latest code after HIVE-18495 seeing the following issue:
{noformat}
-
[ERROR] COMPILATION ERROR : 
-
[ERROR] 
/grid/0/jenkins/workspace/Zuul_HDP_Build_Job/build-support/SOURCES/hive/ql/src/test/org/apache/hive/testutils/HiveTestEnvSetup.java:[29,64]
 package org.apache.hadoop.hbase.shaded.com.google.common.collect does not exist
[ERROR] 
/grid/0/jenkins/workspace/Zuul_HDP_Build_Job/build-support/SOURCES/hive/ql/src/test/org/apache/hive/testutils/MiniZooKeeperCluster.java:[43,68]
 package org.apache.hadoop.hbase.shaded.com.google.common.annotations does not 
exist
ERROR] 
/grid/0/jenkins/workspace/Zuul_HDP_Build_Job/build-support/SOURCES/hive/ql/src/test/org/apache/hive/testutils/MiniZooKeeperCluster.java:[100,4]
 cannot find symbol
  symbol:   class VisibleForTesting
  location: class org.apache.hive.testutils.MiniZooKeeperCluster
[INFO] 3 errors
{noformat}
Seems like org.apache.hadoop.hbase.shaded.com.google.* is being used, I am 
guessing the idea was to use com.google.*. Not sure why we didn't see this 
failing in Apache Hive build system.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18578) Some class has missed the ASF header

2018-01-29 Thread Saijin Huang (JIRA)
Saijin Huang created HIVE-18578:
---

 Summary: Some class has missed the ASF header
 Key: HIVE-18578
 URL: https://issues.apache.org/jira/browse/HIVE-18578
 Project: Hive
  Issue Type: Bug
Reporter: Saijin Huang
Assignee: Saijin Huang
 Fix For: 3.0.0


Some class has missed the ASF header



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 65276: HIVE-18516

2018-01-29 Thread Jason Dere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65276/#review196499
---




ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 3821 (patched)


So only non-bucketed ACID tables get the file rename to 00_0 etc? What 
happens with bucketed ACID tables and the file name?



ql/src/test/results/clientpositive/llap/load_data_acid_rename.q.out
Lines 29 (patched)


Is there a smaller table/results you can test with?


- Jason Dere


On Jan. 29, 2018, 11:10 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65276/
> ---
> 
> (Updated Jan. 29, 2018, 11:10 p.m.)
> 
> 
> Review request for hive, Eugene Koifman and Jason Dere.
> 
> 
> Bugs: HIVE-18516
> https://issues.apache.org/jira/browse/HIVE-18516
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> load data should rename files consistent with insert statements for ACID 
> Tables.
> Includes test change for a missed test.
> 
> 
> Diffs
> -
> 
>   itests/src/test/resources/testconfiguration.properties d86ff58840 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 23983d85b3 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> 5868d4dd56 
>   ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveCopyFiles.java 
> c6a4a8926b 
>   ql/src/test/queries/clientpositive/load_data_acid_rename.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/smb_mapjoin_7.q 4a6afb0496 
>   ql/src/test/results/clientpositive/beeline/smb_mapjoin_7.q.out 7a6f8c53a5 
>   ql/src/test/results/clientpositive/llap/load_data_acid_rename.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/smb_mapjoin_7.q.out b71c5b87c1 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_7.q.out ac49c02913 
> 
> 
> Diff: https://reviews.apache.org/r/65276/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



[jira] [Created] (HIVE-18577) SemanticAnalyzer.validate has some pointless metastore calls

2018-01-29 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-18577:
---

 Summary: SemanticAnalyzer.validate has some pointless metastore 
calls
 Key: HIVE-18577
 URL: https://issues.apache.org/jira/browse/HIVE-18577
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18576) Support to read nested complex type with Parquet in vectorization mode

2018-01-29 Thread Colin Ma (JIRA)
Colin Ma created HIVE-18576:
---

 Summary: Support to read nested complex type with Parquet in 
vectorization mode
 Key: HIVE-18576
 URL: https://issues.apache.org/jira/browse/HIVE-18576
 Project: Hive
  Issue Type: Sub-task
Reporter: Colin Ma
Assignee: Colin Ma


Nested complex type is common used, eg: Struct, s2 
List>. Currently, nested complex type can't be parsed in vectorization 
mode, this ticket is target to support it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 65356: HIVE-18536 IOW + DP is broken for insert-only ACID

2018-01-29 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65356/
---

(Updated Jan. 30, 2018, 1:15 a.m.)


Review request for hive and Eugene Koifman.


Repository: hive-git


Description
---

.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/common/JavaUtils.java 9f64b3d2e0 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java d7b3e4b2fd 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 23983d85b3 


Diff: https://reviews.apache.org/r/65356/diff/2/

Changes: https://reviews.apache.org/r/65356/diff/1-2/


Testing
---


Thanks,

Sergey Shelukhin



Re: Review Request 65356: HIVE-18536 IOW + DP is broken for insert-only ACID

2018-01-29 Thread Sergey Shelukhin


> On Jan. 29, 2018, 5:13 p.m., Eugene Koifman wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
> > Line 4077 (original), 4077 (patched)
> > 
> >
> > as far as I can tell every call to this method, passes null for 
> > isBaseDir.  Can this be removed?

Nm, it was supposed to be passed in one place where available.


- Sergey


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65356/#review196437
---


On Jan. 26, 2018, 9:03 p.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65356/
> ---
> 
> (Updated Jan. 26, 2018, 9:03 p.m.)
> 
> 
> Review request for hive and Eugene Koifman.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> .
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/JavaUtils.java 9f64b3d2e0 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 2e1fd37d4a 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 23983d85b3 
> 
> 
> Diff: https://reviews.apache.org/r/65356/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



Re: Long running time for recently added tests in standalone-metastore

2018-01-29 Thread Owen O'Malley
+1

On Mon, Jan 29, 2018 at 4:38 PM, Alexander Kolbasov 
wrote:

> Quite reasonable.
> +1
>
> On Mon, Jan 29, 2018 at 4:19 PM, Alan Gates  wrote:
>
> > With all of the added tests in standalone-metastore/…/client directory,
> the
> > runtime of ‘mvn test’ in standalone-metastore went from 6 minutes to 26
> on
> > my humble laptop.  We do not want to get ourselves back where the rest of
> > Hive is; currently Hive developers don’t run the unit tests themselves
> > because the tests take too long.  I believe we should be working to push
> > the unit test runtime down to about 2 minutes, so people are willing to
> run
> > it frequently as part of their development.
> >
> > I don’t mean that the new tests aren't valuable.  But we need a balance
> > between test coverage in the unit tests and usability.  So I propose that
> > we carve off many of the current unit tests (including some not in the
> > client module, like TestSetUGI…, TestRemote...) in a new profile,
> > ‘nightly’, or ‘checkin’, or something.  Then ‘mvn install’ will finish
> > quickly while hopefully covering 90% of the areas we need to cover.  We
> can
> > ask developers to run the extended set before checkin and configure the
> > automated tests to do the same.  This way we still cover everything
> before
> > committing.
> >
> > Seem reasonable?
> >
> > Alan.
> >
>


Re: Long running time for recently added tests in standalone-metastore

2018-01-29 Thread Alexander Kolbasov
Quite reasonable.
+1

On Mon, Jan 29, 2018 at 4:19 PM, Alan Gates  wrote:

> With all of the added tests in standalone-metastore/…/client directory, the
> runtime of ‘mvn test’ in standalone-metastore went from 6 minutes to 26 on
> my humble laptop.  We do not want to get ourselves back where the rest of
> Hive is; currently Hive developers don’t run the unit tests themselves
> because the tests take too long.  I believe we should be working to push
> the unit test runtime down to about 2 minutes, so people are willing to run
> it frequently as part of their development.
>
> I don’t mean that the new tests aren't valuable.  But we need a balance
> between test coverage in the unit tests and usability.  So I propose that
> we carve off many of the current unit tests (including some not in the
> client module, like TestSetUGI…, TestRemote...) in a new profile,
> ‘nightly’, or ‘checkin’, or something.  Then ‘mvn install’ will finish
> quickly while hopefully covering 90% of the areas we need to cover.  We can
> ask developers to run the extended set before checkin and configure the
> automated tests to do the same.  This way we still cover everything before
> committing.
>
> Seem reasonable?
>
> Alan.
>


Long running time for recently added tests in standalone-metastore

2018-01-29 Thread Alan Gates
With all of the added tests in standalone-metastore/…/client directory, the
runtime of ‘mvn test’ in standalone-metastore went from 6 minutes to 26 on
my humble laptop.  We do not want to get ourselves back where the rest of
Hive is; currently Hive developers don’t run the unit tests themselves
because the tests take too long.  I believe we should be working to push
the unit test runtime down to about 2 minutes, so people are willing to run
it frequently as part of their development.

I don’t mean that the new tests aren't valuable.  But we need a balance
between test coverage in the unit tests and usability.  So I propose that
we carve off many of the current unit tests (including some not in the
client module, like TestSetUGI…, TestRemote...) in a new profile,
‘nightly’, or ‘checkin’, or something.  Then ‘mvn install’ will finish
quickly while hopefully covering 90% of the areas we need to cover.  We can
ask developers to run the extended set before checkin and configure the
automated tests to do the same.  This way we still cover everything before
committing.

Seem reasonable?

Alan.


Re: Review Request 65413: HIVE-18575 ACID properties usage in jobconf is ambiguous for MM tables

2018-01-29 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65413/
---

(Updated Jan. 29, 2018, 11:12 p.m.)


Review request for hive and Eugene Koifman.


Repository: hive-git


Description
---

.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
b7d3e99e1a505f576a06c530080fc72dddcd85ba 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java
 5ee8aadfa774a85a0bdbcaf78a636ff6593c43e2 
  
hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java
 5e12614cfe17030f8fcb56ef8c83b53b8b870c97 
  
hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/StreamingAssert.java
 c98d22be2e6216e95d9c13f3a26540ca03e7405e 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
 13059023516edbb58a9129ba9aa49de7e40129e6 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapRecordReader.java
 d252279be973201227da52d8aecf83b3fcc4656b 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java
 68bb168bd23b84dd150cdc4da63d73657f1b33bb 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java 
a7dace955d6fb3dabc4c5e77ef68f83617eb48d1 
  ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 
270b576199c57c109195b85d43e216743a607955 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java 
abd42ec651927503e7c8c2d9a7d3d415cc9c4ac4 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 
eb75308e8393cadf8e69e0e30b303474b89df03e 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 
c3b846c4d2fee8691b4952b9f6cf4dd1d8bd632f 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 
ff2cc0455c64ed210d8ff14a9f112cd91b7314be 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java 
61565ef0305006a57b7f608e60ddcdf2b6ff474d 
  
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
 da200049bcbc8f2fe1d793acc7b84f8b99ae67cc 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcInputFormat.java 
7b157e648646c5a199aaebf04484b81ff1c12478 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
923372d5b6da42446997051d0758e9aab4881e2e 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
190771ea6b1cbf4b669a8919271b25a689af941b 
  ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 
661446df0b9fbb5cf248d76205e47dbaa113026f 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
9152b4e08a7a1044fc7f844d47ae8e180162b78b 
  ql/src/test/org/apache/hadoop/hive/ql/io/TestAcidUtils.java 
26a96a47f1935de8e985d382b40c8aae604a9880 
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java 
92f005d1dc837ea5ba7d8579892b6e7325940120 
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcRawRecordMerger.java 
c6a866a1644f087d260f78e280d07867d81cbc0c 
  
ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestVectorizedOrcAcidRowBatchReader.java
 65508f4ddd66140a273c8c447c0ee93f4f139454 


Diff: https://reviews.apache.org/r/65413/diff/2/

Changes: https://reviews.apache.org/r/65413/diff/1-2/


Testing
---


Thanks,

Sergey Shelukhin



Review Request 65413: HIVE-18575 ACID properties usage in jobconf is ambiguous for MM tables

2018-01-29 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65413/
---

Review request for hive and Eugene Koifman.


Repository: hive-git


Description
---

.


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
b7d3e99e1a505f576a06c530080fc72dddcd85ba 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java
 5ee8aadfa774a85a0bdbcaf78a636ff6593c43e2 
  
hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java
 5e12614cfe17030f8fcb56ef8c83b53b8b870c97 
  
hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/StreamingAssert.java
 c98d22be2e6216e95d9c13f3a26540ca03e7405e 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
 13059023516edbb58a9129ba9aa49de7e40129e6 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapRecordReader.java
 d252279be973201227da52d8aecf83b3fcc4656b 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java
 68bb168bd23b84dd150cdc4da63d73657f1b33bb 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java 
a7dace955d6fb3dabc4c5e77ef68f83617eb48d1 
  ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 
270b576199c57c109195b85d43e216743a607955 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java 
abd42ec651927503e7c8c2d9a7d3d415cc9c4ac4 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 
eb75308e8393cadf8e69e0e30b303474b89df03e 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 
c3b846c4d2fee8691b4952b9f6cf4dd1d8bd632f 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 
ff2cc0455c64ed210d8ff14a9f112cd91b7314be 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java 
61565ef0305006a57b7f608e60ddcdf2b6ff474d 
  
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
 da200049bcbc8f2fe1d793acc7b84f8b99ae67cc 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcInputFormat.java 
7b157e648646c5a199aaebf04484b81ff1c12478 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
923372d5b6da42446997051d0758e9aab4881e2e 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
190771ea6b1cbf4b669a8919271b25a689af941b 
  ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 
661446df0b9fbb5cf248d76205e47dbaa113026f 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
9152b4e08a7a1044fc7f844d47ae8e180162b78b 
  ql/src/test/org/apache/hadoop/hive/ql/io/TestAcidUtils.java 
26a96a47f1935de8e985d382b40c8aae604a9880 
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java 
92f005d1dc837ea5ba7d8579892b6e7325940120 
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcRawRecordMerger.java 
c6a866a1644f087d260f78e280d07867d81cbc0c 
  
ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestVectorizedOrcAcidRowBatchReader.java
 65508f4ddd66140a273c8c447c0ee93f4f139454 


Diff: https://reviews.apache.org/r/65413/diff/1/


Testing
---


Thanks,

Sergey Shelukhin



Re: Review Request 65276: HIVE-18516

2018-01-29 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65276/
---

(Updated Jan. 29, 2018, 11:10 p.m.)


Review request for hive, Eugene Koifman and Jason Dere.


Changes
---

Implemented review comments.
Since ACID works with ORC files which handle compression internally, the 
extensions are completely ignored.
The logic is updated only for ACID non-bucketed tables. The fix for ACID 
bucketed table will come with larger fix for bucketed tables in general.


Bugs: HIVE-18516
https://issues.apache.org/jira/browse/HIVE-18516


Repository: hive-git


Description
---

load data should rename files consistent with insert statements for ACID Tables.
Includes test change for a missed test.


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties d86ff58840 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 23983d85b3 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
5868d4dd56 
  ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveCopyFiles.java 
c6a4a8926b 
  ql/src/test/queries/clientpositive/load_data_acid_rename.q PRE-CREATION 
  ql/src/test/queries/clientpositive/smb_mapjoin_7.q 4a6afb0496 
  ql/src/test/results/clientpositive/beeline/smb_mapjoin_7.q.out 7a6f8c53a5 
  ql/src/test/results/clientpositive/llap/load_data_acid_rename.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/smb_mapjoin_7.q.out b71c5b87c1 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_7.q.out ac49c02913 


Diff: https://reviews.apache.org/r/65276/diff/2/

Changes: https://reviews.apache.org/r/65276/diff/1-2/


Testing
---


Thanks,

Deepak Jaiswal



[jira] [Created] (HIVE-18575) ACID properties usage in jobconf is ambiguous for MM tables

2018-01-29 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-18575:
---

 Summary: ACID properties usage in jobconf is ambiguous for MM 
tables
 Key: HIVE-18575
 URL: https://issues.apache.org/jira/browse/HIVE-18575
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin


Vectorization checks for ACID table trigger for MM tables where they don't 
apply. Other places seem to set the setting for transactional case while most 
of the code seems to assume it implies full acid.
Overall, many places in the code use the settings directly or set the ACID flag 
without setting the ACID properties.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 65304: HIVE-18513 Query results caching

2018-01-29 Thread Jason Dere


> On Jan. 29, 2018, 6:26 p.m., Jesús Camacho Rodríguez wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
> > Lines 3699 (patched)
> > 
> >
> > Is this the permission for each results directory? Does this mean that 
> > results cannot be shared by different users?
> > 
> > Why does this need to be configurable? (I would assume this is not 
> > something that you let the user decide).

If this is HiveServer2 (with doAs=false), the hive user would be the directory 
owner regardless of which user is submitting the query, so I would not expect 
issues with sharing the cache in the HiveServer2 case. One danger with making 
this directory readable by others is the possibility that it allows a user to 
access cached results which the user may not have permission to see, if 
user-level filtering/masking rules are enabled. Different instances of Hive do 
not share caches, so sharing results between different Hive CLI instances is 
not something I am worrying about here.

I was basing the directory creation rules on the scratchdir logic, which does 
allow this to be configurable. But really the setting at the time of cache 
initialization is what matters. If you think this should just be hardcoded to 
700 let me know and I can make the change.


> On Jan. 29, 2018, 6:26 p.m., Jesús Camacho Rodríguez wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
> > Lines 3710 (patched)
> > 
> >
> > What is _entry_ referring to? Does it mean the SQL query string size? 
> > We should extend the description to make it clear.

Entry here refers to the size of the cached results that are saved on the 
filesystem. Will make this more clear.


> On Jan. 29, 2018, 6:26 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java
> > Lines 1829 (patched)
> > 
> >
> > Currently what happens when we get this exception?
> > 
> > It seems to me that mechanism in HIVE-17626 works when execution itself 
> > fails, but in this case, whether the cache entry is still valid or not can 
> > be inferred statically at planning time, hence I am not sure whether it 
> > should be handled the same way? It seems we will have certain overhead that 
> > might not be necessary.
> > 
> > Can't we check the validity of the entry when we are replacing the plan 
> > by the scan on the cached results, e.g., in SemanticAnalyzer?

So the table locks for the query are not acquired until query execution, which 
occurs after query compilation. If we implement automatic invalidation of the 
cache based on updates to the Hive tables, the following could happen:

1. Query A goes through query compilation and finds an entry in the cache that 
can be used to satisfy the query. At this point there have been no updates to 
the table which would invalidate the cache.
2. Query B acquires a write lock and begins making updates to one or more of 
the tables involved in query A
3. Query A attempts to acquire read locks and blocks while query B is running.
4. Query B finishes updating the tables and releases its lock.
5. Query A now acquires the read lock, but at this point the cached result is 
stale.

The options here would be to either attempt to recompile the query (the current 
approach), or to just go ahead and serve the stale results.


> On Jan. 29, 2018, 6:26 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java
> > Lines 174 (patched)
> > 
> >
> > What happens if execution fails? Will the results still be cleaned 
> > properly?

This is being called in Driver.closeInProcess()/Driver.close(), so my 
impression was this should be the case.


> On Jan. 29, 2018, 6:26 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java
> > Lines 262 (patched)
> > 
> >
> > Code might have been simpler using Guava cache. This is just a note, 
> > maybe something that can be considered for a follow-up.

Thanks for the pointer, will check it out.


> On Jan. 29, 2018, 6:26 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOpMaterializationValidator.java
> > Lines 72 (patched)
> > 
> >
> > Leaf can be other node too, e.g., _DruidQuery_. Now reusing results is 
> > time-based, but for those type of tables, we do not have guarantees (as 
> > with other external tables). Thus, we should probably return true iff scan 
> > is an instance of HiveTableScan, fals

[jira] [Created] (HIVE-18574) LLAP: Ship netty3 as part of LLAP install tarball

2018-01-29 Thread Gopal V (JIRA)
Gopal V created HIVE-18574:
--

 Summary: LLAP: Ship netty3 as part of LLAP install tarball
 Key: HIVE-18574
 URL: https://issues.apache.org/jira/browse/HIVE-18574
 Project: Hive
  Issue Type: Bug
  Components: llap
Affects Versions: 3.0.0
Reporter: Gopal V


Removing netty from Tez libs causes

{code}
2018-01-29T18:28:49,995 ERROR [main ()] 
org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon: Failed to start LLAP Daemon 
with exception
java.lang.NoClassDefFoundError: org/jboss/netty/channel/group/ChannelGroup
at 
org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon.serviceStart(LlapDaemon.java:410)
 
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18573) Use proper Calcite operator instead of UDFs

2018-01-29 Thread slim bouguerra (JIRA)
slim bouguerra created HIVE-18573:
-

 Summary: Use proper Calcite operator instead of UDFs
 Key: HIVE-18573
 URL: https://issues.apache.org/jira/browse/HIVE-18573
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: slim bouguerra


Currently, Hive is mostly using user-defined black box sql operators during 
Query planning. It will be more beneficial to use proper calcite operators.

Also, Use a single name for Extract operator instead of a different name for 
every Unit,  

Same for Floor function. This will allow unifying the treatment per operator.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18572) The record readers; InputFormat needs to be fixed for Tez as it generates 1 split

2018-01-29 Thread gurmukh singh (JIRA)
gurmukh singh created HIVE-18572:


 Summary: The record readers; InputFormat needs to be fixed for Tez 
as it generates 1 split
 Key: HIVE-18572
 URL: https://issues.apache.org/jira/browse/HIVE-18572
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.1.0
Reporter: gurmukh singh


The record reader needs to be fixed for tez, as it generates only 1 split due 
to the {color:#33}MRv2 CombineInputFormat broke that rule{color}.

This has been fixed in MR but not Tez.

I am seeing a strange behaviour in tez; it is seeing all data as a single split 
under hive, where as MR see all 79 files. This is causing all the data to go to 
a single map

TEZ Processing
INFO  : Partition trusted.usage\{ds=20180126, periode=1200} stats: [numFiles=1, 
numRows=79575067, totalSize=3.164.605.993, rawDataSize=112439569671]
ELAPSED TIME: 1958.99 s

MR Processing
Partition trusted.usage\{ds=20180126, periode=1200} stats: [numFiles=79, 
numRows=79575067, totalSize=3172280778, rawDataSize=112418416260]
ELAPSED TIME: 65 s

Log Tez
2018-01-29 16:50:04,825 [INFO] [InputInitializer \{Map 1} #0] 
|split.TezMapredSplitsGrouper|: Desired splits: 381 too large.  Desired 
splitLength: 8311476 Min splitLength: 50331648 New desired splits: 381 Final 
desired splits: 381 All splits have localhost: false Total length: 19166265870 
Original splits: 1
2018-01-29 16:50:04,825 [INFO] [InputInitializer \{Map 1} #0] 
|split.TezMapredSplitsGrouper|: Using original number of splits: 1 desired 
splits: 381
2018-01-29 16:50:04,826 [INFO] [InputInitializer \{Map 1} #0] 
|tez.SplitGrouper|: Original split size is 1 grouped split size is 1, for 
bucket: 1
2018-01-29 16:50:04,827 [INFO] [InputInitializer \{Map 1} #0] 
|tez.HiveSplitGenerator|: Number of grouped splits: 1
2018-01-29 16:50:04,846 [INFO] [InputInitializer \{Map 1} #0] 
|dag.RootInputInitializerManager|: Succeeded InputInitializer for Input: usage 
on vertex vertex_1517207496169_0085_1_00 [Map 1]
2018-01-29 16:50:04,848 [INFO] [App Shared Pool - #0] |impl.VertexImpl|: Cannot 
init vertex: vertex_1517207496169_0085_1_00 [Map 1] numTasks: -1 
numUnitializedEdges: 0 numInitializedInputs: 1 initWaitsForRootInitializers: 
true
2018-01-29 16:50:04,848 [INFO] [App Shared Pool - #0] |impl.VertexImpl|: Got 
updated RootInputsSpecs: \{usage=forAllWorkUnits=true, update=[1]}
2018-01-29 16:50:04,859 [INFO] [App Shared Pool - #0] |impl.VertexImpl|: Vertex 
vertex_1517207496169_0085_1_00 [Map 1] parallelism set to 1

As per discussion with Gopal Vijayaraghavan:

https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java#L494
 that line, right there MRv2 CombineInputFormat broke that rule, so the record 
readers had to be fixed to handle it 
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/HiveContextAwareRecordReader.java#L312



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18571) stats issue for MM tables

2018-01-29 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-18571:
---

 Summary: stats issue for MM tables
 Key: HIVE-18571
 URL: https://issues.apache.org/jira/browse/HIVE-18571
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin


There are multiple stats aggregation issues with MM tables.
Some simple stats are double counted and some stats (simple stats) are invalid 
for ACID table dirs altogether. 
I have a patch almost ready, need to fix some more stuff and clean up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18570) ACID IOW implemented using base may delete too much data

2018-01-29 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-18570:
---

 Summary: ACID IOW implemented using base may delete too much data
 Key: HIVE-18570
 URL: https://issues.apache.org/jira/browse/HIVE-18570
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin


Suppose we have a table with delta_0 insert data.
Txn 1 starts an insert into delta_1.
Txn 2 starts an IOW into base_2.
Txn 2 commits.
Txn 1 commits after txn 2 but its results would be invisible.

If we treat IOW foo like DELETE FROM foo (to reason about it w.r.t. ACID 
semantics), it seems to me this sequence of events is only possible under 
read-uncommitted isolation level (so, 2 deletes rows written by 1).
Under any other isolation level rows written by 1 must survive, or there must 
be some lock based change in sequence or conflict.







--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 65342: HIVE-18546

2018-01-29 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65342/#review196454
---




standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Line 1919 (original), 1894 (patched)


I wonder whether we shall check length of string before persisting in 
RDBMS. We are using different datatype in different DBs. This string shouldn't 
exceed max permissible length. Should throw exception saying "TxnList too long" 
?


- Ashutosh Chauhan


On Jan. 25, 2018, 5:49 p.m., Jesús Camacho Rodríguez wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65342/
> ---
> 
> (Updated Jan. 25, 2018, 5:49 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-18546
> https://issues.apache.org/jira/browse/HIVE-18546
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-18546
> 
> 
> Diffs
> -
> 
>   metastore/scripts/upgrade/derby/048-HIVE-14498.derby.sql 
> 4ffd054530503681de1c9f6d65f8187fc1b7520d 
>   metastore/scripts/upgrade/derby/hive-schema-3.0.0.derby.sql 
> 6a59b0df712c8a9f9be880cec5fd8c8eddda4a7d 
>   metastore/scripts/upgrade/derby/hive-txn-schema-3.0.0.derby.sql 
> d72b06cb5866edf93dbcbb20268fc899439e5c43 
>   metastore/scripts/upgrade/hive/hive-schema-3.0.0.hive.sql 
> eb4f0124b5a7829e58d5e9a6a604c201ccea998a 
>   metastore/scripts/upgrade/mssql/033-HIVE-14498.mssql.sql 
> 3a47600bb09e2c20cc12f8759e1287001367604e 
>   metastore/scripts/upgrade/mssql/hive-schema-3.0.0.mssql.sql 
> c45bb3e323c640223b19831abbf4e806c3019f0b 
>   metastore/scripts/upgrade/mysql/048-HIVE-14498.mysql.sql 
> 986eaf5272eab560fa2f862910aaf74c5332c716 
>   metastore/scripts/upgrade/mysql/hive-schema-3.0.0.mysql.sql 
> 01c995d632d94a8f9cc3f46f94c54290abb3da13 
>   metastore/scripts/upgrade/mysql/hive-txn-schema-3.0.0.mysql.sql 
> 497846f994d431d8717aea36d4ad569892e3c8c3 
>   metastore/scripts/upgrade/oracle/048-HIVE-14498.oracle.sql 
> 0b01e89d92f7f48439024aeb326d675d123f0f8c 
>   metastore/scripts/upgrade/oracle/hive-schema-3.0.0.oracle.sql 
> e1aee6fb6c84999b17f87f80750582fafeae063f 
>   metastore/scripts/upgrade/oracle/hive-txn-schema-3.0.0.oracle.sql 
> 5411bc47103f901623244bc26c0ace87e10ad2e1 
>   metastore/scripts/upgrade/postgres/047-HIVE-14498.postgres.sql 
> 8d4de8870d93bab49c873cab44e6714b93491744 
>   metastore/scripts/upgrade/postgres/hive-schema-3.0.0.postgres.sql 
> 28cb01684a46aaeea40d7cbe1973d7bc20810988 
>   metastore/scripts/upgrade/postgres/hive-txn-schema-3.0.0.postgres.sql 
> a81d6eec6d6235706f1225d541f8290971cc6215 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 
> 51ef39057434c41fbe760c547e3bf231e65e4cc0 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 
> 9b0ffe0e91db05ae623531248f12745266789a11 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> f8126f1604c37f601c9d55b99698d2c0fdf71308 
>   ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java 
> aa95d2fcdcd0b3cede35537a1d7d041ee738e4a8 
>   standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 
> 42bc9297e72ac8fd77352cb786cfed3abf5af59b 
>   standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 
> 8b78230a32d4d4339189c1db4b533ed04ec080af 
>   
> standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp
>  6a2ff6c4c681b2dbaf339b214663212a2e6dab22 
>   standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h 
> df646a7d1771892e4404be5c4fba183c0f914510 
>   standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 
> 27f8c0f2fcb24a90be8a44d68947589004286c28 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AbortTxnsRequest.java
>  398f8d4e93c6077c110e6469bcd3715fdad5a634 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AddDynamicPartitions.java
>  2102aa5215598edfe5e5c53d541c4fe02ebc7f09 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AddForeignKeyRequest.java
>  a2225298e72f708e97324048592c37a308e43514 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AddNotNullConstraintRequest.java
>  ef23d3025aabb2934f93230ea72c4585dda973e4 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AddPartitionsRequest.java
>  13a23182488ebda9ab0f7163fd4d6822c04c975f 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AddPartitionsResult.java
>  49ce6e1a6cc38994662f56536c6dd6bd55e67d47 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/met

Re: Review Request 65304: HIVE-18513 Query results caching

2018-01-29 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65304/#review196440
---




common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
Lines 3699 (patched)


Is this the permission for each results directory? Does this mean that 
results cannot be shared by different users?

Why does this need to be configurable? (I would assume this is not 
something that you let the user decide).



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
Lines 3710 (patched)


What is _entry_ referring to? Does it mean the SQL query string size? We 
should extend the description to make it clear.



ql/src/java/org/apache/hadoop/hive/ql/Driver.java
Lines 1829 (patched)


Currently what happens when we get this exception?

It seems to me that mechanism in HIVE-17626 works when execution itself 
fails, but in this case, whether the cache entry is still valid or not can be 
inferred statically at planning time, hence I am not sure whether it should be 
handled the same way? It seems we will have certain overhead that might not be 
necessary.

Can't we check the validity of the entry when we are replacing the plan by 
the scan on the cached results, e.g., in SemanticAnalyzer?



ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java
Lines 174 (patched)


What happens if execution fails? Will the results still be cleaned properly?



ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java
Lines 262 (patched)


Code might have been simpler using Guava cache. This is just a note, maybe 
something that can be considered for a follow-up.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOpMaterializationValidator.java
Lines 72 (patched)


Leaf can be other node too, e.g., _DruidQuery_. Now reusing results is 
time-based, but for those type of tables, we do not have guarantees (as with 
other external tables). Thus, we should probably return true iff scan is an 
instance of HiveTableScan, false otherwise.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOpMaterializationValidator.java
Lines 77 (patched)


_getProjects()_



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOpMaterializationValidator.java
Lines 87 (patched)


It should be _getCondition()_.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOpMaterializationValidator.java
Lines 96 (patched)


I think all the information should be in _getCondition()_.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOpMaterializationValidator.java
Lines 124 (patched)


- There are some operators missing, e.g., semijoin.
- I would create specific visit methods for all operators, then a fall-back 
visit method on RelNode that returns false if it does not recognize the 
operator that we are visiting (as a way to prevent any incorrect assesment, 
e.g., if new operators are added in the future).


- Jesús Camacho Rodríguez


On Jan. 25, 2018, 7:34 a.m., Jason Dere wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65304/
> ---
> 
> (Updated Jan. 25, 2018, 7:34 a.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, and Jesús Camacho 
> Rodríguez.
> 
> 
> Bugs: HIVE-18513
> https://issues.apache.org/jira/browse/HIVE-18513
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> - For queries that result in MR/Tez/Spark jobs on the cluster, save the 
> temporary query results to a cache directory where they can be re-used.
> - Add QueryResultsCache to manage cached results. Currently cache 
> invalidation is time-based, update-based cache invalidation needs to be added 
> later.
> - Driver/SemanticAnalyzer/Calcite planner changes to lookup queries in the 
> cache and use in place of the query plan.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 0c2cf05 
>   common/src/java/org/apache/hadoop/hive/ql/log/PerfLogger.java 2767bca 
>   data/conf/hive-site.xml 01f83d1 
>   data/conf/llap/hive-site.xml cdda875 
>   data/conf/perf-reg/spark/hive-site.xml 497a61f 
>   data/conf/perf-reg/tez/hive-site.xml 012369f 
>   data/conf/rlist/hi

Re: Review Request 65356: HIVE-18536 IOW + DP is broken for insert-only ACID

2018-01-29 Thread Eugene Koifman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65356/#review196437
---




ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
Line 4077 (original), 4077 (patched)


as far as I can tell every call to this method, passes null for isBaseDir.  
Can this be removed?


- Eugene Koifman


On Jan. 26, 2018, 9:03 p.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65356/
> ---
> 
> (Updated Jan. 26, 2018, 9:03 p.m.)
> 
> 
> Review request for hive and Eugene Koifman.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> .
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/JavaUtils.java 9f64b3d2e0 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 2e1fd37d4a 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 23983d85b3 
> 
> 
> Diff: https://reviews.apache.org/r/65356/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



[jira] [Created] (HIVE-18569) Hive Druid indexing not dealing with decimals in correct way.

2018-01-29 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-18569:
---

 Summary: Hive Druid indexing not dealing with decimals in correct 
way.
 Key: HIVE-18569
 URL: https://issues.apache.org/jira/browse/HIVE-18569
 Project: Hive
  Issue Type: Bug
  Components: Druid integration
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Currently, a decimal column is indexed as double in druid.
This should not happen and either the user has to add an explicit cast or we 
can add a flag to enable approximation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 65380: HIVE-18566: Create tests to cover adding partitions from PartitionSpec

2018-01-29 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65380/#review196431
---



Nice throughout test cases.
Thanks Marta!

Fix it and ship it!


standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
Lines 1312 (patched)


Agree with Adam, please move it to another test class.


- Peter Vary


On Jan. 29, 2018, 12:27 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65380/
> ---
> 
> (Updated Jan. 29, 2018, 12:27 p.m.)
> 
> 
> Review request for hive, Peter Vary and Adam Szita.
> 
> 
> Bugs: HIVE-18566
> https://issues.apache.org/jira/browse/HIVE-18566
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> he following methods of IMetaStoreClient are covered by this test.
> - int add_partitions_pspec(PartitionSpecProxy)
> 
> The test covers not just the happy pathes, but the edge cases as well.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
>  09be321 
> 
> 
> Diff: https://reviews.apache.org/r/65380/diff/1/
> 
> 
> Testing
> ---
> 
> Run the tests
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 65380: HIVE-18566: Create tests to cover adding partitions from PartitionSpec

2018-01-29 Thread Adam Szita via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65380/#review196430
---



Nice patch with well covered cases.
Since this is a lot of test code, wouldn't it make sense to move the 
PartitionSpec-related code into a separate class (rather than having a 2k+ LOC 
class)?


standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
Lines 2220-2221 (patched)


123456 may be a contant to be considered for extraction


- Adam Szita


On Jan. 29, 2018, 12:27 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65380/
> ---
> 
> (Updated Jan. 29, 2018, 12:27 p.m.)
> 
> 
> Review request for hive, Peter Vary and Adam Szita.
> 
> 
> Bugs: HIVE-18566
> https://issues.apache.org/jira/browse/HIVE-18566
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> he following methods of IMetaStoreClient are covered by this test.
> - int add_partitions_pspec(PartitionSpecProxy)
> 
> The test covers not just the happy pathes, but the edge cases as well.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
>  09be321 
> 
> 
> Diff: https://reviews.apache.org/r/65380/diff/1/
> 
> 
> Testing
> ---
> 
> Run the tests
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 65349: HIVE-18544: Create tests to cover methods for appending Partitions

2018-01-29 Thread Adam Szita via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65349/#review196428
---


Ship it!




Ship It!

- Adam Szita


On Jan. 29, 2018, 12:49 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65349/
> ---
> 
> (Updated Jan. 29, 2018, 12:49 p.m.)
> 
> 
> Review request for hive, Peter Vary and Adam Szita.
> 
> 
> Bugs: HIVE-18544
> https://issues.apache.org/jira/browse/HIVE-18544
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The following methods of IMetaStoreClient are covered by this test.
> - Partition appendPartition(String, String, List)
> - Partition appendPartition(String, String, String)
> 
> The test covers not just the happy pathes, but the edge cases as well.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAppendPartitions.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/65349/diff/2/
> 
> 
> Testing
> ---
> 
> Run the tests
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



[jira] [Created] (HIVE-18568) Hive doesn't complain about duplicate column unless it is a CTAS query

2018-01-29 Thread Siddhant Saraf (JIRA)
Siddhant Saraf created HIVE-18568:
-

 Summary: Hive doesn't complain about duplicate column unless it is 
a CTAS query
 Key: HIVE-18568
 URL: https://issues.apache.org/jira/browse/HIVE-18568
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Siddhant Saraf


{code:java}
-- demo:
hive> select 1 as number, 2 as number;
OK
number number
1 2
Time taken: 0.091 seconds, Fetched: 1 row(s)

hive> create table test as select 1 as number, 2 as number;
FAILED: SemanticException [Error 10036]: Duplicate column name: number
{code}
I had a 'select' query to which I later prepended a 'create table as' when it 
threw an error about duplicate column name. I was surprised by the different 
treatment of duplicate column names. In my opinion this is a bug. Or is this by 
design?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18567) ObjectStore.getPartitionNamesNoTxn doesn't handle max param properly

2018-01-29 Thread Adam Szita (JIRA)
Adam Szita created HIVE-18567:
-

 Summary: ObjectStore.getPartitionNamesNoTxn doesn't handle max 
param properly
 Key: HIVE-18567
 URL: https://issues.apache.org/jira/browse/HIVE-18567
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Adam Szita
Assignee: Adam Szita


As per [this HMS API test 
case|https://github.com/apache/hive/commit/fa0a8d27d4149cc5cc2dbb49d8eb6b03f46bc279#diff-25c67d898000b53e623a6df9221aad5dR1044]
 listing partition names doesn't check tha max param against 
MetaStoreConf.LIMIT_PARTITION_REQUEST (as other methods do by 
checkLimitNumberOfPartitionsByFilter), and also behaves differently on max=0 
setting compared to other methods.

We should bring this into consistency.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 65349: HIVE-18544: Create tests to cover methods for appending Partitions

2018-01-29 Thread Marta Kuczora via Review Board


> On Jan. 29, 2018, 11:15 a.m., Adam Szita wrote:
> > Thanks for the patch Marta, it looks very thourough! I've added my 2 cents: 
> > small observations regarding helper methods only.

Thanks a lot Adam for the review.


> On Jan. 29, 2018, 11:15 a.m., Adam Szita wrote:
> > standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAppendPartitions.java
> > Lines 489-495 (patched)
> > 
> >
> > When defining table types we should rely on the enum values IMHO: 
> > TableType.EXTERNAL_TABLE.name()

You are right, thanks for pointing this out.


> On Jan. 29, 2018, 11:15 a.m., Adam Szita wrote:
> > standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAppendPartitions.java
> > Lines 536-539 (patched)
> > 
> >
> > Another way of doing this is:
> > 
> > 
> > Arrays.stream(input.split("/")).map(v->v.split("=")[1]).collect(toList())

Thanks for the hint, I fixed this.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65349/#review196418
---


On Jan. 29, 2018, 12:49 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65349/
> ---
> 
> (Updated Jan. 29, 2018, 12:49 p.m.)
> 
> 
> Review request for hive, Peter Vary and Adam Szita.
> 
> 
> Bugs: HIVE-18544
> https://issues.apache.org/jira/browse/HIVE-18544
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The following methods of IMetaStoreClient are covered by this test.
> - Partition appendPartition(String, String, List)
> - Partition appendPartition(String, String, String)
> 
> The test covers not just the happy pathes, but the edge cases as well.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAppendPartitions.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/65349/diff/2/
> 
> 
> Testing
> ---
> 
> Run the tests
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 65349: HIVE-18544: Create tests to cover methods for appending Partitions

2018-01-29 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65349/
---

(Updated Jan. 29, 2018, 12:49 p.m.)


Review request for hive, Peter Vary and Adam Szita.


Changes
---

Address review findings


Bugs: HIVE-18544
https://issues.apache.org/jira/browse/HIVE-18544


Repository: hive-git


Description
---

The following methods of IMetaStoreClient are covered by this test.
- Partition appendPartition(String, String, List)
- Partition appendPartition(String, String, String)

The test covers not just the happy pathes, but the edge cases as well.


Diffs (updated)
-

  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAppendPartitions.java
 PRE-CREATION 


Diff: https://reviews.apache.org/r/65349/diff/2/

Changes: https://reviews.apache.org/r/65349/diff/1-2/


Testing
---

Run the tests


Thanks,

Marta Kuczora



Re: Review Request 65349: HIVE-18544: Create tests to cover methods for appending Partitions

2018-01-29 Thread Marta Kuczora via Review Board


> On Jan. 26, 2018, 1:53 p.m., Peter Vary wrote:
> > Ship It!

Thanks a lot Peter for the review.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65349/#review196338
---


On Jan. 29, 2018, 12:49 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65349/
> ---
> 
> (Updated Jan. 29, 2018, 12:49 p.m.)
> 
> 
> Review request for hive, Peter Vary and Adam Szita.
> 
> 
> Bugs: HIVE-18544
> https://issues.apache.org/jira/browse/HIVE-18544
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The following methods of IMetaStoreClient are covered by this test.
> - Partition appendPartition(String, String, List)
> - Partition appendPartition(String, String, String)
> 
> The test covers not just the happy pathes, but the edge cases as well.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAppendPartitions.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/65349/diff/2/
> 
> 
> Testing
> ---
> 
> Run the tests
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Review Request 65380: HIVE-18566: Create tests to cover adding partitions from PartitionSpec

2018-01-29 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65380/
---

Review request for hive, Peter Vary and Adam Szita.


Bugs: HIVE-18566
https://issues.apache.org/jira/browse/HIVE-18566


Repository: hive-git


Description
---

he following methods of IMetaStoreClient are covered by this test.
- int add_partitions_pspec(PartitionSpecProxy)

The test covers not just the happy pathes, but the edge cases as well.


Diffs
-

  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
 09be321 


Diff: https://reviews.apache.org/r/65380/diff/1/


Testing
---

Run the tests


Thanks,

Marta Kuczora



[jira] [Created] (HIVE-18566) Create tests to cover adding partitions from PartitionSpec

2018-01-29 Thread Marta Kuczora (JIRA)
Marta Kuczora created HIVE-18566:


 Summary: Create tests to cover adding partitions from PartitionSpec
 Key: HIVE-18566
 URL: https://issues.apache.org/jira/browse/HIVE-18566
 Project: Hive
  Issue Type: Sub-task
Reporter: Marta Kuczora
Assignee: Marta Kuczora






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 65349: HIVE-18544: Create tests to cover methods for appending Partitions

2018-01-29 Thread Adam Szita via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65349/#review196418
---



Thanks for the patch Marta, it looks very thourough! I've added my 2 cents: 
small observations regarding helper methods only.


standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAppendPartitions.java
Lines 489-495 (patched)


When defining table types we should rely on the enum values IMHO: 
TableType.EXTERNAL_TABLE.name()



standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAppendPartitions.java
Lines 536-539 (patched)


Another way of doing this is:

Arrays.stream(input.split("/")).map(v->v.split("=")[1]).collect(toList())


- Adam Szita


On Jan. 26, 2018, 1:03 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65349/
> ---
> 
> (Updated Jan. 26, 2018, 1:03 p.m.)
> 
> 
> Review request for hive, Peter Vary and Adam Szita.
> 
> 
> Bugs: HIVE-18544
> https://issues.apache.org/jira/browse/HIVE-18544
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The following methods of IMetaStoreClient are covered by this test.
> - Partition appendPartition(String, String, List)
> - Partition appendPartition(String, String, String)
> 
> The test covers not just the happy pathes, but the edge cases as well.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAppendPartitions.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/65349/diff/1/
> 
> 
> Testing
> ---
> 
> Run the tests
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 65353: Create tests to cover getTableMeta method

2018-01-29 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65353/#review196416
---



Thanks for the patch Adam. I was only be able to come up 1 more test case :)


standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestGetTableMeta.java
Lines 144 (patched)


It might be a good idea to add a comment field to some of the tables. And 
leave out the comment field for others, and check the results too.


- Peter Vary


On Jan. 26, 2018, 6:12 p.m., Adam Szita wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65353/
> ---
> 
> (Updated Jan. 26, 2018, 6:12 p.m.)
> 
> 
> Review request for hive, Marta Kuczora and Peter Vary.
> 
> 
> Bugs: HIVE-18542
> https://issues.apache.org/jira/browse/HIVE-18542
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Create tests to cover getTableMeta method
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestGetTableMeta.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/65353/diff/1/
> 
> 
> Testing
> ---
> 
> Have run the tests themselves
> 
> 
> Thanks,
> 
> Adam Szita
> 
>



[jira] [Created] (HIVE-18565) Set some timeout to all CliDriver tests

2018-01-29 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18565:
---

 Summary: Set some timeout to all CliDriver tests
 Key: HIVE-18565
 URL: https://issues.apache.org/jira/browse/HIVE-18565
 Project: Hive
  Issue Type: Sub-task
  Components: Testing Infrastructure
Reporter: Zoltan Haindrich


if a testcase runs into an infinite loop or something ; it should still fail 
instead of causing timeout at the ptest executor



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18564) Add a mapper to make plan transformations more easily understandable

2018-01-29 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18564:
---

 Summary: Add a mapper to make plan transformations more easily 
understandable
 Key: HIVE-18564
 URL: https://issues.apache.org/jira/browse/HIVE-18564
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


This part is started as a small helper class to enable plan independent mapping 
of runtime operator informations. But in reality its a bit different; and might 
have its own different kind of usages.

Goals were:
 * connect plan pieces which are responsible for the same part together; 
currently I'm using it to connect RelNode, AST, Operator, RuntimeStats
 * make it easy to attach new data
 * make it easy to lookup some related information

This concept seems to be also usefull during writing tests; because it enables 
the lookup of specific pieces like HiveFilter



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18563) "Load data into table" behavior is different between 1.2.1 and 1.2.1000

2018-01-29 Thread Junichi Oda (JIRA)
Junichi Oda created HIVE-18563:
--

 Summary: "Load data into table" behavior is different between 
1.2.1 and 1.2.1000
 Key: HIVE-18563
 URL: https://issues.apache.org/jira/browse/HIVE-18563
 Project: Hive
  Issue Type: Bug
  Components: Hive, HiveServer2
 Environment: * OS : CentOS6
 * JDK : 1.8.0_152(Oracle)
 * HDP : 2.3.2.0 and 2.6.2.0
 * Hive : 1.2.1.2.3.2.0-2950 and 1.2.1000.2.6.2.0-205
Reporter: Junichi Oda


After upgrading HDP from 2.3.2.0 to 2.6.2.0, the "load data into table" 
behavior changed.

Data is input hourly, All files have the same name.

{code:java}
/user/user1/logs/mmdd/00/part-r-0.gz
/user/user1/logs/mmdd/01/part-r-0.gz
/user/user1/logs/mmdd/02/part-r-0.gz
/user/user1/logs/mmdd/03/part-r-0.gz
・・・
/user/user1/logs/mmdd/22/part-r-0.gz
/user/user1/logs/mmdd/23/part-r-0.gz
{code}

Before upgrade (HDP 2.3.2.0 )

{code:java}
HQL
hive> load data inpath '/user/user1/logs/mmdd/*/*.gz' into table 
sample_db.sample_tbl partition (dt='mmdd');
 
 
Result
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd/part-r-0.gz
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd/part-r-0_copy_1.gz
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd/part-r-0_copy_10.gz
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd/part-r-0_copy_11.gz
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd/part-r-0_copy_12.gz
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd/part-r-0_copy_13.gz
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd/part-r-0_copy_14.gz
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd/part-r-0_copy_15.gz
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd/part-r-0_copy_16.gz
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd/part-r-0_copy_17.gz
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd/part-r-0_copy_18.gz
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd/part-r-0_copy_19.gz
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd/part-r-0_copy_2.gz
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd/part-r-0_copy_20.gz
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd/part-r-0_copy_21.gz
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd/part-r-0_copy_22.gz
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd/part-r-0_copy_23.gz
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd/part-r-0_copy_3.gz
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd/part-r-0_copy_4.gz
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd/part-r-0_copy_5.gz
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd/part-r-0_copy_6.gz
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd/part-r-0_copy_7.gz
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd/part-r-0_copy_8.gz
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd/part-r-0_copy_9.gz
{code}
All files were renamed into part-r-_copy_*.gz without the file 
part-r-.gz.

After upgrade(HDP 2.6.2.0 )
{code:java}
HQL
hive> load data inpath '/user/user1/logs/mmdd/*/*.gz' into table 
sample_db.sample_tbl partition (dt='mmdd');
 
Result
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd
/hive/warehouse/sample_db.db/sample_tbl/dt=mmdd/part-r-0.gz
{code}
There is only part-r-.gz.

This file was the same file as part-r-_copy_23.gz.

When files are loaded one by one, I can load all files like as HDP 2.3.2.0 
environment.

Why is the behavior different between 2.3.2.0 and 2.6.2.0 ?

Thanks in advance

 

https://community.hortonworks.com/questions/158176/load-data-into-table-behavior-is-different-between.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)