[jira] [Created] (HIVE-18795) upgrade accumulo to 1.8.1

2018-02-23 Thread Saijin Huang (JIRA)
Saijin Huang created HIVE-18795:
---

 Summary: upgrade accumulo to 1.8.1
 Key: HIVE-18795
 URL: https://issues.apache.org/jira/browse/HIVE-18795
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Saijin Huang
Assignee: Saijin Huang






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18794) Repl load "with" clause does not pass config to tasks for non-partition tables

2018-02-23 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-18794:
-

 Summary: Repl load "with" clause does not pass config to tasks for 
non-partition tables
 Key: HIVE-18794
 URL: https://issues.apache.org/jira/browse/HIVE-18794
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-18794.1.patch

Miss one scenario in HIVE-18626.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 65415: HIVE-18571 stats issues for MM tables

2018-02-23 Thread Sergey Shelukhin


> On Feb. 2, 2018, 10 a.m., Zoltan Haindrich wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
> > Lines 650 (patched)
> > 
> >
> > I might be missing something but I don't see why should quickstats be 
> > calculated differently for transactional tables...quickstats is num_files 
> > and total bytes on disk - these things apply to acid tables as well

For acid tables, files in old delta directories and such do not belong to the 
table data set.


- Sergey


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65415/#review196696
---


On Feb. 24, 2018, 2:09 a.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65415/
> ---
> 
> (Updated Feb. 24, 2018, 2:09 a.m.)
> 
> 
> Review request for hive and Eugene Koifman.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> f.,v fbghdscd
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java b490325091 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
>  0a82225d4a 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 70fcd2c142 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 8b0af3e5c8 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
> 67d05e65dd 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> 7d2de75315 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> cd6f1ee692 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/BasicStatsWork.java a4e770ce95 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsNoJobTask.java 
> 946c300750 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java 1d7660e8b2 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/Partish.java 05b0474e90 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java 
> d84cf136d5 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
>  89354a2d34 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  c6e34a8a22 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
>  50f873a013 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
>  2599ab103e 
> 
> 
> Diff: https://reviews.apache.org/r/65415/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



Re: Review Request 65415: HIVE-18571 stats issues for MM tables

2018-02-23 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65415/
---

(Updated Feb. 24, 2018, 2:09 a.m.)


Review request for hive and Eugene Koifman.


Repository: hive-git


Description
---

f.,v fbghdscd


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java b490325091 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
 0a82225d4a 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 70fcd2c142 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 8b0af3e5c8 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
67d05e65dd 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
7d2de75315 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java cd6f1ee692 
  ql/src/java/org/apache/hadoop/hive/ql/plan/BasicStatsWork.java a4e770ce95 
  ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsNoJobTask.java 
946c300750 
  ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java 1d7660e8b2 
  ql/src/java/org/apache/hadoop/hive/ql/stats/Partish.java 05b0474e90 
  ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java 
d84cf136d5 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
 89354a2d34 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 c6e34a8a22 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
 50f873a013 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 2599ab103e 


Diff: https://reviews.apache.org/r/65415/diff/3/

Changes: https://reviews.apache.org/r/65415/diff/2-3/


Testing
---


Thanks,

Sergey Shelukhin



[jira] [Created] (HIVE-18793) Round udf should support variable as second argument

2018-02-23 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-18793:
---

 Summary: Round udf should support variable as second argument
 Key: HIVE-18793
 URL: https://issues.apache.org/jira/browse/HIVE-18793
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Ashutosh Chauhan






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18792) Allow standard compliant syntax for insert on partitioned tables

2018-02-23 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-18792:
---

 Summary: Allow standard compliant syntax for insert on partitioned 
tables
 Key: HIVE-18792
 URL: https://issues.apache.org/jira/browse/HIVE-18792
 Project: Hive
  Issue Type: Improvement
  Components: SQL
Reporter: Ashutosh Chauhan


Following works:

{code}

create table t1 (a int, b int, c int);

create table t2 (a int, b int, c int) partitioned by (d int);

insert into t1 values (1,2,3);

insert into t1 (c, b, a) values (1,2,3);

insert into t1 (a,b) values (1,2);

{code}

For partitioned tables it should work similarly but doesn't.  All of following 
fails:

{code}

insert into t2 values (1,2,3,4);

insert into t2 (a, b, c, d) values (1,2,3,4);

insert into t2 (c,d) values (1,2);

insert into t2 (a,b) values (1,2);

{code}

All of above should work. Also note following works:

{code}

insert into t2 partition(d)  values (1,2,3,4);

insert into t2 partition(d=4)  values (1,2,3);

{code}

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18791) Fix TestJdbcWithMiniHS2#testHttpHeaderSize

2018-02-23 Thread Andrew Sherman (JIRA)
Andrew Sherman created HIVE-18791:
-

 Summary: Fix TestJdbcWithMiniHS2#testHttpHeaderSize
 Key: HIVE-18791
 URL: https://issues.apache.org/jira/browse/HIVE-18791
 Project: Hive
  Issue Type: Bug
Reporter: Andrew Sherman
Assignee: Andrew Sherman


TestJdbcWithMiniHS2#testHttpHeaderSize tests whether config of http header 
sizes works by using a long username. The local scratch directory for the 
session uses the username as part of its path. When this name is more than 255 
chars (on most modern file systems) then the directory creation will fail. 
HIVE-18625 made this failure throw an exception, which has caused a regression 
in testHttpHeaderSize.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] hive pull request #312: HIVE-18777

2018-02-23 Thread thejasmn
GitHub user thejasmn opened a pull request:

https://github.com/apache/hive/pull/312

HIVE-18777



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/thejasmn/hive HIVE-18777-authprovider

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/312.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #312


commit e4627ce304ea44ddeffa6f822247fc5e105d9aba
Author: Thejas M Nair 
Date:   2018-02-23T22:08:08Z

add policy provider interfaces

commit 4e0157d3aecf3f1d94eb790cb1a0f91dfeb3e25a
Author: Thejas M Nair 
Date:   2018-02-23T22:23:51Z

Add ASL header




---


[jira] [Created] (HIVE-18790) test jars are present hive .tar.gz

2018-02-23 Thread Thejas M Nair (JIRA)
Thejas M Nair created HIVE-18790:


 Summary: test jars are present hive .tar.gz
 Key: HIVE-18790
 URL: https://issues.apache.org/jira/browse/HIVE-18790
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Thejas M Nair


In the extracted tar.gz from apache master there are test jar files. They 
should be removed.

{code}
ls apache-hive-3.0.0-SNAPSHOT-bin/lib/*test*
apache-hive-3.0.0-SNAPSHOT-bin/lib/hbase-common-2.0.0-alpha4-tests.jar 
apache-hive-3.0.0-SNAPSHOT-bin/lib/hive-llap-common-3.0.0-SNAPSHOT-tests.jar
apache-hive-3.0.0-SNAPSHOT-bin/lib/hbase-hadoop2-compat-2.0.0-alpha4-tests.jar 
apache-hive-3.0.0-SNAPSHOT-bin/lib/hive-testutils-3.0.0-SNAPSHOT.jar
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Question about design of ObjectStore cache

2018-02-23 Thread Sergey Shelukhin
I think the primary motivation for cache was the cloud use-case where the
default SQL instance is underpowered and you have to set up and pay more
to get a performant one. The same may apply to setting up redis/memcached
on the cloud environment - if you set one up you might as well just get
the faster SQL instance.
On prem any reasonable RDBMS is usually fast enough that caching metadata
may not provide much benefit.
Also, I’m not familiar with DN caching but Hive actually does not use DN
ORM itself to retrieve most of the performance sensitive stuff (like
partitions and stats), because the initial retrieval of many entities is
very inefficient; it issues SQL queries directly. I’m not sure if it would
be easy to integrate with DN caching.

Thejas might have more details on the design.


On 18/2/23, 10:49, "Alexander Kolbasov"  wrote:

>Hello,
>
>I am wondering about the design choices made for ObjectStore cache. Looks
>like Datanucleus has support for L2 caching using various backends,
>including memcache and redis (caching support is pluggable). I am
>wondering
>why you decided to implement your own caching solution instead. Even if
>you
>wanted to cache at thrift level, using memcached or redis seems like a
>useful thing to consider. Were there any reasons to avoid these?
>
>- Alex



Review Request 65778: HIVE-18726 Implement DEFAULT constraint

2018-02-23 Thread Vineet Garg

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65778/
---

Review request for hive, Ashutosh Chauhan and Jesús Camacho Rodríguez.


Bugs: HIVE-18726
https://issues.apache.org/jira/browse/HIVE-18726


Repository: hive-git


Description
---

This patch adds DEFAULT constraint


Diffs
-

  
itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java
 a3725c5395 
  itests/src/test/resources/testconfiguration.properties 4a52eb5559 
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 6cd7a136ae 
  metastore/scripts/upgrade/derby/hive-schema-3.0.0.derby.sql a8f227b775 
  metastore/scripts/upgrade/hive/hive-schema-3.0.0.hive.sql 84d523e1d7 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java f99178dbc7 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 32fc257b03 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/DefaultConstraint.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 8b0af3e5c8 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/JsonMetaDataFormatter.java
 77e5678f80 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java
 a5b6a4b0c3 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatter.java
 88d5554e1d 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/TextMetaDataFormatter.java
 607e111c97 
  ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
171825eb74 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
e926b63764 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g 733ec79ce1 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java cd6f1ee692 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java 
14217e3978 
  
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/message/AddNotNullConstraintHandler.java
 9c12e7e2af 
  ql/src/java/org/apache/hadoop/hive/ql/plan/AlterTableDesc.java 00c0381107 
  ql/src/java/org/apache/hadoop/hive/ql/plan/CreateTableDesc.java 6228d4c803 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeConstantDesc.java 
73f449fc28 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ImportTableDesc.java fcbac7d840 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToChar.java 
b98ec68158 
  ql/src/test/queries/clientnegative/alter_external_with_constraint.q  
  ql/src/test/queries/clientnegative/alter_external_with_default_constraint.q 
PRE-CREATION 
  
ql/src/test/queries/clientnegative/alter_tableprops_external_with_constraint.q  
  
ql/src/test/queries/clientnegative/alter_tableprops_external_with_default_constraint.q
 PRE-CREATION 
  ql/src/test/queries/clientnegative/constraint_duplicate_name.q PRE-CREATION 
  ql/src/test/queries/clientnegative/constraint_invalide_name.q PRE-CREATION 
  ql/src/test/queries/clientnegative/constraint_partition_columns.q 
PRE-CREATION 
  ql/src/test/queries/clientnegative/create_external_with_constraint.q  
  ql/src/test/queries/clientnegative/create_external_with_default_constraint.q 
PRE-CREATION 
  ql/src/test/queries/clientnegative/default_constraint_invalid_default_value.q 
PRE-CREATION 
  
ql/src/test/queries/clientnegative/default_constraint_invalid_default_value2.q 
PRE-CREATION 
  
ql/src/test/queries/clientnegative/default_constraint_invalid_default_value_length.q
 PRE-CREATION 
  
ql/src/test/queries/clientnegative/default_constraint_invalid_default_value_type.q
 PRE-CREATION 
  ql/src/test/queries/clientnegative/default_constraint_invalid_type.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/create_with_constraints.q 7b2594b79f 
  ql/src/test/queries/clientpositive/default_constraint.q PRE-CREATION 
  ql/src/test/results/clientnegative/alter_external_with_constraint.q.out  
  
ql/src/test/results/clientnegative/alter_external_with_notnull_constraint.q.out 
PRE-CREATION 
  
ql/src/test/results/clientnegative/alter_tableprops_external_with_constraint.q.out
  
  
ql/src/test/results/clientnegative/alter_tableprops_external_with_default_constraint.q.out
 PRE-CREATION 
  ql/src/test/results/clientnegative/constraint_duplicate_name.q.out 
PRE-CREATION 
  ql/src/test/results/clientnegative/constraint_invalide_name.q.out 
PRE-CREATION 
  ql/src/test/results/clientnegative/constraint_partition_columns.q.out 
PRE-CREATION 
  ql/src/test/results/clientnegative/create_external_with_constraint.q.out  
  
ql/src/test/results/clientnegative/create_external_with_notnull_constraint.q.out
 PRE-CREATION 
  
ql/src/test/results/clientnegative/default_constraint_invalid_default_value.q.out
 PRE-CREATION 
  
ql/src/test/results/clientnegative/default_constraint_invalid_default_value2.q.out
 PRE-CREATION 
  
ql/src/test/results/clientnegative/default_constraint_invalid_default_value_length.q.out
 PRE-CREATION 
  

[jira] [Created] (HIVE-18789) Disallow embedded element in UDFXPathUtil

2018-02-23 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-18789:
-

 Summary: Disallow embedded element in UDFXPathUtil
 Key: HIVE-18789
 URL: https://issues.apache.org/jira/browse/HIVE-18789
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 65634: HIVE-18264: CachedStore: Store cached partitions/col stats within the table cache

2018-02-23 Thread Daniel Dai

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65634/#review198206
---


Ship it!




Ship It!

- Daniel Dai


On Feb. 23, 2018, 8:14 p.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65634/
> ---
> 
> (Updated Feb. 23, 2018, 8:14 p.m.)
> 
> 
> Review request for hive, Daniel Dai and Thejas Nair.
> 
> 
> Bugs: HIVE-18264
> https://issues.apache.org/jira/browse/HIVE-18264
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-18264
> 
> 
> Diffs
> -
> 
>   
> itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java
>  a3725c5395 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  7b44df4128 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java
>  f500d63725 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java
>  f0f650ddcf 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
>  0d132f2074 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
>  32ea17495f 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
>  50f873a013 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
>  75ea8c4a77 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
>  207d842f94 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java
>  ab6feb6f0b 
> 
> 
> Diff: https://reviews.apache.org/r/65634/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>



Re: Review Request 65634: HIVE-18264: CachedStore: Store cached partitions/col stats within the table cache

2018-02-23 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65634/
---

(Updated Feb. 23, 2018, 8:14 p.m.)


Review request for hive, Daniel Dai and Thejas Nair.


Bugs: HIVE-18264
https://issues.apache.org/jira/browse/HIVE-18264


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-18264


Diffs (updated)
-

  
itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java
 a3725c5395 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 7b44df4128 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java
 f500d63725 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java
 f0f650ddcf 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
 0d132f2074 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
 32ea17495f 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
 50f873a013 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
 75ea8c4a77 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
 207d842f94 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java
 ab6feb6f0b 


Diff: https://reviews.apache.org/r/65634/diff/2/

Changes: https://reviews.apache.org/r/65634/diff/1-2/


Testing
---


Thanks,

Vaibhav Gumashta



Re: Review Request 65634: HIVE-18264: CachedStore: Store cached partitions/col stats within the table cache

2018-02-23 Thread Vaibhav Gumashta


> On Feb. 21, 2018, 10:07 p.m., Daniel Dai wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
> > Line 311 (original), 215 (patched)
> > 
> >
> > This is not introduced in this patch, but getting columns for table and 
> > apply to partition will not work for schema revolution. We shall get 
> > columns for every individual partition.

I agree, but not sure if current stats works with schema evolution. Let me take 
this up in a follow up jira as this might need a little more thought.


> On Feb. 21, 2018, 10:07 p.m., Daniel Dai wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
> > Line 800 (original), 632 (patched)
> > 
> >
> > I don't remember but why this is get() not getUnsafe()? It sounds the 
> > same as getAllTables etc. Also apply to getDatabases, alterDatabase, 
> > dropDatabase, getDatabase and createDatabase

We're using get() here so that this call blocks till the database cache is 
populated. We're letting reads go through the cache while the tables are 
getting populated, but not for databases. Let me know if you think otherwise.


- Vaibhav


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65634/#review197794
---


On Feb. 13, 2018, 12:08 p.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65634/
> ---
> 
> (Updated Feb. 13, 2018, 12:08 p.m.)
> 
> 
> Review request for hive, Daniel Dai and Thejas Nair.
> 
> 
> Bugs: HIVE-18264
> https://issues.apache.org/jira/browse/HIVE-18264
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-18264
> 
> 
> Diffs
> -
> 
>   
> itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java
>  78b26374f2 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  d58ed677f3 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java
>  e4e7d4239d 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java
>  f0f650ddcf 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
>  80aa3bcdb4 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
>  32ea17495f 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
>  9100c73beb 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
>  86e72d8d76 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java
>  bd61df654a 
> 
> 
> Diff: https://reviews.apache.org/r/65634/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>



Question about design of ObjectStore cache

2018-02-23 Thread Alexander Kolbasov
Hello,

I am wondering about the design choices made for ObjectStore cache. Looks
like Datanucleus has support for L2 caching using various backends,
including memcache and redis (caching support is pluggable). I am wondering
why you decided to implement your own caching solution instead. Even if you
wanted to cache at thrift level, using memcached or redis seems like a
useful thing to consider. Were there any reasons to avoid these?

- Alex


[jira] [Created] (HIVE-18788) Clean up inputs in JDBC PreparedStatement

2018-02-23 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-18788:
-

 Summary: Clean up inputs in JDBC PreparedStatement
 Key: HIVE-18788
 URL: https://issues.apache.org/jira/browse/HIVE-18788
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: reverting test-breaking changes

2018-02-23 Thread Sahil Takiar
+1

Does anyone have suggestions about how to efficiently identify which commit
is breaking a test? Is it just git-bisect or is there an easier way? Hive
QA isn't always that helpful, it will say a test is failing for the past
"x" builds, but that doesn't help much since Hive QA isn't a nightly build.

On Thu, Feb 22, 2018 at 10:31 AM, Vihang Karajgaonkar 
wrote:

> +1
> Commenting on JIRA and giving a 24hr heads-up (excluding weekends) would be
> good.
>
> On Thu, Feb 22, 2018 at 10:19 AM, Alan Gates  wrote:
>
> > +1.
> >
> > Alan.
> >
> > On Thu, Feb 22, 2018 at 8:25 AM, Thejas Nair 
> > wrote:
> >
> > > +1
> > > I agree, this makes sense. The number of failures keeps increasing.
> > > A 24 hour heads up in either case before revert would be good.
> > >
> > >
> > > On Thu, Feb 22, 2018 at 2:45 AM, Peter Vary 
> wrote:
> > >
> > > > I agree with Zoltan. The continuously braking tests make it very hard
> > to
> > > > spot real issues.
> > > > Any thoughts on doing it automatically?
> > > >
> > > > > On Feb 22, 2018, at 10:47 AM, Zoltan Haindrich 
> wrote:
> > > > >
> > > > > *
> > > > >
> > > > > Hello,
> > > > >
> > > > > *
> > > > > *
> > > > >
> > > > > **
> > > > >
> > > > > In the last couple weeks the number of broken tests have started to
> > go
> > > > up...and even tho I run bisect/etc from time to time ; sometimes
> people
> > > > don’t react to my comments/tickets/etc.
> > > > >
> > > > > Because keeping this many failing tests makes it easier for a new
> one
> > > to
> > > > slip in...I think reverting the patch introducing the test failures
> > would
> > > > also help in some case.
> > > > >
> > > > > I think it would help a lot to prevent further test breaks to
> revert
> > > the
> > > > patch if any of the following conditions is met:
> > > > >
> > > > > *
> > > > > *
> > > > >
> > > > > C1) if the notification/comment about the fact that the patch
> indeed
> > > > broken a test somehow have been unanswered for at least 24 hours.
> > > > >
> > > > > C2) if the patch is in for 7 days; but the test failure is still
> not
> > > > addressed (note that in this case there might be a conversation about
> > > > fixing it...but in this case ; to enable other people to work in a
> > > cleaner
> > > > environment is more important than a single patch - and if it can't
> be
> > > > fixed in 7 days...well it might not get fixed in a month).
> > > > >
> > > > > *
> > > > > *
> > > > >
> > > > > I would like to also note that I've seen a few tickets which have
> > been
> > > > picked up by people who were not involved in creating the original
> > > change -
> > > > and although the intention was good, they might miss the context of
> the
> > > > original patch and may "fix" the tests in the wrong way: accept a
> q.out
> > > > which is inappropriate or ignore the test...
> > > > >
> > > > > *
> > > > > *
> > > > >
> > > > > would it be ok to implement this from now on? because it makes my
> > > > efforts practically useless if people are not reacting…
> > > > >
> > > > > *
> > > > > *
> > > > >
> > > > > note: just to be on the same page - this is only about running a
> > single
> > > > test which falls on its own - I feel that flaky tests are an entirely
> > > > different topic.
> > > > >
> > > > > *
> > > > > *
> > > > >
> > > > > cheers,
> > > > >
> > > > > Zoltan
> > > > >
> > > > > **
> > > > > *
> > > >
> > > >
> > >
> >
>



-- 
Sahil Takiar
Software Engineer
takiar.sa...@gmail.com | (510) 673-0309


[jira] [Created] (HIVE-18787) TestSparkCliDriver.testCliDriver[subquery_scalar] is consistently failing

2018-02-23 Thread Sahil Takiar (JIRA)
Sahil Takiar created HIVE-18787:
---

 Summary: TestSparkCliDriver.testCliDriver[subquery_scalar] is 
consistently failing
 Key: HIVE-18787
 URL: https://issues.apache.org/jira/browse/HIVE-18787
 Project: Hive
  Issue Type: Test
  Components: Test
Reporter: Sahil Takiar
Assignee: Sahil Takiar


Not sure what caused this to start failing, but its been failing for a while.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18786) NPE in Hive windowing functions

2018-02-23 Thread Michael Bieniosek (JIRA)
Michael Bieniosek created HIVE-18786:


 Summary: NPE in Hive windowing functions
 Key: HIVE-18786
 URL: https://issues.apache.org/jira/browse/HIVE-18786
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.3.2
Reporter: Michael Bieniosek


When I run a Hive query with windowing functions, if there's enough data I get 
an NPE.

For example something like this query might break:

select id, created_date, max(created_date) over (partition by id) 
latest_created_any from ...

The only workaround I've found is to remove the windowing functions entirely.

The stacktrace looks suspiciously similar to HADOOP-2931, but I'm in hive-2.3.2 
which appears to have the bugfix applied.

 

Caused by: java.lang.NullPointerException
         at 
org.apache.hadoop.hive.ql.exec.persistence.PTFRowContainer.first(PTFRowContainer.java:115)
         at 
org.apache.hadoop.hive.ql.exec.PTFPartition.iterator(PTFPartition.java:114)
         at 
org.apache.hadoop.hive.ql.udf.ptf.BasePartitionEvaluator.getPartitionAgg(BasePartitionEvaluator.java:200)
         at 
org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.evaluateFunctionOnPartition(WindowingTableFunction.java:155)
         at 
org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.iterator(WindowingTableFunction.java:538)
         at 
org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:349)
         at 
org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:123)
         at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
         at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
         at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:356)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 65716: HIVE-18696: The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-02-23 Thread Sahil Takiar


> On Feb. 21, 2018, 3:32 p.m., Sahil Takiar wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Lines 3149-3150 (patched)
> > 
> >
> > curious what behavior you were seeing, wondering why cancelling or 
> > interrupting the `Future`s doesn't work
> 
> Marta Kuczora wrote:
> The idea was to iterate over the Future tasks and call their get method 
> to wait until they finish. If an exception occurred in one task, iterated 
> over the tasks again and canceled them with mayInterruptIfRunning=false flag. 
> According to the Java doc, if this flag is false, the in-progress tasks are 
> allowed to complete. So the idea behind this was to avoid interrupting a task 
> when it already started creating the folder. So the situation when the folder 
> is created, but it is not yet put to the addedPartitions map doesn't happen.
> Then wait for the already running tasks to complete and I assumed we are 
> good at that point because all tasks are either finished or didn't even 
> started.
> 
> Imagine a flow something like this in code:
> 
> boolean failureOccurred = false;
> try {
>   for (Future partFuture : partFutures) {
> partFuture.get();
>   }
> } catch (InterruptedException | ExecutionException e) {
>   failureOccurred = true;  
> }
> 
> for (Future task : partFutures) {
>   task.cancel(false);
> } 
> 
> for (Future partFuture : partFutures) {
>   if (!partFuture.isCanceled()) {
> partFuture.get();
>   }
> }
> 
> 
> Then I created a test to see if it works as I expected. I tried to create 
> 40 partitions and had a counter which were visible to the tasks and threw an 
> exception when it reached 20. What I noticed is that almost every time I got 
> a ConcurrentModificationException on this line
> 
>for (Map.Entry e : 
> addedPartitions.entrySet()) {
> 
> So there must be some tasks still writing the addedPartitions map at that 
> point. By the way, changing the type of the map to ConcurrentMap proved this 
> right, as no exception occurred in this case, but there were leftover folders.
> 
> So I started to debug it, mainly with logging when the call method of a 
> task is called, when a task get canceled and what was the result when the get 
> method was called. What I found that there were tasks which were started, 
> their call method was called, so they started to create the folder, but then 
> there was a successful cancel on them. For these tasks the get method simply 
> would throw a CancellationException as it sees the task is not running any 
> more (or the isCanceled method would return true). But actually these tasks 
> created the folder, but it could happen that they didn't finish until the 
> clean up.
> 
> I checked the FutureTask code and the run method checks if the state of 
> the task is NEW and if it is, calls the Callable's call method. But doesn't 
> change the state at that point. My theory is that if a cancel is called on 
> the same task at this point, it will also see that the state is NEW, so it 
> will change it to CANCELLED. So I believe a task can go into a weird state 
> like this.
> Calling the cancel with mayInterruptIfRunning=true also resulted the same.
> So I didn't find a bullet proof solution with canceling the tasks, but it 
> can be that I missed something and there is a good way to solve this.
> 
> If you have any idea, please share it with me, any idea is welcome. :)
> 
> Sahil Takiar wrote:
> hmm this is odd. did you try checking the return value of the `cancel` 
> method, the javadocs say it returns `true` if the cancel was successful and 
> `false` otherwise; is it returning `true`?
> 
> I can understand that `mayInterruptIfRunning=true` won't help, because it 
> just causes the thread to throw an `InterruptedException` but any code 
> running in that thread can just catch the exception and drop it
> 
> could it be possible the second `partFuture.get()` threw an exception, in 
> which case the `finally` block will be trigerred before all threads complete?
> 
> I think the solution you have proposed works fine, but the 
> `Future#cancel` methodology should work too and is slightly cleaner, but if 
> you can't get it to work no worries, don't spend too much time on it
> 
> Marta Kuczora wrote:
> Yep, I checked it and for these "leftover" tasks, the cancel returned 
> true. There were tasks for which the cancel returned false, but those were 
> ok, because the get just waited for them.
> 
> No, I didn't get exception during the second partFuture.get(). Also when 
> I logged what happens with the tasks, all tasks appeared in this second get 
> loop. For the ones which were canceled successfully (even though they 

[GitHub] hive pull request #290: HIVE-18192: Introduce WriteID per table rather than ...

2018-02-23 Thread sankarh
Github user sankarh closed the pull request at:

https://github.com/apache/hive/pull/290


---


[jira] [Created] (HIVE-18785) Make JSON Serde First-Class Serde

2018-02-23 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HIVE-18785:
--

 Summary: Make JSON Serde First-Class Serde
 Key: HIVE-18785
 URL: https://issues.apache.org/jira/browse/HIVE-18785
 Project: Hive
  Issue Type: New Feature
  Components: Serializers/Deserializers
Affects Versions: 3.0.0
Reporter: BELUGA BEHR


According to the [Hive SerDe 
Docs|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-RowFormats],
 there are some extra steps involved in getting the JSON SerDe to work:

{quote}
ROW FORMAT SERDE 
'org.apache.hive.hcatalog.data.JsonSerDe' 
STORED AS TEXTFILE

In some distributions, a reference to hive-hcatalog-core.jar is required.
ADD JAR /usr/lib/hive-hcatalog/lib/hive-hcatalog-core.jar;
{quote}

I would like to propose that we move this SerDe into first-class status:

{{STORED AS JSONFILE}}

The user should have to perform no additional steps to use this SerDe.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 65716: HIVE-18696: The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-02-23 Thread Marta Kuczora via Review Board


> On Feb. 21, 2018, 3:32 p.m., Sahil Takiar wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Lines 3149-3150 (patched)
> > 
> >
> > curious what behavior you were seeing, wondering why cancelling or 
> > interrupting the `Future`s doesn't work
> 
> Marta Kuczora wrote:
> The idea was to iterate over the Future tasks and call their get method 
> to wait until they finish. If an exception occurred in one task, iterated 
> over the tasks again and canceled them with mayInterruptIfRunning=false flag. 
> According to the Java doc, if this flag is false, the in-progress tasks are 
> allowed to complete. So the idea behind this was to avoid interrupting a task 
> when it already started creating the folder. So the situation when the folder 
> is created, but it is not yet put to the addedPartitions map doesn't happen.
> Then wait for the already running tasks to complete and I assumed we are 
> good at that point because all tasks are either finished or didn't even 
> started.
> 
> Imagine a flow something like this in code:
> 
> boolean failureOccurred = false;
> try {
>   for (Future partFuture : partFutures) {
> partFuture.get();
>   }
> } catch (InterruptedException | ExecutionException e) {
>   failureOccurred = true;  
> }
> 
> for (Future task : partFutures) {
>   task.cancel(false);
> } 
> 
> for (Future partFuture : partFutures) {
>   if (!partFuture.isCanceled()) {
> partFuture.get();
>   }
> }
> 
> 
> Then I created a test to see if it works as I expected. I tried to create 
> 40 partitions and had a counter which were visible to the tasks and threw an 
> exception when it reached 20. What I noticed is that almost every time I got 
> a ConcurrentModificationException on this line
> 
>for (Map.Entry e : 
> addedPartitions.entrySet()) {
> 
> So there must be some tasks still writing the addedPartitions map at that 
> point. By the way, changing the type of the map to ConcurrentMap proved this 
> right, as no exception occurred in this case, but there were leftover folders.
> 
> So I started to debug it, mainly with logging when the call method of a 
> task is called, when a task get canceled and what was the result when the get 
> method was called. What I found that there were tasks which were started, 
> their call method was called, so they started to create the folder, but then 
> there was a successful cancel on them. For these tasks the get method simply 
> would throw a CancellationException as it sees the task is not running any 
> more (or the isCanceled method would return true). But actually these tasks 
> created the folder, but it could happen that they didn't finish until the 
> clean up.
> 
> I checked the FutureTask code and the run method checks if the state of 
> the task is NEW and if it is, calls the Callable's call method. But doesn't 
> change the state at that point. My theory is that if a cancel is called on 
> the same task at this point, it will also see that the state is NEW, so it 
> will change it to CANCELLED. So I believe a task can go into a weird state 
> like this.
> Calling the cancel with mayInterruptIfRunning=true also resulted the same.
> So I didn't find a bullet proof solution with canceling the tasks, but it 
> can be that I missed something and there is a good way to solve this.
> 
> If you have any idea, please share it with me, any idea is welcome. :)
> 
> Sahil Takiar wrote:
> hmm this is odd. did you try checking the return value of the `cancel` 
> method, the javadocs say it returns `true` if the cancel was successful and 
> `false` otherwise; is it returning `true`?
> 
> I can understand that `mayInterruptIfRunning=true` won't help, because it 
> just causes the thread to throw an `InterruptedException` but any code 
> running in that thread can just catch the exception and drop it
> 
> could it be possible the second `partFuture.get()` threw an exception, in 
> which case the `finally` block will be trigerred before all threads complete?
> 
> I think the solution you have proposed works fine, but the 
> `Future#cancel` methodology should work too and is slightly cleaner, but if 
> you can't get it to work no worries, don't spend too much time on it

Yep, I checked it and for these "leftover" tasks, the cancel returned true. 
There were tasks for which the cancel returned false, but those were ok, 
because the get just waited for them.

No, I didn't get exception during the second partFuture.get(). Also when I 
logged what happens with the tasks, all tasks appeared in this second get loop. 
For the ones which were canceled successfully (even though they are running), 
just showed that they were canceled.

[jira] [Created] (HIVE-18784) TestJdbcWithMiniKdcSQLAuthBinary runs with HTTP transport mode instead of binary

2018-02-23 Thread Daniel Voros (JIRA)
Daniel Voros created HIVE-18784:
---

 Summary: TestJdbcWithMiniKdcSQLAuthBinary runs with HTTP transport 
mode instead of binary
 Key: HIVE-18784
 URL: https://issues.apache.org/jira/browse/HIVE-18784
 Project: Hive
  Issue Type: Test
Affects Versions: 3.0.0
Reporter: Daniel Voros
Assignee: Daniel Voros


TestJdbcWithMiniKdcSQLAuthHttp should run HTTP and 
TestJdbcWithMiniKdcSQLAuthBinary should run binary, but currently they're both 
using HTTP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)