from:"Chaoyu Tang \(JIRA\)"

[jira] [Assigned] (HIVE-8448) Union All might not work due to the type conversion issue

2014-10-13 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang reassigned HIVE-8448:
-

Assignee: Chaoyu Tang

> Union All might not work due to the type conversion issue
> -
>
> Key: HIVE-8448
> URL: https://issues.apache.org/jira/browse/HIVE-8448
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
>
> create table t1 (val date);
> insert overwrite table t1 select '2014-10-10' from src limit 1;
> create table t2 (val varchar(10));
> insert overwrite table t2 select '2014-10-10' from src limit 1; 
> ==
> Query:
> select t.val from
> (select val from t1
> union all
> select val from t1
> union all
> select val from t2
> union all
> select val from t1) t;
> ==
> Will throw exception: 
> {code}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
> types for union operator
>   at 
> org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133)
>   ... 22 more
> {code}
> It was because at this query parse step, getCommonClassForUnionAll is used, 
> but at execution getCommonClass is used. They are not used consistently in 
> union. The later one does not support the implicit conversion from date to 
> string, which is the problem cause.
> The change might be simple 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8448) Union All might not work due to the type conversion issue

2014-10-13 Thread Chaoyu Tang (JIRA)

Chaoyu Tang created HIVE-8448:
-

 Summary: Union All might not work due to the type conversion issue
 Key: HIVE-8448
 URL: https://issues.apache.org/jira/browse/HIVE-8448
 Project: Hive
  Issue Type: Bug
Reporter: Chaoyu Tang
Priority: Minor


create table t1 (val date);
insert overwrite table t1 select '2014-10-10' from src limit 1;

create table t2 (val varchar(10));
insert overwrite table t2 select '2014-10-10' from src limit 1; 
==
Query:
select t.val from
(select val from t1
union all
select val from t1
union all
select val from t2
union all
select val from t1) t;
==
Will throw exception: 
{code}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible types 
for union operator
at 
org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133)
... 22 more
{code}

It was because at this query parse step, getCommonClassForUnionAll is used, but 
at execution getCommonClass is used. They are not used consistently in union. 
The later one does not support the implicit conversion from date to string, 
which is the problem cause.

The change might be simple 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8448) Union All might not work due to the type conversion issue

2014-10-13 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-8448:
--
Description: 
create table t1 (val date);
insert overwrite table t1 select '2014-10-10' from src limit 1;

create table t2 (val varchar(10));
insert overwrite table t2 select '2014-10-10' from src limit 1; 
==
Query:
select t.val from
(select val from t1
union all
select val from t1
union all
select val from t2
union all
select val from t1) t;
==
Will throw exception: 
{code}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible types 
for union operator
at 
org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133)
... 22 more
{code}

It was because at this query parse step, getCommonClassForUnionAll is used, but 
at execution getCommonClass is used. They are not used consistently in union. 
The later one does not support the implicit conversion from date to string, 
which is the problem cause.

The change might be simple to fix this particular union issue but I noticed 
that there are three versions of getCommonClass: getCommonClass, 
getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they need 
refactoring.

  was:
create table t1 (val date);
insert overwrite table t1 select '2014-10-10' from src limit 1;

create table t2 (val varchar(10));
insert overwrite table t2 select '2014-10-10' from src limit 1; 
==
Query:
select t.val from
(select val from t1
union all
select val from t1
union all
select val from t2
union all
select val from t1) t;
==
Will throw exception: 
{code}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible types 
for union operator
at 
org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133)
... 22 more
{code}

It was because at this query parse step, getCommonClassForUnionAll is used, but 
at execution getCommonClass is used. They are not used consistently in union. 
The later one does not support the implicit conversion from date to string, 
which is the problem cause.

The change might be simple 


> Union All might not work due to the type conversion issue
> -
>
> Key: HIVE-8448
> URL: https://issues.apache.org/jira/browse/HIVE-8448
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
>
> create table t1 (val date);
> insert overwrite table t1 select '2014-10-10' from src limit 1;
> create table t2 (val varchar(10));
> insert overwrite table t2 select '2014-10-10' from src limit 1; 
> ==
> Query:
> select t.val from
> (select val from t1
> union all
> select val from t1
> union all
> select val from t2
> union all
> select val fr

[jira] [Updated] (HIVE-8448) Union All might not work due to the type conversion issue

2014-10-13 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-8448:
--
Description: 
create table t1 (val date);
insert overwrite table t1 select '2014-10-10' from src limit 1;

create table t2 (val varchar(10));
insert overwrite table t2 select '2014-10-10' from src limit 1; 
==
Query:
select t.val from
(select val from t1
union all
select val from t1
union all
select val from t2
union all
select val from t1) t;
==
Will throw exception: 
{code}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible types 
for union operator
at 
org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133)
... 22 more
{code}

It was because at this query parse step, getCommonClassForUnionAll is used, but 
at execution getCommonClass is used. They are not used consistently in union. 
The later one does not support the implicit conversion from date to string, 
which is the problem cause.

The change might be simple to fix this particular union issue but I noticed 
that there are three versions of getCommonClass: getCommonClass, 
getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they need 
to be cleaned and refactored.

  was:
create table t1 (val date);
insert overwrite table t1 select '2014-10-10' from src limit 1;

create table t2 (val varchar(10));
insert overwrite table t2 select '2014-10-10' from src limit 1; 
==
Query:
select t.val from
(select val from t1
union all
select val from t1
union all
select val from t2
union all
select val from t1) t;
==
Will throw exception: 
{code}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible types 
for union operator
at 
org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133)
... 22 more
{code}

It was because at this query parse step, getCommonClassForUnionAll is used, but 
at execution getCommonClass is used. They are not used consistently in union. 
The later one does not support the implicit conversion from date to string, 
which is the problem cause.

The change might be simple to fix this particular union issue but I noticed 
that there are three versions of getCommonClass: getCommonClass, 
getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they need 
refactoring.


> Union All might not work due to the type conversion issue
> -
>
> Key: HIVE-8448
> URL: https://issues.apache.org/jira/browse/HIVE-8448
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
>
> create table t1 (val date);
> insert overwrite table t1 select '2014-10-10' from src limit 1;
> create table t2 (val varc

[jira] [Created] (HIVE-6059) Add union type support in LazyBinarySerDe

2013-12-19 Thread Chaoyu Tang (JIRA)

Chaoyu Tang created HIVE-6059:
-

 Summary: Add union type support in LazyBinarySerDe
 Key: HIVE-6059
 URL: https://issues.apache.org/jira/browse/HIVE-6059
 Project: Hive
  Issue Type: New Feature
  Components: File Formats
Affects Versions: 0.12.0
Reporter: Chaoyu Tang


We need the support to type union in LazyBinarySerDe, which is required to the 
join query with any union types in its select values. The reduce values in Join 
operation is serialized/deserialized using LazyBinarySerDe, otherwise we will 
see some errors like:
{code}
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardObjectInspector(ObjectInspectorUtils.java:106)
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardObjectInspector(ObjectInspectorUtils.java:156)
at 
org.apache.hadoop.hive.ql.exec.JoinUtil.getStandardObjectInspectors(JoinUtil.java:98)
at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.initializeOp(CommonJoinOperator.java:261)
at 
org.apache.hadoop.hive.ql.exec.JoinOperator.initializeOp(JoinOperator.java:61)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:360)
at org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:150)
{code}




--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

[jira] [Commented] (HIVE-2508) Join on union type fails

2013-12-19 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13852944#comment-13852944
 ] 

Chaoyu Tang commented on HIVE-2508:
---

I believe the observed error was due to the lack of union type support in 
LazyBinarySerDe which is used to deserialize the reduce values in Join, rather 
than the join on the key of union type. So any join query with select values 
having union type (e.g. SELECT * FROM DEST1 JOIN DEST2 on (DEST1.value = 
DEST2.value) should failed with same NPE.


> Join on union type fails
> 
>
> Key: HIVE-2508
> URL: https://issues.apache.org/jira/browse/HIVE-2508
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Ashutosh Chauhan
>  Labels: uniontype
>
> {code}
> hive> CREATE TABLE DEST1(key UNIONTYPE, value BIGINT) STORED 
> AS TEXTFILE;
> OK
> Time taken: 0.076 seconds
> hive> CREATE TABLE DEST2(key UNIONTYPE, value BIGINT) STORED 
> AS TEXTFILE;
> OK
> Time taken: 0.034 seconds
> hive> SELECT * FROM DEST1 JOIN DEST2 on (DEST1.key = DEST2.key);
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

[jira] [Created] (HIVE-6082) Certain KeeperException should be ignored in ZooKeeperHiveLockManage.unlockPrimitive

2013-12-20 Thread Chaoyu Tang (JIRA)

Chaoyu Tang created HIVE-6082:
-

 Summary: Certain KeeperException should be ignored in 
ZooKeeperHiveLockManage.unlockPrimitive
 Key: HIVE-6082
 URL: https://issues.apache.org/jira/browse/HIVE-6082
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Chaoyu Tang


KeeperException.NoNodeException and NotEmptyException should be ignored when 
deleting a zLock or its parent in ZooKeeperHiveLockManager unlockPrimitive. The 
exceptions can happen: 
1) ZooKeeperHiveLockManager retries deleting a zLock after a failure but it has 
been deleted. 
2) a race condition where another process adds a zLock just before it is about 
to be deleted.
Otherwise, unlock may unnecessarily be retried for numRetriesForUnLock times.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

[jira] [Updated] (HIVE-6082) Certain KeeperException should be ignored in ZooKeeperHiveLockManage.unlockPrimitive

2013-12-20 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-6082:
--

Attachment: Hive-6082.patch

Please review the attached fix.

> Certain KeeperException should be ignored in 
> ZooKeeperHiveLockManage.unlockPrimitive
> 
>
> Key: HIVE-6082
> URL: https://issues.apache.org/jira/browse/HIVE-6082
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.12.0
>Reporter: Chaoyu Tang
> Attachments: Hive-6082.patch
>
>
> KeeperException.NoNodeException and NotEmptyException should be ignored when 
> deleting a zLock or its parent in ZooKeeperHiveLockManager unlockPrimitive. 
> The exceptions can happen: 
> 1) ZooKeeperHiveLockManager retries deleting a zLock after a failure but it 
> has been deleted. 
> 2) a race condition where another process adds a zLock just before it is 
> about to be deleted.
> Otherwise, unlock may unnecessarily be retried for numRetriesForUnLock times.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

[jira] [Assigned] (HIVE-6082) Certain KeeperException should be ignored in ZooKeeperHiveLockManage.unlockPrimitive

2013-12-20 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang reassigned HIVE-6082:
-

Assignee: Chaoyu Tang

> Certain KeeperException should be ignored in 
> ZooKeeperHiveLockManage.unlockPrimitive
> 
>
> Key: HIVE-6082
> URL: https://issues.apache.org/jira/browse/HIVE-6082
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.12.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: Hive-6082.patch
>
>
> KeeperException.NoNodeException and NotEmptyException should be ignored when 
> deleting a zLock or its parent in ZooKeeperHiveLockManager unlockPrimitive. 
> The exceptions can happen: 
> 1) ZooKeeperHiveLockManager retries deleting a zLock after a failure but it 
> has been deleted. 
> 2) a race condition where another process adds a zLock just before it is 
> about to be deleted.
> Otherwise, unlock may unnecessarily be retried for numRetriesForUnLock times.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

[jira] [Commented] (HIVE-6082) Certain KeeperException should be ignored in ZooKeeperHiveLockManage.unlockPrimitive

2013-12-24 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13856348#comment-13856348
 ] 

Chaoyu Tang commented on HIVE-6082:
---

Could some one commits this patch if it is a proper fix? Thanks.

> Certain KeeperException should be ignored in 
> ZooKeeperHiveLockManage.unlockPrimitive
> 
>
> Key: HIVE-6082
> URL: https://issues.apache.org/jira/browse/HIVE-6082
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.12.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-6082.patch, Hive-6082.patch
>
>
> KeeperException.NoNodeException and NotEmptyException should be ignored when 
> deleting a zLock or its parent in ZooKeeperHiveLockManager unlockPrimitive. 
> The exceptions can happen: 
> 1) ZooKeeperHiveLockManager retries deleting a zLock after a failure but it 
> has been deleted. 
> 2) a race condition where another process adds a zLock just before it is 
> about to be deleted.
> Otherwise, unlock may unnecessarily be retried for numRetriesForUnLock times.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6245) HS2 creates DBs/Tables with wrong ownership when HMS setugi is true

2014-01-20 Thread Chaoyu Tang (JIRA)

Chaoyu Tang created HIVE-6245:
-

 Summary: HS2 creates DBs/Tables with wrong ownership when HMS 
setugi is true
 Key: HIVE-6245
 URL: https://issues.apache.org/jira/browse/HIVE-6245
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Chaoyu Tang
Assignee: Chaoyu Tang


The case with following settings is valid but does not work correctly in 
current HS2:
==
hive.server2.authentication=NONE (or LDAP)
hive.server2.enable.doAs= true
hive.metastore.sasl.enabled=false
hive.metastore.execute.setugi=true
==
Ideally, HS2 is able to impersonate the logged in user (from Beeline, or JDBC 
application) and create DBs/Tables with user's ownership.




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6245) HS2 creates DBs/Tables with wrong ownership when HMS setugi is true

2014-01-20 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-6245:
--

Attachment: HIVE-6245.patch

Fixes include:
1. be able to open an impersonation session in an non-kerberized HS2
2. when working with non-kerberized HMS but with hive.metastore.execute.setugi 
set to true, remember to close the ThreadLocal Hive object thus avoiding using 
a stale HMS connection in a new session.

> HS2 creates DBs/Tables with wrong ownership when HMS setugi is true
> ---
>
> Key: HIVE-6245
> URL: https://issues.apache.org/jira/browse/HIVE-6245
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-6245.patch
>
>
> The case with following settings is valid but does not work correctly in 
> current HS2:
> ==
> hive.server2.authentication=NONE (or LDAP)
> hive.server2.enable.doAs= true
> hive.metastore.sasl.enabled=false
> hive.metastore.execute.setugi=true
> ==
> Ideally, HS2 is able to impersonate the logged in user (from Beeline, or JDBC 
> application) and create DBs/Tables with user's ownership.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6245) HS2 creates DBs/Tables with wrong ownership when HMS setugi is true

2014-01-22 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13879282#comment-13879282
 ] 

Chaoyu Tang commented on HIVE-6245:
---

[~thejas] Could you take a look at this JIRA and comment? I noticed that the 
change in HIVE-4356 (remove duplicate impersonation parameters for hiveserver2) 
made HS2 doAs (impersonation) only work with Kerberos env. Thanks

> HS2 creates DBs/Tables with wrong ownership when HMS setugi is true
> ---
>
> Key: HIVE-6245
> URL: https://issues.apache.org/jira/browse/HIVE-6245
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-6245.patch
>
>
> The case with following settings is valid but does not work correctly in 
> current HS2:
> ==
> hive.server2.authentication=NONE (or LDAP)
> hive.server2.enable.doAs= true
> hive.metastore.sasl.enabled=false
> hive.metastore.execute.setugi=true
> ==
> Ideally, HS2 is able to impersonate the logged in user (from Beeline, or JDBC 
> application) and create DBs/Tables with user's ownership.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-4489) beeline always return the same error message twice

2013-05-03 Thread Chaoyu Tang (JIRA)

Chaoyu Tang created HIVE-4489:
-

 Summary: beeline always return the same error message twice
 Key: HIVE-4489
 URL: https://issues.apache.org/jira/browse/HIVE-4489
 Project: Hive
  Issue Type: Bug
  Components: Clients
Affects Versions: 0.10.0
Reporter: Chaoyu Tang
Priority: Minor
 Fix For: 0.11.0


Beeline always returns the same error message twice. for example, if I try to 
create a table a2 which already exists, it prints out two exact same messages 
and it is not quite user friendly.
{{{
beeline> !connect jdbc:hive2://localhost:1 scott tiger 
org.apache.hive.jdbc.HiveDriver
Connecting to jdbc:hive2://localhost:1
Connected to: Hive (version 0.10.0)
Driver: Hive (version 0.10.0-cdh4.2.1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://localhost:1> create table a2 (value int);
Error: Error while processing statement: FAILED: Execution Error, return code 1 
from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
Error: Error while processing statement: FAILED: Execution Error, return code 1 
from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
}}}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4489) beeline always return the same error message twice

2013-05-03 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-4489:
--

Description: 
Beeline always returns the same error message twice. for example, if I try to 
create a table a2 which already exists, it prints out two exact same messages 
and it is not quite user friendly.
{code}
beeline> !connect jdbc:hive2://localhost:1 scott tiger 
org.apache.hive.jdbc.HiveDriver
Connecting to jdbc:hive2://localhost:1
Connected to: Hive (version 0.10.0)
Driver: Hive (version 0.10.0-cdh4.2.1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://localhost:1> create table a2 (value int);
Error: Error while processing statement: FAILED: Execution Error, return code 1 
from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
Error: Error while processing statement: FAILED: Execution Error, return code 1 
from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
{code}

  was:
Beeline always returns the same error message twice. for example, if I try to 
create a table a2 which already exists, it prints out two exact same messages 
and it is not quite user friendly.
{{{
beeline> !connect jdbc:hive2://localhost:1 scott tiger 
org.apache.hive.jdbc.HiveDriver
Connecting to jdbc:hive2://localhost:1
Connected to: Hive (version 0.10.0)
Driver: Hive (version 0.10.0-cdh4.2.1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://localhost:1> create table a2 (value int);
Error: Error while processing statement: FAILED: Execution Error, return code 1 
from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
Error: Error while processing statement: FAILED: Execution Error, return code 1 
from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
}}}


> beeline always return the same error message twice
> --
>
> Key: HIVE-4489
> URL: https://issues.apache.org/jira/browse/HIVE-4489
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 0.10.0
>Reporter: Chaoyu Tang
>Priority: Minor
>  Labels: newbie
> Fix For: 0.11.0
>
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> Beeline always returns the same error message twice. for example, if I try to 
> create a table a2 which already exists, it prints out two exact same messages 
> and it is not quite user friendly.
> {code}
> beeline> !connect jdbc:hive2://localhost:1 scott tiger 
> org.apache.hive.jdbc.HiveDriver
> Connecting to jdbc:hive2://localhost:1
> Connected to: Hive (version 0.10.0)
> Driver: Hive (version 0.10.0-cdh4.2.1)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> 0: jdbc:hive2://localhost:1> create table a2 (value int);
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4489) beeline always return the same error message twice

2013-05-03 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-4489:
--

Fix Version/s: (was: 0.11.0)

> beeline always return the same error message twice
> --
>
> Key: HIVE-4489
> URL: https://issues.apache.org/jira/browse/HIVE-4489
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 0.10.0
>Reporter: Chaoyu Tang
>Priority: Minor
>  Labels: newbie
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> Beeline always returns the same error message twice. for example, if I try to 
> create a table a2 which already exists, it prints out two exact same messages 
> and it is not quite user friendly.
> {code}
> beeline> !connect jdbc:hive2://localhost:1 scott tiger 
> org.apache.hive.jdbc.HiveDriver
> Connecting to jdbc:hive2://localhost:1
> Connected to: Hive (version 0.10.0)
> Driver: Hive (version 0.10.0-cdh4.2.1)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> 0: jdbc:hive2://localhost:1> create table a2 (value int);
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4489) beeline always return the same error message twice

2013-05-03 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-4489:
--

Attachment: HIVE-4489.patch

removed duplicated error logging in the low level of exception catch block and 
only the top level catch block print out the error.

> beeline always return the same error message twice
> --
>
> Key: HIVE-4489
> URL: https://issues.apache.org/jira/browse/HIVE-4489
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 0.10.0
>Reporter: Chaoyu Tang
>Priority: Minor
>  Labels: newbie
> Attachments: HIVE-4489.patch
>
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> Beeline always returns the same error message twice. for example, if I try to 
> create a table a2 which already exists, it prints out two exact same messages 
> and it is not quite user friendly.
> {code}
> beeline> !connect jdbc:hive2://localhost:1 scott tiger 
> org.apache.hive.jdbc.HiveDriver
> Connecting to jdbc:hive2://localhost:1
> Connected to: Hive (version 0.10.0)
> Driver: Hive (version 0.10.0-cdh4.2.1)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> 0: jdbc:hive2://localhost:1> create table a2 (value int);
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4489) beeline always return the same error message twice

2013-05-03 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-4489:
--

Attachment: (was: HIVE-4489.patch)

> beeline always return the same error message twice
> --
>
> Key: HIVE-4489
> URL: https://issues.apache.org/jira/browse/HIVE-4489
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 0.10.0
>Reporter: Chaoyu Tang
>Priority: Minor
>  Labels: newbie
> Attachments: HIVE-4489.patch
>
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> Beeline always returns the same error message twice. for example, if I try to 
> create a table a2 which already exists, it prints out two exact same messages 
> and it is not quite user friendly.
> {code}
> beeline> !connect jdbc:hive2://localhost:1 scott tiger 
> org.apache.hive.jdbc.HiveDriver
> Connecting to jdbc:hive2://localhost:1
> Connected to: Hive (version 0.10.0)
> Driver: Hive (version 0.10.0-cdh4.2.1)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> 0: jdbc:hive2://localhost:1> create table a2 (value int);
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4489) beeline always return the same error message twice

2013-05-03 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-4489:
--

Attachment: HIVE-4489.patch

> beeline always return the same error message twice
> --
>
> Key: HIVE-4489
> URL: https://issues.apache.org/jira/browse/HIVE-4489
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 0.10.0
>Reporter: Chaoyu Tang
>Priority: Minor
>  Labels: newbie
> Attachments: HIVE-4489.patch
>
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> Beeline always returns the same error message twice. for example, if I try to 
> create a table a2 which already exists, it prints out two exact same messages 
> and it is not quite user friendly.
> {code}
> beeline> !connect jdbc:hive2://localhost:1 scott tiger 
> org.apache.hive.jdbc.HiveDriver
> Connecting to jdbc:hive2://localhost:1
> Connected to: Hive (version 0.10.0)
> Driver: Hive (version 0.10.0-cdh4.2.1)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> 0: jdbc:hive2://localhost:1> create table a2 (value int);
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4489) beeline always return the same error message twice

2013-05-10 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13654547#comment-13654547
 ] 

Chaoyu Tang commented on HIVE-4489:
---

Could someone review and commit it if it is a proper fix? Thanks. It is to fix 
the issue raised by some beeline users.

> beeline always return the same error message twice
> --
>
> Key: HIVE-4489
> URL: https://issues.apache.org/jira/browse/HIVE-4489
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 0.10.0
>Reporter: Chaoyu Tang
>Priority: Minor
>  Labels: newbie
> Attachments: HIVE-4489.patch
>
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> Beeline always returns the same error message twice. for example, if I try to 
> create a table a2 which already exists, it prints out two exact same messages 
> and it is not quite user friendly.
> {code}
> beeline> !connect jdbc:hive2://localhost:1 scott tiger 
> org.apache.hive.jdbc.HiveDriver
> Connecting to jdbc:hive2://localhost:1
> Connected to: Hive (version 0.10.0)
> Driver: Hive (version 0.10.0-cdh4.2.1)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> 0: jdbc:hive2://localhost:1> create table a2 (value int);
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4487) Hive does not set explicit permissions on hive.exec.scratchdir

2013-05-15 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659087#comment-13659087
 ] 

Chaoyu Tang commented on HIVE-4487:
---

How about the hive.exec.local.scratchdir (/tmp/${user.name} in local system) if 
it is applicable, should not it be 700 as well?

> Hive does not set explicit permissions on hive.exec.scratchdir
> --
>
> Key: HIVE-4487
> URL: https://issues.apache.org/jira/browse/HIVE-4487
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Joey Echeverria
>
> The hive.exec.scratchdir defaults to /tmp/hive-$\{user.name\}, but when Hive 
> creates this directory it doesn't set any explicit permission on it. This 
> means if you have the default HDFS umask setting of 022, then these 
> directories end up being world readable. These permissions also get applied 
> to the staging directories and their files, thus leaving inter-stage data 
> world readable.
> This can cause a potential leak of data especially when operating on a 
> Kerberos enabled cluster. Hive should probably default these directories to 
> only be readable by the owner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4291) Test HiveServer2 crash based on max thrift threads

2013-06-17 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13686222#comment-13686222
 ] 

Chaoyu Tang commented on HIVE-4291:
---

If I understand right, the test testHS2StabilityOnLargeConnections seems 
expecting that client is returned SQLExcpetion (line 190) when the max # of 
threads is reached. While the fix in Thrift-1869 lets the client wait until the 
next one is available, and SQLException should not be thrown out in the test.

> Test HiveServer2 crash based on max thrift threads
> --
>
> Key: HIVE-4291
> URL: https://issues.apache.org/jira/browse/HIVE-4291
> Project: Hive
>  Issue Type: Test
>  Components: HiveServer2
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
> Attachments: TestHS2ThreadAllocation.java
>
>
> This test case ensures HS2 does not shutdown/crash when the thrift threads 
> have been depleted. This is due to an issue fixed in THRIFT-1869. This test 
> should pass post HIVE-4224. This test case ensures, the crash doesnt happen 
> due to any changes in Thrift behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4766) HS2 login timeout

2013-06-20 Thread Chaoyu Tang (JIRA)

Chaoyu Tang created HIVE-4766:
-

 Summary: HS2 login timeout 
 Key: HIVE-4766
 URL: https://issues.apache.org/jira/browse/HIVE-4766
 Project: Hive
  Issue Type: Bug
Reporter: Chaoyu Tang




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4766) Support HS2 client login timeout when the thrift thread max# is reached

2013-06-20 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-4766:
--

  Component/s: HiveServer2
  Description: HiveServer2 client (beeline) hangs in login if the 
thrift max thread# has been reached. It is because the server crashes due to a 
defect in currently used thrift 0.9.0. When hive is upgraded to use a new 
version of Thrift (say thrift 1.0), HS2 should support client login timeout 
instead of current hanging.
Affects Version/s: 0.10.0
   Issue Type: Improvement  (was: Bug)
  Summary: Support HS2 client login timeout when the thrift thread 
max# is reached  (was: HS2 login timeout )

> Support HS2 client login timeout when the thrift thread max# is reached
> ---
>
> Key: HIVE-4766
> URL: https://issues.apache.org/jira/browse/HIVE-4766
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 0.10.0
>Reporter: Chaoyu Tang
>
> HiveServer2 client (beeline) hangs in login if the thrift max thread# has 
> been reached. It is because the server crashes due to a defect in currently 
> used thrift 0.9.0. When hive is upgraded to use a new version of Thrift (say 
> thrift 1.0), HS2 should support client login timeout instead of current 
> hanging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4487) Hive does not set explicit permissions on hive.exec.scratchdir

2013-06-22 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13691208#comment-13691208
 ] 

Chaoyu Tang commented on HIVE-4487:
---

Please review the changes in https://reviews.apache.org/r/12049/

> Hive does not set explicit permissions on hive.exec.scratchdir
> --
>
> Key: HIVE-4487
> URL: https://issues.apache.org/jira/browse/HIVE-4487
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Joey Echeverria
>
> The hive.exec.scratchdir defaults to /tmp/hive-$\{user.name\}, but when Hive 
> creates this directory it doesn't set any explicit permission on it. This 
> means if you have the default HDFS umask setting of 022, then these 
> directories end up being world readable. These permissions also get applied 
> to the staging directories and their files, thus leaving inter-stage data 
> world readable.
> This can cause a potential leak of data especially when operating on a 
> Kerberos enabled cluster. Hive should probably default these directories to 
> only be readable by the owner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HIVE-4487) Hive does not set explicit permissions on hive.exec.scratchdir

2013-06-22 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang reassigned HIVE-4487:
-

Assignee: Chaoyu Tang

> Hive does not set explicit permissions on hive.exec.scratchdir
> --
>
> Key: HIVE-4487
> URL: https://issues.apache.org/jira/browse/HIVE-4487
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Joey Echeverria
>Assignee: Chaoyu Tang
>
> The hive.exec.scratchdir defaults to /tmp/hive-$\{user.name\}, but when Hive 
> creates this directory it doesn't set any explicit permission on it. This 
> means if you have the default HDFS umask setting of 022, then these 
> directories end up being world readable. These permissions also get applied 
> to the staging directories and their files, thus leaving inter-stage data 
> world readable.
> This can cause a potential leak of data especially when operating on a 
> Kerberos enabled cluster. Hive should probably default these directories to 
> only be readable by the owner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4487) Hive does not set explicit permissions on hive.exec.scratchdir

2013-06-22 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-4487:
--

Attachment: HIVE-4487.patch

> Hive does not set explicit permissions on hive.exec.scratchdir
> --
>
> Key: HIVE-4487
> URL: https://issues.apache.org/jira/browse/HIVE-4487
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Joey Echeverria
>Assignee: Chaoyu Tang
> Attachments: HIVE-4487.patch
>
>
> The hive.exec.scratchdir defaults to /tmp/hive-$\{user.name\}, but when Hive 
> creates this directory it doesn't set any explicit permission on it. This 
> means if you have the default HDFS umask setting of 022, then these 
> directories end up being world readable. These permissions also get applied 
> to the staging directories and their files, thus leaving inter-stage data 
> world readable.
> This can cause a potential leak of data especially when operating on a 
> Kerberos enabled cluster. Hive should probably default these directories to 
> only be readable by the owner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3756) "LOAD DATA" does not honor permission inheritence

2013-06-22 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-3756:
--

Attachment: HIVE-3756.patch

> "LOAD DATA" does not honor permission inheritence
> -
>
> Key: HIVE-3756
> URL: https://issues.apache.org/jira/browse/HIVE-3756
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Security
>Affects Versions: 0.9.0
>Reporter: Johndee Burks
> Attachments: HIVE-3756.patch
>
>
> When a "LOAD DATA" operation is performed the resulting data in hdfs for the 
> table does not maintain permission inheritance. This remains true even with 
> the "hive.warehouse.subdir.inherit.perms" set to true.
> The issue is easily reproducible by creating a table and loading some data 
> into it. After the load is complete just do a "dfs -ls -R" on the warehouse 
> directory and you will see that the inheritance of permissions worked for the 
> table directory but not for the data. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3756) "LOAD DATA" does not honor permission inheritence

2013-06-22 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13691245#comment-13691245
 ] 

Chaoyu Tang commented on HIVE-3756:
---

Please review the changes in https://reviews.apache.org/r/12050/

> "LOAD DATA" does not honor permission inheritence
> -
>
> Key: HIVE-3756
> URL: https://issues.apache.org/jira/browse/HIVE-3756
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Security
>Affects Versions: 0.9.0
>Reporter: Johndee Burks
>Assignee: Chaoyu Tang
> Attachments: HIVE-3756.patch
>
>
> When a "LOAD DATA" operation is performed the resulting data in hdfs for the 
> table does not maintain permission inheritance. This remains true even with 
> the "hive.warehouse.subdir.inherit.perms" set to true.
> The issue is easily reproducible by creating a table and loading some data 
> into it. After the load is complete just do a "dfs -ls -R" on the warehouse 
> directory and you will see that the inheritance of permissions worked for the 
> table directory but not for the data. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HIVE-3756) "LOAD DATA" does not honor permission inheritence

2013-06-22 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang reassigned HIVE-3756:
-

Assignee: Chaoyu Tang

> "LOAD DATA" does not honor permission inheritence
> -
>
> Key: HIVE-3756
> URL: https://issues.apache.org/jira/browse/HIVE-3756
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Security
>Affects Versions: 0.9.0
>Reporter: Johndee Burks
>Assignee: Chaoyu Tang
> Attachments: HIVE-3756.patch
>
>
> When a "LOAD DATA" operation is performed the resulting data in hdfs for the 
> table does not maintain permission inheritance. This remains true even with 
> the "hive.warehouse.subdir.inherit.perms" set to true.
> The issue is easily reproducible by creating a table and loading some data 
> into it. After the load is complete just do a "dfs -ls -R" on the warehouse 
> directory and you will see that the inheritance of permissions worked for the 
> table directory but not for the data. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3094) new partition files and directories should inherit file permissions from parent partition/table dir

2013-06-22 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13691248#comment-13691248
 ] 

Chaoyu Tang commented on HIVE-3094:
---

see patch in HIVE-3756

> new partition files and directories should inherit file permissions from 
> parent partition/table dir
> ---
>
> Key: HIVE-3094
> URL: https://issues.apache.org/jira/browse/HIVE-3094
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization, Security
>Affects Versions: 0.9.0
>Reporter: Thejas M Nair
>Assignee: Chaoyu Tang
>
> In HIVE-2936 changes were made for warehouse table sub directories to inherit 
> the permissions from parent directory. But this applies only to directories 
> created by metastore. 
> When directories (in case of dynamic partitioning) or files are created from 
> the MR job, it uses the default 
> But new partition files or directories created from the MR jobs don't inherit 
> the permissions. 
> This means that even if the permissions have been granted on table directory 
> for a group, the group will not have permissions on the new partitions. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HIVE-3094) new partition files and directories should inherit file permissions from parent partition/table dir

2013-06-22 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang reassigned HIVE-3094:
-

Assignee: Chaoyu Tang

> new partition files and directories should inherit file permissions from 
> parent partition/table dir
> ---
>
> Key: HIVE-3094
> URL: https://issues.apache.org/jira/browse/HIVE-3094
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization, Security
>Affects Versions: 0.9.0
>Reporter: Thejas M Nair
>Assignee: Chaoyu Tang
>
> In HIVE-2936 changes were made for warehouse table sub directories to inherit 
> the permissions from parent directory. But this applies only to directories 
> created by metastore. 
> When directories (in case of dynamic partitioning) or files are created from 
> the MR job, it uses the default 
> But new partition files or directories created from the MR jobs don't inherit 
> the permissions. 
> This means that even if the permissions have been granted on table directory 
> for a group, the group will not have permissions on the new partitions. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3756) "LOAD DATA" does not honor permission inheritence

2013-07-02 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-3756:
--

Attachment: HIVE-3756_1.patch

Update the changes based on review comments

> "LOAD DATA" does not honor permission inheritence
> -
>
> Key: HIVE-3756
> URL: https://issues.apache.org/jira/browse/HIVE-3756
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Security
>Affects Versions: 0.9.0
>Reporter: Johndee Burks
>Assignee: Chaoyu Tang
> Attachments: HIVE-3756_1.patch, HIVE-3756.patch
>
>
> When a "LOAD DATA" operation is performed the resulting data in hdfs for the 
> table does not maintain permission inheritance. This remains true even with 
> the "hive.warehouse.subdir.inherit.perms" set to true.
> The issue is easily reproducible by creating a table and loading some data 
> into it. After the load is complete just do a "dfs -ls -R" on the warehouse 
> directory and you will see that the inheritance of permissions worked for the 
> table directory but not for the data. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3756) "LOAD DATA" does not honor permission inheritence

2013-07-17 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13711258#comment-13711258
 ] 

Chaoyu Tang commented on HIVE-3756:
---

Yes, IMO, the table should preserve its own permission/group B in the 
insert-overwrite case. Here is a use case, a database is created to allow a 
group to access (the mode of /dbdir can be 770) and a certain table in this db 
(/dbdir/tbldir) is only allowed to admin himself (say permission mode 700). If 
the admin insert overwrite data to this table, it will change the /dbdir/tbldir 
to 770, breaking the security unexpectedly.
I can change code to preserve this permission/group of the overwritten table. 
It seems a minor changes. 

> "LOAD DATA" does not honor permission inheritence
> -
>
> Key: HIVE-3756
> URL: https://issues.apache.org/jira/browse/HIVE-3756
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Security
>Affects Versions: 0.9.0
>Reporter: Johndee Burks
>Assignee: Chaoyu Tang
> Attachments: HIVE-3756_1.patch, HIVE-3756.patch
>
>
> When a "LOAD DATA" operation is performed the resulting data in hdfs for the 
> table does not maintain permission inheritance. This remains true even with 
> the "hive.warehouse.subdir.inherit.perms" set to true.
> The issue is easily reproducible by creating a table and loading some data 
> into it. After the load is complete just do a "dfs -ls -R" on the warehouse 
> directory and you will see that the inheritance of permissions worked for the 
> table directory but not for the data. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3756) "LOAD DATA" does not honor permission inheritence

2013-07-19 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-3756:
--

Attachment: HIVE-3756_2.patch

[~sushanth] & [~ashutoshc] Upload the patch here and also post it in rb for 
review 

> "LOAD DATA" does not honor permission inheritence
> -
>
> Key: HIVE-3756
> URL: https://issues.apache.org/jira/browse/HIVE-3756
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Security
>Affects Versions: 0.9.0
>Reporter: Johndee Burks
>Assignee: Chaoyu Tang
> Attachments: HIVE-3756_1.patch, HIVE-3756_2.patch, HIVE-3756.patch
>
>
> When a "LOAD DATA" operation is performed the resulting data in hdfs for the 
> table does not maintain permission inheritance. This remains true even with 
> the "hive.warehouse.subdir.inherit.perms" set to true.
> The issue is easily reproducible by creating a table and loading some data 
> into it. After the load is complete just do a "dfs -ls -R" on the warehouse 
> directory and you will see that the inheritance of permissions worked for the 
> table directory but not for the data. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3756) "LOAD DATA" does not honor permission inheritence

2013-07-24 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13719076#comment-13719076
 ] 

Chaoyu Tang commented on HIVE-3756:
---

[~ashutoshc] & [~sushanth] I uploaded the patch HIVE-3756_2.patch here and also 
posted it in rb requesting the review. The changes incorporated preserving 
permission/group in insert overwrite case as discussed here and some review 
suggestions from Ashutosh. For answers to questions from review, please see 
https://reviews.apache.org/r/12050/.

> "LOAD DATA" does not honor permission inheritence
> -
>
> Key: HIVE-3756
> URL: https://issues.apache.org/jira/browse/HIVE-3756
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Security
>Affects Versions: 0.9.0
>Reporter: Johndee Burks
>Assignee: Chaoyu Tang
> Attachments: HIVE-3756_1.patch, HIVE-3756_2.patch, HIVE-3756.patch
>
>
> When a "LOAD DATA" operation is performed the resulting data in hdfs for the 
> table does not maintain permission inheritance. This remains true even with 
> the "hive.warehouse.subdir.inherit.perms" set to true.
> The issue is easily reproducible by creating a table and loading some data 
> into it. After the load is complete just do a "dfs -ls -R" on the warehouse 
> directory and you will see that the inheritance of permissions worked for the 
> table directory but not for the data. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3756) "LOAD DATA" does not honor permission inheritence

2013-07-26 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13720661#comment-13720661
 ] 

Chaoyu Tang commented on HIVE-3756:
---

[~ashutoshc] & [~sushanth] Thanks for the review.

> "LOAD DATA" does not honor permission inheritence
> -
>
> Key: HIVE-3756
> URL: https://issues.apache.org/jira/browse/HIVE-3756
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Security
>Affects Versions: 0.9.0
>Reporter: Johndee Burks
>Assignee: Chaoyu Tang
> Fix For: 0.12.0
>
> Attachments: HIVE-3756_1.patch, HIVE-3756_2.patch, HIVE-3756.patch
>
>
> When a "LOAD DATA" operation is performed the resulting data in hdfs for the 
> table does not maintain permission inheritance. This remains true even with 
> the "hive.warehouse.subdir.inherit.perms" set to true.
> The issue is easily reproducible by creating a table and loading some data 
> into it. After the load is complete just do a "dfs -ls -R" on the warehouse 
> directory and you will see that the inheritance of permissions worked for the 
> table directory but not for the data. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4223) LazySimpleSerDe will throw IndexOutOfBoundsException in nested structs of hive table

2013-07-29 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13723276#comment-13723276
 ] 

Chaoyu Tang commented on HIVE-4223:
---

[~java8964] I was not able to reproduce the said problem in hive-0.9.0 and 
wondering if it might be related to the data? Here is my test case;
1. create table bcd (col1 array >>>)
 row format delimited fields terminated by '\001' collection items terminated 
by '\002' lines terminated by '\n' stored as textfile;
** should be same as you described
2. load data local inpath '/root/nest_struct.data' overwrite into table bcd;
** see attached nest_struct.data
3. select col1 from bcd;
** got:
[{"col1":"c1v","col2":"c2v","col3":"c3v","col4":"c4v","col5":"c5v","col6":"c6v","col7":"c7v","col8":[{"col1":"c11v","col2":"c22v","col3":"c33v","col4":"c44v","col5":"c55v","col6":"c66v","col7":"c77v","col8":"c88v","col9":"c99v"}]}]


Did you see anything different from your case?
Could you please update your case and probably I can have a try.

 

> LazySimpleSerDe will throw IndexOutOfBoundsException in nested structs of 
> hive table
> 
>
> Key: HIVE-4223
> URL: https://issues.apache.org/jira/browse/HIVE-4223
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.0
> Environment: Hive 0.9.0
>Reporter: Yong Zhang
> Attachments: nest_struct.data
>
>
> The LazySimpleSerDe will throw IndexOutOfBoundsException if the column 
> structure is struct containing array of struct. 
> I have a table with one column defined like this:
> columnA
> array <
> struct<
>col1:primiType,
>col2:primiType,
>col3:primiType,
>col4:primiType,
>col5:primiType,
>col6:primiType,
>col7:primiType,
>col8:array<
> struct<
>   col1:primiType,
>   col2::primiType,
>   col3::primiType,
>   col4:primiType,
>   col5:primiType,
>   col6:primiType,
>   col7:primiType,
>   col8:primiType,
>   col9:primiType
> >
>>
> >
> >
> In this example, the outside struct has 8 columns (including the array), and 
> the inner struct has 9 columns. As long as the outside struct has LESS column 
> count than the inner struct column count, I think we will get the following 
> exception as stracktrace in LazeSimpleSerDe when it tries to serialize a row:
> Caused by: java.lang.IndexOutOfBoundsException: Index: 8, Size: 8
> at java.util.ArrayList.RangeCheck(ArrayList.java:547)
> at java.util.ArrayList.get(ArrayList.java:322)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:485)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:443)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:381)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:365)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:568)
> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
> at 
> org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:132)
> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83)
> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:531)
> ... 9 more
> I am not very sure about exactly the reason of this problem. I believe that 
> the   public static void serialize(ByteStream.Output out, Object 
> obj,ObjectInspector objInspector, byte[] separators, int level, Text 
> nullSequence, boolean escaped, byte escapeChar, boolean[] needsEscape) is 
> recursively invoking itself when facing nest structure. But for the nested 
> struct structure, the list reference will mass up, and the size() will return 
> wrong data.
> In the above example case I faced, 
> for these 2 lines:
>

[jira] [Updated] (HIVE-4223) LazySimpleSerDe will throw IndexOutOfBoundsException in nested structs of hive table

2013-07-29 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-4223:
--

Attachment: nest_struct.data

data file to my test case -- chaoyu

> LazySimpleSerDe will throw IndexOutOfBoundsException in nested structs of 
> hive table
> 
>
> Key: HIVE-4223
> URL: https://issues.apache.org/jira/browse/HIVE-4223
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.0
> Environment: Hive 0.9.0
>Reporter: Yong Zhang
> Attachments: nest_struct.data
>
>
> The LazySimpleSerDe will throw IndexOutOfBoundsException if the column 
> structure is struct containing array of struct. 
> I have a table with one column defined like this:
> columnA
> array <
> struct<
>col1:primiType,
>col2:primiType,
>col3:primiType,
>col4:primiType,
>col5:primiType,
>col6:primiType,
>col7:primiType,
>col8:array<
> struct<
>   col1:primiType,
>   col2::primiType,
>   col3::primiType,
>   col4:primiType,
>   col5:primiType,
>   col6:primiType,
>   col7:primiType,
>   col8:primiType,
>   col9:primiType
> >
>>
> >
> >
> In this example, the outside struct has 8 columns (including the array), and 
> the inner struct has 9 columns. As long as the outside struct has LESS column 
> count than the inner struct column count, I think we will get the following 
> exception as stracktrace in LazeSimpleSerDe when it tries to serialize a row:
> Caused by: java.lang.IndexOutOfBoundsException: Index: 8, Size: 8
> at java.util.ArrayList.RangeCheck(ArrayList.java:547)
> at java.util.ArrayList.get(ArrayList.java:322)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:485)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:443)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:381)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:365)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:568)
> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
> at 
> org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:132)
> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83)
> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:531)
> ... 9 more
> I am not very sure about exactly the reason of this problem. I believe that 
> the   public static void serialize(ByteStream.Output out, Object 
> obj,ObjectInspector objInspector, byte[] separators, int level, Text 
> nullSequence, boolean escaped, byte escapeChar, boolean[] needsEscape) is 
> recursively invoking itself when facing nest structure. But for the nested 
> struct structure, the list reference will mass up, and the size() will return 
> wrong data.
> In the above example case I faced, 
> for these 2 lines:
>   List fields = soi.getAllStructFieldRefs();
>   list = soi.getStructFieldsDataAsList(obj);
> my StructObjectInspector(soi) will return the CORRECT data for 
> getAllStructFieldRefs() and getStructFieldsDataAsList() methods. For example, 
> for one row, for the outsider 8 columns struct, I have 2 elements in the 
> inner array of struct, and each element will have 9 columns (as there are 9 
> columns in the inner struct). During runtime, after I added more logging in 
> the LazySimpleSerDe, I will see the following behavior in the logging:
> for 8 outside column, loop
> for 9 inside columns, loop for serialize
> for 9 inside columns, loop for serialize
> code broken here, for the outside loop, it will try to access the 9th 
> element,which not exist in the outside loop, as you will see the stracktrace 
> as

[jira] [Commented] (HIVE-4223) LazySimpleSerDe will throw IndexOutOfBoundsException in nested structs of hive table

2013-07-29 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13723303#comment-13723303
 ] 

Chaoyu Tang commented on HIVE-4223:
---

The previous comments are not in right format, re-post:

I was not able to reproduce the said problem in hive-0.9.0 and wondering if it 
might be related to the data? Here is my test case;
1. create table bcd (col1 array >>>)
 row format delimited fields terminated by '\001' collection items terminated 
by '\002' lines terminated by '\n' stored as textfile;
-- same as the case described in this JIRA
2. load data local inpath '/root/nest_struct.data' overwrite into table bcd;
-- see attached nest_struct.data
3. select col1 from bcd;
-- got expected result
{code}
[{"col1":"c1v","col2":"c2v","col3":"c3v","col4":"c4v","col5":"c5v","col6":"c6v","col7":"c7v","col8":[{"col1":"c11v","col2":"c22v","col3":"c33v","col4":"c44v","col5":"c55v","col6":"c66v","col7":"c77v","col8":"c88v","col9":"c99v"}]}]
{code}

> LazySimpleSerDe will throw IndexOutOfBoundsException in nested structs of 
> hive table
> 
>
> Key: HIVE-4223
> URL: https://issues.apache.org/jira/browse/HIVE-4223
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.0
> Environment: Hive 0.9.0
>Reporter: Yong Zhang
> Attachments: nest_struct.data
>
>
> The LazySimpleSerDe will throw IndexOutOfBoundsException if the column 
> structure is struct containing array of struct. 
> I have a table with one column defined like this:
> columnA
> array <
> struct<
>col1:primiType,
>col2:primiType,
>col3:primiType,
>col4:primiType,
>col5:primiType,
>col6:primiType,
>col7:primiType,
>col8:array<
> struct<
>   col1:primiType,
>   col2::primiType,
>   col3::primiType,
>   col4:primiType,
>   col5:primiType,
>   col6:primiType,
>   col7:primiType,
>   col8:primiType,
>   col9:primiType
> >
>>
> >
> >
> In this example, the outside struct has 8 columns (including the array), and 
> the inner struct has 9 columns. As long as the outside struct has LESS column 
> count than the inner struct column count, I think we will get the following 
> exception as stracktrace in LazeSimpleSerDe when it tries to serialize a row:
> Caused by: java.lang.IndexOutOfBoundsException: Index: 8, Size: 8
> at java.util.ArrayList.RangeCheck(ArrayList.java:547)
> at java.util.ArrayList.get(ArrayList.java:322)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:485)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:443)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:381)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:365)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:568)
> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
> at 
> org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:132)
> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83)
> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:531)
> ... 9 more
> I am not very sure about exactly the reason of this problem. I believe that 
> the   public static void serialize(ByteStream.Output out, Object 
> obj,ObjectInspector objInspector, byte[] separators, int level, Text 
> nullSequence, boolean escaped, byte escapeChar, boolean[] needsEscape) is 
> recursively invoking itself when facing nest structure. But for the nested 
> struct structure, the list reference will mass up, and the size() will return 
> wrong data.
> In the above example case I faced, 
> for these 2 lines:
>   List fields = soi.getAllStructField

[jira] [Created] (HIVE-6998) Select query can only support maximum 128 distinct expressions

2014-05-01 Thread Chaoyu Tang (JIRA)

Chaoyu Tang created HIVE-6998:
-

 Summary: Select query can only support maximum 128 distinct 
expressions
 Key: HIVE-6998
 URL: https://issues.apache.org/jira/browse/HIVE-6998
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, Serializers/Deserializers
Affects Versions: 0.14.0
Reporter: Chaoyu Tang


Select query can only support maximum 128 distinct expressions. Otherwise, you 
will be thrown ArrayIndexOutOfBoundsException. For a query like:
select count(distinct c1),  count(distinct c2),  count(distinct c3),  
count(distinct c4),  count(distinct c5),  count(distinct c6), , 
count(distinct c128),  count(distinct c129) from tbl_129columns;

you will get error like:
{code}
java.lang.Exception: java.lang.RuntimeException: Hive Runtime Error while 
closing operators
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354)
Caused by: java.lang.RuntimeException: Hive Runtime Error while closing 
operators
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:260)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:695)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.ArrayIndexOutOfBoundsException: -128
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1141)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:579)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:227)
... 10 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.ArrayIndexOutOfBoundsException: -128
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.flush(GroupByOperator.java:1099)
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1138)
... 15 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.ArrayIndexOutOfBoundsException: -128
at 
org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:327)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1064)
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.flush(GroupByOperator.java:1082)
... 16 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: -128
at java.util.ArrayList.get(ArrayList.java:324)
at 
org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.serialize(BinarySortableSerDe.java:838)
at 
org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.serialize(BinarySortableSerDe.java:600)
at 
org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.toHiveKey(ReduceSinkOperator.java:401)
at 
org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:320)
... 19 more
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-7445) Improve LOGS for Hive when a query is not able to acquire locks

2014-07-18 Thread Chaoyu Tang (JIRA)

Chaoyu Tang created HIVE-7445:
-

 Summary: Improve LOGS for Hive when a query is not able to acquire 
locks
 Key: HIVE-7445
 URL: https://issues.apache.org/jira/browse/HIVE-7445
 Project: Hive
  Issue Type: Improvement
  Components: Diagnosability, Logging
Affects Versions: 0.13.1
Reporter: Chaoyu Tang
Assignee: Chaoyu Tang
Priority: Minor
 Fix For: 0.14.0


Currently the error thrown when you cannot acquire a lock is:
Error in acquireLocks... 
FAILED: Error in acquiring locks: Locks on the underlying objects cannot be 
acquired. retry after some time
This error is insufficient if the user would like to understand what is 
blocking them and insufficient from a diagnosability perspective because it is 
difficult to know what query is blocking the lock acquisition.




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7445) Improve LOGS for Hive when a query is not able to acquire locks

2014-07-18 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-7445:
--

Attachment: HIVE-7445.patch

With this patch, when in debug mode, ZooKeeperHiveLockManager logs out all 
conflicting locks to a lock which it fails to acquire for a query. The logging 
looks like:
{code
14/07/18 09:43:08 ERROR ZooKeeperHiveLockManager: Unable to acquire lock for 
default@sample_07 mode IMPLICIT
14/07/18 09:43:08 DEBUG ZooKeeperHiveLockManager: Requested lock 
default@sample_07:: mode:IMPLICIT; query:insert into table sample_07 select * 
from sample_08
14/07/18 09:43:08 DEBUG ZooKeeperHiveLockManager: Conflicting lock to 
default@sample_07:: mode:IMPLICIT;query:select * from 
sample_07;queryId:root_20140718064141_439583f9-f281-4d01-ba0c-616523685124;clientIp:10.20.92.233
{code}

> Improve LOGS for Hive when a query is not able to acquire locks
> ---
>
> Key: HIVE-7445
> URL: https://issues.apache.org/jira/browse/HIVE-7445
> Project: Hive
>  Issue Type: Improvement
>  Components: Diagnosability, Logging
>Affects Versions: 0.13.1
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-7445.patch
>
>
> Currently the error thrown when you cannot acquire a lock is:
> Error in acquireLocks... 
> FAILED: Error in acquiring locks: Locks on the underlying objects cannot be 
> acquired. retry after some time
> This error is insufficient if the user would like to understand what is 
> blocking them and insufficient from a diagnosability perspective because it 
> is difficult to know what query is blocking the lock acquisition.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7445) Improve LOGS for Hive when a query is not able to acquire locks

2014-07-18 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-7445:
--

Status: Patch Available  (was: Open)

> Improve LOGS for Hive when a query is not able to acquire locks
> ---
>
> Key: HIVE-7445
> URL: https://issues.apache.org/jira/browse/HIVE-7445
> Project: Hive
>  Issue Type: Improvement
>  Components: Diagnosability, Logging
>Affects Versions: 0.13.1
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-7445.patch
>
>
> Currently the error thrown when you cannot acquire a lock is:
> Error in acquireLocks... 
> FAILED: Error in acquiring locks: Locks on the underlying objects cannot be 
> acquired. retry after some time
> This error is insufficient if the user would like to understand what is 
> blocking them and insufficient from a diagnosability perspective because it 
> is difficult to know what query is blocking the lock acquisition.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Reopened] (HIVE-5456) Queries fail on avro backed table with empty partition

2014-07-18 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang reopened HIVE-5456:
---

  Assignee: Chaoyu Tang

This problem is still existing in trunk Hive-0.14.0 as of today (7/17/2014). 
The  cause is that the properties used in createDummyFileForEmptyPartition only 
contain those from a partition, thus missing the Avro schema url/literal which 
might only exist as a table property. 

> Queries fail on avro backed table with empty partition 
> ---
>
> Key: HIVE-5456
> URL: https://issues.apache.org/jira/browse/HIVE-5456
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0
>Reporter: Prasad Mujumdar
>Assignee: Chaoyu Tang
>
> The following query fails
> {noformat}
> DROP TABLE IF EXISTS episodes_partitioned;
> CREATE TABLE episodes_partitioned
> PARTITIONED BY (doctor_pt INT)
> ROW FORMAT
> SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> TBLPROPERTIES ('avro.schema.literal'='{
>   "namespace": "testing.hive.avro.serde",
>   "name": "episodes",
>   "type": "record",
>   "fields": [
> {
>   "name":"title",
>   "type":"string",
>   "doc":"episode title"
> },
> {
>   "name":"air_date",
>   "type":"string",
>   "doc":"initial date"
> },
> {
>   "name":"doctor",
>   "type":"int",
>   "doc":"main actor playing the Doctor in episode"
> }
>   ]
> }');
> ALTER TABLE episodes_partitioned ADD PARTITION (doctor_pt=4);
> ALTER TABLE episodes_partitioned ADD PARTITION (doctor_pt=5);
> SELECT COUNT(*) FROM episodes_partitioned;
> {noformat}
> with following exception 
> {noformat}
> java.io.IOException: org.apache.hadoop.hive.serde2.avro.AvroSerdeException: 
> Neither avro.schema.literal nor avro.schema.url specified, can't determine 
> table schema
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat.getHiveRecordWriter(AvroContainerOutputFormat.java:61)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.createEmptyFile(Utilities.java:2869)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.createDummyFileForEmptyPartition(Utilities.java:2901)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.getInputPaths(Utilities.java:2825)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:381)
> at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1409)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1187)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1015)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:883)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:737)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-5456) Queries fail on avro backed table with empty partition

2014-07-18 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-5456:
--

Attachment: HIVE-5456.patch

> Queries fail on avro backed table with empty partition 
> ---
>
> Key: HIVE-5456
> URL: https://issues.apache.org/jira/browse/HIVE-5456
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0
>Reporter: Prasad Mujumdar
>Assignee: Chaoyu Tang
> Attachments: HIVE-5456.patch
>
>
> The following query fails
> {noformat}
> DROP TABLE IF EXISTS episodes_partitioned;
> CREATE TABLE episodes_partitioned
> PARTITIONED BY (doctor_pt INT)
> ROW FORMAT
> SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> TBLPROPERTIES ('avro.schema.literal'='{
>   "namespace": "testing.hive.avro.serde",
>   "name": "episodes",
>   "type": "record",
>   "fields": [
> {
>   "name":"title",
>   "type":"string",
>   "doc":"episode title"
> },
> {
>   "name":"air_date",
>   "type":"string",
>   "doc":"initial date"
> },
> {
>   "name":"doctor",
>   "type":"int",
>   "doc":"main actor playing the Doctor in episode"
> }
>   ]
> }');
> ALTER TABLE episodes_partitioned ADD PARTITION (doctor_pt=4);
> ALTER TABLE episodes_partitioned ADD PARTITION (doctor_pt=5);
> SELECT COUNT(*) FROM episodes_partitioned;
> {noformat}
> with following exception 
> {noformat}
> java.io.IOException: org.apache.hadoop.hive.serde2.avro.AvroSerdeException: 
> Neither avro.schema.literal nor avro.schema.url specified, can't determine 
> table schema
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat.getHiveRecordWriter(AvroContainerOutputFormat.java:61)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.createEmptyFile(Utilities.java:2869)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.createDummyFileForEmptyPartition(Utilities.java:2901)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.getInputPaths(Utilities.java:2825)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:381)
> at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1409)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1187)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1015)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:883)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:737)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-5456) Queries fail on avro backed table with empty partition

2014-07-18 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-5456:
--

Fix Version/s: 0.14.0
Affects Version/s: 0.13.1
   Status: Patch Available  (was: Reopened)

> Queries fail on avro backed table with empty partition 
> ---
>
> Key: HIVE-5456
> URL: https://issues.apache.org/jira/browse/HIVE-5456
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.13.1, 0.11.0
>Reporter: Prasad Mujumdar
>Assignee: Chaoyu Tang
> Fix For: 0.14.0
>
> Attachments: HIVE-5456.patch
>
>
> The following query fails
> {noformat}
> DROP TABLE IF EXISTS episodes_partitioned;
> CREATE TABLE episodes_partitioned
> PARTITIONED BY (doctor_pt INT)
> ROW FORMAT
> SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> TBLPROPERTIES ('avro.schema.literal'='{
>   "namespace": "testing.hive.avro.serde",
>   "name": "episodes",
>   "type": "record",
>   "fields": [
> {
>   "name":"title",
>   "type":"string",
>   "doc":"episode title"
> },
> {
>   "name":"air_date",
>   "type":"string",
>   "doc":"initial date"
> },
> {
>   "name":"doctor",
>   "type":"int",
>   "doc":"main actor playing the Doctor in episode"
> }
>   ]
> }');
> ALTER TABLE episodes_partitioned ADD PARTITION (doctor_pt=4);
> ALTER TABLE episodes_partitioned ADD PARTITION (doctor_pt=5);
> SELECT COUNT(*) FROM episodes_partitioned;
> {noformat}
> with following exception 
> {noformat}
> java.io.IOException: org.apache.hadoop.hive.serde2.avro.AvroSerdeException: 
> Neither avro.schema.literal nor avro.schema.url specified, can't determine 
> table schema
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat.getHiveRecordWriter(AvroContainerOutputFormat.java:61)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.createEmptyFile(Utilities.java:2869)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.createDummyFileForEmptyPartition(Utilities.java:2901)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.getInputPaths(Utilities.java:2825)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:381)
> at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1409)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1187)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1015)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:883)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:737)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-5456) Queries fail on avro backed table with empty partition

2014-07-18 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14066819#comment-14066819
 ] 

Chaoyu Tang commented on HIVE-5456:
---

Looks like the build failed due to compilation errors which were not from this 
patch.

> Queries fail on avro backed table with empty partition 
> ---
>
> Key: HIVE-5456
> URL: https://issues.apache.org/jira/browse/HIVE-5456
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0, 0.13.1
>Reporter: Prasad Mujumdar
>Assignee: Chaoyu Tang
> Fix For: 0.14.0
>
> Attachments: HIVE-5456.patch
>
>
> The following query fails
> {noformat}
> DROP TABLE IF EXISTS episodes_partitioned;
> CREATE TABLE episodes_partitioned
> PARTITIONED BY (doctor_pt INT)
> ROW FORMAT
> SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> TBLPROPERTIES ('avro.schema.literal'='{
>   "namespace": "testing.hive.avro.serde",
>   "name": "episodes",
>   "type": "record",
>   "fields": [
> {
>   "name":"title",
>   "type":"string",
>   "doc":"episode title"
> },
> {
>   "name":"air_date",
>   "type":"string",
>   "doc":"initial date"
> },
> {
>   "name":"doctor",
>   "type":"int",
>   "doc":"main actor playing the Doctor in episode"
> }
>   ]
> }');
> ALTER TABLE episodes_partitioned ADD PARTITION (doctor_pt=4);
> ALTER TABLE episodes_partitioned ADD PARTITION (doctor_pt=5);
> SELECT COUNT(*) FROM episodes_partitioned;
> {noformat}
> with following exception 
> {noformat}
> java.io.IOException: org.apache.hadoop.hive.serde2.avro.AvroSerdeException: 
> Neither avro.schema.literal nor avro.schema.url specified, can't determine 
> table schema
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat.getHiveRecordWriter(AvroContainerOutputFormat.java:61)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.createEmptyFile(Utilities.java:2869)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.createDummyFileForEmptyPartition(Utilities.java:2901)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.getInputPaths(Utilities.java:2825)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:381)
> at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1409)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1187)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1015)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:883)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:737)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7445) Improve LOGS for Hive when a query is not able to acquire locks

2014-07-18 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14066825#comment-14066825
 ] 

Chaoyu Tang commented on HIVE-7445:
---

Hi [~szehon], thanks for review and comments. I will look into the change based 
on your comments and also the failed tests. 

> Improve LOGS for Hive when a query is not able to acquire locks
> ---
>
> Key: HIVE-7445
> URL: https://issues.apache.org/jira/browse/HIVE-7445
> Project: Hive
>  Issue Type: Improvement
>  Components: Diagnosability, Logging
>Affects Versions: 0.13.1
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-7445.patch
>
>
> Currently the error thrown when you cannot acquire a lock is:
> Error in acquireLocks... 
> FAILED: Error in acquiring locks: Locks on the underlying objects cannot be 
> acquired. retry after some time
> This error is insufficient if the user would like to understand what is 
> blocking them and insufficient from a diagnosability perspective because it 
> is difficult to know what query is blocking the lock acquisition.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7445) Improve LOGS for Hive when a query is not able to acquire locks

2014-07-22 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-7445:
--

Attachment: HIVE-7445.1.patch

Hi @Szehon, I made the changes based on your comments (see attached 
HIVE-7445.1.patch) and also posted it in RB 
https://reviews.apache.org/r/23820/. Please review it and let me know if there 
is any problem. Thanks

> Improve LOGS for Hive when a query is not able to acquire locks
> ---
>
> Key: HIVE-7445
> URL: https://issues.apache.org/jira/browse/HIVE-7445
> Project: Hive
>  Issue Type: Improvement
>  Components: Diagnosability, Logging
>Affects Versions: 0.13.1
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-7445.1.patch, HIVE-7445.patch
>
>
> Currently the error thrown when you cannot acquire a lock is:
> Error in acquireLocks... 
> FAILED: Error in acquiring locks: Locks on the underlying objects cannot be 
> acquired. retry after some time
> This error is insufficient if the user would like to understand what is 
> blocking them and insufficient from a diagnosability perspective because it 
> is difficult to know what query is blocking the lock acquisition.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7445) Improve LOGS for Hive when a query is not able to acquire locks

2014-07-22 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-7445:
--

Attachment: HIVE-7445.2.patch

Changed format based on the review comments and make the indention two empty 
space.

> Improve LOGS for Hive when a query is not able to acquire locks
> ---
>
> Key: HIVE-7445
> URL: https://issues.apache.org/jira/browse/HIVE-7445
> Project: Hive
>  Issue Type: Improvement
>  Components: Diagnosability, Logging
>Affects Versions: 0.13.1
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-7445.1.patch, HIVE-7445.2.patch, HIVE-7445.patch
>
>
> Currently the error thrown when you cannot acquire a lock is:
> Error in acquireLocks... 
> FAILED: Error in acquiring locks: Locks on the underlying objects cannot be 
> acquired. retry after some time
> This error is insufficient if the user would like to understand what is 
> blocking them and insufficient from a diagnosability perspective because it 
> is difficult to know what query is blocking the lock acquisition.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7445) Improve LOGS for Hive when a query is not able to acquire locks

2014-07-23 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-7445:
--

Attachment: HIVE-7445.3.patch

> Improve LOGS for Hive when a query is not able to acquire locks
> ---
>
> Key: HIVE-7445
> URL: https://issues.apache.org/jira/browse/HIVE-7445
> Project: Hive
>  Issue Type: Improvement
>  Components: Diagnosability, Logging
>Affects Versions: 0.13.1
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-7445.1.patch, HIVE-7445.2.patch, HIVE-7445.3.patch, 
> HIVE-7445.patch
>
>
> Currently the error thrown when you cannot acquire a lock is:
> Error in acquireLocks... 
> FAILED: Error in acquiring locks: Locks on the underlying objects cannot be 
> acquired. retry after some time
> This error is insufficient if the user would like to understand what is 
> blocking them and insufficient from a diagnosability perspective because it 
> is difficult to know what query is blocking the lock acquisition.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-5456) Queries fail on avro backed table with empty partition

2014-07-23 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-5456:
--

Attachment: HIVE-5456.patch

Reattach patch to trigger the tests.

> Queries fail on avro backed table with empty partition 
> ---
>
> Key: HIVE-5456
> URL: https://issues.apache.org/jira/browse/HIVE-5456
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0, 0.13.1
>Reporter: Prasad Mujumdar
>Assignee: Chaoyu Tang
> Fix For: 0.14.0
>
> Attachments: HIVE-5456.patch, HIVE-5456.patch
>
>
> The following query fails
> {noformat}
> DROP TABLE IF EXISTS episodes_partitioned;
> CREATE TABLE episodes_partitioned
> PARTITIONED BY (doctor_pt INT)
> ROW FORMAT
> SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> TBLPROPERTIES ('avro.schema.literal'='{
>   "namespace": "testing.hive.avro.serde",
>   "name": "episodes",
>   "type": "record",
>   "fields": [
> {
>   "name":"title",
>   "type":"string",
>   "doc":"episode title"
> },
> {
>   "name":"air_date",
>   "type":"string",
>   "doc":"initial date"
> },
> {
>   "name":"doctor",
>   "type":"int",
>   "doc":"main actor playing the Doctor in episode"
> }
>   ]
> }');
> ALTER TABLE episodes_partitioned ADD PARTITION (doctor_pt=4);
> ALTER TABLE episodes_partitioned ADD PARTITION (doctor_pt=5);
> SELECT COUNT(*) FROM episodes_partitioned;
> {noformat}
> with following exception 
> {noformat}
> java.io.IOException: org.apache.hadoop.hive.serde2.avro.AvroSerdeException: 
> Neither avro.schema.literal nor avro.schema.url specified, can't determine 
> table schema
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat.getHiveRecordWriter(AvroContainerOutputFormat.java:61)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.createEmptyFile(Utilities.java:2869)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.createDummyFileForEmptyPartition(Utilities.java:2901)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.getInputPaths(Utilities.java:2825)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:381)
> at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1409)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1187)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1015)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:883)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:737)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-6792) hive.warehouse.subdir.inherit.perms doesn't work correctly in CTAS

2014-03-30 Thread Chaoyu Tang (JIRA)

Chaoyu Tang created HIVE-6792:
-

 Summary: hive.warehouse.subdir.inherit.perms doesn't work 
correctly in CTAS
 Key: HIVE-6792
 URL: https://issues.apache.org/jira/browse/HIVE-6792
 Project: Hive
  Issue Type: Bug
  Components: Authorization, Security
Affects Versions: 0.14.0
Reporter: Chaoyu Tang
Assignee: Chaoyu Tang


hive.warehouse.subdir.inherit.perms doesn't work correctly in CTAS. When it is 
set to true, the table created using create table .. as select.. does not 
inherit its parent directory's group and permission mode. It can be easily 
reproduced:
==
hive> dfs -ls -R /user/hive/warehouse;
drwxrwx--T   - hive   hive0 2014-03-30 17:44 
/user/hive/warehouse/ctas.db
drwxr-xr-x   - hive   hive0 2014-03-30 17:20 
/user/hive/warehouse/ctas_src_tbl
-rw-r--r--   3 hive   hive46059 2014-03-30 17:20 
/user/hive/warehouse/ctas_src_tbl/00_0

hive> create table ctas.test_perm as select * from ctas_src_tbl;

hive> dfs -ls -R /user/hive/warehouse;  
drwxrwx--T   - hive   hive0 2014-03-30 17:46 
/user/hive/warehouse/ctas.db
drwxr-xr-x   - hive   supergroup  0 2014-03-30 17:46 
/user/hive/warehouse/ctas.db/test_perm
-rw-r--r--   3 hive   supergroup  46059 2014-03-30 17:46 
/user/hive/warehouse/ctas.db/test_perm/00_0
drwxr-xr-x   - hive   hive0 2014-03-30 17:20 
/user/hive/warehouse/ctas_src_tbl
-rw-r--r--   3 hive   hive46059 2014-03-30 17:20 
/user/hive/warehouse/ctas_src_tbl/00_0
==
The created table does not inherit its database ctas's group hive and 
permission mode 770, instead it takes the default group (supergroup) and 
permission mode (755) in hdfs



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6792) hive.warehouse.subdir.inherit.perms doesn't work correctly in CTAS

2014-03-30 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-6792:
--

Attachment: HIVE-6792.patch

Please review attached patch. With this, the new created table (test_perm) with 
its data file inherit group and permission mode from their parent cats.db.
==
hive> dfs -ls -R /user/hive/warehouse;  
drwxrwx--T   - hive   hive0 2014-03-30 17:56 
/user/hive/warehouse/ctas.db
drwxrwx--T   - hive   hive0 2014-03-30 17:56 
/user/hive/warehouse/ctas.db/test_perm
-rw-rw   3 hive   hive46059 2014-03-30 17:56 
/user/hive/warehouse/ctas.db/test_perm/00_0
drwxr-xr-x   - hive   hive0 2014-03-30 17:20 
/user/hive/warehouse/ctas_src_tbl
-rw-r--r--   3 hive   hive46059 2014-03-30 17:20 
/user/hive/warehouse/ctas_src_tbl/00_0

> hive.warehouse.subdir.inherit.perms doesn't work correctly in CTAS
> --
>
> Key: HIVE-6792
> URL: https://issues.apache.org/jira/browse/HIVE-6792
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Security
>Affects Versions: 0.14.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-6792.patch
>
>
> hive.warehouse.subdir.inherit.perms doesn't work correctly in CTAS. When it 
> is set to true, the table created using create table .. as select.. does not 
> inherit its parent directory's group and permission mode. It can be easily 
> reproduced:
> ==
> hive> dfs -ls -R /user/hive/warehouse;
> drwxrwx--T   - hive   hive0 2014-03-30 17:44 
> /user/hive/warehouse/ctas.db
> drwxr-xr-x   - hive   hive0 2014-03-30 17:20 
> /user/hive/warehouse/ctas_src_tbl
> -rw-r--r--   3 hive   hive46059 2014-03-30 17:20 
> /user/hive/warehouse/ctas_src_tbl/00_0
> hive> create table ctas.test_perm as select * from ctas_src_tbl;
> 
> hive> dfs -ls -R /user/hive/warehouse;  
> drwxrwx--T   - hive   hive0 2014-03-30 17:46 
> /user/hive/warehouse/ctas.db
> drwxr-xr-x   - hive   supergroup  0 2014-03-30 17:46 
> /user/hive/warehouse/ctas.db/test_perm
> -rw-r--r--   3 hive   supergroup  46059 2014-03-30 17:46 
> /user/hive/warehouse/ctas.db/test_perm/00_0
> drwxr-xr-x   - hive   hive0 2014-03-30 17:20 
> /user/hive/warehouse/ctas_src_tbl
> -rw-r--r--   3 hive   hive46059 2014-03-30 17:20 
> /user/hive/warehouse/ctas_src_tbl/00_0
> ==
> The created table does not inherit its database ctas's group hive and 
> permission mode 770, instead it takes the default group (supergroup) and 
> permission mode (755) in hdfs



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6792) hive.warehouse.subdir.inherit.perms doesn't work correctly in CTAS

2014-03-31 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-6792:
--

Attachment: HIVE-6792-1.patch

Thanks, Szehon, for pointing out. I changed to use member variable conf instead 
and please see attached HIVE-6792-1.patch

> hive.warehouse.subdir.inherit.perms doesn't work correctly in CTAS
> --
>
> Key: HIVE-6792
> URL: https://issues.apache.org/jira/browse/HIVE-6792
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Security
>Affects Versions: 0.14.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-6792-1.patch, HIVE-6792.patch
>
>
> hive.warehouse.subdir.inherit.perms doesn't work correctly in CTAS. When it 
> is set to true, the table created using create table .. as select.. does not 
> inherit its parent directory's group and permission mode. It can be easily 
> reproduced:
> ==
> hive> dfs -ls -R /user/hive/warehouse;
> drwxrwx--T   - hive   hive0 2014-03-30 17:44 
> /user/hive/warehouse/ctas.db
> drwxr-xr-x   - hive   hive0 2014-03-30 17:20 
> /user/hive/warehouse/ctas_src_tbl
> -rw-r--r--   3 hive   hive46059 2014-03-30 17:20 
> /user/hive/warehouse/ctas_src_tbl/00_0
> hive> create table ctas.test_perm as select * from ctas_src_tbl;
> 
> hive> dfs -ls -R /user/hive/warehouse;  
> drwxrwx--T   - hive   hive0 2014-03-30 17:46 
> /user/hive/warehouse/ctas.db
> drwxr-xr-x   - hive   supergroup  0 2014-03-30 17:46 
> /user/hive/warehouse/ctas.db/test_perm
> -rw-r--r--   3 hive   supergroup  46059 2014-03-30 17:46 
> /user/hive/warehouse/ctas.db/test_perm/00_0
> drwxr-xr-x   - hive   hive0 2014-03-30 17:20 
> /user/hive/warehouse/ctas_src_tbl
> -rw-r--r--   3 hive   hive46059 2014-03-30 17:20 
> /user/hive/warehouse/ctas_src_tbl/00_0
> ==
> The created table does not inherit its database ctas's group hive and 
> permission mode 770, instead it takes the default group (supergroup) and 
> permission mode (755) in hdfs



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6792) hive.warehouse.subdir.inherit.perms doesn't work correctly in CTAS

2014-03-31 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-6792:
--

Status: Patch Available  (was: Open)

> hive.warehouse.subdir.inherit.perms doesn't work correctly in CTAS
> --
>
> Key: HIVE-6792
> URL: https://issues.apache.org/jira/browse/HIVE-6792
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Security
>Affects Versions: 0.14.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-6792-1.patch, HIVE-6792.patch
>
>
> hive.warehouse.subdir.inherit.perms doesn't work correctly in CTAS. When it 
> is set to true, the table created using create table .. as select.. does not 
> inherit its parent directory's group and permission mode. It can be easily 
> reproduced:
> ==
> hive> dfs -ls -R /user/hive/warehouse;
> drwxrwx--T   - hive   hive0 2014-03-30 17:44 
> /user/hive/warehouse/ctas.db
> drwxr-xr-x   - hive   hive0 2014-03-30 17:20 
> /user/hive/warehouse/ctas_src_tbl
> -rw-r--r--   3 hive   hive46059 2014-03-30 17:20 
> /user/hive/warehouse/ctas_src_tbl/00_0
> hive> create table ctas.test_perm as select * from ctas_src_tbl;
> 
> hive> dfs -ls -R /user/hive/warehouse;  
> drwxrwx--T   - hive   hive0 2014-03-30 17:46 
> /user/hive/warehouse/ctas.db
> drwxr-xr-x   - hive   supergroup  0 2014-03-30 17:46 
> /user/hive/warehouse/ctas.db/test_perm
> -rw-r--r--   3 hive   supergroup  46059 2014-03-30 17:46 
> /user/hive/warehouse/ctas.db/test_perm/00_0
> drwxr-xr-x   - hive   hive0 2014-03-30 17:20 
> /user/hive/warehouse/ctas_src_tbl
> -rw-r--r--   3 hive   hive46059 2014-03-30 17:20 
> /user/hive/warehouse/ctas_src_tbl/00_0
> ==
> The created table does not inherit its database ctas's group hive and 
> permission mode 770, instead it takes the default group (supergroup) and 
> permission mode (755) in hdfs



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6245) HS2 creates DBs/Tables with wrong ownership when HMS setugi is true

2014-04-14 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969164#comment-13969164
 ] 

Chaoyu Tang commented on HIVE-6245:
---

[~thejas] No worries, as long as it has been addressed in recent HIVE-6312 and 
HIVE-6864. I am going to verify them with my case as well. Thanks!

> HS2 creates DBs/Tables with wrong ownership when HMS setugi is true
> ---
>
> Key: HIVE-6245
> URL: https://issues.apache.org/jira/browse/HIVE-6245
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-6245.patch
>
>
> The case with following settings is valid but does not work correctly in 
> current HS2:
> ==
> hive.server2.authentication=NONE (or LDAP)
> hive.server2.enable.doAs= true
> hive.metastore.sasl.enabled=false
> hive.metastore.execute.setugi=true
> ==
> Ideally, HS2 is able to impersonate the logged in user (from Beeline, or JDBC 
> application) and create DBs/Tables with user's ownership.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (HIVE-7441) Custom partition scheme gets rewritten with hive scheme upon concatenate

2014-08-04 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang reassigned HIVE-7441:
-

Assignee: Chaoyu Tang

It is a defect from hive and I think that the concatenated partition file 
should be under the original partition location if they are from same file 
system.

> Custom partition scheme gets rewritten with hive scheme upon concatenate
> 
>
> Key: HIVE-7441
> URL: https://issues.apache.org/jira/browse/HIVE-7441
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.10.0, 0.11.0, 0.12.0
> Environment: CDH4.5 and CDH5.0
>Reporter: Johndee Burks
>Assignee: Chaoyu Tang
>Priority: Minor
>
> If I take a given data directories. The directories contain a data file that 
> is rc format and only contains one character "1".
> {code}
> /j1/part1
> /j1/part2
> {code}
> Create the table over the directories using the following command:
> {code}
> create table j1 (a int) partitioned by (b string) stored as rcfile location 
> '/j1' ;
> {code}
> I add these directories to a table for example j1 using the following 
> commands:
> {code}
> alter table j1 add partition (b = 'part1') location '/j1/part1';
> alter table j1 add partition (b = 'part2') location '/j1/part2';
> {code}
> I then do the following command to the first partition: 
> {code}
> alter table j1 partition (b = 'part1') concatenate;
> {code}
> Hive changes the partition location from on hdfs
> {code}
> /j1/part1
> {code}
> to 
> {code}
> /j1/b=part1
> {code}
> However it does not update the partition location in the metastore and 
> partition is then lost to the table. It is hard to find this out until you 
> start querying your data and notice there is missing data. The table even 
> still shows the partition when you do "show partitions".



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7441) Custom partition scheme gets rewritten with hive scheme upon concatenate

2014-08-04 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-7441:
--

Fix Version/s: 0.14.0
   Status: Patch Available  (was: Open)

> Custom partition scheme gets rewritten with hive scheme upon concatenate
> 
>
> Key: HIVE-7441
> URL: https://issues.apache.org/jira/browse/HIVE-7441
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.12.0, 0.11.0, 0.10.0
> Environment: CDH4.5 and CDH5.0
>Reporter: Johndee Burks
>Assignee: Chaoyu Tang
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-7441.patch
>
>
> If I take a given data directories. The directories contain a data file that 
> is rc format and only contains one character "1".
> {code}
> /j1/part1
> /j1/part2
> {code}
> Create the table over the directories using the following command:
> {code}
> create table j1 (a int) partitioned by (b string) stored as rcfile location 
> '/j1' ;
> {code}
> I add these directories to a table for example j1 using the following 
> commands:
> {code}
> alter table j1 add partition (b = 'part1') location '/j1/part1';
> alter table j1 add partition (b = 'part2') location '/j1/part2';
> {code}
> I then do the following command to the first partition: 
> {code}
> alter table j1 partition (b = 'part1') concatenate;
> {code}
> Hive changes the partition location from on hdfs
> {code}
> /j1/part1
> {code}
> to 
> {code}
> /j1/b=part1
> {code}
> However it does not update the partition location in the metastore and 
> partition is then lost to the table. It is hard to find this out until you 
> start querying your data and notice there is missing data. The table even 
> still shows the partition when you do "show partitions".



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7441) Custom partition scheme gets rewritten with hive scheme upon concatenate

2014-08-04 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-7441:
--

Attachment: HIVE-7441.patch

Please review the attached patch in https://reviews.apache.org/r/24284/, thanks.

> Custom partition scheme gets rewritten with hive scheme upon concatenate
> 
>
> Key: HIVE-7441
> URL: https://issues.apache.org/jira/browse/HIVE-7441
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.10.0, 0.11.0, 0.12.0
> Environment: CDH4.5 and CDH5.0
>Reporter: Johndee Burks
>Assignee: Chaoyu Tang
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-7441.patch
>
>
> If I take a given data directories. The directories contain a data file that 
> is rc format and only contains one character "1".
> {code}
> /j1/part1
> /j1/part2
> {code}
> Create the table over the directories using the following command:
> {code}
> create table j1 (a int) partitioned by (b string) stored as rcfile location 
> '/j1' ;
> {code}
> I add these directories to a table for example j1 using the following 
> commands:
> {code}
> alter table j1 add partition (b = 'part1') location '/j1/part1';
> alter table j1 add partition (b = 'part2') location '/j1/part2';
> {code}
> I then do the following command to the first partition: 
> {code}
> alter table j1 partition (b = 'part1') concatenate;
> {code}
> Hive changes the partition location from on hdfs
> {code}
> /j1/part1
> {code}
> to 
> {code}
> /j1/b=part1
> {code}
> However it does not update the partition location in the metastore and 
> partition is then lost to the table. It is hard to find this out until you 
> start querying your data and notice there is missing data. The table even 
> still shows the partition when you do "show partitions".



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7441) Custom partition scheme gets rewritten with hive scheme upon concatenate

2014-08-05 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-7441:
--

Attachment: HIVE-7441.1.patch

Thanks, Szehon. I made changes and uploaded the new patch to review board. 
Also I fixed the tests and the failures I think were from the order of the 
selected rows.

> Custom partition scheme gets rewritten with hive scheme upon concatenate
> 
>
> Key: HIVE-7441
> URL: https://issues.apache.org/jira/browse/HIVE-7441
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.10.0, 0.11.0, 0.12.0
> Environment: CDH4.5 and CDH5.0
>Reporter: Johndee Burks
>Assignee: Chaoyu Tang
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-7441.1.patch, HIVE-7441.patch
>
>
> If I take a given data directories. The directories contain a data file that 
> is rc format and only contains one character "1".
> {code}
> /j1/part1
> /j1/part2
> {code}
> Create the table over the directories using the following command:
> {code}
> create table j1 (a int) partitioned by (b string) stored as rcfile location 
> '/j1' ;
> {code}
> I add these directories to a table for example j1 using the following 
> commands:
> {code}
> alter table j1 add partition (b = 'part1') location '/j1/part1';
> alter table j1 add partition (b = 'part2') location '/j1/part2';
> {code}
> I then do the following command to the first partition: 
> {code}
> alter table j1 partition (b = 'part1') concatenate;
> {code}
> Hive changes the partition location from on hdfs
> {code}
> /j1/part1
> {code}
> to 
> {code}
> /j1/b=part1
> {code}
> However it does not update the partition location in the metastore and 
> partition is then lost to the table. It is hard to find this out until you 
> start querying your data and notice there is missing data. The table even 
> still shows the partition when you do "show partitions".



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-7635) Query having same aggregate functions but different case throws IndexOutOfBoundsException

2014-08-06 Thread Chaoyu Tang (JIRA)

Chaoyu Tang created HIVE-7635:
-

 Summary: Query having same aggregate functions but different case 
throws IndexOutOfBoundsException
 Key: HIVE-7635
 URL: https://issues.apache.org/jira/browse/HIVE-7635
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.1
Reporter: Chaoyu Tang
Assignee: Chaoyu Tang
 Fix For: 0.14.0


A query having same aggregate functions (e.g. count) but in different case  
does not work and throws IndexOutOfBoundsException.
{code}
Query:
SELECT key, COUNT(value) FROM src GROUP BY key HAVING count(value) >= 4
---
Error log:
14/08/06 11:00:45 ERROR ql.Driver: FAILED: IndexOutOfBoundsException Index: 2, 
Size: 2
java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
at java.util.ArrayList.RangeCheck(ArrayList.java:547)
at java.util.ArrayList.get(ArrayList.java:322)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanReduceSinkOperator(SemanticAnalyzer.java:4173)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggrNoSkew(SemanticAnalyzer.java:5165)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8337)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9178)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9431)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:207)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:414)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:310)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1023)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:960)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:950)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:265)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:217)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:427)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:800)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633)
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7635) Query having same aggregate functions but different case throws IndexOutOfBoundsException

2014-08-06 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-7635:
--

Attachment: HIVE-7635.patch

Please review the attached patch. I also post it on the review 
boardhttps://reviews.apache.org/r/24404/. Thanks

> Query having same aggregate functions but different case throws 
> IndexOutOfBoundsException
> -
>
> Key: HIVE-7635
> URL: https://issues.apache.org/jira/browse/HIVE-7635
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.13.1
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 0.14.0
>
> Attachments: HIVE-7635.patch
>
>
> A query having same aggregate functions (e.g. count) but in different case  
> does not work and throws IndexOutOfBoundsException.
> {code}
> Query:
> SELECT key, COUNT(value) FROM src GROUP BY key HAVING count(value) >= 4
> ---
> Error log:
> 14/08/06 11:00:45 ERROR ql.Driver: FAILED: IndexOutOfBoundsException Index: 
> 2, Size: 2
> java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
>   at java.util.ArrayList.RangeCheck(ArrayList.java:547)
>   at java.util.ArrayList.get(ArrayList.java:322)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanReduceSinkOperator(SemanticAnalyzer.java:4173)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggrNoSkew(SemanticAnalyzer.java:5165)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8337)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9178)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9431)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:207)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:414)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:310)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1023)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:960)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:950)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:265)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:217)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:427)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:800)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7635) Query having same aggregate functions but different case throws IndexOutOfBoundsException

2014-08-06 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-7635:
--

Status: Patch Available  (was: Open)

> Query having same aggregate functions but different case throws 
> IndexOutOfBoundsException
> -
>
> Key: HIVE-7635
> URL: https://issues.apache.org/jira/browse/HIVE-7635
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.13.1
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 0.14.0
>
> Attachments: HIVE-7635.patch
>
>
> A query having same aggregate functions (e.g. count) but in different case  
> does not work and throws IndexOutOfBoundsException.
> {code}
> Query:
> SELECT key, COUNT(value) FROM src GROUP BY key HAVING count(value) >= 4
> ---
> Error log:
> 14/08/06 11:00:45 ERROR ql.Driver: FAILED: IndexOutOfBoundsException Index: 
> 2, Size: 2
> java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
>   at java.util.ArrayList.RangeCheck(ArrayList.java:547)
>   at java.util.ArrayList.get(ArrayList.java:322)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanReduceSinkOperator(SemanticAnalyzer.java:4173)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggrNoSkew(SemanticAnalyzer.java:5165)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8337)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9178)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9431)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:207)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:414)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:310)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1023)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:960)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:950)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:265)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:217)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:427)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:800)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7635) Query having same aggregate functions but different case throws IndexOutOfBoundsException

2014-08-06 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14088230#comment-14088230
 ] 

Chaoyu Tang commented on HIVE-7635:
---

Sure. Thanks. The link is https://reviews.apache.org/r/24404/

> Query having same aggregate functions but different case throws 
> IndexOutOfBoundsException
> -
>
> Key: HIVE-7635
> URL: https://issues.apache.org/jira/browse/HIVE-7635
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.13.1
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 0.14.0
>
> Attachments: HIVE-7635.patch
>
>
> A query having same aggregate functions (e.g. count) but in different case  
> does not work and throws IndexOutOfBoundsException.
> {code}
> Query:
> SELECT key, COUNT(value) FROM src GROUP BY key HAVING count(value) >= 4
> ---
> Error log:
> 14/08/06 11:00:45 ERROR ql.Driver: FAILED: IndexOutOfBoundsException Index: 
> 2, Size: 2
> java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
>   at java.util.ArrayList.RangeCheck(ArrayList.java:547)
>   at java.util.ArrayList.get(ArrayList.java:322)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanReduceSinkOperator(SemanticAnalyzer.java:4173)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggrNoSkew(SemanticAnalyzer.java:5165)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8337)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9178)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9431)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:207)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:414)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:310)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1023)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:960)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:950)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:265)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:217)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:427)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:800)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7635) Query having same aggregate functions but different case throws IndexOutOfBoundsException

2014-08-07 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-7635:
--

Attachment: HIVE-7635.1.patch

Fixed the failed test (due to the missing update in having.q.out for Tez). 
Uploaded the new patch here and also to RB.

> Query having same aggregate functions but different case throws 
> IndexOutOfBoundsException
> -
>
> Key: HIVE-7635
> URL: https://issues.apache.org/jira/browse/HIVE-7635
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.13.1
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 0.14.0
>
> Attachments: HIVE-7635.1.patch, HIVE-7635.patch
>
>
> A query having same aggregate functions (e.g. count) but in different case  
> does not work and throws IndexOutOfBoundsException.
> {code}
> Query:
> SELECT key, COUNT(value) FROM src GROUP BY key HAVING count(value) >= 4
> ---
> Error log:
> 14/08/06 11:00:45 ERROR ql.Driver: FAILED: IndexOutOfBoundsException Index: 
> 2, Size: 2
> java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
>   at java.util.ArrayList.RangeCheck(ArrayList.java:547)
>   at java.util.ArrayList.get(ArrayList.java:322)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanReduceSinkOperator(SemanticAnalyzer.java:4173)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggrNoSkew(SemanticAnalyzer.java:5165)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8337)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9178)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9431)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:207)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:414)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:310)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1023)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:960)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:950)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:265)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:217)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:427)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:800)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7441) Custom partition scheme gets rewritten with hive scheme upon concatenate

2014-08-12 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14095108#comment-14095108
 ] 

Chaoyu Tang commented on HIVE-7441:
---

[~szehon] Thanks for reviewing the patch and committing it. Appreciate it.

> Custom partition scheme gets rewritten with hive scheme upon concatenate
> 
>
> Key: HIVE-7441
> URL: https://issues.apache.org/jira/browse/HIVE-7441
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.10.0, 0.11.0, 0.12.0
> Environment: CDH4.5 and CDH5.0
>Reporter: Johndee Burks
>Assignee: Chaoyu Tang
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-7441.1.patch, HIVE-7441.patch
>
>
> If I take a given data directories. The directories contain a data file that 
> is rc format and only contains one character "1".
> {code}
> /j1/part1
> /j1/part2
> {code}
> Create the table over the directories using the following command:
> {code}
> create table j1 (a int) partitioned by (b string) stored as rcfile location 
> '/j1' ;
> {code}
> I add these directories to a table for example j1 using the following 
> commands:
> {code}
> alter table j1 add partition (b = 'part1') location '/j1/part1';
> alter table j1 add partition (b = 'part2') location '/j1/part2';
> {code}
> I then do the following command to the first partition: 
> {code}
> alter table j1 partition (b = 'part1') concatenate;
> {code}
> Hive changes the partition location from on hdfs
> {code}
> /j1/part1
> {code}
> to 
> {code}
> /j1/b=part1
> {code}
> However it does not update the partition location in the metastore and 
> partition is then lost to the table. It is hard to find this out until you 
> start querying your data and notice there is missing data. The table even 
> still shows the partition when you do "show partitions".



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7635) Query having same aggregate functions but different case throws IndexOutOfBoundsException

2014-08-12 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14095110#comment-14095110
 ] 

Chaoyu Tang commented on HIVE-7635:
---

[~szehon] & [~ashutoshc] Thanks for reviewing and committing the patch.

> Query having same aggregate functions but different case throws 
> IndexOutOfBoundsException
> -
>
> Key: HIVE-7635
> URL: https://issues.apache.org/jira/browse/HIVE-7635
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.13.1
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 0.14.0
>
> Attachments: HIVE-7635.1.patch, HIVE-7635.patch
>
>
> A query having same aggregate functions (e.g. count) but in different case  
> does not work and throws IndexOutOfBoundsException.
> {code}
> Query:
> SELECT key, COUNT(value) FROM src GROUP BY key HAVING count(value) >= 4
> ---
> Error log:
> 14/08/06 11:00:45 ERROR ql.Driver: FAILED: IndexOutOfBoundsException Index: 
> 2, Size: 2
> java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
>   at java.util.ArrayList.RangeCheck(ArrayList.java:547)
>   at java.util.ArrayList.get(ArrayList.java:322)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanReduceSinkOperator(SemanticAnalyzer.java:4173)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggrNoSkew(SemanticAnalyzer.java:5165)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8337)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9178)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9431)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:207)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:414)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:310)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1023)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:960)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:950)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:265)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:217)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:427)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:800)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (HIVE-1363) 'SHOW TABLE EXTENDED LIKE' command does not strip single/double quotes

2014-09-05 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang reassigned HIVE-1363:
-

Assignee: Chaoyu Tang  (was: Carl Steinbach)

I am running into this issue and pick up this JIRA to work on.

> 'SHOW TABLE EXTENDED LIKE' command does not strip single/double quotes
> --
>
> Key: HIVE-1363
> URL: https://issues.apache.org/jira/browse/HIVE-1363
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.5.0
>Reporter: Carl Steinbach
>Assignee: Chaoyu Tang
>
> {code}
> hive> SHOW TABLE EXTENDED LIKE pokes;
> OK
> tableName:pokes
> owner:carl
> location:hdfs://localhost/user/hive/warehouse/pokes
> inputformat:org.apache.hadoop.mapred.TextInputFormat
> outputformat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> columns:struct columns { i32 num}
> partitioned:false
> partitionColumns:
> totalNumberFiles:0
> totalFileSize:0
> maxFileSize:0
> minFileSize:0
> lastAccessTime:0
> lastUpdateTime:1274517075221
> hive> SHOW TABLE EXTENDED LIKE "p*";
> FAILED: Error in metadata: MetaException(message:Got exception: 
> javax.jdo.JDOUserException ')' expected at character 54 in "database.name == 
> dbName && ( tableName.matches("(?i)"p.*""))")
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> SHOW TABLE EXTENDED LIKE 'p*';
> OK
> hive> SHOW TABLE EXTENDED LIKE `p*`;
> OK
> tableName:pokes
> owner:carl
> location:hdfs://localhost/user/hive/warehouse/pokes
> inputformat:org.apache.hadoop.mapred.TextInputFormat
> outputformat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> columns:struct columns { i32 num}
> partitioned:false
> partitionColumns:
> totalNumberFiles:0
> totalFileSize:0
> maxFileSize:0
> minFileSize:0
> lastAccessTime:0
> lastUpdateTime:1274517075221
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-1363) 'SHOW TABLE EXTENDED LIKE' command does not strip single/double quotes

2014-09-05 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-1363:
--
Attachment: HIVE-1363.patch

The patch is requesting for the review: https://reviews.apache.org/r/25412/
Thanks in advanced.

> 'SHOW TABLE EXTENDED LIKE' command does not strip single/double quotes
> --
>
> Key: HIVE-1363
> URL: https://issues.apache.org/jira/browse/HIVE-1363
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.5.0
>Reporter: Carl Steinbach
>Assignee: Chaoyu Tang
> Attachments: HIVE-1363.patch
>
>
> {code}
> hive> SHOW TABLE EXTENDED LIKE pokes;
> OK
> tableName:pokes
> owner:carl
> location:hdfs://localhost/user/hive/warehouse/pokes
> inputformat:org.apache.hadoop.mapred.TextInputFormat
> outputformat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> columns:struct columns { i32 num}
> partitioned:false
> partitionColumns:
> totalNumberFiles:0
> totalFileSize:0
> maxFileSize:0
> minFileSize:0
> lastAccessTime:0
> lastUpdateTime:1274517075221
> hive> SHOW TABLE EXTENDED LIKE "p*";
> FAILED: Error in metadata: MetaException(message:Got exception: 
> javax.jdo.JDOUserException ')' expected at character 54 in "database.name == 
> dbName && ( tableName.matches("(?i)"p.*""))")
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> SHOW TABLE EXTENDED LIKE 'p*';
> OK
> hive> SHOW TABLE EXTENDED LIKE `p*`;
> OK
> tableName:pokes
> owner:carl
> location:hdfs://localhost/user/hive/warehouse/pokes
> inputformat:org.apache.hadoop.mapred.TextInputFormat
> outputformat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> columns:struct columns { i32 num}
> partitioned:false
> partitionColumns:
> totalNumberFiles:0
> totalFileSize:0
> maxFileSize:0
> minFileSize:0
> lastAccessTime:0
> lastUpdateTime:1274517075221
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-1363) 'SHOW TABLE EXTENDED LIKE' command does not strip single/double quotes

2014-09-05 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-1363:
--
Fix Version/s: 0.14.0
Affects Version/s: 0.14.0
   Status: Patch Available  (was: Open)

> 'SHOW TABLE EXTENDED LIKE' command does not strip single/double quotes
> --
>
> Key: HIVE-1363
> URL: https://issues.apache.org/jira/browse/HIVE-1363
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.5.0, 0.14.0
>Reporter: Carl Steinbach
>Assignee: Chaoyu Tang
> Fix For: 0.14.0
>
> Attachments: HIVE-1363.patch
>
>
> {code}
> hive> SHOW TABLE EXTENDED LIKE pokes;
> OK
> tableName:pokes
> owner:carl
> location:hdfs://localhost/user/hive/warehouse/pokes
> inputformat:org.apache.hadoop.mapred.TextInputFormat
> outputformat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> columns:struct columns { i32 num}
> partitioned:false
> partitionColumns:
> totalNumberFiles:0
> totalFileSize:0
> maxFileSize:0
> minFileSize:0
> lastAccessTime:0
> lastUpdateTime:1274517075221
> hive> SHOW TABLE EXTENDED LIKE "p*";
> FAILED: Error in metadata: MetaException(message:Got exception: 
> javax.jdo.JDOUserException ')' expected at character 54 in "database.name == 
> dbName && ( tableName.matches("(?i)"p.*""))")
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> SHOW TABLE EXTENDED LIKE 'p*';
> OK
> hive> SHOW TABLE EXTENDED LIKE `p*`;
> OK
> tableName:pokes
> owner:carl
> location:hdfs://localhost/user/hive/warehouse/pokes
> inputformat:org.apache.hadoop.mapred.TextInputFormat
> outputformat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> columns:struct columns { i32 num}
> partitioned:false
> partitionColumns:
> totalNumberFiles:0
> totalFileSize:0
> maxFileSize:0
> minFileSize:0
> lastAccessTime:0
> lastUpdateTime:1274517075221
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-1363) 'SHOW TABLE EXTENDED LIKE' command does not strip single/double quotes

2014-09-06 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-1363:
--
Attachment: HIVE-1363.1.patch

Thanks [~xuefuz] for the review. I made the change based on the comments and 
uploaded a new patch here.

> 'SHOW TABLE EXTENDED LIKE' command does not strip single/double quotes
> --
>
> Key: HIVE-1363
> URL: https://issues.apache.org/jira/browse/HIVE-1363
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.5.0, 0.14.0
>Reporter: Carl Steinbach
>Assignee: Chaoyu Tang
> Fix For: 0.14.0
>
> Attachments: HIVE-1363.1.patch, HIVE-1363.patch
>
>
> {code}
> hive> SHOW TABLE EXTENDED LIKE pokes;
> OK
> tableName:pokes
> owner:carl
> location:hdfs://localhost/user/hive/warehouse/pokes
> inputformat:org.apache.hadoop.mapred.TextInputFormat
> outputformat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> columns:struct columns { i32 num}
> partitioned:false
> partitionColumns:
> totalNumberFiles:0
> totalFileSize:0
> maxFileSize:0
> minFileSize:0
> lastAccessTime:0
> lastUpdateTime:1274517075221
> hive> SHOW TABLE EXTENDED LIKE "p*";
> FAILED: Error in metadata: MetaException(message:Got exception: 
> javax.jdo.JDOUserException ')' expected at character 54 in "database.name == 
> dbName && ( tableName.matches("(?i)"p.*""))")
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> SHOW TABLE EXTENDED LIKE 'p*';
> OK
> hive> SHOW TABLE EXTENDED LIKE `p*`;
> OK
> tableName:pokes
> owner:carl
> location:hdfs://localhost/user/hive/warehouse/pokes
> inputformat:org.apache.hadoop.mapred.TextInputFormat
> outputformat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> columns:struct columns { i32 num}
> partitioned:false
> partitionColumns:
> totalNumberFiles:0
> totalFileSize:0
> maxFileSize:0
> minFileSize:0
> lastAccessTime:0
> lastUpdateTime:1274517075221
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-1363) 'SHOW TABLE EXTENDED LIKE' command does not strip single/double quotes

2014-09-06 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-1363:
--
Attachment: HIVE-1363.2.patch

The describe_table_json.q test failure is related to the change from this 
patch. Actually the original test output seemed not right. For the query SHOW 
TABLE EXTENDED LIKE 'json*', it returned empty results and its output in json 
was {"tables":[]}. But the expected result should have one entry for table 
jsontable, and the output should look like following, which is to be masked in 
its q.out file.
==
{"tables":[{"minFileSize":0,"totalNumberFiles":0,"location":"file:/user/hive/warehouse/apache/jsontable","outputFormat":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat","lastAccessTime":0,"lastUpdateTime":1410049821000,"columns":[{"name":"key","type":"int"},{"name":"value","type":"string"}],"maxFileSize":0,"partitioned":false,"tableName":"jsontable","owner":"ctang","inputFormat":"org.apache.hadoop.mapred.TextInputFormat","totalFileSize":0}]}
==

Change describe_table_json.q.out to reflect the expected query output and 
uploaded a new patch.

> 'SHOW TABLE EXTENDED LIKE' command does not strip single/double quotes
> --
>
> Key: HIVE-1363
> URL: https://issues.apache.org/jira/browse/HIVE-1363
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.5.0, 0.14.0
>Reporter: Carl Steinbach
>Assignee: Chaoyu Tang
> Fix For: 0.14.0
>
> Attachments: HIVE-1363.1.patch, HIVE-1363.2.patch, HIVE-1363.patch
>
>
> {code}
> hive> SHOW TABLE EXTENDED LIKE pokes;
> OK
> tableName:pokes
> owner:carl
> location:hdfs://localhost/user/hive/warehouse/pokes
> inputformat:org.apache.hadoop.mapred.TextInputFormat
> outputformat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> columns:struct columns { i32 num}
> partitioned:false
> partitionColumns:
> totalNumberFiles:0
> totalFileSize:0
> maxFileSize:0
> minFileSize:0
> lastAccessTime:0
> lastUpdateTime:1274517075221
> hive> SHOW TABLE EXTENDED LIKE "p*";
> FAILED: Error in metadata: MetaException(message:Got exception: 
> javax.jdo.JDOUserException ')' expected at character 54 in "database.name == 
> dbName && ( tableName.matches("(?i)"p.*""))")
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> SHOW TABLE EXTENDED LIKE 'p*';
> OK
> hive> SHOW TABLE EXTENDED LIKE `p*`;
> OK
> tableName:pokes
> owner:carl
> location:hdfs://localhost/user/hive/warehouse/pokes
> inputformat:org.apache.hadoop.mapred.TextInputFormat
> outputformat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> columns:struct columns { i32 num}
> partitioned:false
> partitionColumns:
> totalNumberFiles:0
> totalFileSize:0
> maxFileSize:0
> minFileSize:0
> lastAccessTime:0
> lastUpdateTime:1274517075221
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-5320) Querying a table with nested struct type over JSON data results in errors

2013-09-19 Thread Chaoyu Tang (JIRA)

Chaoyu Tang created HIVE-5320:
-

 Summary: Querying a table with nested struct type over JSON data 
results in errors
 Key: HIVE-5320
 URL: https://issues.apache.org/jira/browse/HIVE-5320
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.9.0
Reporter: Chaoyu Tang


Querying a table with nested_struct datatype like
==
create table nest_struct_tbl (col1 string, col2 array>>>) ROW FORMAT SERDE 
'org.openx.data.jsonserde.JsonSerDe'; 
==
over JSON data cause errors including java.lang.IndexOutOfBoundsException or 
corrupted data. 
The JsonSerDe used is 
json-serde-1.1.4.jar/json-serde-1.1.4-jar-dependencies.jar.

The cause is that the method:
public List getStructFieldsDataAsList(Object o) 
in JsonStructObjectInspector.java 
returns a list referencing to a static arraylist "values"
So the local variable 'list' in method serialize of Hive LazySimpleSerDe class 
is returned with same reference in its recursive calls and its element values 
are kept on being overwritten in the case STRUCT.

Solutions:
1. Fix in JsonSerDe, and change the field 'values' in 
java.org.openx.data.jsonserde.objectinspector.JsonStructObjectInspector.java
to instance scope.
Filed a ticket to JSonSerDe 
(https://github.com/rcongiu/Hive-JSON-Serde/issues/31)
2. Ideally, in the method serialize of class LazySimpleSerDe, we should 
defensively save a copy of a list resulted from list = 
soi.getStructFieldsDataAsList(obj) in which case the soi is the instance of 
JsonStructObjectInspector, so that the recursive calls of serialize can work 
properly regardless of the extended SerDe implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4223) LazySimpleSerDe will throw IndexOutOfBoundsException in nested structs of hive table

2013-09-19 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13771904#comment-13771904
 ] 

Chaoyu Tang commented on HIVE-4223:
---

I was able to reproduce the similar issue but with JsonSerDe 1.1.4 
(json-serde-1.1.4.jar/json-serde-1.1.4-jar-dependencies.jar). See Hive-5320 for 
details


> LazySimpleSerDe will throw IndexOutOfBoundsException in nested structs of 
> hive table
> 
>
> Key: HIVE-4223
> URL: https://issues.apache.org/jira/browse/HIVE-4223
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.0
> Environment: Hive 0.9.0
>Reporter: Yong Zhang
> Attachments: nest_struct.data
>
>
> The LazySimpleSerDe will throw IndexOutOfBoundsException if the column 
> structure is struct containing array of struct. 
> I have a table with one column defined like this:
> columnA
> array <
> struct<
>col1:primiType,
>col2:primiType,
>col3:primiType,
>col4:primiType,
>col5:primiType,
>col6:primiType,
>col7:primiType,
>col8:array<
> struct<
>   col1:primiType,
>   col2::primiType,
>   col3::primiType,
>   col4:primiType,
>   col5:primiType,
>   col6:primiType,
>   col7:primiType,
>   col8:primiType,
>   col9:primiType
> >
>>
> >
> >
> In this example, the outside struct has 8 columns (including the array), and 
> the inner struct has 9 columns. As long as the outside struct has LESS column 
> count than the inner struct column count, I think we will get the following 
> exception as stracktrace in LazeSimpleSerDe when it tries to serialize a row:
> Caused by: java.lang.IndexOutOfBoundsException: Index: 8, Size: 8
> at java.util.ArrayList.RangeCheck(ArrayList.java:547)
> at java.util.ArrayList.get(ArrayList.java:322)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:485)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:443)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:381)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:365)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:568)
> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
> at 
> org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:132)
> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83)
> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:531)
> ... 9 more
> I am not very sure about exactly the reason of this problem. I believe that 
> the   public static void serialize(ByteStream.Output out, Object 
> obj,ObjectInspector objInspector, byte[] separators, int level, Text 
> nullSequence, boolean escaped, byte escapeChar, boolean[] needsEscape) is 
> recursively invoking itself when facing nest structure. But for the nested 
> struct structure, the list reference will mass up, and the size() will return 
> wrong data.
> In the above example case I faced, 
> for these 2 lines:
>   List fields = soi.getAllStructFieldRefs();
>   list = soi.getStructFieldsDataAsList(obj);
> my StructObjectInspector(soi) will return the CORRECT data for 
> getAllStructFieldRefs() and getStructFieldsDataAsList() methods. For example, 
> for one row, for the outsider 8 columns struct, I have 2 elements in the 
> inner array of struct, and each element will have 9 columns (as there are 9 
> columns in the inner struct). During runtime, after I added more logging in 
> the LazySimpleSerDe, I will see the following behavior in the logging:
> for 8 outside column, loop
> for 9 inside columns, loop for serialize
> for 9 inside columns, loop for serialize
> code broken

[jira] [Updated] (HIVE-5320) Querying a table with nested struct type over JSON data results in errors

2013-09-19 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-5320:
--

Attachment: HIVE-5320.patch

> Querying a table with nested struct type over JSON data results in errors
> -
>
> Key: HIVE-5320
> URL: https://issues.apache.org/jira/browse/HIVE-5320
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.0
>Reporter: Chaoyu Tang
> Attachments: HIVE-5320.patch
>
>
> Querying a table with nested_struct datatype like
> ==
> create table nest_struct_tbl (col1 string, col2 array a2:array>>>) ROW FORMAT SERDE 
> 'org.openx.data.jsonserde.JsonSerDe'; 
> ==
> over JSON data cause errors including java.lang.IndexOutOfBoundsException or 
> corrupted data. 
> The JsonSerDe used is 
> json-serde-1.1.4.jar/json-serde-1.1.4-jar-dependencies.jar.
> The cause is that the method:
> public List getStructFieldsDataAsList(Object o) 
> in JsonStructObjectInspector.java 
> returns a list referencing to a static arraylist "values"
> So the local variable 'list' in method serialize of Hive LazySimpleSerDe 
> class is returned with same reference in its recursive calls and its element 
> values are kept on being overwritten in the case STRUCT.
> Solutions:
> 1. Fix in JsonSerDe, and change the field 'values' in 
> java.org.openx.data.jsonserde.objectinspector.JsonStructObjectInspector.java
> to instance scope.
> Filed a ticket to JSonSerDe 
> (https://github.com/rcongiu/Hive-JSON-Serde/issues/31)
> 2. Ideally, in the method serialize of class LazySimpleSerDe, we should 
> defensively save a copy of a list resulted from list = 
> soi.getStructFieldsDataAsList(obj) in which case the soi is the instance of 
> JsonStructObjectInspector, so that the recursive calls of serialize can work 
> properly regardless of the extended SerDe implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HIVE-5320) Querying a table with nested struct type over JSON data results in errors

2013-09-19 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang reassigned HIVE-5320:
-

Assignee: Chaoyu Tang

> Querying a table with nested struct type over JSON data results in errors
> -
>
> Key: HIVE-5320
> URL: https://issues.apache.org/jira/browse/HIVE-5320
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-5320.patch
>
>
> Querying a table with nested_struct datatype like
> ==
> create table nest_struct_tbl (col1 string, col2 array a2:array>>>) ROW FORMAT SERDE 
> 'org.openx.data.jsonserde.JsonSerDe'; 
> ==
> over JSON data cause errors including java.lang.IndexOutOfBoundsException or 
> corrupted data. 
> The JsonSerDe used is 
> json-serde-1.1.4.jar/json-serde-1.1.4-jar-dependencies.jar.
> The cause is that the method:
> public List getStructFieldsDataAsList(Object o) 
> in JsonStructObjectInspector.java 
> returns a list referencing to a static arraylist "values"
> So the local variable 'list' in method serialize of Hive LazySimpleSerDe 
> class is returned with same reference in its recursive calls and its element 
> values are kept on being overwritten in the case STRUCT.
> Solutions:
> 1. Fix in JsonSerDe, and change the field 'values' in 
> java.org.openx.data.jsonserde.objectinspector.JsonStructObjectInspector.java
> to instance scope.
> Filed a ticket to JSonSerDe 
> (https://github.com/rcongiu/Hive-JSON-Serde/issues/31)
> 2. Ideally, in the method serialize of class LazySimpleSerDe, we should 
> defensively save a copy of a list resulted from list = 
> soi.getStructFieldsDataAsList(obj) in which case the soi is the instance of 
> JsonStructObjectInspector, so that the recursive calls of serialize can work 
> properly regardless of the extended SerDe implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-5320) Querying a table with nested struct type over JSON data results in errors

2013-09-19 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13771916#comment-13771916
 ] 

Chaoyu Tang commented on HIVE-5320:
---

Please review the attached patch for the fix.

> Querying a table with nested struct type over JSON data results in errors
> -
>
> Key: HIVE-5320
> URL: https://issues.apache.org/jira/browse/HIVE-5320
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-5320.patch
>
>
> Querying a table with nested_struct datatype like
> ==
> create table nest_struct_tbl (col1 string, col2 array a2:array>>>) ROW FORMAT SERDE 
> 'org.openx.data.jsonserde.JsonSerDe'; 
> ==
> over JSON data cause errors including java.lang.IndexOutOfBoundsException or 
> corrupted data. 
> The JsonSerDe used is 
> json-serde-1.1.4.jar/json-serde-1.1.4-jar-dependencies.jar.
> The cause is that the method:
> public List getStructFieldsDataAsList(Object o) 
> in JsonStructObjectInspector.java 
> returns a list referencing to a static arraylist "values"
> So the local variable 'list' in method serialize of Hive LazySimpleSerDe 
> class is returned with same reference in its recursive calls and its element 
> values are kept on being overwritten in the case STRUCT.
> Solutions:
> 1. Fix in JsonSerDe, and change the field 'values' in 
> java.org.openx.data.jsonserde.objectinspector.JsonStructObjectInspector.java
> to instance scope.
> Filed a ticket to JSonSerDe 
> (https://github.com/rcongiu/Hive-JSON-Serde/issues/31)
> 2. Ideally, in the method serialize of class LazySimpleSerDe, we should 
> defensively save a copy of a list resulted from list = 
> soi.getStructFieldsDataAsList(obj) in which case the soi is the instance of 
> JsonStructObjectInspector, so that the recursive calls of serialize can work 
> properly regardless of the extended SerDe implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4487) Hive does not set explicit permissions on hive.exec.scratchdir

2013-09-19 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772186#comment-13772186
 ] 

Chaoyu Tang commented on HIVE-4487:
---

[~yhuai] It works in my eclipse. The log tells that it failed in line outStream 
= fs.create(resFile) of DDLTask.
Could you debug and check before this line is executed, what permission and 
owner of the dir (e.g. /tmp/yhuai/hive_2013-09-19_/, one level up 
-local-1) are? What Hadoop version you are using?

> Hive does not set explicit permissions on hive.exec.scratchdir
> --
>
> Key: HIVE-4487
> URL: https://issues.apache.org/jira/browse/HIVE-4487
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Joey Echeverria
>Assignee: Chaoyu Tang
> Fix For: 0.12.0
>
> Attachments: HIVE-4487.patch
>
>
> The hive.exec.scratchdir defaults to /tmp/hive-$\{user.name\}, but when Hive 
> creates this directory it doesn't set any explicit permission on it. This 
> means if you have the default HDFS umask setting of 022, then these 
> directories end up being world readable. These permissions also get applied 
> to the staging directories and their files, thus leaving inter-stage data 
> world readable.
> This can cause a potential leak of data especially when operating on a 
> Kerberos enabled cluster. Hive should probably default these directories to 
> only be readable by the owner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4487) Hive does not set explicit permissions on hive.exec.scratchdir

2013-09-19 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772351#comment-13772351
 ] 

Chaoyu Tang commented on HIVE-4487:
---

Synced to the head of trunk and was able to reproduce the issue as Yin Huai 
saw. 
The cause is, as Thejas said, in the conversion of octal permission to short. 
Changed the line 209 in Context.java to:
FsPermission fsPermission = new 
FsPermission(Short.parseShort(scratchDirPermission.trim(), 8)) 
will solve the problem.

> Hive does not set explicit permissions on hive.exec.scratchdir
> --
>
> Key: HIVE-4487
> URL: https://issues.apache.org/jira/browse/HIVE-4487
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Joey Echeverria
>Assignee: Chaoyu Tang
> Fix For: 0.12.0
>
> Attachments: HIVE-4487.patch
>
>
> The hive.exec.scratchdir defaults to /tmp/hive-$\{user.name\}, but when Hive 
> creates this directory it doesn't set any explicit permission on it. This 
> means if you have the default HDFS umask setting of 022, then these 
> directories end up being world readable. These permissions also get applied 
> to the staging directories and their files, thus leaving inter-stage data 
> world readable.
> This can cause a potential leak of data especially when operating on a 
> Kerberos enabled cluster. Hive should probably default these directories to 
> only be readable by the owner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4487) Hive does not set explicit permissions on hive.exec.scratchdir

2013-09-19 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772355#comment-13772355
 ] 

Chaoyu Tang commented on HIVE-4487:
---

Noticed that Mark has already provided a patch with the same changes in 
Hive-5322. Thanks, [~mwagner]

> Hive does not set explicit permissions on hive.exec.scratchdir
> --
>
> Key: HIVE-4487
> URL: https://issues.apache.org/jira/browse/HIVE-4487
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Joey Echeverria
>Assignee: Chaoyu Tang
> Fix For: 0.12.0
>
> Attachments: HIVE-4487.patch
>
>
> The hive.exec.scratchdir defaults to /tmp/hive-$\{user.name\}, but when Hive 
> creates this directory it doesn't set any explicit permission on it. This 
> means if you have the default HDFS umask setting of 022, then these 
> directories end up being world readable. These permissions also get applied 
> to the staging directories and their files, thus leaving inter-stage data 
> world readable.
> This can cause a potential leak of data especially when operating on a 
> Kerberos enabled cluster. Hive should probably default these directories to 
> only be readable by the owner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-5320) Querying a table with nested struct type over JSON data results in errors

2013-09-23 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13774610#comment-13774610
 ] 

Chaoyu Tang commented on HIVE-5320:
---

Agree that this is mainly caused by the improper implementation from 
json-serde, but is there anything Hive can do to better cope with this kind of 
unexpected behavior or prevent it from happening again? To provide clear 
documentations for its related SerDe APIs like that in ListObjectInspector?

> Querying a table with nested struct type over JSON data results in errors
> -
>
> Key: HIVE-5320
> URL: https://issues.apache.org/jira/browse/HIVE-5320
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-5320.patch
>
>
> Querying a table with nested_struct datatype like
> ==
> create table nest_struct_tbl (col1 string, col2 array a2:array>>>) ROW FORMAT SERDE 
> 'org.openx.data.jsonserde.JsonSerDe'; 
> ==
> over JSON data cause errors including java.lang.IndexOutOfBoundsException or 
> corrupted data. 
> The JsonSerDe used is 
> json-serde-1.1.4.jar/json-serde-1.1.4-jar-dependencies.jar.
> The cause is that the method:
> public List getStructFieldsDataAsList(Object o) 
> in JsonStructObjectInspector.java 
> returns a list referencing to a static arraylist "values"
> So the local variable 'list' in method serialize of Hive LazySimpleSerDe 
> class is returned with same reference in its recursive calls and its element 
> values are kept on being overwritten in the case STRUCT.
> Solutions:
> 1. Fix in JsonSerDe, and change the field 'values' in 
> java.org.openx.data.jsonserde.objectinspector.JsonStructObjectInspector.java
> to instance scope.
> Filed a ticket to JSonSerDe 
> (https://github.com/rcongiu/Hive-JSON-Serde/issues/31)
> 2. Ideally, in the method serialize of class LazySimpleSerDe, we should 
> defensively save a copy of a list resulted from list = 
> soi.getStructFieldsDataAsList(obj) in which case the soi is the instance of 
> JsonStructObjectInspector, so that the recursive calls of serialize can work 
> properly regardless of the extended SerDe implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-5320) Querying a table with nested struct type over JSON data results in errors

2013-09-24 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-5320:
--

Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

> Querying a table with nested struct type over JSON data results in errors
> -
>
> Key: HIVE-5320
> URL: https://issues.apache.org/jira/browse/HIVE-5320
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-5320.patch
>
>
> Querying a table with nested_struct datatype like
> ==
> create table nest_struct_tbl (col1 string, col2 array a2:array>>>) ROW FORMAT SERDE 
> 'org.openx.data.jsonserde.JsonSerDe'; 
> ==
> over JSON data cause errors including java.lang.IndexOutOfBoundsException or 
> corrupted data. 
> The JsonSerDe used is 
> json-serde-1.1.4.jar/json-serde-1.1.4-jar-dependencies.jar.
> The cause is that the method:
> public List getStructFieldsDataAsList(Object o) 
> in JsonStructObjectInspector.java 
> returns a list referencing to a static arraylist "values"
> So the local variable 'list' in method serialize of Hive LazySimpleSerDe 
> class is returned with same reference in its recursive calls and its element 
> values are kept on being overwritten in the case STRUCT.
> Solutions:
> 1. Fix in JsonSerDe, and change the field 'values' in 
> java.org.openx.data.jsonserde.objectinspector.JsonStructObjectInspector.java
> to instance scope.
> Filed a ticket to JSonSerDe 
> (https://github.com/rcongiu/Hive-JSON-Serde/issues/31)
> 2. Ideally, in the method serialize of class LazySimpleSerDe, we should 
> defensively save a copy of a list resulted from list = 
> soi.getStructFieldsDataAsList(obj) in which case the soi is the instance of 
> JsonStructObjectInspector, so that the recursive calls of serialize can work 
> properly regardless of the extended SerDe implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-8784) Querying partition does not work with JDO enabled against PostgreSQL

2014-11-07 Thread Chaoyu Tang (JIRA)

Chaoyu Tang created HIVE-8784:
-

 Summary: Querying partition does not work with JDO enabled against 
PostgreSQL
 Key: HIVE-8784
 URL: https://issues.apache.org/jira/browse/HIVE-8784
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.15.0
Reporter: Chaoyu Tang
Assignee: Chaoyu Tang
 Fix For: 0.15.0


Querying a partition in PostgreSQL fails when using JDO (with 
hive.metastore.try.direct.sql=false) . Following is the reproduce example:
{code}
create table partition_test_multilevel (key string, value string) partitioned 
by (level1 string, level2 string, level3 string);

insert overwrite table partition_test_multilevel partition(level1='', 
level2='111', level3='11') select key, value from srcpart tablesample (11 rows);
insert overwrite table partition_test_multilevel partition(level1='', 
level2='222', level3='11') select key, value from srcpart tablesample (15 rows);
insert overwrite table partition_test_multilevel partition(level1='', 
level2='333', level3='11') select key, value from srcpart tablesample (20 rows);

select level1, level2, level3, count(*) from partition_test_multilevel where 
level2 <= '222' group by level1, level2, level3;
{code}
The query fails with following error:
{code}
  Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: 
MetaException(message:Invocation of method "substring" on "StringExpression" 
requires argument 1 of type "NumericExpression")
at 
org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.getPartitionsFromServer(PartitionPruner.java:392)
at 
org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune(PartitionPruner.java:215)
at 
org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune(PartitionPruner.java:139)
at 
org.apache.hadoop.hive.ql.parse.ParseContext.getPrunedPartitions(ParseContext.java:619)
at 
org.apache.hadoop.hive.ql.optimizer.pcr.PcrOpProcFactory$FilterPCR.process(PcrOpProcFactory.java:110)
... 21 more
{code}

It is because the JDO pushdown filter generated for a query having 
inequality/between partition predicate uses DN indexOf function which is not 
working properly with postgresql (see 
http://www.datanucleus.org/servlet/jira/browse/NUCRDBMS-840) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8784) Querying partition does not work with JDO enabled against PostgreSQL

2014-11-07 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-8784:
--
Attachment: HIVE-8784.patch

Hive (see generateJDOFilterOverPartitions in 
org.apache.hadoop.hive.metastore.parser.ExpressTree) is currently using some DN 
string functions (substring, indexOf) to get the value for a partitionKey. 
Actually there is a straightforward way (as implemented in this patch) to get 
it which in addition avoids the DN indexOf issue with postgresql.

The test cases provided in this patch have been tested against various DB 
including Derby, Mysql and PostgreSQL etc.

> Querying partition does not work with JDO enabled against PostgreSQL
> 
>
> Key: HIVE-8784
> URL: https://issues.apache.org/jira/browse/HIVE-8784
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.15.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 0.15.0
>
> Attachments: HIVE-8784.patch
>
>
> Querying a partition in PostgreSQL fails when using JDO (with 
> hive.metastore.try.direct.sql=false) . Following is the reproduce example:
> {code}
> create table partition_test_multilevel (key string, value string) partitioned 
> by (level1 string, level2 string, level3 string);
> insert overwrite table partition_test_multilevel partition(level1='', 
> level2='111', level3='11') select key, value from srcpart tablesample (11 
> rows);
> insert overwrite table partition_test_multilevel partition(level1='', 
> level2='222', level3='11') select key, value from srcpart tablesample (15 
> rows);
> insert overwrite table partition_test_multilevel partition(level1='', 
> level2='333', level3='11') select key, value from srcpart tablesample (20 
> rows);
> select level1, level2, level3, count(*) from partition_test_multilevel where 
> level2 <= '222' group by level1, level2, level3;
> {code}
> The query fails with following error:
> {code}
>   Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: 
> MetaException(message:Invocation of method "substring" on "StringExpression" 
> requires argument 1 of type "NumericExpression")
>   at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.getPartitionsFromServer(PartitionPruner.java:392)
>   at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune(PartitionPruner.java:215)
>   at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune(PartitionPruner.java:139)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseContext.getPrunedPartitions(ParseContext.java:619)
>   at 
> org.apache.hadoop.hive.ql.optimizer.pcr.PcrOpProcFactory$FilterPCR.process(PcrOpProcFactory.java:110)
>   ... 21 more
> {code}
> It is because the JDO pushdown filter generated for a query having 
> inequality/between partition predicate uses DN indexOf function which is not 
> working properly with postgresql (see 
> http://www.datanucleus.org/servlet/jira/browse/NUCRDBMS-840) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8784) Querying partition does not work with JDO enabled against PostgreSQL

2014-11-07 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-8784:
--
Status: Patch Available  (was: Open)

> Querying partition does not work with JDO enabled against PostgreSQL
> 
>
> Key: HIVE-8784
> URL: https://issues.apache.org/jira/browse/HIVE-8784
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.15.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 0.15.0
>
> Attachments: HIVE-8784.patch
>
>
> Querying a partition in PostgreSQL fails when using JDO (with 
> hive.metastore.try.direct.sql=false) . Following is the reproduce example:
> {code}
> create table partition_test_multilevel (key string, value string) partitioned 
> by (level1 string, level2 string, level3 string);
> insert overwrite table partition_test_multilevel partition(level1='', 
> level2='111', level3='11') select key, value from srcpart tablesample (11 
> rows);
> insert overwrite table partition_test_multilevel partition(level1='', 
> level2='222', level3='11') select key, value from srcpart tablesample (15 
> rows);
> insert overwrite table partition_test_multilevel partition(level1='', 
> level2='333', level3='11') select key, value from srcpart tablesample (20 
> rows);
> select level1, level2, level3, count(*) from partition_test_multilevel where 
> level2 <= '222' group by level1, level2, level3;
> {code}
> The query fails with following error:
> {code}
>   Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: 
> MetaException(message:Invocation of method "substring" on "StringExpression" 
> requires argument 1 of type "NumericExpression")
>   at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.getPartitionsFromServer(PartitionPruner.java:392)
>   at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune(PartitionPruner.java:215)
>   at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune(PartitionPruner.java:139)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseContext.getPrunedPartitions(ParseContext.java:619)
>   at 
> org.apache.hadoop.hive.ql.optimizer.pcr.PcrOpProcFactory$FilterPCR.process(PcrOpProcFactory.java:110)
>   ... 21 more
> {code}
> It is because the JDO pushdown filter generated for a query having 
> inequality/between partition predicate uses DN indexOf function which is not 
> working properly with postgresql (see 
> http://www.datanucleus.org/servlet/jira/browse/NUCRDBMS-840) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8784) Querying partition does not work with JDO enabled against PostgreSQL

2014-11-07 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14202466#comment-14202466
 ] 

Chaoyu Tang commented on HIVE-8784:
---

Patch is posted on RB and requesting for the review, see 
https://reviews.apache.org/r/27737/ , thanks.

> Querying partition does not work with JDO enabled against PostgreSQL
> 
>
> Key: HIVE-8784
> URL: https://issues.apache.org/jira/browse/HIVE-8784
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.15.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 0.15.0
>
> Attachments: HIVE-8784.patch
>
>
> Querying a partition in PostgreSQL fails when using JDO (with 
> hive.metastore.try.direct.sql=false) . Following is the reproduce example:
> {code}
> create table partition_test_multilevel (key string, value string) partitioned 
> by (level1 string, level2 string, level3 string);
> insert overwrite table partition_test_multilevel partition(level1='', 
> level2='111', level3='11') select key, value from srcpart tablesample (11 
> rows);
> insert overwrite table partition_test_multilevel partition(level1='', 
> level2='222', level3='11') select key, value from srcpart tablesample (15 
> rows);
> insert overwrite table partition_test_multilevel partition(level1='', 
> level2='333', level3='11') select key, value from srcpart tablesample (20 
> rows);
> select level1, level2, level3, count(*) from partition_test_multilevel where 
> level2 <= '222' group by level1, level2, level3;
> {code}
> The query fails with following error:
> {code}
>   Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: 
> MetaException(message:Invocation of method "substring" on "StringExpression" 
> requires argument 1 of type "NumericExpression")
>   at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.getPartitionsFromServer(PartitionPruner.java:392)
>   at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune(PartitionPruner.java:215)
>   at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune(PartitionPruner.java:139)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseContext.getPrunedPartitions(ParseContext.java:619)
>   at 
> org.apache.hadoop.hive.ql.optimizer.pcr.PcrOpProcFactory$FilterPCR.process(PcrOpProcFactory.java:110)
>   ... 21 more
> {code}
> It is because the JDO pushdown filter generated for a query having 
> inequality/between partition predicate uses DN indexOf function which is not 
> working properly with postgresql (see 
> http://www.datanucleus.org/servlet/jira/browse/NUCRDBMS-840) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8784) Querying partition does not work with JDO enabled against PostgreSQL

2014-11-10 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-8784:
--
Attachment: HIVE-8784_1.patch

Add more test cases using orm and directsql.

> Querying partition does not work with JDO enabled against PostgreSQL
> 
>
> Key: HIVE-8784
> URL: https://issues.apache.org/jira/browse/HIVE-8784
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.15.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 0.15.0
>
> Attachments: HIVE-8784.patch, HIVE-8784_1.patch
>
>
> Querying a partition in PostgreSQL fails when using JDO (with 
> hive.metastore.try.direct.sql=false) . Following is the reproduce example:
> {code}
> create table partition_test_multilevel (key string, value string) partitioned 
> by (level1 string, level2 string, level3 string);
> insert overwrite table partition_test_multilevel partition(level1='', 
> level2='111', level3='11') select key, value from srcpart tablesample (11 
> rows);
> insert overwrite table partition_test_multilevel partition(level1='', 
> level2='222', level3='11') select key, value from srcpart tablesample (15 
> rows);
> insert overwrite table partition_test_multilevel partition(level1='', 
> level2='333', level3='11') select key, value from srcpart tablesample (20 
> rows);
> select level1, level2, level3, count(*) from partition_test_multilevel where 
> level2 <= '222' group by level1, level2, level3;
> {code}
> The query fails with following error:
> {code}
>   Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: 
> MetaException(message:Invocation of method "substring" on "StringExpression" 
> requires argument 1 of type "NumericExpression")
>   at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.getPartitionsFromServer(PartitionPruner.java:392)
>   at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune(PartitionPruner.java:215)
>   at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune(PartitionPruner.java:139)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseContext.getPrunedPartitions(ParseContext.java:619)
>   at 
> org.apache.hadoop.hive.ql.optimizer.pcr.PcrOpProcFactory$FilterPCR.process(PcrOpProcFactory.java:110)
>   ... 21 more
> {code}
> It is because the JDO pushdown filter generated for a query having 
> inequality/between partition predicate uses DN indexOf function which is not 
> working properly with postgresql (see 
> http://www.datanucleus.org/servlet/jira/browse/NUCRDBMS-840) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8784) Querying partition does not work with JDO enabled against PostgreSQL

2014-11-10 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-8784:
--
Attachment: HIVE-8784.1.patch

Rename HIVE-8784_1.patch to HIVE-8784.1.patch to enable pre-commit test.

> Querying partition does not work with JDO enabled against PostgreSQL
> 
>
> Key: HIVE-8784
> URL: https://issues.apache.org/jira/browse/HIVE-8784
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.15.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 0.15.0
>
> Attachments: HIVE-8784.1.patch, HIVE-8784.patch, HIVE-8784_1.patch
>
>
> Querying a partition in PostgreSQL fails when using JDO (with 
> hive.metastore.try.direct.sql=false) . Following is the reproduce example:
> {code}
> create table partition_test_multilevel (key string, value string) partitioned 
> by (level1 string, level2 string, level3 string);
> insert overwrite table partition_test_multilevel partition(level1='', 
> level2='111', level3='11') select key, value from srcpart tablesample (11 
> rows);
> insert overwrite table partition_test_multilevel partition(level1='', 
> level2='222', level3='11') select key, value from srcpart tablesample (15 
> rows);
> insert overwrite table partition_test_multilevel partition(level1='', 
> level2='333', level3='11') select key, value from srcpart tablesample (20 
> rows);
> select level1, level2, level3, count(*) from partition_test_multilevel where 
> level2 <= '222' group by level1, level2, level3;
> {code}
> The query fails with following error:
> {code}
>   Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: 
> MetaException(message:Invocation of method "substring" on "StringExpression" 
> requires argument 1 of type "NumericExpression")
>   at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.getPartitionsFromServer(PartitionPruner.java:392)
>   at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune(PartitionPruner.java:215)
>   at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune(PartitionPruner.java:139)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseContext.getPrunedPartitions(ParseContext.java:619)
>   at 
> org.apache.hadoop.hive.ql.optimizer.pcr.PcrOpProcFactory$FilterPCR.process(PcrOpProcFactory.java:110)
>   ... 21 more
> {code}
> It is because the JDO pushdown filter generated for a query having 
> inequality/between partition predicate uses DN indexOf function which is not 
> working properly with postgresql (see 
> http://www.datanucleus.org/servlet/jira/browse/NUCRDBMS-840) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8839) Support "alter table .. add/replace columns cascade"

2014-11-12 Thread Chaoyu Tang (JIRA)

Chaoyu Tang created HIVE-8839:
-

 Summary: Support "alter table .. add/replace columns cascade"
 Key: HIVE-8839
 URL: https://issues.apache.org/jira/browse/HIVE-8839
 Project: Hive
  Issue Type: Improvement
  Components: SQL
 Environment: 
















Reporter: Chaoyu Tang
Assignee: Chaoyu Tang
 Fix For: 0.15.0


We often run into some issues like HIVE-6131which is due to inconsistent column 
descriptors between table and partitions after alter table. HIVE-8441/HIVE-7971 
provided the flexibility to alter table at partition level. But most cases we 
have need change the table and partitions at same time. In addition, "alter 
table" is usually required prior to "alter table partition .." since querying 
table partition data is also through table. Instead of do that in two steps, 
here we provide a convenient ddl like "alter table ... cascade" to cascade 
table changes to partitions as well. The changes are only limited and 
applicable to add/replace columns and change column name, datatype, position 
and comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8839) Support "alter table .. add/replace columns cascade"

2014-11-12 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-8839:
--
Attachment: HIVE-8839.patch

patch is attached and also available for review at 
https://reviews.apache.org/r/27917/.

> Support "alter table .. add/replace columns cascade"
> 
>
> Key: HIVE-8839
> URL: https://issues.apache.org/jira/browse/HIVE-8839
> Project: Hive
>  Issue Type: Improvement
>  Components: SQL
> Environment: 
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 0.15.0
>
> Attachments: HIVE-8839.patch
>
>
> We often run into some issues like HIVE-6131which is due to inconsistent 
> column descriptors between table and partitions after alter table. 
> HIVE-8441/HIVE-7971 provided the flexibility to alter table at partition 
> level. But most cases we have need change the table and partitions at same 
> time. In addition, "alter table" is usually required prior to "alter table 
> partition .." since querying table partition data is also through table. 
> Instead of do that in two steps, here we provide a convenient ddl like "alter 
> table ... cascade" to cascade table changes to partitions as well. The 
> changes are only limited and applicable to add/replace columns and change 
> column name, datatype, position and comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8839) Support "alter table .. add/replace columns cascade"

2014-11-12 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-8839:
--
Status: Patch Available  (was: Open)

> Support "alter table .. add/replace columns cascade"
> 
>
> Key: HIVE-8839
> URL: https://issues.apache.org/jira/browse/HIVE-8839
> Project: Hive
>  Issue Type: Improvement
>  Components: SQL
> Environment: 
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 0.15.0
>
> Attachments: HIVE-8839.patch
>
>
> We often run into some issues like HIVE-6131which is due to inconsistent 
> column descriptors between table and partitions after alter table. 
> HIVE-8441/HIVE-7971 provided the flexibility to alter table at partition 
> level. But most cases we have need change the table and partitions at same 
> time. In addition, "alter table" is usually required prior to "alter table 
> partition .." since querying table partition data is also through table. 
> Instead of do that in two steps, here we provide a convenient ddl like "alter 
> table ... cascade" to cascade table changes to partitions as well. The 
> changes are only limited and applicable to add/replace columns and change 
> column name, datatype, position and comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8839) Support "alter table .. add/replace columns cascade"

2014-11-12 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209199#comment-14209199
 ] 

Chaoyu Tang commented on HIVE-8839:
---

Most failures were from temp table tests, will look into to see if and how they 
are related to this patch.

> Support "alter table .. add/replace columns cascade"
> 
>
> Key: HIVE-8839
> URL: https://issues.apache.org/jira/browse/HIVE-8839
> Project: Hive
>  Issue Type: Improvement
>  Components: SQL
> Environment: 
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 0.15.0
>
> Attachments: HIVE-8839.patch
>
>
> We often run into some issues like HIVE-6131which is due to inconsistent 
> column descriptors between table and partitions after alter table. 
> HIVE-8441/HIVE-7971 provided the flexibility to alter table at partition 
> level. But most cases we have need change the table and partitions at same 
> time. In addition, "alter table" is usually required prior to "alter table 
> partition .." since querying table partition data is also through table. 
> Instead of do that in two steps, here we provide a convenient ddl like "alter 
> table ... cascade" to cascade table changes to partitions as well. The 
> changes are only limited and applicable to add/replace columns and change 
> column name, datatype, position and comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8839) Support "alter table .. add/replace columns cascade"

2014-11-13 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-8839:
--
Attachment: HIVE-8784.1.patch

Fix for temp table related test failures

> Support "alter table .. add/replace columns cascade"
> 
>
> Key: HIVE-8839
> URL: https://issues.apache.org/jira/browse/HIVE-8839
> Project: Hive
>  Issue Type: Improvement
>  Components: SQL
> Environment: 
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 0.15.0
>
> Attachments: HIVE-8784.1.patch, HIVE-8839.patch
>
>
> We often run into some issues like HIVE-6131which is due to inconsistent 
> column descriptors between table and partitions after alter table. 
> HIVE-8441/HIVE-7971 provided the flexibility to alter table at partition 
> level. But most cases we have need change the table and partitions at same 
> time. In addition, "alter table" is usually required prior to "alter table 
> partition .." since querying table partition data is also through table. 
> Instead of do that in two steps, here we provide a convenient ddl like "alter 
> table ... cascade" to cascade table changes to partitions as well. The 
> changes are only limited and applicable to add/replace columns and change 
> column name, datatype, position and comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8839) Support "alter table .. add/replace columns cascade"

2014-11-13 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-8839:
--
Attachment: (was: HIVE-8784.1.patch)

> Support "alter table .. add/replace columns cascade"
> 
>
> Key: HIVE-8839
> URL: https://issues.apache.org/jira/browse/HIVE-8839
> Project: Hive
>  Issue Type: Improvement
>  Components: SQL
> Environment: 
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 0.15.0
>
> Attachments: HIVE-8839.patch
>
>
> We often run into some issues like HIVE-6131which is due to inconsistent 
> column descriptors between table and partitions after alter table. 
> HIVE-8441/HIVE-7971 provided the flexibility to alter table at partition 
> level. But most cases we have need change the table and partitions at same 
> time. In addition, "alter table" is usually required prior to "alter table 
> partition .." since querying table partition data is also through table. 
> Instead of do that in two steps, here we provide a convenient ddl like "alter 
> table ... cascade" to cascade table changes to partitions as well. The 
> changes are only limited and applicable to add/replace columns and change 
> column name, datatype, position and comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8839) Support "alter table .. add/replace columns cascade"

2014-11-13 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-8839:
--
Attachment: HIVE-8839.1.patch

Attach the right patch with all thrift generated code

> Support "alter table .. add/replace columns cascade"
> 
>
> Key: HIVE-8839
> URL: https://issues.apache.org/jira/browse/HIVE-8839
> Project: Hive
>  Issue Type: Improvement
>  Components: SQL
> Environment: 
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 0.15.0
>
> Attachments: HIVE-8839.1.patch, HIVE-8839.patch
>
>
> We often run into some issues like HIVE-6131which is due to inconsistent 
> column descriptors between table and partitions after alter table. 
> HIVE-8441/HIVE-7971 provided the flexibility to alter table at partition 
> level. But most cases we have need change the table and partitions at same 
> time. In addition, "alter table" is usually required prior to "alter table 
> partition .." since querying table partition data is also through table. 
> Instead of do that in two steps, here we provide a convenient ddl like "alter 
> table ... cascade" to cascade table changes to partitions as well. The 
> changes are only limited and applicable to add/replace columns and change 
> column name, datatype, position and comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8839) Support "alter table .. add/replace columns cascade"

2014-11-13 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14211739#comment-14211739
 ] 

Chaoyu Tang commented on HIVE-8839:
---

Hi Szehon, I just uploaded to RB a patch without non-java generated thrift 
code, and the entire patch here, thanks for review.

> Support "alter table .. add/replace columns cascade"
> 
>
> Key: HIVE-8839
> URL: https://issues.apache.org/jira/browse/HIVE-8839
> Project: Hive
>  Issue Type: Improvement
>  Components: SQL
> Environment: 
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 0.15.0
>
> Attachments: HIVE-8839.1.patch, HIVE-8839.patch
>
>
> We often run into some issues like HIVE-6131which is due to inconsistent 
> column descriptors between table and partitions after alter table. 
> HIVE-8441/HIVE-7971 provided the flexibility to alter table at partition 
> level. But most cases we have need change the table and partitions at same 
> time. In addition, "alter table" is usually required prior to "alter table 
> partition .." since querying table partition data is also through table. 
> Instead of do that in two steps, here we provide a convenient ddl like "alter 
> table ... cascade" to cascade table changes to partitions as well. The 
> changes are only limited and applicable to add/replace columns and change 
> column name, datatype, position and comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8839) Support "alter table .. add/replace columns cascade"

2014-11-13 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-8839:
--
Attachment: (was: HIVE-8839.1.patch)

> Support "alter table .. add/replace columns cascade"
> 
>
> Key: HIVE-8839
> URL: https://issues.apache.org/jira/browse/HIVE-8839
> Project: Hive
>  Issue Type: Improvement
>  Components: SQL
> Environment: 
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 0.15.0
>
> Attachments: HIVE-8839.patch
>
>
> We often run into some issues like HIVE-6131which is due to inconsistent 
> column descriptors between table and partitions after alter table. 
> HIVE-8441/HIVE-7971 provided the flexibility to alter table at partition 
> level. But most cases we have need change the table and partitions at same 
> time. In addition, "alter table" is usually required prior to "alter table 
> partition .." since querying table partition data is also through table. 
> Instead of do that in two steps, here we provide a convenient ddl like "alter 
> table ... cascade" to cascade table changes to partitions as well. The 
> changes are only limited and applicable to add/replace columns and change 
> column name, datatype, position and comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8839) Support "alter table .. add/replace columns cascade"

2014-11-13 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-8839:
--
Attachment: HIVE-8839.1.patch

> Support "alter table .. add/replace columns cascade"
> 
>
> Key: HIVE-8839
> URL: https://issues.apache.org/jira/browse/HIVE-8839
> Project: Hive
>  Issue Type: Improvement
>  Components: SQL
> Environment: 
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 0.15.0
>
> Attachments: HIVE-8839.1.patch, HIVE-8839.patch
>
>
> We often run into some issues like HIVE-6131which is due to inconsistent 
> column descriptors between table and partitions after alter table. 
> HIVE-8441/HIVE-7971 provided the flexibility to alter table at partition 
> level. But most cases we have need change the table and partitions at same 
> time. In addition, "alter table" is usually required prior to "alter table 
> partition .." since querying table partition data is also through table. 
> Instead of do that in two steps, here we provide a convenient ddl like "alter 
> table ... cascade" to cascade table changes to partitions as well. The 
> changes are only limited and applicable to add/replace columns and change 
> column name, datatype, position and comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 3 >

1 - 100 of 256 matches

Mail list logo