[jira] [Commented] (HIVE-1988) Make the delegation token issued by the MetaStore owned by the right user

2011-04-05 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015794#comment-13015794
 ] 

jirapos...@reviews.apache.org commented on HIVE-1988:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/528/#review386
---



http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java


HadoopShims.isSecureShimImpl() is not called anywhere else. Shall we remove 
it if not required anymore?



http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java


Do you want to move this into setup(), as it is common in both testcases?



http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java


code looks duplicated. Can it be refactored by passing group names to a 
method?


- Amareshwari


On 2011-03-29 10:26:38, Devaraj Das wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/528/
bq.  ---
bq.  
bq.  (Updated 2011-03-29 10:26:38)
bq.  
bq.  
bq.  Review request for hive.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Fixes to some security issues discussed in HIVE-1988
bq.  
bq.  
bq.  This addresses bug HIVE-1988.
bq.  https://issues.apache.org/jira/browse/HIVE-1988
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/if/hive_metastore.thrift 
1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
 1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/DelegationTokenSecretManager.java
 1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
 1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java
 1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java
 1085623 
bq.  
bq.  Diff: https://reviews.apache.org/r/528/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  New unit test added and that passes. All unit tests passed.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Devaraj
bq.  
bq.



> Make the delegation token issued by the MetaStore owned by the right user
> -
>
> Key: HIVE-1988
> URL: https://issues.apache.org/jira/browse/HIVE-1988
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Security, Server Infrastructure
>Affects Versions: 0.7.0
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 0.8.0
>
> Attachments: hive-1988-3.patch, hive-1988.patch
>
>
> The 'owner' of any delegation token issued by the MetaStore is set to the 
> requesting user. When a delegation token is asked by the user himself during 
> a job submission, this is fine. However, in the case where the token is 
> requested for by services (e.g., Oozie), on behalf of the user, the token's 
> owner is set to the user the service is running as. Later on, when the token 
> is used by a MapReduce task, the MetaStore treats the incoming request as 
> coming from Oozie and does operations as Oozie. This means any new directory 
> creations (e.g., create_table) on the hdfs by the MetaStore will end up with 
> Oozie as the owner.
> Also, the MetaStore doesn't check whether a user asking for a token on behalf 
> of some other user, is actually authorized to act on behalf of that other 
> user. We should start using the ProxyUser authorization in the MetaStore 
> (HADOOP-6510's APIs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-1988) Make the delegation token issued by the MetaStore owned by the right user

2011-04-05 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015795#comment-13015795
 ] 

Amareshwari Sriramadasu commented on HIVE-1988:
---

Changes look good overall. I updated the review board with some minor comments. 
You can upload the next patch with generated code.

> Make the delegation token issued by the MetaStore owned by the right user
> -
>
> Key: HIVE-1988
> URL: https://issues.apache.org/jira/browse/HIVE-1988
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Security, Server Infrastructure
>Affects Versions: 0.7.0
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 0.8.0
>
> Attachments: hive-1988-3.patch, hive-1988.patch
>
>
> The 'owner' of any delegation token issued by the MetaStore is set to the 
> requesting user. When a delegation token is asked by the user himself during 
> a job submission, this is fine. However, in the case where the token is 
> requested for by services (e.g., Oozie), on behalf of the user, the token's 
> owner is set to the user the service is running as. Later on, when the token 
> is used by a MapReduce task, the MetaStore treats the incoming request as 
> coming from Oozie and does operations as Oozie. This means any new directory 
> creations (e.g., create_table) on the hdfs by the MetaStore will end up with 
> Oozie as the owner.
> Also, the MetaStore doesn't check whether a user asking for a token on behalf 
> of some other user, is actually authorized to act on behalf of that other 
> user. We should start using the ProxyUser authorization in the MetaStore 
> (HADOOP-6510's APIs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2032) create database does not honour warehouse.dir in dbproperties

2011-04-05 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015818#comment-13015818
 ] 

Thiruvel Thirumoolan commented on HIVE-2032:


@Amareshwari

Post altering, new tables will be created under new location. Old tables' have 
the fully qualified location in metadata and they should continue to work as 
before.

The reasons I went with alter location are:

1. Allow migration to happen if one would like to reorganize or if quota runs 
out. Not sure how many folks have this situation.
2. One can migrate new tables to another DFS cluster (existing cluster becoming 
full).
3. One can migrate between file systems if they have sufficient use cases.

Do you think these are valid use cases?

> create database does not honour warehouse.dir in dbproperties
> -
>
> Key: HIVE-2032
> URL: https://issues.apache.org/jira/browse/HIVE-2032
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 0.7.0, 0.8.0
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
> Fix For: 0.8.0
>
> Attachments: DatabaseLocation.patch
>
>
> # create database db with dbproperties ('hive.metastore.warehouse.dir' = 
> 'loc');
> The above command does not set location of 'db' to 'loc'. It instead creates 
> 'db.db' under the warehouse directory configured in hive-site.xml of CLI. 
> Looks conflicting with HIVE-1820's expectation. If scratch dir is specified 
> here, that is honoured.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2032) create database does not honour warehouse.dir in dbproperties

2011-04-05 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015918#comment-13015918
 ] 

Amareshwari Sriramadasu commented on HIVE-2032:
---

Use cases make sense. But, drop database would remove tables only from new 
location. 
Does anyone have the idea why alter Database currently allows to change only DB 
properties? Namit/Ning?

bq.Post altering, new tables will be created under new location. Old tables' 
have the fully qualified location in metadata and they should continue to work 
as before.
Thiruvel, did you get a chance to test this? Because, your changes in patch 
does not look complete. Changes should be propagated to PersistentManager 
through ObjectStore.

> create database does not honour warehouse.dir in dbproperties
> -
>
> Key: HIVE-2032
> URL: https://issues.apache.org/jira/browse/HIVE-2032
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 0.7.0, 0.8.0
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
> Fix For: 0.8.0
>
> Attachments: DatabaseLocation.patch
>
>
> # create database db with dbproperties ('hive.metastore.warehouse.dir' = 
> 'loc');
> The above command does not set location of 'db' to 'loc'. It instead creates 
> 'db.db' under the warehouse directory configured in hive-site.xml of CLI. 
> Looks conflicting with HIVE-1820's expectation. If scratch dir is specified 
> here, that is honoured.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2032) create database does not honour warehouse.dir in dbproperties

2011-04-05 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015925#comment-13015925
 ] 

Thiruvel Thirumoolan commented on HIVE-2032:


> Use cases make sense. But, drop database would remove tables only from new 
> location.

If I am not wrong, drop db succeeds only if all tables under it are dropped.

> Thiruvel, did you get a chance to test this? Because, your changes in patch 
> does not look complete. Changes should be propagated to PersistentManager 
> through ObjectStore.

Will take a look.

> create database does not honour warehouse.dir in dbproperties
> -
>
> Key: HIVE-2032
> URL: https://issues.apache.org/jira/browse/HIVE-2032
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 0.7.0, 0.8.0
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
> Fix For: 0.8.0
>
> Attachments: DatabaseLocation.patch
>
>
> # create database db with dbproperties ('hive.metastore.warehouse.dir' = 
> 'loc');
> The above command does not set location of 'db' to 'loc'. It instead creates 
> 'db.db' under the warehouse directory configured in hive-site.xml of CLI. 
> Looks conflicting with HIVE-1820's expectation. If scratch dir is specified 
> here, that is honoured.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: HIVE-1988

2011-04-05 Thread Amareshwari Sriramadasu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/528/#review386
---



http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java


HadoopShims.isSecureShimImpl() is not called anywhere else. Shall we remove 
it if not required anymore?



http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java


Do you want to move this into setup(), as it is common in both testcases?



http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java


code looks duplicated. Can it be refactored by passing group names to a 
method?


- Amareshwari


On 2011-03-29 10:26:38, Devaraj Das wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/528/
> ---
> 
> (Updated 2011-03-29 10:26:38)
> 
> 
> Review request for hive.
> 
> 
> Summary
> ---
> 
> Fixes to some security issues discussed in HIVE-1988
> 
> 
> This addresses bug HIVE-1988.
> https://issues.apache.org/jira/browse/HIVE-1988
> 
> 
> Diffs
> -
> 
>   
> http://svn.apache.org/repos/asf/hive/trunk/metastore/if/hive_metastore.thrift 
> 1085623 
>   
> http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  1085623 
>   
> http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
>  1085623 
>   
> http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
>  1085623 
>   
> http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/DelegationTokenSecretManager.java
>  1085623 
>   
> http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
>  1085623 
>   
> http://svn.apache.org/repos/asf/hive/trunk/shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java
>  1085623 
>   
> http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java
>  1085623 
> 
> Diff: https://reviews.apache.org/r/528/diff
> 
> 
> Testing
> ---
> 
> New unit test added and that passes. All unit tests passed.
> 
> 
> Thanks,
> 
> Devaraj
> 
>



[jira] [Commented] (HIVE-1538) FilterOperator is applied twice with ppd on.

2011-04-05 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015943#comment-13015943
 ] 

jirapos...@reviews.apache.org commented on HIVE-1538:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/550/
---

Review request for hive, Yongqiang He and namit jain.


Summary
---

Patch updated to trunk with newly added configuration var 
hive.ppd.remove.duplicatefilters


This addresses bug HIVE-1538.
https://issues.apache.org/jira/browse/HIVE-1538


Diffs
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1088944 
  trunk/contrib/src/test/results/clientpositive/dboutput.q.out 1088944 
  trunk/contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 1088944 
  trunk/hbase-handler/src/test/results/hbase_pushdown.q.out 1088944 
  trunk/hbase-handler/src/test/results/hbase_queries.q.out 1088944 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 1088944 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerInfo.java 1088944 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 
1088944 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 1088944 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpWalkerInfo.java 1088944 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicatePushDown.java 
1088944 
  trunk/ql/src/test/queries/clientpositive/ppd1.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_clusterby.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_constant_expr.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_gby.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_gby2.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_gby_join.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_join.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_join2.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_join3.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_multi_insert.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_outer_join1.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_outer_join2.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_outer_join3.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_outer_join4.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_random.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_transform.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_udf_case.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_union.q 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join0.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join11.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join12.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join13.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join14.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join16.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join19.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join20.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join21.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join23.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join27.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join4.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join5.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join6.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join7.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join8.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join9.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucket2.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucket3.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucket4.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucket_groupby.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin1.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin2.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin3.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/case_sensitivity.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/cast1.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/cluster.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/combine2.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/create_view.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/disable_merge_for_bucketing.q.out 
1088944 
  trunk/ql/sr

[jira] [Updated] (HIVE-1538) FilterOperator is applied twice with ppd on.

2011-04-05 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated HIVE-1538:
--

Attachment: patch-1538-3.txt

Patch updated to trunk

> FilterOperator is applied twice with ppd on.
> 
>
> Key: HIVE-1538
> URL: https://issues.apache.org/jira/browse/HIVE-1538
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Attachments: patch-1538-1.txt, patch-1538-2.txt, patch-1538-3.txt, 
> patch-1538.txt
>
>
> With hive.optimize.ppd set to true, FilterOperator is applied twice. And it 
> seems second operator is always filtering zero rows.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2032) create database does not honour warehouse.dir in dbproperties

2011-04-05 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015951#comment-13015951
 ] 

Thiruvel Thirumoolan commented on HIVE-2032:


HIVE-1537 is supporting LOCATION in create database command
HIVE-675 - first comment is consistent with HIVE-1537
HIVE-1820 is about warehouse location through DBPROPERTIES

My approach on this JIRA so far was based on HIVE-1820.

What is the suggested way to specify location during create database?

@Carl, @Ning, @He Yongqiang - suggestions?

> create database does not honour warehouse.dir in dbproperties
> -
>
> Key: HIVE-2032
> URL: https://issues.apache.org/jira/browse/HIVE-2032
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 0.7.0, 0.8.0
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
> Fix For: 0.8.0
>
> Attachments: DatabaseLocation.patch
>
>
> # create database db with dbproperties ('hive.metastore.warehouse.dir' = 
> 'loc');
> The above command does not set location of 'db' to 'loc'. It instead creates 
> 'db.db' under the warehouse directory configured in hive-site.xml of CLI. 
> Looks conflicting with HIVE-1820's expectation. If scratch dir is specified 
> here, that is honoured.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Query Regarding HIVE-1844.

2011-04-05 Thread Mohit
Hello He Yongqiang /All,

 

I was going through the defect HIVE-1844, but I couldn't able to reproduce
the scenario using Hive 0.5 version , though I  saw some OOM consistently
while Copy Task @ server side, but the client didn't hanged.

 

As per you what could have made client hanged? In my case,  Hive client was
able to get proper response from thrift whenever OOM occurred at Server
side.

like , java.sql.SQLException: org.apache.thrift.TApplicationException :
Internal error processing execute

 

Kindly provide me pointers on reproducing it. Do I need to do more
regression on it?

 

Just a thought/observation,

And as per code change , why the OOM was caught too early (that too in the
form of Throw able, which will eat other exception as well) ?

It would have been eventually caught by
ThriftHive$Processor$execute.process() and appropriate actions would have
been taken, so I was wondering how the code change helped preventing client
hang?

 

 

Thanks and Regards,

-Mohit



Review Request: HIVE-1538

2011-04-05 Thread Amareshwari Sriramadasu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/550/
---

Review request for hive, Yongqiang He and namit jain.


Summary
---

Patch updated to trunk with newly added configuration var 
hive.ppd.remove.duplicatefilters


This addresses bug HIVE-1538.
https://issues.apache.org/jira/browse/HIVE-1538


Diffs
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1088944 
  trunk/contrib/src/test/results/clientpositive/dboutput.q.out 1088944 
  trunk/contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 1088944 
  trunk/hbase-handler/src/test/results/hbase_pushdown.q.out 1088944 
  trunk/hbase-handler/src/test/results/hbase_queries.q.out 1088944 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 1088944 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerInfo.java 1088944 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 
1088944 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 1088944 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpWalkerInfo.java 1088944 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicatePushDown.java 
1088944 
  trunk/ql/src/test/queries/clientpositive/ppd1.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_clusterby.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_constant_expr.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_gby.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_gby2.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_gby_join.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_join.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_join2.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_join3.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_multi_insert.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_outer_join1.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_outer_join2.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_outer_join3.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_outer_join4.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_random.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_transform.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_udf_case.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_union.q 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join0.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join11.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join12.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join13.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join14.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join16.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join19.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join20.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join21.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join23.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join27.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join4.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join5.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join6.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join7.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join8.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join9.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucket2.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucket3.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucket4.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucket_groupby.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin1.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin2.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin3.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/case_sensitivity.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/cast1.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/cluster.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/combine2.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/create_view.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/disable_merge_for_bucketing.q.out 
1088944 
  trunk/ql/src/test/results/clientpositive/filter_join_breaktask.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/groupby_map_ppr.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/groupby_map_ppr_multi_distinct.q.out 
1088944 
  trunk/ql/src/test/results/clientpositive/groupby_p

[jira] [Created] (HIVE-2091) Test scripts need to be made deterministic in their output

2011-04-05 Thread Roman Shaposhnik (JIRA)
Test scripts need to be made deterministic in their output
--

 Key: HIVE-2091
 URL: https://issues.apache.org/jira/browse/HIVE-2091
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Affects Versions: 0.7.0
Reporter: Roman Shaposhnik
Priority: Minor


Currently this 2 query scripts generate non-deterministic output:

The suggestion is to use GROUP BY statement.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2091) Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output

2011-04-05 Thread Roman Shaposhnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Shaposhnik updated HIVE-2091:
---

Description: 
Currently this 2 query scripts generate non-deterministic output: 
  * ql/src/test/queries/clientpositive/rcfile_columnar.q
  * ql/src/test/queries/clientpositive/join_filters.q  

The suggestion is to use GROUP BY statement.

  was:
Currently this 2 query scripts generate non-deterministic output:

The suggestion is to use GROUP BY statement.

Summary: Test scripts rcfile_columnar.q and join_filters.q   need to be 
made deterministic in their output  (was: Test scripts need to be made 
deterministic in their output)

> Test scripts rcfile_columnar.q and join_filters.q   need to be made 
> deterministic in their output
> -
>
> Key: HIVE-2091
> URL: https://issues.apache.org/jira/browse/HIVE-2091
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Affects Versions: 0.7.0
>Reporter: Roman Shaposhnik
>Priority: Minor
>
> Currently this 2 query scripts generate non-deterministic output: 
>   * ql/src/test/queries/clientpositive/rcfile_columnar.q
>   * ql/src/test/queries/clientpositive/join_filters.q  
> The suggestion is to use GROUP BY statement.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation

2011-04-05 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-2068:
-

Status: Open  (was: Patch Available)

> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or 
> aggregation
> ---
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siying Dong
>Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT 
> xxx" will start a MapReduce job with input to be the whole table or 
> partition. The latency can be huge if the table or partition is big. We could 
> reduce number of input files to speed up the queries.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2068) Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation

2011-04-05 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016030#comment-13016030
 ] 

Namit Jain commented on HIVE-2068:
--

comments in review-board

> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or 
> aggregation
> ---
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siying Dong
>Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT 
> xxx" will start a MapReduce job with input to be the whole table or 
> partition. The latency can be huge if the table or partition is big. We could 
> reduce number of input files to speed up the queries.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2054) Exception on windows when using the jdbc driver. "IOException: The system cannot find the path specified"

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2054:
-

   Resolution: Fixed
Fix Version/s: 0.7.1
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

Committed to trunk and 0.7.1. Thanks Bennie!

> Exception on windows when using the jdbc driver. "IOException: The system 
> cannot find the path specified"
> -
>
> Key: HIVE-2054
> URL: https://issues.apache.org/jira/browse/HIVE-2054
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 0.8.0
>Reporter: Bennie Schut
>Assignee: Bennie Schut
>Priority: Minor
> Fix For: 0.7.1, 0.8.0
>
> Attachments: HIVE-2054.1.patch.txt, HIVE-2054.2.patch.txt, 
> HIVE-2054.3.patch.txt
>
>
> It seems something recently changed on the jdbc driver which causes this 
> IOException on windows.
> java.lang.RuntimeException: java.io.IOException: The system cannot find the 
> path specified
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:237)
>   at 
> org.apache.hadoop.hive.jdbc.HiveConnection.(HiveConnection.java:73)
>   at org.apache.hadoop.hive.jdbc.HiveDriver.connect(HiveDriver.java:110)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2090) Add "DROP DATABASE ... FORCE"

2011-04-05 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016059#comment-13016059
 ] 

John Sichi commented on HIVE-2090:
--

I'd say go with default RESTRICT for safety.

> Add "DROP DATABASE ... FORCE"
> -
>
> Key: HIVE-2090
> URL: https://issues.apache.org/jira/browse/HIVE-2090
> Project: Hive
>  Issue Type: New Feature
>Reporter: Siying Dong
>Assignee: Siying Dong
>Priority: Minor
> Attachments: HIVE-2090.1.patch
>
>
> A "DROP DATABASE ... FORCE" will be useful, when we use a database for 
> isolation when doing some tests. Being able to force cleaning up the database 
> will make test cleaning up easier.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2091) Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output

2011-04-05 Thread Roman Shaposhnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Shaposhnik updated HIVE-2091:
---

Status: Patch Available  (was: Open)

Please take a look at the attached patch

> Test scripts rcfile_columnar.q and join_filters.q   need to be made 
> deterministic in their output
> -
>
> Key: HIVE-2091
> URL: https://issues.apache.org/jira/browse/HIVE-2091
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Affects Versions: 0.7.0
>Reporter: Roman Shaposhnik
>Priority: Minor
>
> Currently this 2 query scripts generate non-deterministic output: 
>   * ql/src/test/queries/clientpositive/rcfile_columnar.q
>   * ql/src/test/queries/clientpositive/join_filters.q  
> The suggestion is to use GROUP BY statement.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2091) Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output

2011-04-05 Thread Roman Shaposhnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Shaposhnik updated HIVE-2091:
---

Attachment: HIVE-2091.patch

> Test scripts rcfile_columnar.q and join_filters.q   need to be made 
> deterministic in their output
> -
>
> Key: HIVE-2091
> URL: https://issues.apache.org/jira/browse/HIVE-2091
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Affects Versions: 0.7.0
>Reporter: Roman Shaposhnik
>Priority: Minor
> Attachments: HIVE-2091.patch
>
>
> Currently this 2 query scripts generate non-deterministic output: 
>   * ql/src/test/queries/clientpositive/rcfile_columnar.q
>   * ql/src/test/queries/clientpositive/join_filters.q  
> The suggestion is to use GROUP BY statement.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2091) Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2091:
-

Description: 
Currently this 2 query scripts generate non-deterministic output: 
  * ql/src/test/queries/clientpositive/rcfile_columnar.q
  * ql/src/test/queries/clientpositive/join_filters.q  

The suggestion is to use ORDER BY statement.

  was:
Currently this 2 query scripts generate non-deterministic output: 
  * ql/src/test/queries/clientpositive/rcfile_columnar.q
  * ql/src/test/queries/clientpositive/join_filters.q  

The suggestion is to use GROUP BY statement.


> Test scripts rcfile_columnar.q and join_filters.q   need to be made 
> deterministic in their output
> -
>
> Key: HIVE-2091
> URL: https://issues.apache.org/jira/browse/HIVE-2091
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Affects Versions: 0.7.0
>Reporter: Roman Shaposhnik
>Priority: Minor
> Attachments: HIVE-2091.patch
>
>
> Currently this 2 query scripts generate non-deterministic output: 
>   * ql/src/test/queries/clientpositive/rcfile_columnar.q
>   * ql/src/test/queries/clientpositive/join_filters.q  
> The suggestion is to use ORDER BY statement.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Jenkins build is back to normal : Hive-trunk-h0.20 #655

2011-04-05 Thread Apache Hudson Server
See 




Build failed in Jenkins: Hive-0.7.0-h0.20 #66

2011-04-05 Thread Apache Hudson Server
See 

--
[...truncated 27402 lines...]
[junit] Hive history 
file=
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] Copying data from 

[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-04-05_12-46-50_400_7485125692539403914/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks determined at compile time: 1
[junit] In order to change the average load for a reducer (in bytes):
[junit]   set hive.exec.reducers.bytes.per.reducer=
[junit] In order to limit the maximum number of reducers:
[junit]   set hive.exec.reducers.max=
[junit] In order to set a constant number of reducers:
[junit]   set mapred.reduce.tasks=
[junit] Job running in-process (local Hadoop)
[junit] 2011-04-05 12:46:53,444 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-04-05_12-46-50_400_7485125692539403914/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] Copying data from 

[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-04-05_12-46-55_145_3892655606022687079/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-04-05_12-46-55_145_3892655606022687079/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedr

[jira] [Resolved] (HIVE-2072) test

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-2072.
--

Resolution: Incomplete

> test
> 
>
> Key: HIVE-2072
> URL: https://issues.apache.org/jira/browse/HIVE-2072
> Project: Hive
>  Issue Type: Test
>Reporter: YoungYik
>Priority: Trivial
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2066) Metastore Schema Scripts

2011-04-05 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016098#comment-13016098
 ] 

Carl Steinbach commented on HIVE-2066:
--

This ticket is being used as a convenient public storage space for versioned 
dumps of the Hive MetaStore database schema.

> Metastore Schema Scripts
> 
>
> Key: HIVE-2066
> URL: https://issues.apache.org/jira/browse/HIVE-2066
> Project: Hive
>  Issue Type: Task
>  Components: Metastore
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Attachments: hive-schema-0.3.0.derby.sql, 
> hive-schema-0.3.0.mysql.sql, hive-schema-0.3.0.postgres.sql, 
> hive-schema-0.4.0.derby.sql, hive-schema-0.4.0.mysql.sql, 
> hive-schema-0.4.0.postgres.sql, hive-schema-0.4.1.derby.sql, 
> hive-schema-0.4.1.mysql.sql, hive-schema-0.4.1.postgres.sql, 
> hive-schema-0.5.0.derby.sql, hive-schema-0.5.0.mysql.sql, 
> hive-schema-0.5.0.postgres.sql, hive-schema-0.6.0.derby.sql, 
> hive-schema-0.6.0.mysql.sql, hive-schema-0.6.0.postgres.sql, 
> hive-schema-0.7.0.derby.sql, hive-schema-0.7.0.mysql.sql, 
> hive-schema-0.7.0.postgres.sql
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-2066) Metastore Schema Scripts

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-2066.
--

Resolution: Not A Problem

> Metastore Schema Scripts
> 
>
> Key: HIVE-2066
> URL: https://issues.apache.org/jira/browse/HIVE-2066
> Project: Hive
>  Issue Type: Task
>  Components: Metastore
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Attachments: hive-schema-0.3.0.derby.sql, 
> hive-schema-0.3.0.mysql.sql, hive-schema-0.3.0.postgres.sql, 
> hive-schema-0.4.0.derby.sql, hive-schema-0.4.0.mysql.sql, 
> hive-schema-0.4.0.postgres.sql, hive-schema-0.4.1.derby.sql, 
> hive-schema-0.4.1.mysql.sql, hive-schema-0.4.1.postgres.sql, 
> hive-schema-0.5.0.derby.sql, hive-schema-0.5.0.mysql.sql, 
> hive-schema-0.5.0.postgres.sql, hive-schema-0.6.0.derby.sql, 
> hive-schema-0.6.0.mysql.sql, hive-schema-0.6.0.postgres.sql, 
> hive-schema-0.7.0.derby.sql, hive-schema-0.7.0.mysql.sql, 
> hive-schema-0.7.0.postgres.sql
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-1668) Move HWI out to Github

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-1668.
--

Resolution: Not A Problem

> Move HWI out to Github
> --
>
> Key: HIVE-1668
> URL: https://issues.apache.org/jira/browse/HIVE-1668
> Project: Hive
>  Issue Type: Improvement
>  Components: Web UI
>Reporter: Jeff Hammerbacher
>
> I have seen HWI cause a number of build and test errors, and it's now going 
> to cost us some extra work for integration with security. We've worked on 
> hundreds of clusters at Cloudera and I've never seen anyone use HWI. With the 
> Beeswax UI available in Hue, it's unlikely that anyone would prefer to stick 
> with HWI. I think it's time to move it out to Github.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-2093) inputs are outputs should be populated for create/drop database

2011-04-05 Thread Namit Jain (JIRA)
inputs are outputs should be populated for create/drop database
---

 Key: HIVE-2093
 URL: https://issues.apache.org/jira/browse/HIVE-2093
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Siying Dong


This is needed for many other things: concurrency, authorization etc. to work

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-2092) support 'drop database force';

2011-04-05 Thread Namit Jain (JIRA)
support 'drop database  force';
---

 Key: HIVE-2092
 URL: https://issues.apache.org/jira/browse/HIVE-2092
 Project: Hive
  Issue Type: New Feature
Reporter: Namit Jain
Assignee: Siying Dong


Currently, the above command fails if the database is not empty.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-2092) support 'drop database force';

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-2092.
--

Resolution: Duplicate

Duplicate of HIVE-2090

> support 'drop database  force';
> ---
>
> Key: HIVE-2092
> URL: https://issues.apache.org/jira/browse/HIVE-2092
> Project: Hive
>  Issue Type: New Feature
>Reporter: Namit Jain
>Assignee: Siying Dong
>
> Currently, the above command fails if the database is not empty.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2090) Add "DROP DATABASE ... FORCE"

2011-04-05 Thread Siying Dong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siying Dong updated HIVE-2090:
--

Attachment: HIVE-2090.2.patch

This is an in-progress patch. It fixed the syntax to "CASCADE/RESTRICT" instead 
of "FORCE". While we had some discussion offline and decided to do the logic in 
object store level, so I need to make some more changes. We'll open other 
issues for fixing concurrency and authorization around dropping and creating 
databases.

> Add "DROP DATABASE ... FORCE"
> -
>
> Key: HIVE-2090
> URL: https://issues.apache.org/jira/browse/HIVE-2090
> Project: Hive
>  Issue Type: New Feature
>Reporter: Siying Dong
>Assignee: Siying Dong
>Priority: Minor
> Attachments: HIVE-2090.1.patch, HIVE-2090.2.patch
>
>
> A "DROP DATABASE ... FORCE" will be useful, when we use a database for 
> isolation when doing some tests. Being able to force cleaning up the database 
> will make test cleaning up easier.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-2094) CREATE and DROP DATABASE doesn't check user permission for doing it

2011-04-05 Thread Siying Dong (JIRA)
CREATE and DROP DATABASE doesn't check user permission for doing it
---

 Key: HIVE-2094
 URL: https://issues.apache.org/jira/browse/HIVE-2094
 Project: Hive
  Issue Type: Bug
Reporter: Siying Dong
Assignee: He Yongqiang


We need to make sure only users with system permission to do it.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

2011-04-05 Thread Marquis Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marquis Wang updated HIVE-1803:
---

Status: Patch Available  (was: Open)

> Implement bitmap indexing in Hive
> -
>
> Key: HIVE-1803
> URL: https://issues.apache.org/jira/browse/HIVE-1803
> Project: Hive
>  Issue Type: New Feature
>  Components: Indexing
>Reporter: Marquis Wang
>Assignee: Marquis Wang
> Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, 
> HIVE-1803.11.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, 
> HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, 
> HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, 
> bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

2011-04-05 Thread Marquis Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marquis Wang updated HIVE-1803:
---

Attachment: unit-tests.patch
HIVE-1803.11.patch

New patch that fixes the minor javadocs comments from patch 10.

A unit-tests patch that updates all the unit tests that were affected by the 
virtual column change.

> Implement bitmap indexing in Hive
> -
>
> Key: HIVE-1803
> URL: https://issues.apache.org/jira/browse/HIVE-1803
> Project: Hive
>  Issue Type: New Feature
>  Components: Indexing
>Reporter: Marquis Wang
>Assignee: Marquis Wang
> Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, 
> HIVE-1803.11.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, 
> HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, 
> HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, 
> bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: HIVE-1988

2011-04-05 Thread Devaraj Das


> On 2011-04-05 07:52:15, Amareshwari Sriramadasu wrote:
> > http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java,
> >  line 152
> > 
> >
> > HadoopShims.isSecureShimImpl() is not called anywhere else. Shall we 
> > remove it if not required anymore?

I suggest we leave it there. This seems like a useful method, and I am actually 
using it in another patch.


> On 2011-04-05 07:52:15, Amareshwari Sriramadasu wrote:
> > http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java,
> >  lines 144-156
> > 
> >
> > Do you want to move this into setup(), as it is common in both 
> > testcases?

Done


> On 2011-04-05 07:52:15, Amareshwari Sriramadasu wrote:
> > http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java,
> >  lines 192-209
> > 
> >
> > code looks duplicated. Can it be refactored by passing group names to a 
> > method?

Done


- Devaraj


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/528/#review386
---


On 2011-03-29 10:26:38, Devaraj Das wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/528/
> ---
> 
> (Updated 2011-03-29 10:26:38)
> 
> 
> Review request for hive.
> 
> 
> Summary
> ---
> 
> Fixes to some security issues discussed in HIVE-1988
> 
> 
> This addresses bug HIVE-1988.
> https://issues.apache.org/jira/browse/HIVE-1988
> 
> 
> Diffs
> -
> 
>   
> http://svn.apache.org/repos/asf/hive/trunk/metastore/if/hive_metastore.thrift 
> 1085623 
>   
> http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  1085623 
>   
> http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
>  1085623 
>   
> http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
>  1085623 
>   
> http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/DelegationTokenSecretManager.java
>  1085623 
>   
> http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
>  1085623 
>   
> http://svn.apache.org/repos/asf/hive/trunk/shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java
>  1085623 
>   
> http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java
>  1085623 
> 
> Diff: https://reviews.apache.org/r/528/diff
> 
> 
> Testing
> ---
> 
> New unit test added and that passes. All unit tests passed.
> 
> 
> Thanks,
> 
> Devaraj
> 
>



Re: Review Request: HIVE-1988

2011-04-05 Thread Devaraj Das

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/528/
---

(Updated 2011-04-05 21:24:34.129643)


Review request for hive.


Changes
---

Addressed Amareshwari's comments.


Summary
---

Fixes to some security issues discussed in HIVE-1988


This addresses bug HIVE-1988.
https://issues.apache.org/jira/browse/HIVE-1988


Diffs (updated)
-

  http://svn.apache.org/repos/asf/hive/trunk/metastore/if/hive_metastore.thrift 
1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/DelegationTokenSecretManager.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java
 1089155 

Diff: https://reviews.apache.org/r/528/diff


Testing
---

New unit test added and that passes. All unit tests passed.


Thanks,

Devaraj



[jira] [Commented] (HIVE-1988) Make the delegation token issued by the MetaStore owned by the right user

2011-04-05 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016128#comment-13016128
 ] 

jirapos...@reviews.apache.org commented on HIVE-1988:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/528/
---

(Updated 2011-04-05 21:24:34.129643)


Review request for hive.


Changes
---

Addressed Amareshwari's comments.


Summary
---

Fixes to some security issues discussed in HIVE-1988


This addresses bug HIVE-1988.
https://issues.apache.org/jira/browse/HIVE-1988


Diffs (updated)
-

  http://svn.apache.org/repos/asf/hive/trunk/metastore/if/hive_metastore.thrift 
1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/DelegationTokenSecretManager.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java
 1089155 

Diff: https://reviews.apache.org/r/528/diff


Testing
---

New unit test added and that passes. All unit tests passed.


Thanks,

Devaraj



> Make the delegation token issued by the MetaStore owned by the right user
> -
>
> Key: HIVE-1988
> URL: https://issues.apache.org/jira/browse/HIVE-1988
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Security, Server Infrastructure
>Affects Versions: 0.7.0
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 0.8.0
>
> Attachments: hive-1988-3.patch, hive-1988.patch
>
>
> The 'owner' of any delegation token issued by the MetaStore is set to the 
> requesting user. When a delegation token is asked by the user himself during 
> a job submission, this is fine. However, in the case where the token is 
> requested for by services (e.g., Oozie), on behalf of the user, the token's 
> owner is set to the user the service is running as. Later on, when the token 
> is used by a MapReduce task, the MetaStore treats the incoming request as 
> coming from Oozie and does operations as Oozie. This means any new directory 
> creations (e.g., create_table) on the hdfs by the MetaStore will end up with 
> Oozie as the owner.
> Also, the MetaStore doesn't check whether a user asking for a token on behalf 
> of some other user, is actually authorized to act on behalf of that other 
> user. We should start using the ProxyUser authorization in the MetaStore 
> (HADOOP-6510's APIs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-1988) Make the delegation token issued by the MetaStore owned by the right user

2011-04-05 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016129#comment-13016129
 ] 

jirapos...@reviews.apache.org commented on HIVE-1988:
-



bq.  On 2011-04-05 07:52:15, Amareshwari Sriramadasu wrote:
bq.  > 
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java,
 line 152
bq.  > 
bq.  >
bq.  > HadoopShims.isSecureShimImpl() is not called anywhere else. Shall we 
remove it if not required anymore?

I suggest we leave it there. This seems like a useful method, and I am actually 
using it in another patch.


bq.  On 2011-04-05 07:52:15, Amareshwari Sriramadasu wrote:
bq.  > 
http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java,
 lines 144-156
bq.  > 
bq.  >
bq.  > Do you want to move this into setup(), as it is common in both 
testcases?

Done


bq.  On 2011-04-05 07:52:15, Amareshwari Sriramadasu wrote:
bq.  > 
http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java,
 lines 192-209
bq.  > 
bq.  >
bq.  > code looks duplicated. Can it be refactored by passing group names 
to a method?

Done


- Devaraj


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/528/#review386
---


On 2011-03-29 10:26:38, Devaraj Das wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/528/
bq.  ---
bq.  
bq.  (Updated 2011-03-29 10:26:38)
bq.  
bq.  
bq.  Review request for hive.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Fixes to some security issues discussed in HIVE-1988
bq.  
bq.  
bq.  This addresses bug HIVE-1988.
bq.  https://issues.apache.org/jira/browse/HIVE-1988
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/if/hive_metastore.thrift 
1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
 1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/DelegationTokenSecretManager.java
 1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
 1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java
 1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java
 1085623 
bq.  
bq.  Diff: https://reviews.apache.org/r/528/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  New unit test added and that passes. All unit tests passed.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Devaraj
bq.  
bq.



> Make the delegation token issued by the MetaStore owned by the right user
> -
>
> Key: HIVE-1988
> URL: https://issues.apache.org/jira/browse/HIVE-1988
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Security, Server Infrastructure
>Affects Versions: 0.7.0
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 0.8.0
>
> Attachments: hive-1988-3.patch, hive-1988.patch
>
>
> The 'owner' of any delegation token issued by the MetaStore is set to the 
> requesting user. When a delegation token is asked by the user himself during 
> a job submission, this is fine. However, in the case where the token is 
> requested for by services (e.g., Oozie), on behalf of the user, the token's 
> owner is set to the user the service is running as. Later on, when the token 
> is used by a MapReduce task, the MetaStore treats the incoming request as 
> coming from Oozie and does operations as Oozie. This means any new directory 
> creations (e.g., create_table) on the hdfs by the MetaStore will end up with 
> Oozie as the owner.
> Also, the MetaStore doesn't check whether a user asking for a token on behalf 
> of

[jira] [Updated] (HIVE-867) Add add UDFs found in mysql

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-867:


Component/s: UDF

> Add add UDFs found in mysql
> ---
>
> Key: HIVE-867
> URL: https://issues.apache.org/jira/browse/HIVE-867
> Project: Hive
>  Issue Type: New Feature
>  Components: UDF
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
> Attachments: hive-867-1.diff, hive-867-10.diff, hive-867-2.diff, 
> hive-867-3.diff, hive-867-7.diff
>
>
> Some UDF's that mysql has that hive does not. 
> atan
> aes_decrypt
> aes_encrypt
> bit_and
> bit_count
> bit_length
> bit_or
> bit_xor
> char_length
> char
> character_length
> collation
> compress
> crc32
> encode
> encrypt
> format
> greatest
> in
> inet_oton
> inet_ntoa
> match
> md5
> oct
> ord
> pi
> radians
> sha1 _sha
> sign
> sleep
> truncate

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-2095) auto convert map join should not be triggered if the input size is bigger than a configured value.

2011-04-05 Thread He Yongqiang (JIRA)
auto convert map join should not be triggered if the input size is bigger than 
a configured value.
--

 Key: HIVE-2095
 URL: https://issues.apache.org/jira/browse/HIVE-2095
 Project: Hive
  Issue Type: Bug
Reporter: He Yongqiang
Assignee: He Yongqiang


If auto convert join is set to true, it should fall back to common join if the 
input size of each join table is bigger than a configured value.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-2061) Create a hive_contrib.jar symlink to hive-contrib-{version}.jar for backward compatibility

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-2061.
--

   Resolution: Fixed
Fix Version/s: 0.8.0
 Hadoop Flags: [Reviewed]

Committed to trunk.

> Create a hive_contrib.jar symlink to hive-contrib-{version}.jar for backward 
> compatibility
> --
>
> Key: HIVE-2061
> URL: https://issues.apache.org/jira/browse/HIVE-2061
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Ning Zhang
>Assignee: Ning Zhang
>Priority: Minor
> Fix For: 0.8.0
>
> Attachments: HIVE-2061.patch
>
>
> We have seen a use case where in the user's script, it run 'add jar 
> hive_contrib.jar'. Since Hive has moved the jar file to be 
> hive-contrib-{version}.jar, it introduced backward incompatibility. If we as 
> the user to change the script and when Hive upgrade version again, the user 
> need to change the script again. Creating a symlink seems to be the best 
> solution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2061) Create a hive_contrib.jar symlink to hive-contrib-{version}.jar for backward compatibility

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2061:
-

Component/s: Build Infrastructure

> Create a hive_contrib.jar symlink to hive-contrib-{version}.jar for backward 
> compatibility
> --
>
> Key: HIVE-2061
> URL: https://issues.apache.org/jira/browse/HIVE-2061
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Ning Zhang
>Assignee: Ning Zhang
>Priority: Minor
> Fix For: 0.8.0
>
> Attachments: HIVE-2061.patch
>
>
> We have seen a use case where in the user's script, it run 'add jar 
> hive_contrib.jar'. Since Hive has moved the jar file to be 
> hive-contrib-{version}.jar, it introduced backward incompatibility. If we as 
> the user to change the script and when Hive upgrade version again, the user 
> need to change the script again. Creating a symlink seems to be the best 
> solution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1985) better error message for selecting non-existing columns

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1985:
-

Component/s: Query Processor
 Diagnosability

> better error message for selecting non-existing columns
> ---
>
> Key: HIVE-1985
> URL: https://issues.apache.org/jira/browse/HIVE-1985
> Project: Hive
>  Issue Type: Improvement
>  Components: Diagnosability, Query Processor
>Reporter: He Yongqiang
>
> Should have an error message for a query like :
> select a.key,a,a.value from src a;

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2057) eliminate parser warning for "Identifier DOT Identifier"

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2057:
-

Component/s: Diagnosability

> eliminate parser warning for "Identifier DOT Identifier"
> 
>
> Key: HIVE-2057
> URL: https://issues.apache.org/jira/browse/HIVE-2057
> Project: Hive
>  Issue Type: Improvement
>  Components: Diagnosability, Query Processor
>Reporter: John Sichi
>
> I noticed this warning in recent builds:
> {noformat}
> build-grammar:
>  [echo] Building Grammar 
> /data/users/jsichi/open/hive-trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
>   
>  [java] ANTLR Parser Generator  Version 3.0.1 (August 13, 2007)  1989-2007
>  [java] warning(200): 
> /data/users/jsichi/open/hive-trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g:1503:5:
>  Decision can match input such as "Identifier DOT Identifier" using multiple 
> alternatives: 1, 2
>  [java] As a result, alternative(s) 2 were disabled for that input
> {noformat}
> This was introduced by HIVE-1517.  Is there a way to get rid of it?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1935) set hive.security.authorization.createtable.owner.grants to null by default

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1935:
-

Component/s: Security

> set hive.security.authorization.createtable.owner.grants to null by default
> ---
>
> Key: HIVE-1935
> URL: https://issues.apache.org/jira/browse/HIVE-1935
> Project: Hive
>  Issue Type: Bug
>  Components: Security
>Reporter: He Yongqiang
>Assignee: He Yongqiang
> Fix For: 0.7.0
>
> Attachments: HIVE-1935.1.patch
>
>
> It seems an empty setting in hive-size.xml does not overwrite hive-default.xml

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-1935) set hive.security.authorization.createtable.owner.grants to null by default

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-1935.
--

   Resolution: Fixed
Fix Version/s: 0.7.0
 Hadoop Flags: [Reviewed]

> set hive.security.authorization.createtable.owner.grants to null by default
> ---
>
> Key: HIVE-1935
> URL: https://issues.apache.org/jira/browse/HIVE-1935
> Project: Hive
>  Issue Type: Bug
>  Components: Security
>Reporter: He Yongqiang
>Assignee: He Yongqiang
> Fix For: 0.7.0
>
> Attachments: HIVE-1935.1.patch
>
>
> It seems an empty setting in hive-size.xml does not overwrite hive-default.xml

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1841) datanucleus.fixedDatastore should be true in hive-default.xml

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1841:
-

Component/s: Metastore

>  datanucleus.fixedDatastore should be true in hive-default.xml
> --
>
> Key: HIVE-1841
> URL: https://issues.apache.org/jira/browse/HIVE-1841
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Metastore
>Affects Versions: 0.6.0
>Reporter: Edward Capriolo
>Priority: Minor
> Attachments: HIVE-1841.1.patch.txt
>
>
> Two datanucleus variables:
> {noformat}
> 
>  datanucleus.autoCreateSchema
>  false
> 
> 
>  datanucleus.fixedDatastore
>  true
> 
> {noformat}
> are dangerous.  We do want the schema to auto-create itself, but we do not 
> want the schema to auto update itself. 
> Someone might accidentally point a trunk at the wrong meta-store and 
> unknowingly update. I believe we should set this to false and possibly trap 
> exceptions stemming from hive wanting to do any update. This way someone has 
> to actively acknowledge the update, by setting this to true and then starting 
> up hive, or leaving it false, removing schema modifies for the user that hive 
> usages, and doing all the time and doing the updates by hand. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1825) Different defaults for hive.metastore.local

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1825:
-

Component/s: Metastore

> Different defaults for hive.metastore.local
> ---
>
> Key: HIVE-1825
> URL: https://issues.apache.org/jira/browse/HIVE-1825
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration, Metastore
>Affects Versions: 0.6.0
>Reporter: Lars Francke
>
> hive-default.xml sets {{hive.metastore.local}} to {{true}}. In the code 
> however there is this:
> {code:title=HiveMetaStoreClient.java}
> boolean localMetaStore = conf.getBoolean("hive.metastore.local", false);
> {code}
> This leads to different behaviour depending on whether hbase-default.xml is 
> on the classpath or not.which can lead to some confusion ;-)
> I can supply a patch - should be pretty similar. I just don't  know what the 
> "real" default should be. My guess would be {{true}}.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1875) On job failure log some messages explaining that Hive is retrieving task completion events

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1875:
-

Component/s: Diagnosability

> On job failure log some messages explaining that Hive is retrieving task 
> completion events
> --
>
> Key: HIVE-1875
> URL: https://issues.apache.org/jira/browse/HIVE-1875
> Project: Hive
>  Issue Type: Improvement
>  Components: Diagnosability, Query Processor
>Reporter: Carl Steinbach
>
> If a job fails, Hive currently displays a link to the task with the most 
> number of failures for easy access to the error logs. However, generating the 
> link may require many RPC's to get all the task completion events, adding a 
> delay of up to 30 minutes. HIVE-1578 added a configuration property that 
> allows the user to disable this behavior.
> This ticket covers adding some logging statements notifying the user that 
> HIve is retrieving this information. This intended to avoid giving the user 
> the impression that the CLI has simply locked up.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-1095) Hive in Maven

2011-04-05 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016146#comment-13016146
 ] 

Ning Zhang commented on HIVE-1095:
--

I tried the first command:  ant make-maven -Dversion=0.8.0-SNAPSHOT -logfile 
make-maven.log
and it seems succeeded. I'll attached make-maven.log. It would be nice that 
someone has the knowledge can take a look and see if it is correct. 

I haven't not run the other command to publish maven yet. I can run that as 
long as I get a +1 from committers who has the knowledge. 

> Hive in Maven
> -
>
> Key: HIVE-1095
> URL: https://issues.apache.org/jira/browse/HIVE-1095
> Project: Hive
>  Issue Type: Task
>  Components: Build Infrastructure
>Affects Versions: 0.6.0
>Reporter: Gerrit Jansen van Vuuren
>Priority: Minor
> Attachments: HIVE-1095-trunk.patch, HIVE-1095.v2.PATCH, 
> HIVE-1095.v3.PATCH, HIVE-1095.v4.PATCH, HIVE-1095.v5.PATCH, 
> hiveReleasedToMaven.tar.gz
>
>
> Getting hive into maven main repositories
> Documentation on how to do this is on:
> http://maven.apache.org/guides/mini/guide-central-repository-upload.html

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1095) Hive in Maven

2011-04-05 Thread Ning Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-1095:
-

Attachment: make-maven.log

> Hive in Maven
> -
>
> Key: HIVE-1095
> URL: https://issues.apache.org/jira/browse/HIVE-1095
> Project: Hive
>  Issue Type: Task
>  Components: Build Infrastructure
>Affects Versions: 0.6.0
>Reporter: Gerrit Jansen van Vuuren
>Priority: Minor
> Attachments: HIVE-1095-trunk.patch, HIVE-1095.v2.PATCH, 
> HIVE-1095.v3.PATCH, HIVE-1095.v4.PATCH, HIVE-1095.v5.PATCH, 
> hiveReleasedToMaven.tar.gz, make-maven.log
>
>
> Getting hive into maven main repositories
> Documentation on how to do this is on:
> http://maven.apache.org/guides/mini/guide-central-repository-upload.html

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1301) RAND() should be RAND_UNIF(); also, we should create RAND_NORM() and add options

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1301:
-

Component/s: UDF

> RAND() should be RAND_UNIF(); also, we should create RAND_NORM() and add 
> options
> 
>
> Key: HIVE-1301
> URL: https://issues.apache.org/jira/browse/HIVE-1301
> Project: Hive
>  Issue Type: Wish
>  Components: UDF
>Reporter: Adam Kramer
>Assignee: Paul Yang
>
> The generation of pseudorandom data is very useful, but would be even MORE 
> useful if we had a few levers to pull.
> Currently, RAND() generates a random number pulled from a uniform 
> distribution between 0 and 1. It would be great if we could user-specify the 
> min and max because that is a more elegant way to write code: RAND()*200+50 
> will generate the same thing as RAND_UNIF(min=50,max=250) but the latter is a 
> much better way to express this in a readable manner.
> Similarly, it would be useful to have non-uniform random data for statistical 
> purposes. RAND_NORM(mean=0,sd=1) 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1262) Add security/checksum UDFs sha,crc32,md5,aes_encrypt, and aes_decrypt

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1262:
-

Component/s: UDF

> Add security/checksum UDFs sha,crc32,md5,aes_encrypt, and aes_decrypt
> -
>
> Key: HIVE-1262
> URL: https://issues.apache.org/jira/browse/HIVE-1262
> Project: Hive
>  Issue Type: New Feature
>  Components: UDF
>Affects Versions: 0.6.0
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
> Attachments: hive-1262-1.patch.txt
>
>
> Add security/checksum UDFs sha,crc32,md5,aes_encrypt, and aes_decrypt

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1360) Allow UDFs to access constant parameter values at compile time

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1360:
-

Component/s: UDF

> Allow UDFs to access constant parameter values at compile time
> --
>
> Key: HIVE-1360
> URL: https://issues.apache.org/jira/browse/HIVE-1360
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor, UDF
>Affects Versions: 0.5.0
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
>
> UDFs should be able to access constant parameter values at compile time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1384) HiveServer should run as the user who submitted the query.

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1384:
-

Component/s: Security

> HiveServer should run as the user who submitted the query.
> --
>
> Key: HIVE-1384
> URL: https://issues.apache.org/jira/browse/HIVE-1384
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Security, Server Infrastructure
>Reporter: He Yongqiang
>Assignee: He Yongqiang
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1343) add an interface in RCFile to support concatenation of two files without (de)compression

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1343:
-

Component/s: Serializers/Deserializers

> add an interface in RCFile to support concatenation of two files without 
> (de)compression
> 
>
> Key: HIVE-1343
> URL: https://issues.apache.org/jira/browse/HIVE-1343
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Affects Versions: 0.6.0
>Reporter: Ning Zhang
>Assignee: He Yongqiang
> Attachments: HIVE-1343.1.patch
>
>
> If two files are concatenated, we need to read each record in these files and 
> write them back to the destination file. The IO cost is mostly unavoidable 
> due to the lack of append functionality in HDFS. However the CPU cost could 
> be significantly reduced by avoiding compression and decompression of the 
> files.
> The File Format layer should provide API that implement the block-level 
> concatenation. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1458) Table aliases are case sensitive.

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1458:
-

Component/s: Query Processor

> Table aliases are case sensitive.
> -
>
> Key: HIVE-1458
> URL: https://issues.apache.org/jira/browse/HIVE-1458
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Jonathan Chang
>Assignee: Paul Yang
>Priority: Minor
>
> This query:
>  SELECT exploded, COUNT(1)
>
> FROM (
>   
> SELECT SPLIT(value, "0") AS value 
>   
> FROM tmp_jonchang_hive_test   
>   
> ) B LATERAL VIEW explode(value) C AS exploded 
>   
> GROUP BY exploded  
> produces the error:
> FAILED: Error in semantic analysis: line 7:9 Invalid Table Alias or Column 
> Reference exploded
> However, when B is changed to lowercase, the query works as expected: 
>  SELECT exploded, COUNT(1)
>
> FROM (
>   
> SELECT SPLIT(value, "0") AS value 
>   
> FROM tmp_jonchang_hive_test   
>   
> ) b LATERAL VIEW explode(value) C AS exploded 
>   
> GROUP BY exploded
> Table aliases shouldn't be case sensitive AFAIK.  And even if they are, I 
> can't see the cause for the error.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1480) CREATE TABLE IF NOT EXISTS get incorrect table name

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1480:
-

Component/s: Query Processor

> CREATE TABLE IF NOT EXISTS get incorrect table name
> ---
>
> Key: HIVE-1480
> URL: https://issues.apache.org/jira/browse/HIVE-1480
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Ning Zhang
>Assignee: Ning Zhang
>
> CREATE TABLE IF NOT EXISTS T AS SELECT ... gives the following error after 
> the job succeeded:
> Setting total progress to 100
> 10/07/22 11:26:14 INFO exec.ExecDriver: Ended Job = job_201006221843_688872
> 10/07/22 11:26:14 INFO exec.FileSinkOperator: Moving tmp dir: 
> hdfs://dfstmp.data.facebook.com:9000/tmp/hive-root/hive_2010-07-22_11-20-15_027_2717837693750284928/_tmp.10001
>  to: 
> hdfs://dfstmp.data.facebook.com:9000/tmp/hive-root/hive_2010-07-22_11-20-15_027_2717837693750284928/_tmp.10001.intermediate
> 10/07/22 11:26:14 INFO exec.FileSinkOperator: Moving tmp dir: 
> hdfs://dfstmp.data.facebook.com:9000/tmp/hive-root/hive_2010-07-22_11-20-15_027_2717837693750284928/_tmp.10001.intermediate
>  to: 
> hdfs://dfstmp.data.facebook.com:9000/tmp/hive-root/hive_2010-07-22_11-20-15_027_2717837693750284928/10001
> Moving data to: 
> hdfs://dfstmp.data.facebook.com:9000/user/facebook/warehouse/ericm_budget_email_actua43
> 10/07/22 11:26:15 INFO exec.MoveTask: Moving data to: 
> hdfs://dfstmp.data.facebook.com:9000/user/facebook/warehouse/ericm_budget_email_actua43
>  from 
> hdfs://dfstmp.data.facebook.com:9000/tmp/hive-root/hive_2010-07-22_11-20-15_027_2717837693750284928/10001
> 10/07/22 11:26:15 WARN hdfs.DFSClient: File 
> /user/facebook/warehouse/ericm_budget_email_actua43 is beng deleted only 
> through Trash org.apache.hadoop.fs.FsShell.delete because all deletes must go 
> through Trash.
> 10/07/22 11:26:15 INFO hive.log: DDL: struct ericm_budget_email_actua43 { 
> string acct_id, string first_name, string email, string campaign_name_list}
> 10/07/22 11:26:15 INFO metastore.HiveMetaStore: 0: create_table: db=default 
> tbl=ericm_budget_email_actua43
> 10/07/22 11:26:15 INFO metastore.HiveMetaStore: 0: get_table : db=default 
> tbl=ericm_budget_email_actua43
> 10/07/22 11:26:15 INFO hooks.HookUtils: Host:cdb067.snc1.facebook.com 
> database:audit_silver
> 10/07/22 11:26:15 INFO hooks.HookUtils: Host:cdb067.snc1.facebook.com 
> database:lineage_silver
> 10/07/22 11:26:15 INFO hooks.HookUtils: rows inserted: 1 sql: insert into 
> snc1_command_log set command = ?, command_type = ?, inputs = ?, outputs = ?, 
> queryId = ?, user_info = ?
> OK
> 10/07/22 11:26:15 INFO ql.Driver: OK
> 10/07/22 11:26:16 INFO ql.Context: getStream error: 
> java.io.FileNotFoundException: File does not exist: 
> hdfs://dfstmp.data.facebook.com:9000/tmp/hive-root/hive_2010-07-22_11-20-15_027_2717837693750284928/1
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:457)
> at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:294)
> at org.apache.hadoop.hive.ql.Context.getStream(Context.java:386)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:688)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:146)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:197)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:294)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>  
> Time taken: 361.26 seconds
> 10/07/22 11:26:16 INFO CliDriver: Time taken: 361.26 seconds
> Exit code: 0, 0
> dus: Cannot access /user/facebook/warehouse/IF: No such file or directory.
> tablesize cmd:/mnt/vol/hive/sites/silver.trunk/hadoop/bin/hadoop dfs -dus 
> /user/facebook/warehouse/IF | cut -d$'\t' -f2
>  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1625) Added implementation to HivePreparedStatement, HiveBaseResultSet and HiveQueryResultSet.

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1625:
-

Component/s: JDBC

> Added implementation to HivePreparedStatement, HiveBaseResultSet and 
> HiveQueryResultSet.
> 
>
> Key: HIVE-1625
> URL: https://issues.apache.org/jira/browse/HIVE-1625
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Reporter: Sean Flatley
>Assignee: Sean Flatley
> Attachments: HIVE-1625.patch, changelog.txt, testJdbcDriver.log
>
>
> We implemented several of the HivePreparedStatement set methods, such as 
> setString(int, String) and the means to substitute place holders in the SQL 
> with the values set.  
> HiveQueryResultSet and HiveBaseResultSet were enhanced so that getStatement() 
> could be implemented.
> See attached change log for details.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1665) drop operations may cause file leak

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1665:
-

Component/s: Metastore

> drop operations may cause file leak
> ---
>
> Key: HIVE-1665
> URL: https://issues.apache.org/jira/browse/HIVE-1665
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: He Yongqiang
>Assignee: He Yongqiang
> Attachments: hive-1665.1.patch
>
>
> Right now when doing a drop, Hive first drops metadata and then drops the 
> actual files. If file system is down at that time, the files will keep not 
> deleted. 
> Had an offline discussion about this:
> to fix this, add a new conf "scratch dir" into hive conf. 
> when doing a drop operation:
> 1) move data to scratch directory
> 2) drop metadata
> 3) if 2) failed, roll back 1) and report error 3.1
> if 2) succeeded, drop data from scratch directory 3.2
> 4) if 3.2 fails, we are ok because we assume the scratch dir will be emptied 
> manually.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1666) retry metadata operation in case of an failure

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1666:
-

Component/s: Metastore

> retry metadata operation in case of an failure
> --
>
> Key: HIVE-1666
> URL: https://issues.apache.org/jira/browse/HIVE-1666
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Query Processor
>Reporter: Namit Jain
>Assignee: Paul Yang
>
> If a user is trying to insert into a partition,
> insert overwrite table T partition (p) select ..
> it is possible that the directory gets created, but the metadata creation of 
> T@p fails - 
> currently, we will just throw an error. The final directory has been created.
> It will be useful to at-least retry the metadata operation. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1667) Store the group of the owner of the table in metastore

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1667:
-

Component/s: Security

> Store the group of the owner of the table in metastore
> --
>
> Key: HIVE-1667
> URL: https://issues.apache.org/jira/browse/HIVE-1667
> Project: Hive
>  Issue Type: New Feature
>  Components: Security
>Reporter: Namit Jain
> Attachments: hive-1667.patch
>
>
> Currently, the group of the owner of the table is not stored in the metastore.
> Secondly, if you create a table, the table's owner group is set to the group 
> for the parent. It is not read from the UGI passed in.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1690) HivePreparedStatement.executeImmediate(String sql) is breaking the exception stack

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1690:
-

Component/s: JDBC

> HivePreparedStatement.executeImmediate(String sql) is breaking the exception 
> stack
> --
>
> Key: HIVE-1690
> URL: https://issues.apache.org/jira/browse/HIVE-1690
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Reporter: Eli Griv
>Priority: Minor
>
> in HivePreparedStatement.executeImmediate(String sql), the exception stack is 
> broken, so it's impossible to know which method throwed "Method not 
> supported" 
> FIX :
> HivePreparedStatement.java
> L166
> -   throw new SQLException(e.getMessage(), e.getSQLState(), e.getErrorCode());
> +  throw new SQLException(e.getMessage(), e.getSQLState(), e.getErrorCode(), 
> e);
> L168
> -   throw new SQLException(ex.toString(), "08S01");
> +  throw new SQLException(ex.toString(), "08S01", ex);

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1071) Making RCFile "concatenatable" to reduce the number of files of the output

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1071:
-

Component/s: Serializers/Deserializers

> Making RCFile "concatenatable" to reduce the number of files of the output
> --
>
> Key: HIVE-1071
> URL: https://issues.apache.org/jira/browse/HIVE-1071
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: Zheng Shao
>
> Hive automatically determine the number of reducers most of the time.
> Sometimes, we create a lot of small files.
> Hive has an option to "merge" those small files though a map-reduce job.
> Dhruba has the idea which can fix it even faster:
> if we can make RCFile concatenatable, then we can simply tell the namenode to 
> "merge" these files.
> Pros: This approach does not do any I/O so it's faster.
> Cons: We have to zero-fill the files to make sure they can be concatenated 
> (all blocks except the last have to be full HDFS blocks).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1189) Add package-info.java to Hive

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1189:
-

Component/s: Diagnosability

> Add package-info.java to Hive
> -
>
> Key: HIVE-1189
> URL: https://issues.apache.org/jira/browse/HIVE-1189
> Project: Hive
>  Issue Type: New Feature
>  Components: Build Infrastructure, Diagnosability
>Affects Versions: 0.6.0
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Attachments: HIVE-1189.1.patch
>
>
> Hadoop automatically generates build/src/org/apache/hadoop/package-info.java 
> with information like this:
> {code}
> /*
>  * Generated by src/saveVersion.sh
>  */
> @HadoopVersionAnnotation(version="0.20.2-dev", revision="826568",
>  user="zshao", date="Sun Oct 18 17:46:56 PDT 2009", 
> url="http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20";)
> package org.apache.hadoop;
> {code}
> Hive should do the same thing so that we can easily know the version of the 
> code at runtime.
> This will help us identify whether we are still running the same version of 
> Hive, if we serialize the plan and later continue the execution (See 
> HIVE-1100).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1189) Add package-info.java to Hive

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1189:
-

Component/s: Build Infrastructure

> Add package-info.java to Hive
> -
>
> Key: HIVE-1189
> URL: https://issues.apache.org/jira/browse/HIVE-1189
> Project: Hive
>  Issue Type: New Feature
>  Components: Build Infrastructure, Diagnosability
>Affects Versions: 0.6.0
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Attachments: HIVE-1189.1.patch
>
>
> Hadoop automatically generates build/src/org/apache/hadoop/package-info.java 
> with information like this:
> {code}
> /*
>  * Generated by src/saveVersion.sh
>  */
> @HadoopVersionAnnotation(version="0.20.2-dev", revision="826568",
>  user="zshao", date="Sun Oct 18 17:46:56 PDT 2009", 
> url="http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20";)
> package org.apache.hadoop;
> {code}
> Hive should do the same thing so that we can easily know the version of the 
> code at runtime.
> This will help us identify whether we are still running the same version of 
> Hive, if we serialize the plan and later continue the execution (See 
> HIVE-1100).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-613) Hive server fetch row incorrect NULL representation

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-613:


Component/s: Server Infrastructure

> Hive server fetch row incorrect NULL representation
> ---
>
> Key: HIVE-613
> URL: https://issues.apache.org/jira/browse/HIVE-613
> Project: Hive
>  Issue Type: Bug
>  Components: Server Infrastructure
>Reporter: Eric Hwang
>Priority: Minor
>
> The Hive server fetch function does not correctly serialize null fields in 
> the returned rows. Regardless of the actual null format representation within 
> the table, the Hive server fetch function will always return null fields as 
> "NULL",creating a potential conflict with the actual string "NULL".

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-627) Optimizer should only access RowSchema (and not RowResolver)

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-627:


Component/s: Query Processor

> Optimizer should only access RowSchema (and not RowResolver)
> 
>
> Key: HIVE-627
> URL: https://issues.apache.org/jira/browse/HIVE-627
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Zheng Shao
>
> The column pruner is accessing RowResolver a lot of times, for things like 
> reverseLookup, and get(alias, column).
> These are not necessary - we should not need to translate an internal name to 
> (alias, column) and then translate back. We should be able to use internal 
> name from one operator to the other, using RowSchema.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-649) [UDF] now() for getting current time

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-649:


Component/s: UDF

> [UDF] now() for getting current time
> 
>
> Key: HIVE-649
> URL: https://issues.apache.org/jira/browse/HIVE-649
> Project: Hive
>  Issue Type: New Feature
>  Components: UDF
>Reporter: Min Zhou
> Attachments: HIVE-649.patch
>
>
> http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html#function_now

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-660) Fix UDFLike for multi-line inputs

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-660:


Component/s: UDF

> Fix UDFLike for multi-line inputs
> -
>
> Key: HIVE-660
> URL: https://issues.apache.org/jira/browse/HIVE-660
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Attachments: HIVE-660.1.patch
>
>
> We should use DOTALL option in UDFLike, because '%' and '_' should also match 
> to the newline.
> See 
> http://java.sun.com/j2se/1.4.2/docs/api/java/util/regex/Pattern.html#DOTALL

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-664) optimize UDF split

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-664:


Component/s: (was: Query Processor)
 UDF
 Labels: optimization  (was: )

> optimize UDF split
> --
>
> Key: HIVE-664
> URL: https://issues.apache.org/jira/browse/HIVE-664
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Namit Jain
>  Labels: optimization
>
> Min Zhou added a comment - 21/Jul/09 07:34 AM
> It's very useful for us .
> some comments:
>1. Can you implement it directly with Text ? Avoiding string decoding and 
> encoding would be faster. Of course that trick may lead to another problem, 
> as String.split uses a regular expression for splitting.
>2. getDisplayString() always return a string in lowercase.
> [ Show » ]
> Min Zhou added a comment - 21/Jul/09 07:34 AM It's very useful for us . some 
> comments:
>1. Can you implement it directly with Text ? Avoiding string decoding and 
> encoding would be faster. Of course that trick may lead to another problem, 
> as String.split uses a regular expression for splitting.
>2. getDisplayString() always return a string in lowercase.
> [ Permlink | « Hide ]
> Namit Jain added a comment - 21/Jul/09 09:22 AM
> Committed. Thanks Emil
> [ Show » ]
> Namit Jain added a comment - 21/Jul/09 09:22 AM Committed. Thanks Emil
> [ Permlink | « Hide ]
> Emil Ibrishimov added a comment - 21/Jul/09 10:48 AM
> There are some easy (compromise) ways to optimize split:
> 1. Check if the regex argument actually contains some "regex specific 
> characters" and if it doesn't, do a straightforward split without converting 
> to strings.
> 2. Assume some default value for the second argument (for example - 
> split(str) to be equivalent to split(str, ' ') and optimize for this value
> 3. Have two separate split functions - one that does regex and one that 
> splits around plain text.
> I think that 1 is a good choice and can be done rather quickly.
> [ Show » ]
> Emil Ibrishimov added a comment - 21/Jul/09 10:48 AM There are some easy 
> (compromise) ways to optimize split: 1. Check if the regex argument actually 
> contains some "regex specific characters" and if it doesn't, do a 
> straightforward split without converting to strings. 2. Assume some default 
> value for the second argument (for example - split(str) to be equivalent to 
> split(str, ' ') and optimize for this value 3. Have two separate split 
> functions - one that does regex and one that splits around plain text. I 
> think that 1 is a good choice and can be done rather quickly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-538) make hive_jdbc.jar self-containing

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-538:


Component/s: (was: Clients)
 JDBC

> make hive_jdbc.jar self-containing
> --
>
> Key: HIVE-538
> URL: https://issues.apache.org/jira/browse/HIVE-538
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 0.3.0, 0.4.0, 0.6.0
>Reporter: Raghotham Murthy
>
> Currently, most jars in hive/build/dist/lib and the hadoop-*-core.jar are 
> required in the classpath to run jdbc applications on hive. We need to do 
> atleast the following to get rid of most unnecessary dependencies:
> 1. get rid of dynamic serde and use a standard serialization format, maybe 
> tab separated, json or avro
> 2. dont use hadoop configuration parameters
> 3. repackage thrift and fb303 classes into hive_jdbc.jar

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-663) column aliases should be supported

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-663.
-

  Resolution: Duplicate
Hadoop Flags: [Reviewed]

This was fixed a long time ago in some other ticket.

> column aliases should be supported
> --
>
> Key: HIVE-663
> URL: https://issues.apache.org/jira/browse/HIVE-663
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>
> select key as x from src where x < 10;
> should work

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-535) Memory-efficient hash-based Aggregation

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-535:


Component/s: Query Processor
 Labels: optimization  (was: )

> Memory-efficient hash-based Aggregation
> ---
>
> Key: HIVE-535
> URL: https://issues.apache.org/jira/browse/HIVE-535
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 0.4.0
>Reporter: Zheng Shao
>  Labels: optimization
>
> Currently there are a lot of memory overhead in the hash-based aggregation in 
> GroupByOperator.
> The net result is that GroupByOperator won't be able to store many entries in 
> its HashTable, and flushes frequently, and won't be able to achieve very good 
> partial aggregation result.
> Here are some initial thoughts (some of them are from Joydeep long time ago):
> A1. Serialize the key of the HashTable. This will eliminate the 16-byte 
> per-object overhead of Java in keys (depending on how many objects there are 
> in the key, the saving can be substantial).
> A2. Use more memory-efficient hash tables - java.util.HashMap has about 64 
> bytes of overhead per entry.
> A3. Use primitive array to store aggregation results. Basically, the UDAF 
> should manage the array of aggregation results, so UDAFCount should manage a 
> long[], UDAFAvg should manage a double[] and a long[]. The external code 
> should pass an index to iterate/merge/terminal an aggregation result. This 
> will eliminate the 16-byte per-object overhead of Java.
> More ideas are welcome.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-508) Better error message for UDF parameter handling

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-508:


Component/s: UDF
 Diagnosability

> Better error message for UDF parameter handling
> ---
>
> Key: HIVE-508
> URL: https://issues.apache.org/jira/browse/HIVE-508
> Project: Hive
>  Issue Type: Bug
>  Components: Diagnosability, UDF
>Reporter: Zheng Shao
>
> {code}
> CREATE TABLE x (a map);
> SELECT round(a) FROM x;
> {code}
> This will show an error message:
> FAILED: Unknown exception : 
> org.apache.hadoop.hive.serde2.typeinfo.MapTypeInfo cannot be cast to 
> org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo
> We need better error messsage like:
> FAILED: Unable to pass a (type: map) to function round.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-475) Lines exceeding mapred.linerecordreader.maxlength should cause exceptions

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-475:


Component/s: Diagnosability

> Lines exceeding mapred.linerecordreader.maxlength should cause exceptions
> -
>
> Key: HIVE-475
> URL: https://issues.apache.org/jira/browse/HIVE-475
> Project: Hive
>  Issue Type: Improvement
>  Components: Diagnosability, Serializers/Deserializers
>Reporter: S. Alex Smith
>
> Currently, rows of data that exceed mapred.linerecordreader.maxlength vanish 
> silently.  Instead, an option should be added to indicate what to do under 
> this circumstance (vanish the entire line, truncate after max length, or fail 
> the job), but the default behavior should be job failure.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-489) compiler should check for validity of output paths before job submission

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-489:


Component/s: Diagnosability

> compiler should check for validity of output paths before job submission
> 
>
> Key: HIVE-489
> URL: https://issues.apache.org/jira/browse/HIVE-489
> Project: Hive
>  Issue Type: Bug
>  Components: Diagnosability, Query Processor
>Reporter: Joydeep Sen Sarma
>
> couple of hours after a job has run - finding that the move operation failed 
> (because the output directory did not exist and move doesn't make parent 
> directories automatically). No Trash - output is gone. Hive should have 
> barfed on this in the compile phase.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-71) log details on rows that cause hive exceptions

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-71?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-71:
---

Component/s: Serializers/Deserializers
 Query Processor
 Diagnosability

> log details on rows that cause hive exceptions
> --
>
> Key: HIVE-71
> URL: https://issues.apache.org/jira/browse/HIVE-71
> Project: Hive
>  Issue Type: Bug
>  Components: Diagnosability, Logging, Query Processor, 
> Serializers/Deserializers
>Reporter: Joydeep Sen Sarma
>
> users are logging all rows in order to find out the row that's causing 
> exceptions. instead we should just log as much information as possible on the 
> row that causes exception in hive stack

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1362) column level statistics

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1362:
-

Component/s: Statistics

> column level statistics
> ---
>
> Key: HIVE-1362
> URL: https://issues.apache.org/jira/browse/HIVE-1362
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Ning Zhang
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1361) table/partition level statistics

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1361:
-

Component/s: Statistics

> table/partition level statistics
> 
>
> Key: HIVE-1361
> URL: https://issues.apache.org/jira/browse/HIVE-1361
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor, Statistics
>Reporter: Ning Zhang
> Fix For: 0.7.0
>
> Attachments: HIVE-1361.2.patch, HIVE-1361.2_java_only.patch, 
> HIVE-1361.3.patch, HIVE-1361.4.java_only.patch, HIVE-1361.4.patch, 
> HIVE-1361.5.java_only.patch, HIVE-1361.5.patch, HIVE-1361.java_only.patch, 
> HIVE-1361.patch, stats0.patch
>
>
> At the first step, we gather table-level stats for non-partitioned table and 
> partition-level stats for partitioned table. Future work could extend the 
> table level stats to partitioned table as well. 
> There are 3 major milestones in this subtask: 
>  1) extend the insert statement to gather table/partition level stats 
> on-the-fly.
>  2) extend metastore API to support storing and retrieving stats for a 
> particular table/partition. 
>  3) add an ANALYZE TABLE [PARTITION] statement in Hive QL to gather stats for 
> existing tables/partitions. 
> The proposed stats are:
> Partition-level stats: 
>   - number of rows
>   - total size in bytes
>   - number of files
>   - max, min, average row sizes
>   - max, min, average file sizes
> Table-level stats in addition to partition level stats:
>   - number of partitions

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1940) Query Optimization Using Column Metadata and Histograms

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1940:
-

Component/s: Statistics

> Query Optimization Using Column Metadata and Histograms
> ---
>
> Key: HIVE-1940
> URL: https://issues.apache.org/jira/browse/HIVE-1940
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore, Query Processor, Statistics
>Reporter: Anja Gruenheid
> Attachments: HiveMetaStore.pdf
>
>
> The current basis for cost-based query optimization in Hive is information 
> gathered on tables and partitions. To make further improvements in query 
> optimization possible, the next step is to develop and implement 
> possibilities to gather information on columns as discussed in issue HIVE-33. 
> After that, an implementation of histograms is a possible option to use and 
> collect run-time statistics. Next to the actual implementation of these 
> features, it is also necessary to develop a consistent storage model for the 
> MetaStore.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-109) 'location' clause for table creation should only be allowed for external tables

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-109:


Component/s: Metastore

> 'location' clause for table creation should only be allowed for external 
> tables
> ---
>
> Key: HIVE-109
> URL: https://issues.apache.org/jira/browse/HIVE-109
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Query Processor
>Reporter: Joydeep Sen Sarma
>
> currently - the code does not by and large distinguish between external and 
> internal tables. one clear distinction though is that storage for external 
> tables is managed outside hive. this leads to consequences like HIVE-86 - so 
> that hive does not mess around with tables whose storage is managed 
> externally. however - currently - we allow users to specify location for 
> internal tables - which is confusing and could lead to data being deleted in 
> external folders. we should not allow this.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-55) restrict table and column names to be alphanumeric and _ characters

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-55?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-55:
---

Component/s: Metastore

> restrict table and column names to be alphanumeric and _ characters
> ---
>
> Key: HIVE-55
> URL: https://issues.apache.org/jira/browse/HIVE-55
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Query Processor
>Reporter: Prasad Chakka
>
> currently the DDL will restrict to alpha-numeric and _ chars but not if the 
> tables were created or altered using metastore clients directly. this JIRA 
> aims to fix that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-80) Add testcases for concurrent query execution

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-80?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-80:
---

Component/s: Server Infrastructure
 Labels: concurrency  (was: )

> Add testcases for concurrent query execution
> 
>
> Key: HIVE-80
> URL: https://issues.apache.org/jira/browse/HIVE-80
> Project: Hive
>  Issue Type: Test
>  Components: Query Processor, Server Infrastructure
>Reporter: Raghotham Murthy
>Assignee: Arvind Prabhakar
>Priority: Critical
>  Labels: concurrency
> Attachments: hive_input_format_race-2.patch
>
>
> Can use one driver object per query.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-149) Aggregate functions MIN and MAX should support all types

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-149:


Component/s: (was: Query Processor)
 UDF

> Aggregate functions MIN and MAX should support all types
> 
>
> Key: HIVE-149
> URL: https://issues.apache.org/jira/browse/HIVE-149
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: YihueyChyi
>Assignee: David Phillips
>Priority: Critical
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-156) Allow "!=" in place of "<>"

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-156.
-

Resolution: Duplicate

Fixed in HIVE-899.

> Allow "!=" in place of "<>"
> ---
>
> Key: HIVE-156
> URL: https://issues.apache.org/jira/browse/HIVE-156
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: S. Alex Smith
>Priority: Trivial
>
> I'm used to using "!=" for inequality.  It would be nice if Hive supported 
> this as an alternative to "<>".

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-191) Update methods in Hive class to specify database name

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-191.
-

Resolution: Duplicate

Fixed in HIVE-675.

> Update methods in Hive class to specify database name
> -
>
> Key: HIVE-191
> URL: https://issues.apache.org/jira/browse/HIVE-191
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Johan Oskarsson
>Priority: Minor
>
> In the query processor module there is a Hive class used to access various 
> Metastore data. Unfortunately most of those methods only work on the default 
> database. We should update them to work on other databases as well by adding 
> a database name parameter. See HIVE-182 for more background information.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-293) report deserialize exceptions from serde's via exceptions

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-293:


Component/s: Diagnosability

> report deserialize exceptions from serde's via exceptions
> -
>
> Key: HIVE-293
> URL: https://issues.apache.org/jira/browse/HIVE-293
> Project: Hive
>  Issue Type: Bug
>  Components: Diagnosability, Serializers/Deserializers
>Reporter: Joydeep Sen Sarma
>
> lazyserde and dynamicserde should report exceptions on number (and other) 
> parsing errors so higher layers can take the correct action

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-301) Ability to store row counts (and other stats) in metastore and obtain them via queries

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-301.
-

Resolution: Duplicate

I think this was covered by the recent stats work.

> Ability to store row counts (and other stats) in metastore and obtain them 
> via queries
> --
>
> Key: HIVE-301
> URL: https://issues.apache.org/jira/browse/HIVE-301
> Project: Hive
>  Issue Type: New Feature
>Reporter: Joydeep Sen Sarma
>
> now that we have insertion row counts being bubbled out of the execution path 
> - it would be good to stash them away in the metastore. It would also be good 
> to have them be viewable by some simple command (like the mysql status 
> commands - but perhaps we have something we could re-use already).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-305) Port Hadoop streaming's counters/status reporters to Hive Transforms

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-305:


Component/s: Query Processor

> Port Hadoop streaming's counters/status reporters to Hive Transforms
> 
>
> Key: HIVE-305
> URL: https://issues.apache.org/jira/browse/HIVE-305
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Venky Iyer
>
> https://issues.apache.org/jira/browse/HADOOP-1328
> " Introduced a way for a streaming process to update global counters and 
> status using stderr stream to emit information. Use 
> "reporter:counter:,, " to update  a counter. Use 
> "reporter:status:" to update status. "

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-345) Extend Date UDFs to support time zone and full specs as in MySQL

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-345:


Component/s: (was: Query Processor)
 UDF

> Extend Date UDFs to support time zone and full specs as in MySQL
> 
>
> Key: HIVE-345
> URL: https://issues.apache.org/jira/browse/HIVE-345
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 0.3.0
>Reporter: Zheng Shao
>
> Most of the Date UDF in Hive now are based on String instead of Date objects, 
> and they have limited functionality compared with MySQL.
> http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html#function_from-unixtime
> http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html#function_date-add
> http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html#function_date-sub
> http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html#function_datediff
> We should make it fully compliant with what MySQL offers.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-361) Support seeks in some Hive File Formats

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-361:


Component/s: Serializers/Deserializers

> Support seeks in some Hive File Formats
> ---
>
> Key: HIVE-361
> URL: https://issues.apache.org/jira/browse/HIVE-361
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Zheng Shao
>
> Seek support can be useful for a few applications:
> 1. Filter out a set of records quickly when the data is sorted on the 
> filtering key;
> 2. Create a random sample out of a File.
> This might not be a short-term goal, but let's keep the discussions here so 
> it does not get lost.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-357) Add order-sensitive and order-insensitive hashing aggregation functions (UDAF)

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-357:


Component/s: UDF

> Add order-sensitive and order-insensitive hashing aggregation functions (UDAF)
> --
>
> Key: HIVE-357
> URL: https://issues.apache.org/jira/browse/HIVE-357
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor, UDF
>Reporter: Zheng Shao
>Assignee: Zheng Shao
>
> In order to test whether a new version of Hive produces exactly the same 
> result as an order version, we usually want to run a bunch of big queries as 
> well as small queries.
> It's hard to compare the result of big queries, but if we have a hashing 
> aggregation function, we can just aggregate the result of big queries and 
> compare the single number.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-364) Hive Operators should calculate the value of common expressions just once

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-364:


Component/s: (was: Serializers/Deserializers)
 Query Processor

> Hive Operators should calculate the value of common expressions just once
> -
>
> Key: HIVE-364
> URL: https://issues.apache.org/jira/browse/HIVE-364
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Zheng Shao
>
> Currently, if we have "t.a + t.b" in 2 different expressions in the select 
> clause / where clause, we are computing it twice.
> We should cache the value of the expression evaluation result to save CPU 
> time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-436) MIN and MAX should inherit type

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-436:


Component/s: UDF

> MIN and MAX should inherit type
> ---
>
> Key: HIVE-436
> URL: https://issues.apache.org/jira/browse/HIVE-436
> Project: Hive
>  Issue Type: Wish
>  Components: UDF
>Reporter: Adam Kramer
>
> MIN and MAX functions currently return the DOUBLE type...but really, they 
> should return the same type as the column they operate on.
> In some cases like SUM, it's possible that the result would overflow making 
> DOUBLE more useful as it can drop digits and swap to scientific notation, but 
> MIN and MAX by definition cannot have this problem because the answers are 
> always represented in the column they are run across.
> Easy workaround: CAST all of my MINs and MAXes. It's just a wish.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-441) Convert field, index, AND, OR operators to GenericUDF

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-441:


Component/s: UDF

> Convert field, index, AND, OR operators to GenericUDF
> -
>
> Key: HIVE-441
> URL: https://issues.apache.org/jira/browse/HIVE-441
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Zheng Shao
>Assignee: Zheng Shao
>
> Once the GenericUDF framework is in, we should convert exprNodeFieldDesc, 
> exprNodeIndexDesc to GenericUDF to simplify the code.
> We should also convert AND and OR to GenericUDF in order to take advantage of 
> short-circuit evaluation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Hive-0.7.0-h0.20 #67

2011-04-05 Thread Apache Hudson Server
See 

Changes:

[cws] HIVE-2054. Exception on windows when using the jdbc driver. 'IOException: 
The system cannot find the path specified' (Bennie Schut via cws)

--
[...truncated 27327 lines...]
[junit] Hive history 
file=
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] Copying data from 

[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-04-05_17-01-26_771_2233068310906990849/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks determined at compile time: 1
[junit] In order to change the average load for a reducer (in bytes):
[junit]   set hive.exec.reducers.bytes.per.reducer=
[junit] In order to limit the maximum number of reducers:
[junit]   set hive.exec.reducers.max=
[junit] In order to set a constant number of reducers:
[junit]   set mapred.reduce.tasks=
[junit] Job running in-process (local Hadoop)
[junit] 2011-04-05 17:01:29,841 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-04-05_17-01-26_771_2233068310906990849/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] Copying data from 

[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-04-05_17-01-31_397_1104374875420903108/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[ju

[jira] [Created] (HIVE-2096) throw a error if the input is larger than a threshold for index input format

2011-04-05 Thread Namit Jain (JIRA)
throw a error if the input is larger than a threshold for index input format


 Key: HIVE-2096
 URL: https://issues.apache.org/jira/browse/HIVE-2096
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: He Yongqiang


This can hang for ever.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Hive-trunk-h0.20 #656

2011-04-05 Thread Apache Hudson Server
See 

Changes:

[cws] HIVE-2054. Exception on windows when using the jdbc driver. 'IOException: 
The system cannot find the path specified' (Bennie Schut via cws)

--
[...truncated 29856 lines...]
[junit] OK
[junit] PREHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-04-05_17-07-36_023_6138469711846423175/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks determined at compile time: 1
[junit] In order to change the average load for a reducer (in bytes):
[junit]   set hive.exec.reducers.bytes.per.reducer=
[junit] In order to limit the maximum number of reducers:
[junit]   set hive.exec.reducers.max=
[junit] In order to set a constant number of reducers:
[junit]   set mapred.reduce.tasks=
[junit] Job running in-process (local Hadoop)
[junit] Hadoop job information for null: number of mappers: 0; number of 
reducers: 0
[junit] 2011-04-05 17:07:39,095 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-04-05_17-07-36_023_6138469711846423175/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] Copying data from 

[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-04-05_17-07-40_622_2683345849257742374/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-04-05_17-07-40_622_2683345849257742374/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[juni

[jira] [Commented] (HIVE-2090) Add "DROP DATABASE ... FORCE"

2011-04-05 Thread He Yongqiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016200#comment-13016200
 ] 

He Yongqiang commented on HIVE-2090:


can you add authorization check for drop database in this jira?

> Add "DROP DATABASE ... FORCE"
> -
>
> Key: HIVE-2090
> URL: https://issues.apache.org/jira/browse/HIVE-2090
> Project: Hive
>  Issue Type: New Feature
>Reporter: Siying Dong
>Assignee: Siying Dong
>Priority: Minor
> Attachments: HIVE-2090.1.patch, HIVE-2090.2.patch
>
>
> A "DROP DATABASE ... FORCE" will be useful, when we use a database for 
> isolation when doing some tests. Being able to force cleaning up the database 
> will make test cleaning up easier.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2090) Add "DROP DATABASE ... FORCE"

2011-04-05 Thread Siying Dong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siying Dong updated HIVE-2090:
--

Attachment: HIVE-2090.3.patch

> Add "DROP DATABASE ... FORCE"
> -
>
> Key: HIVE-2090
> URL: https://issues.apache.org/jira/browse/HIVE-2090
> Project: Hive
>  Issue Type: New Feature
>Reporter: Siying Dong
>Assignee: Siying Dong
>Priority: Minor
> Attachments: HIVE-2090.1.patch, HIVE-2090.2.patch, HIVE-2090.3.patch
>
>
> A "DROP DATABASE ... FORCE" will be useful, when we use a database for 
> isolation when doing some tests. Being able to force cleaning up the database 
> will make test cleaning up easier.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   >