[jira] [Created] (HIVE-4232) JDBC2 HiveConnection has odd defaults

2013-03-26 Thread Chris Drome (JIRA)
Chris Drome created HIVE-4232:
-

 Summary: JDBC2 HiveConnection has odd defaults
 Key: HIVE-4232
 URL: https://issues.apache.org/jira/browse/HIVE-4232
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.11.0
Reporter: Chris Drome


HiveConnection defaults to using a plain SASL transport if auth is not set. To 
get a raw transport auth must be set to noSasl; furthermore noSasl is case 
sensitive. Code tries to infer Kerberos or plain authentication based on the 
presence of principal. There is no provision for specifying QOP level.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4232) JDBC2 HiveConnection has odd defaults

2013-03-26 Thread Chris Drome (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Drome updated HIVE-4232:
--

Assignee: Chris Drome

> JDBC2 HiveConnection has odd defaults
> -
>
> Key: HIVE-4232
> URL: https://issues.apache.org/jira/browse/HIVE-4232
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 0.11.0
>Reporter: Chris Drome
>Assignee: Chris Drome
>
> HiveConnection defaults to using a plain SASL transport if auth is not set. 
> To get a raw transport auth must be set to noSasl; furthermore noSasl is case 
> sensitive. Code tries to infer Kerberos or plain authentication based on the 
> presence of principal. There is no provision for specifying QOP level.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4232) JDBC2 HiveConnection has odd defaults

2013-03-26 Thread Chris Drome (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13613592#comment-13613592
 ] 

Chris Drome commented on HIVE-4232:
---

The patch proposes that if auth is not specified or auth=none the transport 
will default to TSocket. If auth=kerberos|plain then the Kerberos SASL 
transport or the Plain SASL transport will be used. If auth=kerberos then 
principal must also be specified. We propose controlling the QOP with another 
parameter qop=auth|auth-int|auth-conf. The patch also takes care of 
case-sensitive value comparisons.

I feel that these changes result in a more reasonable set of defaults and don't 
rely upon inferring Kerberos or plain based on the existence of other 
parameters.

I will rebase HIVE-4225 if these proposed changes are accepted.

> JDBC2 HiveConnection has odd defaults
> -
>
> Key: HIVE-4232
> URL: https://issues.apache.org/jira/browse/HIVE-4232
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 0.11.0
>Reporter: Chris Drome
>Assignee: Chris Drome
> Attachments: HIVE-4232.patch
>
>
> HiveConnection defaults to using a plain SASL transport if auth is not set. 
> To get a raw transport auth must be set to noSasl; furthermore noSasl is case 
> sensitive. Code tries to infer Kerberos or plain authentication based on the 
> presence of principal. There is no provision for specifying QOP level.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4232) JDBC2 HiveConnection has odd defaults

2013-03-26 Thread Chris Drome (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Drome updated HIVE-4232:
--

Attachment: HIVE-4232.patch

Uploaded patch which implements the proposed changes.

> JDBC2 HiveConnection has odd defaults
> -
>
> Key: HIVE-4232
> URL: https://issues.apache.org/jira/browse/HIVE-4232
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 0.11.0
>Reporter: Chris Drome
>Assignee: Chris Drome
> Attachments: HIVE-4232.patch
>
>
> HiveConnection defaults to using a plain SASL transport if auth is not set. 
> To get a raw transport auth must be set to noSasl; furthermore noSasl is case 
> sensitive. Code tries to infer Kerberos or plain authentication based on the 
> presence of principal. There is no provision for specifying QOP level.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4225) HiveServer2 does not support SASL QOP

2013-03-26 Thread Chris Drome (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13613595#comment-13613595
 ] 

Chris Drome commented on HIVE-4225:
---

I will rebase this patch if the changes in HIVE-4232 are acceptable.

> HiveServer2 does not support SASL QOP
> -
>
> Key: HIVE-4225
> URL: https://issues.apache.org/jira/browse/HIVE-4225
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Shims
>Affects Versions: 0.11.0
>Reporter: Chris Drome
>Assignee: Chris Drome
> Fix For: 0.11.0
>
> Attachments: HIVE-4225.patch
>
>
> HiveServer2 implements Kerberos authentication through SASL framework, but 
> does not support setting QOP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4042) ignore mapjoin hint

2013-03-26 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4042:
---

   Resolution: Fixed
Fix Version/s: 0.11.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Namit!

> ignore mapjoin hint
> ---
>
> Key: HIVE-4042
> URL: https://issues.apache.org/jira/browse/HIVE-4042
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Fix For: 0.11.0
>
> Attachments: hive.4042.10.patch, hive.4042.11.patch, 
> hive.4042.12.patch, hive.4042.1.patch, hive.4042.2.patch, hive.4042.3.patch, 
> hive.4042.4.patch, hive.4042.5.patch, hive.4042.6.patch, hive.4042.7.patch, 
> hive.4042.8.patch, hive.4042.9.patch
>
>
> After HIVE-3784, in a production environment, it can become difficult to
> deploy since a lot of production queries can break.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4232) JDBC2 HiveConnection has odd defaults

2013-03-26 Thread Chris Drome (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Drome updated HIVE-4232:
--

Fix Version/s: 0.11.0
   Status: Patch Available  (was: Open)

> JDBC2 HiveConnection has odd defaults
> -
>
> Key: HIVE-4232
> URL: https://issues.apache.org/jira/browse/HIVE-4232
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 0.11.0
>Reporter: Chris Drome
>Assignee: Chris Drome
> Fix For: 0.11.0
>
> Attachments: HIVE-4232.patch
>
>
> HiveConnection defaults to using a plain SASL transport if auth is not set. 
> To get a raw transport auth must be set to noSasl; furthermore noSasl is case 
> sensitive. Code tries to infer Kerberos or plain authentication based on the 
> presence of principal. There is no provision for specifying QOP level.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3381) Result of outer join is not valid

2013-03-26 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13613602#comment-13613602
 ] 

Ashutosh Chauhan commented on HIVE-3381:


[~navis] Can you refresh the patch? Not applying cleanly. I will test and 
commit it than. Also, it will be good to include vikram's tests while you are 
refreshing the patch.

> Result of outer join is not valid
> -
>
> Key: HIVE-3381
> URL: https://issues.apache.org/jira/browse/HIVE-3381
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.10.0
>Reporter: Navis
>Assignee: Navis
>Priority: Critical
> Attachments: HIVE-3381.D5565.3.patch, HIVE-3381.D5565.4.patch, 
> HIVE-3381.D5565.5.patch, HIVE-3381.D5565.6.patch, mapjoin_testOuter.q
>
>
> Outer joins, especially full outer joins or outer join with filter on 'ON 
> clause' is not showing proper results. For example, query in test join_1to1.q
> {code}
> SELECT * FROM join_1to1_1 a full outer join join_1to1_2 b on a.key1 = b.key1 
> and a.value = 66 and b.value = 66 ORDER BY a.key1 ASC, a.key2 ASC, a.value 
> ASC, b.key1 ASC, b.key2 ASC, b.value ASC;
> {code}
> results
> {code}
> NULL  NULLNULLNULLNULL66
> NULL  NULLNULLNULL10050   66
> NULL  NULLNULL10  10010   66
> NULL  NULLNULL30  10030   88
> NULL  NULLNULL35  10035   88
> NULL  NULLNULL40  10040   88
> NULL  NULLNULL40  10040   88
> NULL  NULLNULL50  10050   88
> NULL  NULLNULL50  10050   88
> NULL  NULLNULL50  10050   88
> NULL  NULLNULL70  10040   88
> NULL  NULLNULL70  10040   88
> NULL  NULLNULL70  10040   88
> NULL  NULLNULL70  10040   88
> NULL  NULL66  NULLNULLNULL
> NULL  10050   66  NULLNULLNULL
> 5 10005   66  5   10005   66
> 1510015   66  NULLNULLNULL
> 2010020   66  20  10020   66
> 2510025   88  NULLNULLNULL
> 3010030   66  NULLNULLNULL
> 3510035   88  NULLNULLNULL
> 4010040   66  NULLNULLNULL
> 4010040   66  40  10040   66
> 4010040   88  NULLNULLNULL
> 4010040   88  NULLNULLNULL
> 5010050   66  NULLNULLNULL
> 5010050   66  50  10050   66
> 5010050   66  50  10050   66
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 7010040   66  NULLNULLNULL
> 7010040   66  NULLNULLNULL
> 7010040   66  NULLNULLNULL
> 7010040   66  NULLNULLNULL
> 8010040   88  NULLNULLNULL
> 8010040   88  NULLNULLNULL
> 8010040   88  NULLNULLNULL
> 8010040   88  NULLNULLNULL
> {code} 
> but it seemed not right. This should be 
> {code}
> NULL  NULLNULLNULLNULL66
> NULL  NULLNULLNULL10050   66
> NULL  NULLNULL10  10010   66
> NULL  NULLNULL25  10025   66
> NULL  NULLNULL30  10030   88
> NULL  NULLNULL35  10035   88
> NULL  NULLNULL40  10040   88
> NULL  NULLNULL50  10050   88
> NULL  NULLNULL70  10040   88
> NULL  NULLNULL70  10040   88
> NULL  NULLNULL80  10040   66
> NULL  NULLNULL80  10040   66
> NULL  NULL66  NULLNULLNULL
> NULL  10050   66  NULLNULLNULL
> 5 10005   66  5   10005   66
> 1510015   66  NULLNULLNULL
> 2010020   66  20  10020   66
> 2510025   88  NULLNULLNULL
> 3010030   66  NULLNULLNULL
> 3510035   88  NULLNULLNULL
> 4010040   66  40  10040   66
> 4010040   88  NULLNULLNULL
> 5010050   66  50  10050   66
> 5010050   66  50  10050   66
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 7010040   66  NULLNULLNULL
> 7010040   66  NULLNULLNULL
> 8010040   88  NULLNULLNULL
> 8010040   88  NULLNULL 

[jira] [Commented] (HIVE-3381) Result of outer join is not valid

2013-03-26 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13613603#comment-13613603
 ] 

Phabricator commented on HIVE-3381:
---

ashutoshc has accepted the revision "HIVE-3381 [jira] Result of outer join is 
not valid".

  +1 Running tests

REVISION DETAIL
  https://reviews.facebook.net/D5565

BRANCH
  DPAL-1739

ARCANIST PROJECT
  hive

To: JIRA, ashutoshc, navis
Cc: njain


> Result of outer join is not valid
> -
>
> Key: HIVE-3381
> URL: https://issues.apache.org/jira/browse/HIVE-3381
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.10.0
>Reporter: Navis
>Assignee: Navis
>Priority: Critical
> Attachments: HIVE-3381.D5565.3.patch, HIVE-3381.D5565.4.patch, 
> HIVE-3381.D5565.5.patch, HIVE-3381.D5565.6.patch, mapjoin_testOuter.q
>
>
> Outer joins, especially full outer joins or outer join with filter on 'ON 
> clause' is not showing proper results. For example, query in test join_1to1.q
> {code}
> SELECT * FROM join_1to1_1 a full outer join join_1to1_2 b on a.key1 = b.key1 
> and a.value = 66 and b.value = 66 ORDER BY a.key1 ASC, a.key2 ASC, a.value 
> ASC, b.key1 ASC, b.key2 ASC, b.value ASC;
> {code}
> results
> {code}
> NULL  NULLNULLNULLNULL66
> NULL  NULLNULLNULL10050   66
> NULL  NULLNULL10  10010   66
> NULL  NULLNULL30  10030   88
> NULL  NULLNULL35  10035   88
> NULL  NULLNULL40  10040   88
> NULL  NULLNULL40  10040   88
> NULL  NULLNULL50  10050   88
> NULL  NULLNULL50  10050   88
> NULL  NULLNULL50  10050   88
> NULL  NULLNULL70  10040   88
> NULL  NULLNULL70  10040   88
> NULL  NULLNULL70  10040   88
> NULL  NULLNULL70  10040   88
> NULL  NULL66  NULLNULLNULL
> NULL  10050   66  NULLNULLNULL
> 5 10005   66  5   10005   66
> 1510015   66  NULLNULLNULL
> 2010020   66  20  10020   66
> 2510025   88  NULLNULLNULL
> 3010030   66  NULLNULLNULL
> 3510035   88  NULLNULLNULL
> 4010040   66  NULLNULLNULL
> 4010040   66  40  10040   66
> 4010040   88  NULLNULLNULL
> 4010040   88  NULLNULLNULL
> 5010050   66  NULLNULLNULL
> 5010050   66  50  10050   66
> 5010050   66  50  10050   66
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 7010040   66  NULLNULLNULL
> 7010040   66  NULLNULLNULL
> 7010040   66  NULLNULLNULL
> 7010040   66  NULLNULLNULL
> 8010040   88  NULLNULLNULL
> 8010040   88  NULLNULLNULL
> 8010040   88  NULLNULLNULL
> 8010040   88  NULLNULLNULL
> {code} 
> but it seemed not right. This should be 
> {code}
> NULL  NULLNULLNULLNULL66
> NULL  NULLNULLNULL10050   66
> NULL  NULLNULL10  10010   66
> NULL  NULLNULL25  10025   66
> NULL  NULLNULL30  10030   88
> NULL  NULLNULL35  10035   88
> NULL  NULLNULL40  10040   88
> NULL  NULLNULL50  10050   88
> NULL  NULLNULL70  10040   88
> NULL  NULLNULL70  10040   88
> NULL  NULLNULL80  10040   66
> NULL  NULLNULL80  10040   66
> NULL  NULL66  NULLNULLNULL
> NULL  10050   66  NULLNULLNULL
> 5 10005   66  5   10005   66
> 1510015   66  NULLNULLNULL
> 2010020   66  20  10020   66
> 2510025   88  NULLNULLNULL
> 3010030   66  NULLNULLNULL
> 3510035   88  NULLNULLNULL
> 4010040   66  40  10040   66
> 4010040   88  NULLNULLNULL
> 5010050   66  50  10050   66
> 5010050   66  50  10050   66
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 7010040   66  NULLNULLNULL
> 7010040   66  NULLNULLNULL
> 8010040   88 

[jira] [Updated] (HIVE-3562) Some limit can be pushed down to map stage

2013-03-26 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-3562:
--

Attachment: HIVE-3562.D5967.4.patch

navis updated the revision "HIVE-3562 [jira] Some limit can be pushed down to 
map stage".

  1. Used Heap for ORDER BY and Map for GROUP BY
  2. Added tests for spill/break
  3. Changed to use percentage for memory threshold

Reviewers: tarball, JIRA

REVISION DETAIL
  https://reviews.facebook.net/D5967

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D5967?vs=24861&id=30483#toc

AFFECTED FILES
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
  conf/hive-default.xml.template
  ql/build.xml
  ql/ivy.xml
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExtractOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ForwardOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/SelectOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/TopNHash.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/TopNHashForGBY.java
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveKey.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/LimitPushdownOptimizer.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java
  ql/src/test/queries/clientpositive/limit_pushdown.q
  ql/src/test/results/clientpositive/limit_pushdown.q.out

To: JIRA, tarball, navis
Cc: njain


> Some limit can be pushed down to map stage
> --
>
> Key: HIVE-3562
> URL: https://issues.apache.org/jira/browse/HIVE-3562
> Project: Hive
>  Issue Type: Bug
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3562.D5967.1.patch, HIVE-3562.D5967.2.patch, 
> HIVE-3562.D5967.3.patch, HIVE-3562.D5967.4.patch
>
>
> Queries with limit clause (with reasonable number), for example
> {noformat}
> select * from src order by key limit 10;
> {noformat}
> makes operator tree, 
> TS-SEL-RS-EXT-LIMIT-FS
> But LIMIT can be partially calculated in RS, reducing size of shuffling.
> TS-SEL-RS(TOP-N)-EXT-LIMIT-FS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4007) Create abstract classes for serializer and deserializer

2013-03-26 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13613612#comment-13613612
 ] 

Ashutosh Chauhan commented on HIVE-4007:


Cool. +1 running tests.

> Create abstract classes for serializer and deserializer
> ---
>
> Key: HIVE-4007
> URL: https://issues.apache.org/jira/browse/HIVE-4007
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.4007.1.patch, hive.4007.2.patch, hive.4007.3.patch, 
> hive.4007.4.patch
>
>
> Currently, it is very difficult to change the Serializer/Deserializer
> interface, since all the SerDes directly implement the interface.
> Instead, we should have abstract classes for implementing these interfaces.
> In case of a interface change, only the abstract class and the relevant 
> serde needs to change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3980) Cleanup after HIVE-3403

2013-03-26 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3980:
---

   Resolution: Fixed
Fix Version/s: 0.11.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Namit!

> Cleanup after HIVE-3403
> ---
>
> Key: HIVE-3980
> URL: https://issues.apache.org/jira/browse/HIVE-3980
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Fix For: 0.11.0
>
> Attachments: hive.3980.1.patch, hive.3980.2.patch, hive.3980.3.patch, 
> hive.3980.4.patch
>
>
> There have been a lot of comments on HIVE-3403, which involve changing 
> variable names/function names/adding more comments/general cleanup etc.
> Since HIVE-3403 involves a lot of refactoring, it was fairly difficult to
> address the comments there, since refreshing becomes impossible. This jira
> is to track those cleanups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4197) Bring windowing support inline with SQL Standard

2013-03-26 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13613618#comment-13613618
 ] 

Ashutosh Chauhan commented on HIVE-4197:


https://reviews.facebook.net/D9717 Review request on behalf of [~rhbutani]

> Bring windowing support inline with SQL Standard
> 
>
> Key: HIVE-4197
> URL: https://issues.apache.org/jira/browse/HIVE-4197
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Reporter: Harish Butani
> Attachments: WindowingSpecification.pdf
>
>
> The current behavior defers from the Standard in several significant places.
> Please review attached doc; there are still a few open issues. Once we agree 
> on the behavior, can proceed with fixing the implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3381) Result of outer join is not valid

2013-03-26 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-3381:
--

Attachment: HIVE-3381.D5565.7.patch

navis updated the revision "HIVE-3381 [jira] Result of outer join is not valid".

  Rebased to trunk (HIVE-3980) & added test (Thanks Vikram)

Reviewers: ashutoshc, JIRA

REVISION DETAIL
  https://reviews.facebook.net/D5565

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D5565?vs=30351&id=30495#toc

BRANCH
  DPAL-1739

ARCANIST PROJECT
  hive

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java
  ql/src/test/queries/clientpositive/mapjoin_test_outer.q
  ql/src/test/results/clientpositive/auto_join21.q.out
  ql/src/test/results/clientpositive/auto_join29.q.out
  ql/src/test/results/clientpositive/auto_join7.q.out
  ql/src/test/results/clientpositive/auto_join_filters.q.out
  ql/src/test/results/clientpositive/join21.q.out
  ql/src/test/results/clientpositive/join7.q.out
  ql/src/test/results/clientpositive/join_1to1.q.out
  ql/src/test/results/clientpositive/join_filters.q.out
  ql/src/test/results/clientpositive/join_filters_overlap.q.out
  ql/src/test/results/clientpositive/mapjoin1.q.out
  ql/src/test/results/clientpositive/mapjoin_test_outer.q.out

To: JIRA, ashutoshc, navis
Cc: njain


> Result of outer join is not valid
> -
>
> Key: HIVE-3381
> URL: https://issues.apache.org/jira/browse/HIVE-3381
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.10.0
>Reporter: Navis
>Assignee: Navis
>Priority: Critical
> Attachments: HIVE-3381.D5565.3.patch, HIVE-3381.D5565.4.patch, 
> HIVE-3381.D5565.5.patch, HIVE-3381.D5565.6.patch, HIVE-3381.D5565.7.patch, 
> mapjoin_testOuter.q
>
>
> Outer joins, especially full outer joins or outer join with filter on 'ON 
> clause' is not showing proper results. For example, query in test join_1to1.q
> {code}
> SELECT * FROM join_1to1_1 a full outer join join_1to1_2 b on a.key1 = b.key1 
> and a.value = 66 and b.value = 66 ORDER BY a.key1 ASC, a.key2 ASC, a.value 
> ASC, b.key1 ASC, b.key2 ASC, b.value ASC;
> {code}
> results
> {code}
> NULL  NULLNULLNULLNULL66
> NULL  NULLNULLNULL10050   66
> NULL  NULLNULL10  10010   66
> NULL  NULLNULL30  10030   88
> NULL  NULLNULL35  10035   88
> NULL  NULLNULL40  10040   88
> NULL  NULLNULL40  10040   88
> NULL  NULLNULL50  10050   88
> NULL  NULLNULL50  10050   88
> NULL  NULLNULL50  10050   88
> NULL  NULLNULL70  10040   88
> NULL  NULLNULL70  10040   88
> NULL  NULLNULL70  10040   88
> NULL  NULLNULL70  10040   88
> NULL  NULL66  NULLNULLNULL
> NULL  10050   66  NULLNULLNULL
> 5 10005   66  5   10005   66
> 1510015   66  NULLNULLNULL
> 2010020   66  20  10020   66
> 2510025   88  NULLNULLNULL
> 3010030   66  NULLNULLNULL
> 3510035   88  NULLNULLNULL
> 4010040   66  NULLNULLNULL
> 4010040   66  40  10040   66
> 4010040   88  NULLNULLNULL
> 4010040   88  NULLNULLNULL
> 5010050   66  NULLNULLNULL
> 5010050   66  50  10050   66
> 5010050   66  50  10050   66
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 7010040   66  NULLNULLNULL
> 7010040   66  NULLNULLNULL
> 7010040   66  NULLNULLNULL
> 7010040   66  NULLNULLNULL
> 8010040   88  NULLNULLNULL
> 8010040   88  NULLNULLNULL
> 8010040   88  NULLNULLNULL
> 8010040   88  NULLNULLNULL
> {code} 
> but it seemed not right. This should be 
> {code}
> NULL  NULLNULLNULLNULL66
> NULL  NULLNULLNULL10050   66
> NULL  NULL 

[jira] [Created] (HIVE-4233) The TGT gotten from class 'CLIService' should be renewed on time?

2013-03-26 Thread dong (JIRA)
dong created HIVE-4233:
--

 Summary: The TGT gotten from class 'CLIService'  should be renewed 
on time? 
 Key: HIVE-4233
 URL: https://issues.apache.org/jira/browse/HIVE-4233
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.10.0
 Environment: CentOS release 6.3 (Final)

jdk1.6.0_31

HiveServer2  0.10.0-cdh4.2.0

Kerberos Security 
Reporter: dong
Priority: Critical


When the HIveServer2 have started more than 7 days, I use beeline  shell  to  
connect the HiveServer2,all operation failed.

The log of HiveServer2 shows it was caused by the Kerberos auth failure,the 
exception stack trace is:

2013-03-26 11:55:20,932 ERROR hive.ql.metadata.Hive: 
java.lang.RuntimeException: Unable to instantiate 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient
at 
org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1084)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:51)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:61)
at 
org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2140)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2151)
at 
org.apache.hadoop.hive.ql.metadata.Hive.getDelegationToken(Hive.java:2275)
at 
org.apache.hive.service.cli.CLIService.getDelegationTokenFromMetaStore(CLIService.java:358)
at 
org.apache.hive.service.cli.thrift.ThriftCLIService.OpenSession(ThriftCLIService.java:127)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1073)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1058)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at 
org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge20S.java:565)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedConstructorAccessor52.newInstance(Unknown 
Source)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at 
org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1082)
... 16 more
Caused by: java.lang.IllegalStateException: This ticket is no longer valid
at 
javax.security.auth.kerberos.KerberosTicket.toString(KerberosTicket.java:601)
at java.lang.String.valueOf(String.java:2826)
at java.lang.StringBuilder.append(StringBuilder.java:115)
at sun.security.jgss.krb5.SubjectComber.findAux(SubjectComber.java:120)
at sun.security.jgss.krb5.SubjectComber.find(SubjectComber.java:41)
at sun.security.jgss.krb5.Krb5Util.getTicket(Krb5Util.java:130)
at 
sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:328)
at java.security.AccessController.doPrivileged(Native Method)
at 
sun.security.jgss.krb5.Krb5InitCredential.getTgt(Krb5InitCredential.java:325)
at 
sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:128)
at 
sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:106)
at 
sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:172)
at 
sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:209)
at 
sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:195)
at 
sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:162)
at 
com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:175)
at 
org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
at 
org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
at 
org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
at 
org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
at 
org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
at java.security.AccessController.doPrivileged(Native Method)
at j

[jira] [Updated] (HIVE-3963) Allow Hive to connect to RDBMS

2013-03-26 Thread Maxime LANCIAUX (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxime LANCIAUX updated HIVE-3963:
--

Description: 
I am thinking about something like :

SELECT jdbcload('driver','url','user','password','sql') FROM dual; 
(https://issues.apache.org/jira/browse/HIVE-1558)

There is already a JIRA https://issues.apache.org/jira/browse/HIVE-1555 for 
JDBCStorageHandler

  was:
I am thinking about something like :

SELECT jdbcload('driver','url','user','password','sql') FROM dual;

There is already a JIRA https://issues.apache.org/jira/browse/HIVE-1555 for 
JDBCStorageHandler


> Allow Hive to connect to RDBMS
> --
>
> Key: HIVE-3963
> URL: https://issues.apache.org/jira/browse/HIVE-3963
> Project: Hive
>  Issue Type: New Feature
>  Components: Import/Export, JDBC, SQL, StorageHandler
>Affects Versions: 0.10.0, 0.9.1, 0.11.0
>Reporter: Maxime LANCIAUX
> Fix For: 0.10.1
>
> Attachments: patchfile
>
>
> I am thinking about something like :
> SELECT jdbcload('driver','url','user','password','sql') FROM dual; 
> (https://issues.apache.org/jira/browse/HIVE-1558)
> There is already a JIRA https://issues.apache.org/jira/browse/HIVE-1555 for 
> JDBCStorageHandler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-1558) introducing the "dual" table

2013-03-26 Thread Maxime LANCIAUX (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13614159#comment-13614159
 ] 

Maxime LANCIAUX commented on HIVE-1558:
---

What do you think about modify Hive.g and add a token for DUAL ?

> introducing the "dual" table
> 
>
> Key: HIVE-1558
> URL: https://issues.apache.org/jira/browse/HIVE-1558
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Ning Zhang
>Assignee: Marcin Kurczych
>
> The "dual" table in MySQL and Oracle is very convenient in testing UDFs or 
> constructing rows without reading any other tables. 
> If dual is the only data source we could leverage the local mode execution. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21 #332

2013-03-26 Thread Apache Jenkins Server
See 

--
[...truncated 5078 lines...]
A ql/src/gen/thrift/gen-py/queryplan/constants.py
A ql/src/gen/thrift/gen-py/queryplan/__init__.py
A ql/src/gen/thrift/gen-cpp
A ql/src/gen/thrift/gen-cpp/queryplan_constants.h
A ql/src/gen/thrift/gen-cpp/queryplan_types.cpp
A ql/src/gen/thrift/gen-cpp/queryplan_types.h
A ql/src/gen/thrift/gen-cpp/queryplan_constants.cpp
A ql/src/gen/thrift/gen-rb
A ql/src/gen/thrift/gen-rb/queryplan_types.rb
A ql/src/gen/thrift/gen-rb/queryplan_constants.rb
A ql/src/gen/thrift/gen-javabean
A ql/src/gen/thrift/gen-javabean/org
A ql/src/gen/thrift/gen-javabean/org/apache
A ql/src/gen/thrift/gen-javabean/org/apache/hadoop
A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive
A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql
A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan
A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api
A 
ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/QueryPlan.java
A 
ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Adjacency.java
A 
ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Graph.java
A 
ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Task.java
A 
ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/AdjacencyType.java
A 
ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Stage.java
A 
ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/TaskType.java
A 
ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Query.java
A 
ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/StageType.java
A 
ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/NodeType.java
A 
ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Operator.java
A 
ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/OperatorType.java
A ql/src/gen/thrift/gen-php
A ql/src/gen/thrift/gen-php/queryplan
A ql/src/gen/thrift/gen-php/queryplan/queryplan_types.php
A ql/src/gen-javabean
A ql/src/gen-javabean/org
A ql/src/gen-javabean/org/apache
A ql/src/gen-javabean/org/apache/hadoop
A ql/src/gen-javabean/org/apache/hadoop/hive
A ql/src/gen-javabean/org/apache/hadoop/hive/ql
A ql/src/gen-javabean/org/apache/hadoop/hive/ql/plan
A ql/src/gen-javabean/org/apache/hadoop/hive/ql/plan/api
A ql/src/gen-php
A ql/build.xml
A ql/if
A ql/if/queryplan.thrift
A pdk
A pdk/ivy.xml
A pdk/scripts
A pdk/scripts/class-registration.xsl
A pdk/scripts/build-plugin.xml
A pdk/scripts/README
A pdk/src
A pdk/src/java
A pdk/src/java/org
A pdk/src/java/org/apache
A pdk/src/java/org/apache/hive
A pdk/src/java/org/apache/hive/pdk
A pdk/src/java/org/apache/hive/pdk/FunctionExtractor.java
A pdk/src/java/org/apache/hive/pdk/HivePdkUnitTest.java
A pdk/src/java/org/apache/hive/pdk/HivePdkUnitTests.java
A pdk/src/java/org/apache/hive/pdk/PluginTest.java
A pdk/test-plugin
A pdk/test-plugin/test
A pdk/test-plugin/test/cleanup.sql
A pdk/test-plugin/test/onerow.txt
A pdk/test-plugin/test/setup.sql
A pdk/test-plugin/src
A pdk/test-plugin/src/org
A pdk/test-plugin/src/org/apache
A pdk/test-plugin/src/org/apache/hive
A pdk/test-plugin/src/org/apache/hive/pdktest
A pdk/test-plugin/src/org/apache/hive/pdktest/Rot13.java
A pdk/test-plugin/build.xml
A pdk/build.xml
A build-offline.xml
 U.
At revision 1461198
no change for http://svn.apache.org/repos/asf/hive/branches/branch-0.9 since 
the previous build
[hive] $ /home/hudson/tools/ant/apache-ant-1.8.1/bin/ant 
-Dversion=0.9.1-SNAPSHOT very-clean tar binary
Buildfile: 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build.xml

ivy-init-dirs:
 [echo] Project: hive
[mkdir] Created dir: 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/ivy
[mkdir] Created dir: 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/ivy/lib
[mkdir] Created dir: 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/ivy/report
[mkdir] Created dir: 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/ivy/maven

ivy-download:
 [echo] Project: hive
  [get] Getting: 
http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar
  [get] To: 
/x1/jenkins/jenkins-slave/workspace/Hive

[jira] [Commented] (HIVE-4179) NonBlockingOpDeDup does not merge SEL operators correctly

2013-03-26 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13614279#comment-13614279
 ] 

Gunther Hagleitner commented on HIVE-4179:
--

[~navis] I think you'd be the best person to take a look. Can you spare a 
moment?

> NonBlockingOpDeDup does not merge SEL operators correctly
> -
>
> Key: HIVE-4179
> URL: https://issues.apache.org/jira/browse/HIVE-4179
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-4179.1.patch, HIVE-4179.2.patch
>
>
> The input columns list for SEL operations isn't merged properly in the 
> optimization. The best way to see this is running union_remove_22.q with 
> -Dhadoop.mr.rev=23. The plan shows lost UDFs and a broken lineage for one 
> column.
> Note: union_remove tests do not run on hadoop 1 or 0.20.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3381) Result of outer join is not valid

2013-03-26 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3381:
---

   Resolution: Fixed
Fix Version/s: 0.11.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Navis!

> Result of outer join is not valid
> -
>
> Key: HIVE-3381
> URL: https://issues.apache.org/jira/browse/HIVE-3381
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.10.0
>Reporter: Navis
>Assignee: Navis
>Priority: Critical
> Fix For: 0.11.0
>
> Attachments: HIVE-3381.D5565.3.patch, HIVE-3381.D5565.4.patch, 
> HIVE-3381.D5565.5.patch, HIVE-3381.D5565.6.patch, HIVE-3381.D5565.7.patch, 
> mapjoin_testOuter.q
>
>
> Outer joins, especially full outer joins or outer join with filter on 'ON 
> clause' is not showing proper results. For example, query in test join_1to1.q
> {code}
> SELECT * FROM join_1to1_1 a full outer join join_1to1_2 b on a.key1 = b.key1 
> and a.value = 66 and b.value = 66 ORDER BY a.key1 ASC, a.key2 ASC, a.value 
> ASC, b.key1 ASC, b.key2 ASC, b.value ASC;
> {code}
> results
> {code}
> NULL  NULLNULLNULLNULL66
> NULL  NULLNULLNULL10050   66
> NULL  NULLNULL10  10010   66
> NULL  NULLNULL30  10030   88
> NULL  NULLNULL35  10035   88
> NULL  NULLNULL40  10040   88
> NULL  NULLNULL40  10040   88
> NULL  NULLNULL50  10050   88
> NULL  NULLNULL50  10050   88
> NULL  NULLNULL50  10050   88
> NULL  NULLNULL70  10040   88
> NULL  NULLNULL70  10040   88
> NULL  NULLNULL70  10040   88
> NULL  NULLNULL70  10040   88
> NULL  NULL66  NULLNULLNULL
> NULL  10050   66  NULLNULLNULL
> 5 10005   66  5   10005   66
> 1510015   66  NULLNULLNULL
> 2010020   66  20  10020   66
> 2510025   88  NULLNULLNULL
> 3010030   66  NULLNULLNULL
> 3510035   88  NULLNULLNULL
> 4010040   66  NULLNULLNULL
> 4010040   66  40  10040   66
> 4010040   88  NULLNULLNULL
> 4010040   88  NULLNULLNULL
> 5010050   66  NULLNULLNULL
> 5010050   66  50  10050   66
> 5010050   66  50  10050   66
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 7010040   66  NULLNULLNULL
> 7010040   66  NULLNULLNULL
> 7010040   66  NULLNULLNULL
> 7010040   66  NULLNULLNULL
> 8010040   88  NULLNULLNULL
> 8010040   88  NULLNULLNULL
> 8010040   88  NULLNULLNULL
> 8010040   88  NULLNULLNULL
> {code} 
> but it seemed not right. This should be 
> {code}
> NULL  NULLNULLNULLNULL66
> NULL  NULLNULLNULL10050   66
> NULL  NULLNULL10  10010   66
> NULL  NULLNULL25  10025   66
> NULL  NULLNULL30  10030   88
> NULL  NULLNULL35  10035   88
> NULL  NULLNULL40  10040   88
> NULL  NULLNULL50  10050   88
> NULL  NULLNULL70  10040   88
> NULL  NULLNULL70  10040   88
> NULL  NULLNULL80  10040   66
> NULL  NULLNULL80  10040   66
> NULL  NULL66  NULLNULLNULL
> NULL  10050   66  NULLNULLNULL
> 5 10005   66  5   10005   66
> 1510015   66  NULLNULLNULL
> 2010020   66  20  10020   66
> 2510025   88  NULLNULLNULL
> 3010030   66  NULLNULLNULL
> 3510035   88  NULLNULLNULL
> 4010040   66  40  10040   66
> 4010040   88  NULLNULLNULL
> 5010050   66  50  10050   66
> 5010050   66  50  10050   66
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 7010040   66  NULLNULLNULL
> 7010040   66  NULLNULLNULL
> 8010040   88  NULLNULLNULL
> 8010040   88  NULLNULLNULL
> {code}

--
This message 

[jira] [Updated] (HIVE-4007) Create abstract classes for serializer and deserializer

2013-03-26 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4007:
---

   Resolution: Fixed
Fix Version/s: 0.11.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Namit!

> Create abstract classes for serializer and deserializer
> ---
>
> Key: HIVE-4007
> URL: https://issues.apache.org/jira/browse/HIVE-4007
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: Namit Jain
>Assignee: Namit Jain
> Fix For: 0.11.0
>
> Attachments: hive.4007.1.patch, hive.4007.2.patch, hive.4007.3.patch, 
> hive.4007.4.patch
>
>
> Currently, it is very difficult to change the Serializer/Deserializer
> interface, since all the SerDes directly implement the interface.
> Instead, we should have abstract classes for implementing these interfaces.
> In case of a interface change, only the abstract class and the relevant 
> serde needs to change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3958) support partial scan for analyze command - RCFile

2013-03-26 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Tim Liu updated HIVE-3958:
---

Attachment: HIVE-3958.patch.5

> support partial scan for analyze command - RCFile
> -
>
> Key: HIVE-3958
> URL: https://issues.apache.org/jira/browse/HIVE-3958
> Project: Hive
>  Issue Type: Improvement
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
> Attachments: HIVE-3958.patch.1, HIVE-3958.patch.2, HIVE-3958.patch.3, 
> HIVE-3958.patch.4, HIVE-3958.patch.5
>
>
> analyze commands allows us to collect statistics on existing 
> tables/partitions. It works great but might be slow since it scans all files.
> There are 2 ways to speed it up:
> 1. collect stats without file scan. It may not collect all stats but good and 
> fast enough for use case. HIVE-3917 addresses it
> 2. collect stats via partial file scan. It doesn't scan all content of files 
> but part of it to get file metadata. some examples are 
> https://cwiki.apache.org/Hive/rcfilecat.html for RCFile, ORC ( HIVE-3874 ) 
> and HFile of Hbase
> This jira is targeted to address the #2. More specifically RCFile format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work stopped] (HIVE-3958) support partial scan for analyze command - RCFile

2013-03-26 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-3958 stopped by Gang Tim Liu.

> support partial scan for analyze command - RCFile
> -
>
> Key: HIVE-3958
> URL: https://issues.apache.org/jira/browse/HIVE-3958
> Project: Hive
>  Issue Type: Improvement
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
> Attachments: HIVE-3958.patch.1, HIVE-3958.patch.2, HIVE-3958.patch.3, 
> HIVE-3958.patch.4, HIVE-3958.patch.5
>
>
> analyze commands allows us to collect statistics on existing 
> tables/partitions. It works great but might be slow since it scans all files.
> There are 2 ways to speed it up:
> 1. collect stats without file scan. It may not collect all stats but good and 
> fast enough for use case. HIVE-3917 addresses it
> 2. collect stats via partial file scan. It doesn't scan all content of files 
> but part of it to get file metadata. some examples are 
> https://cwiki.apache.org/Hive/rcfilecat.html for RCFile, ORC ( HIVE-3874 ) 
> and HFile of Hbase
> This jira is targeted to address the #2. More specifically RCFile format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (HIVE-3958) support partial scan for analyze command - RCFile

2013-03-26 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-3958 started by Gang Tim Liu.

> support partial scan for analyze command - RCFile
> -
>
> Key: HIVE-3958
> URL: https://issues.apache.org/jira/browse/HIVE-3958
> Project: Hive
>  Issue Type: Improvement
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
> Attachments: HIVE-3958.patch.1, HIVE-3958.patch.2, HIVE-3958.patch.3, 
> HIVE-3958.patch.4, HIVE-3958.patch.5
>
>
> analyze commands allows us to collect statistics on existing 
> tables/partitions. It works great but might be slow since it scans all files.
> There are 2 ways to speed it up:
> 1. collect stats without file scan. It may not collect all stats but good and 
> fast enough for use case. HIVE-3917 addresses it
> 2. collect stats via partial file scan. It doesn't scan all content of files 
> but part of it to get file metadata. some examples are 
> https://cwiki.apache.org/Hive/rcfilecat.html for RCFile, ORC ( HIVE-3874 ) 
> and HFile of Hbase
> This jira is targeted to address the #2. More specifically RCFile format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3958) support partial scan for analyze command - RCFile

2013-03-26 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Tim Liu updated HIVE-3958:
---

Status: Patch Available  (was: In Progress)

Another diff is ready. thanks

> support partial scan for analyze command - RCFile
> -
>
> Key: HIVE-3958
> URL: https://issues.apache.org/jira/browse/HIVE-3958
> Project: Hive
>  Issue Type: Improvement
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
> Attachments: HIVE-3958.patch.1, HIVE-3958.patch.2, HIVE-3958.patch.3, 
> HIVE-3958.patch.4, HIVE-3958.patch.5
>
>
> analyze commands allows us to collect statistics on existing 
> tables/partitions. It works great but might be slow since it scans all files.
> There are 2 ways to speed it up:
> 1. collect stats without file scan. It may not collect all stats but good and 
> fast enough for use case. HIVE-3917 addresses it
> 2. collect stats via partial file scan. It doesn't scan all content of files 
> but part of it to get file metadata. some examples are 
> https://cwiki.apache.org/Hive/rcfilecat.html for RCFile, ORC ( HIVE-3874 ) 
> and HFile of Hbase
> This jira is targeted to address the #2. More specifically RCFile format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (HIVE-3958) support partial scan for analyze command - RCFile

2013-03-26 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-3958 started by Gang Tim Liu.

> support partial scan for analyze command - RCFile
> -
>
> Key: HIVE-3958
> URL: https://issues.apache.org/jira/browse/HIVE-3958
> Project: Hive
>  Issue Type: Improvement
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
> Attachments: HIVE-3958.patch.1, HIVE-3958.patch.2, HIVE-3958.patch.3, 
> HIVE-3958.patch.4, HIVE-3958.patch.5
>
>
> analyze commands allows us to collect statistics on existing 
> tables/partitions. It works great but might be slow since it scans all files.
> There are 2 ways to speed it up:
> 1. collect stats without file scan. It may not collect all stats but good and 
> fast enough for use case. HIVE-3917 addresses it
> 2. collect stats via partial file scan. It doesn't scan all content of files 
> but part of it to get file metadata. some examples are 
> https://cwiki.apache.org/Hive/rcfilecat.html for RCFile, ORC ( HIVE-3874 ) 
> and HFile of Hbase
> This jira is targeted to address the #2. More specifically RCFile format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4234) Add the partitioned column information to the TableScanDesc.

2013-03-26 Thread sachin (JIRA)
sachin created HIVE-4234:


 Summary: Add the partitioned column information to the 
TableScanDesc.
 Key: HIVE-4234
 URL: https://issues.apache.org/jira/browse/HIVE-4234
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: sachin
Assignee: sachin
Priority: Minor
 Fix For: 0.10.1


This information will be useful for row processing by various operator hooks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4234) Add the partitioned column information to the TableScanDesc.

2013-03-26 Thread sachin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sachin updated HIVE-4234:
-

Fix Version/s: (was: 0.10.1)

> Add the partitioned column information to the TableScanDesc.
> 
>
> Key: HIVE-4234
> URL: https://issues.apache.org/jira/browse/HIVE-4234
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: sachin
>Assignee: sachin
>Priority: Minor
>
> This information will be useful for row processing by various operator hooks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4095) Add exchange partition in Hive

2013-03-26 Thread Dheeraj Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dheeraj Kumar Singh updated HIVE-4095:
--

Assignee: Dheeraj Kumar Singh  (was: Rui Jian)

> Add exchange partition in Hive
> --
>
> Key: HIVE-4095
> URL: https://issues.apache.org/jira/browse/HIVE-4095
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Dheeraj Kumar Singh
>
> It would very useful to support exchange partition in hive, something similar
> to http://www.orafaq.com/node/2570 in Oracle.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4228) Bump up hadoop2 version in trunk

2013-03-26 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13614548#comment-13614548
 ] 

Thiruvel Thirumoolan commented on HIVE-4228:


Patch on Phabricator - https://reviews.facebook.net/D9723

> Bump up hadoop2 version in trunk
> 
>
> Key: HIVE-4228
> URL: https://issues.apache.org/jira/browse/HIVE-4228
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Affects Versions: 0.11.0
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
> Fix For: 0.11.0
>
> Attachments: HIVE-4228.patch
>
>
> Hive builds with hadoop 2.0.0-alpha now. Bumping up to hadoop-2.0.3-alpha. 
> Have raised JIRAs with hive10-hadoop23.6 unit tests. Most of them should fix 
> any new failures due to this bump. [I am guessing this should also help 
> HCatalog].

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4228) Bump up hadoop2 version in trunk

2013-03-26 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-4228:
---

Status: Patch Available  (was: Open)

> Bump up hadoop2 version in trunk
> 
>
> Key: HIVE-4228
> URL: https://issues.apache.org/jira/browse/HIVE-4228
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Affects Versions: 0.11.0
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
> Fix For: 0.11.0
>
> Attachments: HIVE-4228.patch
>
>
> Hive builds with hadoop 2.0.0-alpha now. Bumping up to hadoop-2.0.3-alpha. 
> Have raised JIRAs with hive10-hadoop23.6 unit tests. Most of them should fix 
> any new failures due to this bump. [I am guessing this should also help 
> HCatalog].

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3958) support partial scan for analyze command - RCFile

2013-03-26 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Tim Liu updated HIVE-3958:
---

Attachment: HIVE-3958.patch.6

> support partial scan for analyze command - RCFile
> -
>
> Key: HIVE-3958
> URL: https://issues.apache.org/jira/browse/HIVE-3958
> Project: Hive
>  Issue Type: Improvement
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
> Attachments: HIVE-3958.patch.1, HIVE-3958.patch.2, HIVE-3958.patch.3, 
> HIVE-3958.patch.4, HIVE-3958.patch.5, HIVE-3958.patch.6
>
>
> analyze commands allows us to collect statistics on existing 
> tables/partitions. It works great but might be slow since it scans all files.
> There are 2 ways to speed it up:
> 1. collect stats without file scan. It may not collect all stats but good and 
> fast enough for use case. HIVE-3917 addresses it
> 2. collect stats via partial file scan. It doesn't scan all content of files 
> but part of it to get file metadata. some examples are 
> https://cwiki.apache.org/Hive/rcfilecat.html for RCFile, ORC ( HIVE-3874 ) 
> and HFile of Hbase
> This jira is targeted to address the #2. More specifically RCFile format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Merging HCatalog into Hive

2013-03-26 Thread Alan Gates
There's an issue with the permissions here.  In the authorization file you 
granted permission to hcatalog committers on a directory /hive/hcatalog.  But 
in Hive you created /hive/trunk/hcatalog, which none of the hcatalog committers 
can access.  In the authorization file you'll need to change hive-hcatalog to 
have authorization /hive/trunk/hcatalog.  

There is also a scalability issue.  Every time Hive branches you'll have to add 
a line for that branch as well.  Also, this will prohibit any dev branches for 
hcatalog users, or access to any dev branches done in Hive.  I suspect you'll 
find it much easier to give the hive-hcatalog group access to /hive and then 
use community mores to enforce that no hcat committers commit outside the hcat 
directory.

Alan.

On Mar 15, 2013, at 5:26 PM, Carl Steinbach wrote:

> Hi Alan,
> 
> I committed HIVE-4145, created an HCatalog component on JIRA, and
> updated the asf-authorization-template to give the HCatalog committers
> karma on the hcatalog subdirectory. At this point I think everything should
> be ready to go. Let me know if you run into any problems.
> 
> Thanks.
> 
> Carl
> 
> On Wed, Mar 13, 2013 at 11:56 AM, Alan Gates  wrote:
> Proposed changes look good to me.  And you don't need an infra ticket to 
> grant karma.  Since you're Hive VP you can do it.  See 
> http://www.apache.org/dev/pmc.html#SVNaccess
> 
> Alan.
> 
> On Mar 10, 2013, at 9:29 PM, Carl Steinbach wrote:
> 
> > Hi Alan,
> >
> > I submitted a patch that creates the hcatalog directory and makes some 
> > other necessary
> > changes here:
> >
> > https://issues.apache.org/jira/browse/HIVE-4145
> >
> > Once this is committed I will contact ASFINFRA and ask them to grant the 
> > HCatalog
> > committers karma.
> >
> > Thanks.
> >
> > Carl
> >
> > On Sat, Mar 9, 2013 at 12:54 PM, Alan Gates  wrote:
> > Alright, I've gotten some feedback from Brock around the JIRA stuff and 
> > Carl in a live conversation expressed his desire to move hcat into the Hive 
> > namespace sooner rather than later.  So the proposal is that we'd move the 
> > code to org.apache.hive.hcatalog, though we would create shell classes and 
> > interfaces in org.apache.hcatalog for all public classes and interfaces so 
> > that it will be backward compatible.  I'm fine with doing this now.
> >
> > So, let's get started.  Carl, could you create an hcatalog directory under 
> > trunk/hive and grant the listed hcat committers karma on it?  Then I'll get 
> > started on moving the actual code.
> >
> > Alan.
> >
> > On Feb 24, 2013, at 12:22 PM, Brock Noland wrote:
> >
> > > Looks good from my perspective and I glad to see this moving forward.
> > >
> > > Regarding #4 (JIRA)
> > >
> > > "I don't know if there's a way to upload existing JIRAs into Hive's JIRA,
> > > but I think it would be better to leave them where they are."
> > >
> > > JIRA has a bulk move feature, but I am curious as why we would leave them
> > > under the old project? There might be good reason to orphan them, but my
> > > first thought is that it would be nice to have them under the HIVE project
> > > simply for search purposes.
> > >
> > > Brock
> > >
> > >
> > >
> > >
> > > On Fri, Feb 22, 2013 at 7:12 PM, Alan Gates  wrote:
> > >
> > >> Alright, our vote has passed, it's time to get on with merging HCatalog
> > >> into Hive.  Here's the things I can think of we need to deal with.  
> > >> Please
> > >> add additional issues I've missed:
> > >>
> > >> 1) Moving the code
> > >> 2) Dealing with domain names in the code
> > >> 3) The mailing lists
> > >> 4) The JIRA
> > >> 5) The website
> > >> 6) Committer rights
> > >> 7) Make a proposal for how HCat is released going forward
> > >> 8) Publish an FAQ
> > >>
> > >> Proposals for how we handle these:
> > >> Below I propose an approach for how to handle each of these.  Feedback
> > >> welcome.
> > >>
> > >> 1) Moving the code
> > >> I propose that HCat move into a subdirectory of Hive.  This fits nicely
> > >> into Hive's structure since it already has metastore, ql, etc.  We'd just
> > >> add 'hcatalog' as a new directory.  This directory would contain hcatalog
> > >> as it is today.  It does not follow Hive's standard build model so we'd
> > >> need to do some work to make it so that building Hive also builds HCat, 
> > >> but
> > >> this should be minimal.
> > >>
> > >> 2) Dealing with domain names
> > >> HCat code currently is under org.apache.hcatalog.  Do we want to change
> > >> it?  In time we probably should change it to match the rest of Hive
> > >> (org.apache.hadoop.hive.hcatalog).  We need to do this in a backward
> > >> compatible way.  I propose we leave it as is for now and if we decide to 
> > >> in
> > >> the future we can move the actual code to org.apache.hadoop.hive.hcatalog
> > >> and create shell classes under org.apache.hcatalog.
> > >>
> > >> 3) The mailing lists
> > >> Given that our goal is to merge the projects and not create a subproject
> > >> we should merge the mailing lists

Re: Merging HCatalog into Hive

2013-03-26 Thread Carl Steinbach
Hi Alan,

I agree that it will probably be too painful to enforce the rules with SVN,
so I went ahead and gave all of the HCatalog committers RW access to /hive.
Please follow the rules. If I receive any complaints about this I'll revert
back to the old scheme.

Thanks.

Carl

On Tue, Mar 26, 2013 at 2:34 PM, Alan Gates  wrote:

> There's an issue with the permissions here.  In the authorization file you
> granted permission to hcatalog committers on a directory /hive/hcatalog.
>  But in Hive you created /hive/trunk/hcatalog, which none of the hcatalog
> committers can access.  In the authorization file you'll need to change
> hive-hcatalog to have authorization /hive/trunk/hcatalog.
>
> There is also a scalability issue.  Every time Hive branches you'll have
> to add a line for that branch as well.  Also, this will prohibit any dev
> branches for hcatalog users, or access to any dev branches done in Hive.  I
> suspect you'll find it much easier to give the hive-hcatalog group access
> to /hive and then use community mores to enforce that no hcat committers
> commit outside the hcat directory.
>
> Alan.
>
> On Mar 15, 2013, at 5:26 PM, Carl Steinbach wrote:
>
> > Hi Alan,
> >
> > I committed HIVE-4145, created an HCatalog component on JIRA, and
> > updated the asf-authorization-template to give the HCatalog committers
> > karma on the hcatalog subdirectory. At this point I think everything
> should
> > be ready to go. Let me know if you run into any problems.
> >
> > Thanks.
> >
> > Carl
> >
> > On Wed, Mar 13, 2013 at 11:56 AM, Alan Gates 
> wrote:
> > Proposed changes look good to me.  And you don't need an infra ticket to
> grant karma.  Since you're Hive VP you can do it.  See
> http://www.apache.org/dev/pmc.html#SVNaccess
> >
> > Alan.
> >
> > On Mar 10, 2013, at 9:29 PM, Carl Steinbach wrote:
> >
> > > Hi Alan,
> > >
> > > I submitted a patch that creates the hcatalog directory and makes some
> other necessary
> > > changes here:
> > >
> > > https://issues.apache.org/jira/browse/HIVE-4145
> > >
> > > Once this is committed I will contact ASFINFRA and ask them to grant
> the HCatalog
> > > committers karma.
> > >
> > > Thanks.
> > >
> > > Carl
> > >
> > > On Sat, Mar 9, 2013 at 12:54 PM, Alan Gates 
> wrote:
> > > Alright, I've gotten some feedback from Brock around the JIRA stuff
> and Carl in a live conversation expressed his desire to move hcat into the
> Hive namespace sooner rather than later.  So the proposal is that we'd move
> the code to org.apache.hive.hcatalog, though we would create shell classes
> and interfaces in org.apache.hcatalog for all public classes and interfaces
> so that it will be backward compatible.  I'm fine with doing this now.
> > >
> > > So, let's get started.  Carl, could you create an hcatalog directory
> under trunk/hive and grant the listed hcat committers karma on it?  Then
> I'll get started on moving the actual code.
> > >
> > > Alan.
> > >
> > > On Feb 24, 2013, at 12:22 PM, Brock Noland wrote:
> > >
> > > > Looks good from my perspective and I glad to see this moving forward.
> > > >
> > > > Regarding #4 (JIRA)
> > > >
> > > > "I don't know if there's a way to upload existing JIRAs into Hive's
> JIRA,
> > > > but I think it would be better to leave them where they are."
> > > >
> > > > JIRA has a bulk move feature, but I am curious as why we would leave
> them
> > > > under the old project? There might be good reason to orphan them,
> but my
> > > > first thought is that it would be nice to have them under the HIVE
> project
> > > > simply for search purposes.
> > > >
> > > > Brock
> > > >
> > > >
> > > >
> > > >
> > > > On Fri, Feb 22, 2013 at 7:12 PM, Alan Gates 
> wrote:
> > > >
> > > >> Alright, our vote has passed, it's time to get on with merging
> HCatalog
> > > >> into Hive.  Here's the things I can think of we need to deal with.
>  Please
> > > >> add additional issues I've missed:
> > > >>
> > > >> 1) Moving the code
> > > >> 2) Dealing with domain names in the code
> > > >> 3) The mailing lists
> > > >> 4) The JIRA
> > > >> 5) The website
> > > >> 6) Committer rights
> > > >> 7) Make a proposal for how HCat is released going forward
> > > >> 8) Publish an FAQ
> > > >>
> > > >> Proposals for how we handle these:
> > > >> Below I propose an approach for how to handle each of these.
>  Feedback
> > > >> welcome.
> > > >>
> > > >> 1) Moving the code
> > > >> I propose that HCat move into a subdirectory of Hive.  This fits
> nicely
> > > >> into Hive's structure since it already has metastore, ql, etc.
>  We'd just
> > > >> add 'hcatalog' as a new directory.  This directory would contain
> hcatalog
> > > >> as it is today.  It does not follow Hive's standard build model so
> we'd
> > > >> need to do some work to make it so that building Hive also builds
> HCat, but
> > > >> this should be minimal.
> > > >>
> > > >> 2) Dealing with domain names
> > > >> HCat code currently is under org.apache.hcatalog.  Do we want to
> change
> > > >> it?  In time we probab

Re: Merging HCatalog into Hive

2013-03-26 Thread Alan Gates
Cool, it works now.  Thanks for the fast response.

Alan.

On Mar 26, 2013, at 2:58 PM, Carl Steinbach wrote:

> Hi Alan,
> 
> I agree that it will probably be too painful to enforce the rules with SVN, 
> so I went ahead and gave all of the HCatalog committers RW access to /hive. 
> Please follow the rules. If I receive any complaints about this I'll revert 
> back to the old scheme.
> 
> Thanks.
> 
> Carl
> 
> On Tue, Mar 26, 2013 at 2:34 PM, Alan Gates  wrote:
> There's an issue with the permissions here.  In the authorization file you 
> granted permission to hcatalog committers on a directory /hive/hcatalog.  But 
> in Hive you created /hive/trunk/hcatalog, which none of the hcatalog 
> committers can access.  In the authorization file you'll need to change 
> hive-hcatalog to have authorization /hive/trunk/hcatalog.
> 
> There is also a scalability issue.  Every time Hive branches you'll have to 
> add a line for that branch as well.  Also, this will prohibit any dev 
> branches for hcatalog users, or access to any dev branches done in Hive.  I 
> suspect you'll find it much easier to give the hive-hcatalog group access to 
> /hive and then use community mores to enforce that no hcat committers commit 
> outside the hcat directory.
> 
> Alan.
> 
> On Mar 15, 2013, at 5:26 PM, Carl Steinbach wrote:
> 
> > Hi Alan,
> >
> > I committed HIVE-4145, created an HCatalog component on JIRA, and
> > updated the asf-authorization-template to give the HCatalog committers
> > karma on the hcatalog subdirectory. At this point I think everything should
> > be ready to go. Let me know if you run into any problems.
> >
> > Thanks.
> >
> > Carl
> >
> > On Wed, Mar 13, 2013 at 11:56 AM, Alan Gates  wrote:
> > Proposed changes look good to me.  And you don't need an infra ticket to 
> > grant karma.  Since you're Hive VP you can do it.  See 
> > http://www.apache.org/dev/pmc.html#SVNaccess
> >
> > Alan.
> >
> > On Mar 10, 2013, at 9:29 PM, Carl Steinbach wrote:
> >
> > > Hi Alan,
> > >
> > > I submitted a patch that creates the hcatalog directory and makes some 
> > > other necessary
> > > changes here:
> > >
> > > https://issues.apache.org/jira/browse/HIVE-4145
> > >
> > > Once this is committed I will contact ASFINFRA and ask them to grant the 
> > > HCatalog
> > > committers karma.
> > >
> > > Thanks.
> > >
> > > Carl
> > >
> > > On Sat, Mar 9, 2013 at 12:54 PM, Alan Gates  wrote:
> > > Alright, I've gotten some feedback from Brock around the JIRA stuff and 
> > > Carl in a live conversation expressed his desire to move hcat into the 
> > > Hive namespace sooner rather than later.  So the proposal is that we'd 
> > > move the code to org.apache.hive.hcatalog, though we would create shell 
> > > classes and interfaces in org.apache.hcatalog for all public classes and 
> > > interfaces so that it will be backward compatible.  I'm fine with doing 
> > > this now.
> > >
> > > So, let's get started.  Carl, could you create an hcatalog directory 
> > > under trunk/hive and grant the listed hcat committers karma on it?  Then 
> > > I'll get started on moving the actual code.
> > >
> > > Alan.
> > >
> > > On Feb 24, 2013, at 12:22 PM, Brock Noland wrote:
> > >
> > > > Looks good from my perspective and I glad to see this moving forward.
> > > >
> > > > Regarding #4 (JIRA)
> > > >
> > > > "I don't know if there's a way to upload existing JIRAs into Hive's 
> > > > JIRA,
> > > > but I think it would be better to leave them where they are."
> > > >
> > > > JIRA has a bulk move feature, but I am curious as why we would leave 
> > > > them
> > > > under the old project? There might be good reason to orphan them, but my
> > > > first thought is that it would be nice to have them under the HIVE 
> > > > project
> > > > simply for search purposes.
> > > >
> > > > Brock
> > > >
> > > >
> > > >
> > > >
> > > > On Fri, Feb 22, 2013 at 7:12 PM, Alan Gates  
> > > > wrote:
> > > >
> > > >> Alright, our vote has passed, it's time to get on with merging HCatalog
> > > >> into Hive.  Here's the things I can think of we need to deal with.  
> > > >> Please
> > > >> add additional issues I've missed:
> > > >>
> > > >> 1) Moving the code
> > > >> 2) Dealing with domain names in the code
> > > >> 3) The mailing lists
> > > >> 4) The JIRA
> > > >> 5) The website
> > > >> 6) Committer rights
> > > >> 7) Make a proposal for how HCat is released going forward
> > > >> 8) Publish an FAQ
> > > >>
> > > >> Proposals for how we handle these:
> > > >> Below I propose an approach for how to handle each of these.  Feedback
> > > >> welcome.
> > > >>
> > > >> 1) Moving the code
> > > >> I propose that HCat move into a subdirectory of Hive.  This fits nicely
> > > >> into Hive's structure since it already has metastore, ql, etc.  We'd 
> > > >> just
> > > >> add 'hcatalog' as a new directory.  This directory would contain 
> > > >> hcatalog
> > > >> as it is today.  It does not follow Hive's standard build model so we'd
> > > >> need to do some work to make

[jira] [Created] (HIVE-4235) CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists

2013-03-26 Thread Gang Tim Liu (JIRA)
Gang Tim Liu created HIVE-4235:
--

 Summary: CREATE TABLE IF NOT EXISTS uses inefficient way to check 
if table exists
 Key: HIVE-4235
 URL: https://issues.apache.org/jira/browse/HIVE-4235
 Project: Hive
  Issue Type: Bug
  Components: JDBC, Query Processor, SQL
Reporter: Gang Tim Liu
Assignee: Gang Tim Liu


CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists.

It uses Hive.java's getTablesByPattern(...) to check if table exists. It 
involves regular expression and eventually database join. Very efficient. May 
cause database lock time increases and hurt db performance if a lot of such 
commands hit database.

The suggested approach is to use getTable(...) since we know tablename already

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Moving code to Hive NOW

2013-03-26 Thread Alan Gates
I've moved the code.  I'll be moving a lot of other code around over the next 
few days as I do what we discussed in 
https://issues.apache.org/jira/browse/HIVE-4198 so don't rebase your patches 
just yet.

Alan.

On Mar 26, 2013, at 3:14 PM, Alan Gates wrote:

> I am going to move the HCatalog code to Hive in the next few minutes.  Please 
> don't check anything into HCatalog until this is done.  All patches will be 
> invalidated by this move.  I'll send an all clear when this is done.
> 
> Alan.



[jira] [Updated] (HIVE-4235) CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists

2013-03-26 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Tim Liu updated HIVE-4235:
---

Description: 
CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists.

It uses Hive.java's getTablesByPattern(...) to check if table exists. It 
involves regular expression and eventually database join. Very efficient. It 
can cause database lock time increase and hurt db performance if a lot of such 
commands hit database.

The suggested approach is to use getTable(...) since we know tablename already

  was:
CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists.

It uses Hive.java's getTablesByPattern(...) to check if table exists. It 
involves regular expression and eventually database join. Very efficient. May 
cause database lock time increases and hurt db performance if a lot of such 
commands hit database.

The suggested approach is to use getTable(...) since we know tablename already


> CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists
> 
>
> Key: HIVE-4235
> URL: https://issues.apache.org/jira/browse/HIVE-4235
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC, Query Processor, SQL
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
>
> CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists.
> It uses Hive.java's getTablesByPattern(...) to check if table exists. It 
> involves regular expression and eventually database join. Very efficient. It 
> can cause database lock time increase and hurt db performance if a lot of 
> such commands hit database.
> The suggested approach is to use getTable(...) since we know tablename already

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (HIVE-4235) CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists

2013-03-26 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-4235 started by Gang Tim Liu.

> CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists
> 
>
> Key: HIVE-4235
> URL: https://issues.apache.org/jira/browse/HIVE-4235
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC, Query Processor, SQL
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
>
> CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists.
> It uses Hive.java's getTablesByPattern(...) to check if table exists. It 
> involves regular expression and eventually database join. Very efficient. It 
> can cause database lock time increase and hurt db performance if a lot of 
> such commands hit database.
> The suggested approach is to use getTable(...) since we know tablename already

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4179) NonBlockingOpDeDup does not merge SEL operators correctly

2013-03-26 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-4179:
-

Fix Version/s: 0.11.0

> NonBlockingOpDeDup does not merge SEL operators correctly
> -
>
> Key: HIVE-4179
> URL: https://issues.apache.org/jira/browse/HIVE-4179
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Fix For: 0.11.0
>
> Attachments: HIVE-4179.1.patch, HIVE-4179.2.patch
>
>
> The input columns list for SEL operations isn't merged properly in the 
> optimization. The best way to see this is running union_remove_22.q with 
> -Dhadoop.mr.rev=23. The plan shows lost UDFs and a broken lineage for one 
> column.
> Note: union_remove tests do not run on hadoop 1 or 0.20.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4179) NonBlockingOpDeDup does not merge SEL operators correctly

2013-03-26 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-4179:
-

Priority: Critical  (was: Major)

> NonBlockingOpDeDup does not merge SEL operators correctly
> -
>
> Key: HIVE-4179
> URL: https://issues.apache.org/jira/browse/HIVE-4179
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
>Priority: Critical
> Fix For: 0.11.0
>
> Attachments: HIVE-4179.1.patch, HIVE-4179.2.patch
>
>
> The input columns list for SEL operations isn't merged properly in the 
> optimization. The best way to see this is running union_remove_22.q with 
> -Dhadoop.mr.rev=23. The plan shows lost UDFs and a broken lineage for one 
> column.
> Note: union_remove tests do not run on hadoop 1 or 0.20.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4235) CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists

2013-03-26 Thread Gang Tim Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13614649#comment-13614649
 ] 

Gang Tim Liu commented on HIVE-4235:


https://reviews.facebook.net/D9729

> CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists
> 
>
> Key: HIVE-4235
> URL: https://issues.apache.org/jira/browse/HIVE-4235
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC, Query Processor, SQL
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
> Attachments: HIVE-4235.patch.1
>
>
> CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists.
> It uses Hive.java's getTablesByPattern(...) to check if table exists. It 
> involves regular expression and eventually database join. Very efficient. It 
> can cause database lock time increase and hurt db performance if a lot of 
> such commands hit database.
> The suggested approach is to use getTable(...) since we know tablename already

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4235) CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists

2013-03-26 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Tim Liu updated HIVE-4235:
---

Attachment: HIVE-4235.patch.1

> CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists
> 
>
> Key: HIVE-4235
> URL: https://issues.apache.org/jira/browse/HIVE-4235
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC, Query Processor, SQL
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
> Attachments: HIVE-4235.patch.1
>
>
> CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists.
> It uses Hive.java's getTablesByPattern(...) to check if table exists. It 
> involves regular expression and eventually database join. Very efficient. It 
> can cause database lock time increase and hurt db performance if a lot of 
> such commands hit database.
> The suggested approach is to use getTable(...) since we know tablename already

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4235) CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists

2013-03-26 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Tim Liu updated HIVE-4235:
---

Status: Patch Available  (was: In Progress)

diff ready

> CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists
> 
>
> Key: HIVE-4235
> URL: https://issues.apache.org/jira/browse/HIVE-4235
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC, Query Processor, SQL
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
> Attachments: HIVE-4235.patch.1
>
>
> CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists.
> It uses Hive.java's getTablesByPattern(...) to check if table exists. It 
> involves regular expression and eventually database join. Very efficient. It 
> can cause database lock time increase and hurt db performance if a lot of 
> such commands hit database.
> The suggested approach is to use getTable(...) since we know tablename already

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4114) hive-metastore.jar depends on jdo2-api:jar:2.3-ec, which is missing in maven central

2013-03-26 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13614667#comment-13614667
 ] 

Konstantin Boudnik commented on HIVE-4114:
--

Can we wrap it into a pom file and deploy to, perhaps, apache maven? 

> hive-metastore.jar depends on jdo2-api:jar:2.3-ec, which is missing in maven 
> central
> 
>
> Key: HIVE-4114
> URL: https://issues.apache.org/jira/browse/HIVE-4114
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Gopal V
>Priority: Trivial
>
> Adding hive-exec-0.10.0 to an independent pom.xml results in the following 
> error
> {code}
> Failed to retrieve javax.jdo:jdo2-api-2.3-ec
> Caused by: Could not find artifact javax.jdo:jdo2-api:jar:2.3-ec in central 
> (http://repo1.maven.org/maven2)
> ...
> Path to dependency: 
>   1) org.notmysock.hive:plan-viewer:jar:1.0-SNAPSHOT
>   2) org.apache.hive:hive-exec:jar:0.10.0
>   3) org.apache.hive:hive-metastore:jar:0.10.0
>   4) javax.jdo:jdo2-api:jar:2.3-ec
> {code}
> From the best I could tell, in the hive build ant+ivy pulls this file from 
> the datanucleus repo
> http://www.datanucleus.org/downloads/maven2/javax/jdo/jdo2-api/2.3-ec/
> For completeness sake, the dependency needs to be pulled to maven central.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3381) Result of outer join is not valid

2013-03-26 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13614737#comment-13614737
 ] 

Navis commented on HIVE-3381:
-

Finally! Thanks to all.

> Result of outer join is not valid
> -
>
> Key: HIVE-3381
> URL: https://issues.apache.org/jira/browse/HIVE-3381
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.10.0
>Reporter: Navis
>Assignee: Navis
>Priority: Critical
> Fix For: 0.11.0
>
> Attachments: HIVE-3381.D5565.3.patch, HIVE-3381.D5565.4.patch, 
> HIVE-3381.D5565.5.patch, HIVE-3381.D5565.6.patch, HIVE-3381.D5565.7.patch, 
> mapjoin_testOuter.q
>
>
> Outer joins, especially full outer joins or outer join with filter on 'ON 
> clause' is not showing proper results. For example, query in test join_1to1.q
> {code}
> SELECT * FROM join_1to1_1 a full outer join join_1to1_2 b on a.key1 = b.key1 
> and a.value = 66 and b.value = 66 ORDER BY a.key1 ASC, a.key2 ASC, a.value 
> ASC, b.key1 ASC, b.key2 ASC, b.value ASC;
> {code}
> results
> {code}
> NULL  NULLNULLNULLNULL66
> NULL  NULLNULLNULL10050   66
> NULL  NULLNULL10  10010   66
> NULL  NULLNULL30  10030   88
> NULL  NULLNULL35  10035   88
> NULL  NULLNULL40  10040   88
> NULL  NULLNULL40  10040   88
> NULL  NULLNULL50  10050   88
> NULL  NULLNULL50  10050   88
> NULL  NULLNULL50  10050   88
> NULL  NULLNULL70  10040   88
> NULL  NULLNULL70  10040   88
> NULL  NULLNULL70  10040   88
> NULL  NULLNULL70  10040   88
> NULL  NULL66  NULLNULLNULL
> NULL  10050   66  NULLNULLNULL
> 5 10005   66  5   10005   66
> 1510015   66  NULLNULLNULL
> 2010020   66  20  10020   66
> 2510025   88  NULLNULLNULL
> 3010030   66  NULLNULLNULL
> 3510035   88  NULLNULLNULL
> 4010040   66  NULLNULLNULL
> 4010040   66  40  10040   66
> 4010040   88  NULLNULLNULL
> 4010040   88  NULLNULLNULL
> 5010050   66  NULLNULLNULL
> 5010050   66  50  10050   66
> 5010050   66  50  10050   66
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 7010040   66  NULLNULLNULL
> 7010040   66  NULLNULLNULL
> 7010040   66  NULLNULLNULL
> 7010040   66  NULLNULLNULL
> 8010040   88  NULLNULLNULL
> 8010040   88  NULLNULLNULL
> 8010040   88  NULLNULLNULL
> 8010040   88  NULLNULLNULL
> {code} 
> but it seemed not right. This should be 
> {code}
> NULL  NULLNULLNULLNULL66
> NULL  NULLNULLNULL10050   66
> NULL  NULLNULL10  10010   66
> NULL  NULLNULL25  10025   66
> NULL  NULLNULL30  10030   88
> NULL  NULLNULL35  10035   88
> NULL  NULLNULL40  10040   88
> NULL  NULLNULL50  10050   88
> NULL  NULLNULL70  10040   88
> NULL  NULLNULL70  10040   88
> NULL  NULLNULL80  10040   66
> NULL  NULLNULL80  10040   66
> NULL  NULL66  NULLNULLNULL
> NULL  10050   66  NULLNULLNULL
> 5 10005   66  5   10005   66
> 1510015   66  NULLNULLNULL
> 2010020   66  20  10020   66
> 2510025   88  NULLNULLNULL
> 3010030   66  NULLNULLNULL
> 3510035   88  NULLNULLNULL
> 4010040   66  40  10040   66
> 4010040   88  NULLNULLNULL
> 5010050   66  50  10050   66
> 5010050   66  50  10050   66
> 5010050   88  NULLNULLNULL
> 5010050   88  NULLNULLNULL
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 6010040   66  60  10040   66
> 7010040   66  NULLNULLNULL
> 7010040   66  NULLNULLNULL
> 8010040   88  NULLNULLNULL
> 8010040   88  NULLNULLNULL
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please

[jira] [Commented] (HIVE-4235) CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists

2013-03-26 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13614742#comment-13614742
 ] 

Kevin Wilfong commented on HIVE-4235:
-

+1

> CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists
> 
>
> Key: HIVE-4235
> URL: https://issues.apache.org/jira/browse/HIVE-4235
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC, Query Processor, SQL
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
> Attachments: HIVE-4235.patch.1
>
>
> CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists.
> It uses Hive.java's getTablesByPattern(...) to check if table exists. It 
> involves regular expression and eventually database join. Very efficient. It 
> can cause database lock time increase and hurt db performance if a lot of 
> such commands hit database.
> The suggested approach is to use getTable(...) since we know tablename already

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4179) NonBlockingOpDeDup does not merge SEL operators correctly

2013-03-26 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13614786#comment-13614786
 ] 

Navis commented on HIVE-4179:
-

I've took a look at this. The root cause is from UnionProcessor which does not 
copy colExprMapping of parent SEL operator. After applying that, I've confirmed 
the result is valid.

[~hagleitn] The patch you've provided is valid but the missing colExprMap info 
can make problems in anytime. So I prefer to revise it as suggested above. 
Could you do that?

> NonBlockingOpDeDup does not merge SEL operators correctly
> -
>
> Key: HIVE-4179
> URL: https://issues.apache.org/jira/browse/HIVE-4179
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
>Priority: Critical
> Fix For: 0.11.0
>
> Attachments: HIVE-4179.1.patch, HIVE-4179.2.patch
>
>
> The input columns list for SEL operations isn't merged properly in the 
> optimization. The best way to see this is running union_remove_22.q with 
> -Dhadoop.mr.rev=23. The plan shows lost UDFs and a broken lineage for one 
> column.
> Note: union_remove tests do not run on hadoop 1 or 0.20.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4209) Cache evaluation result of deterministic expression and reuse it

2013-03-26 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4209:
--

Attachment: HIVE-4209.D9585.2.patch

navis updated the revision "HIVE-4209 [jira] Cache evaluation result of 
deterministic expression and reuse it".

  Fix NPE, running test

Reviewers: JIRA

REVISION DETAIL
  https://reviews.facebook.net/D9585

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D9585?vs=30201&id=30531#toc

AFFECTED FILES
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeColumnEvaluator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeConstantEvaluator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeEvaluator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeEvaluatorFactory.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeEvaluatorRef.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeNullEvaluator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeFieldEvaluator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/FilterOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/SelectOperator.java

To: JIRA, navis


> Cache evaluation result of deterministic expression and reuse it
> 
>
> Key: HIVE-4209
> URL: https://issues.apache.org/jira/browse/HIVE-4209
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-4209.D9585.1.patch, HIVE-4209.D9585.2.patch
>
>
> For example, 
> {noformat}
> select key from src where key + 1 > 100 AND key + 1 < 200 limit 3;
> {noformat}
> key + 1 need not to be evaluated twice.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4233) The TGT gotten from class 'CLIService' should be renewed on time

2013-03-26 Thread dong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dong updated HIVE-4233:
---

Summary: The TGT gotten from class 'CLIService'  should be renewed on time  
(was: The TGT gotten from class 'CLIService'  should be renewed on time? )

> The TGT gotten from class 'CLIService'  should be renewed on time
> -
>
> Key: HIVE-4233
> URL: https://issues.apache.org/jira/browse/HIVE-4233
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.10.0
> Environment: CentOS release 6.3 (Final)
> jdk1.6.0_31
> HiveServer2  0.10.0-cdh4.2.0
> Kerberos Security 
>Reporter: dong
>Priority: Critical
>
> When the HIveServer2 have started more than 7 days, I use beeline  shell  to  
> connect the HiveServer2,all operation failed.
> The log of HiveServer2 shows it was caused by the Kerberos auth failure,the 
> exception stack trace is:
> 2013-03-26 11:55:20,932 ERROR hive.ql.metadata.Hive: 
> java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1084)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:51)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:61)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2140)
> at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2151)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getDelegationToken(Hive.java:2275)
> at 
> org.apache.hive.service.cli.CLIService.getDelegationTokenFromMetaStore(CLIService.java:358)
> at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.OpenSession(ThriftCLIService.java:127)
> at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1073)
> at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1058)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge20S.java:565)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.GeneratedConstructorAccessor52.newInstance(Unknown 
> Source)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1082)
> ... 16 more
> Caused by: java.lang.IllegalStateException: This ticket is no longer valid
> at 
> javax.security.auth.kerberos.KerberosTicket.toString(KerberosTicket.java:601)
> at java.lang.String.valueOf(String.java:2826)
> at java.lang.StringBuilder.append(StringBuilder.java:115)
> at 
> sun.security.jgss.krb5.SubjectComber.findAux(SubjectComber.java:120)
> at sun.security.jgss.krb5.SubjectComber.find(SubjectComber.java:41)
> at sun.security.jgss.krb5.Krb5Util.getTicket(Krb5Util.java:130)
> at 
> sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:328)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> sun.security.jgss.krb5.Krb5InitCredential.getTgt(Krb5InitCredential.java:325)
> at 
> sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:128)
> at 
> sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:106)
> at 
> sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:172)
> at 
> sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:209)
> at 
> sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:195)
> at 
> sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:162)
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:175)
> at 
> org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTra

Where to put hcatalog branches and site code

2013-03-26 Thread Alan Gates
Right after I moved the hcat code to hive/trunk/hcatalog Owen pointed out that 
the problem with this is now everyone who checks out Hive pulls _all_ of the 
hcat code.  This isn't what we want.

The site code I propose we integrate with Hive's site code.  I'll put up a 
patch for this shortly.

The branches we could either move into Hive's branches directory (and move them 
to hcatalog-branch-0.x) or we could create a /hive/hcatalog-historical and put 
them there.  I'm fine with either.  Thoughts?

Alan.

[jira] [Updated] (HIVE-4171) Current database in metastore.Hive is not consistent with SessionState

2013-03-26 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4171:
--

Attachment: HIVE-4171.D9399.2.patch

navis updated the revision "HIVE-4171 [jira] Current database in metastore.Hive 
is not consistent with SessionState".

  Should change context loader, too

Reviewers: JIRA

REVISION DETAIL
  https://reviews.facebook.net/D9399

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D9399?vs=29805&id=30543#toc

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java

To: JIRA, navis
Cc: prasadm


> Current database in metastore.Hive is not consistent with SessionState
> --
>
> Key: HIVE-4171
> URL: https://issues.apache.org/jira/browse/HIVE-4171
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Reporter: Navis
>Assignee: Navis
>  Labels: HiveServer2
> Attachments: HIVE-4171.D9399.1.patch, HIVE-4171.D9399.2.patch
>
>
> metastore.Hive is thread local instance, which can have different status with 
> SessionState. Currently the only status in metastore.Hive is database name in 
> use.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3464) Merging join tree may reorder joins which could be invalid

2013-03-26 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-3464:
--

Attachment: HIVE-3464.D5409.4.patch

navis updated the revision "HIVE-3464 [jira] Merging join tree may reorder 
joins which could be invalid".

  Rebased to trunk

Reviewers: JIRA

REVISION DETAIL
  https://reviews.facebook.net/D5409

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D5409?vs=23079&id=30549#toc

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  ql/src/test/queries/clientpositive/mergejoins_mixed.q
  ql/src/test/results/clientpositive/join_filters_overlap.q.out
  ql/src/test/results/clientpositive/mergejoins_mixed.q.out

To: JIRA, navis
Cc: njain


> Merging join tree may reorder joins which could be invalid
> --
>
> Key: HIVE-3464
> URL: https://issues.apache.org/jira/browse/HIVE-3464
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.10.0
>Reporter: Navis
>Assignee: Navis
> Attachments: HIVE-3464.D5409.2.patch, HIVE-3464.D5409.3.patch, 
> HIVE-3464.D5409.4.patch
>
>
> Currently, hive merges join tree from right to left regardless of join types, 
> which may introduce join reordering. For example,
> select * from a join a b on a.key=b.key join a c on b.key=c.key join a d on 
> a.key=d.key; 
> Hive tries to merge join tree in a-d=b-d, a-d=a-b, b-c=a-b order and a-d=a-b 
> and b-c=a-b will be merged. Final join tree is "a-(bdc)".
> With this, ab-d join will be executed prior to ab-c. But if join type of -c 
> and -d is different, this is not valid.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-784) Support uncorrelated subqueries in the WHERE clause

2013-03-26 Thread Sun Rui (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13614982#comment-13614982
 ] 

Sun Rui commented on HIVE-784:
--

Guido, you may try project-panthera-ase@github, which is an open source effort 
for HIVE-3472. It can support correlated/un-correlated subqueries and selection 
from multiple tables without any join operator now. Instead of directly 
modifying the HiveQL grammar and query processor to support subqueries, the 
project incorporates a PL/SQL parser for SQL input and performs semantically 
identical transformation on the PL/SQL parser output for execution by the query 
processor.

> Support uncorrelated subqueries in the WHERE clause
> ---
>
> Key: HIVE-784
> URL: https://issues.apache.org/jira/browse/HIVE-784
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Ning Zhang
>Assignee: Ning Zhang
>
> Hive currently only support views in the FROM-clause, some Facebook use cases 
> suggest that Hive should support subqueries such as those connected by 
> IN/EXISTS in the WHERE-clause. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira