[jira] [Commented] (HIVE-3652) Join optimization for star schema

2013-01-02 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542723#comment-13542723
 ] 

Amareshwari Sriramadasu commented on HIVE-3652:
---

[~vikram.dixit]I'm not working on it right now. May not get time in next one 
month also. Please feel free work on it, if interested.

> Join optimization for star schema
> -
>
> Key: HIVE-3652
> URL: https://issues.apache.org/jira/browse/HIVE-3652
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
>
> Currently, if we join one fact table with multiple dimension tables, it 
> results in multiple mapreduce jobs for each join with dimension table, 
> because join would be on different keys for each dimension. 
> Usually all the dimension tables will be small and can fit into memory and so 
> map-side join can used to join with fact table.
> In this issue I want to look at optimizing such query to generate single 
> mapreduce job sothat mapper loads dimension tables into memory and joins with 
> fact table on different keys as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-3652) Join optimization for star schema

2013-01-02 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu reassigned HIVE-3652:
-

Assignee: (was: Amareshwari Sriramadasu)

> Join optimization for star schema
> -
>
> Key: HIVE-3652
> URL: https://issues.apache.org/jira/browse/HIVE-3652
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Amareshwari Sriramadasu
>
> Currently, if we join one fact table with multiple dimension tables, it 
> results in multiple mapreduce jobs for each join with dimension table, 
> because join would be on different keys for each dimension. 
> Usually all the dimension tables will be small and can fit into memory and so 
> map-side join can used to join with fact table.
> In this issue I want to look at optimizing such query to generate single 
> mapreduce job sothat mapper loads dimension tables into memory and joins with 
> fact table on different keys as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3562) Some limit can be pushed down to map stage

2013-01-02 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542690#comment-13542690
 ] 

Phabricator commented on HIVE-3562:
---

tarball has commented on the revision "HIVE-3562 [jira] Some limit can be 
pushed down to map stage".

  Looks good to me. +1

REVISION DETAIL
  https://reviews.facebook.net/D5967

To: JIRA, tarball, navis
Cc: njain


> Some limit can be pushed down to map stage
> --
>
> Key: HIVE-3562
> URL: https://issues.apache.org/jira/browse/HIVE-3562
> Project: Hive
>  Issue Type: Bug
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3562.D5967.1.patch, HIVE-3562.D5967.2.patch, 
> HIVE-3562.D5967.3.patch
>
>
> Queries with limit clause (with reasonable number), for example
> {noformat}
> select * from src order by key limit 10;
> {noformat}
> makes operator tree, 
> TS-SEL-RS-EXT-LIMIT-FS
> But LIMIT can be partially calculated in RS, reducing size of shuffling.
> TS-SEL-RS(TOP-N)-EXT-LIMIT-FS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3562) Some limit can be pushed down to map stage

2013-01-02 Thread Sivaramakrishnan Narayanan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542689#comment-13542689
 ] 

Sivaramakrishnan Narayanan commented on HIVE-3562:
--

Looks good to me. +1

> Some limit can be pushed down to map stage
> --
>
> Key: HIVE-3562
> URL: https://issues.apache.org/jira/browse/HIVE-3562
> Project: Hive
>  Issue Type: Bug
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3562.D5967.1.patch, HIVE-3562.D5967.2.patch, 
> HIVE-3562.D5967.3.patch
>
>
> Queries with limit clause (with reasonable number), for example
> {noformat}
> select * from src order by key limit 10;
> {noformat}
> makes operator tree, 
> TS-SEL-RS-EXT-LIMIT-FS
> But LIMIT can be partially calculated in RS, reducing size of shuffling.
> TS-SEL-RS(TOP-N)-EXT-LIMIT-FS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-446) Implement TRUNCATE

2013-01-02 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542688#comment-13542688
 ] 

Phabricator commented on HIVE-446:
--

mgrover has commented on the revision "HIVE-446 [jira] Implement TRUNCATE".

  I am happy to take care of all rmr changes in HIVE-3701

REVISION DETAIL
  https://reviews.facebook.net/D7371

To: JIRA, navis
Cc: njain, mgrover


> Implement TRUNCATE
> --
>
> Key: HIVE-446
> URL: https://issues.apache.org/jira/browse/HIVE-446
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Prasad Chakka
>Assignee: Navis
> Fix For: 0.11.0
>
> Attachments: HIVE-446.D7371.1.patch, HIVE-446.D7371.2.patch, 
> HIVE-446.D7371.3.patch, HIVE-446.D7371.4.patch
>
>
> truncate the data but leave the table and metadata intact.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3852) Multi-groupby optimization fails when same distinct column is used twice or more

2013-01-02 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-3852:
--

Attachment: HIVE-3852.D7737.1.patch

navis requested code review of "HIVE-3852 [jira] Multi-groupby optimization 
fails when same distinct column is used twice or more".
Reviewers: JIRA

  DPAL-1951 Multi-groupby optimization fails when same distinct column is used 
twice or more

  FROM INPUT
  INSERT OVERWRITE TABLE dest1
  SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), count(distinct 
substr(INPUT.value,5)) GROUP BY INPUT.key
  INSERT OVERWRITE TABLE dest2
  SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), avg(distinct 
substr(INPUT.value,5)) GROUP BY INPUT.key;

  fails with exception FAILED: IndexOutOfBoundsException Index: 0,Size: 0

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D7737

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  ql/src/test/queries/clientpositive/groupby10.q
  ql/src/test/results/clientpositive/groupby10.q.out

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/18621/

To: JIRA, navis


> Multi-groupby optimization fails when same distinct column is used twice or 
> more
> 
>
> Key: HIVE-3852
> URL: https://issues.apache.org/jira/browse/HIVE-3852
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3852.D7737.1.patch
>
>
> {code}
> FROM INPUT
> INSERT OVERWRITE TABLE dest1 
> SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), count(distinct 
> substr(INPUT.value,5)) GROUP BY INPUT.key
> INSERT OVERWRITE TABLE dest2 
> SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), avg(distinct 
> substr(INPUT.value,5)) GROUP BY INPUT.key;
> {code}
> fails with exception FAILED: IndexOutOfBoundsException Index: 0,Size: 0

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3852) Multi-groupby optimization fails when same distinct column is used twice or more

2013-01-02 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3852:


Status: Patch Available  (was: Open)

> Multi-groupby optimization fails when same distinct column is used twice or 
> more
> 
>
> Key: HIVE-3852
> URL: https://issues.apache.org/jira/browse/HIVE-3852
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3852.D7737.1.patch
>
>
> {code}
> FROM INPUT
> INSERT OVERWRITE TABLE dest1 
> SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), count(distinct 
> substr(INPUT.value,5)) GROUP BY INPUT.key
> INSERT OVERWRITE TABLE dest2 
> SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), avg(distinct 
> substr(INPUT.value,5)) GROUP BY INPUT.key;
> {code}
> fails with exception FAILED: IndexOutOfBoundsException Index: 0,Size: 0

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3852) Multi-groupby optimization fails when same distinct column is used twice or more

2013-01-02 Thread Navis (JIRA)
Navis created HIVE-3852:
---

 Summary: Multi-groupby optimization fails when same distinct 
column is used twice or more
 Key: HIVE-3852
 URL: https://issues.apache.org/jira/browse/HIVE-3852
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial


{code}
FROM INPUT
INSERT OVERWRITE TABLE dest1 
SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), count(distinct 
substr(INPUT.value,5)) GROUP BY INPUT.key
INSERT OVERWRITE TABLE dest2 
SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), avg(distinct 
substr(INPUT.value,5)) GROUP BY INPUT.key;
{code}

fails with exception FAILED: IndexOutOfBoundsException Index: 0,Size: 0

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-933) Infer bucketing/sorting properties

2013-01-02 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-933:
---

Attachment: HIVE-933.7.patch.txt

> Infer bucketing/sorting properties
> --
>
> Key: HIVE-933
> URL: https://issues.apache.org/jira/browse/HIVE-933
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Kevin Wilfong
> Attachments: HIVE-933.1.patch.txt, HIVE-933.2.patch.txt, 
> HIVE-933.3.patch.txt, HIVE-933.4.patch.txt, HIVE-933.5.patch.txt, 
> HIVE-933.6.patch.txt, HIVE-933.7.patch.txt
>
>
> This is a long-term plan, and may require major changes.
> From the query, we can figure out the sorting/bucketing properties, and 
> change the metadata of the destination at that time.
> However, this means that different partitions may have different metadata. 
> Currently, the query plan is same for all the 
> partitions of the table - we can do the following:
> 1. In the first cut, have a simple approach where you take the union all 
> metadata, and create the most defensive plan.
> 2. Enhance mapredWork() to include partition specific operator trees.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Hive-trunk-h0.21 - Build # 1891 - Fixed

2013-01-02 Thread Apache Jenkins Server
Changes for Build #1886

Changes for Build #1887

Changes for Build #1888

Changes for Build #1889

Changes for Build #1890
[namit] HIVE-446 Implement TRUNCATE
(Navis via namit)


Changes for Build #1891



All tests passed

The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1891)

Status: Fixed

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1891/ to 
view the results.

Re: write access to the Hive wiki?

2013-01-02 Thread Namit Jain
What is your id ?


On 1/3/13 1:18 AM, "Sean Busbey"  wrote:

>Hi All!
>
>Could I have write access to the Hive wiki?
>
>I'd like to fix some documentation errors. The most immediate is the Avro
>SerDe page, which contains incorrect table creation statements.
>
>-- 
>Sean



[jira] [Updated] (HIVE-3562) Some limit can be pushed down to map stage

2013-01-02 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-3562:
--

Attachment: HIVE-3562.D5967.3.patch

navis updated the revision "HIVE-3562 [jira] Some limit can be pushed down to 
map stage".
Reviewers: JIRA, tarball

  Addressed comments
  Prevent multi-GBY single-RS case


REVISION DETAIL
  https://reviews.facebook.net/D5967

AFFECTED FILES
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
  conf/hive-default.xml.template
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExtractOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ForwardOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/SelectOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveKey.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/LimitPushdownOptimizer.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java
  ql/src/test/queries/clientpositive/limit_pushdown.q
  ql/src/test/results/clientpositive/limit_pushdown.q.out

To: JIRA, tarball, navis
Cc: njain


> Some limit can be pushed down to map stage
> --
>
> Key: HIVE-3562
> URL: https://issues.apache.org/jira/browse/HIVE-3562
> Project: Hive
>  Issue Type: Bug
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3562.D5967.1.patch, HIVE-3562.D5967.2.patch, 
> HIVE-3562.D5967.3.patch
>
>
> Queries with limit clause (with reasonable number), for example
> {noformat}
> select * from src order by key limit 10;
> {noformat}
> makes operator tree, 
> TS-SEL-RS-EXT-LIMIT-FS
> But LIMIT can be partially calculated in RS, reducing size of shuffling.
> TS-SEL-RS(TOP-N)-EXT-LIMIT-FS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3562) Some limit can be pushed down to map stage

2013-01-02 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3562:


Status: Patch Available  (was: Open)

> Some limit can be pushed down to map stage
> --
>
> Key: HIVE-3562
> URL: https://issues.apache.org/jira/browse/HIVE-3562
> Project: Hive
>  Issue Type: Bug
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3562.D5967.1.patch, HIVE-3562.D5967.2.patch, 
> HIVE-3562.D5967.3.patch
>
>
> Queries with limit clause (with reasonable number), for example
> {noformat}
> select * from src order by key limit 10;
> {noformat}
> makes operator tree, 
> TS-SEL-RS-EXT-LIMIT-FS
> But LIMIT can be partially calculated in RS, reducing size of shuffling.
> TS-SEL-RS(TOP-N)-EXT-LIMIT-FS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3428) Fix log4j configuration errors when running hive on hadoop23

2013-01-02 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-3428:
-

Attachment: HIVE-3428_SHIM_EVENT_COUNTER.patch

Proposed shim for EventCounter class. Does not contain NullAppender or 
HADOOP_CONF variable.

> Fix log4j configuration errors when running hive on hadoop23
> 
>
> Key: HIVE-3428
> URL: https://issues.apache.org/jira/browse/HIVE-3428
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
> Attachments: HIVE-3428.1.patch.txt, HIVE-3428.2.patch.txt, 
> HIVE-3428.3.patch.txt, HIVE-3428.4.patch.txt, HIVE-3428.5.patch.txt, 
> HIVE-3428.6.patch.txt, HIVE-3428_SHIM_EVENT_COUNTER.patch
>
>
> There are log4j configuration errors when running hive on hadoop23, some of 
> them may fail testcases, since the following log4j error message could 
> printed to console, or to output file, which diffs from the expected output:
> [junit] < log4j:ERROR Could not find value for key log4j.appender.NullAppender
> [junit] < log4j:ERROR Could not instantiate appender named "NullAppender".
> [junit] < 12/09/04 11:34:42 WARN conf.HiveConf: hive-site.xml not found on 
> CLASSPATH

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3428) Fix log4j configuration errors when running hive on hadoop23

2013-01-02 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542594#comment-13542594
 ] 

Gunther Hagleitner commented on HIVE-3428:
--

Patch doesn't seem to provide configuration for hadoop.mr.rev=20S (hadoop 
1-line). More importantly, if the only difference between the files is the 
Hadoop's deprecated EventCounter, it seems better to create a shim class for 
that. This way there's only one conf file and hive will pick the right file 
from the classpath at runtime.

> Fix log4j configuration errors when running hive on hadoop23
> 
>
> Key: HIVE-3428
> URL: https://issues.apache.org/jira/browse/HIVE-3428
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
> Attachments: HIVE-3428.1.patch.txt, HIVE-3428.2.patch.txt, 
> HIVE-3428.3.patch.txt, HIVE-3428.4.patch.txt, HIVE-3428.5.patch.txt, 
> HIVE-3428.6.patch.txt
>
>
> There are log4j configuration errors when running hive on hadoop23, some of 
> them may fail testcases, since the following log4j error message could 
> printed to console, or to output file, which diffs from the expected output:
> [junit] < log4j:ERROR Could not find value for key log4j.appender.NullAppender
> [junit] < log4j:ERROR Could not instantiate appender named "NullAppender".
> [junit] < 12/09/04 11:34:42 WARN conf.HiveConf: hive-site.xml not found on 
> CLASSPATH

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: write access to the Hive wiki?

2013-01-02 Thread Carl Steinbach
Hi Sean,

I added you to the Hive wiki ACL.

Thanks.

Carl

On Wed, Jan 2, 2013 at 11:48 AM, Sean Busbey  wrote:

> Hi All!
>
> Could I have write access to the Hive wiki?
>
> I'd like to fix some documentation errors. The most immediate is the Avro
> SerDe page, which contains incorrect table creation statements.
>
> --
> Sean
>


[jira] [Commented] (HIVE-3699) Multiple insert overwrite into multiple tables query stores same results in all tables

2013-01-02 Thread Shanzhong Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542568#comment-13542568
 ] 

Shanzhong Zhu commented on HIVE-3699:
-

Any updates on this item?

We are also observing a similar issue.

In the following query, the result of test17 was supposed to be empty. But 
test17 seems to have the same results as test18.

FROM (
SELECT info.product, info.sid, info.id, t.persona, info.service
  FROM info_table info JOIN main_tbl t ON info.service=t.service
  WHERE (info.id BETWEEN 17 AND 18) AND t.dt='2012-11-20' AND t.m='XXX1' 
AND t.g = 'XXX2' AND t.s = 'XXX3' ) u
INSERT OVERWRITE TABLE test18 PARTITION (dt='2012-11-20', service)
SELECT u.product, u.sid, u.id, u.persona, u.service
WHERE u.id=18
INSERT OVERWRITE TABLE test17 PARTITION (dt='2012-11-20', service)
SELECT u.product, u.sid, u.id, u.persona, u.service
WHERE u.id=17;

> Multiple insert overwrite into multiple tables query stores same results in 
> all tables
> --
>
> Key: HIVE-3699
> URL: https://issues.apache.org/jira/browse/HIVE-3699
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.9.0
> Environment: Cloudera 4.1 on Amazon Linux (rebranded Centos 6): 
> hive-0.9.0+150-1.cdh4.1.1.p0.4.el6.noarch
>Reporter: Alexandre Fouché
>
> (Note: This might be related to HIVE-2750)
> I am doing a query with multiple INSERT OVERWRITE to multiple tables in order 
> to scan the dataset only 1 time, and i end up having all these tables with 
> the same content ! It seems the GROUP BY query that returns results is 
> overwriting all the temp tables.
> Weird enough, if i had further GROUP BY queries into additional temp tables, 
> grouped by a different field, then all temp tables, even the ones that would 
> have been wrong content are all correctly populated.
> This is the misbehaving query:
> FROM nikon
> INSERT OVERWRITE TABLE e1
> SELECT qs_cs_s_aid AS Emplacements, COUNT(*) AS Impressions
> WHERE qs_cs_s_cat='PRINT' GROUP BY qs_cs_s_aid
> INSERT OVERWRITE TABLE e2
> SELECT qs_cs_s_aid AS Emplacements, COUNT(*) AS Vues
> WHERE qs_cs_s_cat='VIEW' GROUP BY qs_cs_s_aid
> ;
> It launches only one MR job and here are the results. Why does table 'e1' 
> contains results from table 'e2' ?! Table 'e1' should have been empty (see 
> individual SELECTs further below)
> hive> SELECT * from e1;
> OK
> NULL2
> 1627575 25
> 1627576 70
> 1690950 22
> 1690952 42
> 1696705 199
> 1696706 66
> 1696730 229
> 1696759 85
> 1696893 218
> Time taken: 0.229 seconds
> hive> SELECT * from e2;
> OK
> NULL2
> 1627575 25
> 1627576 70
> 1690950 22
> 1690952 42
> 1696705 199
> 1696706 66
> 1696730 229
> 1696759 85
> 1696893 218
> Time taken: 0.11 seconds
> Here is are the result to the indiviual queries (only the second query 
> returns a result set):
> hive> SELECT qs_cs_s_aid AS Emplacements, COUNT(*) AS Impressions FROM 
> nikon
> WHERE qs_cs_s_cat='PRINT' GROUP BY qs_cs_s_aid;
> (...)
> OK
>   <- There are no results, this is normal
> Time taken: 41.471 seconds
> hive> SELECT qs_cs_s_aid AS Emplacements, COUNT(*) AS Vues FROM nikon
> WHERE qs_cs_s_cat='VIEW' GROUP BY qs_cs_s_aid;
> (...)
> OK
> NULL  2
> 1627575 25
> 1627576 70
> 1690950 22
> 1690952 42
> 1696705 199
> 1696706 66
> 1696730 229
> 1696759 85
> 1696893 218
> Time taken: 39.607 seconds
> 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-3585) Integrate Trevni as another columnar oriented file format

2013-01-02 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach reassigned HIVE-3585:


Assignee: Mark Wagner  (was: Jakob Homan)

> Integrate Trevni as another columnar oriented file format
> -
>
> Key: HIVE-3585
> URL: https://issues.apache.org/jira/browse/HIVE-3585
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 0.10.0
>Reporter: alex gemini
>Assignee: Mark Wagner
>Priority: Minor
>
> add new avro module trevni as another columnar format.New columnar format 
> need a columnar SerDe,seems fastutil is a good choice.the shark project use 
> fastutil library as columnar serde library but it seems too large (almost 
> 15m) for just a few primitive array collection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3528) Avro SerDe doesn't handle serializing Nullable types that require access to a Schema

2013-01-02 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542533#comment-13542533
 ] 

Sean Busbey commented on HIVE-3528:
---

Update with clientpositive .q test and subsumed enum handling (HIVE-3538) to 
[review board #7431|https://reviews.apache.org/r/7431/]

> Avro SerDe doesn't handle serializing Nullable types that require access to a 
> Schema
> 
>
> Key: HIVE-3528
> URL: https://issues.apache.org/jira/browse/HIVE-3528
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Sean Busbey
>  Labels: avro
> Attachments: HIVE-3528.1.patch.txt
>
>
> Deserialization properly handles hiding Nullable Avro types, including 
> complex types like record, map, array, etc. However, when Serialization 
> attempts to write out these types it erroneously makes use of the UNION 
> schema that contains NULL and the other type.
> This results in Schema mis-match errors for Record, Array, Enum, Fixed, and 
> Bytes.
> Here's a [review board of unit tests that express the 
> problem|https://reviews.apache.org/r/7431/], as well as one that supports the 
> case that it's only when the schema is needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-3538) Avro SerDe can't handle Nullable Enums

2013-01-02 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey resolved HIVE-3538.
---

Resolution: Duplicate

Subsumed by HIVE-3528

> Avro SerDe can't handle Nullable Enums
> --
>
> Key: HIVE-3538
> URL: https://issues.apache.org/jira/browse/HIVE-3538
> Project: Hive
>  Issue Type: Bug
>Reporter: Sean Busbey
> Attachments: HIVE-3538.tests.txt
>
>
> If a field has a schema that unions NULL with an enum, Avro fails to resolve 
> the union because Avro SerDe doesn't restore "enumness".
> Since the enum datum is a String, avro internals check the union for a string 
> schema, which is not present.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: HIVE-3528

2013-01-02 Thread Sean Busbey

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7431/
---

(Updated Jan. 2, 2013, 11:11 p.m.)


Review request for hive and Jakob Homan.


Changes
---

Added a clientpositive test for all nullable types. Subsumed HIVE-3538, with 
changes in anticipation of AVRO-997's stricter handling of enums.


Description (updated)
---

Changes AvroSerDe to properly give the non-null schema to serialization 
routines when using Nullable complex types. Properly restores the enum-ness of 
Avro Enums prior to serialization.


Diffs (updated)
-

  /trunk/data/files/csv.txt PRE-CREATION 
  /trunk/ql/src/test/queries/clientpositive/avro_nullable_fields.q PRE-CREATION 
  /trunk/ql/src/test/results/clientpositive/avro_nullable_fields.q.out 
PRE-CREATION 
  /trunk/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerializer.java 
1426606 
  
/trunk/serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroDeserializer.java
 1426606 
  
/trunk/serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroObjectInspectorGenerator.java
 1426606 
  
/trunk/serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerializer.java
 1426606 

Diff: https://reviews.apache.org/r/7431/diff/


Testing (updated)
---

Adds tests that check each of the Avro types that Serialization needs to use a 
user-provided schema, both as top level fields and as nested members of a 
complex type.

Adds a client positive test that reads in a CSV table with NULLs, copies that 
data into an Avro backed table, then reads the data out of the table.


Thanks,

Sean Busbey



[jira] [Created] (HIVE-3851) Add isFinalMapRed from MapredWork to EXPLAIN EXTENDED output

2013-01-02 Thread Kevin Wilfong (JIRA)
Kevin Wilfong created HIVE-3851:
---

 Summary: Add isFinalMapRed from MapredWork to EXPLAIN EXTENDED 
output
 Key: HIVE-3851
 URL: https://issues.apache.org/jira/browse/HIVE-3851
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
Priority: Minor


A flag indicating that a map reduce job is produces final output (ignoring 
moves/merges) will be added as part of HIVE-933.  It would be good to include 
this in the output of EXPLAIN EXTENDED.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3850) hour() function returns 12 hour clock value when using timestamp datatype

2013-01-02 Thread Pieterjan Vriends (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pieterjan Vriends updated HIVE-3850:


Summary: hour() function returns 12 hour clock value when using timestamp 
datatype  (was: hour() function returns 12 hour clock value when using 
timestamp)

> hour() function returns 12 hour clock value when using timestamp datatype
> -
>
> Key: HIVE-3850
> URL: https://issues.apache.org/jira/browse/HIVE-3850
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 0.9.0
>Reporter: Pieterjan Vriends
>
> Apparently UDFHour.java does have two evaluate() functions. One that does 
> accept a Text object as parameter and one that does use a TimeStampWritable 
> object as parameter. The first function does return the value of 
> Calendar.HOUR_OF_DAY and the second one of Calendar.HOUR. In the 
> documentation I couldn't find any information on the overload of the 
> evaluation function. I did spent quite some time finding out why my statement 
> didn't return a 24 hour clock value.
> Shouldn't both functions return the same?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Hive-0.10.0-SNAPSHOT-h0.20.1 #22

2013-01-02 Thread Apache Jenkins Server
See 

--
[...truncated 41919 lines...]
[junit] Hadoop job information for null: number of mappers: 0; number of 
reducers: 0
[junit] 2013-01-02 14:17:24,131 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] Execution completed successfully
[junit] Mapred Local Task Succeeded . Convert the Join into MapJoin
[junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/localscratchdir/hive_2013-01-02_14-17-21_059_3270679034972611333/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/tmp/hive_job_log_jenkins_201301021417_745834635.txt
[junit] Copying file: 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] Copying data from 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] Table default.testhivedrivertable stats: [num_partitions: 0, 
num_files: 1, num_rows: 0, total_size: 5812, raw_data_size: 0]
[junit] POSTHOOK: query: load data local inpath 
'/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/localscratchdir/hive_2013-01-02_14-17-25_499_8521399548346007948/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/localscratchdir/hive_2013-01-02_14-17-25_499_8521399548346007948/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/tmp/hive_job_log_jenkins_201301021417_1668767561.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[j

[jira] [Commented] (HIVE-3585) Integrate Trevni as another columnar oriented file format

2013-01-02 Thread Mark Wagner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542432#comment-13542432
 ] 

Mark Wagner commented on HIVE-3585:
---

I'm taking this over for Jakob. Please add me as a contributor so that I can 
assign this ticket to myself.

> Integrate Trevni as another columnar oriented file format
> -
>
> Key: HIVE-3585
> URL: https://issues.apache.org/jira/browse/HIVE-3585
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 0.10.0
>Reporter: alex gemini
>Assignee: Jakob Homan
>Priority: Minor
>
> add new avro module trevni as another columnar format.New columnar format 
> need a columnar SerDe,seems fastutil is a good choice.the shark project use 
> fastutil library as columnar serde library but it seems too large (almost 
> 15m) for just a few primitive array collection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3585) Integrate Trevni as another columnar oriented file format

2013-01-02 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542442#comment-13542442
 ] 

Jakob Homan commented on HIVE-3585:
---

Also, He, I'm assuming your -1 is not intended to be a veto? I don't believe it 
would hold up technically.  Trevni is essentially a variation on Avro.  Not 
letting people read their Trevni-encoded data in Hive just because there's 
already another columnar format doesn't seem like a good way forward.

> Integrate Trevni as another columnar oriented file format
> -
>
> Key: HIVE-3585
> URL: https://issues.apache.org/jira/browse/HIVE-3585
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 0.10.0
>Reporter: alex gemini
>Assignee: Jakob Homan
>Priority: Minor
>
> add new avro module trevni as another columnar format.New columnar format 
> need a columnar SerDe,seems fastutil is a good choice.the shark project use 
> fastutil library as columnar serde library but it seems too large (almost 
> 15m) for just a few primitive array collection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3803) explain dependency should show the dependencies hierarchically in presence of views

2013-01-02 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-3803:


Status: Open  (was: Patch Available)

> explain dependency should show the dependencies hierarchically in presence of 
> views
> ---
>
> Key: HIVE-3803
> URL: https://issues.apache.org/jira/browse/HIVE-3803
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3803.1.patch, hive.3803.2.patch, hive.3803.3.patch, 
> hive.3803.4.patch, hive.3803.5.patch
>
>
> It should also include tables whose partitions are being accessed

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3803) explain dependency should show the dependencies hierarchically in presence of views

2013-01-02 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542383#comment-13542383
 ] 

Kevin Wilfong commented on HIVE-3803:
-

Can you file the patch with the test updates.

> explain dependency should show the dependencies hierarchically in presence of 
> views
> ---
>
> Key: HIVE-3803
> URL: https://issues.apache.org/jira/browse/HIVE-3803
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3803.1.patch, hive.3803.2.patch, hive.3803.3.patch, 
> hive.3803.4.patch, hive.3803.5.patch
>
>
> It should also include tables whose partitions are being accessed

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3552) HIVE-3552 performant manner for performing cubes/rollups/grouping sets for a high number of grouping set keys

2013-01-02 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542382#comment-13542382
 ] 

Kevin Wilfong commented on HIVE-3552:
-

The patch needs to be updated, it's not applying cleanly.

Some minor style comments on Phabricator.

> HIVE-3552 performant manner for performing cubes/rollups/grouping sets for a 
> high number of grouping set keys
> -
>
> Key: HIVE-3552
> URL: https://issues.apache.org/jira/browse/HIVE-3552
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3552.10.patch, hive.3552.11.patch, 
> hive.3552.1.patch, hive.3552.2.patch, hive.3552.3.patch, hive.3552.4.patch, 
> hive.3552.5.patch, hive.3552.6.patch, hive.3552.7.patch, hive.3552.8.patch, 
> hive.3552.9.patch
>
>
> This is a follow up for HIVE-3433.
> Had a offline discussion with Sambavi - she pointed out a scenario where the
> implementation in HIVE-3433 will not scale. Assume that the user is performing
> a cube on many columns, say '8' columns. So, each row would generate 256 rows
> for the hash table, which may kill the current group by implementation.
> A better implementation would be to add an additional mr job - in the first 
> mr job perform the group by assuming there was no cube. Add another mr job, 
> where
> you would perform the cube. The assumption is that the group by would have 
> decreased the output data significantly, and the rows would appear in the 
> order of
> grouping keys which has a higher probability of hitting the hash table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3552) HIVE-3552 performant manner for performing cubes/rollups/grouping sets for a high number of grouping set keys

2013-01-02 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-3552:


Status: Open  (was: Patch Available)

> HIVE-3552 performant manner for performing cubes/rollups/grouping sets for a 
> high number of grouping set keys
> -
>
> Key: HIVE-3552
> URL: https://issues.apache.org/jira/browse/HIVE-3552
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3552.10.patch, hive.3552.11.patch, 
> hive.3552.1.patch, hive.3552.2.patch, hive.3552.3.patch, hive.3552.4.patch, 
> hive.3552.5.patch, hive.3552.6.patch, hive.3552.7.patch, hive.3552.8.patch, 
> hive.3552.9.patch
>
>
> This is a follow up for HIVE-3433.
> Had a offline discussion with Sambavi - she pointed out a scenario where the
> implementation in HIVE-3433 will not scale. Assume that the user is performing
> a cube on many columns, say '8' columns. So, each row would generate 256 rows
> for the hash table, which may kill the current group by implementation.
> A better implementation would be to add an additional mr job - in the first 
> mr job perform the group by assuming there was no cube. Add another mr job, 
> where
> you would perform the cube. The assumption is that the group by would have 
> decreased the output data significantly, and the rows would appear in the 
> order of
> grouping keys which has a higher probability of hitting the hash table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3652) Join optimization for star schema

2013-01-02 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542361#comment-13542361
 ] 

Vikram Dixit K commented on HIVE-3652:
--

[~amareshwari] I am quite interested in this jira and was wondering what phase 
you are in with respect to design/implementation. I would like to collaborate 
with you on this if possible. Please let me know.

Thanks
Vikram.

> Join optimization for star schema
> -
>
> Key: HIVE-3652
> URL: https://issues.apache.org/jira/browse/HIVE-3652
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
>
> Currently, if we join one fact table with multiple dimension tables, it 
> results in multiple mapreduce jobs for each join with dimension table, 
> because join would be on different keys for each dimension. 
> Usually all the dimension tables will be small and can fit into memory and so 
> map-side join can used to join with fact table.
> In this issue I want to look at optimizing such query to generate single 
> mapreduce job sothat mapper loads dimension tables into memory and joins with 
> fact table on different keys as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


write access to the Hive wiki?

2013-01-02 Thread Sean Busbey
Hi All!

Could I have write access to the Hive wiki?

I'd like to fix some documentation errors. The most immediate is the Avro
SerDe page, which contains incorrect table creation statements.

-- 
Sean


[jira] [Commented] (HIVE-3803) explain dependency should show the dependencies hierarchically in presence of views

2013-01-02 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542296#comment-13542296
 ] 

Kevin Wilfong commented on HIVE-3803:
-

One really minor comment on the diff, otherwise looks good.

> explain dependency should show the dependencies hierarchically in presence of 
> views
> ---
>
> Key: HIVE-3803
> URL: https://issues.apache.org/jira/browse/HIVE-3803
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3803.1.patch, hive.3803.2.patch, hive.3803.3.patch, 
> hive.3803.4.patch, hive.3803.5.patch
>
>
> It should also include tables whose partitions are being accessed

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3272) RetryingRawStore will perform partial transaction on retry

2013-01-02 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542282#comment-13542282
 ] 

Kevin Wilfong commented on HIVE-3272:
-

Yes, this is a totally separate issue from HIVE-3826.  HIVE-3826 will happen 
even when the RetryingRawStore tries only once (never retries).

> RetryingRawStore will perform partial transaction on retry
> --
>
> Key: HIVE-3272
> URL: https://issues.apache.org/jira/browse/HIVE-3272
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.10.0
>Reporter: Kevin Wilfong
>Priority: Critical
>
> By the time the RetryingRawStore retries a command the transaction 
> encompassing it has already been rolled back.  This means that it will 
> perform the remainder of the raw store commands outside of a transaction, 
> unless there is another one encapsulating it which is definitely not always 
> the case, and then fail when it tries to commit the transaction as there is 
> none open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3812) TestCase TestJdbcDriver fails with IBM Java 6

2013-01-02 Thread Renata Ghisloti Duarte de Souza (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renata Ghisloti Duarte de Souza updated HIVE-3812:
--

Fix Version/s: 0.8.1
   0.10.0
   Status: Patch Available  (was: Open)

> TestCase TestJdbcDriver fails with IBM Java 6
> -
>
> Key: HIVE-3812
> URL: https://issues.apache.org/jira/browse/HIVE-3812
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC, Tests
>Affects Versions: 0.9.0, 0.8.1, 0.8.0, 0.10.0
> Environment: Apache Ant 1.7.1
> IBM JDK 6
>Reporter: Renata Ghisloti Duarte de Souza
>Priority: Minor
> Fix For: 0.10.0, 0.8.1
>
> Attachments: HIVE-3812.1_0.8.1.patch.txt, HIVE-3812.1_trunk.patch.txt
>
>
> When running testcase TestJdbcDriver with IBM Java 6, it fails with the 
> following error:
>  type="junit.framework.ComparisonFailure">junit.framework.ComparisonFailure: 
> expected:[[{}, 1], [{[c=d, a=b]}, 2]] but was:[[{}, 1], [{[a=b, c=d]}, 2]];
>   at junit.framework.Assert.assertEquals(Assert.java:85)
>   at junit.framework.Assert.assertEquals(Assert.java:91)
>   at 
> org.apache.hadoop.hive.jdbc.TestJdbcDriver.testDataTypes(TestJdbcDriver.java:380)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3488) Issue trying to use the thick client (embedded) from windows.

2013-01-02 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1354#comment-1354
 ] 

Rémy DUBOIS commented on HIVE-3488:
---

Reminder.

> Issue trying to use the thick client (embedded) from windows.
> -
>
> Key: HIVE-3488
> URL: https://issues.apache.org/jira/browse/HIVE-3488
> Project: Hive
>  Issue Type: Bug
>  Components: Windows
>Affects Versions: 0.8.1
>Reporter: Rémy DUBOIS
>Priority: Critical
>
> I'm trying to execute a very simple SELECT query against my remote hive 
> server.
> If I'm doing a SELECT * from table, everything works well. If I'm trying to 
> execute a SELECT name from table, this error appears:
> {code:java}
> Job Submission failed with exception 'java.io.IOException(cannot find dir = 
> /user/hive/warehouse/test/city=paris/out.csv in pathToPartitionInfo: 
> [hdfs://cdh-four:8020/user/hive/warehouse/test/city=paris])'
> 12/09/19 17:18:44 ERROR exec.Task: Job Submission failed with exception 
> 'java.io.IOException(cannot find dir = 
> /user/hive/warehouse/test/city=paris/out.csv in pathToPartitionInfo: 
> [hdfs://cdh-four:8020/user/hive/warehouse/test/city=paris])'
> java.io.IOException: cannot find dir = 
> /user/hive/warehouse/test/city=paris/out.csv in pathToPartitionInfo: 
> [hdfs://cdh-four:8020/user/hive/warehouse/test/city=paris]
>   at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:290)
>   at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:257)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat$CombineHiveInputSplit.(CombineHiveInputFormat.java:104)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:407)
>   at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:989)
>   at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:981)
>   at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
>   at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:891)
>   at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:844)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Unknown Source)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at 
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:844)
>   at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:818)
>   at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:452)
>   at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
>   at 
> org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:191)
>   at 
> org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:187)
> {code}
> Indeed, this "dir" (/user/hive/warehouse/test/city=paris/out.csv) can't be 
> found since it deals with my data file, and not a directory.
> Could you please help me?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false #248

2013-01-02 Thread Apache Jenkins Server
See 


--
[...truncated 5669 lines...]
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/org/apache/ftpserver/ftplet-api/1.0.0/ftplet-api-1.0.0.jar
 ...
[ivy:resolve]  (22kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] 
org.apache.ftpserver#ftplet-api;1.0.0!ftplet-api.jar(bundle) (14ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/org/apache/mina/mina-core/2.0.0-M5/mina-core-2.0.0-M5.jar
 ...
[ivy:resolve] 
...
 (622kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] 
org.apache.mina#mina-core;2.0.0-M5!mina-core.jar(bundle) (122ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/org/apache/ftpserver/ftpserver-core/1.0.0/ftpserver-core-1.0.0.jar
 ...
[ivy:resolve] . (264kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] 
org.apache.ftpserver#ftpserver-core;1.0.0!ftpserver-core.jar(bundle) (19ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/org/apache/ftpserver/ftpserver-deprecated/1.0.0-M2/ftpserver-deprecated-1.0.0-M2.jar
 ...
[ivy:resolve] .. (31kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] 
org.apache.ftpserver#ftpserver-deprecated;1.0.0-M2!ftpserver-deprecated.jar 
(13ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/org/slf4j/slf4j-api/1.5.2/slf4j-api-1.5.2.jar ...
[ivy:resolve] . (16kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] org.slf4j#slf4j-api;1.5.2!slf4j-api.jar (11ms)

ivy-retrieve-hadoop-shim:
 [echo] Project: shims
[javac] Compiling 16 source files to 

[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: 

 uses unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.
 [echo] Building shims 0.20S

build_shims:
 [echo] Project: shims
 [echo] Compiling 

 against hadoop 1.0.0 
(

ivy-init-settings:
 [echo] Project: shims

ivy-resolve-hadoop-shim:
 [echo] Project: shims
[ivy:resolve] :: loading settings :: file = 

[ivy:resolve] downloading 
http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-core/1.0.0/hadoop-core-1.0.0.jar
 ...
[ivy:resolve] 
.
 (3652kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] 
org.apache.hadoop#hadoop-core;1.0.0!hadoop-core.jar (100ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-tools/1.0.0/hadoop-tools-1.0.0.jar
 ...
[ivy:resolve] .. (281kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] 
org.apache.hadoop#hadoop-tools;1.0.0!hadoop-tools.jar (31ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-test/1.0.0/hadoop-test-1.0.0.jar
 ...
[ivy:resolve] 

 (2471kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] 
org.apache.hadoop#hadoop-test;1.0.0!hadoop-test.jar (63ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/commons-codec/commons-codec/1.4/commons-codec-1.4.

Hive-trunk-h0.21 - Build # 1890 - Failure

2013-01-02 Thread Apache Jenkins Server
Changes for Build #1886

Changes for Build #1887

Changes for Build #1888

Changes for Build #1889

Changes for Build #1890
[namit] HIVE-446 Implement TRUNCATE
(Navis via namit)




1 tests failed.
REGRESSION:  
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_stats_aggregator_error_1

Error Message:
Forked Java VM exited abnormally. Please note the time in the report does not 
reflect the time until the VM exit.

Stack Trace:
junit.framework.AssertionFailedError: Forked Java VM exited abnormally. Please 
note the time in the report does not reflect the time until the VM exit.
at 
net.sf.antcontrib.logic.ForTask.doSequentialIteration(ForTask.java:259)
at net.sf.antcontrib.logic.ForTask.doToken(ForTask.java:268)
at net.sf.antcontrib.logic.ForTask.doTheTasks(ForTask.java:324)
at net.sf.antcontrib.logic.ForTask.execute(ForTask.java:244)




The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1890)

Status: Failure

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1890/ to 
view the results.

[jira] [Commented] (HIVE-446) Implement TRUNCATE

2013-01-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542139#comment-13542139
 ] 

Hudson commented on HIVE-446:
-

Integrated in Hive-trunk-h0.21 #1890 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1890/])
HIVE-446 Implement TRUNCATE
(Navis via namit) (Revision 1427681)

 Result = FAILURE
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1427681
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/DDLWork.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/HiveOperation.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/TruncateTableDesc.java
* /hive/trunk/ql/src/test/queries/clientnegative/truncate_table_failure1.q
* /hive/trunk/ql/src/test/queries/clientnegative/truncate_table_failure2.q
* /hive/trunk/ql/src/test/queries/clientnegative/truncate_table_failure3.q
* /hive/trunk/ql/src/test/queries/clientnegative/truncate_table_failure4.q
* /hive/trunk/ql/src/test/queries/clientpositive/truncate_table.q
* /hive/trunk/ql/src/test/results/clientnegative/truncate_table_failure1.q.out
* /hive/trunk/ql/src/test/results/clientnegative/truncate_table_failure2.q.out
* /hive/trunk/ql/src/test/results/clientnegative/truncate_table_failure3.q.out
* /hive/trunk/ql/src/test/results/clientnegative/truncate_table_failure4.q.out
* /hive/trunk/ql/src/test/results/clientpositive/truncate_table.q.out


> Implement TRUNCATE
> --
>
> Key: HIVE-446
> URL: https://issues.apache.org/jira/browse/HIVE-446
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Prasad Chakka
>Assignee: Navis
> Fix For: 0.11.0
>
> Attachments: HIVE-446.D7371.1.patch, HIVE-446.D7371.2.patch, 
> HIVE-446.D7371.3.patch, HIVE-446.D7371.4.patch
>
>
> truncate the data but leave the table and metadata intact.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2439) Upgrade antlr version to 3.4

2013-01-02 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542136#comment-13542136
 ] 

Thiruvel Thirumoolan commented on HIVE-2439:


[~namit]

This is intended to simplify life for users of Hive, HCatalog and Pig. As 
HCat/Pig use antlr 3.4, anyone using all the components have to workaround in 
unfriendly and complicated ways. Thomas Weise raised this issue  before 
http://markmail.org/thread/xltnc5ak2saurdbu. Websearch for 'hive pig antlr' 
also brings up workarounds like using jarjar.

While upgrading antlr, I also fixed problems in Hive.g that didn't surface with 
3.0.1.

[~ashutoshc] Feel free to add if I have missed anything.

> Upgrade antlr version to 3.4
> 
>
> Key: HIVE-2439
> URL: https://issues.apache.org/jira/browse/HIVE-2439
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.8.0
>Reporter: Ashutosh Chauhan
>Assignee: Thiruvel Thirumoolan
> Fix For: 0.10.0, 0.9.1, 0.11.0
>
> Attachments: HIVE-2439_branch9_2.patch, HIVE-2439_branch9_3.patch, 
> HIVE-2439_branch9.patch, hive-2439_incomplete.patch, HIVE-2439_trunk.patch
>
>
> Upgrade antlr version to 3.4

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3850) hour() function returns 12 hour clock value when using timestamp

2013-01-02 Thread Pieterjan Vriends (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pieterjan Vriends updated HIVE-3850:


Description: 
Apparently UDFHour.java does have two evaluate() functions. One that does 
accept a Text object as parameter and one that does use a TimeStampWritable 
object as parameter. The first function does return the value of 
Calendar.HOUR_OF_DAY and the second one of Calendar.HOUR. In the documentation 
I couldn't find any information on the overload of the evaluation function. I 
did spent quite some time finding out why my statement didn't return a 24 hour 
clock value.

Shouldn't both functions return the same?


  was:
Apparently UDFHour.java does have two evaluate() functions. One that does 
accept a Text object as parameter and one that does use a TimeStampWritable 
object as parameter. The first function does return the value of 
Calendar.HOUR_OF_DAY and the second one of Calendar.HOUR. In the documentation 
I couldn't find any information on the overload of the evaluation function. I 
did spent quite some time finding out why my statement didn't return a 24 hour 
clock value.



> hour() function returns 12 hour clock value when using timestamp
> 
>
> Key: HIVE-3850
> URL: https://issues.apache.org/jira/browse/HIVE-3850
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 0.9.0
>Reporter: Pieterjan Vriends
>
> Apparently UDFHour.java does have two evaluate() functions. One that does 
> accept a Text object as parameter and one that does use a TimeStampWritable 
> object as parameter. The first function does return the value of 
> Calendar.HOUR_OF_DAY and the second one of Calendar.HOUR. In the 
> documentation I couldn't find any information on the overload of the 
> evaluation function. I did spent quite some time finding out why my statement 
> didn't return a 24 hour clock value.
> Shouldn't both functions return the same?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3850) hour() function returns 12 hour clock value when using timestamp

2013-01-02 Thread Pieterjan Vriends (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pieterjan Vriends updated HIVE-3850:


Affects Version/s: 0.9.0

> hour() function returns 12 hour clock value when using timestamp
> 
>
> Key: HIVE-3850
> URL: https://issues.apache.org/jira/browse/HIVE-3850
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 0.9.0
>Reporter: Pieterjan Vriends
>
> Apparently UDFHour.java does have two evaluate() functions. One that does 
> accept a Text object as parameter and one that does use a TimeStampWritable 
> object as parameter. The first function does return the value of 
> Calendar.HOUR_OF_DAY and the second one of Calendar.HOUR. In the 
> documentation I couldn't find any information on the overload of the 
> evaluation function. I did spent quite some time finding out why my statement 
> didn't return a 24 hour clock value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3850) hour() function returns 12 hour clock value when using timestamp

2013-01-02 Thread Pieterjan Vriends (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pieterjan Vriends updated HIVE-3850:


Summary: hour() function returns 12 hour clock value when using timestamp  
(was: hour() function returns 12 hour clock value when using timestamp 
datatype.)

> hour() function returns 12 hour clock value when using timestamp
> 
>
> Key: HIVE-3850
> URL: https://issues.apache.org/jira/browse/HIVE-3850
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Pieterjan Vriends
>Priority: Minor
>
> Apparently UDFHour.java does have two evaluate() functions. One that does 
> accept a Text object as parameter and one that does use a TimeStampWritable 
> object as parameter. The first function does return the value of 
> Calendar.HOUR_OF_DAY and the second one of Calendar.HOUR. In the 
> documentation I couldn't find any information on the overload of the 
> evaluation function. I did spent quite some time finding out why my statement 
> didn't return a 24 hour clock value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3850) hour() function returns 12 hour clock value when using timestamp

2013-01-02 Thread Pieterjan Vriends (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pieterjan Vriends updated HIVE-3850:


Priority: Major  (was: Minor)

> hour() function returns 12 hour clock value when using timestamp
> 
>
> Key: HIVE-3850
> URL: https://issues.apache.org/jira/browse/HIVE-3850
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Pieterjan Vriends
>
> Apparently UDFHour.java does have two evaluate() functions. One that does 
> accept a Text object as parameter and one that does use a TimeStampWritable 
> object as parameter. The first function does return the value of 
> Calendar.HOUR_OF_DAY and the second one of Calendar.HOUR. In the 
> documentation I couldn't find any information on the overload of the 
> evaluation function. I did spent quite some time finding out why my statement 
> didn't return a 24 hour clock value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3850) hour() function returns 12 hour clock value when using timestamp datatype.

2013-01-02 Thread Pieterjan Vriends (JIRA)
Pieterjan Vriends created HIVE-3850:
---

 Summary: hour() function returns 12 hour clock value when using 
timestamp datatype.
 Key: HIVE-3850
 URL: https://issues.apache.org/jira/browse/HIVE-3850
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Pieterjan Vriends
Priority: Minor


Apparently UDFHour.java does have two evaluate() functions. One that does 
accept a Text object as parameter and one that does use a TimeStampWritable 
object as parameter. The first function does return the value of 
Calendar.HOUR_OF_DAY and the second one of Calendar.HOUR. In the documentation 
I couldn't find any information on the overload of the evaluation function. I 
did spent quite some time finding out why my statement didn't return a 24 hour 
clock value.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3781) not all meta events call metastore event listener

2013-01-02 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3781:


Status: Patch Available  (was: Open)

> not all meta events call metastore event listener
> -
>
> Key: HIVE-3781
> URL: https://issues.apache.org/jira/browse/HIVE-3781
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.9.0
>Reporter: Sudhanshu Arora
>Assignee: Navis
> Attachments: hive.3781.3.patch, hive.3781.4.patch, 
> HIVE-3781.D7731.1.patch, HIVE-3781.D7731.2.patch, HIVE-3781.D7731.3.patch
>
>
> An event listener must be called for any DDL activity. For example, 
> create_index, drop_index today does not call metaevent listener. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3781) not all meta events call metastore event listener

2013-01-02 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-3781:
--

Attachment: HIVE-3781.D7731.3.patch

navis updated the revision "HIVE-3781 [jira] not all meta events call metastore 
event listener".
Reviewers: JIRA

  Rebased to trunk and confirmed TestMetaStoreEventListener succeeded


REVISION DETAIL
  https://reviews.facebook.net/D7731

AFFECTED FILES
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
  
metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreEventListener.java
  metastore/src/java/org/apache/hadoop/hive/metastore/events/AddIndexEvent.java
  
metastore/src/java/org/apache/hadoop/hive/metastore/events/PreEventContext.java
  
metastore/src/java/org/apache/hadoop/hive/metastore/events/AlterIndexEvent.java
  metastore/src/java/org/apache/hadoop/hive/metastore/events/DropIndexEvent.java
  
metastore/src/java/org/apache/hadoop/hive/metastore/events/PreAddIndexEvent.java
  
metastore/src/java/org/apache/hadoop/hive/metastore/events/PreAlterIndexEvent.java
  
metastore/src/java/org/apache/hadoop/hive/metastore/events/PreDropIndexEvent.java
  metastore/src/test/org/apache/hadoop/hive/metastore/DummyListener.java
  
metastore/src/test/org/apache/hadoop/hive/metastore/TestMetaStoreEventListener.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
  ql/src/test/org/apache/hadoop/hive/ql/QTestUtil.java
  ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java

To: JIRA, navis


> not all meta events call metastore event listener
> -
>
> Key: HIVE-3781
> URL: https://issues.apache.org/jira/browse/HIVE-3781
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.9.0
>Reporter: Sudhanshu Arora
>Assignee: Navis
> Attachments: hive.3781.3.patch, hive.3781.4.patch, 
> HIVE-3781.D7731.1.patch, HIVE-3781.D7731.2.patch, HIVE-3781.D7731.3.patch
>
>
> An event listener must be called for any DDL activity. For example, 
> create_index, drop_index today does not call metaevent listener. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3781) not all meta events call metastore event listener

2013-01-02 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3781:
-

Status: Open  (was: Patch Available)

TestMetaStoreEventListener is failing

> not all meta events call metastore event listener
> -
>
> Key: HIVE-3781
> URL: https://issues.apache.org/jira/browse/HIVE-3781
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.9.0
>Reporter: Sudhanshu Arora
>Assignee: Navis
> Attachments: hive.3781.3.patch, hive.3781.4.patch, 
> HIVE-3781.D7731.1.patch, HIVE-3781.D7731.2.patch
>
>
> An event listener must be called for any DDL activity. For example, 
> create_index, drop_index today does not call metaevent listener. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2693) Add DECIMAL data type

2013-01-02 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542080#comment-13542080
 ] 

Namit Jain commented on HIVE-2693:
--

Looks mostly good.

Most of the comments are minor - the only major ones are around lack of testing.


> Add DECIMAL data type
> -
>
> Key: HIVE-2693
> URL: https://issues.apache.org/jira/browse/HIVE-2693
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor, Types
>Affects Versions: 0.10.0
>Reporter: Carl Steinbach
>Assignee: Prasad Mujumdar
> Attachments: 2693_7.patch, 2693_8.patch, 2693_fix_all_tests1.patch, 
> HIVE-2693-10.patch, HIVE-2693-11.patch, HIVE-2693-12-SortableSerDe.patch, 
> HIVE-2693-13.patch, HIVE-2693-14.patch, HIVE-2693-15.patch, 
> HIVE-2693-16.patch, HIVE-2693-1.patch.txt, HIVE-2693-all.patch, 
> HIVE-2693.D7683.1.patch, HIVE-2693-fix.patch, HIVE-2693.patch, 
> HIVE-2693-take3.patch, HIVE-2693-take4.patch
>
>
> Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice 
> template for how to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2693) Add DECIMAL data type

2013-01-02 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-2693:
-

Status: Open  (was: Patch Available)

comments on phabricator

> Add DECIMAL data type
> -
>
> Key: HIVE-2693
> URL: https://issues.apache.org/jira/browse/HIVE-2693
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor, Types
>Affects Versions: 0.10.0
>Reporter: Carl Steinbach
>Assignee: Prasad Mujumdar
> Attachments: 2693_7.patch, 2693_8.patch, 2693_fix_all_tests1.patch, 
> HIVE-2693-10.patch, HIVE-2693-11.patch, HIVE-2693-12-SortableSerDe.patch, 
> HIVE-2693-13.patch, HIVE-2693-14.patch, HIVE-2693-15.patch, 
> HIVE-2693-16.patch, HIVE-2693-1.patch.txt, HIVE-2693-all.patch, 
> HIVE-2693.D7683.1.patch, HIVE-2693-fix.patch, HIVE-2693.patch, 
> HIVE-2693-take3.patch, HIVE-2693-take4.patch
>
>
> Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice 
> template for how to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3781) not all meta events call metastore event listener

2013-01-02 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3781:
-

Attachment: hive.3781.4.patch

> not all meta events call metastore event listener
> -
>
> Key: HIVE-3781
> URL: https://issues.apache.org/jira/browse/HIVE-3781
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.9.0
>Reporter: Sudhanshu Arora
>Assignee: Navis
> Attachments: hive.3781.3.patch, hive.3781.4.patch, 
> HIVE-3781.D7731.1.patch, HIVE-3781.D7731.2.patch
>
>
> An event listener must be called for any DDL activity. For example, 
> create_index, drop_index today does not call metaevent listener. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira