[jira] [Commented] (HIVE-5700) enforce single date format for partition column storage

2013-11-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817106#comment-13817106
 ] 

Hive QA commented on HIVE-5700:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12612776/HIVE-5700.01.patch

{color:green}SUCCESS:{color} +1 4594 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/205/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/205/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12612776

 enforce single date format for partition column storage
 ---

 Key: HIVE-5700
 URL: https://issues.apache.org/jira/browse/HIVE-5700
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-5700.01.patch, HIVE-5700.patch


 inspired by HIVE-5286.
 Partition column for dates should be stored as either integer, or as fixed 
 representation e.g. -mm-dd. External representation can remain varied as 
 is.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5581) Implement vectorized year/month/day... etc. for string arguments

2013-11-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817130#comment-13817130
 ] 

Hive QA commented on HIVE-5581:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12612780/HIVE-5581.3.patch

{color:green}SUCCESS:{color} +1 4603 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/206/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/206/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12612780

 Implement vectorized year/month/day... etc. for string arguments
 

 Key: HIVE-5581
 URL: https://issues.apache.org/jira/browse/HIVE-5581
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Eric Hanson
Assignee: Teddy Choi
 Attachments: HIVE-5581.1.patch.txt, HIVE-5581.2.patch, 
 HIVE-5581.3.patch


 Functions year(), month(), day(), weekofyear(), hour(), minute(), second() 
 need to be implemented for string arguments in vectorized mode. 
 They already work for timestamp arguments.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-3777) add a property in the partition to figure out if stats are accurate

2013-11-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817188#comment-13817188
 ] 

Hive QA commented on HIVE-3777:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12612797/HIVE-3777.5.patch

{color:green}SUCCESS:{color} +1 4595 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/208/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/208/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12612797

 add a property in the partition to figure out if stats are accurate
 ---

 Key: HIVE-3777
 URL: https://issues.apache.org/jira/browse/HIVE-3777
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Namit Jain
Assignee: Ashutosh Chauhan
 Attachments: HIVE-3777.2.patch, HIVE-3777.2.patch, HIVE-3777.3.patch, 
 HIVE-3777.4.patch, HIVE-3777.5.patch, HIVE-3777.patch


 Currently, stats task tries to update the statistics in the table/partition
 being updated after the table/partition is loaded. In case of a failure to 
 update these stats (due to the any reason), the operation either succeeds
 (writing inaccurate stats) or fails depending on whether hive.stats.reliable
 is set to true. This can be bad for applications who do not always care about
 reliable stats, since the query may have taken a long time to execute and then
 fail eventually.
 Another property should be added to the partition: areStatsAccurate. If 
 hive.stats.reliable is
 set to false, and stats could not be computed correctly, the operation would
 still succeed, update the stats, but set areStatsAccurate to false.
 If the application cares about accurate stats, it can be obtained in the 
 background.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5565) Limit Hive decimal type maximum precision and scale to 38

2013-11-08 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-5565:
--

Fix Version/s: (was: 0.13.0)
   Status: Open  (was: Patch Available)

 Limit Hive decimal type maximum precision and scale to 38
 -

 Key: HIVE-5565
 URL: https://issues.apache.org/jira/browse/HIVE-5565
 Project: Hive
  Issue Type: Task
  Components: Types
Affects Versions: 0.13.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-5565.patch


 With HIVE-3976, the maximum precision is set to 65, and maximum scale is to 
 30. After discussing with several folks in the community, it's determined 
 that 38 as a maximum for both precision and scale are probably sufficient, in 
 addition to the potential performance boost that might become possible to 
 some implementation.
 This task is to make such a change. The change is expected to be trivial, but 
 it may impact many test cases. The reason for a separate JIRA is that patch 
 in HIVE-3976 is already in a good shape. Rather than destabilizing a bigger 
 patch, a dedicate patch will facilitates both reviews.
 The wiki document will be updated shortly.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-3819) Creating a table on Hive without Hadoop daemons running returns a misleading error

2013-11-08 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817337#comment-13817337
 ] 

Xuefu Zhang commented on HIVE-3819:
---

[~mgrover] were you able to reproduce? If not, I think we can close this one. 
Thanks.

 Creating a table on Hive without Hadoop daemons running returns a misleading 
 error
 --

 Key: HIVE-3819
 URL: https://issues.apache.org/jira/browse/HIVE-3819
 Project: Hive
  Issue Type: Bug
  Components: CLI, Metastore
Reporter: Mark Grover
Assignee: Xuefu Zhang

 I was running hive without running the underlying hadoop daemon's running. 
 Hadoop was configured to run in pseudo-distributed mode. However, when I 
 tried to create a hive table, I got this rather misleading error:
 {code}
 FAILED: Error in metadata: java.lang.RuntimeException: Unable to instantiate 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 {code}
 We should look into making this error message less misleading (more about 
 hadoop daemons not running instead of metastore client not being 
 instantiable).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-3819) Creating a table on Hive without Hadoop daemons running returns a misleading error

2013-11-08 Thread Mark Grover (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817348#comment-13817348
 ] 

Mark Grover commented on HIVE-3819:
---

Sorry, Xuefu, I haven't had the time. If you can't reproduce this, please go 
ahead and mark this as Cant' reproduce.

Thanks for checking!

 Creating a table on Hive without Hadoop daemons running returns a misleading 
 error
 --

 Key: HIVE-3819
 URL: https://issues.apache.org/jira/browse/HIVE-3819
 Project: Hive
  Issue Type: Bug
  Components: CLI, Metastore
Reporter: Mark Grover
Assignee: Xuefu Zhang

 I was running hive without running the underlying hadoop daemon's running. 
 Hadoop was configured to run in pseudo-distributed mode. However, when I 
 tried to create a hive table, I got this rather misleading error:
 {code}
 FAILED: Error in metadata: java.lang.RuntimeException: Unable to instantiate 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 {code}
 We should look into making this error message less misleading (more about 
 hadoop daemons not running instead of metastore client not being 
 instantiable).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-3777) add a property in the partition to figure out if stats are accurate

2013-11-08 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3777:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Thejas for review!

 add a property in the partition to figure out if stats are accurate
 ---

 Key: HIVE-3777
 URL: https://issues.apache.org/jira/browse/HIVE-3777
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Namit Jain
Assignee: Ashutosh Chauhan
 Fix For: 0.13.0

 Attachments: HIVE-3777.2.patch, HIVE-3777.2.patch, HIVE-3777.3.patch, 
 HIVE-3777.4.patch, HIVE-3777.5.patch, HIVE-3777.patch


 Currently, stats task tries to update the statistics in the table/partition
 being updated after the table/partition is loaded. In case of a failure to 
 update these stats (due to the any reason), the operation either succeeds
 (writing inaccurate stats) or fails depending on whether hive.stats.reliable
 is set to true. This can be bad for applications who do not always care about
 reliable stats, since the query may have taken a long time to execute and then
 fail eventually.
 Another property should be added to the partition: areStatsAccurate. If 
 hive.stats.reliable is
 set to false, and stats could not be computed correctly, the operation would
 still succeed, update the stats, but set areStatsAccurate to false.
 If the application cares about accurate stats, it can be obtained in the 
 background.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (HIVE-5642) Exception in UDFs with large number of arguments.

2013-11-08 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-5642.


   Resolution: Fixed
Fix Version/s: 0.13.0

Fixed via HIVE-5604

 Exception in UDFs with large number of arguments.
 -

 Key: HIVE-5642
 URL: https://issues.apache.org/jira/browse/HIVE-5642
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Fix For: 0.13.0


 Such UDFs will mostly be custom UDFs, but if they are not supported in vector 
 more, we should fall back to non-vector mode.
 {code}
 Caused by: java.lang.ArrayIndexOutOfBoundsException: 3
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExpressionDescriptor$Builder.setArgumentType(VectorExpressionDescriptor.java:147)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:431)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:545)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:248)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createVectorExpression(VectorizationContext.java:460)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5779) Subquery in where clause with distinct fails with mapjoin turned on with serialization error.

2013-11-08 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-5779:
---

Status: Open  (was: Patch Available)

 Subquery in where clause with distinct fails with mapjoin turned on with 
 serialization error.
 -

 Key: HIVE-5779
 URL: https://issues.apache.org/jira/browse/HIVE-5779
 Project: Hive
  Issue Type: Bug
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-5779.2.patch, HIVE-5779.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5779) Subquery in where clause with distinct fails with mapjoin turned on with serialization error.

2013-11-08 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-5779:
---

Status: Patch Available  (was: Open)

 Subquery in where clause with distinct fails with mapjoin turned on with 
 serialization error.
 -

 Key: HIVE-5779
 URL: https://issues.apache.org/jira/browse/HIVE-5779
 Project: Hive
  Issue Type: Bug
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-5779.2.patch, HIVE-5779.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5779) Subquery in where clause with distinct fails with mapjoin turned on with serialization error.

2013-11-08 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-5779:
---

Attachment: HIVE-5779.2.patch

Re-upload for Hive QA to pick up.

 Subquery in where clause with distinct fails with mapjoin turned on with 
 serialization error.
 -

 Key: HIVE-5779
 URL: https://issues.apache.org/jira/browse/HIVE-5779
 Project: Hive
  Issue Type: Bug
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-5779.2.patch, HIVE-5779.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5691) Intermediate columns are incorrectly initialized for partitioned tables.

2013-11-08 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-5691:
---

Status: Open  (was: Patch Available)

 Intermediate columns are incorrectly initialized for partitioned tables.
 

 Key: HIVE-5691
 URL: https://issues.apache.org/jira/browse/HIVE-5691
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-5691.1.patch, HIVE-5691.2.patch, HIVE-5691.3.patch


 Intermediate columns are incorrectly initialized for partitioned tables. Same 
 tablescan operator can be used for multiple partitions. The vectorizer 
 doesn't initialize for all partition paths.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5691) Intermediate columns are incorrectly initialized for partitioned tables.

2013-11-08 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-5691:
---

Attachment: HIVE-5691.4.patch

Another attempt for Hive QA.

 Intermediate columns are incorrectly initialized for partitioned tables.
 

 Key: HIVE-5691
 URL: https://issues.apache.org/jira/browse/HIVE-5691
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-5691.1.patch, HIVE-5691.2.patch, HIVE-5691.3.patch, 
 HIVE-5691.4.patch


 Intermediate columns are incorrectly initialized for partitioned tables. Same 
 tablescan operator can be used for multiple partitions. The vectorizer 
 doesn't initialize for all partition paths.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5691) Intermediate columns are incorrectly initialized for partitioned tables.

2013-11-08 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-5691:
---

Status: Patch Available  (was: Open)

 Intermediate columns are incorrectly initialized for partitioned tables.
 

 Key: HIVE-5691
 URL: https://issues.apache.org/jira/browse/HIVE-5691
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-5691.1.patch, HIVE-5691.2.patch, HIVE-5691.3.patch, 
 HIVE-5691.4.patch


 Intermediate columns are incorrectly initialized for partitioned tables. Same 
 tablescan operator can be used for multiple partitions. The vectorizer 
 doesn't initialize for all partition paths.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5657) TopN produces incorrect results with count(distinct)

2013-11-08 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-5657:
---

Status: Open  (was: Patch Available)

 TopN produces incorrect results with count(distinct)
 

 Key: HIVE-5657
 URL: https://issues.apache.org/jira/browse/HIVE-5657
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Critical
 Attachments: D13797.1.patch, D13797.2.patch, HIVE-5657.02.patch, 
 HIVE-5657.1.patch.txt, example.patch


 Attached patch illustrates the problem.
 limit_pushdown test has various other cases of aggregations and distincts, 
 incl. count-distinct, that work correctly (that said, src dataset is bad for 
 testing these things because every count, for example, produces one record 
 only), so something must be special about this.
 I am not very familiar with distinct- code and these nuances; if someone 
 knows a quick fix feel free to take this, otherwise I will probably start 
 looking next week. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5657) TopN produces incorrect results with count(distinct)

2013-11-08 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817450#comment-13817450
 ] 

Ashutosh Chauhan commented on HIVE-5657:


+1 
Can you create a follow-up jira for removing unnecessary if(firstRow) from 
processOp(), seems like work in that if block can be done in initializeOp() ?
Also, you need to reupload your patch since seems like Hive QA hasn't picked it 
up yet.

 TopN produces incorrect results with count(distinct)
 

 Key: HIVE-5657
 URL: https://issues.apache.org/jira/browse/HIVE-5657
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Critical
 Attachments: D13797.1.patch, D13797.2.patch, HIVE-5657.02.patch, 
 HIVE-5657.1.patch.txt, example.patch


 Attached patch illustrates the problem.
 limit_pushdown test has various other cases of aggregations and distincts, 
 incl. count-distinct, that work correctly (that said, src dataset is bad for 
 testing these things because every count, for example, produces one record 
 only), so something must be special about this.
 I am not very familiar with distinct- code and these nuances; if someone 
 knows a quick fix feel free to take this, otherwise I will probably start 
 looking next week. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-4388) HBase tests fail against Hadoop 2

2013-11-08 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-4388:
---

Attachment: HIVE-4388.16.patch

v16 should fix those failures due to missing deps.

 HBase tests fail against Hadoop 2
 -

 Key: HIVE-4388
 URL: https://issues.apache.org/jira/browse/HIVE-4388
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Gunther Hagleitner
Assignee: Brock Noland
 Attachments: HIVE-4388-wip.txt, HIVE-4388.10.patch, 
 HIVE-4388.11.patch, HIVE-4388.12.patch, HIVE-4388.13.patch, 
 HIVE-4388.14.patch, HIVE-4388.15.patch, HIVE-4388.15.patch, 
 HIVE-4388.16.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch


 Currently we're building by default against 0.92. When you run against hadoop 
 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963.
 HIVE-3861 upgrades the version of hbase used. This will get you past the 
 problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5767) in SemanticAnalyzer#doPhase1, handling for TOK_UNION falls thru into TOK_INSERT

2013-11-08 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-5767:
---

Status: Open  (was: Patch Available)

+1 
Can you reupload the patch so that Hive QA gets to run on it?

 in SemanticAnalyzer#doPhase1, handling for TOK_UNION falls thru into 
 TOK_INSERT
 ---

 Key: HIVE-5767
 URL: https://issues.apache.org/jira/browse/HIVE-5767
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Trivial
 Attachments: HIVE-5767.patch


 I don't think it's intended. INSERT path consists of a big if statement which 
 prevents most of the code from executing for union case.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5686) partition column type validation doesn't quite work for dates

2013-11-08 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-5686:
---

Status: Open  (was: Patch Available)

You need to re-upload the patch for Hive QA to kick in.

 partition column type validation doesn't quite work for dates
 -

 Key: HIVE-5686
 URL: https://issues.apache.org/jira/browse/HIVE-5686
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-5686.patch


 Another interesting issue...
 {noformat}
 hive create table z(c string) partitioned by (i date,j date);
 OK
 Time taken: 0.099 seconds
 hive alter table z add partition (i='2012-01-01', j='foo');  
 FAILED: SemanticException [Error 10248]: Cannot add partition column j of 
 type string as it cannot be converted to type date
 hive alter table z add partition (i='2012-01-01', j=date 'foo');
 OK
 Time taken: 0.119 seconds
 {noformat}
 The fake date is caught in normal queries:
 {noformat}
 hive select * from z where j == date 'foo';
 FAILED: SemanticException Unable to convert date literal string to date value.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5626) enable metastore direct SQL for drop/similar queries

2013-11-08 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817477#comment-13817477
 ] 

Ashutosh Chauhan commented on HIVE-5626:


Patch looks good. Thanks for refactoring. How did you test this one, by looking 
at logs ? Is it possible to add junit tests for this as we have added for 
direct-sql for other cases?

 enable metastore direct SQL for drop/similar queries
 

 Key: HIVE-5626
 URL: https://issues.apache.org/jira/browse/HIVE-5626
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Minor
 Attachments: HIVE-5626.01.patch, HIVE-5626.02.patch, HIVE-5626.patch


 Metastore direct SQL is currently disabled for any queries running inside 
 external transaction (i.e. all modification queries, like dropping stuff).
 This was done to keep the strictly performance-optimization behavior when 
 using Postgres, which unlike other RDBMS-es fails the tx on any syntax error; 
 so, if direct SQL is broken there's no way to fall back. So, it is disabled 
 for these cases.
 It is not as important because drop commands are rare, but we might want to 
 address that. Either by some config setting or by making it work on 
 non-postgres DBs.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5626) enable metastore direct SQL for drop/similar queries

2013-11-08 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-5626:
---

Status: Open  (was: Patch Available)

 enable metastore direct SQL for drop/similar queries
 

 Key: HIVE-5626
 URL: https://issues.apache.org/jira/browse/HIVE-5626
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Minor
 Attachments: HIVE-5626.01.patch, HIVE-5626.02.patch, HIVE-5626.patch


 Metastore direct SQL is currently disabled for any queries running inside 
 external transaction (i.e. all modification queries, like dropping stuff).
 This was done to keep the strictly performance-optimization behavior when 
 using Postgres, which unlike other RDBMS-es fails the tx on any syntax error; 
 so, if direct SQL is broken there's no way to fall back. So, it is disabled 
 for these cases.
 It is not as important because drop commands are rare, but we might want to 
 address that. Either by some config setting or by making it work on 
 non-postgres DBs.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


precommit test

2013-11-08 Thread Xuefu Zhang
Hi,

It seems that pre-commit tests are not running. I wonder if anyone knows
why?

Thanks,
Xuefu


[jira] [Commented] (HIVE-5581) Implement vectorized year/month/day... etc. for string arguments

2013-11-08 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817499#comment-13817499
 ] 

Eric Hanson commented on HIVE-5581:
---

I think there is still a possibility in a pathological case that you could 
return a value when you should return NULL. See my comment in the code review. 
It's almost there. Thanks Teddy.

 Implement vectorized year/month/day... etc. for string arguments
 

 Key: HIVE-5581
 URL: https://issues.apache.org/jira/browse/HIVE-5581
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Eric Hanson
Assignee: Teddy Choi
 Attachments: HIVE-5581.1.patch.txt, HIVE-5581.2.patch, 
 HIVE-5581.3.patch


 Functions year(), month(), day(), weekofyear(), hour(), minute(), second() 
 need to be implemented for string arguments in vectorized mode. 
 They already work for timestamp arguments.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Review Request 14486: HIVE-5441: Async query execution doesn't return resultset status

2013-11-08 Thread Prasad Mujumdar


 On Nov. 8, 2013, 12:22 a.m., Thejas Nair wrote:
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 979
  https://reviews.apache.org/r/14486/diff/4/?file=380248#file380248line979
 
  OK, I see what you mean. I was looking at just the commented line, and 
  didn't look at the full view-diff page. Yes, the releaseLocks won't get 
  called. That looks like a problem.
  
  Thanks to Vaibhav to pointing it out to me.
 
 Brock Noland wrote:
 OK, thanks for responding. I'll open a jira.
 
 Brock Noland wrote:
 https://issues.apache.org/jira/browse/HIVE-5781

I guess the lock are acquired a bit later, just before the execution. We can 
actually get rid of that releaseLocks() in case of compiler error.


- Prasad


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14486/#review28467
---


On Nov. 6, 2013, 11:50 p.m., Prasad Mujumdar wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/14486/
 ---
 
 (Updated Nov. 6, 2013, 11:50 p.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-5441
 https://issues.apache.org/jira/browse/HIVE-5441
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Separate out the query compilation and execute that part synchronously.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java c09ffde 
   service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 
 4ee1b74 
   service/src/test/org/apache/hive/service/cli/CLIServiceTest.java cd9d99a 
 
 Diff: https://reviews.apache.org/r/14486/diff/
 
 
 Testing
 ---
 
 Added test cases
 
 
 Thanks,
 
 Prasad Mujumdar
 




Re: precommit test

2013-11-08 Thread Brock Noland
AFAIK they are running just fine.  Last night that was not the case because
of a price spike in EC2 spot instances (going to be improved via
https://issues.apache.org/jira/browse/HIVE-5782).

Long story short, they are queued right now and we can eliminate the
queueing once https://issues.apache.org/jira/browse/HADOOP-9765 gets in.
Because of a limitation of the precommit system (fixed by HADOOP-9765) we
have a rube-goldberg contraption. Right now, the jobs are queued here:
https://builds.apache.org/job/PreCommit-HIVE-Build/

That job, simply posts the patches over to:
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/


On Fri, Nov 8, 2013 at 11:30 AM, Xuefu Zhang xzh...@cloudera.com wrote:

 Hi,

 It seems that pre-commit tests are not running. I wonder if anyone knows
 why?

 Thanks,
 Xuefu




-- 
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org


[jira] [Commented] (HIVE-4574) XMLEncoder thread safety issues in openjdk7 causes HiveServer2 to be stuck

2013-11-08 Thread Ian Robertson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817506#comment-13817506
 ] 

Ian Robertson commented on HIVE-4574:
-

For those tracking this issue, the underlying issue is now in the openjdk bug 
tracker: https://bugs.openjdk.java.net/browse/JDK-8028054 . It's currently 
scheduled for JDK8; not clear whether it will also be backported to a release 
of 7.

 XMLEncoder thread safety issues in openjdk7 causes HiveServer2 to be stuck
 --

 Key: HIVE-4574
 URL: https://issues.apache.org/jira/browse/HIVE-4574
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.11.0, 0.12.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-4574.1.patch


 In open jdk7, XMLEncoder.writeObject call leads to calls to 
 java.beans.MethodFinder.findMethod(). MethodFinder class not thread safe 
 because it uses a static WeakHashMap that would get used from multiple 
 threads. See -
 http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/7-b147/com/sun/beans/finder/MethodFinder.java#46
 Concurrent access to HashMap implementation that are not thread safe can 
 sometimes result in infinite-loops and other problems. If jdk7 is in use, it 
 makes sense to synchronize calls to XMLEncoder.writeObject .



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5700) enforce single date format for partition column storage

2013-11-08 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817507#comment-13817507
 ] 

Ashutosh Chauhan commented on HIVE-5700:


Can you create a RB entry for this ?

 enforce single date format for partition column storage
 ---

 Key: HIVE-5700
 URL: https://issues.apache.org/jira/browse/HIVE-5700
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-5700.01.patch, HIVE-5700.patch


 inspired by HIVE-5286.
 Partition column for dates should be stored as either integer, or as fixed 
 representation e.g. -mm-dd. External representation can remain varied as 
 is.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: precommit test

2013-11-08 Thread Xuefu Zhang
Thanks for the info, Brock.

Yesterday I submitted a couple of patches, expecting to get the result this
morning. However, they didn't run. I had to manually submit the request
this morning. Right now they are waiting in queue.

--Xuefu


On Fri, Nov 8, 2013 at 9:48 AM, Brock Noland br...@cloudera.com wrote:

 AFAIK they are running just fine.  Last night that was not the case because
 of a price spike in EC2 spot instances (going to be improved via
 https://issues.apache.org/jira/browse/HIVE-5782).

 Long story short, they are queued right now and we can eliminate the
 queueing once https://issues.apache.org/jira/browse/HADOOP-9765 gets in.
 Because of a limitation of the precommit system (fixed by HADOOP-9765) we
 have a rube-goldberg contraption. Right now, the jobs are queued here:
 https://builds.apache.org/job/PreCommit-HIVE-Build/

 That job, simply posts the patches over to:
 http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/


 On Fri, Nov 8, 2013 at 11:30 AM, Xuefu Zhang xzh...@cloudera.com wrote:

  Hi,
 
  It seems that pre-commit tests are not running. I wonder if anyone knows
  why?
 
  Thanks,
  Xuefu
 



 --
 Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org



Review Request 15359: HIVE-5700 enforce single date format for partition column storage

2013-11-08 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15359/
---

Review request for hive and Ashutosh Chauhan.


Repository: hive-git


Description
---

see JIRA


Diffs
-

  metastore/scripts/upgrade/mysql/upgrade-0.12.0-to-0.13.0.mysql.sql 04e4a87 
  metastore/scripts/upgrade/oracle/upgrade-0.12.0-to-0.13.0.oracle.sql 8847d3e 
  metastore/scripts/upgrade/postgres/upgrade-0.12.0-to-0.13.0.postgres.sql 
01cbe76 
  ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 46d1fac 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 5305537 

Diff: https://reviews.apache.org/r/15359/diff/


Testing
---

tests; manual verification of mysql and psql scripts (Oracle tbd)


Thanks,

Sergey Shelukhin



[jira] [Commented] (HIVE-4388) HBase tests fail against Hadoop 2

2013-11-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817558#comment-13817558
 ] 

Hive QA commented on HIVE-4388:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12612848/HIVE-4388.16.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4597 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver_cascade_dbdrop_hadoop20
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/209/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/209/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests failed with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12612848

 HBase tests fail against Hadoop 2
 -

 Key: HIVE-4388
 URL: https://issues.apache.org/jira/browse/HIVE-4388
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Gunther Hagleitner
Assignee: Brock Noland
 Attachments: HIVE-4388-wip.txt, HIVE-4388.10.patch, 
 HIVE-4388.11.patch, HIVE-4388.12.patch, HIVE-4388.13.patch, 
 HIVE-4388.14.patch, HIVE-4388.15.patch, HIVE-4388.15.patch, 
 HIVE-4388.16.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch


 Currently we're building by default against 0.92. When you run against hadoop 
 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963.
 HIVE-3861 upgrades the version of hbase used. This will get you past the 
 problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5700) enforce single date format for partition column storage

2013-11-08 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817559#comment-13817559
 ] 

Sergey Shelukhin commented on HIVE-5700:


https://reviews.apache.org/r/15359/

 enforce single date format for partition column storage
 ---

 Key: HIVE-5700
 URL: https://issues.apache.org/jira/browse/HIVE-5700
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-5700.01.patch, HIVE-5700.patch


 inspired by HIVE-5286.
 Partition column for dates should be stored as either integer, or as fixed 
 representation e.g. -mm-dd. External representation can remain varied as 
 is.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5626) enable metastore direct SQL for drop/similar queries

2013-11-08 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817564#comment-13817564
 ] 

Sergey Shelukhin commented on HIVE-5626:


all the .q tests are ran with verifying object store which runs SQL and JDO and 
compares. Separate test - what do you mean?

 enable metastore direct SQL for drop/similar queries
 

 Key: HIVE-5626
 URL: https://issues.apache.org/jira/browse/HIVE-5626
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Minor
 Attachments: HIVE-5626.01.patch, HIVE-5626.02.patch, HIVE-5626.patch


 Metastore direct SQL is currently disabled for any queries running inside 
 external transaction (i.e. all modification queries, like dropping stuff).
 This was done to keep the strictly performance-optimization behavior when 
 using Postgres, which unlike other RDBMS-es fails the tx on any syntax error; 
 so, if direct SQL is broken there's no way to fall back. So, it is disabled 
 for these cases.
 It is not as important because drop commands are rare, but we might want to 
 address that. Either by some config setting or by making it work on 
 non-postgres DBs.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Subscribe to List

2013-11-08 Thread Chinna Rao Lalam



[jira] [Created] (HIVE-5783) Native Parquet Support in Hive

2013-11-08 Thread Justin Coffey (JIRA)
Justin Coffey created HIVE-5783:
---

 Summary: Native Parquet Support in Hive
 Key: HIVE-5783
 URL: https://issues.apache.org/jira/browse/HIVE-5783
 Project: Hive
  Issue Type: New Feature
Reporter: Justin Coffey
Priority: Minor


Problem Statement:

Hive would be easier to use if it had native Parquet support. Our organization, 
Criteo, uses Hive extensively. Therefore we built the Parquet Hive integration 
and would like to now contribute that integration to Hive.

About Parquet:

Parquet is a columnar storage format for Hadoop and integrates with many Hadoop 
ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, Pig, Drill, 
Crunch, and Hive. Pig, Crunch, and Drill all contain native Parquet integration.

Changes Details:

Parquet was built with dependency management in mind and therefore only a 
single Parquet jar will be added as a dependency.




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5657) TopN produces incorrect results with count(distinct)

2013-11-08 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817574#comment-13817574
 ] 

Sergey Shelukhin commented on HIVE-5657:


btw please don't commit, let me address my own comments on rb

 TopN produces incorrect results with count(distinct)
 

 Key: HIVE-5657
 URL: https://issues.apache.org/jira/browse/HIVE-5657
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Critical
 Attachments: D13797.1.patch, D13797.2.patch, HIVE-5657.02.patch, 
 HIVE-5657.1.patch.txt, example.patch


 Attached patch illustrates the problem.
 limit_pushdown test has various other cases of aggregations and distincts, 
 incl. count-distinct, that work correctly (that said, src dataset is bad for 
 testing these things because every count, for example, produces one record 
 only), so something must be special about this.
 I am not very familiar with distinct- code and these nuances; if someone 
 knows a quick fix feel free to take this, otherwise I will probably start 
 looking next week. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5657) TopN produces incorrect results with count(distinct)

2013-11-08 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817572#comment-13817572
 ] 

Sergey Shelukhin commented on HIVE-5657:


seems like too small an item to create JIRA for... also are you sure it can 
indeed be moved? see my TODO comment. As a matter of priorities I'd like to not 
spend time making sure ;)

 TopN produces incorrect results with count(distinct)
 

 Key: HIVE-5657
 URL: https://issues.apache.org/jira/browse/HIVE-5657
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Critical
 Attachments: D13797.1.patch, D13797.2.patch, HIVE-5657.02.patch, 
 HIVE-5657.1.patch.txt, example.patch


 Attached patch illustrates the problem.
 limit_pushdown test has various other cases of aggregations and distincts, 
 incl. count-distinct, that work correctly (that said, src dataset is bad for 
 testing these things because every count, for example, produces one record 
 only), so something must be special about this.
 I am not very familiar with distinct- code and these nuances; if someone 
 knows a quick fix feel free to take this, otherwise I will probably start 
 looking next week. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5686) partition column type validation doesn't quite work for dates

2013-11-08 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-5686:
---

Status: Patch Available  (was: Open)

 partition column type validation doesn't quite work for dates
 -

 Key: HIVE-5686
 URL: https://issues.apache.org/jira/browse/HIVE-5686
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-5686.01.patch, HIVE-5686.patch


 Another interesting issue...
 {noformat}
 hive create table z(c string) partitioned by (i date,j date);
 OK
 Time taken: 0.099 seconds
 hive alter table z add partition (i='2012-01-01', j='foo');  
 FAILED: SemanticException [Error 10248]: Cannot add partition column j of 
 type string as it cannot be converted to type date
 hive alter table z add partition (i='2012-01-01', j=date 'foo');
 OK
 Time taken: 0.119 seconds
 {noformat}
 The fake date is caught in normal queries:
 {noformat}
 hive select * from z where j == date 'foo';
 FAILED: SemanticException Unable to convert date literal string to date value.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5767) in SemanticAnalyzer#doPhase1, handling for TOK_UNION falls thru into TOK_INSERT

2013-11-08 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-5767:
---

Attachment: HIVE-5767.01.patch

same patch

 in SemanticAnalyzer#doPhase1, handling for TOK_UNION falls thru into 
 TOK_INSERT
 ---

 Key: HIVE-5767
 URL: https://issues.apache.org/jira/browse/HIVE-5767
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Trivial
 Attachments: HIVE-5767.01.patch, HIVE-5767.patch


 I don't think it's intended. INSERT path consists of a big if statement which 
 prevents most of the code from executing for union case.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5700) enforce single date format for partition column storage

2013-11-08 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817579#comment-13817579
 ] 

Ashutosh Chauhan commented on HIVE-5700:


You also need to add script for derby. Also, if you can test your Oracle 
script, that will be good. Also, a -ve test case which rejects date like 
2013-1-1 as partitioning column will be good to include.

 enforce single date format for partition column storage
 ---

 Key: HIVE-5700
 URL: https://issues.apache.org/jira/browse/HIVE-5700
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-5700.01.patch, HIVE-5700.patch


 inspired by HIVE-5286.
 Partition column for dates should be stored as either integer, or as fixed 
 representation e.g. -mm-dd. External representation can remain varied as 
 is.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5686) partition column type validation doesn't quite work for dates

2013-11-08 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-5686:
---

Attachment: HIVE-5686.01.patch

same patch

 partition column type validation doesn't quite work for dates
 -

 Key: HIVE-5686
 URL: https://issues.apache.org/jira/browse/HIVE-5686
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-5686.01.patch, HIVE-5686.patch


 Another interesting issue...
 {noformat}
 hive create table z(c string) partitioned by (i date,j date);
 OK
 Time taken: 0.099 seconds
 hive alter table z add partition (i='2012-01-01', j='foo');  
 FAILED: SemanticException [Error 10248]: Cannot add partition column j of 
 type string as it cannot be converted to type date
 hive alter table z add partition (i='2012-01-01', j=date 'foo');
 OK
 Time taken: 0.119 seconds
 {noformat}
 The fake date is caught in normal queries:
 {noformat}
 hive select * from z where j == date 'foo';
 FAILED: SemanticException Unable to convert date literal string to date value.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5767) in SemanticAnalyzer#doPhase1, handling for TOK_UNION falls thru into TOK_INSERT

2013-11-08 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-5767:
---

Status: Patch Available  (was: Open)

 in SemanticAnalyzer#doPhase1, handling for TOK_UNION falls thru into 
 TOK_INSERT
 ---

 Key: HIVE-5767
 URL: https://issues.apache.org/jira/browse/HIVE-5767
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Trivial
 Attachments: HIVE-5767.01.patch, HIVE-5767.patch


 I don't think it's intended. INSERT path consists of a big if statement which 
 prevents most of the code from executing for union case.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Scheduling the next Hive Contributors Meeting

2013-11-08 Thread Carl Steinbach
We're long overdue for a Hive Contributors Meeting. Thejas has offered to
host the next meeting at Hortonworks on November 19th from 4-6pm. We will
have a Google Hangout or Webex setup for people who wish to attend
remotely. If you want to attend but can't because of a scheduling conflict
please let us know. If enough people fall into this category we will try to
reschedule.

Thanks.

Carl


[jira] [Commented] (HIVE-5286) Negative test date_literal1.q fails on java7 because the syntax is valid

2013-11-08 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817591#comment-13817591
 ] 

Xuefu Zhang commented on HIVE-5286:
---

+1, patch looks good to me.

 Negative test date_literal1.q fails on java7 because the syntax is valid
 

 Key: HIVE-5286
 URL: https://issues.apache.org/jira/browse/HIVE-5286
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Brock Noland
Assignee: Szehon Ho
 Attachments: HIVE-5286.patch


 {noformat}
 [brock@bigboy java-date]$ cat Test.java 
 import java.sql.Date;
 public class Test {
   public static void main(String[] args) throws Exception {
 System.out.println(Date.valueOf(2001-1-1));
   }
 }
 [brock@bigboy java-date]$ exec-via-java6 java -cp . Test
 Exception in thread main java.lang.IllegalArgumentException
   at java.sql.Date.valueOf(Date.java:138)
   at Test.main(Test.java:4)
 [brock@bigboy java-date]$ exec-via-java7 java -cp . Test
 2001-01-01
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5657) TopN produces incorrect results with count(distinct)

2013-11-08 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-5657:
---

Attachment: HIVE-5657.03.patch

trivial changes compared to 02

 TopN produces incorrect results with count(distinct)
 

 Key: HIVE-5657
 URL: https://issues.apache.org/jira/browse/HIVE-5657
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Critical
 Attachments: D13797.1.patch, D13797.2.patch, HIVE-5657.02.patch, 
 HIVE-5657.03.patch, HIVE-5657.1.patch.txt, example.patch


 Attached patch illustrates the problem.
 limit_pushdown test has various other cases of aggregations and distincts, 
 incl. count-distinct, that work correctly (that said, src dataset is bad for 
 testing these things because every count, for example, produces one record 
 only), so something must be special about this.
 I am not very familiar with distinct- code and these nuances; if someone 
 knows a quick fix feel free to take this, otherwise I will probably start 
 looking next week. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5657) TopN produces incorrect results with count(distinct)

2013-11-08 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-5657:
---

Status: Patch Available  (was: Open)

 TopN produces incorrect results with count(distinct)
 

 Key: HIVE-5657
 URL: https://issues.apache.org/jira/browse/HIVE-5657
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Critical
 Attachments: D13797.1.patch, D13797.2.patch, HIVE-5657.02.patch, 
 HIVE-5657.03.patch, HIVE-5657.1.patch.txt, example.patch


 Attached patch illustrates the problem.
 limit_pushdown test has various other cases of aggregations and distincts, 
 incl. count-distinct, that work correctly (that said, src dataset is bad for 
 testing these things because every count, for example, produces one record 
 only), so something must be special about this.
 I am not very familiar with distinct- code and these nuances; if someone 
 knows a quick fix feel free to take this, otherwise I will probably start 
 looking next week. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5700) enforce single date format for partition column storage

2013-11-08 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817604#comment-13817604
 ] 

Sergey Shelukhin commented on HIVE-5700:


I was hoping to avoid writing Derby upgrade script... it;s not DB structure - 
do people really need to upgrade derby? Hmm.

As for negative tests, the problem is that date validation on JDK6 is going to 
kick in first and reject the date literal before this code executes... the only 
way to allow it is to run JDK7, for example. Let me think about more isolated 
test

 enforce single date format for partition column storage
 ---

 Key: HIVE-5700
 URL: https://issues.apache.org/jira/browse/HIVE-5700
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-5700.01.patch, HIVE-5700.patch


 inspired by HIVE-5286.
 Partition column for dates should be stored as either integer, or as fixed 
 representation e.g. -mm-dd. External representation can remain varied as 
 is.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5783) Native Parquet Support in Hive

2013-11-08 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817609#comment-13817609
 ] 

Carl Steinbach commented on HIVE-5783:
--

[~jcoffey] I added you to the list of Hive contributors on JIRA. Feel free to 
assign this ticket to yourself. Thanks.

 Native Parquet Support in Hive
 --

 Key: HIVE-5783
 URL: https://issues.apache.org/jira/browse/HIVE-5783
 Project: Hive
  Issue Type: New Feature
Reporter: Justin Coffey
Priority: Minor

 Problem Statement:
 Hive would be easier to use if it had native Parquet support. Our 
 organization, Criteo, uses Hive extensively. Therefore we built the Parquet 
 Hive integration and would like to now contribute that integration to Hive.
 About Parquet:
 Parquet is a columnar storage format for Hadoop and integrates with many 
 Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, 
 Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native 
 Parquet integration.
 Changes Details:
 Parquet was built with dependency management in mind and therefore only a 
 single Parquet jar will be added as a dependency.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Scheduling the next Hive Contributors Meeting

2013-11-08 Thread Brock Noland
Hi,

Thanks Carl and Thejas! I would be attending remotely so the webex or
google hangout would be very much appreciated. Please let me know if there
is anything I can do to help enable either a webex or hangout!

The Apache Sentry (incubating)[1] community which depends on Hive would be
interested in briefly describing the project to the Hive community and
discuss how we can work together to move both projects forward!  As a side
note, there have been lively discussions on the integration of other
incubating projects therefore I'd just like to share that the changes
Sentry is interested in are very small in scope and unlikely to cause
disruption to the Hive community.

Cheers!
Brock

[1] http://incubator.apache.org/projects/sentry.html


On Fri, Nov 8, 2013 at 1:08 PM, Carl Steinbach c...@apache.org wrote:

 We're long overdue for a Hive Contributors Meeting. Thejas has offered to
 host the next meeting at Hortonworks on November 19th from 4-6pm. We will
 have a Google Hangout or Webex setup for people who wish to attend
 remotely. If you want to attend but can't because of a scheduling conflict
 please let us know. If enough people fall into this category we will try to
 reschedule.

 Thanks.

 Carl



[jira] [Created] (HIVE-5784) Group By Operator doesn't carry forward table aliases in its RowResolver

2013-11-08 Thread Harish Butani (JIRA)
Harish Butani created HIVE-5784:
---

 Summary: Group By Operator doesn't carry forward table aliases in 
its RowResolver
 Key: HIVE-5784
 URL: https://issues.apache.org/jira/browse/HIVE-5784
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani


The following queries fails:
{code}
select b.key, count(*) from src b group by key
select key, count(*) from src b group by b.key
{code}
with a SemanticException; the select expression b.key (key in the 2nd query) 
are not resolved by the GBy RowResolver.

This is because the GBy RowResolver only supports resolving based on an 
AST.toStringTree match. Underlying issue is that a RowResolver doesn't allow 
multiple mappings to the same ColumnInfo.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Review Request 15361: HIVE-5784: Group By Operator doesn't carry forward table aliases in its RowResolver

2013-11-08 Thread Harish Butani

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15361/
---

Review request for hive and Ashutosh Chauhan.


Bugs: hive-5784
https://issues.apache.org/jira/browse/hive-5784


Repository: hive-git


Description
---

The following queries fails:
select b.key, count(*) from src b group by key
select key, count(*) from src b group by b.key
with a SemanticException; the select expression b.key (key in the 2nd query) 
are not resolved by the GBy RowResolver.
This is because the GBy RowResolver only supports resolving based on an 
AST.toStringTree match. Underlying issue is that a RowResolver doesn't allow 
multiple mappings to the same ColumnInfo.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/RowResolver.java 908546e 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 5305537 
  ql/src/test/queries/clientpositive/groupby_resolution.q PRE-CREATION 
  ql/src/test/results/clientpositive/groupby_resolution.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/15361/diff/


Testing
---

added test groupby_resolution.q


Thanks,

Harish Butani



Re: Scheduling the next Hive Contributors Meeting

2013-11-08 Thread Nitin Pawar
I am not a contributor but a spectator to what hive have been doing last
couple of years.
I work out of India and would love to just sit back and listen to all the
new upcoming things (if that's allowed) :)


On Sat, Nov 9, 2013 at 1:08 AM, Brock Noland br...@cloudera.com wrote:

 Hi,

 Thanks Carl and Thejas! I would be attending remotely so the webex or
 google hangout would be very much appreciated. Please let me know if there
 is anything I can do to help enable either a webex or hangout!

 The Apache Sentry (incubating)[1] community which depends on Hive would be
 interested in briefly describing the project to the Hive community and
 discuss how we can work together to move both projects forward!  As a side
 note, there have been lively discussions on the integration of other
 incubating projects therefore I'd just like to share that the changes
 Sentry is interested in are very small in scope and unlikely to cause
 disruption to the Hive community.

 Cheers!
 Brock

 [1] http://incubator.apache.org/projects/sentry.html


 On Fri, Nov 8, 2013 at 1:08 PM, Carl Steinbach c...@apache.org wrote:

  We're long overdue for a Hive Contributors Meeting. Thejas has offered to
  host the next meeting at Hortonworks on November 19th from 4-6pm. We will
  have a Google Hangout or Webex setup for people who wish to attend
  remotely. If you want to attend but can't because of a scheduling
 conflict
  please let us know. If enough people fall into this category we will try
 to
  reschedule.
 
  Thanks.
 
  Carl
 




-- 
Nitin Pawar


[jira] [Commented] (HIVE-5356) Move arithmatic UDFs to generic UDF implementations

2013-11-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817616#comment-13817616
 ] 

Hive QA commented on HIVE-5356:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12612668/HIVE-5356.4.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 4636 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_udf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_pmod
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_invalid_arithmetic_type
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/210/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/210/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests failed with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12612668

 Move arithmatic UDFs to generic UDF implementations
 ---

 Key: HIVE-5356
 URL: https://issues.apache.org/jira/browse/HIVE-5356
 Project: Hive
  Issue Type: Task
  Components: UDF
Affects Versions: 0.11.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 0.13.0

 Attachments: HIVE-5356.1.patch, HIVE-5356.2.patch, HIVE-5356.3.patch, 
 HIVE-5356.4.patch, HIVE-5356.patch


 Currently, all of the arithmetic operators, such as add/sub/mult/div, are 
 implemented as old-style UDFs and java reflection is used to determine the 
 return type TypeInfos/ObjectInspectors, based on the return type of the 
 evaluate() method chosen for the expression. This works fine for types that 
 don't have type params.
 Hive decimal type participates in these operations just like int or double. 
 Different from double or int, however, decimal has precision and scale, which 
 cannot be determined by just looking at the return type (decimal) of the UDF 
 evaluate() method, even though the operands have certain precision/scale. 
 With the default of decimal without precision/scale, then (10, 0) will be 
 the type params. This is certainly not desirable.
 To solve this problem, all of the arithmetic operators would need to be 
 implemented as GenericUDFs, which allow returning ObjectInspector during the 
 initialize() method. The object inspectors returned can carry type params, 
 from which the exact return type can be determined.
 It's worth mentioning that, for user UDF implemented in non-generic way, if 
 the return type of the chosen evaluate() method is decimal, the return type 
 actually has (10,0) as precision/scale, which might not be desirable. This 
 needs to be documented.
 This JIRA will cover minus, plus, divide, multiply, mod, and pmod, to limit 
 the scope of review. The remaining ones will be covered under HIVE-5706.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5784) Group By Operator doesn't carry forward table aliases in its RowResolver

2013-11-08 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-5784:


Attachment: HIVE-5784.1.patch

 Group By Operator doesn't carry forward table aliases in its RowResolver
 

 Key: HIVE-5784
 URL: https://issues.apache.org/jira/browse/HIVE-5784
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-5784.1.patch


 The following queries fails:
 {code}
 select b.key, count(*) from src b group by key
 select key, count(*) from src b group by b.key
 {code}
 with a SemanticException; the select expression b.key (key in the 2nd query) 
 are not resolved by the GBy RowResolver.
 This is because the GBy RowResolver only supports resolving based on an 
 AST.toStringTree match. Underlying issue is that a RowResolver doesn't allow 
 multiple mappings to the same ColumnInfo.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5784) Group By Operator doesn't carry forward table aliases in its RowResolver

2013-11-08 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817617#comment-13817617
 ] 

Harish Butani commented on HIVE-5784:
-

Review request: https://reviews.apache.org/r/15361/

 Group By Operator doesn't carry forward table aliases in its RowResolver
 

 Key: HIVE-5784
 URL: https://issues.apache.org/jira/browse/HIVE-5784
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-5784.1.patch


 The following queries fails:
 {code}
 select b.key, count(*) from src b group by key
 select key, count(*) from src b group by b.key
 {code}
 with a SemanticException; the select expression b.key (key in the 2nd query) 
 are not resolved by the GBy RowResolver.
 This is because the GBy RowResolver only supports resolving based on an 
 AST.toStringTree match. Underlying issue is that a RowResolver doesn't allow 
 multiple mappings to the same ColumnInfo.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5784) Group By Operator doesn't carry forward table aliases in its RowResolver

2013-11-08 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-5784:


Status: Patch Available  (was: Open)

 Group By Operator doesn't carry forward table aliases in its RowResolver
 

 Key: HIVE-5784
 URL: https://issues.apache.org/jira/browse/HIVE-5784
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-5784.1.patch


 The following queries fails:
 {code}
 select b.key, count(*) from src b group by key
 select key, count(*) from src b group by b.key
 {code}
 with a SemanticException; the select expression b.key (key in the 2nd query) 
 are not resolved by the GBy RowResolver.
 This is because the GBy RowResolver only supports resolving based on an 
 AST.toStringTree match. Underlying issue is that a RowResolver doesn't allow 
 multiple mappings to the same ColumnInfo.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5557) Push down qualifying Where clause predicates as join conditions

2013-11-08 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-5557:


Status: Open  (was: Patch Available)

 Push down qualifying Where clause predicates as join conditions
 ---

 Key: HIVE-5557
 URL: https://issues.apache.org/jira/browse/HIVE-5557
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-5557.1.patch, HIVE-5557.2.patch, HIVE-5557.3.patch, 
 HIVE-5557.4.patch


 See details in HIVE-



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5557) Push down qualifying Where clause predicates as join conditions

2013-11-08 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817620#comment-13817620
 ] 

Harish Butani commented on HIVE-5557:
-

Re-submit for Hive QA to pick up.

 Push down qualifying Where clause predicates as join conditions
 ---

 Key: HIVE-5557
 URL: https://issues.apache.org/jira/browse/HIVE-5557
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-5557.1.patch, HIVE-5557.2.patch, HIVE-5557.3.patch, 
 HIVE-5557.4.patch


 See details in HIVE-



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Scheduling the next Hive Contributors Meeting

2013-11-08 Thread Brock Noland
Hi,

On Fri, Nov 8, 2013 at 1:43 PM, Nitin Pawar nitinpawar...@gmail.com wrote:

 I am not a contributor but a spectator to what hive have been doing last
 couple of years.
 I work out of India and would love to just sit back and listen to all the
 new upcoming things (if that's allowed) :)


Not only allowed, but encouraged!  Great to have your interest!




 On Sat, Nov 9, 2013 at 1:08 AM, Brock Noland br...@cloudera.com wrote:

  Hi,
 
  Thanks Carl and Thejas! I would be attending remotely so the webex or
  google hangout would be very much appreciated. Please let me know if
 there
  is anything I can do to help enable either a webex or hangout!
 
  The Apache Sentry (incubating)[1] community which depends on Hive would
 be
  interested in briefly describing the project to the Hive community and
  discuss how we can work together to move both projects forward!  As a
 side
  note, there have been lively discussions on the integration of other
  incubating projects therefore I'd just like to share that the changes
  Sentry is interested in are very small in scope and unlikely to cause
  disruption to the Hive community.
 
  Cheers!
  Brock
 
  [1] http://incubator.apache.org/projects/sentry.html
 
 
  On Fri, Nov 8, 2013 at 1:08 PM, Carl Steinbach c...@apache.org wrote:
 
   We're long overdue for a Hive Contributors Meeting. Thejas has offered
 to
   host the next meeting at Hortonworks on November 19th from 4-6pm. We
 will
   have a Google Hangout or Webex setup for people who wish to attend
   remotely. If you want to attend but can't because of a scheduling
  conflict
   please let us know. If enough people fall into this category we will
 try
  to
   reschedule.
  
   Thanks.
  
   Carl
  
 



 --
 Nitin Pawar




-- 
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org


[jira] [Updated] (HIVE-5557) Push down qualifying Where clause predicates as join conditions

2013-11-08 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-5557:


Status: Patch Available  (was: Open)

 Push down qualifying Where clause predicates as join conditions
 ---

 Key: HIVE-5557
 URL: https://issues.apache.org/jira/browse/HIVE-5557
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-5557.1.patch, HIVE-5557.2.patch, HIVE-5557.3.patch, 
 HIVE-5557.4.patch


 See details in HIVE-



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HIVE-5785) Hive Metadata Thinks Table Partitions Are All Strings

2013-11-08 Thread Brad Ruderman (JIRA)
Brad Ruderman created HIVE-5785:
---

 Summary: Hive Metadata Thinks Table Partitions Are All Strings
 Key: HIVE-5785
 URL: https://issues.apache.org/jira/browse/HIVE-5785
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema
Reporter: Brad Ruderman
Priority: Minor


hive (bruderman) CREATE TABLE test (a int, b int) partitioned by (dt int);
OK
Time taken: 0.101 seconds
hive (bruderman) desc test;
OK
col_namedata_type   comment
a   int
b   int
dt  int
Time taken: 0.093 seconds
hive (bruderman) CREATE VIEW v_test AS SELECT * FROM test;
OK
a   b   dt
Time taken: 0.042 seconds
hive (bruderman) desc v_test;
OK
col_namedata_type   comment
a   int
b   int
dt  string
Time taken: 0.098 seconds
hive (bruderman)
--

When I have a table which is partitioned by an int/bigint, and I go to import 
that table into Tableau, Tableau detects the partition column as being a string 
thus I cannot use it for incremental refreshes. I thought it was a tableau bug, 
however when creating a view: select * from table, then describing the view, I 
see that the partition column is a string, thus I think the issue is within 
hive.

Finally the issue extends when interfacing through hive server 1/hive server 2:

(hs2)➜  pyhs2 git:(master) python test.py
None
[{'comment': None, 'columnName': 'a', 'type': 'INT_TYPE'}, {'comment': None, 
'columnName': 'b', 'type': 'INT_TYPE'}, {'comment': None, 'columnName': 'dt', 
'type': 'STRING_TYPE'}]

Where the column is detected a string. 

The workaround is to create a view:

CREATE VIEW v_test AS 
SELECT t.*, CAST(t.dt as INT) dt_part
FROM test

And using that. This issue extended beyond Tableau and affects anything using 
the HiveServer1/2.

Thanks!



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5784) Group By Operator doesn't carry forward table aliases in its RowResolver

2013-11-08 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817630#comment-13817630
 ] 

Xuefu Zhang commented on HIVE-5784:
---

[~rhbutani] This seems a dupe of HIVE-3107, which keeps the previous 
discussions. There is no point to keep two. Feel free to close this one and 
take that one if you're are going to work on this now.

 Group By Operator doesn't carry forward table aliases in its RowResolver
 

 Key: HIVE-5784
 URL: https://issues.apache.org/jira/browse/HIVE-5784
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-5784.1.patch


 The following queries fails:
 {code}
 select b.key, count(*) from src b group by key
 select key, count(*) from src b group by b.key
 {code}
 with a SemanticException; the select expression b.key (key in the 2nd query) 
 are not resolved by the GBy RowResolver.
 This is because the GBy RowResolver only supports resolving based on an 
 AST.toStringTree match. Underlying issue is that a RowResolver doesn't allow 
 multiple mappings to the same ColumnInfo.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5562) Provide stripe level column statistics in ORC

2013-11-08 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817637#comment-13817637
 ] 

Gunther Hagleitner commented on HIVE-5562:
--

Committed to trunk. Thanks [~prasanth_j] and [~owen.omalley]!

 Provide stripe level column statistics in ORC
 -

 Key: HIVE-5562
 URL: https://issues.apache.org/jira/browse/HIVE-5562
 Project: Hive
  Issue Type: New Feature
  Components: File Formats
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile
 Fix For: 0.13.0

 Attachments: HIVE-5562.1.patch.txt, HIVE-5562.2.patch.txt


 ORC maintains two levels of column statistics. Index statistics (for every 
 rowgroup) and file level column statistics for the entire file. It is useful 
 to have stripe level column statistics which will be intermediate to index 
 and file statistics. The reason to maintain stripe level statistics is that, 
 the current input split computation logic is based on stripe boundaries. So 
 if stripe level statistics are available and if a stripe doesn't satisfy a 
 predicate condition then that entire stripe (also split) can be eliminated 
 from split computation.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5562) Provide stripe level column statistics in ORC

2013-11-08 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-5562:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 Provide stripe level column statistics in ORC
 -

 Key: HIVE-5562
 URL: https://issues.apache.org/jira/browse/HIVE-5562
 Project: Hive
  Issue Type: New Feature
  Components: File Formats
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile
 Fix For: 0.13.0

 Attachments: HIVE-5562.1.patch.txt, HIVE-5562.2.patch.txt


 ORC maintains two levels of column statistics. Index statistics (for every 
 rowgroup) and file level column statistics for the entire file. It is useful 
 to have stripe level column statistics which will be intermediate to index 
 and file statistics. The reason to maintain stripe level statistics is that, 
 the current input split computation logic is based on stripe boundaries. So 
 if stripe level statistics are available and if a stripe doesn't satisfy a 
 predicate condition then that entire stripe (also split) can be eliminated 
 from split computation.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5745) TestHiveLogging is failing (at least on mac)

2013-11-08 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-5745:
-

Status: Patch Available  (was: Open)

 TestHiveLogging is failing (at least on mac)
 

 Key: HIVE-5745
 URL: https://issues.apache.org/jira/browse/HIVE-5745
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-5745.1.patch


 The path for the log file on my mac contains two slashes. That causes mvn 
 install fail.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-4388) HBase tests fail against Hadoop 2

2013-11-08 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-4388:
---

Attachment: HIVE-4388.17.patch

v17 should fix that last failure :)

 HBase tests fail against Hadoop 2
 -

 Key: HIVE-4388
 URL: https://issues.apache.org/jira/browse/HIVE-4388
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Gunther Hagleitner
Assignee: Brock Noland
 Attachments: HIVE-4388-wip.txt, HIVE-4388.10.patch, 
 HIVE-4388.11.patch, HIVE-4388.12.patch, HIVE-4388.13.patch, 
 HIVE-4388.14.patch, HIVE-4388.15.patch, HIVE-4388.15.patch, 
 HIVE-4388.16.patch, HIVE-4388.17.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch


 Currently we're building by default against 0.92. When you run against hadoop 
 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963.
 HIVE-3861 upgrades the version of hbase used. This will get you past the 
 problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Scheduling the next Hive Contributors Meeting

2013-11-08 Thread Gunther Hagleitner
Looking forward to it!

I would like to do a status update and quick demo of the Tez integration
work (HIVE-4660), if there is time and interest.

Thanks,
Gunther.




On Fri, Nov 8, 2013 at 11:44 AM, Brock Noland br...@cloudera.com wrote:

 Hi,

 On Fri, Nov 8, 2013 at 1:43 PM, Nitin Pawar nitinpawar...@gmail.com
 wrote:

  I am not a contributor but a spectator to what hive have been doing last
  couple of years.
  I work out of India and would love to just sit back and listen to all the
  new upcoming things (if that's allowed) :)
 

 Not only allowed, but encouraged!  Great to have your interest!


 
 
  On Sat, Nov 9, 2013 at 1:08 AM, Brock Noland br...@cloudera.com wrote:
 
   Hi,
  
   Thanks Carl and Thejas! I would be attending remotely so the webex or
   google hangout would be very much appreciated. Please let me know if
  there
   is anything I can do to help enable either a webex or hangout!
  
   The Apache Sentry (incubating)[1] community which depends on Hive would
  be
   interested in briefly describing the project to the Hive community and
   discuss how we can work together to move both projects forward!  As a
  side
   note, there have been lively discussions on the integration of other
   incubating projects therefore I'd just like to share that the changes
   Sentry is interested in are very small in scope and unlikely to cause
   disruption to the Hive community.
  
   Cheers!
   Brock
  
   [1] http://incubator.apache.org/projects/sentry.html
  
  
   On Fri, Nov 8, 2013 at 1:08 PM, Carl Steinbach c...@apache.org wrote:
  
We're long overdue for a Hive Contributors Meeting. Thejas has
 offered
  to
host the next meeting at Hortonworks on November 19th from 4-6pm. We
  will
have a Google Hangout or Webex setup for people who wish to attend
remotely. If you want to attend but can't because of a scheduling
   conflict
please let us know. If enough people fall into this category we will
  try
   to
reschedule.
   
Thanks.
   
Carl
   
  
 
 
 
  --
  Nitin Pawar
 



 --
 Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


[jira] [Commented] (HIVE-5741) Hcatalog needs to be added to the binary tar

2013-11-08 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817653#comment-13817653
 ] 

Brock Noland commented on HIVE-5741:


On this item I think we should just keep the project structure as-is.

 Hcatalog needs to be added to the binary tar
 

 Key: HIVE-5741
 URL: https://issues.apache.org/jira/browse/HIVE-5741
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland





--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5714) Separate reactor root or aggregator from parent pom

2013-11-08 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817652#comment-13817652
 ] 

Brock Noland commented on HIVE-5714:


[~abayer] would you have a couple minutes to share your best practices?

As I understand it the root pom should *only* do aggregation while the parent 
pom should do everything that is inherited by the modules. Is that correct?

 Separate reactor root or aggregator from parent pom
 ---

 Key: HIVE-5714
 URL: https://issues.apache.org/jira/browse/HIVE-5714
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland

 It's a best practice to have a separate reactor pom from parent pom. More 
 details in FLUME-2199.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5564) Need to accomodate table decimal columns that were defined prior to HIVE-3976

2013-11-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817651#comment-13817651
 ] 

Hive QA commented on HIVE-5564:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12612688/HIVE-5564.4.patch

{color:green}SUCCESS:{color} +1 4595 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/211/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/211/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12612688

 Need to accomodate table decimal columns that were defined prior to HIVE-3976
 -

 Key: HIVE-5564
 URL: https://issues.apache.org/jira/browse/HIVE-5564
 Project: Hive
  Issue Type: Task
  Components: Types
Affects Versions: 0.13.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 0.13.0

 Attachments: HIVE-5564.1.patch, HIVE-5564.2.patch, HIVE-5564.3.patch, 
 HIVE-5564.4.patch, HIVE-5564.patch


 With HIVE-3976, decimal columns are stored with precision/scale, such as 
 decimal(17,5), as the type name. However, such columns defined in hive prior 
 to HIVE-3976 have a name as decimal. Those columns need to continue to work 
 with a precision/scale as (10,0), per the functional doc. With patch in 
 HIVE-3976, we may get the following error message in such case:
 {code}
 0: jdbc:hive2://localhost:1 desc dec;
 Error: Error while processing statement: FAILED: RuntimeException Decimal 
 type is specified without length: decimal:int (state=42000,code=4)
 {code}
 This issue will be addressed in this JIRA as a follow-up task.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5601) NPE in ORC's PPD when using select * from table with where predicate

2013-11-08 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817654#comment-13817654
 ] 

Gunther Hagleitner commented on HIVE-5601:
--

Committed the patch to trunk. Haven't updated hive .12 yet.

 NPE in ORC's PPD  when using select * from table with where predicate
 -

 Key: HIVE-5601
 URL: https://issues.apache.org/jira/browse/HIVE-5601
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Prasanth J
Assignee: Prasanth J
Priority: Critical
  Labels: ORC
 Attachments: HIVE-5601.4-branch-0.12.patch.txt, 
 HIVE-5601.5.patch.txt, HIVE-5601.branch-0.12.2.patch.txt, 
 HIVE-5601.branch-0.12.3.patch.txt, HIVE-5601.branch-0.12.4.patch.txt, 
 HIVE-5601.branch-12.1.patch.txt, HIVE-5601.trunk.1.patch.txt, 
 HIVE-5601.trunk.2.patch.txt, HIVE-5601.trunk.3.patch.txt, 
 HIVE-5601.trunk.4.patch.txt, HIVE-5601.trunk.5.patch.txt


 ORCInputFormat has a method findIncludedColumns() which returns boolean array 
 of included columns. In case of the following query 
 {code}select * from qlog_orc where id1000 limit 10;{code}
  where all columns are selected the findIncludedColumns() returns null. This 
 will result in a NPE when PPD is enabled. Following is the stack trace
 {code}Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.planReadPartialDataStreams(RecordReaderImpl.java:2387)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.readPartialDataStreams(RecordReaderImpl.java:2543)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.readStripe(RecordReaderImpl.java:2200)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:2573)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:2615)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.init(RecordReaderImpl.java:132)
   at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rows(ReaderImpl.java:348)
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.init(OrcInputFormat.java:99)
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:241)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:237)
   ... 8 more{code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Review Request 15151: Better error reporting by async threads in HiveServer2

2013-11-08 Thread Prasad Mujumdar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15151/#review28572
---


@Vaibhav, thanks for taking the issue forward and putting a new patch!

I do have a high level comment on the approach. The 'status' returned by HS2 
RPC is suppose to be the status of that particular API's execution. Where as in 
this case, we are overloading the 'status' field to return the status of a 
different (ie. ExecuteStatement()) status.
For example, if you call GetStatus() with a non-existing operation id then you 
would get an error status. This error is for failure of the GetStatus() itself. 
On the other hand if you call GetStatus() for an async query that failed, then 
you will also get the error status. However this error is not for the current 
GetStatus() operation, but for the last ExecuteStatement() operation.
The current implementation of the JDBC driver (or CLIClient in general) will 
work with this, but perhaps its not a clean way to implement it. Have you 
considered adding a new field in the GetStatus response to return the error 
status of the actual execute operation ?



- Prasad Mujumdar


On Nov. 1, 2013, 12:54 a.m., Vaibhav Gumashta wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15151/
 ---
 
 (Updated Nov. 1, 2013, 12:54 a.m.)
 
 
 Review request for hive, Prasad Mujumdar and Thejas Nair.
 
 
 Bugs: HIVE-5230
 https://issues.apache.org/jira/browse/HIVE-5230
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support 
 for async execution in HS2. When a background thread gets an error, currently 
 the client can only poll for the operation state and also the error with its 
 stacktrace is logged. However, it will be useful to provide a richer error 
 response like thrift API does with TStatus (which is constructed while 
 building a Thrift response object). 
 
 
 Diffs
 -
 
   service/src/java/org/apache/hive/service/cli/CLIService.java 1a7f338 
   service/src/java/org/apache/hive/service/cli/CLIServiceClient.java 14ef54f 
   service/src/java/org/apache/hive/service/cli/EmbeddedCLIServiceClient.java 
 9dca874 
   service/src/java/org/apache/hive/service/cli/ICLIService.java f647ce6 
   service/src/java/org/apache/hive/service/cli/OperationStatus.java 
 PRE-CREATION 
   service/src/java/org/apache/hive/service/cli/operation/Operation.java 
 6f4b8dc 
   
 service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 
 bcdb67f 
   service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 
 f6adf92 
   service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
 9df110e 
   
 service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIServiceClient.java
  9bb2a0f 
   service/src/test/org/apache/hive/service/cli/CLIServiceTest.java d6caed1 
   
 service/src/test/org/apache/hive/service/cli/thrift/ThriftCLIServiceTest.java 
 ff7166d 
 
 Diff: https://reviews.apache.org/r/15151/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Vaibhav Gumashta
 




[jira] [Commented] (HIVE-5754) NullPointerException when alter partition table and table does not exist

2013-11-08 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817678#comment-13817678
 ] 

Xuefu Zhang commented on HIVE-5754:
---

This seems to be a bug. However, did you try to reproduce with the latest trunk?

 NullPointerException when alter partition table and table does not exist
 

 Key: HIVE-5754
 URL: https://issues.apache.org/jira/browse/HIVE-5754
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0
Reporter: Alexis Deltour

 I have a problem with my oozie hive action which clean my hive table, and 
 when my table doesn't exist and i alter partition table, i obtain different 
 messages with 2 versions of hive :
 Sur CDH3 hive 0.7.1 :
 hive ALTER TABLE mytable DROP IF EXISTS PARTITION (mypart='10');
 FAILED: Error in semantic analysis: Table not found mytable  
 -- Oozie action OK.
 Sur CDH4 hive 0.10.0 :
 hive ALTER TABLE mytable DROP IF EXISTS PARTITION (mypart='10');
 FAILED: NullPointerException null  
 -- Oozie action in error.
 Is this a bug or a configuration problem ?



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-4388) HBase tests fail against Hadoop 2

2013-11-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817684#comment-13817684
 ] 

Hive QA commented on HIVE-4388:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12612876/HIVE-4388.17.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4598 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket_num_reducers
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/212/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/212/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests failed with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12612876

 HBase tests fail against Hadoop 2
 -

 Key: HIVE-4388
 URL: https://issues.apache.org/jira/browse/HIVE-4388
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Gunther Hagleitner
Assignee: Brock Noland
 Attachments: HIVE-4388-wip.txt, HIVE-4388.10.patch, 
 HIVE-4388.11.patch, HIVE-4388.12.patch, HIVE-4388.13.patch, 
 HIVE-4388.14.patch, HIVE-4388.15.patch, HIVE-4388.15.patch, 
 HIVE-4388.16.patch, HIVE-4388.17.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch


 Currently we're building by default against 0.92. When you run against hadoop 
 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963.
 HIVE-3861 upgrades the version of hbase used. This will get you past the 
 problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-4388) HBase tests fail against Hadoop 2

2013-11-08 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817694#comment-13817694
 ] 

Brock Noland commented on HIVE-4388:


Sweet, that test is flaky and not related.  [~hagleitn] or [~ashutoshc] should 
we get this one in?

 HBase tests fail against Hadoop 2
 -

 Key: HIVE-4388
 URL: https://issues.apache.org/jira/browse/HIVE-4388
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Gunther Hagleitner
Assignee: Brock Noland
 Attachments: HIVE-4388-wip.txt, HIVE-4388.10.patch, 
 HIVE-4388.11.patch, HIVE-4388.12.patch, HIVE-4388.13.patch, 
 HIVE-4388.14.patch, HIVE-4388.15.patch, HIVE-4388.15.patch, 
 HIVE-4388.16.patch, HIVE-4388.17.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch


 Currently we're building by default against 0.92. When you run against hadoop 
 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963.
 HIVE-3861 upgrades the version of hbase used. This will get you past the 
 problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-4388) Upgrade HBase to 0.96

2013-11-08 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-4388:
---

Summary: Upgrade HBase to 0.96  (was: HBase tests fail against Hadoop 2)

 Upgrade HBase to 0.96
 -

 Key: HIVE-4388
 URL: https://issues.apache.org/jira/browse/HIVE-4388
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Gunther Hagleitner
Assignee: Brock Noland
 Attachments: HIVE-4388-wip.txt, HIVE-4388.10.patch, 
 HIVE-4388.11.patch, HIVE-4388.12.patch, HIVE-4388.13.patch, 
 HIVE-4388.14.patch, HIVE-4388.15.patch, HIVE-4388.15.patch, 
 HIVE-4388.16.patch, HIVE-4388.17.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch


 Currently we're building by default against 0.92. When you run against hadoop 
 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963.
 HIVE-3861 upgrades the version of hbase used. This will get you past the 
 problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5632) Eliminate splits based on SARGs using stripe statistics in ORC

2013-11-08 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817706#comment-13817706
 ] 

Gunther Hagleitner commented on HIVE-5632:
--

I've committed the test data file (orc_split_elim.orc) to trunk (in 
data/files/orc_split_elim.orc). That doesn't affect anything in the build, but 
now the pre-commit tests should be able to run.

 Eliminate splits based on SARGs using stripe statistics in ORC
 --

 Key: HIVE-5632
 URL: https://issues.apache.org/jira/browse/HIVE-5632
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile
 Attachments: HIVE-5632.1.patch.txt, HIVE-5632.2.patch.txt, 
 HIVE-5632.3.patch.txt, orc_split_elim.orc


 HIVE-5562 provides stripe level statistics in ORC. Stripe level statistics 
 combined with predicate pushdown in ORC (HIVE-4246) can be used to eliminate 
 the stripes (thereby splits) that doesn't satisfy the predicate condition. 
 This can greatly reduce unnecessary reads.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5784) Group By Operator doesn't carry forward table aliases in its RowResolver

2013-11-08 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-5784:


Status: Open  (was: Patch Available)

 Group By Operator doesn't carry forward table aliases in its RowResolver
 

 Key: HIVE-5784
 URL: https://issues.apache.org/jira/browse/HIVE-5784
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-5784.1.patch


 The following queries fails:
 {code}
 select b.key, count(*) from src b group by key
 select key, count(*) from src b group by b.key
 {code}
 with a SemanticException; the select expression b.key (key in the 2nd query) 
 are not resolved by the GBy RowResolver.
 This is because the GBy RowResolver only supports resolving based on an 
 AST.toStringTree match. Underlying issue is that a RowResolver doesn't allow 
 multiple mappings to the same ColumnInfo.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (HIVE-5784) Group By Operator doesn't carry forward table aliases in its RowResolver

2013-11-08 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani resolved HIVE-5784.
-

Resolution: Duplicate

duplicate of HIVE-3107

 Group By Operator doesn't carry forward table aliases in its RowResolver
 

 Key: HIVE-5784
 URL: https://issues.apache.org/jira/browse/HIVE-5784
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-5784.1.patch


 The following queries fails:
 {code}
 select b.key, count(*) from src b group by key
 select key, count(*) from src b group by b.key
 {code}
 with a SemanticException; the select expression b.key (key in the 2nd query) 
 are not resolved by the GBy RowResolver.
 This is because the GBy RowResolver only supports resolving based on an 
 AST.toStringTree match. Underlying issue is that a RowResolver doesn't allow 
 multiple mappings to the same ColumnInfo.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5632) Eliminate splits based on SARGs using stripe statistics in ORC

2013-11-08 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-5632:
-

Status: Patch Available  (was: Open)

 Eliminate splits based on SARGs using stripe statistics in ORC
 --

 Key: HIVE-5632
 URL: https://issues.apache.org/jira/browse/HIVE-5632
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile
 Attachments: HIVE-5632.1.patch.txt, HIVE-5632.2.patch.txt, 
 HIVE-5632.3.patch.txt, HIVE-5632.4.patch, orc_split_elim.orc


 HIVE-5562 provides stripe level statistics in ORC. Stripe level statistics 
 combined with predicate pushdown in ORC (HIVE-4246) can be used to eliminate 
 the stripes (thereby splits) that doesn't satisfy the predicate condition. 
 This can greatly reduce unnecessary reads.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5784) Group By Operator doesn't carry forward table aliases in its RowResolver

2013-11-08 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817708#comment-13817708
 ] 

Harish Butani commented on HIVE-5784:
-

[~xuefuz] ok thanks for pointing this out.  Can you review the patch.

 Group By Operator doesn't carry forward table aliases in its RowResolver
 

 Key: HIVE-5784
 URL: https://issues.apache.org/jira/browse/HIVE-5784
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-5784.1.patch


 The following queries fails:
 {code}
 select b.key, count(*) from src b group by key
 select key, count(*) from src b group by b.key
 {code}
 with a SemanticException; the select expression b.key (key in the 2nd query) 
 are not resolved by the GBy RowResolver.
 This is because the GBy RowResolver only supports resolving based on an 
 AST.toStringTree match. Underlying issue is that a RowResolver doesn't allow 
 multiple mappings to the same ColumnInfo.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5632) Eliminate splits based on SARGs using stripe statistics in ORC

2013-11-08 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-5632:
-

Attachment: HIVE-5632.4.patch

Re-uploading .3 as .4 to kick off pre-commit.

 Eliminate splits based on SARGs using stripe statistics in ORC
 --

 Key: HIVE-5632
 URL: https://issues.apache.org/jira/browse/HIVE-5632
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile
 Attachments: HIVE-5632.1.patch.txt, HIVE-5632.2.patch.txt, 
 HIVE-5632.3.patch.txt, HIVE-5632.4.patch, orc_split_elim.orc


 HIVE-5562 provides stripe level statistics in ORC. Stripe level statistics 
 combined with predicate pushdown in ORC (HIVE-4246) can be used to eliminate 
 the stripes (thereby splits) that doesn't satisfy the predicate condition. 
 This can greatly reduce unnecessary reads.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-3107) Improve semantic analyzer to better handle column name references in group by/sort by clauses

2013-11-08 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-3107:


Attachment: HIVE-3107.1.patch

 Improve semantic analyzer to better handle column name references in group 
 by/sort by clauses
 -

 Key: HIVE-3107
 URL: https://issues.apache.org/jira/browse/HIVE-3107
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.9.0
Reporter: Richard Ding
Assignee: Xuefu Zhang
 Attachments: HIVE-3107.1.patch


 This is related to HIVE-1922.
 Following queries all fail with various SemanticExceptions:
 {code}
 explain select t.c from t group by c;
 explain select t.c from t group by c sort by t.c; 
 explain select t.c as c0 from t group by c0;
 explain select t.c from t group by t.c sort by t.c; 
 {code}
 It is true that one could always find a version of any of above queries that 
 works. But one has to try to find out and it doesn't work well with machine 
 generated SQL queries.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5632) Eliminate splits based on SARGs using stripe statistics in ORC

2013-11-08 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-5632:
-

Status: Open  (was: Patch Available)

 Eliminate splits based on SARGs using stripe statistics in ORC
 --

 Key: HIVE-5632
 URL: https://issues.apache.org/jira/browse/HIVE-5632
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile
 Attachments: HIVE-5632.1.patch.txt, HIVE-5632.2.patch.txt, 
 HIVE-5632.3.patch.txt, HIVE-5632.4.patch, orc_split_elim.orc


 HIVE-5562 provides stripe level statistics in ORC. Stripe level statistics 
 combined with predicate pushdown in ORC (HIVE-4246) can be used to eliminate 
 the stripes (thereby splits) that doesn't satisfy the predicate condition. 
 This can greatly reduce unnecessary reads.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-3107) Improve semantic analyzer to better handle column name references in group by/sort by clauses

2013-11-08 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817711#comment-13817711
 ] 

Harish Butani commented on HIVE-3107:
-

Review request: https://reviews.apache.org/r/15361/

 Improve semantic analyzer to better handle column name references in group 
 by/sort by clauses
 -

 Key: HIVE-3107
 URL: https://issues.apache.org/jira/browse/HIVE-3107
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.9.0
Reporter: Richard Ding
Assignee: Xuefu Zhang
 Attachments: HIVE-3107.1.patch


 This is related to HIVE-1922.
 Following queries all fail with various SemanticExceptions:
 {code}
 explain select t.c from t group by c;
 explain select t.c from t group by c sort by t.c; 
 explain select t.c as c0 from t group by c0;
 explain select t.c from t group by t.c sort by t.c; 
 {code}
 It is true that one could always find a version of any of above queries that 
 works. But one has to try to find out and it doesn't work well with machine 
 generated SQL queries.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-3107) Improve semantic analyzer to better handle column name references in group by/sort by clauses

2013-11-08 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-3107:


Status: Patch Available  (was: Reopened)

 Improve semantic analyzer to better handle column name references in group 
 by/sort by clauses
 -

 Key: HIVE-3107
 URL: https://issues.apache.org/jira/browse/HIVE-3107
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.9.0
Reporter: Richard Ding
Assignee: Harish Butani
 Attachments: HIVE-3107.1.patch


 This is related to HIVE-1922.
 Following queries all fail with various SemanticExceptions:
 {code}
 explain select t.c from t group by c;
 explain select t.c from t group by c sort by t.c; 
 explain select t.c as c0 from t group by c0;
 explain select t.c from t group by t.c sort by t.c; 
 {code}
 It is true that one could always find a version of any of above queries that 
 works. But one has to try to find out and it doesn't work well with machine 
 generated SQL queries.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (HIVE-3107) Improve semantic analyzer to better handle column name references in group by/sort by clauses

2013-11-08 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani reassigned HIVE-3107:
---

Assignee: Harish Butani  (was: Xuefu Zhang)

 Improve semantic analyzer to better handle column name references in group 
 by/sort by clauses
 -

 Key: HIVE-3107
 URL: https://issues.apache.org/jira/browse/HIVE-3107
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.9.0
Reporter: Richard Ding
Assignee: Harish Butani
 Attachments: HIVE-3107.1.patch


 This is related to HIVE-1922.
 Following queries all fail with various SemanticExceptions:
 {code}
 explain select t.c from t group by c;
 explain select t.c from t group by c sort by t.c; 
 explain select t.c as c0 from t group by c0;
 explain select t.c from t group by t.c sort by t.c; 
 {code}
 It is true that one could always find a version of any of above queries that 
 works. But one has to try to find out and it doesn't work well with machine 
 generated SQL queries.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5779) Subquery in where clause with distinct fails with mapjoin turned on with serialization error.

2013-11-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817735#comment-13817735
 ] 

Hive QA commented on HIVE-5779:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12612843/HIVE-5779.2.patch

{color:green}SUCCESS:{color} +1 4597 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/213/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/213/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12612843

 Subquery in where clause with distinct fails with mapjoin turned on with 
 serialization error.
 -

 Key: HIVE-5779
 URL: https://issues.apache.org/jira/browse/HIVE-5779
 Project: Hive
  Issue Type: Bug
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-5779.2.patch, HIVE-5779.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HIVE-5786) Remove HadoopShims methods that were needed for pre-Hadoop 0.20

2013-11-08 Thread Jason Dere (JIRA)
Jason Dere created HIVE-5786:


 Summary: Remove HadoopShims methods that were needed for 
pre-Hadoop 0.20
 Key: HIVE-5786
 URL: https://issues.apache.org/jira/browse/HIVE-5786
 Project: Hive
  Issue Type: Bug
  Components: Shims
Reporter: Jason Dere
Assignee: Jason Dere


There are several methods in HadoopShims that can be removed since we are only 
supporting 0.20+.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5786) Remove HadoopShims methods that were needed for pre-Hadoop 0.20

2013-11-08 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817738#comment-13817738
 ] 

Jason Dere commented on HIVE-5786:
--

Looks like the following shims methods can be removed from HadoopShims:

usesJobShell
isJobPreparing
fileSystemDeleteOnExit
inputFormatValidateInput
setTmpFiles
getAccessTime
compareText
setFloatConf
getTaskJobIDs


 Remove HadoopShims methods that were needed for pre-Hadoop 0.20
 ---

 Key: HIVE-5786
 URL: https://issues.apache.org/jira/browse/HIVE-5786
 Project: Hive
  Issue Type: Bug
  Components: Shims
Reporter: Jason Dere
Assignee: Jason Dere

 There are several methods in HadoopShims that can be removed since we are 
 only supporting 0.20+.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5786) Remove HadoopShims methods that were needed for pre-Hadoop 0.20

2013-11-08 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-5786:
-

Attachment: HIVE-5786.1.patch

patch v1.

 Remove HadoopShims methods that were needed for pre-Hadoop 0.20
 ---

 Key: HIVE-5786
 URL: https://issues.apache.org/jira/browse/HIVE-5786
 Project: Hive
  Issue Type: Bug
  Components: Shims
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-5786.1.patch


 There are several methods in HadoopShims that can be removed since we are 
 only supporting 0.20+.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5107) Change hive's build to maven

2013-11-08 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817753#comment-13817753
 ] 

Vaibhav Gumashta commented on HIVE-5107:


Hi [~brocknoland]! Thanks for the awesome effort. I have one question regarding 
the organization of tests. Some of the tests have been moved to the itests 
folder whereas some live in the original package. Is there a good reason for 
having that structure? For example, some of the unit test files for the service 
package live in service/src/test/org/apache/hive/service, while some of them 
have moved to itests/hive-unit/src/test/java/org/apache/hive/service.

 Change hive's build to maven
 

 Key: HIVE-5107
 URL: https://issues.apache.org/jira/browse/HIVE-5107
 Project: Hive
  Issue Type: Task
Reporter: Edward Capriolo
Assignee: Edward Capriolo

 I can not cope with hive's build infrastructure any more. I have started 
 working on porting the project to maven. When I have some solid progess i 
 will github the entire thing for review. Then we can talk about switching the 
 project somehow.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5107) Change hive's build to maven

2013-11-08 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817767#comment-13817767
 ] 

Brock Noland commented on HIVE-5107:


 itests  are any tests that has cyclical dependencies or requires that the 
packages be built. Typically only integration tests that have those 
requirements, thus I have named it itests.

 Change hive's build to maven
 

 Key: HIVE-5107
 URL: https://issues.apache.org/jira/browse/HIVE-5107
 Project: Hive
  Issue Type: Task
Reporter: Edward Capriolo
Assignee: Edward Capriolo

 I can not cope with hive's build infrastructure any more. I have started 
 working on porting the project to maven. When I have some solid progess i 
 will github the entire thing for review. Then we can talk about switching the 
 project somehow.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Review Request 15373: HIVE-5786 Remove HadoopShims methods that were needed for pre-Hadoop 0.20

2013-11-08 Thread Jason Dere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15373/
---

Review request for hive.


Bugs: HIVE-5786
https://issues.apache.org/jira/browse/HIVE-5786


Repository: hive-git


Description
---

Remove some of the shims methods which were made obsolete after dropping 
pre-hadoop 0.20 support.


Diffs
-

  cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java 4fcca8c 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 4f32390 
  
contrib/src/java/org/apache/hadoop/hive/contrib/fileformat/base64/Base64TextInputFormat.java
 5909188 
  
contrib/src/java/org/apache/hadoop/hive/contrib/udaf/example/UDAFExampleMax.java
 abb66c4 
  
contrib/src/java/org/apache/hadoop/hive/contrib/udaf/example/UDAFExampleMin.java
 6f389d8 
  itests/util/src/main/java/org/apache/hadoop/hive/ql/udf/UDAFTestMax.java 
eda2aa4 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 2ac22b7 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java e69aaa6 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java 
0a2f976 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java 7b77944 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java 99ec216 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java f7086a3 
  ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java 
ab884c5 
  
ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/truncate/ColumnTruncateMapper.java
 f0678ef 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/JsonMetaDataFormatter.java
 a85a19d 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/TextMetaDataFormatter.java
 0f48674 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPEqualOrGreaterThan.java
 cf39215 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPEqualOrLessThan.java
 3eba13b 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPGreaterThan.java 
d6654a1 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPLessThan.java 
b1e03b4 
  ql/src/test/org/apache/hadoop/hive/ql/io/TestSymlinkTextInputFormat.java 
1d92d40 
  serde/src/java/org/apache/hadoop/hive/serde2/io/HiveCharWritable.java e68c63a 
  serde/src/java/org/apache/hadoop/hive/serde2/io/HiveVarcharWritable.java 
005832b 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java
 ba8342d 
  shims/0.20/src/main/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java 
17f4a94 
  
shims/common-secure/src/main/java/org/apache/hadoop/hive/shims/HadoopShimsSecure.java
 fd0d526 
  shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 
62ff878 

Diff: https://reviews.apache.org/r/15373/diff/


Testing
---


Thanks,

Jason Dere



[jira] [Commented] (HIVE-4022) Structs and struct fields cannot be NULL in INSERT statements

2013-11-08 Thread Adrian Hains (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817769#comment-13817769
 ] 

Adrian Hains commented on HIVE-4022:


I found a workaround to get me past this restriction. I had a need to add some 
struct columns to a table t1 by way of copying the data to a new table t2 with 
the correct updated schema. Trying to insert directly to t2 by selecting from 
t1 with null literals failed for me as described in this jira ticket. To work 
around this I created an additional table t2copy that has the same schema as 
t2. Then I did an insert to t1 selecting from t2 left outer join t2copy, and 
referencing the t2copy.newStructColumn instance to have a table-sourced null 
value pass to t1. This worked. It may be that t2copy having the same struct 
definition is unnecessary, and a simple empty table with a bogus struct column 
definition would have worked just as well.

 Structs and struct fields cannot be NULL in INSERT statements
 -

 Key: HIVE-4022
 URL: https://issues.apache.org/jira/browse/HIVE-4022
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Michael Malak

 Originally thought to be Avro-specific, and first noted with respect to 
 HIVE-3528 Avro SerDe doesn't handle serializing Nullable types that require 
 access to a Schema, it turns out even native Hive tables cannot store NULL 
 in a STRUCT field or for the entire STRUCT itself, at least when the NULL is 
 specified directly in the INSERT statement.
 Again, this affects both Avro-backed tables and native Hive tables.
 ***For native Hive tables:
 The following:
 echo 1,2 twovalues.csv
 hive
 CREATE TABLE tc (x INT, y INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
 LOAD DATA LOCAL INPATH 'twovalues.csv' INTO TABLE tc;
 CREATE TABLE oc (z STRUCTa: int, b: int);
 INSERT INTO TABLE oc SELECT null FROM tc;
 produces the error
 FAILED: SemanticException [Error 10044]: Line 1:18 Cannot insert into target 
 table because column number/types are different 'oc': Cannot convert column 0 
 from void to structa:int,b:int.
 The following:
 INSERT INTO TABLE oc SELECT named_struct('a', null, 'b', null) FROM tc;
 produces the error:
 FAILED: SemanticException [Error 10044]: Line 1:18 Cannot insert into target 
 table because column number/types are different 'oc': Cannot convert column 0 
 from structa:void,b:void to structa:int,b:int.
 ***For Avro:
 In HIVE-3528, there is in fact a null-struct test case in line 14 of
 https://github.com/apache/hive/blob/15cc604bf10f4c2502cb88fb8bb3dcd45647cf2c/data/files/csv.txt
 The test script at
 https://github.com/apache/hive/blob/12d6f3e7d21f94e8b8490b7c6d291c9f4cac8a4f/ql/src/test/queries/clientpositive/avro_nullable_fields.q
 does indeed work.  But in that test, the query gets all of its data from a 
 test table verbatim:
 INSERT OVERWRITE TABLE as_avro SELECT * FROM test_serializer;
 If instead we stick in a hard-coded null for the struct directly into the 
 query, it fails:
 INSERT OVERWRITE TABLE as_avro SELECT string1, int1, tinyint1, smallint1, 
 bigint1, boolean1, float1, double1, list1, map1, null, enum1, nullableint, 
 bytes1, fixed1 FROM test_serializer;
 with the following error:
 FAILED: SemanticException [Error 10044]: Line 1:23 Cannot insert into target 
 table because column number/types are different 'as_avro': Cannot convert 
 column 10 from void to structsint:int,sboolean:boolean,sstring:string.
 Note, though, that substituting a hard-coded null for string1 (and restoring 
 struct1 into the query) does work:
 INSERT OVERWRITE TABLE as_avro SELECT null, int1, tinyint1, smallint1, 
 bigint1, boolean1, float1, double1, list1, map1, struct1, enum1, nullableint, 
 bytes1, fixed1 FROM test_serializer;



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5786) Remove HadoopShims methods that were needed for pre-Hadoop 0.20

2013-11-08 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817770#comment-13817770
 ] 

Jason Dere commented on HIVE-5786:
--

RB at https://reviews.apache.org/r/15373/

 Remove HadoopShims methods that were needed for pre-Hadoop 0.20
 ---

 Key: HIVE-5786
 URL: https://issues.apache.org/jira/browse/HIVE-5786
 Project: Hive
  Issue Type: Bug
  Components: Shims
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-5786.1.patch


 There are several methods in HadoopShims that can be removed since we are 
 only supporting 0.20+.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5786) Remove HadoopShims methods that were needed for pre-Hadoop 0.20

2013-11-08 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-5786:
-

Status: Patch Available  (was: Open)

 Remove HadoopShims methods that were needed for pre-Hadoop 0.20
 ---

 Key: HIVE-5786
 URL: https://issues.apache.org/jira/browse/HIVE-5786
 Project: Hive
  Issue Type: Bug
  Components: Shims
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-5786.1.patch


 There are several methods in HadoopShims that can be removed since we are 
 only supporting 0.20+.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5632) Eliminate splits based on SARGs using stripe statistics in ORC

2013-11-08 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817785#comment-13817785
 ] 

Gunther Hagleitner commented on HIVE-5632:
--

looked at the revised patch. LGTM +1. [~prasanth_j] can you open the follow up 
jira discussed and link to this?

 Eliminate splits based on SARGs using stripe statistics in ORC
 --

 Key: HIVE-5632
 URL: https://issues.apache.org/jira/browse/HIVE-5632
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile
 Attachments: HIVE-5632.1.patch.txt, HIVE-5632.2.patch.txt, 
 HIVE-5632.3.patch.txt, HIVE-5632.4.patch, orc_split_elim.orc


 HIVE-5562 provides stripe level statistics in ORC. Stripe level statistics 
 combined with predicate pushdown in ORC (HIVE-4246) can be used to eliminate 
 the stripes (thereby splits) that doesn't satisfy the predicate condition. 
 This can greatly reduce unnecessary reads.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5107) Change hive's build to maven

2013-11-08 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817783#comment-13817783
 ] 

Vaibhav Gumashta commented on HIVE-5107:


I'm not well-versed in maven, but wouldn't it be cleaner to move all the tests 
to itests? I think it might become confusing when adding new tests if the tests 
for a package are split into different locations.

 Change hive's build to maven
 

 Key: HIVE-5107
 URL: https://issues.apache.org/jira/browse/HIVE-5107
 Project: Hive
  Issue Type: Task
Reporter: Edward Capriolo
Assignee: Edward Capriolo

 I can not cope with hive's build infrastructure any more. I have started 
 working on porting the project to maven. When I have some solid progess i 
 will github the entire thing for review. Then we can talk about switching the 
 project somehow.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5683) JDBC support for char

2013-11-08 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-5683:
-

Attachment: HIVE-5683.2.patch

rebase patch with trunk - patch v2.

 JDBC support for char
 -

 Key: HIVE-5683
 URL: https://issues.apache.org/jira/browse/HIVE-5683
 Project: Hive
  Issue Type: Bug
  Components: JDBC, Types
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-5683.1.patch, HIVE-5683.2.patch


 Support char type in JDBC, including char length in result set metadata.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Review Request 15375: HIVE-5683 JDBC support for char

2013-11-08 Thread Jason Dere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15375/
---

Review request for hive and Thejas Nair.


Bugs: HIVE-5683
https://issues.apache.org/jira/browse/HIVE-5683


Repository: hive-git


Description
---

thrift/jdbc changes for char.


Diffs
-

  data/files/datatypes.txt 10daa1b 
  itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java 
a270cc6 
  jdbc/src/java/org/apache/hive/jdbc/HiveBaseResultSet.java b693e93 
  jdbc/src/java/org/apache/hive/jdbc/HiveQueryResultSet.java 25faf48 
  jdbc/src/java/org/apache/hive/jdbc/HiveResultSetMetaData.java 79e8c8c 
  jdbc/src/java/org/apache/hive/jdbc/JdbcColumn.java d612cf6 
  jdbc/src/java/org/apache/hive/jdbc/Utils.java 45de290 
  service/if/TCLIService.thrift 1f49445 
  service/src/gen/thrift/gen-cpp/TCLIService_constants.h 7471811 
  service/src/gen/thrift/gen-cpp/TCLIService_constants.cpp d085b30 
  service/src/gen/thrift/gen-cpp/TCLIService_types.h 490b393 
  service/src/gen/thrift/gen-cpp/TCLIService_types.cpp a3fd46c 
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TCLIServiceConstants.java
 7b4c576 
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TOpenSessionResp.java
 5d353f7 
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TProtocolVersion.java
 15f2973 
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TTypeId.java
 be70a3a 
  service/src/gen/thrift/gen-py/TCLIService/constants.py 589ce88 
  service/src/gen/thrift/gen-py/TCLIService/ttypes.py b286b05 
  service/src/gen/thrift/gen-rb/t_c_l_i_service_constants.rb 8c341c8 
  service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb c608364 
  service/src/java/org/apache/hive/service/cli/ColumnValue.java 62e221b 
  service/src/java/org/apache/hive/service/cli/Type.java f414fca 
  service/src/java/org/apache/hive/service/cli/TypeQualifiers.java 66a4b12 

Diff: https://reviews.apache.org/r/15375/diff/


Testing
---


Thanks,

Jason Dere



  1   2   >