date:20130522


 [ 
https://issues.apache.org/jira/browse/HIVE-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bing Li updated HIVE-4577:
--

Affects Version/s: 0.10.0

 hive CLI can't handle hadoop dfs command  with space and quotes.
 

 Key: HIVE-4577
 URL: https://issues.apache.org/jira/browse/HIVE-4577
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.9.0, 0.10.0
Reporter: Bing Li
Assignee: Bing Li
 Attachments: HIVE-4577.1.patch, HIVE-4577.2.patch


 As design, hive could support hadoop dfs command in hive shell, like 
 hive dfs -mkdir /user/biadmin/mydir;
 but has different behavior with hadoop if the path contains space and quotes
 hive dfs -mkdir hello; 
 drwxr-xr-x   - biadmin supergroup  0 2013-04-23 09:40 
 /user/biadmin/hello
 hive dfs -mkdir 'world';
 drwxr-xr-x   - biadmin supergroup  0 2013-04-23 09:43 
 /user/biadmin/'world'
 hive dfs -mkdir bei jing;
 drwxr-xr-x   - biadmin supergroup  0 2013-04-23 09:44 
 /user/biadmin/bei
 drwxr-xr-x   - biadmin supergroup  0 2013-04-23 09:44 
 /user/biadmin/jing

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4588) Support session level hooks for HiveServer2

2013-05-22 Thread Prasad Mujumdar (JIRA)

Prasad Mujumdar created HIVE-4588:
-

 Summary: Support session level hooks for HiveServer2
 Key: HIVE-4588
 URL: https://issues.apache.org/jira/browse/HIVE-4588
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 0.11.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.12.0


Support session level hooks for HiveSrver2. The configured hooks will get 
executed at beginning of each new session.
This is useful for auditing connections, possibly tuning the session level 
properties etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Review Request: HIVE-4588: Support session level hooks for HiveServer2

2013-05-22 Thread Prasad Mujumdar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11326/
---

Review request for hive.


Description
---

Support session level hooks for HiveServer2
  - New config parameter to define the hook
  - New hook context interface to pass the serssion user and config to the hook 
implementation
  - Session manager executes the configured hooks when a new session starts


This addresses bug HIVE-4588.
https://issues.apache.org/jira/browse/HIVE-4588


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 335af45 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionHook.java 
PRE-CREATION 
  
service/src/java/org/apache/hive/service/cli/session/HiveSessionHookContext.java
 PRE-CREATION 
  
service/src/java/org/apache/hive/service/cli/session/HiveSessionHookContextImpl.java
 PRE-CREATION 
  service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
3bb6807 
  service/src/test/org/apache/hive/service/cli/session/TestSessionHooks.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/11326/diff/


Testing
---

Added new test for session hooks


Thanks,

Prasad Mujumdar

[jira] [Updated] (HIVE-4588) Support session level hooks for HiveServer2

2013-05-22 Thread Prasad Mujumdar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-4588:
--

Attachment: HIVE-4588-1.patch

 Support session level hooks for HiveServer2
 ---

 Key: HIVE-4588
 URL: https://issues.apache.org/jira/browse/HIVE-4588
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 0.11.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.12.0

 Attachments: HIVE-4588-1.patch


 Support session level hooks for HiveSrver2. The configured hooks will get 
 executed at beginning of each new session.
 This is useful for auditing connections, possibly tuning the session level 
 properties etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4588) Support session level hooks for HiveServer2

2013-05-22 Thread Prasad Mujumdar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-4588:
--

Status: Patch Available  (was: Open)

Review request on https://reviews.apache.org/r/11326/

 Support session level hooks for HiveServer2
 ---

 Key: HIVE-4588
 URL: https://issues.apache.org/jira/browse/HIVE-4588
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 0.11.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.12.0

 Attachments: HIVE-4588-1.patch


 Support session level hooks for HiveSrver2. The configured hooks will get 
 executed at beginning of each new session.
 This is useful for auditing connections, possibly tuning the session level 
 properties etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4589) Hive Load command failed when inpath contains space or any restricted characters

Bing Li created HIVE-4589:
-

 Summary: Hive Load command failed when inpath contains space or 
any restricted characters
 Key: HIVE-4589
 URL: https://issues.apache.org/jira/browse/HIVE-4589
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.10.0, 0.9.0
Reporter: Bing Li
Assignee: Bing Li


0) create a simple text file with some string.   See attached uk.cities

1) Create a directory in Hadoop that contains a space in the name 
         hadoop fs -mkdir '/testdir/bri tain/' 
 hadoop fs -copyFromLocal    /tmp/uk.cities     '/testdir/bri 
tain/uk.cities'

2) create table partspace ( city string) partitioned by ( country string) row 
format delimited FIELDS TERMINATED BY '$' stored as textfile;

3) load data inpath '/testdir/bri tain/uk.cities' into table partspace 
partition (country='britain');

Then I got the message like,
Load failed with message   Wrong file format. Please check the file's format


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4450) Extend Vector Aggregates to support GROUP BY

2013-05-22 Thread Remus Rusanu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-4450:
---

Attachment: HIVE-4450-p1.patch.txt

added missing file

 Extend Vector Aggregates to support GROUP BY
 

 Key: HIVE-4450
 URL: https://issues.apache.org/jira/browse/HIVE-4450
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Remus Rusanu
Assignee: Remus Rusanu
  Labels: features
 Fix For: vectorization-branch

 Attachments: HIVE-4450-p1.patch.txt, HIVE-4450-p1.patch.txt, 
 HIVE-4450-p1.patch.txt


 Extend the VectorGroupByOperator and the VectorUDAF aggregates to support 
 group by.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4589) Hive Load command failed when inpath contains space or any restricted characters


 [ 
https://issues.apache.org/jira/browse/HIVE-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bing Li updated HIVE-4589:
--

Status: Patch Available  (was: In Progress)

 Hive Load command failed when inpath contains space or any restricted 
 characters
 

 Key: HIVE-4589
 URL: https://issues.apache.org/jira/browse/HIVE-4589
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.10.0, 0.9.0
Reporter: Bing Li
Assignee: Bing Li

 0) create a simple text file with some string.   See attached uk.cities
 1) Create a directory in Hadoop that contains a space in the name 
          hadoop fs -mkdir '/testdir/bri tain/' 
  hadoop fs -copyFromLocal    /tmp/uk.cities     '/testdir/bri 
 tain/uk.cities'
 2) create table partspace ( city string) partitioned by ( country string) row 
 format delimited FIELDS TERMINATED BY '$' stored as textfile;
 3) load data inpath '/testdir/bri tain/uk.cities' into table partspace 
 partition (country='britain');
 Then I got the message like,
 Load failed with message   Wrong file format. Please check the file's format

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Work started] (HIVE-4589) Hive Load command failed when inpath contains space or any restricted characters


 [ 
https://issues.apache.org/jira/browse/HIVE-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-4589 started by Bing Li.

 Hive Load command failed when inpath contains space or any restricted 
 characters
 

 Key: HIVE-4589
 URL: https://issues.apache.org/jira/browse/HIVE-4589
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.9.0, 0.10.0
Reporter: Bing Li
Assignee: Bing Li

 0) create a simple text file with some string.   See attached uk.cities
 1) Create a directory in Hadoop that contains a space in the name 
          hadoop fs -mkdir '/testdir/bri tain/' 
  hadoop fs -copyFromLocal    /tmp/uk.cities     '/testdir/bri 
 tain/uk.cities'
 2) create table partspace ( city string) partitioned by ( country string) row 
 format delimited FIELDS TERMINATED BY '$' stored as textfile;
 3) load data inpath '/testdir/bri tain/uk.cities' into table partspace 
 partition (country='britain');
 Then I got the message like,
 Load failed with message   Wrong file format. Please check the file's format

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4589) Hive Load command failed when inpath contains space or any restricted characters


 [ 
https://issues.apache.org/jira/browse/HIVE-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bing Li updated HIVE-4589:
--

Status: Open  (was: Patch Available)

 Hive Load command failed when inpath contains space or any restricted 
 characters
 

 Key: HIVE-4589
 URL: https://issues.apache.org/jira/browse/HIVE-4589
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.10.0, 0.9.0
Reporter: Bing Li
Assignee: Bing Li
 Attachments: HIVE-4589.patch


 0) create a simple text file with some string.   See attached uk.cities
 1) Create a directory in Hadoop that contains a space in the name 
          hadoop fs -mkdir '/testdir/bri tain/' 
  hadoop fs -copyFromLocal    /tmp/uk.cities     '/testdir/bri 
 tain/uk.cities'
 2) create table partspace ( city string) partitioned by ( country string) row 
 format delimited FIELDS TERMINATED BY '$' stored as textfile;
 3) load data inpath '/testdir/bri tain/uk.cities' into table partspace 
 partition (country='britain');
 Then I got the message like,
 Load failed with message   Wrong file format. Please check the file's format

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4589) Hive Load command failed when inpath contains space or any restricted characters


 [ 
https://issues.apache.org/jira/browse/HIVE-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bing Li updated HIVE-4589:
--

Attachment: HIVE-4589.patch

 Hive Load command failed when inpath contains space or any restricted 
 characters
 

 Key: HIVE-4589
 URL: https://issues.apache.org/jira/browse/HIVE-4589
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.9.0, 0.10.0
Reporter: Bing Li
Assignee: Bing Li
 Attachments: HIVE-4589.patch


 0) create a simple text file with some string.   See attached uk.cities
 1) Create a directory in Hadoop that contains a space in the name 
          hadoop fs -mkdir '/testdir/bri tain/' 
  hadoop fs -copyFromLocal    /tmp/uk.cities     '/testdir/bri 
 tain/uk.cities'
 2) create table partspace ( city string) partitioned by ( country string) row 
 format delimited FIELDS TERMINATED BY '$' stored as textfile;
 3) load data inpath '/testdir/bri tain/uk.cities' into table partspace 
 partition (country='britain');
 Then I got the message like,
 Load failed with message   Wrong file format. Please check the file's format

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4570) More information to user on GetOperationStatus in Hive Server2 when query is still executing


[ 
https://issues.apache.org/jira/browse/HIVE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13663978#comment-13663978
 ] 

Jaideep Dhok commented on HIVE-4570:


Please give your feedback on the below changes -

Current API GetOperationState is not enough since it returns only a state enum. 
Instead of changing that we can add new API GetOperationProgress() which will 
return both OperationState and OperationProgress.

Driver maintains list of running and runnable tasks although that info is not 
exposed outside. It's kept locally in the driver's execute method. We can add 
Driver.getTaskProgressList() to return task progress reports on all tasks (both 
running and runnable)

Proposed changes -
{noformat}
// new method in TCLIService.thrift
OperationProgress GetOperationProgress(OperationHandle)
// new types -
OperationProgress, TaskProgress and MapRedTaskProgress

1.  OperationProgress:
class OperationProgress {
  OperationState getOperationState(),
  ListTaskProgress getTaskProgress();
}

2.class TaskProgress {
  public float getProgress() {
return 0;
  }
  public String getTaskID() {
return N/A;
  }
}

3. class MapRedTaskProgress extends TaskProgress {
  public float mapProgress();
  public float reduceProgress();
  public String getTaskID() {
 return runningJob.getID().toString();
  }
}

4. New method in Task:
  public TaskProgress getTaskProgress() {
return 0; // default to 0
  }

5. Override getTaskProgress in MapRedTask to return a MapRedTaskProgress
{noformat}

 More information to user on GetOperationStatus in Hive Server2 when query is 
 still executing
 

 Key: HIVE-4570
 URL: https://issues.apache.org/jira/browse/HIVE-4570
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 0.11.0
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok

 Currently in Hive Server2, when the query is still executing only the status 
 is set as STILL_EXECUTING. 
 This issue is to give more information to the user such as progress and 
 running job handles, if possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4589) Hive Load command failed when inpath contains space or any restricted characters


 [ 
https://issues.apache.org/jira/browse/HIVE-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bing Li updated HIVE-4589:
--

Attachment: (was: HIVE-4589.patch)

 Hive Load command failed when inpath contains space or any restricted 
 characters
 

 Key: HIVE-4589
 URL: https://issues.apache.org/jira/browse/HIVE-4589
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.9.0, 0.10.0
Reporter: Bing Li
Assignee: Bing Li
 Attachments: HIVE-4589.patch


 0) create a simple text file with some string.   See attached uk.cities
 1) Create a directory in Hadoop that contains a space in the name 
          hadoop fs -mkdir '/testdir/bri tain/' 
  hadoop fs -copyFromLocal    /tmp/uk.cities     '/testdir/bri 
 tain/uk.cities'
 2) create table partspace ( city string) partitioned by ( country string) row 
 format delimited FIELDS TERMINATED BY '$' stored as textfile;
 3) load data inpath '/testdir/bri tain/uk.cities' into table partspace 
 partition (country='britain');
 Then I got the message like,
 Load failed with message   Wrong file format. Please check the file's format

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4589) Hive Load command failed when inpath contains space or any restricted characters


 [ 
https://issues.apache.org/jira/browse/HIVE-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bing Li updated HIVE-4589:
--

Attachment: HIVE-4589.patch

 Hive Load command failed when inpath contains space or any restricted 
 characters
 

 Key: HIVE-4589
 URL: https://issues.apache.org/jira/browse/HIVE-4589
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.9.0, 0.10.0
Reporter: Bing Li
Assignee: Bing Li
 Attachments: HIVE-4589.patch


 0) create a simple text file with some string.   See attached uk.cities
 1) Create a directory in Hadoop that contains a space in the name 
          hadoop fs -mkdir '/testdir/bri tain/' 
  hadoop fs -copyFromLocal    /tmp/uk.cities     '/testdir/bri 
 tain/uk.cities'
 2) create table partspace ( city string) partitioned by ( country string) row 
 format delimited FIELDS TERMINATED BY '$' stored as textfile;
 3) load data inpath '/testdir/bri tain/uk.cities' into table partspace 
 partition (country='britain');
 Then I got the message like,
 Load failed with message   Wrong file format. Please check the file's format

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4589) Hive Load command failed when inpath contains space or any restricted characters


 [ 
https://issues.apache.org/jira/browse/HIVE-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bing Li updated HIVE-4589:
--

Attachment: (was: HIVE-4589.patch)

 Hive Load command failed when inpath contains space or any restricted 
 characters
 

 Key: HIVE-4589
 URL: https://issues.apache.org/jira/browse/HIVE-4589
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.9.0, 0.10.0
Reporter: Bing Li
Assignee: Bing Li
 Attachments: HIVE-4589.patch


 0) create a simple text file with some string.   See attached uk.cities
 1) Create a directory in Hadoop that contains a space in the name 
          hadoop fs -mkdir '/testdir/bri tain/' 
  hadoop fs -copyFromLocal    /tmp/uk.cities     '/testdir/bri 
 tain/uk.cities'
 2) create table partspace ( city string) partitioned by ( country string) row 
 format delimited FIELDS TERMINATED BY '$' stored as textfile;
 3) load data inpath '/testdir/bri tain/uk.cities' into table partspace 
 partition (country='britain');
 Then I got the message like,
 Load failed with message   Wrong file format. Please check the file's format

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4589) Hive Load command failed when inpath contains space or any restricted characters


 [ 
https://issues.apache.org/jira/browse/HIVE-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bing Li updated HIVE-4589:
--

Attachment: HIVE-4589.patch

 Hive Load command failed when inpath contains space or any restricted 
 characters
 

 Key: HIVE-4589
 URL: https://issues.apache.org/jira/browse/HIVE-4589
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.9.0, 0.10.0
Reporter: Bing Li
Assignee: Bing Li
 Attachments: HIVE-4589.patch


 0) create a simple text file with some string.   See attached uk.cities
 1) Create a directory in Hadoop that contains a space in the name 
          hadoop fs -mkdir '/testdir/bri tain/' 
  hadoop fs -copyFromLocal    /tmp/uk.cities     '/testdir/bri 
 tain/uk.cities'
 2) create table partspace ( city string) partitioned by ( country string) row 
 format delimited FIELDS TERMINATED BY '$' stored as textfile;
 3) load data inpath '/testdir/bri tain/uk.cities' into table partspace 
 partition (country='britain');
 Then I got the message like,
 Load failed with message   Wrong file format. Please check the file's format

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4589) Hive Load command failed when inpath contains space or any restricted characters


[ 
https://issues.apache.org/jira/browse/HIVE-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664032#comment-13664032
 ] 

Bing Li commented on HIVE-4589:
---

In order to run this test case (-Dtestcase=TestCliDriver -Dqfile=load_fs3.q), 
you should apply the patch for HIVE-4577 first.

 Hive Load command failed when inpath contains space or any restricted 
 characters
 

 Key: HIVE-4589
 URL: https://issues.apache.org/jira/browse/HIVE-4589
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.9.0, 0.10.0
Reporter: Bing Li
Assignee: Bing Li
 Attachments: HIVE-4589.patch


 0) create a simple text file with some string.   See attached uk.cities
 1) Create a directory in Hadoop that contains a space in the name 
          hadoop fs -mkdir '/testdir/bri tain/' 
  hadoop fs -copyFromLocal    /tmp/uk.cities     '/testdir/bri 
 tain/uk.cities'
 2) create table partspace ( city string) partitioned by ( country string) row 
 format delimited FIELDS TERMINATED BY '$' stored as textfile;
 3) load data inpath '/testdir/bri tain/uk.cities' into table partspace 
 partition (country='britain');
 Then I got the message like,
 Load failed with message   Wrong file format. Please check the file's format

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4589) Hive Load command failed when inpath contains space or any restricted characters


 [ 
https://issues.apache.org/jira/browse/HIVE-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bing Li updated HIVE-4589:
--

Status: Patch Available  (was: Open)

I added a new test case for this defect.
In order to run the case (-Dtestcase=TestCliDriver -Dqfile=load_fs3.q), you 
should apply the patch for HIVE-4577 first.

 Hive Load command failed when inpath contains space or any restricted 
 characters
 

 Key: HIVE-4589
 URL: https://issues.apache.org/jira/browse/HIVE-4589
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.10.0, 0.9.0
Reporter: Bing Li
Assignee: Bing Li
 Attachments: HIVE-4589.patch


 0) create a simple text file with some string.   See attached uk.cities
 1) Create a directory in Hadoop that contains a space in the name 
          hadoop fs -mkdir '/testdir/bri tain/' 
  hadoop fs -copyFromLocal    /tmp/uk.cities     '/testdir/bri 
 tain/uk.cities'
 2) create table partspace ( city string) partitioned by ( country string) row 
 format delimited FIELDS TERMINATED BY '$' stored as textfile;
 3) load data inpath '/testdir/bri tain/uk.cities' into table partspace 
 partition (country='britain');
 Then I got the message like,
 Load failed with message   Wrong file format. Please check the file's format

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4569) GetQueryPlan api in Hive Server2


 [ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaideep Dhok updated HIVE-4569:
---

Affects Version/s: (was: 0.11.0)
   Status: Patch Available  (was: In Progress)

Added new thrift API - GetQueryPlan to return query plan of a SQL query.

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok

 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4589) Hive Load command failed when inpath contains space or any restricted characters

2013-05-22 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664095#comment-13664095
 ] 

Xuefu Zhang commented on HIVE-4589:
---

It looks like a dupe of HIVE-4554.

 Hive Load command failed when inpath contains space or any restricted 
 characters
 

 Key: HIVE-4589
 URL: https://issues.apache.org/jira/browse/HIVE-4589
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.9.0, 0.10.0
Reporter: Bing Li
Assignee: Bing Li
 Attachments: HIVE-4589.patch


 0) create a simple text file with some string.   See attached uk.cities
 1) Create a directory in Hadoop that contains a space in the name 
          hadoop fs -mkdir '/testdir/bri tain/' 
  hadoop fs -copyFromLocal    /tmp/uk.cities     '/testdir/bri 
 tain/uk.cities'
 2) create table partspace ( city string) partitioned by ( country string) row 
 format delimited FIELDS TERMINATED BY '$' stored as textfile;
 3) load data inpath '/testdir/bri tain/uk.cities' into table partspace 
 partition (country='britain');
 Then I got the message like,
 Load failed with message   Wrong file format. Please check the file's format

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4472) OR, NOT Filter logic can lose an array, and always takes time O(VectorizedRowBatch.DEFAULT_SIZE)

2013-05-22 Thread Jitendra Nath Pandey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4472:
---

Attachment: HIVE-4472.5.patch

Same patch as previous one except that the fix to TestConstantVectorExpression 
is removed, because that is taken care of by HIVE-4553.

 OR, NOT Filter logic can lose an array, and always takes time 
 O(VectorizedRowBatch.DEFAULT_SIZE)
 

 Key: HIVE-4472
 URL: https://issues.apache.org/jira/browse/HIVE-4472
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4472.1.patch, HIVE-4472.2.patch, HIVE-4472.3.patch, 
 HIVE-4472.4.patch, HIVE-4472.5.patch


 The issue is in file FilterExprOrExpr.java and FilterNotExpr.java.
 I posted a review for you at 
 https://reviews.apache.org/r/10752/
 I think there is a bug related to sharing of an array of integers. Also, one 
 algorithm step takes O(DEFAULT_BATCH_SIZE) time always. If 
 nDEFAULT_BATCH_SIZE then this is a performance issue. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4579) Create a SARG interface for RecordReaders

2013-05-22 Thread Ashit Gosalia (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664293#comment-13664293
 ] 

Ashit Gosalia commented on HIVE-4579:
-

This is a broad enough interface. You may also want to consider supporting IN 
and BETWEEN clauses because  particular RecordReaders may efficiently implement 
these special forms. 

Looking at the TPC-H and TPC-DS queries, top level ANDs also seem to be a 
common case.

 Create a SARG interface for RecordReaders
 -

 Key: HIVE-4579
 URL: https://issues.apache.org/jira/browse/HIVE-4579
 Project: Hive
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Owen O'Malley

 I think we should create a SARG (http://en.wikipedia.org/wiki/Sargable) 
 interface for RecordReaders. For a first pass, I'll create an API that uses 
 the value stored in hive.io.filter.expr.serialized.
 The desire is to define an simpler interface that the direct AST expression 
 that is provided by hive.io.filter.expr.serialized so that the code to 
 evaluate expressions can be generalized instead of put inside a particular 
 RecordReader.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4450) Extend Vector Aggregates to support GROUP BY

2013-05-22 Thread Remus Rusanu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-4450:
---

Attachment: HIVE-4450-p1.patch.txt

This one I applied to second enlistment and confirmed it compiles

 Extend Vector Aggregates to support GROUP BY
 

 Key: HIVE-4450
 URL: https://issues.apache.org/jira/browse/HIVE-4450
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Remus Rusanu
Assignee: Remus Rusanu
  Labels: features
 Fix For: vectorization-branch

 Attachments: HIVE-4450-p1.patch.txt, HIVE-4450-p1.patch.txt, 
 HIVE-4450-p1.patch.txt, HIVE-4450-p1.patch.txt


 Extend the VectorGroupByOperator and the VectorUDAF aggregates to support 
 group by.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4579) Create a SARG interface for RecordReaders


[ 
https://issues.apache.org/jira/browse/HIVE-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664321#comment-13664321
 ] 

Eric Hanson commented on HIVE-4579:
---

Consider adding Column IN (list-of-constants) as a SIMPLE_COND. This is really 
commonly used.

 Create a SARG interface for RecordReaders
 -

 Key: HIVE-4579
 URL: https://issues.apache.org/jira/browse/HIVE-4579
 Project: Hive
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Owen O'Malley

 I think we should create a SARG (http://en.wikipedia.org/wiki/Sargable) 
 interface for RecordReaders. For a first pass, I'll create an API that uses 
 the value stored in hive.io.filter.expr.serialized.
 The desire is to define an simpler interface that the direct AST expression 
 that is provided by hive.io.filter.expr.serialized so that the code to 
 evaluate expressions can be generalized instead of put inside a particular 
 RecordReader.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4548) Speed up vectorized LIKE filter for special cases abc%, %abc and %abc%


[ 
https://issues.apache.org/jira/browse/HIVE-4548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664336#comment-13664336
 ] 

Eric Hanson commented on HIVE-4548:
---

Can you confirm if there is a problem or not? E.g. is it possible for a % 
character to show up as the first or second character of a 2-character sequence 
in a String that represents a character beyond the standard set of 0x to 
0x. 

If it is indeed a problem, then we should fix it here and open another JIRA to 
report a bug in the original UDFLike.

 Speed up vectorized LIKE filter for special cases abc%, %abc and %abc%
 --

 Key: HIVE-4548
 URL: https://issues.apache.org/jira/browse/HIVE-4548
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Teddy Choi
Priority: Minor
 Fix For: vectorization-branch

 Attachments: HIVE-4548.1-with-benchmark.patch.txt, 
 HIVE-4548.1-without-benchmark.patch.txt, 
 HIVE-4548.2-with-benchmark.patch.txt, HIVE-4548.2-without-benchmark.patch.txt


 Speed up vectorized LIKE filter evaluation for abc%, %abc, and %abc% pattern 
 special cases (here, abc is just a place holder for some fixed string).  
   
 Problem: The current vectorized LIKE implementation always calls the standard 
 LIKE function code in UDFLike.java. But this is pretty expensive. It calls 
 multiple functions and allocates at least one new object per call. Probably 
 80% of uses of LIKE are for the simple patterns abc%, %abc, and %abc%.  These 
 can be implemented much more efficiently.
 Start by speeding up the case for  
 Column LIKE abc%
   
 The goal would be to minimize expense in the inner loop. Don't use new() in 
 the inner loop, and write a static function that checks the prefix of the 
 string matches the like pattern as efficiently as possible, operating 
 directly on the byte array holding UTF-8-encoded string data, and avoiding 
 unnecessary additional function calls and if/else logic. Call that in the 
 inner loop.
 If feasible, consider using a template-driven approach, with an instance of 
 the template expanded for each of the three cases. Start doing the abc% 
 (prefix match) by hand, then consider templatizing for the other two cases.
 The code is in the vectorization branch of the main hive repo.
   
 Start by checking in the constructor for FilterStringColLikeStringScalar.java 
 if the pattern is one of the simple special cases. If so, record that, and 
 have the evaluate() method call a special-case function for each case, i.e. 
 the general case, and each of the 3 special cases. All the dynamic 
 decision-making would be done once per vector, not once per element.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HIVE-3159) Update AvroSerde to determine schema of new tables

2013-05-22 Thread Jakob Homan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan reassigned HIVE-3159:
-

Assignee: (was: Jakob Homan)

 Update AvroSerde to determine schema of new tables
 --

 Key: HIVE-3159
 URL: https://issues.apache.org/jira/browse/HIVE-3159
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Reporter: Jakob Homan

 Currently when writing tables to Avro one must manually provide an Avro 
 schema that matches what is being delivered by Hive. It'd be better to have 
 the serde infer this schema by converting the table's TypeInfo into an 
 appropriate AvroSchema.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2670) A cluster test utility for Hive

2013-05-22 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-2670:
-

Attachment: HIVE-2670_5.patch

An up to date version of the patch.  I did change this to work with the HCat 
e2e tests that are already (now) in Hive rather than create a whole new e2e 
directory directly off of hive/trunk.  I've run all of the tests and confirmed 
that they pass.

 A cluster test utility for Hive
 ---

 Key: HIVE-2670
 URL: https://issues.apache.org/jira/browse/HIVE-2670
 Project: Hive
  Issue Type: New Feature
  Components: Testing Infrastructure
Reporter: Alan Gates
Assignee: Johnny Zhang
 Attachments: harness.tar, HIVE-2670_5.patch, 
 hive_cluster_test_2.patch, hive_cluster_test_3.patch, 
 hive_cluster_test_4.patch, hive_cluster_test.patch


 Hive has an extensive set of unit tests, but it does not have an 
 infrastructure for testing in a cluster environment.  Pig and HCatalog have 
 been using a test harness for cluster testing for some time.  We have written 
 Hive drivers and tests to run in this harness.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4581) HCat e2e tests broken by changes to Hive's describe table formatting

2013-05-22 Thread Sushanth Sowmyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664388#comment-13664388
 ] 

Sushanth Sowmyan commented on HIVE-4581:


Changes look good to me. +1.

 HCat e2e tests broken by changes to Hive's describe table formatting
 

 Key: HIVE-4581
 URL: https://issues.apache.org/jira/browse/HIVE-4581
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.11.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.12.0

 Attachments: HIVE-4581.patch


 In Hive 0.11 the default formatting for describe table changed.  A number of 
 the HCat e2e tests do describe table and apply regular expressions to the 
 output to make sure the table looks correct.  These formatting changes broke 
 those tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2670) A cluster test utility for Hive

2013-05-22 Thread Johnny Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664390#comment-13664390
 ] 

Johnny Zhang commented on HIVE-2670:


Alan, Thanks for the update!

 A cluster test utility for Hive
 ---

 Key: HIVE-2670
 URL: https://issues.apache.org/jira/browse/HIVE-2670
 Project: Hive
  Issue Type: New Feature
  Components: Testing Infrastructure
Reporter: Alan Gates
Assignee: Johnny Zhang
 Attachments: harness.tar, HIVE-2670_5.patch, 
 hive_cluster_test_2.patch, hive_cluster_test_3.patch, 
 hive_cluster_test_4.patch, hive_cluster_test.patch


 Hive has an extensive set of unit tests, but it does not have an 
 infrastructure for testing in a cluster environment.  Pig and HCatalog have 
 been using a test harness for cluster testing for some time.  We have written 
 Hive drivers and tests to run in this harness.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4590) HCatalog documentation example is wrong

Eugene Koifman created HIVE-4590:


 Summary: HCatalog documentation example is wrong
 Key: HIVE-4590
 URL: https://issues.apache.org/jira/browse/HIVE-4590
 Project: Hive
  Issue Type: Bug
  Components: Documentation
Affects Versions: 0.10.0
Reporter: Eugene Koifman
Priority: Minor


http://hive.apache.org/docs/hcat_r0.5.0/inputoutput.html#Read+Example
reads

The following very simple MapReduce program reads data from one table which it 
assumes to have an integer in the second column, and counts how many different 
values it sees. That is, it does the equivalent of select col1, count(*) from 
$table group by col1;.

The description of the query is wrong.  It actually counts how many instances 
of each distinct value it find.  For example, if values of col1 are 
{1,1,1,3,3,3,5) it will produce
1, 3
3, 2,
5, 1
 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4591) Making changes to webhcat-site.xml have no effect

Eugene Koifman created HIVE-4591:


 Summary: Making changes to webhcat-site.xml have no effect
 Key: HIVE-4591
 URL: https://issues.apache.org/jira/browse/HIVE-4591
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman


Looks like WebHCat configuration is read as follows:

Configuration: core-default.xml, core-site.xml, mapred-default.xml, 
mapred-site.xml, 
jar:file:/Users/ekoifman/dev/hive/build/dist/hcatalog/share/webhcat/svr/webhcat-0.12.0-SNAPSHOT.jar!/webhcat-default.xml

creating 
/Users/ekoifman/dev/hive/build/dist/hcatalog/etc/webhcat/webhcat-site.xml and 
setting templeton.exec.timeout has no effect as can be seen in ExecServiceImpl

Probably the webhcat_server.sh script is missing something

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

webhcat component in hive jira

2013-05-22 Thread Thejas Nair

Can a hive jira admin please create a webhcat component in hive
project in jira ?
(webhcat - http://hive.apache.org/docs/hcat_r0.5.0/rest.html)
Thanks,
Thejas

[jira] [Commented] (HIVE-4578) Changes to Pig's test harness broke HCat e2e tests

2013-05-22 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664425#comment-13664425
 ] 

Hudson commented on HIVE-4578:
--

Integrated in Hive-trunk-h0.21 #2113 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2113/])
HIVE-4578 Changes to Pig's test harness broke HCat e2e tests (gates) 
(Revision 1484969)

 Result = FAILURE
gates : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1484969
Files : 
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/build.xml
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/resource
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/resource/default.res
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/resource/windows.res


 Changes to Pig's test harness broke HCat e2e tests
 --

 Key: HIVE-4578
 URL: https://issues.apache.org/jira/browse/HIVE-4578
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.12.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.12.0

 Attachments: HIVE-4578_2.patch, HIVE-4578.patch


 HCatalog externs the test harness from Pig.  Pig recently made some changes 
 to the test harness to work better across Unix and Windows.  These changes 
 require new OS specific files.  HCatalog will also need these files in order 
 to work with the test harness.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Hive-trunk-h0.21 - Build # 2113 - Still Failing

2013-05-22 Thread Apache Jenkins Server

Changes for Build #2088
[gates] HIVE-4465 webhcat e2e tests succeed regardless of exitvalue


Changes for Build #2089
[cws] HIVE-3957. Add pseudo-BNF grammar for RCFile to Javadoc (Mark Grover via 
cws)

[cws] HIVE-4497. beeline module tests don't get run by default (Thejas Nair via 
cws)

[gangtimliu] HIVE-4474: Column access not tracked properly for partitioned 
tables. Samuel Yuan via Gang Tim Liu

[hashutosh] HIVE-4455 : HCatalog build directories get included in tar file 
produced by ant tar (Alan Gates via Ashutosh Chauhan)


Changes for Build #2090

Changes for Build #2091
[hashutosh] HIVE-4392 : Illogical InvalidObjectException throwed when use mulit 
aggregate functions with star columns  (Navis via Ashutosh Chauhan)

[hashutosh] HIVE-4421 : Improve memory usage by ORC dictionaries (Owen Omalley 
via Ashutosh Chauhan)

[mithun] HCATALOG-627 - Adding thread-safety to NotificationListener. (amalakar 
via mithun)


Changes for Build #2092
[hashutosh] HIVE-4466 : Fix continue.on.failure in unit tests to -well- 
continue on failure in unit tests (Gunther Hagleitner via Ashutosh Chauhan)

[hashutosh] HIVE-4471 : Build fails with hcatalog checkstyle error (Gunther 
Hagleitner via Ashutosh Chauhan)


Changes for Build #2093
[omalley] HIVE-4494 ORC map columns get class cast exception in some contexts 
(omalley)

[omalley] HIVE-4500 Ensure that HiveServer 2 closes log files. (Alan Gates via 
omalley)


Changes for Build #2094
[navis] HIVE-4209 Cache evaluation result of deterministic expression and reuse 
it (Navis via namit)


Changes for Build #2095

Changes for Build #2096

Changes for Build #2097
[cws] HIVE-4530. Enforce minmum ant version required in build script (Arup 
Malakar via cws)

[omalley] Preparing RELEASE_NOTES for Hive 0.11.0rc2.


Changes for Build #2098
[omalley] Update release notes for 0.11.0rc2

[omalley] HIVE-4527 Fix eclipse project template (Carl Steinbach via omalley)

[omalley] HIVE-4505 Hive can't load transforms with remote scripts. (Prasad 
Majumdar and Gunther Hagleitner
via omalley)

[omalley] HIVE-4498 TestBeeLineWithArgs.testPositiveScriptFile fails (Thejas 
Nair via omalley)


Changes for Build #2099

Changes for Build #2100

Changes for Build #2101

Changes for Build #2102

Changes for Build #2103
[daijy] PIG-2955: Fix bunch of Pig e2e tests on Windows


Changes for Build #2104
[daijy] PIG-3069: Native Windows Compatibility for Pig E2E Tests and Harness


Changes for Build #2105
[omalley] HIVE-4550 local_mapred_error_cache fails on some hadoop versions 
(Gunther 
Hagleitner via omalley)

[omalley] HIVE-4440 SMB Operator spills to disk like it's 1999 (Gunther 
Hagleitner via
omalley)


Changes for Build #2106

Changes for Build #2107
[omalley] HIVE-4486 FetchOperator slows down SMB map joins by 50% when there 
are many 
partitions (Gopal V via omalley)


Changes for Build #2108

Changes for Build #2109

Changes for Build #2110

Changes for Build #2111
[omalley] HIVE-4475 Switch RCFile default to LazyBinaryColumnarSerDe. (Guther 
Hagleitner
via omalley)

[omalley] HIVE-4521 Auto join conversion fails in certain cases (Gunther 
Hagleitner via
omalley)


Changes for Build #2112

Changes for Build #2113
[gates] HIVE-4578 Changes to Pig's test harness broke HCat e2e tests (gates)




All tests passed

The Apache Jenkins build system has built Hive-trunk-h0.21 (build #2113)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/2113/ to 
view the results.

[jira] [Assigned] (HIVE-4590) HCatalog documentation example is wrong

2013-05-22 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz reassigned HIVE-4590:


Assignee: Lefty Leverenz

 HCatalog documentation example is wrong
 ---

 Key: HIVE-4590
 URL: https://issues.apache.org/jira/browse/HIVE-4590
 Project: Hive
  Issue Type: Bug
  Components: Documentation
Affects Versions: 0.10.0
Reporter: Eugene Koifman
Assignee: Lefty Leverenz
Priority: Minor

 http://hive.apache.org/docs/hcat_r0.5.0/inputoutput.html#Read+Example
 reads
 The following very simple MapReduce program reads data from one table which 
 it assumes to have an integer in the second column, and counts how many 
 different values it sees. That is, it does the equivalent of select col1, 
 count(*) from $table group by col1;.
 The description of the query is wrong.  It actually counts how many instances 
 of each distinct value it find.  For example, if values of col1 are 
 {1,1,1,3,3,3,5) it will produce
 1, 3
 3, 2,
 5, 1
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4578) Changes to Pig's test harness broke HCat e2e tests

2013-05-22 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664449#comment-13664449
 ] 

Hudson commented on HIVE-4578:
--

Integrated in Hive-trunk-hadoop2 #206 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/206/])
HIVE-4578 Changes to Pig's test harness broke HCat e2e tests (gates) 
(Revision 1484969)

 Result = ABORTED
gates : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1484969
Files : 
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/build.xml
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/resource
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/resource/default.res
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/resource/windows.res


 Changes to Pig's test harness broke HCat e2e tests
 --

 Key: HIVE-4578
 URL: https://issues.apache.org/jira/browse/HIVE-4578
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.12.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.12.0

 Attachments: HIVE-4578_2.patch, HIVE-4578.patch


 HCatalog externs the test harness from Pig.  Pig recently made some changes 
 to the test harness to work better across Unix and Windows.  These changes 
 require new OS specific files.  HCatalog will also need these files in order 
 to work with the test harness.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4590) HCatalog documentation example is wrong

2013-05-22 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-4590:
-

Component/s: HCatalog

 HCatalog documentation example is wrong
 ---

 Key: HIVE-4590
 URL: https://issues.apache.org/jira/browse/HIVE-4590
 Project: Hive
  Issue Type: Bug
  Components: Documentation, HCatalog
Affects Versions: 0.10.0
Reporter: Eugene Koifman
Assignee: Lefty Leverenz
Priority: Minor

 http://hive.apache.org/docs/hcat_r0.5.0/inputoutput.html#Read+Example
 reads
 The following very simple MapReduce program reads data from one table which 
 it assumes to have an integer in the second column, and counts how many 
 different values it sees. That is, it does the equivalent of select col1, 
 count(*) from $table group by col1;.
 The description of the query is wrong.  It actually counts how many instances 
 of each distinct value it find.  For example, if values of col1 are 
 {1,1,1,3,3,3,5) it will produce
 1, 3
 3, 2,
 5, 1
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4450) Extend Vector Aggregates to support GROUP BY


 [ 
https://issues.apache.org/jira/browse/HIVE-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-4450:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

I just committed this to vectorization branch. Thanks, Remus!

 Extend Vector Aggregates to support GROUP BY
 

 Key: HIVE-4450
 URL: https://issues.apache.org/jira/browse/HIVE-4450
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Remus Rusanu
Assignee: Remus Rusanu
  Labels: features
 Fix For: vectorization-branch

 Attachments: HIVE-4450-p1.patch.txt, HIVE-4450-p1.patch.txt, 
 HIVE-4450-p1.patch.txt, HIVE-4450-p1.patch.txt


 Extend the VectorGroupByOperator and the VectorUDAF aggregates to support 
 group by.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4537) select * fails on orc table when vectorization is enabled


 [ 
https://issues.apache.org/jira/browse/HIVE-4537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-4537:


   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

I just committed this to the vectorization branch. Thanks, Sarvesh!

 select * fails on orc table when vectorization is enabled 
 --

 Key: HIVE-4537
 URL: https://issues.apache.org/jira/browse/HIVE-4537
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Tony Murphy
Assignee: Sarvesh Sakalanaga
 Fix For: 0.12.0

 Attachments: Hive-4537.0.patch, Hive-4537.1.patch


 hive select * from intdataorc;
 OK
 Failed with exception 
 java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: Error 
 evaluating cint0
 Time taken: 0.213 seconds

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4537) select * fails on orc table when vectorization is enabled


 [ 
https://issues.apache.org/jira/browse/HIVE-4537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-4537:


  Description: 
hive select * from intdataorc;
OK
Failed with exception 
java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: Error 
evaluating cint0
Time taken: 0.213 seconds

  was:

hive select * from intdataorc;
OK
Failed with exception 
java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: Error 
evaluating cint0
Time taken: 0.213 seconds

Fix Version/s: (was: 0.12.0)
   vectorization-branch

 select * fails on orc table when vectorization is enabled 
 --

 Key: HIVE-4537
 URL: https://issues.apache.org/jira/browse/HIVE-4537
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Tony Murphy
Assignee: Sarvesh Sakalanaga
 Fix For: vectorization-branch

 Attachments: Hive-4537.0.patch, Hive-4537.1.patch


 hive select * from intdataorc;
 OK
 Failed with exception 
 java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: Error 
 evaluating cint0
 Time taken: 0.213 seconds

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4553) Column Column, and Column Scalar vectorized execution tests


 [ 
https://issues.apache.org/jira/browse/HIVE-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-4553:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

I just committed this to the vectorization branch. Thanks, Tony!

 Column Column, and Column Scalar vectorized execution tests
 ---

 Key: HIVE-4553
 URL: https://issues.apache.org/jira/browse/HIVE-4553
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
Assignee: Tony Murphy
 Fix For: vectorization-branch

 Attachments: HIVE-4553 (2).patch, HIVE-4553 (3).patch, 
 HIVE-4553.4.patch, HIVE-4553.5.patch, HIVE-4553.patch


 review board review: https://reviews.apache.org/r/11133/
 This patch adds Column Column, and Column Scalar vectorized execution tests. 
 These tests are generated in parallel with the vectorized expressions. The 
 tests focus is on validating the column vector and the vectorized row batch 
 metadata regarding nulls, repeating, and selection.
 Overview of Changes:
 CodeGen.java:
 + joinPath, getCamelCaseType, readFile and writeFile made static for use in 
 TestCodeGen.java.
 + filter types now specify null as their output type rather than doesn't 
 matter to make detection for test generation easier.
 + support for test generation added.
 TestCodeGen.java  Templates: 
  TestClass.txt
  TestColumnColumnFilterVectorExpressionEvaluation.txt,
  TestColumnColumnOperationVectorExpressionEvaluation.txt,
  TestColumnScalarFilterVectorExpressionEvaluation.txt,
  TestColumnScalarOperationVectorExpressionEvaluation.txt
 +This class is mutable and maintains a hashmap of TestSuiteClassName to test 
 cases. The tests cases are added over the course of vectorized expressions 
 class generation, with test classes being outputted at the end. For each 
 column vector (inputs and/or outputs) a matrix of pairwise covering Booleans 
 is used to generate test cases across nulls and repeating dimensions. Based 
 on the input column vector(s) nulls and repeating states the states of the 
 output column vector (if there is one) is validated, along with the null 
 vector. For filter operations the selection vector is validated against the 
 generated data. Each template corresponds to a class representing a test 
 suite.
 VectorizedRowGroupUtil.java
 +added methods generateLongColumnVector and generateDoubleColumnVector for 
 generating the respective column vectors with optional nulls and/or repeating 
 values.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4472) OR, NOT Filter logic can lose an array, and always takes time O(VectorizedRowBatch.DEFAULT_SIZE)


 [ 
https://issues.apache.org/jira/browse/HIVE-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-4472:


   Resolution: Fixed
Fix Version/s: vectorization-branch
   Status: Resolved  (was: Patch Available)

I just committed this to the vectorization branch. Thanks, Jitendra!

 OR, NOT Filter logic can lose an array, and always takes time 
 O(VectorizedRowBatch.DEFAULT_SIZE)
 

 Key: HIVE-4472
 URL: https://issues.apache.org/jira/browse/HIVE-4472
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey
 Fix For: vectorization-branch

 Attachments: HIVE-4472.1.patch, HIVE-4472.2.patch, HIVE-4472.3.patch, 
 HIVE-4472.4.patch, HIVE-4472.5.patch


 The issue is in file FilterExprOrExpr.java and FilterNotExpr.java.
 I posted a review for you at 
 https://reviews.apache.org/r/10752/
 I think there is a bug related to sharing of an array of integers. Also, one 
 algorithm step takes O(DEFAULT_BATCH_SIZE) time always. If 
 nDEFAULT_BATCH_SIZE then this is a performance issue. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4534) IsNotNull and NotCol incorrectly handle nulls.


 [ 
https://issues.apache.org/jira/browse/HIVE-4534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-4534:


   Resolution: Fixed
Fix Version/s: vectorization-branch
   Status: Resolved  (was: Patch Available)

I just committed this to the vectorization branch. Thanks, Jitendra!

 IsNotNull and NotCol incorrectly handle nulls.
 --

 Key: HIVE-4534
 URL: https://issues.apache.org/jira/browse/HIVE-4534
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey
 Fix For: vectorization-branch

 Attachments: HIVE-4534.1.patch, HIVE-4534.2.patch


 See file IsNotNull.java in package 
 org.apache.hadoop.hive.ql.exec.vector.expressions
 It never looks at the noNulls flag on the input vector, but accesses the 
 isNull[] array anyway. This can yield incorrect results.
 isRepeating and noNulls are not set in the output, which can also cause wrong 
 results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Review Request: HIVE-4568 Beeline needs to support resolving variables

2013-05-22 Thread Xuefu Zhang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11334/
---

Review request for hive.


Description
---

1. Added command variable substition
2. Added test case


This addresses bug HIVE-4568.
https://issues.apache.org/jira/browse/HIVE-4568


Diffs
-

  beeline/src/java/org/apache/hive/beeline/BeeLine.java aeb1e8b 
  beeline/src/java/org/apache/hive/beeline/TestBeeLineVarSubstitution.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/parse/VariableSubstitution.java f292944 

Diff: https://reviews.apache.org/r/11334/diff/


Testing
---


Thanks,

Xuefu Zhang

Review Request: Review Request for HIVE-4554 Failed to create a table from existing file if file path has spaces

2013-05-22 Thread Xuefu Zhang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11335/
---

Review request for hive.


Description
---

Patch includes fix and new test case.


This addresses bug HIVE-4554.
https://issues.apache.org/jira/browse/HIVE-4554


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java 3031d1c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java bd8d252 
  ql/src/test/queries/clientpositive/load_file_with_space_in_the_name.q 
PRE-CREATION 
  ql/src/test/results/clientpositive/load_file_with_space_in_the_name.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/11335/diff/


Testing
---


Thanks,

Xuefu Zhang

[jira] [Updated] (HIVE-4579) Create a SARG interface for RecordReaders


 [ 
https://issues.apache.org/jira/browse/HIVE-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-4579:


Attachment: pushdown.pdf

Here's a quick write up of the intent of the interface.

 Create a SARG interface for RecordReaders
 -

 Key: HIVE-4579
 URL: https://issues.apache.org/jira/browse/HIVE-4579
 Project: Hive
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: pushdown.pdf


 I think we should create a SARG (http://en.wikipedia.org/wiki/Sargable) 
 interface for RecordReaders. For a first pass, I'll create an API that uses 
 the value stored in hive.io.filter.expr.serialized.
 The desire is to define an simpler interface that the direct AST expression 
 that is provided by hive.io.filter.expr.serialized so that the code to 
 evaluate expressions can be generalized instead of put inside a particular 
 RecordReader.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4579) Create a SARG interface for RecordReaders


 [ 
https://issues.apache.org/jira/browse/HIVE-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-4579:


Attachment: (was: pushdown.pdf)

 Create a SARG interface for RecordReaders
 -

 Key: HIVE-4579
 URL: https://issues.apache.org/jira/browse/HIVE-4579
 Project: Hive
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: pushdown.pdf


 I think we should create a SARG (http://en.wikipedia.org/wiki/Sargable) 
 interface for RecordReaders. For a first pass, I'll create an API that uses 
 the value stored in hive.io.filter.expr.serialized.
 The desire is to define an simpler interface that the direct AST expression 
 that is provided by hive.io.filter.expr.serialized so that the code to 
 evaluate expressions can be generalized instead of put inside a particular 
 RecordReader.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4579) Create a SARG interface for RecordReaders


 [ 
https://issues.apache.org/jira/browse/HIVE-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-4579:


Attachment: pushdown.pdf

fixed a typo

 Create a SARG interface for RecordReaders
 -

 Key: HIVE-4579
 URL: https://issues.apache.org/jira/browse/HIVE-4579
 Project: Hive
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: pushdown.pdf


 I think we should create a SARG (http://en.wikipedia.org/wiki/Sargable) 
 interface for RecordReaders. For a first pass, I'll create an API that uses 
 the value stored in hive.io.filter.expr.serialized.
 The desire is to define an simpler interface that the direct AST expression 
 that is provided by hive.io.filter.expr.serialized so that the code to 
 evaluate expressions can be generalized instead of put inside a particular 
 RecordReader.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4548) Speed up vectorized LIKE filter for special cases abc%, %abc and %abc%


[ 
https://issues.apache.org/jira/browse/HIVE-4548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664572#comment-13664572
 ] 

Eric Hanson commented on HIVE-4548:
---

It appears that all the specific characters you are checking for in 
parseSimplePattern (%, _, \) cannot be the first or last character of a 
surrogate pair. So I think the code is safe. Please think this through and add 
some unit tests that process multi-byte UTF-8 characters of 3 bytes or more 
(which will force encoding as surrogate pairs inside a String).

See 
http://en.wikipedia.org/wiki/UTF-16/UCS-2#Code_points_U.2B1_to_U.2B10 
for a discussion of surrogate pairs.

See http://en.wikipedia.org/wiki/List_of_Unicode_characters for a list of 
Unicode characters. % is 0x0025, _ is 0x005F, and \ is 0x005C. Surrogate pairs 
are all have lead surrogates in the range 0xD800..0xDBFF and trail surrogates 
in the range 0xDC00..0xDFFF. 

 Speed up vectorized LIKE filter for special cases abc%, %abc and %abc%
 --

 Key: HIVE-4548
 URL: https://issues.apache.org/jira/browse/HIVE-4548
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Teddy Choi
Priority: Minor
 Fix For: vectorization-branch

 Attachments: HIVE-4548.1-with-benchmark.patch.txt, 
 HIVE-4548.1-without-benchmark.patch.txt, 
 HIVE-4548.2-with-benchmark.patch.txt, HIVE-4548.2-without-benchmark.patch.txt


 Speed up vectorized LIKE filter evaluation for abc%, %abc, and %abc% pattern 
 special cases (here, abc is just a place holder for some fixed string).  
   
 Problem: The current vectorized LIKE implementation always calls the standard 
 LIKE function code in UDFLike.java. But this is pretty expensive. It calls 
 multiple functions and allocates at least one new object per call. Probably 
 80% of uses of LIKE are for the simple patterns abc%, %abc, and %abc%.  These 
 can be implemented much more efficiently.
 Start by speeding up the case for  
 Column LIKE abc%
   
 The goal would be to minimize expense in the inner loop. Don't use new() in 
 the inner loop, and write a static function that checks the prefix of the 
 string matches the like pattern as efficiently as possible, operating 
 directly on the byte array holding UTF-8-encoded string data, and avoiding 
 unnecessary additional function calls and if/else logic. Call that in the 
 inner loop.
 If feasible, consider using a template-driven approach, with an instance of 
 the template expanded for each of the three cases. Start doing the abc% 
 (prefix match) by hand, then consider templatizing for the other two cases.
 The code is in the vectorization branch of the main hive repo.
   
 Start by checking in the constructor for FilterStringColLikeStringScalar.java 
 if the pattern is one of the simple special cases. If so, record that, and 
 have the evaluate() method call a special-case function for each case, i.e. 
 the general case, and each of the 3 special cases. All the dynamic 
 decision-making would be done once per vector, not once per element.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4581) HCat e2e tests broken by changes to Hive's describe table formatting

2013-05-22 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-4581:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 HCat e2e tests broken by changes to Hive's describe table formatting
 

 Key: HIVE-4581
 URL: https://issues.apache.org/jira/browse/HIVE-4581
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.11.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.12.0

 Attachments: HIVE-4581.patch


 In Hive 0.11 the default formatting for describe table changed.  A number of 
 the HCat e2e tests do describe table and apply regular expressions to the 
 output to make sure the table looks correct.  These formatting changes broke 
 those tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-05-22 Thread Carl Steinbach (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664636#comment-13664636
]

Carl Steinbach commented on HIVE-4569:
--

bq. I do not see GetQueryPlan api available in HiveServer2, though the wiki
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API
contains, not sure why it was not added.

It was not added because it became clear during implementation of HiveServer2
that it was a bad idea to extend (i.e. depend on) any of the existing legacy
Hive Thrift APIs. We also were narrowly focused on supporting JDBC/ODBC, and
neither of these APIs provide explicit support for retrieving the execution
plan.

@Jaideep: I think it would be a good idea to post some notes about how you plan
to modify the HS2 Thrift API and get feedback before spending time doing the
implementation work.

GetQueryPlan api in Hive Server2

Key: HIVE-4569
URL: https://issues.apache.org/jira/browse/HIVE-4569
Project: Hive
Issue Type: Bug
Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok

It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan
api available in HiveServer2, though the wiki
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API
contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21 #380

2013-05-22 Thread Apache Jenkins Server

See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/380/

--
[...truncated 35402 lines...]
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/jenkins/hive_2013-05-22_16-11-46_918_7907957170562774939/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201305221611_488159501.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] Copying file: 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] Copying data from 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/jenkins/hive_2013-05-22_16-11-50_939_8934239527469923146/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/jenkins/hive_2013-05-22_16-11-50_939_8934239527469923146/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201305221611_368184354.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201305221611_401196014.txt
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201305221611_1663133470.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (key int, value 
string)

[jira] [Created] (HIVE-4592) ColumnArithmeticColumn.txt template never sets output isNull to true; can give wrong results

Eric Hanson created HIVE-4592:
-

 Summary: ColumnArithmeticColumn.txt template never sets output 
isNull to true; can give wrong results
 Key: HIVE-4592
 URL: https://issues.apache.org/jira/browse/HIVE-4592
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: vectorization-branch


ColumnArithmeticColumn.txt should set the output column's noNulls flag to true 
if neither input column has nulls, but it does not do that. This can lead to 
wrong results is the noNulls was set to false in a previous use of the batch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Jenkins build is back to normal : Hive-0.10.0-SNAPSHOT-h0.20.1 #153

2013-05-22 Thread Apache Jenkins Server

See https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/153/

[jira] [Commented] (HIVE-4592) ColumnArithmeticColumn.txt template never sets output isNull to true; can give wrong results


[ 
https://issues.apache.org/jira/browse/HIVE-4592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664761#comment-13664761
 ] 

Eric Hanson commented on HIVE-4592:
---

Found some other issues in null propagation as well.

 ColumnArithmeticColumn.txt template never sets output isNull to true; can 
 give wrong results
 

 Key: HIVE-4592
 URL: https://issues.apache.org/jira/browse/HIVE-4592
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: vectorization-branch


 ColumnArithmeticColumn.txt should set the output column's noNulls flag to 
 true if neither input column has nulls, but it does not do that. This can 
 lead to wrong results is the noNulls was set to false in a previous use of 
 the batch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Work started] (HIVE-4592) ColumnArithmeticColumn.txt template never sets output isNull to true; can give wrong results


 [ 
https://issues.apache.org/jira/browse/HIVE-4592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-4592 started by Eric Hanson.

 ColumnArithmeticColumn.txt template never sets output isNull to true; can 
 give wrong results
 

 Key: HIVE-4592
 URL: https://issues.apache.org/jira/browse/HIVE-4592
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: vectorization-branch


 ColumnArithmeticColumn.txt should set the output column's noNulls flag to 
 true if neither input column has nulls, but it does not do that. This can 
 lead to wrong results is the noNulls was set to false in a previous use of 
 the batch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4593) ErrorMsg has several messages that reuse the same error code

Eugene Koifman created HIVE-4593:


 Summary: ErrorMsg has several messages that reuse the same error 
code
 Key: HIVE-4593
 URL: https://issues.apache.org/jira/browse/HIVE-4593
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0
Reporter: Eugene Koifman


All of these errorCode values are associated with more than one message.
10043
10227
10228
10229
10230
10231

This is not right.

This affects ErrorMsg.getErrorMsg(int errorCode) as well as Templeton logic.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4593) ErrorMsg has several messages that reuse the same error code


 [ 
https://issues.apache.org/jira/browse/HIVE-4593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-4593:
-

Description: 
All of these errorCode values are associated with more than one message.
10043
10227
10228
10229
10230
10231

This is not right.

This affects ErrorMsg.getErrorMsg(int errorCode) as well as Templeton logic.

Should probably add a JUnit test to check this.

  was:
All of these errorCode values are associated with more than one message.
10043
10227
10228
10229
10230
10231

This is not right.

This affects ErrorMsg.getErrorMsg(int errorCode) as well as Templeton logic.


 ErrorMsg has several messages that reuse the same error code
 

 Key: HIVE-4593
 URL: https://issues.apache.org/jira/browse/HIVE-4593
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0
Reporter: Eugene Koifman

 All of these errorCode values are associated with more than one message.
 10043
 10227
 10228
 10229
 10230
 10231
 This is not right.
 This affects ErrorMsg.getErrorMsg(int errorCode) as well as Templeton logic.
 Should probably add a JUnit test to check this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4257) java.sql.SQLNonTransientConnectionException on JDBCStatsAggregator


[ 
https://issues.apache.org/jira/browse/HIVE-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664794#comment-13664794
 ] 

Navis commented on HIVE-4257:
-

running test

 java.sql.SQLNonTransientConnectionException on JDBCStatsAggregator
 --

 Key: HIVE-4257
 URL: https://issues.apache.org/jira/browse/HIVE-4257
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 0.11.0
Reporter: Teddy Choi
Assignee: Teddy Choi
Priority: Minor
 Attachments: HIVE-4257.1.patch.txt


 java.sql.SQLNonTransientConnectionException occurs on JDBCStatsAggregator 
 after executing dozens of Hive queries periodically, which inserts thousands 
 of rows. It may have a relation with DERBY-5098. To avoid this error, Hive 
 should use a more recent version of Derby(10.6.2.3, 10.7.1.4, 10.8.2.2, 
 10.9.1.0 or later). Hive 0.11.0-SNAPSHOT uses Derby 10.4.2.0.
 {noformat}
 2013-03-24 15:54:30,487 ERROR jdbc.JDBCStatsAggregator 
 (JDBCStatsAggregator.java:aggregateStats(168)) - Error during publishing 
 aggregation. java.sql.SQLNonTransientConnectionException: No current 
 connection.
 2013-03-24 15:54:30,487 ERROR jdbc.JDBCStatsAggregator 
 (JDBCStatsAggregator.java:aggregateStats(168)) - Error during publishing 
 aggregation. java.sql.SQLNonTransientConnectionException: No current 
 connection.
 2013-03-24 15:54:30,487 ERROR jdbc.JDBCStatsAggregator 
 (JDBCStatsAggregator.java:cleanUp(249)) - Error during publishing 
 aggregation. java.sql.SQLNonTransientConnectionException: No current 
 connection.
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4194) JDBC2: HiveDriver should not throw RuntimeException when passed an invalid URL


[ 
https://issues.apache.org/jira/browse/HIVE-4194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664795#comment-13664795
 ] 

Navis commented on HIVE-4194:
-

running test

 JDBC2: HiveDriver should not throw RuntimeException when passed an invalid URL
 --

 Key: HIVE-4194
 URL: https://issues.apache.org/jira/browse/HIVE-4194
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.11.0
Reporter: Richard Ding
Assignee: Richard Ding
 Attachments: HIVE-4194.patch


 As per JDBC 3.0 Spec (section 9.2)
 If the Driver implementation understands the URL, it will return a 
 Connection object; otherwise it returns null
 Currently HiveConnection constructor will throw IllegalArgumentException if 
 url string doesn't start with jdbc:hive2. This exception should be caught 
 by HiveDriver.connect and return null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4220) TimestampWritable.toString throws array index exception sometimes


[ 
https://issues.apache.org/jira/browse/HIVE-4220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664801#comment-13664801
 ] 

Navis commented on HIVE-4220:
-

[~mikhail] The default value of max-worker of HiveServer is Integer.MAX and 
I've thought it could make too many formatters in some (erroneous) situation. 
But admittedly, it's safer and cleaner. +1 and running test.

 TimestampWritable.toString throws array index exception sometimes
 -

 Key: HIVE-4220
 URL: https://issues.apache.org/jira/browse/HIVE-4220
 Project: Hive
  Issue Type: Bug
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-4220.D9669.1.patch


 {noformat}
 org.apache.hive.service.cli.HiveSQLException: java.io.IOException: 
 java.lang.ArrayIndexOutOfBoundsException: 45
 at 
 org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:215)
 at 
 org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:170)
 at 
 org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:288)
 at 
 org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:348)
 at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1553)
 at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1538)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
 at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 45
 at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:194)
 at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1449)
 at 
 org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:193)
 ... 11 more
 Caused by: java.lang.ArrayIndexOutOfBoundsException: 45
 at 
 sun.util.calendar.BaseCalendar.getCalendarDateFromFixedDate(BaseCalendar.java:436)
 at 
 java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2081)
 at 
 java.util.GregorianCalendar.computeFields(GregorianCalendar.java:1996)
 at java.util.Calendar.setTimeInMillis(Calendar.java:1110)
 at java.util.Calendar.setTime(Calendar.java:1076)
 at java.text.SimpleDateFormat.format(SimpleDateFormat.java:875)
 at java.text.SimpleDateFormat.format(SimpleDateFormat.java:868)
 at java.text.DateFormat.format(DateFormat.java:316)
 at 
 org.apache.hadoop.hive.serde2.io.TimestampWritable.toString(TimestampWritable.java:327)
 at 
 org.apache.hadoop.hive.serde2.lazy.LazyTimestamp.writeUTF8(LazyTimestamp.java:95)
 at 
 org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:234)
 at 
 org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:427)
 at 
 org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:381)
 at 
 org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:365)
 at 
 org.apache.hadoop.hive.ql.exec.ListSinkOperator.processOp(ListSinkOperator.java:96)
 at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:487)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:821)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
 at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:487)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:821)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:90)
 at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:487)
 at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:474)
 at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:468)
 at 
 org.apache.hadoop.hive.ql.exec.FetchTask.fetchAndPush(FetchTask.java:222)
 at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:188)
 ... 13 more
 {noformat}
 data formatter in TimestampWritable is declared static and shared but it's 
 not thread-safe.

--
This message is automatically generated by JIRA.
If you think it was sent

[jira] [Commented] (HIVE-4516) Fix concurrency bug in serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java


[ 
https://issues.apache.org/jira/browse/HIVE-4516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664803#comment-13664803
 ] 

Navis commented on HIVE-4516:
-

+1, running test

 Fix concurrency bug in 
 serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java
 -

 Key: HIVE-4516
 URL: https://issues.apache.org/jira/browse/HIVE-4516
 Project: Hive
  Issue Type: Bug
Reporter: Jon Hartlaub
 Attachments: TimestampWritable.java.patch


 A patch for concurrent use of TimestampWritable which occurs in a 
 multithreaded scenario (as found in AmpLab Shark).  A static SimpleDateFormat 
 (not ThreadSafe) is used by TimestampWritable in CTAS DDL statements where it 
 manifests as data corruption when used in a concurrent environment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4592) ColumnArithmeticColumn.txt template never sets output isNull to true; can give wrong results

2013-05-22 Thread Jitendra Nath Pandey (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664817#comment-13664817
 ] 

Jitendra Nath Pandey commented on HIVE-4592:


Same issue exists in many other templates. I think we should fix them too in 
the same jira.
Also, most of these templates assume that noNulls=false and isRepeating=true 
means all values are null.

 ColumnArithmeticColumn.txt template never sets output isNull to true; can 
 give wrong results
 

 Key: HIVE-4592
 URL: https://issues.apache.org/jira/browse/HIVE-4592
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: vectorization-branch


 ColumnArithmeticColumn.txt should set the output column's noNulls flag to 
 true if neither input column has nulls, but it does not do that. This can 
 lead to wrong results is the noNulls was set to false in a previous use of 
 the batch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4220) TimestampWritable.toString throws array index exception sometimes


 [ 
https://issues.apache.org/jira/browse/HIVE-4220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-4220:


Resolution: Duplicate
Status: Resolved  (was: Patch Available)

 TimestampWritable.toString throws array index exception sometimes
 -

 Key: HIVE-4220
 URL: https://issues.apache.org/jira/browse/HIVE-4220
 Project: Hive
  Issue Type: Bug
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-4220.D9669.1.patch


 {noformat}
 org.apache.hive.service.cli.HiveSQLException: java.io.IOException: 
 java.lang.ArrayIndexOutOfBoundsException: 45
 at 
 org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:215)
 at 
 org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:170)
 at 
 org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:288)
 at 
 org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:348)
 at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1553)
 at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1538)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
 at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 45
 at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:194)
 at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1449)
 at 
 org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:193)
 ... 11 more
 Caused by: java.lang.ArrayIndexOutOfBoundsException: 45
 at 
 sun.util.calendar.BaseCalendar.getCalendarDateFromFixedDate(BaseCalendar.java:436)
 at 
 java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2081)
 at 
 java.util.GregorianCalendar.computeFields(GregorianCalendar.java:1996)
 at java.util.Calendar.setTimeInMillis(Calendar.java:1110)
 at java.util.Calendar.setTime(Calendar.java:1076)
 at java.text.SimpleDateFormat.format(SimpleDateFormat.java:875)
 at java.text.SimpleDateFormat.format(SimpleDateFormat.java:868)
 at java.text.DateFormat.format(DateFormat.java:316)
 at 
 org.apache.hadoop.hive.serde2.io.TimestampWritable.toString(TimestampWritable.java:327)
 at 
 org.apache.hadoop.hive.serde2.lazy.LazyTimestamp.writeUTF8(LazyTimestamp.java:95)
 at 
 org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:234)
 at 
 org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:427)
 at 
 org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:381)
 at 
 org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:365)
 at 
 org.apache.hadoop.hive.ql.exec.ListSinkOperator.processOp(ListSinkOperator.java:96)
 at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:487)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:821)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
 at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:487)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:821)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:90)
 at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:487)
 at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:474)
 at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:468)
 at 
 org.apache.hadoop.hive.ql.exec.FetchTask.fetchAndPush(FetchTask.java:222)
 at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:188)
 ... 13 more
 {noformat}
 data formatter in TimestampWritable is declared static and shared but it's 
 not thread-safe.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4569) GetQueryPlan api in Hive Server2


 [ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaideep Dhok updated HIVE-4569:
---

Attachment: git-4569.patch

Attaching patch somehow it got skipped earlier.

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2


[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664836#comment-13664836
 ] 

Jaideep Dhok commented on HIVE-4569:


@Carl: This change will not affect JDBC/ODBC clients. Currently clients using 
Thrift have no way to get query plan, which is why we wanted to add this.

Here are the changes proposed:


# Add GetQueryPlan with arguments same as ExecuteStatement -
   {code}TGetQueryPlanResp GetQueryPlan(1:TExecuteStatementReq req);{code}
# Run a SQLOperation for the request, calling Driver.compile with the statement 
and return the plan object. Throw HiveSQLException with return code of compile 
if it fails.
# New response type for the above call -
{code}
struct TGetQueryPlanResp {
1: required TStatus status
// Queryplan
2: required queryplan.Query plan
}
{code}


We'll have to include queryplan.thrift in TCLIService.thrift for the return type



 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4594) UDF should use setLenient(false) when using SimpleDateFormat for parsing given datestring

2013-05-22 Thread Minwoo Kim (JIRA)

Minwoo Kim created HIVE-4594:


 Summary: UDF should use setLenient(false) when using 
SimpleDateFormat for parsing given datestring
 Key: HIVE-4594
 URL: https://issues.apache.org/jira/browse/HIVE-4594
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Minwoo Kim
Priority: Minor


If UDF has a date format of MM/DD/ and the function supply it the String 
9/5/05 the date should *not* be allowed.
In all cases, parsing must be non-lenient; the given string must strictly 
adhere to the parsing format.

For example,
{code}
select hour('2013-05-111 10:10:1') from src limit 1;
{code}
it returns 10.
the result returned is not what is expected, it should be null;


SimpleDateFormat is lenient by default.
so, UDF should use setLenient(false) when using SimpleDateFormat for parsing 
given datestring except for very specific or intended case




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4594) UDF should use setLenient(false) when using SimpleDateFormat for parsing given datestring

2013-05-22 Thread Minwoo Kim (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Minwoo Kim updated HIVE-4594:
-

Description: 
If UDF has a date format of MM/DD/ and the function supply it the String 
9/5/05 the date should *not* be allowed.
In most cases, parsing must be non-lenient; the given string must strictly 
adhere to the parsing format.

For example,
{code}
select hour('2013-05-111 10:10:1') from src limit 1;
{code}
it returns 10.
the result returned is not what is expected, it should be null;


SimpleDateFormat is lenient by default.
so, UDF should use setLenient(false) when using SimpleDateFormat for parsing 
given datestring except for very specific or intended case




  was:
If UDF has a date format of MM/DD/ and the function supply it the String 
9/5/05 the date should *not* be allowed.
In all cases, parsing must be non-lenient; the given string must strictly 
adhere to the parsing format.

For example,
{code}
select hour('2013-05-111 10:10:1') from src limit 1;
{code}
it returns 10.
the result returned is not what is expected, it should be null;


SimpleDateFormat is lenient by default.
so, UDF should use setLenient(false) when using SimpleDateFormat for parsing 
given datestring except for very specific or intended case





 UDF should use setLenient(false) when using SimpleDateFormat for parsing 
 given datestring
 -

 Key: HIVE-4594
 URL: https://issues.apache.org/jira/browse/HIVE-4594
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Minwoo Kim
Priority: Minor

 If UDF has a date format of MM/DD/ and the function supply it the 
 String 9/5/05 the date should *not* be allowed.
 In most cases, parsing must be non-lenient; the given string must strictly 
 adhere to the parsing format.
 For example,
 {code}
 select hour('2013-05-111 10:10:1') from src limit 1;
 {code}
 it returns 10.
 the result returned is not what is expected, it should be null;
 SimpleDateFormat is lenient by default.
 so, UDF should use setLenient(false) when using SimpleDateFormat for parsing 
 given datestring except for very specific or intended case

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-05-22 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-4569:
-

Status: Open  (was: Patch Available)

Please post a review request on reviewboard or phabricator.

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4570) More information to user on GetOperationStatus in Hive Server2 when query is still executing

2013-05-22 Thread Amareshwari Sriramadasu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664858#comment-13664858
 ] 

Amareshwari Sriramadasu commented on HIVE-4570:
---

bq. Current API GetOperationState is not enough since it returns only a state 
enum. Instead of changing that we can add new API GetOperationProgress() which 
will return both OperationState and OperationProgress.

Sounds good. +1.

For default implementation of getProgress(), you can return 1, if task is 
successful and 0, otherwise.

 More information to user on GetOperationStatus in Hive Server2 when query is 
 still executing
 

 Key: HIVE-4570
 URL: https://issues.apache.org/jira/browse/HIVE-4570
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 0.11.0
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok

 Currently in Hive Server2, when the query is still executing only the status 
 is set as STILL_EXECUTING. 
 This issue is to give more information to the user such as progress and 
 running job handles, if possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-05-22 Thread Phabricator (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4569:
--

Attachment: HIVE-4569.D10887.1.patch

jaideepdhok requested code review of HIVE-4569 [jira] GetQueryPlan api in Hive 
Server2.

Reviewers: JIRA

HIVE-4569

It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan api 
available in HiveServer2, though the wiki 
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
contains, not sure why it was not added.

TEST PLAN
  Added unit test CLIServiceTest.testGetQueryPlan

REVISION DETAIL
  https://reviews.facebook.net/D10887

AFFECTED FILES
  service/if/TCLIService.thrift
  service/src/gen/thrift/gen-cpp/TCLIService.cpp
  service/src/gen/thrift/gen-cpp/TCLIService.h
  service/src/gen/thrift/gen-cpp/TCLIService_server.skeleton.cpp
  service/src/gen/thrift/gen-cpp/TCLIService_types.cpp
  service/src/gen/thrift/gen-cpp/TCLIService_types.h
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TCLIService.java
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetQueryPlanResp.java
  service/src/gen/thrift/gen-php/TCLIService.php
  service/src/gen/thrift/gen-py/TCLIService/TCLIService-remote
  service/src/gen/thrift/gen-py/TCLIService/TCLIService.py
  service/src/gen/thrift/gen-py/TCLIService/ttypes.py
  service/src/gen/thrift/gen-rb/t_c_l_i_service.rb
  service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb
  service/src/java/org/apache/hive/service/cli/CLIService.java
  service/src/java/org/apache/hive/service/cli/CLIServiceClient.java
  service/src/java/org/apache/hive/service/cli/EmbeddedCLIServiceClient.java
  service/src/java/org/apache/hive/service/cli/ICLIService.java
  service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java
  service/src/java/org/apache/hive/service/cli/session/HiveSession.java
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java
  service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java
  
service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIServiceClient.java
  service/src/test/org/apache/hive/service/cli/CLIServiceTest.java

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/26055/

To: JIRA, jaideepdhok


 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4569) GetQueryPlan api in Hive Server2