Regarding the error in Hive query

2012-01-03 Thread Bhavesh Shah
Hello,
I am trying to implement join query in HIve but I am getting error.

My query is:
insert overwrite table t1 select subset.* from subset s JOIN testencounter
t on (t.patient_mrn =s.patient_mrn and t.encounter_date<=s.encounter_date)
where s.cad=0;
I am getting error as:
FAILED: Error in semantic analysis: Line 1:114 Both left and right aliases
encountered in JOIN Encounter_Date

What is this error I am not getting it? Pls suggest me some solution.


-- 
Thanks and Regards,
Bhavesh Shah


[jira] [Commented] (HIVE-2503) HiveServer should provide per session configuration

2012-01-03 Thread Navis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13179279#comment-13179279
 ] 

Navis commented on HIVE-2503:
-

@Carl: Yes, ready.

> HiveServer should provide per session configuration
> ---
>
> Key: HIVE-2503
> URL: https://issues.apache.org/jira/browse/HIVE-2503
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, Server Infrastructure
>Affects Versions: 0.9.0
>Reporter: Navis
>Assignee: Navis
> Fix For: 0.9.0
>
> Attachments: HIVE-2503.1.patch.txt
>
>
> Currently ThriftHiveProcessorFactory returns same HiveConf instance to 
> HiveServerHandler, making impossible to use per sesssion configuration. Just 
> wrapping 'conf' -> 'new HiveConf(conf)' seemed to solve this problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2573) Create per-session function registry

2012-01-03 Thread Navis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13179277#comment-13179277
 ] 

Navis commented on HIVE-2573:
-

Has no test cases yet. I think HIVE-2503 should be resolved first for that.

> Create per-session function registry 
> -
>
> Key: HIVE-2573
> URL: https://issues.apache.org/jira/browse/HIVE-2573
> Project: Hive
>  Issue Type: Improvement
>  Components: Server Infrastructure
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-2573.1.patch.txt, HIVE-2573.2.patch.txt, 
> HIVE-2573.3.patch.txt
>
>
> Currently the function registry is shared resource and could be overrided by 
> other users when using HiveServer. If per-session function registry is 
> provided, this situation could be prevented.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2503) HiveServer should provide per session configuration

2012-01-03 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13179275#comment-13179275
 ] 

jirapos...@reviews.apache.org commented on HIVE-2503:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2820/
---

(Updated 2012-01-04 05:08:24.994913)


Review request for hive and Carl Steinbach.


Changes
---

Added missing ASF header.


Summary
---

We uses multiple HiveClient connected to single HiveServer. After configuring 
connections, we've found the environment variables are different from expected. 
Current hive server uses single instance of HiveConf and this seemed to make 
the problem.


This addresses bug HIVE-2503.
https://issues.apache.org/jira/browse/HIVE-2503


Diffs (updated)
-

  service/src/java/org/apache/hadoop/hive/service/HiveServer.java 854cc99 
  service/src/test/org/apache/hadoop/hive/service/TestHiveServerSessions.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/2820/diff


Testing
---

If we use 'standAloneServer' in TestHiveServer.class, it is very simple to make 
the test case. But it seemed to be 'false' which makes it more complex.


Thanks,

Navis



> HiveServer should provide per session configuration
> ---
>
> Key: HIVE-2503
> URL: https://issues.apache.org/jira/browse/HIVE-2503
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, Server Infrastructure
>Affects Versions: 0.9.0
>Reporter: Navis
>Assignee: Navis
> Fix For: 0.9.0
>
> Attachments: HIVE-2503.1.patch.txt
>
>
> Currently ThriftHiveProcessorFactory returns same HiveConf instance to 
> HiveServerHandler, making impossible to use per sesssion configuration. Just 
> wrapping 'conf' -> 'new HiveConf(conf)' seemed to solve this problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: HIVE-2503: create per-session HiveConf instance

2012-01-03 Thread Navis Ryu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2820/
---

(Updated 2012-01-04 05:08:24.994913)


Review request for hive and Carl Steinbach.


Changes
---

Added missing ASF header.


Summary
---

We uses multiple HiveClient connected to single HiveServer. After configuring 
connections, we've found the environment variables are different from expected. 
Current hive server uses single instance of HiveConf and this seemed to make 
the problem.


This addresses bug HIVE-2503.
https://issues.apache.org/jira/browse/HIVE-2503


Diffs (updated)
-

  service/src/java/org/apache/hadoop/hive/service/HiveServer.java 854cc99 
  service/src/test/org/apache/hadoop/hive/service/TestHiveServerSessions.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/2820/diff


Testing
---

If we use 'standAloneServer' in TestHiveServer.class, it is very simple to make 
the test case. But it seemed to be 'false' which makes it more complex.


Thanks,

Navis



[jira] [Commented] (HIVE-2478) Support dry run option in hive

2012-01-03 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13179239#comment-13179239
 ] 

Phabricator commented on HIVE-2478:
---

khorgath has commented on the revision "HIVE-2478 [jira] Support dry run option 
in hive".

INLINE COMMENTS
  ql/src/test/queries/clientnegative/dryrun_bad_fetch_serde.q:1 Hmm.. ok, this 
is funny - if I try adding comments (using double hyphen) to the .q script, I 
get parse failures from the testcase runner.  And a number of other .q tests 
don't have comments either, so I don't see what format I can use to comment.

  How do I add comments here?

  -- comment  : does not work
  # comment : does not work
  // comment : does not work
  /* comment */ : does not work

REVISION DETAIL
  https://reviews.facebook.net/D927


> Support dry run option in hive
> --
>
> Key: HIVE-2478
> URL: https://issues.apache.org/jira/browse/HIVE-2478
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Affects Versions: 0.9.0
>Reporter: kalyan ram
>Assignee: Sushanth Sowmyan
>Priority: Minor
> Attachments: HIVE-2478-1.patch, HIVE-2478-2.patch, HIVE-2478-3.patch, 
> HIVE-2478.D927.1.patch
>
>
> Hive currently doesn't support a dry run option. For some complex queries we 
> just want to verify the query syntax initally before running it. A dry run 
> option where just the parsing is done without actual execution is a good 
> option.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2279) Implement sort(array) UDF

2012-01-03 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13179237#comment-13179237
 ] 

Phabricator commented on HIVE-2279:
---

cwsteinbach has requested changes to the revision "HIVE-2279 [jira] Implement 
sort(array) UDF".

INLINE COMMENTS
  ql/src/test/queries/clientpositive/udf_sort_array.q:8 Please add EXPLAIN 
queries.
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:82 
Fix indentation.
  ql/src/test/results/clientpositive/udf_sort_array.q.out:13 
"sort_array(sort_array(...))"?
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:45 
"Sorts the input array in ascending order according to the natural ordering of 
the array elements."
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:55 
Please add a negative testcase that exercises these code paths.
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:60 
What happens if I try to sort an array of arrays, or array of structs?
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:67 
The return type should be another ARRAY, not a string.
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:85 
This method needs to return another ARRAY, not a string containing the 
concatenated, sorted input elements. Take a look at the array() UDF for hints 
on how to return an array from a UDF.
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java:97 
Indentation.

REVISION DETAIL
  https://reviews.facebook.net/D1107


> Implement sort(array) UDF
> -
>
> Key: HIVE-2279
> URL: https://issues.apache.org/jira/browse/HIVE-2279
> Project: Hive
>  Issue Type: New Feature
>  Components: UDF
>Reporter: Carl Steinbach
>Assignee: Zhenxiao Luo
> Attachments: HIVE-2279.D1059.1.patch, HIVE-2279.D1101.1.patch, 
> HIVE-2279.D1107.1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2279) Implement sort(array) UDF

2012-01-03 Thread Carl Steinbach (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2279:
-

Status: Open  (was: Patch Available)

> Implement sort(array) UDF
> -
>
> Key: HIVE-2279
> URL: https://issues.apache.org/jira/browse/HIVE-2279
> Project: Hive
>  Issue Type: New Feature
>  Components: UDF
>Reporter: Carl Steinbach
>Assignee: Zhenxiao Luo
> Attachments: HIVE-2279.D1059.1.patch, HIVE-2279.D1101.1.patch, 
> HIVE-2279.D1107.1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2203) Extend concat_ws() UDF to support arrays of strings

2012-01-03 Thread Carl Steinbach (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2203:
-

Status: Open  (was: Patch Available)

> Extend concat_ws() UDF to support arrays of strings
> ---
>
> Key: HIVE-2203
> URL: https://issues.apache.org/jira/browse/HIVE-2203
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Carl Steinbach
>Assignee: Zhenxiao Luo
>Priority: Minor
> Attachments: HIVE-2203.D1065.1.patch, HIVE-2203.D1071.1.patch
>
>
> concat_ws() should support the following type of input parameters:
> concat_ws(string separator, array)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2203) Extend concat_ws() UDF to support arrays of strings

2012-01-03 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13179233#comment-13179233
 ] 

Phabricator commented on HIVE-2203:
---

cwsteinbach has requested changes to the revision "HIVE-2203 [jira] Extend 
concat_ws() UDF to support arrays of strings".

INLINE COMMENTS
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFConcatWS.java:36 
I think this should be changed to:

  CONCAT_WS(sep, [string | array(string)]+)
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFConcatWS.java:69 
Please add a negative testcase that exercises this validation code. Other 
examples of this are available here: ql/src/test/queries/clientnegative/udf_*.q

  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFConcatWS.java:62 
Can you rewrite this using switch statements? I generally find that easier to 
read.
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFConcatWS.java:99 
Indentation is wrong here. Please use 2 character indents.
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFConcatWS.java:103 
Leave a space after semicolons in a for statement.
  ql/src/test/queries/clientpositive/udf_concat_ws.q:20 Please add an EXPLAIN 
query.
  ql/src/test/queries/clientpositive/udf_concat_ws.q:28 Maybe replace this 
query with one call to concat_ws(NULL, ) in the previous query?

REVISION DETAIL
  https://reviews.facebook.net/D1071


> Extend concat_ws() UDF to support arrays of strings
> ---
>
> Key: HIVE-2203
> URL: https://issues.apache.org/jira/browse/HIVE-2203
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Carl Steinbach
>Assignee: Zhenxiao Luo
>Priority: Minor
> Attachments: HIVE-2203.D1065.1.patch, HIVE-2203.D1071.1.patch
>
>
> concat_ws() should support the following type of input parameters:
> concat_ws(string separator, array)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2279) Implement sort(array) UDF

2012-01-03 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13179229#comment-13179229
 ] 

Phabricator commented on HIVE-2279:
---

zhenxiao has commented on the revision "HIVE-2279 [jira] Implement sort(array) 
UDF".

  Incomplete patch.  D1107 is the correct patch for review.

REVISION DETAIL
  https://reviews.facebook.net/D1101


> Implement sort(array) UDF
> -
>
> Key: HIVE-2279
> URL: https://issues.apache.org/jira/browse/HIVE-2279
> Project: Hive
>  Issue Type: New Feature
>  Components: UDF
>Reporter: Carl Steinbach
>Assignee: Zhenxiao Luo
> Attachments: HIVE-2279.D1059.1.patch, HIVE-2279.D1101.1.patch, 
> HIVE-2279.D1107.1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2279) Implement sort(array) UDF

2012-01-03 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-2279:
--

Attachment: HIVE-2279.D1107.1.patch

zhenxiao requested code review of "HIVE-2279 [jira] Implement sort(array) UDF".
Reviewers: JIRA

  HIVE-2279: Implement sort(array) UDF

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D1107

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java
  ql/src/test/queries/clientpositive/udf_sort_array.q
  ql/src/test/results/clientpositive/show_functions.q.out
  ql/src/test/results/clientpositive/udf_sort_array.q.out

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/2325/

Tip: use the X-Herald-Rules header to filter Herald messages in your client.


> Implement sort(array) UDF
> -
>
> Key: HIVE-2279
> URL: https://issues.apache.org/jira/browse/HIVE-2279
> Project: Hive
>  Issue Type: New Feature
>  Components: UDF
>Reporter: Carl Steinbach
>Assignee: Zhenxiao Luo
> Attachments: HIVE-2279.D1059.1.patch, HIVE-2279.D1101.1.patch, 
> HIVE-2279.D1107.1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2279) Implement sort(array) UDF

2012-01-03 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-2279:
--

Attachment: HIVE-2279.D1101.1.patch

zhenxiao requested code review of "HIVE-2279 [jira] Implement sort(array) UDF".
Reviewers: JIRA

  HIVE-2279: resolve Trailing Whitespace

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D1101

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/2319/

Tip: use the X-Herald-Rules header to filter Herald messages in your client.


> Implement sort(array) UDF
> -
>
> Key: HIVE-2279
> URL: https://issues.apache.org/jira/browse/HIVE-2279
> Project: Hive
>  Issue Type: New Feature
>  Components: UDF
>Reporter: Carl Steinbach
>Assignee: Zhenxiao Luo
> Attachments: HIVE-2279.D1059.1.patch, HIVE-2279.D1101.1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2503) HiveServer should provide per session configuration

2012-01-03 Thread Carl Steinbach (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13179213#comment-13179213
 ] 

Carl Steinbach commented on HIVE-2503:
--

@Navis: Is this ticket ready for a final review?

> HiveServer should provide per session configuration
> ---
>
> Key: HIVE-2503
> URL: https://issues.apache.org/jira/browse/HIVE-2503
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, Server Infrastructure
>Affects Versions: 0.9.0
>Reporter: Navis
>Assignee: Navis
> Fix For: 0.9.0
>
> Attachments: HIVE-2503.1.patch.txt
>
>
> Currently ThriftHiveProcessorFactory returns same HiveConf instance to 
> HiveServerHandler, making impossible to use per sesssion configuration. Just 
> wrapping 'conf' -> 'new HiveConf(conf)' seemed to solve this problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2498) Group by operator doesnt estimate size of Timestamp & Binary data correctly

2012-01-03 Thread Carl Steinbach (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2498:
-

Status: Open  (was: Patch Available)

@Ashutosh: Please rebase the patch and submit a phabricator review request. 
Thanks.

> Group by operator doesnt estimate size of Timestamp & Binary data correctly
> ---
>
> Key: HIVE-2498
> URL: https://issues.apache.org/jira/browse/HIVE-2498
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.9.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: hive-2498.patch
>
>
> It currently defaults to default case and returns constant value, whereas we 
> can do better by getting actual size at runtime.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2573) Create per-session function registry

2012-01-03 Thread Carl Steinbach (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13179212#comment-13179212
 ] 

Carl Steinbach commented on HIVE-2573:
--

@Navis: Is this ready for a final review?

> Create per-session function registry 
> -
>
> Key: HIVE-2573
> URL: https://issues.apache.org/jira/browse/HIVE-2573
> Project: Hive
>  Issue Type: Improvement
>  Components: Server Infrastructure
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-2573.1.patch.txt, HIVE-2573.2.patch.txt, 
> HIVE-2573.3.patch.txt
>
>
> Currently the function registry is shared resource and could be overrided by 
> other users when using HiveServer. If per-session function registry is 
> provided, this situation could be prevented.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2691) Specify location of log4j configuration files via configuration properties

2012-01-03 Thread Carl Steinbach (Created) (JIRA)
Specify location of log4j configuration files via configuration properties
--

 Key: HIVE-2691
 URL: https://issues.apache.org/jira/browse/HIVE-2691
 Project: Hive
  Issue Type: New Feature
  Components: Configuration, Logging
Reporter: Carl Steinbach
Assignee: Zhenxiao Luo


Oozie needs to be able to override the default location of the log4j 
configuration
files from the Hive command line, e.g:

{noformat}
hive -hiveconf hive.log4j.file=/home/carl/hive-log4j.properties -hiveconf 
hive.log4j.exec.file=/home/carl/hive-exec-log4j.properties
{noformat}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2690) a bug in 'alter table concatenate' that causes filenames getting double url encoded

2012-01-03 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13179191#comment-13179191
 ] 

Phabricator commented on HIVE-2690:
---

njain has accepted the revision "HIVE-2690 [jira] a bug in 'alter table 
concatenate' that causes filenames getting double url encoded".

REVISION DETAIL
  https://reviews.facebook.net/D1095


> a bug in 'alter table concatenate' that causes filenames getting double url 
> encoded
> ---
>
> Key: HIVE-2690
> URL: https://issues.apache.org/jira/browse/HIVE-2690
> Project: Hive
>  Issue Type: Bug
>Reporter: He Yongqiang
>Assignee: He Yongqiang
> Attachments: HIVE-2690.1.patch, HIVE-2690.D1095.1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HIVE-1530) Include hive-default.xml and hive-log4j.properties in hive-common JAR

2012-01-03 Thread Carl Steinbach (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-1530.
--

Resolution: Invalid

Resolving as INVALID since hive-default.xml no longer exists.

> Include hive-default.xml and hive-log4j.properties in hive-common JAR
> -
>
> Key: HIVE-1530
> URL: https://issues.apache.org/jira/browse/HIVE-1530
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Attachments: HIVE-1530.1.patch.txt
>
>
> hive-common-*.jar should include hive-default.xml and hive-log4j.properties,
> and similarly hive-exec-*.jar should include hive-exec-log4j.properties. The
> hive-default.xml file that currently sits in the conf/ directory should be 
> removed.
> Motivations for this change:
> * We explicitly tell users that they should never modify hive-default.xml yet 
> give them the opportunity to do so by placing the file in the conf dir.
> * Many users are familiar with the Hadoop configuration mechanism that does 
> not require *-default.xml files to be present in the HADOOP_CONF_DIR, and 
> assume that the same is true for HIVE_CONF_DIR.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2690) a bug in 'alter table concatenate' that causes filenames getting double url encoded

2012-01-03 Thread He Yongqiang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13179142#comment-13179142
 ] 

He Yongqiang commented on HIVE-2690:


https://reviews.facebook.net/D1095

> a bug in 'alter table concatenate' that causes filenames getting double url 
> encoded
> ---
>
> Key: HIVE-2690
> URL: https://issues.apache.org/jira/browse/HIVE-2690
> Project: Hive
>  Issue Type: Bug
>Reporter: He Yongqiang
>Assignee: He Yongqiang
> Attachments: HIVE-2690.1.patch, HIVE-2690.D1095.1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2690) a bug in 'alter table concatenate' that causes filenames getting double url encoded

2012-01-03 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-2690:
--

Attachment: HIVE-2690.D1095.1.patch

heyongqiang requested code review of "HIVE-2690 [jira] a bug in 'alter table 
concatenate' that causes filenames getting double url encoded".
Reviewers: JIRA



TEST PLAN
  testcase

REVISION DETAIL
  https://reviews.facebook.net/D1095

AFFECTED FILES
  ql/src/test/results/clientpositive/alter_merge_2.q.out
  ql/src/test/queries/clientpositive/alter_merge_2.q
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/2313/

Tip: use the X-Herald-Rules header to filter Herald messages in your client.


> a bug in 'alter table concatenate' that causes filenames getting double url 
> encoded
> ---
>
> Key: HIVE-2690
> URL: https://issues.apache.org/jira/browse/HIVE-2690
> Project: Hive
>  Issue Type: Bug
>Reporter: He Yongqiang
>Assignee: He Yongqiang
> Attachments: HIVE-2690.1.patch, HIVE-2690.D1095.1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2690) a bug in 'alter table concatenate' that causes filenames getting double url encoded

2012-01-03 Thread He Yongqiang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang updated HIVE-2690:
---

Status: Patch Available  (was: Open)

> a bug in 'alter table concatenate' that causes filenames getting double url 
> encoded
> ---
>
> Key: HIVE-2690
> URL: https://issues.apache.org/jira/browse/HIVE-2690
> Project: Hive
>  Issue Type: Bug
>Reporter: He Yongqiang
>Assignee: He Yongqiang
> Attachments: HIVE-2690.1.patch, HIVE-2690.D1095.1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2690) a bug in 'alter table concatenate' that causes filenames getting double url encoded

2012-01-03 Thread He Yongqiang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang updated HIVE-2690:
---

Attachment: HIVE-2690.1.patch

> a bug in 'alter table concatenate' that causes filenames getting double url 
> encoded
> ---
>
> Key: HIVE-2690
> URL: https://issues.apache.org/jira/browse/HIVE-2690
> Project: Hive
>  Issue Type: Bug
>Reporter: He Yongqiang
>Assignee: He Yongqiang
> Attachments: HIVE-2690.1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Hive-trunk-h0.21 - Build # 1182 - Still Failing

2012-01-03 Thread Apache Jenkins Server
Changes for Build #1144
[jvs] HIVE-1040 [jira] use sed rather than diff for masking out noise in 
diff-based
tests
(Marek Sapota via John Sichi)

Summary:
Replace diff -I with regex masking in Java

The current diff -I approach has two problems:  (1) it does not allow resolution
finer than line-level, so it's impossible to mask out pattern occurrences within
a line, and (2) it produces unmasked files, so if you run diff on the command
line to compare the result .q.out with the checked-in file, you see the noise.

My suggestion is to first run sed to replace noise patterns with an
unlikely-to-occur string like ZYZZYZVA, and then diff the pre-masked files
without using any -I.

This would require a one-time hit to update all existing .q.out files so that
they would contain the pre-masked results.

Test Plan: EMPTY

Reviewers: JIRA, jsichi

Reviewed By: jsichi

CC: jsichi

Differential Revision: 597


Changes for Build #1145

Changes for Build #1146
[namit] HIVE-2640 Add alterPartition to AlterHandler interface
(Kevin Wilfong via namit)


Changes for Build #1147
[namit] HIVE-2617 Insert overwrite table db.tname fails if partition already 
exists
(Chinna Rao Lalam via namit)


Changes for Build #1148
[heyongqiang] HIVE-2651 [jira] The variable hive.exec.mode.local.auto.tasks.max 
should be
changed
(Namit Jain via Yongqiang He)

Summary:
HIVE-2651

It should be called hive.exec.mode.local.auto.input.files.max instead.
The number of input files are checked currently.

Test Plan: EMPTY

Reviewers: JIRA, heyongqiang

Reviewed By: heyongqiang

CC: heyongqiang

Differential Revision: 861

[cws] HIVE-727. Hive Server getSchema() returns wrong schema for 'Explain' 
queries (Prasad Mujumdar via cws)

[namit] HIVE-2611 Make index table output of create index command if
index is table based (Kevin Wilfong via namit)


Changes for Build #1150
[jvs] HIVE-2657 [jira] builtins JAR is not being published to Maven repo & 
hive-cli
POM does not depend on it either
(Carl Steinbach via John Sichi)

Summary: Make hive-cli and hive-ql depend on hive-builtins

Test Plan: EMPTY

Reviewers: JIRA, jsichi

Reviewed By: jsichi

CC: jsichi

Differential Revision: 897

[namit] HIVE-2654 "hive.querylog.location" requires parent directory to be 
exist or
  else folder creation fails (Chinna Rao Lalam via namit)


Changes for Build #1151
[hashutosh] HIVE-1892 : show functions also returns internal operators 
(Priyadarshini via Ashutosh Chauhan)


Changes for Build #1152

Changes for Build #1153
[namit] HIVE-2660 Need better exception handling in RCFile tolerate corruptions
mode (Ramkumar Vadali via namit)


Changes for Build #1154
[cws] HIVE-2631. Make Hive work with Hadoop 1.0.0 (Ashutosh Chauhan via cws)


Changes for Build #1155
[cws] HIVE-BUILD. Update RELEASE_NOTES.txt with 0.8.0 release information (cws)


Changes for Build #1156

Changes for Build #1157

Changes for Build #1158
[namit] HIVE-2602 add support for insert partition overwrite(...) if not
  exists (Chinna Rao Lalam via namit)


Changes for Build #1159

Changes for Build #1160
[cws] HIVE-2005. Implement BETWEEN operator (Navis via cws)


Changes for Build #1161
[jvs] HIVE-2433. add DOAP file for Hive


Changes for Build #1162

Changes for Build #1163

Changes for Build #1164
[heyongqiang] HIVE-2666 [jira] StackOverflowError when using custom UDF in map 
join
(Kevin Wilfong via Yongqiang He)

Summary:
Resource files are now added to the class path as soon as they are added via the
CLI.  This fixes the stack overflow error mentioned in the JIRA by ensuring a
consistent class loader between serializers and deserializers for the same
query.

Note that now serdes which contain a static block to register themselves are now
registered twice, once when adding the file to the class loader, and once when
an instance of the class is created.  Previously, registering a serde twice
resulted in an exception, to avoid this, I have downgraded it to a warning.

When a custom UDF is used as part of a join which is converted to a map join,
the XMLEncoder enters an infinite loop when serializing the map reduce task for
the second time, as part of sending it to be executed.  This results in a stack
overflow error.

Test Plan:
I ran the unit tests to verify nothing was broken.

I ran several queries which used custom UDFs and involved a join which was
converted to a map join.  I verified these completed successfully consistently

Reviewers: JIRA, heyongqiang

Reviewed By: heyongqiang

CC: heyongqiang, kevinwilfong

Differential Revision: 957

[namit] HIVE-2642 fix Hive-2566 and make union optimization more aggressive
(Yongqiang He via namit)


Changes for Build #1166

Changes for Build #1167

Changes for Build #1168
[heyongqiang] HIVE-2600: Enable/Add type-specific compression for rcfile 
(Krishna Kumar via He Yongqiang)


Changes for Build #1169

Changes for Build #1170
[cws] HIVE-1877. Add java_method() as a synonym for the reflect() UDF (Zhenxiao 
Luo via cws)


Changes for Buil

[jira] [Updated] (HIVE-2690) a bug in 'alter table concatenate' that causes filenames getting double url encoded

2012-01-03 Thread He Yongqiang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang updated HIVE-2690:
---

Attachment: (was: HIVE-2690.1.patch)

> a bug in 'alter table concatenate' that causes filenames getting double url 
> encoded
> ---
>
> Key: HIVE-2690
> URL: https://issues.apache.org/jira/browse/HIVE-2690
> Project: Hive
>  Issue Type: Bug
>Reporter: He Yongqiang
>Assignee: He Yongqiang
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2621) Allow multiple group bys with the same input data and spray keys to be run on the same reducer.

2012-01-03 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13179136#comment-13179136
 ] 

Hudson commented on HIVE-2621:
--

Integrated in Hive-trunk-h0.21 #1182 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1182/])
HIVE-2621:Allow multiple group bys with the same input data and spray keys 
to be run on the same reducer. (Kevin via He Yongqiang)

heyongqiang : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1226903
Files : 
* /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDesc.java
* /hive/trunk/ql/src/test/queries/clientpositive/groupby10.q
* /hive/trunk/ql/src/test/queries/clientpositive/groupby7_map.q
* 
/hive/trunk/ql/src/test/queries/clientpositive/groupby7_map_multi_single_reducer.q
* /hive/trunk/ql/src/test/queries/clientpositive/groupby7_noskew.q
* 
/hive/trunk/ql/src/test/queries/clientpositive/groupby7_noskew_multi_single_reducer.q
* /hive/trunk/ql/src/test/queries/clientpositive/groupby8.q
* /hive/trunk/ql/src/test/queries/clientpositive/groupby9.q
* 
/hive/trunk/ql/src/test/queries/clientpositive/groupby_complex_types_multi_single_reducer.q
* /hive/trunk/ql/src/test/queries/clientpositive/groupby_multi_single_reducer.q
* /hive/trunk/ql/src/test/queries/clientpositive/multigroupby_singlemr.q
* /hive/trunk/ql/src/test/results/clientpositive/groupby10.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/groupby7_map_multi_single_reducer.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/groupby7_noskew_multi_single_reducer.q.out
* /hive/trunk/ql/src/test/results/clientpositive/groupby8.q.out
* /hive/trunk/ql/src/test/results/clientpositive/groupby9.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/groupby_complex_types_multi_single_reducer.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/groupby_multi_single_reducer.q.out
* /hive/trunk/ql/src/test/results/clientpositive/multi_insert.q.out
* /hive/trunk/ql/src/test/results/clientpositive/multigroupby_singlemr.q.out
* /hive/trunk/ql/src/test/results/clientpositive/parallel.q.out


> Allow multiple group bys with the same input data and spray keys to be run on 
> the same reducer.
> ---
>
> Key: HIVE-2621
> URL: https://issues.apache.org/jira/browse/HIVE-2621
> Project: Hive
>  Issue Type: New Feature
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-2621.1.patch.txt, HIVE-2621.D567.1.patch, 
> HIVE-2621.D567.2.patch, HIVE-2621.D567.3.patch, HIVE-2621.D567.4.patch
>
>
> Currently, when a user runs a query, such as a multi-insert, where each 
> insertion subclause consists of a simple query followed by a group by, the 
> group bys for each clause are run on a separate reducer.  This requires 
> writing the data for each group by clause to an intermediate file, and then 
> reading it back.  This uses a significant amount of the total CPU consumed by 
> the query for an otherwise simple query.
> If the subclauses are grouped by their distinct expressions and group by 
> keys, with all of the group by expressions for a group of subclauses run on a 
> single reducer, this would reduce the amount of reading/writing to 
> intermediate files for some queries.
> To do this, for each group of subclauses, in the mapper we would execute a 
> the filters for each subclause 'or'd together (provided each subclause has a 
> filter) followed by a reduce sink.  In the reducer, the child operators would 
> be each subclauses filter followed by the group by and any subsequent 
> operations.
> Note that this would require turning off map aggregation, so we would need to 
> make using this type of plan configurable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2690) a bug in 'alter table concatenate' that causes filenames getting double url encoded

2012-01-03 Thread He Yongqiang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang updated HIVE-2690:
---

Attachment: HIVE-2690.1.patch

> a bug in 'alter table concatenate' that causes filenames getting double url 
> encoded
> ---
>
> Key: HIVE-2690
> URL: https://issues.apache.org/jira/browse/HIVE-2690
> Project: Hive
>  Issue Type: Bug
>Reporter: He Yongqiang
>Assignee: He Yongqiang
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2690) a bug in 'alter table concatenate' that causes filenames getting double url encoded

2012-01-03 Thread He Yongqiang (Created) (JIRA)
a bug in 'alter table concatenate' that causes filenames getting double url 
encoded
---

 Key: HIVE-2690
 URL: https://issues.apache.org/jira/browse/HIVE-2690
 Project: Hive
  Issue Type: Bug
Reporter: He Yongqiang
Assignee: He Yongqiang




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Hive-trunk-h0.21 - Build # 1181 - Failure

2012-01-03 Thread Apache Jenkins Server
Changes for Build #1144
[jvs] HIVE-1040 [jira] use sed rather than diff for masking out noise in 
diff-based
tests
(Marek Sapota via John Sichi)

Summary:
Replace diff -I with regex masking in Java

The current diff -I approach has two problems:  (1) it does not allow resolution
finer than line-level, so it's impossible to mask out pattern occurrences within
a line, and (2) it produces unmasked files, so if you run diff on the command
line to compare the result .q.out with the checked-in file, you see the noise.

My suggestion is to first run sed to replace noise patterns with an
unlikely-to-occur string like ZYZZYZVA, and then diff the pre-masked files
without using any -I.

This would require a one-time hit to update all existing .q.out files so that
they would contain the pre-masked results.

Test Plan: EMPTY

Reviewers: JIRA, jsichi

Reviewed By: jsichi

CC: jsichi

Differential Revision: 597


Changes for Build #1145

Changes for Build #1146
[namit] HIVE-2640 Add alterPartition to AlterHandler interface
(Kevin Wilfong via namit)


Changes for Build #1147
[namit] HIVE-2617 Insert overwrite table db.tname fails if partition already 
exists
(Chinna Rao Lalam via namit)


Changes for Build #1148
[heyongqiang] HIVE-2651 [jira] The variable hive.exec.mode.local.auto.tasks.max 
should be
changed
(Namit Jain via Yongqiang He)

Summary:
HIVE-2651

It should be called hive.exec.mode.local.auto.input.files.max instead.
The number of input files are checked currently.

Test Plan: EMPTY

Reviewers: JIRA, heyongqiang

Reviewed By: heyongqiang

CC: heyongqiang

Differential Revision: 861

[cws] HIVE-727. Hive Server getSchema() returns wrong schema for 'Explain' 
queries (Prasad Mujumdar via cws)

[namit] HIVE-2611 Make index table output of create index command if
index is table based (Kevin Wilfong via namit)


Changes for Build #1150
[jvs] HIVE-2657 [jira] builtins JAR is not being published to Maven repo & 
hive-cli
POM does not depend on it either
(Carl Steinbach via John Sichi)

Summary: Make hive-cli and hive-ql depend on hive-builtins

Test Plan: EMPTY

Reviewers: JIRA, jsichi

Reviewed By: jsichi

CC: jsichi

Differential Revision: 897

[namit] HIVE-2654 "hive.querylog.location" requires parent directory to be 
exist or
  else folder creation fails (Chinna Rao Lalam via namit)


Changes for Build #1151
[hashutosh] HIVE-1892 : show functions also returns internal operators 
(Priyadarshini via Ashutosh Chauhan)


Changes for Build #1152

Changes for Build #1153
[namit] HIVE-2660 Need better exception handling in RCFile tolerate corruptions
mode (Ramkumar Vadali via namit)


Changes for Build #1154
[cws] HIVE-2631. Make Hive work with Hadoop 1.0.0 (Ashutosh Chauhan via cws)


Changes for Build #1155
[cws] HIVE-BUILD. Update RELEASE_NOTES.txt with 0.8.0 release information (cws)


Changes for Build #1156

Changes for Build #1157

Changes for Build #1158
[namit] HIVE-2602 add support for insert partition overwrite(...) if not
  exists (Chinna Rao Lalam via namit)


Changes for Build #1159

Changes for Build #1160
[cws] HIVE-2005. Implement BETWEEN operator (Navis via cws)


Changes for Build #1161
[jvs] HIVE-2433. add DOAP file for Hive


Changes for Build #1162

Changes for Build #1163

Changes for Build #1164
[heyongqiang] HIVE-2666 [jira] StackOverflowError when using custom UDF in map 
join
(Kevin Wilfong via Yongqiang He)

Summary:
Resource files are now added to the class path as soon as they are added via the
CLI.  This fixes the stack overflow error mentioned in the JIRA by ensuring a
consistent class loader between serializers and deserializers for the same
query.

Note that now serdes which contain a static block to register themselves are now
registered twice, once when adding the file to the class loader, and once when
an instance of the class is created.  Previously, registering a serde twice
resulted in an exception, to avoid this, I have downgraded it to a warning.

When a custom UDF is used as part of a join which is converted to a map join,
the XMLEncoder enters an infinite loop when serializing the map reduce task for
the second time, as part of sending it to be executed.  This results in a stack
overflow error.

Test Plan:
I ran the unit tests to verify nothing was broken.

I ran several queries which used custom UDFs and involved a join which was
converted to a map join.  I verified these completed successfully consistently

Reviewers: JIRA, heyongqiang

Reviewed By: heyongqiang

CC: heyongqiang, kevinwilfong

Differential Revision: 957

[namit] HIVE-2642 fix Hive-2566 and make union optimization more aggressive
(Yongqiang He via namit)


Changes for Build #1166

Changes for Build #1167

Changes for Build #1168
[heyongqiang] HIVE-2600: Enable/Add type-specific compression for rcfile 
(Krishna Kumar via He Yongqiang)


Changes for Build #1169

Changes for Build #1170
[cws] HIVE-1877. Add java_method() as a synonym for the reflect() UDF (Zhenxiao 
Luo via cws)


Changes for Buil

[jira] [Commented] (HIVE-2478) Support dry run option in hive

2012-01-03 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13178966#comment-13178966
 ] 

Phabricator commented on HIVE-2478:
---

khorgath has commented on the revision "HIVE-2478 [jira] Support dry run option 
in hive".

REVISION DETAIL
  https://reviews.facebook.net/D927


> Support dry run option in hive
> --
>
> Key: HIVE-2478
> URL: https://issues.apache.org/jira/browse/HIVE-2478
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Affects Versions: 0.9.0
>Reporter: kalyan ram
>Assignee: Sushanth Sowmyan
>Priority: Minor
> Attachments: HIVE-2478-1.patch, HIVE-2478-2.patch, HIVE-2478-3.patch, 
> HIVE-2478.D927.1.patch
>
>
> Hive currently doesn't support a dry run option. For some complex queries we 
> just want to verify the query syntax initally before running it. A dry run 
> option where just the parsing is done without actual execution is a good 
> option.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2478) Support dry run option in hive

2012-01-03 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13178965#comment-13178965
 ] 

Phabricator commented on HIVE-2478:
---

khorgath has commented on the revision "HIVE-2478 [jira] Support dry run option 
in hive".

INLINE COMMENTS
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java:135-140 PLAN is not 
explicitly used anywhere in code, because we need to only check 3 times to 
establish the 4 states.

  The first two checks happen in the compile part :
  First check is for PARSE, if it's PARSE, it's weeded out and if not, it 
continues.
  Second check is for ANALYZE, if it's that, it's weeded out and if not, it 
continues.

  The third check is after the compile, and thus, done for whether it's 
not-OFF(because both the PARSE and ANALYZE returns return here too, and we weed 
out all the non-OFF cases there.)

  I think I agree that an empty string would serve better than OFF, looking to 
change that.
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:558 Ok, fixing 
along with trying to make OFF an empty string in the next patch.
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java:423 Good spot, changing.
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java:452 Yup. Without this, it 
does, because it tries to read .getFields() on the schema. Adding comment.
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java:479-488 Oh, ok - I wasn't 
entirely sure of that, given the LOG.info proclaiming semantic analysis was 
complete before the validation. Would the plan validation not count as part of 
the plan phase?
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java:944 Disagree. PARSE, 
ANALYZE and PLAN all come up to here, and need to be rejected. !OFF is the 
correct check here.
  ql/src/test/queries/clientnegative/dryrun_bad_fetch_serde.q:1 Ok, doing so.

REVISION DETAIL
  https://reviews.facebook.net/D927


> Support dry run option in hive
> --
>
> Key: HIVE-2478
> URL: https://issues.apache.org/jira/browse/HIVE-2478
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Affects Versions: 0.9.0
>Reporter: kalyan ram
>Assignee: Sushanth Sowmyan
>Priority: Minor
> Attachments: HIVE-2478-1.patch, HIVE-2478-2.patch, HIVE-2478-3.patch, 
> HIVE-2478.D927.1.patch
>
>
> Hive currently doesn't support a dry run option. For some complex queries we 
> just want to verify the query syntax initally before running it. A dry run 
> option where just the parsing is done without actual execution is a good 
> option.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2621) Allow multiple group bys with the same input data and spray keys to be run on the same reducer.

2012-01-03 Thread He Yongqiang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang updated HIVE-2621:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

committed, thanks Kevin!

> Allow multiple group bys with the same input data and spray keys to be run on 
> the same reducer.
> ---
>
> Key: HIVE-2621
> URL: https://issues.apache.org/jira/browse/HIVE-2621
> Project: Hive
>  Issue Type: New Feature
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-2621.1.patch.txt, HIVE-2621.D567.1.patch, 
> HIVE-2621.D567.2.patch, HIVE-2621.D567.3.patch, HIVE-2621.D567.4.patch
>
>
> Currently, when a user runs a query, such as a multi-insert, where each 
> insertion subclause consists of a simple query followed by a group by, the 
> group bys for each clause are run on a separate reducer.  This requires 
> writing the data for each group by clause to an intermediate file, and then 
> reading it back.  This uses a significant amount of the total CPU consumed by 
> the query for an otherwise simple query.
> If the subclauses are grouped by their distinct expressions and group by 
> keys, with all of the group by expressions for a group of subclauses run on a 
> single reducer, this would reduce the amount of reading/writing to 
> intermediate files for some queries.
> To do this, for each group of subclauses, in the mapper we would execute a 
> the filters for each subclause 'or'd together (provided each subclause has a 
> filter) followed by a reduce sink.  In the reducer, the child operators would 
> be each subclauses filter followed by the group by and any subsequent 
> operations.
> Note that this would require turning off map aggregation, so we would need to 
> make using this type of plan configurable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira