date:20150624

[jira] [Commented] (HIVE-11080) Modify VectorizedRowBatch.toString() to not depend on VectorExpressionWriter

2015-06-24 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600762#comment-14600762
 ] 

Prasanth Jayachandran commented on HIVE-11080:
--

I am not yet sure if toString() is used anywhere else. I don't see it being 
used in VectorFSOp. [~mmccline] Can you take a look at these changes?

> Modify VectorizedRowBatch.toString() to not depend on VectorExpressionWriter
> 
>
> Key: HIVE-11080
> URL: https://issues.apache.org/jira/browse/HIVE-11080
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-11080.patch
>
>
> Currently the VectorizedRowBatch.toString method uses the 
> VectorExpressionWriter to convert the row batch to a string.
> Since the string is only used for printing error messages, I'd propose making 
> the toString use the types of the vector batch instead of the object 
> inspector.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10553) Remove hardcoded Parquet references from SearchArgumentImpl

2015-06-24 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600759#comment-14600759
 ] 

Prasanth Jayachandran commented on HIVE-10553:
--

New changes looks good to me, +1. The latest test failures are not related to 
this patch. It is caused by my recent commit and revert of patch HIVE-11043.


> Remove hardcoded Parquet references from SearchArgumentImpl
> ---
>
> Key: HIVE-10553
> URL: https://issues.apache.org/jira/browse/HIVE-10553
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gopal V
>Assignee: Owen O'Malley
> Attachments: HIVE-10553.patch, HIVE-10553.patch, HIVE-10553.patch
>
>
> SARGs currently depend on Parquet code, which causes a tight coupling between 
> parquet releases and storage-api versions.
> Move Parquet code out to its own RecordReader, similar to ORC's SargApplier 
> implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10796) Remove dependencies on NumericHistogram and NumDistinctValueEstimator from JavaDataModel

2015-06-24 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600749#comment-14600749
 ] 

Prasanth Jayachandran commented on HIVE-10796:
--

LGTM, +1

> Remove dependencies on NumericHistogram and NumDistinctValueEstimator from 
> JavaDataModel
> 
>
> Key: HIVE-10796
> URL: https://issues.apache.org/jira/browse/HIVE-10796
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-10796.patch
>
>
> The JavaDataModel class is used in a lot of places and the non-general 
> calculations are better done in the other classes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6791) Support variable substition for Beeline shell command

2015-06-24 Thread Ferdinand Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-6791:
---
Attachment: HIVE-6791-beeline-cli.3.patch

> Support variable substition for Beeline shell command
> -
>
> Key: HIVE-6791
> URL: https://issues.apache.org/jira/browse/HIVE-6791
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI, Clients
>Affects Versions: 0.14.0
>Reporter: Xuefu Zhang
>Assignee: Ferdinand Xu
> Attachments: HIVE-6791-beeline-cli.2.patch, 
> HIVE-6791-beeline-cli.3.patch, HIVE-6791-beeline-cli.patch
>
>
> A follow-up task from HIVE-6694. Similar to HIVE-6570.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10794) Remove the dependence from ErrorMsg to HiveUtils

2015-06-24 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600745#comment-14600745
 ] 

Prasanth Jayachandran commented on HIVE-10794:
--

LGTM, +1

> Remove the dependence from ErrorMsg to HiveUtils
> 
>
> Key: HIVE-10794
> URL: https://issues.apache.org/jira/browse/HIVE-10794
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-10794.patch
>
>
> HiveUtils has a large set of dependencies and ErrorMsg only needs the new 
> line constant. Breaking the dependence will reduce the dependency set from 
> ErrorMsg significantly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10795) Remove use of PerfLogger from Orc

2015-06-24 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600743#comment-14600743
 ] 

Prasanth Jayachandran commented on HIVE-10795:
--

LGTM, +1

> Remove use of PerfLogger from Orc
> -
>
> Key: HIVE-10795
> URL: https://issues.apache.org/jira/browse/HIVE-10795
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-10795.patch, HIVE-10795.patch, HIVE-10795.patch
>
>
> PerfLogger is yet another class with a huge dependency set that Orc doesn't 
> need.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11086) Remove use of ErrorMsg in Orc's RunLengthIntegerReaderV2

2015-06-24 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600726#comment-14600726
 ] 

Prasanth Jayachandran commented on HIVE-11086:
--

LGTM, +1

> Remove use of ErrorMsg in Orc's RunLengthIntegerReaderV2
> 
>
> Key: HIVE-11086
> URL: https://issues.apache.org/jira/browse/HIVE-11086
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-11086.patch
>
>
> ORC's rle v2 reader uses a string literal from ErrorMsg, which forces a large 
> dependency on the rle v2 reader. Pulling the string literal in directly 
> doesn't change the behavior and fixes the linkage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11104) Select operator doesn't propagate constants appearing in expressions

2015-06-24 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600725#comment-14600725
 ] 

Prasanth Jayachandran commented on HIVE-11104:
--

1) Can the getSignature() call return null? I have seen null checks in other 
places.
2) Can you add explain of the query as well to the q file?

> Select operator doesn't propagate constants appearing in expressions
> 
>
> Key: HIVE-11104
> URL: https://issues.apache.org/jira/browse/HIVE-11104
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.2.1
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-11104.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11097) HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases

2015-06-24 Thread Wan Chang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600719#comment-14600719
 ] 

Wan Chang commented on HIVE-11097:
--

Hi [~jvs], would you help to review this?

> HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases
> -
>
> Key: HIVE-11097
> URL: https://issues.apache.org/jira/browse/HIVE-11097
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.2.0
> Environment: Hive 0.13.1, Hive 2.0.0, hadoop 2.4.1
>Reporter: Wan Chang
>Priority: Critical
> Attachments: HIVE-11097.1.patch
>
>
> Say we have a sql as
> {code}
> create table if not exists test_orc_src (a int, b int, c int) stored as orc;
> create table if not exists test_orc_src2 (a int, b int, d int) stored as orc;
> insert overwrite table test_orc_src select 1,2,3 from src limit 1;
> insert overwrite table test_orc_src2 select 1,2,4 from src limit 1;
> set hive.auto.convert.join = false;
> set hive.execution.engine=mr;
> select
>   tb.c
> from test.test_orc_src tb
> join (select * from test.test_orc_src2) tm
> on tb.a = tm.a
> where tb.b = 2
> {code}
> The correct result is 3 but it produced no result.
> I find that in HiveInputFormat.pushProjectionsAndFilters
> {code}
> match = splitPath.startsWith(key) || splitPathWithNoSchema.startsWith(key);
> {code}
> It uses startsWith to combine aliases with path, so tm will match two alias 
> in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10468) Create scripts to do metastore upgrade tests on jenkins for Oracle DB.

2015-06-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600716#comment-14600716
 ] 

Hive QA commented on HIVE-10468:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12741769/HIVE-10468.3.patch

{color:green}SUCCESS:{color} +1 9021 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4375/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4375/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4375/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12741769 - PreCommit-HIVE-TRUNK-Build

> Create scripts to do metastore upgrade tests on jenkins for Oracle DB.
> --
>
> Key: HIVE-10468
> URL: https://issues.apache.org/jira/browse/HIVE-10468
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-10468.1.patch, HIVE-10468.2.patch, 
> HIVE-10468.3.patch, HIVE-10468.4.patch, HIVE-10468.patch
>
>
> This JIRA is to isolate the work specific to Oracle DB in HIVE-10239. Because 
> of absence of 64 bit debian packages for oracle-xe, the apt-get install fails 
> on the AWS systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10468) Create scripts to do metastore upgrade tests on jenkins for Oracle DB.

2015-06-24 Thread Naveen Gangam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-10468:
-
Attachment: HIVE-10468.4.patch

> Create scripts to do metastore upgrade tests on jenkins for Oracle DB.
> --
>
> Key: HIVE-10468
> URL: https://issues.apache.org/jira/browse/HIVE-10468
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-10468.1.patch, HIVE-10468.2.patch, 
> HIVE-10468.3.patch, HIVE-10468.4.patch, HIVE-10468.patch
>
>
> This JIRA is to isolate the work specific to Oracle DB in HIVE-10239. Because 
> of absence of 64 bit debian packages for oracle-xe, the apt-get install fails 
> on the AWS systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11051) Hive 1.2.0 MapJoin w/Tez - LazyBinaryArray cannot be cast to [Ljava.lang.Object;

2015-06-24 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600644#comment-14600644
 ] 

Gopal V commented on HIVE-11051:


Also needs a back-port to branch-1, once the pre-commit tests pass on trunk.

> Hive 1.2.0  MapJoin w/Tez - LazyBinaryArray cannot be cast to 
> [Ljava.lang.Object;
> -
>
> Key: HIVE-11051
> URL: https://issues.apache.org/jira/browse/HIVE-11051
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers, Tez
>Affects Versions: 1.2.0, 2.0.0
>Reporter: Greg Senia
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11051.01.patch, HIVE-11051.02.patch, 
> problem_table_joins.tar.gz
>
>
> I tried to apply: HIVE-10729 which did not solve the issue.
> The following exception is thrown on a Tez MapJoin with Hive 1.2.0 and Tez 
> 0.5.4/0.5.3
> {code}
> Status: Running (Executing on YARN cluster with App id 
> application_1434641270368_1038)
> 
> VERTICES  STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  
> KILLED
> 
> Map 1 ..   SUCCEEDED  3  300   0  
>  0
> Map 2 ... FAILED  3  102   7  
>  0
> 
> VERTICES: 01/02  [=>>-] 66%   ELAPSED TIME: 7.39 s
>  
> 
> Status: Failed
> Vertex failed, vertexName=Map 2, vertexId=vertex_1434641270368_1038_2_01, 
> diagnostics=[Task failed, taskId=task_1434641270368_1038_2_01_02, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"cnctevn_id":"002245282386","svcrqst_id":"003627217285","svcrqst_crt_dts":"2015-04-23
>  11:54:39.238357","subject_seq_no":1,"plan_component":"HMOM1 
> ","cust_segment":"RM 
> ","cnctyp_cd":"001","cnctmd_cd":"D02","cnctevs_cd":"007","svcrtyp_cd":"335","svrstyp_cd":"088","cmpltyp_cd":"
>  ","catsrsn_cd":"","apealvl_cd":" 
> ","cnstnty_cd":"001","svcrqst_asrqst_ind":"Y","svcrqst_rtnorig_in":"N","svcrqst_vwasof_dt":"null","sum_reason_cd":"98","sum_reason":"Exclude","crsr_master_claim_index":null,"svcrqct_cds":["
>"],"svcrqst_lupdt":"2015-04-23 
> 22:14:01.288132","crsr_lupdt":null,"cntevsds_lupdt":"2015-04-23 
> 11:54:40.740061","ignore_me":1,"notes":null}
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"cnctevn_id":"002245282386","svcrqst_id":"003627217285","svcrqst_crt_dts":"2015-04-23
>  11:54:39.238357","subject_seq_no":1,"plan_component":"HMOM1 
> ","cust_segment":"RM 
> ","cnctyp_cd":"001","cnctmd_cd":"D02","cnctevs_cd":"007","svcrtyp_cd":"335","svrstyp_cd":"088","cmpltyp_cd":"
>  ","catsrsn_cd":"","apealvl_cd":" 
> ","cnstnty_cd":"001","svcrqst_asrqst_ind":"Y","svcrqst_rtnorig_in":"N","svcrqst_vwasof_dt":"null","sum_reason_cd":"98","sum_reason":"Exclude","crsr_master_claim_index":null,"svcrqct_cds":["
>"],"svcrqst_lupdt":"2015-04-23 
> 22:14:01.288132","crsr_lupdt":null,

[jira] [Updated] (HIVE-11051) Hive 1.2.0 MapJoin w/Tez - LazyBinaryArray cannot be cast to [Ljava.lang.Object;

2015-06-24 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11051:
---
Affects Version/s: 2.0.0

> Hive 1.2.0  MapJoin w/Tez - LazyBinaryArray cannot be cast to 
> [Ljava.lang.Object;
> -
>
> Key: HIVE-11051
> URL: https://issues.apache.org/jira/browse/HIVE-11051
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers, Tez
>Affects Versions: 1.2.0, 2.0.0
>Reporter: Greg Senia
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11051.01.patch, HIVE-11051.02.patch, 
> problem_table_joins.tar.gz
>
>
> I tried to apply: HIVE-10729 which did not solve the issue.
> The following exception is thrown on a Tez MapJoin with Hive 1.2.0 and Tez 
> 0.5.4/0.5.3
> {code}
> Status: Running (Executing on YARN cluster with App id 
> application_1434641270368_1038)
> 
> VERTICES  STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  
> KILLED
> 
> Map 1 ..   SUCCEEDED  3  300   0  
>  0
> Map 2 ... FAILED  3  102   7  
>  0
> 
> VERTICES: 01/02  [=>>-] 66%   ELAPSED TIME: 7.39 s
>  
> 
> Status: Failed
> Vertex failed, vertexName=Map 2, vertexId=vertex_1434641270368_1038_2_01, 
> diagnostics=[Task failed, taskId=task_1434641270368_1038_2_01_02, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"cnctevn_id":"002245282386","svcrqst_id":"003627217285","svcrqst_crt_dts":"2015-04-23
>  11:54:39.238357","subject_seq_no":1,"plan_component":"HMOM1 
> ","cust_segment":"RM 
> ","cnctyp_cd":"001","cnctmd_cd":"D02","cnctevs_cd":"007","svcrtyp_cd":"335","svrstyp_cd":"088","cmpltyp_cd":"
>  ","catsrsn_cd":"","apealvl_cd":" 
> ","cnstnty_cd":"001","svcrqst_asrqst_ind":"Y","svcrqst_rtnorig_in":"N","svcrqst_vwasof_dt":"null","sum_reason_cd":"98","sum_reason":"Exclude","crsr_master_claim_index":null,"svcrqct_cds":["
>"],"svcrqst_lupdt":"2015-04-23 
> 22:14:01.288132","crsr_lupdt":null,"cntevsds_lupdt":"2015-04-23 
> 11:54:40.740061","ignore_me":1,"notes":null}
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"cnctevn_id":"002245282386","svcrqst_id":"003627217285","svcrqst_crt_dts":"2015-04-23
>  11:54:39.238357","subject_seq_no":1,"plan_component":"HMOM1 
> ","cust_segment":"RM 
> ","cnctyp_cd":"001","cnctmd_cd":"D02","cnctevs_cd":"007","svcrtyp_cd":"335","svrstyp_cd":"088","cmpltyp_cd":"
>  ","catsrsn_cd":"","apealvl_cd":" 
> ","cnstnty_cd":"001","svcrqst_asrqst_ind":"Y","svcrqst_rtnorig_in":"N","svcrqst_vwasof_dt":"null","sum_reason_cd":"98","sum_reason":"Exclude","crsr_master_claim_index":null,"svcrqct_cds":["
>"],"svcrqst_lupdt":"2015-04-23 
> 22:14:01.288132","crsr_lupdt":null,"cntevsds_lupdt":"2015-04-23 
> 11:54:40.740061","ignore_me":1,"notes":null}
> at 
> org.apache.ha

[jira] [Commented] (HIVE-11051) Hive 1.2.0 MapJoin w/Tez - LazyBinaryArray cannot be cast to [Ljava.lang.Object;

2015-06-24 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600640#comment-14600640
 ] 

Gopal V commented on HIVE-11051:


[~mmccline]: LGTM - +1.

> Hive 1.2.0  MapJoin w/Tez - LazyBinaryArray cannot be cast to 
> [Ljava.lang.Object;
> -
>
> Key: HIVE-11051
> URL: https://issues.apache.org/jira/browse/HIVE-11051
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers, Tez
>Affects Versions: 1.2.0
>Reporter: Greg Senia
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11051.01.patch, HIVE-11051.02.patch, 
> problem_table_joins.tar.gz
>
>
> I tried to apply: HIVE-10729 which did not solve the issue.
> The following exception is thrown on a Tez MapJoin with Hive 1.2.0 and Tez 
> 0.5.4/0.5.3
> {code}
> Status: Running (Executing on YARN cluster with App id 
> application_1434641270368_1038)
> 
> VERTICES  STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  
> KILLED
> 
> Map 1 ..   SUCCEEDED  3  300   0  
>  0
> Map 2 ... FAILED  3  102   7  
>  0
> 
> VERTICES: 01/02  [=>>-] 66%   ELAPSED TIME: 7.39 s
>  
> 
> Status: Failed
> Vertex failed, vertexName=Map 2, vertexId=vertex_1434641270368_1038_2_01, 
> diagnostics=[Task failed, taskId=task_1434641270368_1038_2_01_02, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"cnctevn_id":"002245282386","svcrqst_id":"003627217285","svcrqst_crt_dts":"2015-04-23
>  11:54:39.238357","subject_seq_no":1,"plan_component":"HMOM1 
> ","cust_segment":"RM 
> ","cnctyp_cd":"001","cnctmd_cd":"D02","cnctevs_cd":"007","svcrtyp_cd":"335","svrstyp_cd":"088","cmpltyp_cd":"
>  ","catsrsn_cd":"","apealvl_cd":" 
> ","cnstnty_cd":"001","svcrqst_asrqst_ind":"Y","svcrqst_rtnorig_in":"N","svcrqst_vwasof_dt":"null","sum_reason_cd":"98","sum_reason":"Exclude","crsr_master_claim_index":null,"svcrqct_cds":["
>"],"svcrqst_lupdt":"2015-04-23 
> 22:14:01.288132","crsr_lupdt":null,"cntevsds_lupdt":"2015-04-23 
> 11:54:40.740061","ignore_me":1,"notes":null}
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"cnctevn_id":"002245282386","svcrqst_id":"003627217285","svcrqst_crt_dts":"2015-04-23
>  11:54:39.238357","subject_seq_no":1,"plan_component":"HMOM1 
> ","cust_segment":"RM 
> ","cnctyp_cd":"001","cnctmd_cd":"D02","cnctevs_cd":"007","svcrtyp_cd":"335","svrstyp_cd":"088","cmpltyp_cd":"
>  ","catsrsn_cd":"","apealvl_cd":" 
> ","cnstnty_cd":"001","svcrqst_asrqst_ind":"Y","svcrqst_rtnorig_in":"N","svcrqst_vwasof_dt":"null","sum_reason_cd":"98","sum_reason":"Exclude","crsr_master_claim_index":null,"svcrqct_cds":["
>"],"svcrqst_lupdt":"2015-04-23 
> 22:14:01.288132","crsr_lupdt":null,"cntevsds_lupdt":"2015-04-23 
> 11:54:40.740061","ignore_me"

[jira] [Commented] (HIVE-11090) ordering issues with windows unit test runs

2015-06-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600637#comment-14600637
 ] 

Hive QA commented on HIVE-11090:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12741699/HIVE-11090.02.patch

{color:green}SUCCESS:{color} +1 9024 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4374/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4374/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4374/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12741699 - PreCommit-HIVE-TRUNK-Build

> ordering issues with windows unit test runs
> ---
>
> Key: HIVE-11090
> URL: https://issues.apache.org/jira/browse/HIVE-11090
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-11090.01.patch, HIVE-11090.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11090) ordering issues with windows unit test runs

2015-06-24 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600633#comment-14600633
 ] 

Gunther Hagleitner commented on HIVE-11090:
---

+1 assuming test pass.

> ordering issues with windows unit test runs
> ---
>
> Key: HIVE-11090
> URL: https://issues.apache.org/jira/browse/HIVE-11090
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-11090.01.patch, HIVE-11090.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11099) Add support for running negative q-tests [Spark Branch]

2015-06-24 Thread Mohit Sabharwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-11099:
---
Attachment: HIVE-11099-1-spark.patch

> Add support for running negative q-tests [Spark Branch]
> ---
>
> Key: HIVE-11099
> URL: https://issues.apache.org/jira/browse/HIVE-11099
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-11099-1-spark.patch, HIVE-11099-spark.patch
>
>
> Add support for TestSparkNegativeCliDriver 
> TestMiniSparkOnYarnNegativeCliDriver to negative q-tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10468) Create scripts to do metastore upgrade tests on jenkins for Oracle DB.

2015-06-24 Thread Naveen Gangam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-10468:
-
Attachment: HIVE-10468.3.patch

Shuffled some things around to make more room in the /tmp that is on the same 
partition as /usr/lib. Hopefully, this will get past the disk space shortage 
issue.

> Create scripts to do metastore upgrade tests on jenkins for Oracle DB.
> --
>
> Key: HIVE-10468
> URL: https://issues.apache.org/jira/browse/HIVE-10468
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-10468.1.patch, HIVE-10468.2.patch, 
> HIVE-10468.3.patch, HIVE-10468.patch
>
>
> This JIRA is to isolate the work specific to Oracle DB in HIVE-10239. Because 
> of absence of 64 bit debian packages for oracle-xe, the apt-get install fails 
> on the AWS systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11051) Hive 1.2.0 MapJoin w/Tez - LazyBinaryArray cannot be cast to [Ljava.lang.Object;

2015-06-24 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11051:

Attachment: HIVE-11051.02.patch

> Hive 1.2.0  MapJoin w/Tez - LazyBinaryArray cannot be cast to 
> [Ljava.lang.Object;
> -
>
> Key: HIVE-11051
> URL: https://issues.apache.org/jira/browse/HIVE-11051
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers, Tez
>Affects Versions: 1.2.0
>Reporter: Greg Senia
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11051.01.patch, HIVE-11051.02.patch, 
> problem_table_joins.tar.gz
>
>
> I tried to apply: HIVE-10729 which did not solve the issue.
> The following exception is thrown on a Tez MapJoin with Hive 1.2.0 and Tez 
> 0.5.4/0.5.3
> {code}
> Status: Running (Executing on YARN cluster with App id 
> application_1434641270368_1038)
> 
> VERTICES  STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  
> KILLED
> 
> Map 1 ..   SUCCEEDED  3  300   0  
>  0
> Map 2 ... FAILED  3  102   7  
>  0
> 
> VERTICES: 01/02  [=>>-] 66%   ELAPSED TIME: 7.39 s
>  
> 
> Status: Failed
> Vertex failed, vertexName=Map 2, vertexId=vertex_1434641270368_1038_2_01, 
> diagnostics=[Task failed, taskId=task_1434641270368_1038_2_01_02, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"cnctevn_id":"002245282386","svcrqst_id":"003627217285","svcrqst_crt_dts":"2015-04-23
>  11:54:39.238357","subject_seq_no":1,"plan_component":"HMOM1 
> ","cust_segment":"RM 
> ","cnctyp_cd":"001","cnctmd_cd":"D02","cnctevs_cd":"007","svcrtyp_cd":"335","svrstyp_cd":"088","cmpltyp_cd":"
>  ","catsrsn_cd":"","apealvl_cd":" 
> ","cnstnty_cd":"001","svcrqst_asrqst_ind":"Y","svcrqst_rtnorig_in":"N","svcrqst_vwasof_dt":"null","sum_reason_cd":"98","sum_reason":"Exclude","crsr_master_claim_index":null,"svcrqct_cds":["
>"],"svcrqst_lupdt":"2015-04-23 
> 22:14:01.288132","crsr_lupdt":null,"cntevsds_lupdt":"2015-04-23 
> 11:54:40.740061","ignore_me":1,"notes":null}
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"cnctevn_id":"002245282386","svcrqst_id":"003627217285","svcrqst_crt_dts":"2015-04-23
>  11:54:39.238357","subject_seq_no":1,"plan_component":"HMOM1 
> ","cust_segment":"RM 
> ","cnctyp_cd":"001","cnctmd_cd":"D02","cnctevs_cd":"007","svcrtyp_cd":"335","svrstyp_cd":"088","cmpltyp_cd":"
>  ","catsrsn_cd":"","apealvl_cd":" 
> ","cnstnty_cd":"001","svcrqst_asrqst_ind":"Y","svcrqst_rtnorig_in":"N","svcrqst_vwasof_dt":"null","sum_reason_cd":"98","sum_reason":"Exclude","crsr_master_claim_index":null,"svcrqct_cds":["
>"],"svcrqst_lupdt":"2015-04-23 
> 22:14:01.288132","crsr_lupdt":null,"cntevsds_lupdt":"2015-04-23 
> 11:54:40.740061","ignore_me":1,"notes":null}
> at 
> org

[jira] [Commented] (HIVE-11051) Hive 1.2.0 MapJoin w/Tez - LazyBinaryArray cannot be cast to [Ljava.lang.Object;

2015-06-24 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600589#comment-14600589
 ] 

Matt McCline commented on HIVE-11051:
-

[~wzheng] [~gopalv] Thank you for your review.  Patch #2 has changes for review.

> Hive 1.2.0  MapJoin w/Tez - LazyBinaryArray cannot be cast to 
> [Ljava.lang.Object;
> -
>
> Key: HIVE-11051
> URL: https://issues.apache.org/jira/browse/HIVE-11051
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers, Tez
>Affects Versions: 1.2.0
>Reporter: Greg Senia
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11051.01.patch, problem_table_joins.tar.gz
>
>
> I tried to apply: HIVE-10729 which did not solve the issue.
> The following exception is thrown on a Tez MapJoin with Hive 1.2.0 and Tez 
> 0.5.4/0.5.3
> {code}
> Status: Running (Executing on YARN cluster with App id 
> application_1434641270368_1038)
> 
> VERTICES  STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  
> KILLED
> 
> Map 1 ..   SUCCEEDED  3  300   0  
>  0
> Map 2 ... FAILED  3  102   7  
>  0
> 
> VERTICES: 01/02  [=>>-] 66%   ELAPSED TIME: 7.39 s
>  
> 
> Status: Failed
> Vertex failed, vertexName=Map 2, vertexId=vertex_1434641270368_1038_2_01, 
> diagnostics=[Task failed, taskId=task_1434641270368_1038_2_01_02, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"cnctevn_id":"002245282386","svcrqst_id":"003627217285","svcrqst_crt_dts":"2015-04-23
>  11:54:39.238357","subject_seq_no":1,"plan_component":"HMOM1 
> ","cust_segment":"RM 
> ","cnctyp_cd":"001","cnctmd_cd":"D02","cnctevs_cd":"007","svcrtyp_cd":"335","svrstyp_cd":"088","cmpltyp_cd":"
>  ","catsrsn_cd":"","apealvl_cd":" 
> ","cnstnty_cd":"001","svcrqst_asrqst_ind":"Y","svcrqst_rtnorig_in":"N","svcrqst_vwasof_dt":"null","sum_reason_cd":"98","sum_reason":"Exclude","crsr_master_claim_index":null,"svcrqct_cds":["
>"],"svcrqst_lupdt":"2015-04-23 
> 22:14:01.288132","crsr_lupdt":null,"cntevsds_lupdt":"2015-04-23 
> 11:54:40.740061","ignore_me":1,"notes":null}
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"cnctevn_id":"002245282386","svcrqst_id":"003627217285","svcrqst_crt_dts":"2015-04-23
>  11:54:39.238357","subject_seq_no":1,"plan_component":"HMOM1 
> ","cust_segment":"RM 
> ","cnctyp_cd":"001","cnctmd_cd":"D02","cnctevs_cd":"007","svcrtyp_cd":"335","svrstyp_cd":"088","cmpltyp_cd":"
>  ","catsrsn_cd":"","apealvl_cd":" 
> ","cnstnty_cd":"001","svcrqst_asrqst_ind":"Y","svcrqst_rtnorig_in":"N","svcrqst_vwasof_dt":"null","sum_reason_cd":"98","sum_reason":"Exclude","crsr_master_claim_index":null,"svcrqct_cds":["
>"],"svcrqst_lupdt":"2015-04-23 
> 22:14:01.288132","crsr_lupdt":null,"cntevsds_lupdt":

[jira] [Commented] (HIVE-11099) Add support for running negative q-tests [Spark Branch]

2015-06-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600581#comment-14600581
 ] 

Hive QA commented on HIVE-11099:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12741760/HIVE-11099-spark.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/906/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/906/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-906/

Messages:
{noformat}
 This message was trimmed, see log for full details 
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ 
hive-it-minikdc ---
[INFO] No sources to compile
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ 
hive-it-minikdc ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-it-minikdc ---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-git-source-source/itests/hive-minikdc/target/tmp
[mkdir] Created dir: 
/data/hive-ptest/working/apache-git-source-source/itests/hive-minikdc/target/warehouse
[mkdir] Created dir: 
/data/hive-ptest/working/apache-git-source-source/itests/hive-minikdc/target/tmp/conf
 [copy] Copying 11 files to 
/data/hive-ptest/working/apache-git-source-source/itests/hive-minikdc/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
hive-it-minikdc ---
[INFO] Compiling 9 source files to 
/data/hive-ptest/working/apache-git-source-source/itests/hive-minikdc/target/test-classes
[WARNING] 
/data/hive-ptest/working/apache-git-source-source/itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/TestHs2HooksWithMiniKdc.java:
 
/data/hive-ptest/working/apache-git-source-source/itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/TestHs2HooksWithMiniKdc.java
 uses or overrides a deprecated API.
[WARNING] 
/data/hive-ptest/working/apache-git-source-source/itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/TestHs2HooksWithMiniKdc.java:
 Recompile with -Xlint:deprecation for details.
[INFO] 
[INFO] --- maven-surefire-plugin:2.16:test (default-test) @ hive-it-minikdc ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ hive-it-minikdc ---
[INFO] Building jar: 
/data/hive-ptest/working/apache-git-source-source/itests/hive-minikdc/target/hive-it-minikdc-2.0.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ 
hive-it-minikdc ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ hive-it-minikdc 
---
[INFO] Installing 
/data/hive-ptest/working/apache-git-source-source/itests/hive-minikdc/target/hive-it-minikdc-2.0.0-SNAPSHOT.jar
 to 
/data/hive-ptest/working/maven/org/apache/hive/hive-it-minikdc/2.0.0-SNAPSHOT/hive-it-minikdc-2.0.0-SNAPSHOT.jar
[INFO] Installing 
/data/hive-ptest/working/apache-git-source-source/itests/hive-minikdc/pom.xml 
to 
/data/hive-ptest/working/maven/org/apache/hive/hive-it-minikdc/2.0.0-SNAPSHOT/hive-it-minikdc-2.0.0-SNAPSHOT.pom
[INFO] 
[INFO] 
[INFO] Building Hive Integration - QFile Spark Tests 2.0.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-it-qfile-spark 
---
[INFO] Deleting 
/data/hive-ptest/working/apache-git-source-source/itests/qtest-spark/target
[INFO] Deleting 
/data/hive-ptest/working/apache-git-source-source/itests/qtest-spark (includes 
= [datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ 
hive-it-qfile-spark ---
[INFO] 
[INFO] --- properties-maven-plugin:1.0-alpha-2:read-project-properties 
(default) @ hive-it-qfile-spark ---
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (download-spark) @ hive-it-qfile-spark 
---
[INFO] Executing tasks

main:
 [exec] + /bin/pwd
 [exec] /data/hive-ptest/working/apache-git-source-source/itests/qtest-spark
 [exec] + BASE_DIR=./target
 [exec] + HIVE_ROOT=./target/../../../
 [exec] + DOWNLOAD_DIR=./../thirdparty
 [exec] + mkdir -p ./../thirdparty
 [exec] + download 
http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.4.0-bin-hadoop2-without-hive.tgz
 spark
 [exec] + 
url=http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.4.0-bin-hadoop2-

[jira] [Commented] (HIVE-11051) Hive 1.2.0 MapJoin w/Tez - LazyBinaryArray cannot be cast to [Ljava.lang.Object;

2015-06-24 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600570#comment-14600570
 ] 

Gopal V commented on HIVE-11051:


[~mmccline]: the patch LGTM, 1 minor comment to add to Wei's

Make these final, since they cannot be updated during a single use of the join 
operator.

{code}
+private boolean needsComplexObjectFixup;
+private ArrayList complexObjectArrayBuffer;
{code}

The array lists will still be mutable, but they will never be null if 
initialized once.

> Hive 1.2.0  MapJoin w/Tez - LazyBinaryArray cannot be cast to 
> [Ljava.lang.Object;
> -
>
> Key: HIVE-11051
> URL: https://issues.apache.org/jira/browse/HIVE-11051
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers, Tez
>Affects Versions: 1.2.0
>Reporter: Greg Senia
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11051.01.patch, problem_table_joins.tar.gz
>
>
> I tried to apply: HIVE-10729 which did not solve the issue.
> The following exception is thrown on a Tez MapJoin with Hive 1.2.0 and Tez 
> 0.5.4/0.5.3
> {code}
> Status: Running (Executing on YARN cluster with App id 
> application_1434641270368_1038)
> 
> VERTICES  STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  
> KILLED
> 
> Map 1 ..   SUCCEEDED  3  300   0  
>  0
> Map 2 ... FAILED  3  102   7  
>  0
> 
> VERTICES: 01/02  [=>>-] 66%   ELAPSED TIME: 7.39 s
>  
> 
> Status: Failed
> Vertex failed, vertexName=Map 2, vertexId=vertex_1434641270368_1038_2_01, 
> diagnostics=[Task failed, taskId=task_1434641270368_1038_2_01_02, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"cnctevn_id":"002245282386","svcrqst_id":"003627217285","svcrqst_crt_dts":"2015-04-23
>  11:54:39.238357","subject_seq_no":1,"plan_component":"HMOM1 
> ","cust_segment":"RM 
> ","cnctyp_cd":"001","cnctmd_cd":"D02","cnctevs_cd":"007","svcrtyp_cd":"335","svrstyp_cd":"088","cmpltyp_cd":"
>  ","catsrsn_cd":"","apealvl_cd":" 
> ","cnstnty_cd":"001","svcrqst_asrqst_ind":"Y","svcrqst_rtnorig_in":"N","svcrqst_vwasof_dt":"null","sum_reason_cd":"98","sum_reason":"Exclude","crsr_master_claim_index":null,"svcrqct_cds":["
>"],"svcrqst_lupdt":"2015-04-23 
> 22:14:01.288132","crsr_lupdt":null,"cntevsds_lupdt":"2015-04-23 
> 11:54:40.740061","ignore_me":1,"notes":null}
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"cnctevn_id":"002245282386","svcrqst_id":"003627217285","svcrqst_crt_dts":"2015-04-23
>  11:54:39.238357","subject_seq_no":1,"plan_component":"HMOM1 
> ","cust_segment":"RM 
> ","cnctyp_cd":"001","cnctmd_cd":"D02","cnctevs_cd":"007","svcrtyp_cd":"335","svrstyp_cd":"088","cmpltyp_cd":"
>  ","catsrsn_cd":"","apealvl_cd":" 
> ","cnstnty_cd":"001"

[jira] [Updated] (HIVE-11099) Add support for running negative q-tests [Spark Branch]

2015-06-24 Thread Mohit Sabharwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-11099:
---
Attachment: HIVE-11099-spark.patch

> Add support for running negative q-tests [Spark Branch]
> ---
>
> Key: HIVE-11099
> URL: https://issues.apache.org/jira/browse/HIVE-11099
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-11099-spark.patch
>
>
> Add support for TestSparkNegativeCliDriver 
> TestMiniSparkOnYarnNegativeCliDriver to negative q-tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11099) Add support for running negative q-tests [Spark Branch]

2015-06-24 Thread Mohit Sabharwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-11099:
---
Attachment: (was: HIVE-11099.spark.patch)

> Add support for running negative q-tests [Spark Branch]
> ---
>
> Key: HIVE-11099
> URL: https://issues.apache.org/jira/browse/HIVE-11099
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
>
> Add support for TestSparkNegativeCliDriver 
> TestMiniSparkOnYarnNegativeCliDriver to negative q-tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10468) Create scripts to do metastore upgrade tests on jenkins for Oracle DB.

2015-06-24 Thread Naveen Gangam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-10468:
-
Attachment: HIVE-10468.2.patch

> Create scripts to do metastore upgrade tests on jenkins for Oracle DB.
> --
>
> Key: HIVE-10468
> URL: https://issues.apache.org/jira/browse/HIVE-10468
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-10468.1.patch, HIVE-10468.2.patch, HIVE-10468.patch
>
>
> This JIRA is to isolate the work specific to Oracle DB in HIVE-10239. Because 
> of absence of 64 bit debian packages for oracle-xe, the apt-get install fails 
> on the AWS systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10754) new Job() is deprecated. Replaced all with Job.getInstance() for Hcatalog

2015-06-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600555#comment-14600555
 ] 

Hive QA commented on HIVE-10754:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12741677/HIVE-10754.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9019 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join28
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4372/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4372/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4372/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12741677 - PreCommit-HIVE-TRUNK-Build

> new Job() is deprecated. Replaced all with Job.getInstance() for Hcatalog
> -
>
> Key: HIVE-10754
> URL: https://issues.apache.org/jira/browse/HIVE-10754
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10754.patch
>
>
> Replace all the deprecated new Job() with Job.getInstance() in HCatalog.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11051) Hive 1.2.0 MapJoin w/Tez - LazyBinaryArray cannot be cast to [Ljava.lang.Object;

2015-06-24 Thread Wei Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600535#comment-14600535
 ] 

Wei Zheng commented on HIVE-11051:
--

[~mmccline] +1. The patch looks good.

Some trivial comments:
1. There are two imports that are not used in HybridHashTableContainer so can 
be removed.
2. fixupComplexObjects may have a better name like getComplexFieldsAsList, to 
be consistent to getFieldsAsList. It's your call.
3. There's one line exceeding 80 chars in unpack method for the two container 
classes.
4. This comment may be no longer needed (it's been there for a while) // TODO: 
should we unset bytes after that?

> Hive 1.2.0  MapJoin w/Tez - LazyBinaryArray cannot be cast to 
> [Ljava.lang.Object;
> -
>
> Key: HIVE-11051
> URL: https://issues.apache.org/jira/browse/HIVE-11051
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers, Tez
>Affects Versions: 1.2.0
>Reporter: Greg Senia
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11051.01.patch, problem_table_joins.tar.gz
>
>
> I tried to apply: HIVE-10729 which did not solve the issue.
> The following exception is thrown on a Tez MapJoin with Hive 1.2.0 and Tez 
> 0.5.4/0.5.3
> {code}
> Status: Running (Executing on YARN cluster with App id 
> application_1434641270368_1038)
> 
> VERTICES  STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  
> KILLED
> 
> Map 1 ..   SUCCEEDED  3  300   0  
>  0
> Map 2 ... FAILED  3  102   7  
>  0
> 
> VERTICES: 01/02  [=>>-] 66%   ELAPSED TIME: 7.39 s
>  
> 
> Status: Failed
> Vertex failed, vertexName=Map 2, vertexId=vertex_1434641270368_1038_2_01, 
> diagnostics=[Task failed, taskId=task_1434641270368_1038_2_01_02, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"cnctevn_id":"002245282386","svcrqst_id":"003627217285","svcrqst_crt_dts":"2015-04-23
>  11:54:39.238357","subject_seq_no":1,"plan_component":"HMOM1 
> ","cust_segment":"RM 
> ","cnctyp_cd":"001","cnctmd_cd":"D02","cnctevs_cd":"007","svcrtyp_cd":"335","svrstyp_cd":"088","cmpltyp_cd":"
>  ","catsrsn_cd":"","apealvl_cd":" 
> ","cnstnty_cd":"001","svcrqst_asrqst_ind":"Y","svcrqst_rtnorig_in":"N","svcrqst_vwasof_dt":"null","sum_reason_cd":"98","sum_reason":"Exclude","crsr_master_claim_index":null,"svcrqct_cds":["
>"],"svcrqst_lupdt":"2015-04-23 
> 22:14:01.288132","crsr_lupdt":null,"cntevsds_lupdt":"2015-04-23 
> 11:54:40.740061","ignore_me":1,"notes":null}
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"cnctevn_id":"002245282386","svcrqst_id":"003627217285","svcrqst_crt_dts":"2015-04-23
>  11:54:39.238357","subject_seq_no":1,"plan_component":"HMOM1 
> ","cust_segment":"RM 
> ","cnctyp_cd":"001","cnctmd_cd":"D02","cn

[jira] [Updated] (HIVE-10468) Create scripts to do metastore upgrade tests on jenkins for Oracle DB.

2015-06-24 Thread Naveen Gangam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-10468:
-
Attachment: HIVE-10468.1.patch

> Create scripts to do metastore upgrade tests on jenkins for Oracle DB.
> --
>
> Key: HIVE-10468
> URL: https://issues.apache.org/jira/browse/HIVE-10468
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-10468.1.patch, HIVE-10468.patch
>
>
> This JIRA is to isolate the work specific to Oracle DB in HIVE-10239. Because 
> of absence of 64 bit debian packages for oracle-xe, the apt-get install fails 
> on the AWS systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10328) Enable new return path for cbo

2015-06-24 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-10328:

Attachment: HIVE-10328.5.patch

> Enable new return path for cbo
> --
>
> Key: HIVE-10328
> URL: https://issues.apache.org/jira/browse/HIVE-10328
> Project: Hive
>  Issue Type: Task
>  Components: CBO
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-10328.1.patch, HIVE-10328.2.patch, 
> HIVE-10328.3.patch, HIVE-10328.4.patch, HIVE-10328.4.patch, 
> HIVE-10328.5.patch, HIVE-10328.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-24 Thread Wei Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600475#comment-14600475
 ] 

Wei Zheng commented on HIVE-10233:
--

[~vikram.dixit] Patch 14 looks good.

> Hive on tez: memory manager for grace hash join
> ---
>
> Key: HIVE-10233
> URL: https://issues.apache.org/jira/browse/HIVE-10233
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: llap, 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Gunther Hagleitner
> Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
> HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
> HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
> HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch, 
> HIVE-10233.12.patch, HIVE-10233.13.patch, HIVE-10233.14.patch
>
>
> We need a memory manager in llap/tez to manage the usage of memory across 
> threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10533) CBO (Calcite Return Path): Join to MultiJoin support for outer joins

2015-06-24 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600463#comment-14600463
 ] 

Ashutosh Chauhan commented on HIVE-10533:
-

+1

> CBO (Calcite Return Path): Join to MultiJoin support for outer joins
> 
>
> Key: HIVE-10533
> URL: https://issues.apache.org/jira/browse/HIVE-10533
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-10533.01.patch, HIVE-10533.02.patch, 
> HIVE-10533.02.patch, HIVE-10533.03.patch, HIVE-10533.04.patch, 
> HIVE-10533.05.patch, HIVE-10533.patch
>
>
> CBO return path: auto_join7.q can be used to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6791) Support variable substition for Beeline shell command

2015-06-24 Thread Ferdinand Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600458#comment-14600458
 ] 

Ferdinand Xu commented on HIVE-6791:


Hi [~spena], seems the precommit is not triggered. Could you take a look for 
it? Thank you!

> Support variable substition for Beeline shell command
> -
>
> Key: HIVE-6791
> URL: https://issues.apache.org/jira/browse/HIVE-6791
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI, Clients
>Affects Versions: 0.14.0
>Reporter: Xuefu Zhang
>Assignee: Ferdinand Xu
> Attachments: HIVE-6791-beeline-cli.2.patch, 
> HIVE-6791-beeline-cli.patch
>
>
> A follow-up task from HIVE-6694. Similar to HIVE-6570.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10533) CBO (Calcite Return Path): Join to MultiJoin support for outer joins

2015-06-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600444#comment-14600444
 ] 

Hive QA commented on HIVE-10533:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12741662/HIVE-10533.05.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9007 tests executed
*Failed tests:*
{noformat}
TestCliDriver-smb_mapjoin_15.q-groupby_grouping_id2.q-exim_07_all_part_over_nonoverlap.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join28
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4371/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4371/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4371/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12741662 - PreCommit-HIVE-TRUNK-Build

> CBO (Calcite Return Path): Join to MultiJoin support for outer joins
> 
>
> Key: HIVE-10533
> URL: https://issues.apache.org/jira/browse/HIVE-10533
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-10533.01.patch, HIVE-10533.02.patch, 
> HIVE-10533.02.patch, HIVE-10533.03.patch, HIVE-10533.04.patch, 
> HIVE-10533.05.patch, HIVE-10533.patch
>
>
> CBO return path: auto_join7.q can be used to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11096) Bump the parquet version to 1.7.0

2015-06-24 Thread Ferdinand Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600440#comment-14600440
 ] 

Ferdinand Xu commented on HIVE-11096:
-

Thanks [~spena], LGTM +1

> Bump the parquet version to 1.7.0
> -
>
> Key: HIVE-11096
> URL: https://issues.apache.org/jira/browse/HIVE-11096
> Project: Hive
>  Issue Type: Task
>Affects Versions: 1.2.0
>Reporter: Sergio Peña
>Assignee: Ferdinand Xu
>Priority: Minor
> Attachments: HIVE-11096.1.patch
>
>
> Parquet has moved officially as an Apache project since parquet 1.7.0.
> This new version does not have any bugfixes nor improvements from its last 
> 1.6.0 version, but all imports were changed to be org.apache.parquet, and the 
> pom.xml must use org.apache.parquet instead of com.twitter.
> This ticket should address those import and pom changes only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11104) Select operator doesn't propagate constants appearing in expressions

2015-06-24 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-11104:

Attachment: HIVE-11104.patch

> Select operator doesn't propagate constants appearing in expressions
> 
>
> Key: HIVE-11104
> URL: https://issues.apache.org/jira/browse/HIVE-11104
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.2.1
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-11104.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11084) Issue in Parquet Hive Table

2015-06-24 Thread Chanchal Kumar Ghosh (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600407#comment-14600407
 ] 

Chanchal Kumar Ghosh commented on HIVE-11084:
-

But in show create table command it is showing ROW FORMAT 
DELIMITED

> Issue in Parquet Hive Table
> ---
>
> Key: HIVE-11084
> URL: https://issues.apache.org/jira/browse/HIVE-11084
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.9.0
> Environment: GNU/Linux
>Reporter: Chanchal Kumar Ghosh
>Assignee: Sergio Peña
>
> {code}
> hive> CREATE TABLE intable_p (
> >   sr_no int,
> >   name string,
> >   emp_id int
> > ) PARTITIONED BY (
> >   a string,
> >   b string,
> >   c string
> > ) ROW FORMAT DELIMITED
> >   FIELDS TERMINATED BY '\t'
> >   LINES TERMINATED BY '\n'
> > STORED AS PARQUET;
> hive> insert overwrite table intable_p partition (a='a', b='b', c='c') select 
> * from intable;
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks is set to 0 since there's no reduce operator
> 
> MapReduce Jobs Launched:
> Stage-Stage-1: Map: 1   Cumulative CPU: 2.59 sec   HDFS Read: 247 HDFS Write: 
> 410 SUCCESS
> Total MapReduce CPU Time Spent: 2 seconds 590 msec
> OK
> Time taken: 30.382 seconds
> hive> show create table intable_p;
> OK
> CREATE  TABLE `intable_p`(
>   `sr_no` int,
>   `name` string,
>   `emp_id` int)
> PARTITIONED BY (
>   `a` string,
>   `b` string,
>   `c` string)
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '\t'
>   LINES TERMINATED BY '\n'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
> LOCATION
>   'hdfs://nameservice1/hive/db/intable_p'
> TBLPROPERTIES (
>   'transient_lastDdlTime'='1435080569')
> Time taken: 0.212 seconds, Fetched: 19 row(s)
> hive> CREATE  TABLE `intable_p2`(
> >   `sr_no` int,
> >   `name` string,
> >   `emp_id` int)
> > PARTITIONED BY (
> >   `a` string,
> >   `b` string,
> >   `c` string)
> > ROW FORMAT DELIMITED
> >   FIELDS TERMINATED BY '\t'
> >   LINES TERMINATED BY '\n'
> > STORED AS INPUTFORMAT
> >   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
> > OUTPUTFORMAT
> >   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> OK
> Time taken: 0.179 seconds
> hive> insert overwrite table intable_p2 partition (a='a', b='b', c='c') 
> select * from intable;
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks is set to 0 since there's no reduce operator
> ...
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: > 0
> 2015-06-23 17:34:40,471 Stage-1 map = 0%,  reduce = 0%
> 2015-06-23 17:35:10,753 Stage-1 map = 100%,  reduce = 0%
> Ended Job = job_1433246369760_7947 with errors
> Error during job, obtaining debugging information...
> Examining task ID: task_ (and more) from job job_
> Task with the most failures(4):
> -
> Task ID:
>   task_
> URL:
>   
> -
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"sr_no":1,"name":"ABC","emp_id":1001}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:198)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"sr_no":1,"name":"ABC","emp_id":1001}
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:549)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:180)
> ... 8 more
> Caused by: {color:red}java.lang.ClassCastException: org.apache.hadoop.io.Text 
> cannot be cast to org.apache.hadoop.io.ArrayWritable{color}
> at 
> org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.write(ParquetRecordWriterWrapper.java:105)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:628)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796

[jira] [Commented] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-24 Thread Mostafa Mokhtar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600409#comment-14600409
 ] 

Mostafa Mokhtar commented on HIVE-10233:


[~vikram.dixit]
Yes, issue unrelated. 

> Hive on tez: memory manager for grace hash join
> ---
>
> Key: HIVE-10233
> URL: https://issues.apache.org/jira/browse/HIVE-10233
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: llap, 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Gunther Hagleitner
> Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
> HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
> HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
> HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch, 
> HIVE-10233.12.patch, HIVE-10233.13.patch, HIVE-10233.14.patch
>
>
> We need a memory manager in llap/tez to manage the usage of memory across 
> threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11099) Add support for running negative q-tests [Spark Branch]

2015-06-24 Thread Mohit Sabharwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-11099:
---
Attachment: (was: HIVE-11099.patch)

> Add support for running negative q-tests [Spark Branch]
> ---
>
> Key: HIVE-11099
> URL: https://issues.apache.org/jira/browse/HIVE-11099
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-11099.spark.patch
>
>
> Add support for TestSparkNegativeCliDriver 
> TestMiniSparkOnYarnNegativeCliDriver to negative q-tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-24 Thread Vikram Dixit K (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600393#comment-14600393
 ] 

Vikram Dixit K commented on HIVE-10233:
---

Those don't look like the memory manager's errors. I think this is a different 
issue.

> Hive on tez: memory manager for grace hash join
> ---
>
> Key: HIVE-10233
> URL: https://issues.apache.org/jira/browse/HIVE-10233
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: llap, 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Gunther Hagleitner
> Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
> HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
> HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
> HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch, 
> HIVE-10233.12.patch, HIVE-10233.13.patch, HIVE-10233.14.patch
>
>
> We need a memory manager in llap/tez to manage the usage of memory across 
> threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11099) Add support for running negative q-tests [Spark Branch]

2015-06-24 Thread Mohit Sabharwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-11099:
---
Attachment: HIVE-11099.spark.patch

> Add support for running negative q-tests [Spark Branch]
> ---
>
> Key: HIVE-11099
> URL: https://issues.apache.org/jira/browse/HIVE-11099
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-11099.spark.patch
>
>
> Add support for TestSparkNegativeCliDriver 
> TestMiniSparkOnYarnNegativeCliDriver to negative q-tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11043) ORC split strategies should adapt based on number of files

2015-06-24 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11043:
---
Attachment: HIVE-11043.3.patch

Work around HIVE-11102

> ORC split strategies should adapt based on number of files
> --
>
> Key: HIVE-11043
> URL: https://issues.apache.org/jira/browse/HIVE-11043
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Gopal V
> Fix For: 2.0.0
>
> Attachments: HIVE-11043.1.patch, HIVE-11043.2.patch, 
> HIVE-11043.3.patch
>
>
> ORC split strategies added in HIVE-10114 chose strategies based on average 
> file size. It would be beneficial to choose a different strategy based on 
> number of files as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11080) Modify VectorizedRowBatch.toString() to not depend on VectorExpressionWriter

2015-06-24 Thread Owen O'Malley (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-11080:
-
Attachment: HIVE-11080.patch

This changes the vectorized row batch to use the physical types of the row 
batch instead of the hive types.

> Modify VectorizedRowBatch.toString() to not depend on VectorExpressionWriter
> 
>
> Key: HIVE-11080
> URL: https://issues.apache.org/jira/browse/HIVE-11080
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-11080.patch
>
>
> Currently the VectorizedRowBatch.toString method uses the 
> VectorExpressionWriter to convert the row batch to a string.
> Since the string is only used for printing error messages, I'd propose making 
> the toString use the types of the vector batch instead of the object 
> inspector.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11097) HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases

2015-06-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600338#comment-14600338
 ] 

Hive QA commented on HIVE-11097:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12741659/HIVE-11097.1.patch

{color:green}SUCCESS:{color} +1 9019 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4370/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4370/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4370/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12741659 - PreCommit-HIVE-TRUNK-Build

> HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases
> -
>
> Key: HIVE-11097
> URL: https://issues.apache.org/jira/browse/HIVE-11097
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.2.0
> Environment: Hive 0.13.1, Hive 2.0.0, hadoop 2.4.1
>Reporter: Wan Chang
>Priority: Critical
> Attachments: HIVE-11097.1.patch
>
>
> Say we have a sql as
> {code}
> create table if not exists test_orc_src (a int, b int, c int) stored as orc;
> create table if not exists test_orc_src2 (a int, b int, d int) stored as orc;
> insert overwrite table test_orc_src select 1,2,3 from src limit 1;
> insert overwrite table test_orc_src2 select 1,2,4 from src limit 1;
> set hive.auto.convert.join = false;
> set hive.execution.engine=mr;
> select
>   tb.c
> from test.test_orc_src tb
> join (select * from test.test_orc_src2) tm
> on tb.a = tm.a
> where tb.b = 2
> {code}
> The correct result is 3 but it produced no result.
> I find that in HiveInputFormat.pushProjectionsAndFilters
> {code}
> match = splitPath.startsWith(key) || splitPathWithNoSchema.startsWith(key);
> {code}
> It uses startsWith to combine aliases with path, so tm will match two alias 
> in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-24 Thread Mostafa Mokhtar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600332#comment-14600332
 ] 

Mostafa Mokhtar commented on HIVE-10233:


[~hagleitn] [~vikram.dixit]
On the latest build I am hitting OOM for several queries 
{code}
hive> explain select count(*) from store_sales, customer c1, customer_address 
ca1, customer_demographics cd1 , customer c2, customer_address ca2, 
customer_demographics cd2 where ss_customer_sk = c1.c_customer_sk and 
ss_addr_sk = ca1.ca_address_sk and ss_cdemo_sk = cd1.cd_demo_sk;
{code}

Exception 
{code}
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at org.apache.commons.codec.binary.Base64.resizeBuffer(Base64.java:376)
at org.apache.commons.codec.binary.Base64.encode(Base64.java:461)
at org.apache.commons.codec.binary.Base64.encode(Base64.java:937)
at org.apache.commons.codec.binary.Base64.encodeBase64(Base64.java:818)
at org.apache.commons.codec.binary.Base64.encodeBase64(Base64.java:785)
at org.apache.commons.codec.binary.Base64.encodeBase64(Base64.java:767)
at org.apache.commons.codec.binary.Base64.encodeBase64(Base64.java:642)
at 
org.apache.hadoop.hive.ql.exec.Utilities.serializeExpression(Utilities.java:799)
at 
org.apache.hadoop.hive.ql.plan.TableScanDesc.setFilterExpr(TableScanDesc.java:153)
at 
org.apache.hadoop.hive.ql.optimizer.ConstantPropagateProcFactory$ConstantPropagateTableScanProc.process(ConstantPropagateProcFactory.java:1208)
at 
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
at 
org.apache.hadoop.hive.ql.optimizer.ConstantPropagate$ConstantPropagateWalker.walk(ConstantPropagate.java:150)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
at 
org.apache.hadoop.hive.ql.optimizer.ConstantPropagate.transform(ConstantPropagate.java:120)
at 
org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:196)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10169)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9993)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
at 
org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1124)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1172)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1061)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1051)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
{code}

> Hive on tez: memory manager for grace hash join
> ---
>
> Key: HIVE-10233
> URL: https://issues.apache.org/jira/browse/HIVE-10233
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: llap, 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Gunther Hagleitner
> Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
> HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
> HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
> HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch, 
> HIVE-10233.12.patch, HIVE-10233.13.patch, HIVE-10233.14.patch
>
>
> We need a memory manager in llap/tez to manage the usage of memory across 
> threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11051) Hive 1.2.0 MapJoin w/Tez - LazyBinaryArray cannot be cast to [Ljava.lang.Object;

2015-06-24 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600325#comment-14600325
 ] 

Matt McCline commented on HIVE-11051:
-

[~wzheng] Can you give this a non-binding +1?  Thanks

> Hive 1.2.0  MapJoin w/Tez - LazyBinaryArray cannot be cast to 
> [Ljava.lang.Object;
> -
>
> Key: HIVE-11051
> URL: https://issues.apache.org/jira/browse/HIVE-11051
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers, Tez
>Affects Versions: 1.2.0
>Reporter: Greg Senia
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11051.01.patch, problem_table_joins.tar.gz
>
>
> I tried to apply: HIVE-10729 which did not solve the issue.
> The following exception is thrown on a Tez MapJoin with Hive 1.2.0 and Tez 
> 0.5.4/0.5.3
> {code}
> Status: Running (Executing on YARN cluster with App id 
> application_1434641270368_1038)
> 
> VERTICES  STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  
> KILLED
> 
> Map 1 ..   SUCCEEDED  3  300   0  
>  0
> Map 2 ... FAILED  3  102   7  
>  0
> 
> VERTICES: 01/02  [=>>-] 66%   ELAPSED TIME: 7.39 s
>  
> 
> Status: Failed
> Vertex failed, vertexName=Map 2, vertexId=vertex_1434641270368_1038_2_01, 
> diagnostics=[Task failed, taskId=task_1434641270368_1038_2_01_02, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"cnctevn_id":"002245282386","svcrqst_id":"003627217285","svcrqst_crt_dts":"2015-04-23
>  11:54:39.238357","subject_seq_no":1,"plan_component":"HMOM1 
> ","cust_segment":"RM 
> ","cnctyp_cd":"001","cnctmd_cd":"D02","cnctevs_cd":"007","svcrtyp_cd":"335","svrstyp_cd":"088","cmpltyp_cd":"
>  ","catsrsn_cd":"","apealvl_cd":" 
> ","cnstnty_cd":"001","svcrqst_asrqst_ind":"Y","svcrqst_rtnorig_in":"N","svcrqst_vwasof_dt":"null","sum_reason_cd":"98","sum_reason":"Exclude","crsr_master_claim_index":null,"svcrqct_cds":["
>"],"svcrqst_lupdt":"2015-04-23 
> 22:14:01.288132","crsr_lupdt":null,"cntevsds_lupdt":"2015-04-23 
> 11:54:40.740061","ignore_me":1,"notes":null}
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"cnctevn_id":"002245282386","svcrqst_id":"003627217285","svcrqst_crt_dts":"2015-04-23
>  11:54:39.238357","subject_seq_no":1,"plan_component":"HMOM1 
> ","cust_segment":"RM 
> ","cnctyp_cd":"001","cnctmd_cd":"D02","cnctevs_cd":"007","svcrtyp_cd":"335","svrstyp_cd":"088","cmpltyp_cd":"
>  ","catsrsn_cd":"","apealvl_cd":" 
> ","cnstnty_cd":"001","svcrqst_asrqst_ind":"Y","svcrqst_rtnorig_in":"N","svcrqst_vwasof_dt":"null","sum_reason_cd":"98","sum_reason":"Exclude","crsr_master_claim_index":null,"svcrqct_cds":["
>"],"svcrqst_lupdt":"2015-04-23 
> 22:14:01.288132","crsr_lupdt":null,"cntevsds_lupdt":"2015-04-23 
> 11:54:40.740

[jira] [Commented] (HIVE-11043) ORC split strategies should adapt based on number of files

2015-06-24 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600320#comment-14600320
 ] 

Gopal V commented on HIVE-11043:


Filed HIVE-11102 to track the issue - this error is not related to this patch, 
but has been exposed by this patch (i.e picking ETL Strategy instead of BI).

> ORC split strategies should adapt based on number of files
> --
>
> Key: HIVE-11043
> URL: https://issues.apache.org/jira/browse/HIVE-11043
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Gopal V
> Fix For: 2.0.0
>
> Attachments: HIVE-11043.1.patch, HIVE-11043.2.patch
>
>
> ORC split strategies added in HIVE-10114 chose strategies based on average 
> file size. It would be beneficial to choose a different strategy based on 
> number of files as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-24 Thread Vikram Dixit K (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600314#comment-14600314
 ] 

Vikram Dixit K commented on HIVE-10233:
---

Change looks good from the planning side. Wei can you take a look from the 
execution side (grace hash join) please.

Thanks
Vikram.

> Hive on tez: memory manager for grace hash join
> ---
>
> Key: HIVE-10233
> URL: https://issues.apache.org/jira/browse/HIVE-10233
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: llap, 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Gunther Hagleitner
> Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
> HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
> HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
> HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch, 
> HIVE-10233.12.patch, HIVE-10233.13.patch, HIVE-10233.14.patch
>
>
> We need a memory manager in llap/tez to manage the usage of memory across 
> threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11090) ordering issues with windows unit test runs

2015-06-24 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600271#comment-14600271
 ] 

Matt McCline commented on HIVE-11090:
-

Verified the query results match when the Q files are run with vectorization 
off for Spark, Tez, and MR.

Except for one query result line in vectorization_short_regress.q

Looks like it is an old decimal precision issue...

{code}
1785c1797
< 1969-12-31 16:00:04.063   04XP4DrTCblC788515601.0 79.553  
-1452617198 15601   -407009.58195572987 -15858  -511684.9   
-15601.0158740.1750002  -6432.15344526  -79.553 NULL
-15601.0-2.43391201E8
---
> 1969-12-31 16:00:04.063   04XP4DrTCblC788515601.0 79.553  
> -1452617198 15601   -407009.58195572987 -15858  -511684.9   
> -15601.0158740.1750002  -6432.0 -79.553 NULL-15601.0  
>   -2.43391201E8
1886a1899
{code}

(see newly created HIVE-11101)

> ordering issues with windows unit test runs
> ---
>
> Key: HIVE-11090
> URL: https://issues.apache.org/jira/browse/HIVE-11090
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-11090.01.patch, HIVE-11090.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11051) Hive 1.2.0 MapJoin w/Tez - LazyBinaryArray cannot be cast to [Ljava.lang.Object;

2015-06-24 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600270#comment-14600270
 ] 

Matt McCline commented on HIVE-11051:
-

(Ooops add comment to wrong JIRA)

> Hive 1.2.0  MapJoin w/Tez - LazyBinaryArray cannot be cast to 
> [Ljava.lang.Object;
> -
>
> Key: HIVE-11051
> URL: https://issues.apache.org/jira/browse/HIVE-11051
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers, Tez
>Affects Versions: 1.2.0
>Reporter: Greg Senia
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11051.01.patch, problem_table_joins.tar.gz
>
>
> I tried to apply: HIVE-10729 which did not solve the issue.
> The following exception is thrown on a Tez MapJoin with Hive 1.2.0 and Tez 
> 0.5.4/0.5.3
> {code}
> Status: Running (Executing on YARN cluster with App id 
> application_1434641270368_1038)
> 
> VERTICES  STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  
> KILLED
> 
> Map 1 ..   SUCCEEDED  3  300   0  
>  0
> Map 2 ... FAILED  3  102   7  
>  0
> 
> VERTICES: 01/02  [=>>-] 66%   ELAPSED TIME: 7.39 s
>  
> 
> Status: Failed
> Vertex failed, vertexName=Map 2, vertexId=vertex_1434641270368_1038_2_01, 
> diagnostics=[Task failed, taskId=task_1434641270368_1038_2_01_02, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"cnctevn_id":"002245282386","svcrqst_id":"003627217285","svcrqst_crt_dts":"2015-04-23
>  11:54:39.238357","subject_seq_no":1,"plan_component":"HMOM1 
> ","cust_segment":"RM 
> ","cnctyp_cd":"001","cnctmd_cd":"D02","cnctevs_cd":"007","svcrtyp_cd":"335","svrstyp_cd":"088","cmpltyp_cd":"
>  ","catsrsn_cd":"","apealvl_cd":" 
> ","cnstnty_cd":"001","svcrqst_asrqst_ind":"Y","svcrqst_rtnorig_in":"N","svcrqst_vwasof_dt":"null","sum_reason_cd":"98","sum_reason":"Exclude","crsr_master_claim_index":null,"svcrqct_cds":["
>"],"svcrqst_lupdt":"2015-04-23 
> 22:14:01.288132","crsr_lupdt":null,"cntevsds_lupdt":"2015-04-23 
> 11:54:40.740061","ignore_me":1,"notes":null}
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"cnctevn_id":"002245282386","svcrqst_id":"003627217285","svcrqst_crt_dts":"2015-04-23
>  11:54:39.238357","subject_seq_no":1,"plan_component":"HMOM1 
> ","cust_segment":"RM 
> ","cnctyp_cd":"001","cnctmd_cd":"D02","cnctevs_cd":"007","svcrtyp_cd":"335","svrstyp_cd":"088","cmpltyp_cd":"
>  ","catsrsn_cd":"","apealvl_cd":" 
> ","cnstnty_cd":"001","svcrqst_asrqst_ind":"Y","svcrqst_rtnorig_in":"N","svcrqst_vwasof_dt":"null","sum_reason_cd":"98","sum_reason":"Exclude","crsr_master_claim_index":null,"svcrqct_cds":["
>"],"svcrqst_lupdt":"2015-04-23 
> 22:14:01.288132","crsr_lupdt":null,"cntevsds_lupdt":"2015-04-23 
> 11:54:40.740061","ignore_me":1,"

[jira] [Updated] (HIVE-11077) Add support in parser and wire up to txn manager

2015-06-24 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11077:
--
Attachment: HIVE-11077.3.patch

patch 3 includes HIVE-11030 since it's required but not checked in yet

> Add support in parser and wire up to txn manager
> 
>
> Key: HIVE-11077
> URL: https://issues.apache.org/jira/browse/HIVE-11077
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL, Transactions
>Affects Versions: 1.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-11077.3.patch, HIVE-11077.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10468) Create scripts to do metastore upgrade tests on jenkins for Oracle DB.

2015-06-24 Thread Naveen Gangam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-10468:
-
Attachment: HIVE-10468.patch

Turns out the debian package for Oracle-XE (Express edition) in the oracle 
debian repo has a dependency on libc6 v2.3.2+. 
There is no such version of this library according to http://www.eglibc.org/home

So the installation fails. To work around this issue, I have to delete this 
dependency and re-create the debian package locally in this script.



> Create scripts to do metastore upgrade tests on jenkins for Oracle DB.
> --
>
> Key: HIVE-10468
> URL: https://issues.apache.org/jira/browse/HIVE-10468
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-10468.patch
>
>
> This JIRA is to isolate the work specific to Oracle DB in HIVE-10239. Because 
> of absence of 64 bit debian packages for oracle-xe, the apt-get install fails 
> on the AWS systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11051) Hive 1.2.0 MapJoin w/Tez - LazyBinaryArray cannot be cast to [Ljava.lang.Object;

2015-06-24 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600236#comment-14600236
 ] 

Matt McCline commented on HIVE-11051:
-

Verified the query results match when the Q files are run with vectorization 
off for Spark, Tez, and MR.

Except for one query result line in vectorization_short_regress.q

Looks like it is an old decimal precision issue...

{code}
1785c1797
< 1969-12-31 16:00:04.063   04XP4DrTCblC788515601.0 79.553  
-1452617198 15601   -407009.58195572987 -15858  -511684.9   
-15601.0158740.1750002  -6432.15344526  -79.553 NULL
-15601.0-2.43391201E8
---
> 1969-12-31 16:00:04.063   04XP4DrTCblC788515601.0 79.553  
> -1452617198 15601   -407009.58195572987 -15858  -511684.9   
> -15601.0158740.1750002  -6432.0 -79.553 NULL-15601.0  
>   -2.43391201E8
1886a1899
{code}

(see newly created HIVE-11101)

> Hive 1.2.0  MapJoin w/Tez - LazyBinaryArray cannot be cast to 
> [Ljava.lang.Object;
> -
>
> Key: HIVE-11051
> URL: https://issues.apache.org/jira/browse/HIVE-11051
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers, Tez
>Affects Versions: 1.2.0
>Reporter: Greg Senia
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11051.01.patch, problem_table_joins.tar.gz
>
>
> I tried to apply: HIVE-10729 which did not solve the issue.
> The following exception is thrown on a Tez MapJoin with Hive 1.2.0 and Tez 
> 0.5.4/0.5.3
> {code}
> Status: Running (Executing on YARN cluster with App id 
> application_1434641270368_1038)
> 
> VERTICES  STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  
> KILLED
> 
> Map 1 ..   SUCCEEDED  3  300   0  
>  0
> Map 2 ... FAILED  3  102   7  
>  0
> 
> VERTICES: 01/02  [=>>-] 66%   ELAPSED TIME: 7.39 s
>  
> 
> Status: Failed
> Vertex failed, vertexName=Map 2, vertexId=vertex_1434641270368_1038_2_01, 
> diagnostics=[Task failed, taskId=task_1434641270368_1038_2_01_02, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"cnctevn_id":"002245282386","svcrqst_id":"003627217285","svcrqst_crt_dts":"2015-04-23
>  11:54:39.238357","subject_seq_no":1,"plan_component":"HMOM1 
> ","cust_segment":"RM 
> ","cnctyp_cd":"001","cnctmd_cd":"D02","cnctevs_cd":"007","svcrtyp_cd":"335","svrstyp_cd":"088","cmpltyp_cd":"
>  ","catsrsn_cd":"","apealvl_cd":" 
> ","cnstnty_cd":"001","svcrqst_asrqst_ind":"Y","svcrqst_rtnorig_in":"N","svcrqst_vwasof_dt":"null","sum_reason_cd":"98","sum_reason":"Exclude","crsr_master_claim_index":null,"svcrqct_cds":["
>"],"svcrqst_lupdt":"2015-04-23 
> 22:14:01.288132","crsr_lupdt":null,"cntevsds_lupdt":"2015-04-23 
> 11:54:40.740061","ignore_me":1,"notes":null}
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.RuntimeException: 
>

[jira] [Resolved] (HIVE-8644) ORC rle v2 writer should round when converting from floats to integers

2015-06-24 Thread Owen O'Malley (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HIVE-8644.
-
Resolution: Won't Fix

I suspect changing this isn't worth it at this point.

> ORC rle v2 writer should round when converting from floats to integers
> --
>
> Key: HIVE-8644
> URL: https://issues.apache.org/jira/browse/HIVE-8644
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Minor
> Attachments: HIVE-8644.patch
>
>
> The ORC rle v2 writer would do better to round the floating point numbers 
> when converting to integers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9557) create UDF to measure strings similarity using Cosine Similarity algo

2015-06-24 Thread Alexander Pivovarov (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600195#comment-14600195
 ] 

Alexander Pivovarov commented on HIVE-9557:
---

rename clientnegative/udf_cosine_similarity.q to
clientnegative/udf_cosine_similarity_error_1.q
then
{code}
# build hive
mvn clean install -Phadoop-2,dist -DskipTests
# build itest
cd itest
mvn clean install -Phadoop-2 -DskipTests
# build qtest
cd qtest
mvn clean install -Phadoop-2 -DskipTests
# run q test. it will overwrite q.out file
mvn test -Dtest=TestCliDriver -Dqfile=udf_cosine_similarity.q,show_functions.q 
-Dtest.output.overwrite=true -Phadoop-2
# run negative q file test
mvn test -Dtest=TestNegativeCliDriver -Dqfile=udf_cosine_similarity_error_1.q 
-Dtest.output.overwrite=true -Phadoop-2
{code}

> create UDF to measure strings similarity using Cosine Similarity algo
> -
>
> Key: HIVE-9557
> URL: https://issues.apache.org/jira/browse/HIVE-9557
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Nishant Kelkar
>  Labels: CosineSimilarity, SimilarityMetric, UDF
> Attachments: udf_cosine_similarity-v01.patch
>
>
> algo description http://en.wikipedia.org/wiki/Cosine_similarity
> {code}
> --one word different, total 2 words
> str_sim_cosine('Test String1', 'Test String2') = (2 - 1) / 2 = 0.5f
> {code}
> reference implementation:
> https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/CosineSimilarity.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11086) Remove use of ErrorMsg in Orc's RunLengthIntegerReaderV2

2015-06-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600188#comment-14600188
 ] 

Hive QA commented on HIVE-11086:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12741649/HIVE-11086.patch

{color:green}SUCCESS:{color} +1 9015 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4369/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4369/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4369/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12741649 - PreCommit-HIVE-TRUNK-Build

> Remove use of ErrorMsg in Orc's RunLengthIntegerReaderV2
> 
>
> Key: HIVE-11086
> URL: https://issues.apache.org/jira/browse/HIVE-11086
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-11086.patch
>
>
> ORC's rle v2 reader uses a string literal from ErrorMsg, which forces a large 
> dependency on the rle v2 reader. Pulling the string literal in directly 
> doesn't change the behavior and fixes the linkage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11090) ordering issues with windows unit test runs

2015-06-24 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11090:

Attachment: HIVE-11090.02.patch

> ordering issues with windows unit test runs
> ---
>
> Key: HIVE-11090
> URL: https://issues.apache.org/jira/browse/HIVE-11090
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-11090.01.patch, HIVE-11090.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11099) Add support for running negative q-tests [Spark Branch]

2015-06-24 Thread Mohit Sabharwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-11099:
---
Attachment: HIVE-11099.patch

> Add support for running negative q-tests [Spark Branch]
> ---
>
> Key: HIVE-11099
> URL: https://issues.apache.org/jira/browse/HIVE-11099
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-11099.patch
>
>
> Add support for TestSparkNegativeCliDriver 
> TestMiniSparkOnYarnNegativeCliDriver to negative q-tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9248) Vectorization : Tez Reduce vertex not getting vectorized when GROUP BY is Hash mode

2015-06-24 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600152#comment-14600152
 ] 

Jason Dere commented on HIVE-9248:
--

Ran all of the test failures locally, they pass for me.

> Vectorization : Tez Reduce vertex not getting vectorized when GROUP BY is 
> Hash mode
> ---
>
> Key: HIVE-9248
> URL: https://issues.apache.org/jira/browse/HIVE-9248
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, Vectorization
>Affects Versions: 0.14.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-9248.01.patch, HIVE-9248.02.patch, 
> HIVE-9248.03.patch, HIVE-9248.04.patch, HIVE-9248.05.patch, HIVE-9248.06.patch
>
>
> Under Tez and Vectorization, ReduceWork not getting vectorized unless it 
> GROUP BY operator is MergePartial.  Add valid cases where GROUP BY is Hash 
> (and presumably there are downstream reducers that will do MergePartial).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9557) create UDF to measure strings similarity using Cosine Similarity algo

2015-06-24 Thread Nishant Kelkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600145#comment-14600145
 ] 

Nishant Kelkar commented on HIVE-9557:
--

[~apivovarov], I had a question: When I prepare a 
clientpositives/udf_cosine_similarity.q and a 
clientnegative/udf_cosine_similarity.q, how do I run these? Also, how do I 
create the q.out file?



> create UDF to measure strings similarity using Cosine Similarity algo
> -
>
> Key: HIVE-9557
> URL: https://issues.apache.org/jira/browse/HIVE-9557
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Nishant Kelkar
>  Labels: CosineSimilarity, SimilarityMetric, UDF
> Attachments: udf_cosine_similarity-v01.patch
>
>
> algo description http://en.wikipedia.org/wiki/Cosine_similarity
> {code}
> --one word different, total 2 words
> str_sim_cosine('Test String1', 'Test String2') = (2 - 1) / 2 = 0.5f
> {code}
> reference implementation:
> https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/CosineSimilarity.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10895) ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources

2015-06-24 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600141#comment-14600141
 ] 

Aihua Xu commented on HIVE-10895:
-

Seems both tests are unrelated. TestPigHBaseStorageHandler test passed locally.

> ObjectStore does not close Query objects in some calls, causing a potential 
> leak in some metastore db resources
> ---
>
> Key: HIVE-10895
> URL: https://issues.apache.org/jira/browse/HIVE-10895
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13
>Reporter: Takahiko Saito
>Assignee: Aihua Xu
> Attachments: HIVE-10895.1.patch, HIVE-10895.2.patch
>
>
> During testing, we've noticed Oracle db running out of cursors. Might be 
> related to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11084) Issue in Parquet Hive Table

2015-06-24 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-11084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600109#comment-14600109
 ] 

Sergio Peña commented on HIVE-11084:


I found the problem with your statement. You need to pass the ROW FORMAT SERDE 
parameter as you are using custom input/output format.
This CREATE TABLE should work:

{noformat}
hive> CREATE  TABLE `intable_p2`(
>   `sr_no` int,
>   `name` string,
>   `emp_id` int)
> PARTITIONED BY (
>   `a` string,
>   `b` string,
>   `c` string)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
{noformat}

The use of STORE AS PARQUET allows Hive to use default 
serde/inputformat/outputformat classes. But for custom parameters, you need to 
specify specific classes.

> Issue in Parquet Hive Table
> ---
>
> Key: HIVE-11084
> URL: https://issues.apache.org/jira/browse/HIVE-11084
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.9.0
> Environment: GNU/Linux
>Reporter: Chanchal Kumar Ghosh
>
> {code}
> hive> CREATE TABLE intable_p (
> >   sr_no int,
> >   name string,
> >   emp_id int
> > ) PARTITIONED BY (
> >   a string,
> >   b string,
> >   c string
> > ) ROW FORMAT DELIMITED
> >   FIELDS TERMINATED BY '\t'
> >   LINES TERMINATED BY '\n'
> > STORED AS PARQUET;
> hive> insert overwrite table intable_p partition (a='a', b='b', c='c') select 
> * from intable;
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks is set to 0 since there's no reduce operator
> 
> MapReduce Jobs Launched:
> Stage-Stage-1: Map: 1   Cumulative CPU: 2.59 sec   HDFS Read: 247 HDFS Write: 
> 410 SUCCESS
> Total MapReduce CPU Time Spent: 2 seconds 590 msec
> OK
> Time taken: 30.382 seconds
> hive> show create table intable_p;
> OK
> CREATE  TABLE `intable_p`(
>   `sr_no` int,
>   `name` string,
>   `emp_id` int)
> PARTITIONED BY (
>   `a` string,
>   `b` string,
>   `c` string)
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '\t'
>   LINES TERMINATED BY '\n'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
> LOCATION
>   'hdfs://nameservice1/hive/db/intable_p'
> TBLPROPERTIES (
>   'transient_lastDdlTime'='1435080569')
> Time taken: 0.212 seconds, Fetched: 19 row(s)
> hive> CREATE  TABLE `intable_p2`(
> >   `sr_no` int,
> >   `name` string,
> >   `emp_id` int)
> > PARTITIONED BY (
> >   `a` string,
> >   `b` string,
> >   `c` string)
> > ROW FORMAT DELIMITED
> >   FIELDS TERMINATED BY '\t'
> >   LINES TERMINATED BY '\n'
> > STORED AS INPUTFORMAT
> >   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
> > OUTPUTFORMAT
> >   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> OK
> Time taken: 0.179 seconds
> hive> insert overwrite table intable_p2 partition (a='a', b='b', c='c') 
> select * from intable;
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks is set to 0 since there's no reduce operator
> ...
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: > 0
> 2015-06-23 17:34:40,471 Stage-1 map = 0%,  reduce = 0%
> 2015-06-23 17:35:10,753 Stage-1 map = 100%,  reduce = 0%
> Ended Job = job_1433246369760_7947 with errors
> Error during job, obtaining debugging information...
> Examining task ID: task_ (and more) from job job_
> Task with the most failures(4):
> -
> Task ID:
>   task_
> URL:
>   
> -
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"sr_no":1,"name":"ABC","emp_id":1001}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:198)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while pr

[jira] [Assigned] (HIVE-11084) Issue in Parquet Hive Table

2015-06-24 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-11084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña reassigned HIVE-11084:
--

Assignee: Sergio Peña

> Issue in Parquet Hive Table
> ---
>
> Key: HIVE-11084
> URL: https://issues.apache.org/jira/browse/HIVE-11084
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.9.0
> Environment: GNU/Linux
>Reporter: Chanchal Kumar Ghosh
>Assignee: Sergio Peña
>
> {code}
> hive> CREATE TABLE intable_p (
> >   sr_no int,
> >   name string,
> >   emp_id int
> > ) PARTITIONED BY (
> >   a string,
> >   b string,
> >   c string
> > ) ROW FORMAT DELIMITED
> >   FIELDS TERMINATED BY '\t'
> >   LINES TERMINATED BY '\n'
> > STORED AS PARQUET;
> hive> insert overwrite table intable_p partition (a='a', b='b', c='c') select 
> * from intable;
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks is set to 0 since there's no reduce operator
> 
> MapReduce Jobs Launched:
> Stage-Stage-1: Map: 1   Cumulative CPU: 2.59 sec   HDFS Read: 247 HDFS Write: 
> 410 SUCCESS
> Total MapReduce CPU Time Spent: 2 seconds 590 msec
> OK
> Time taken: 30.382 seconds
> hive> show create table intable_p;
> OK
> CREATE  TABLE `intable_p`(
>   `sr_no` int,
>   `name` string,
>   `emp_id` int)
> PARTITIONED BY (
>   `a` string,
>   `b` string,
>   `c` string)
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '\t'
>   LINES TERMINATED BY '\n'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
> LOCATION
>   'hdfs://nameservice1/hive/db/intable_p'
> TBLPROPERTIES (
>   'transient_lastDdlTime'='1435080569')
> Time taken: 0.212 seconds, Fetched: 19 row(s)
> hive> CREATE  TABLE `intable_p2`(
> >   `sr_no` int,
> >   `name` string,
> >   `emp_id` int)
> > PARTITIONED BY (
> >   `a` string,
> >   `b` string,
> >   `c` string)
> > ROW FORMAT DELIMITED
> >   FIELDS TERMINATED BY '\t'
> >   LINES TERMINATED BY '\n'
> > STORED AS INPUTFORMAT
> >   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
> > OUTPUTFORMAT
> >   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> OK
> Time taken: 0.179 seconds
> hive> insert overwrite table intable_p2 partition (a='a', b='b', c='c') 
> select * from intable;
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks is set to 0 since there's no reduce operator
> ...
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: > 0
> 2015-06-23 17:34:40,471 Stage-1 map = 0%,  reduce = 0%
> 2015-06-23 17:35:10,753 Stage-1 map = 100%,  reduce = 0%
> Ended Job = job_1433246369760_7947 with errors
> Error during job, obtaining debugging information...
> Examining task ID: task_ (and more) from job job_
> Task with the most failures(4):
> -
> Task ID:
>   task_
> URL:
>   
> -
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"sr_no":1,"name":"ABC","emp_id":1001}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:198)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"sr_no":1,"name":"ABC","emp_id":1001}
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:549)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:180)
> ... 8 more
> Caused by: {color:red}java.lang.ClassCastException: org.apache.hadoop.io.Text 
> cannot be cast to org.apache.hadoop.io.ArrayWritable{color}
> at 
> org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.write(ParquetRecordWriterWrapper.java:105)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:628)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
> at o

[jira] [Commented] (HIVE-11090) ordering issues with windows unit test runs

2015-06-24 Thread Laljo John Pullokkaran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600072#comment-14600072
 ] 

Laljo John Pullokkaran commented on HIVE-11090:
---

[~mmccline] Can't you add order by followed by limit. That should produce 
stable results.

> ordering issues with windows unit test runs
> ---
>
> Key: HIVE-11090
> URL: https://issues.apache.org/jira/browse/HIVE-11090
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-11090.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11055) HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)

2015-06-24 Thread Dmitry Tolpeko (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600053#comment-14600053
 ] 

Dmitry Tolpeko commented on HIVE-11055:
---

Sure, I will also update the patch to include shell scripts to run the tool. 
Also currently I believe the patch just adds files, but hplsql.jar will not be 
created. I will add pom.xml.

> HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)
> ---
>
> Key: HIVE-11055
> URL: https://issues.apache.org/jira/browse/HIVE-11055
> Project: Hive
>  Issue Type: Improvement
>Reporter: Dmitry Tolpeko
>Assignee: Dmitry Tolpeko
> Attachments: HIVE-11055.1.patch
>
>
> There is PL/HQL tool (www.plhql.org) that implements procedural SQL for Hive 
> (actually any SQL-on-Hadoop implementation and any JDBC source).
> Alan Gates offered to contribute it to Hive under HPL/SQL name 
> (org.apache.hive.hplsql package). This JIRA is to create a patch to 
> contribute  the PL/HQL code. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11051) Hive 1.2.0 MapJoin w/Tez - LazyBinaryArray cannot be cast to [Ljava.lang.Object;

2015-06-24 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600050#comment-14600050
 ] 

Gunther Hagleitner commented on HIVE-11051:
---

[~t3rmin4t0r] do you want to take a look?

> Hive 1.2.0  MapJoin w/Tez - LazyBinaryArray cannot be cast to 
> [Ljava.lang.Object;
> -
>
> Key: HIVE-11051
> URL: https://issues.apache.org/jira/browse/HIVE-11051
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers, Tez
>Affects Versions: 1.2.0
>Reporter: Greg Senia
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11051.01.patch, problem_table_joins.tar.gz
>
>
> I tried to apply: HIVE-10729 which did not solve the issue.
> The following exception is thrown on a Tez MapJoin with Hive 1.2.0 and Tez 
> 0.5.4/0.5.3
> {code}
> Status: Running (Executing on YARN cluster with App id 
> application_1434641270368_1038)
> 
> VERTICES  STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  
> KILLED
> 
> Map 1 ..   SUCCEEDED  3  300   0  
>  0
> Map 2 ... FAILED  3  102   7  
>  0
> 
> VERTICES: 01/02  [=>>-] 66%   ELAPSED TIME: 7.39 s
>  
> 
> Status: Failed
> Vertex failed, vertexName=Map 2, vertexId=vertex_1434641270368_1038_2_01, 
> diagnostics=[Task failed, taskId=task_1434641270368_1038_2_01_02, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"cnctevn_id":"002245282386","svcrqst_id":"003627217285","svcrqst_crt_dts":"2015-04-23
>  11:54:39.238357","subject_seq_no":1,"plan_component":"HMOM1 
> ","cust_segment":"RM 
> ","cnctyp_cd":"001","cnctmd_cd":"D02","cnctevs_cd":"007","svcrtyp_cd":"335","svrstyp_cd":"088","cmpltyp_cd":"
>  ","catsrsn_cd":"","apealvl_cd":" 
> ","cnstnty_cd":"001","svcrqst_asrqst_ind":"Y","svcrqst_rtnorig_in":"N","svcrqst_vwasof_dt":"null","sum_reason_cd":"98","sum_reason":"Exclude","crsr_master_claim_index":null,"svcrqct_cds":["
>"],"svcrqst_lupdt":"2015-04-23 
> 22:14:01.288132","crsr_lupdt":null,"cntevsds_lupdt":"2015-04-23 
> 11:54:40.740061","ignore_me":1,"notes":null}
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"cnctevn_id":"002245282386","svcrqst_id":"003627217285","svcrqst_crt_dts":"2015-04-23
>  11:54:39.238357","subject_seq_no":1,"plan_component":"HMOM1 
> ","cust_segment":"RM 
> ","cnctyp_cd":"001","cnctmd_cd":"D02","cnctevs_cd":"007","svcrtyp_cd":"335","svrstyp_cd":"088","cmpltyp_cd":"
>  ","catsrsn_cd":"","apealvl_cd":" 
> ","cnstnty_cd":"001","svcrqst_asrqst_ind":"Y","svcrqst_rtnorig_in":"N","svcrqst_vwasof_dt":"null","sum_reason_cd":"98","sum_reason":"Exclude","crsr_master_claim_index":null,"svcrqct_cds":["
>"],"svcrqst_lupdt":"2015-04-23 
> 22:14:01.288132","crsr_lupdt":null,"cntevsds_lupdt":"2015-04-23 
> 11:54:40.740

[jira] [Commented] (HIVE-11077) Add support in parser and wire up to txn manager

2015-06-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600045#comment-14600045
 ] 

Hive QA commented on HIVE-11077:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12741642/HIVE-11077.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4368/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4368/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4368/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4368/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at eb278d3 Revert "HIVE-11043: ORC split strategies should adapt 
based on number of files (Gopal V reviewed by Prasanth Jayachandran)"
+ git clean -f -d
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at eb278d3 Revert "HIVE-11043: ORC split strategies should adapt 
based on number of files (Gopal V reviewed by Prasanth Jayachandran)"
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12741642 - PreCommit-HIVE-TRUNK-Build

> Add support in parser and wire up to txn manager
> 
>
> Key: HIVE-11077
> URL: https://issues.apache.org/jira/browse/HIVE-11077
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL, Transactions
>Affects Versions: 1.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-11077.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11096) Bump the parquet version to 1.7.0

2015-06-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600040#comment-14600040
 ] 

Hive QA commented on HIVE-11096:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12741638/HIVE-11096.1.patch

{color:green}SUCCESS:{color} +1 9015 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4367/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4367/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4367/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12741638 - PreCommit-HIVE-TRUNK-Build

> Bump the parquet version to 1.7.0
> -
>
> Key: HIVE-11096
> URL: https://issues.apache.org/jira/browse/HIVE-11096
> Project: Hive
>  Issue Type: Task
>Affects Versions: 1.2.0
>Reporter: Sergio Peña
>Assignee: Ferdinand Xu
>Priority: Minor
> Attachments: HIVE-11096.1.patch
>
>
> Parquet has moved officially as an Apache project since parquet 1.7.0.
> This new version does not have any bugfixes nor improvements from its last 
> 1.6.0 version, but all imports were changed to be org.apache.parquet, and the 
> pom.xml must use org.apache.parquet instead of com.twitter.
> This ticket should address those import and pom changes only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11084) Issue in Parquet Hive Table

2015-06-24 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-11084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1466#comment-1466
 ] 

Sergio Peña commented on HIVE-11084:


This bug happens with any other table format, like orc and avro. Seems the 
problem happens only when creating tables with INPUTFORMAT and OUTPUTFORMAT 
keywords.

> Issue in Parquet Hive Table
> ---
>
> Key: HIVE-11084
> URL: https://issues.apache.org/jira/browse/HIVE-11084
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.9.0
> Environment: GNU/Linux
>Reporter: Chanchal Kumar Ghosh
>
> {code}
> hive> CREATE TABLE intable_p (
> >   sr_no int,
> >   name string,
> >   emp_id int
> > ) PARTITIONED BY (
> >   a string,
> >   b string,
> >   c string
> > ) ROW FORMAT DELIMITED
> >   FIELDS TERMINATED BY '\t'
> >   LINES TERMINATED BY '\n'
> > STORED AS PARQUET;
> hive> insert overwrite table intable_p partition (a='a', b='b', c='c') select 
> * from intable;
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks is set to 0 since there's no reduce operator
> 
> MapReduce Jobs Launched:
> Stage-Stage-1: Map: 1   Cumulative CPU: 2.59 sec   HDFS Read: 247 HDFS Write: 
> 410 SUCCESS
> Total MapReduce CPU Time Spent: 2 seconds 590 msec
> OK
> Time taken: 30.382 seconds
> hive> show create table intable_p;
> OK
> CREATE  TABLE `intable_p`(
>   `sr_no` int,
>   `name` string,
>   `emp_id` int)
> PARTITIONED BY (
>   `a` string,
>   `b` string,
>   `c` string)
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '\t'
>   LINES TERMINATED BY '\n'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
> LOCATION
>   'hdfs://nameservice1/hive/db/intable_p'
> TBLPROPERTIES (
>   'transient_lastDdlTime'='1435080569')
> Time taken: 0.212 seconds, Fetched: 19 row(s)
> hive> CREATE  TABLE `intable_p2`(
> >   `sr_no` int,
> >   `name` string,
> >   `emp_id` int)
> > PARTITIONED BY (
> >   `a` string,
> >   `b` string,
> >   `c` string)
> > ROW FORMAT DELIMITED
> >   FIELDS TERMINATED BY '\t'
> >   LINES TERMINATED BY '\n'
> > STORED AS INPUTFORMAT
> >   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
> > OUTPUTFORMAT
> >   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> OK
> Time taken: 0.179 seconds
> hive> insert overwrite table intable_p2 partition (a='a', b='b', c='c') 
> select * from intable;
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks is set to 0 since there's no reduce operator
> ...
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: > 0
> 2015-06-23 17:34:40,471 Stage-1 map = 0%,  reduce = 0%
> 2015-06-23 17:35:10,753 Stage-1 map = 100%,  reduce = 0%
> Ended Job = job_1433246369760_7947 with errors
> Error during job, obtaining debugging information...
> Examining task ID: task_ (and more) from job job_
> Task with the most failures(4):
> -
> Task ID:
>   task_
> URL:
>   
> -
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"sr_no":1,"name":"ABC","emp_id":1001}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:198)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"sr_no":1,"name":"ABC","emp_id":1001}
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:549)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:180)
> ... 8 more
> Caused by: {color:red}java.lang.ClassCastException: org.apache.hadoop.io.Text 
> cannot be cast to org.apache.hadoop.io.ArrayWritable{color}
> at 
> org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.write(ParquetRecordWriterWrapper.java:105)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:628)
> at org.apache.hadoop.hive.q

[jira] [Updated] (HIVE-10754) new Job() is deprecated. Replaced all with Job.getInstance() for Hcatalog

2015-06-24 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10754:

Attachment: (was: HIVE-10754.patch)

> new Job() is deprecated. Replaced all with Job.getInstance() for Hcatalog
> -
>
> Key: HIVE-10754
> URL: https://issues.apache.org/jira/browse/HIVE-10754
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10754.patch
>
>
> Replace all the deprecated new Job() with Job.getInstance() in HCatalog.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10754) new Job() is deprecated. Replaced all with Job.getInstance() for Hcatalog

2015-06-24 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10754:

Attachment: HIVE-10754.patch

> new Job() is deprecated. Replaced all with Job.getInstance() for Hcatalog
> -
>
> Key: HIVE-10754
> URL: https://issues.apache.org/jira/browse/HIVE-10754
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10754.patch
>
>
> Replace all the deprecated new Job() with Job.getInstance() in HCatalog.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11051) Hive 1.2.0 MapJoin w/Tez - LazyBinaryArray cannot be cast to [Ljava.lang.Object;

2015-06-24 Thread Greg Senia (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14599913#comment-14599913
 ] 

Greg Senia commented on HIVE-11051:
---

Fix looks good. Tested in our environment testing one final use case today.

> Hive 1.2.0  MapJoin w/Tez - LazyBinaryArray cannot be cast to 
> [Ljava.lang.Object;
> -
>
> Key: HIVE-11051
> URL: https://issues.apache.org/jira/browse/HIVE-11051
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers, Tez
>Affects Versions: 1.2.0
>Reporter: Greg Senia
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11051.01.patch, problem_table_joins.tar.gz
>
>
> I tried to apply: HIVE-10729 which did not solve the issue.
> The following exception is thrown on a Tez MapJoin with Hive 1.2.0 and Tez 
> 0.5.4/0.5.3
> {code}
> Status: Running (Executing on YARN cluster with App id 
> application_1434641270368_1038)
> 
> VERTICES  STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  
> KILLED
> 
> Map 1 ..   SUCCEEDED  3  300   0  
>  0
> Map 2 ... FAILED  3  102   7  
>  0
> 
> VERTICES: 01/02  [=>>-] 66%   ELAPSED TIME: 7.39 s
>  
> 
> Status: Failed
> Vertex failed, vertexName=Map 2, vertexId=vertex_1434641270368_1038_2_01, 
> diagnostics=[Task failed, taskId=task_1434641270368_1038_2_01_02, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"cnctevn_id":"002245282386","svcrqst_id":"003627217285","svcrqst_crt_dts":"2015-04-23
>  11:54:39.238357","subject_seq_no":1,"plan_component":"HMOM1 
> ","cust_segment":"RM 
> ","cnctyp_cd":"001","cnctmd_cd":"D02","cnctevs_cd":"007","svcrtyp_cd":"335","svrstyp_cd":"088","cmpltyp_cd":"
>  ","catsrsn_cd":"","apealvl_cd":" 
> ","cnstnty_cd":"001","svcrqst_asrqst_ind":"Y","svcrqst_rtnorig_in":"N","svcrqst_vwasof_dt":"null","sum_reason_cd":"98","sum_reason":"Exclude","crsr_master_claim_index":null,"svcrqct_cds":["
>"],"svcrqst_lupdt":"2015-04-23 
> 22:14:01.288132","crsr_lupdt":null,"cntevsds_lupdt":"2015-04-23 
> 11:54:40.740061","ignore_me":1,"notes":null}
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"cnctevn_id":"002245282386","svcrqst_id":"003627217285","svcrqst_crt_dts":"2015-04-23
>  11:54:39.238357","subject_seq_no":1,"plan_component":"HMOM1 
> ","cust_segment":"RM 
> ","cnctyp_cd":"001","cnctmd_cd":"D02","cnctevs_cd":"007","svcrtyp_cd":"335","svrstyp_cd":"088","cmpltyp_cd":"
>  ","catsrsn_cd":"","apealvl_cd":" 
> ","cnstnty_cd":"001","svcrqst_asrqst_ind":"Y","svcrqst_rtnorig_in":"N","svcrqst_vwasof_dt":"null","sum_reason_cd":"98","sum_reason":"Exclude","crsr_master_claim_index":null,"svcrqct_cds":["
>"],"svcrqst_lupdt":"2015-04-23 
> 22:14:01.288132","crsr_lupdt":null,"cntevsds_lupdt":"2015-04-

[jira] [Commented] (HIVE-10895) ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources

2015-06-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14599897#comment-14599897
 ] 

Hive QA commented on HIVE-10895:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12741627/HIVE-10895.2.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9017 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join28
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4366/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4366/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4366/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12741627 - PreCommit-HIVE-TRUNK-Build

> ObjectStore does not close Query objects in some calls, causing a potential 
> leak in some metastore db resources
> ---
>
> Key: HIVE-10895
> URL: https://issues.apache.org/jira/browse/HIVE-10895
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13
>Reporter: Takahiko Saito
>Assignee: Aihua Xu
> Attachments: HIVE-10895.1.patch, HIVE-10895.2.patch
>
>
> During testing, we've noticed Oracle db running out of cursors. Might be 
> related to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11055) HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)

2015-06-24 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14599860#comment-14599860
 ] 

Alan Gates commented on HIVE-11055:
---

We'll need to run rat on these before we check them in too, to make sure the 
headers are ok, etc.  I can do this but it will be a couple of days before I 
get to it.

> HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)
> ---
>
> Key: HIVE-11055
> URL: https://issues.apache.org/jira/browse/HIVE-11055
> Project: Hive
>  Issue Type: Improvement
>Reporter: Dmitry Tolpeko
>Assignee: Dmitry Tolpeko
> Attachments: HIVE-11055.1.patch
>
>
> There is PL/HQL tool (www.plhql.org) that implements procedural SQL for Hive 
> (actually any SQL-on-Hadoop implementation and any JDBC source).
> Alan Gates offered to contribute it to Hive under HPL/SQL name 
> (org.apache.hive.hplsql package). This JIRA is to create a patch to 
> contribute  the PL/HQL code. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11079) Fix qfile tests that fail on Windows due to CR/character escape differences

2015-06-24 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14599849#comment-14599849
 ] 

Jason Dere commented on HIVE-11079:
---

Test failure does not appear to be related. It passes for me locally on both 
Mac/Linux with the patch. Not to mention the patch contains only q-file changes.

> Fix qfile tests that fail on Windows due to CR/character escape differences
> ---
>
> Key: HIVE-11079
> URL: https://issues.apache.org/jira/browse/HIVE-11079
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-11079.1.patch, HIVE-11079.2.patch, 
> HIVE-11079.3.patch, HIVE-11079.4.patch, HIVE-11079.5.patch
>
>
> A few qfile tests are failing on Windows due to a couple of windows-specific 
> issues:
> - The table comment for the test includes a CR character, which is different 
> on Windows compared to Unix.
> - The partition path in the test includes a space character. Unlike Unix, on 
> Windows space characters in Hive paths are escaped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10533) CBO (Calcite Return Path): Join to MultiJoin support for outer joins

2015-06-24 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-10533:
---
Attachment: HIVE-10533.05.patch

[~ashutoshc], I rebased the patch and addressed the test fails.

> CBO (Calcite Return Path): Join to MultiJoin support for outer joins
> 
>
> Key: HIVE-10533
> URL: https://issues.apache.org/jira/browse/HIVE-10533
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-10533.01.patch, HIVE-10533.02.patch, 
> HIVE-10533.02.patch, HIVE-10533.03.patch, HIVE-10533.04.patch, 
> HIVE-10533.05.patch, HIVE-10533.patch
>
>
> CBO return path: auto_join7.q can be used to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-1903) Can't join HBase tables if one's name is the beginning of the other

2015-06-24 Thread Wan Chang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wan Chang updated HIVE-1903:

Attachment: (was: HIVE-11097.1.patch)

> Can't join HBase tables if one's name is the beginning of the other
> ---
>
> Key: HIVE-1903
> URL: https://issues.apache.org/jira/browse/HIVE-1903
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Reporter: Jean-Daniel Cryans
>Assignee: John Sichi
> Fix For: 0.7.0
>
> Attachments: HIVE-1903.1.patch
>
>
> I tried joining two tables, let's call them "table" and "table_a", but I'm 
> seeing an array of errors such as this:
> {noformat}
> java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
>   at java.util.ArrayList.RangeCheck(ArrayList.java:547)
>   at java.util.ArrayList.get(ArrayList.java:322)
>   at 
> org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getRecordReader(HiveHBaseTableInputFormat.java:118)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:231)
> {noformat}
> The reason is that HiveInputFormat.pushProjectionsAndFilters matches the 
> aliases with startsWith so in my case the mappers for "table_a" were getting 
> the columns from "table" as well as its own (and since it had less column, it 
> was trying to get one too far in the array).
> I don't know if just changing it to "equals" fill fix it, my guess is it 
> won't, since it may break RCFiles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-1903) Can't join HBase tables if one's name is the beginning of the other

2015-06-24 Thread Wan Chang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wan Chang updated HIVE-1903:

Attachment: HIVE-11097.1.patch

> Can't join HBase tables if one's name is the beginning of the other
> ---
>
> Key: HIVE-1903
> URL: https://issues.apache.org/jira/browse/HIVE-1903
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Reporter: Jean-Daniel Cryans
>Assignee: John Sichi
> Fix For: 0.7.0
>
> Attachments: HIVE-11097.1.patch, HIVE-1903.1.patch
>
>
> I tried joining two tables, let's call them "table" and "table_a", but I'm 
> seeing an array of errors such as this:
> {noformat}
> java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
>   at java.util.ArrayList.RangeCheck(ArrayList.java:547)
>   at java.util.ArrayList.get(ArrayList.java:322)
>   at 
> org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getRecordReader(HiveHBaseTableInputFormat.java:118)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:231)
> {noformat}
> The reason is that HiveInputFormat.pushProjectionsAndFilters matches the 
> aliases with startsWith so in my case the mappers for "table_a" were getting 
> the columns from "table" as well as its own (and since it had less column, it 
> was trying to get one too far in the array).
> I don't know if just changing it to "equals" fill fix it, my guess is it 
> won't, since it may break RCFiles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11097) HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases

2015-06-24 Thread Wan Chang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wan Chang updated HIVE-11097:
-
Attachment: HIVE-11097.1.patch

Attach patch file

> HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases
> -
>
> Key: HIVE-11097
> URL: https://issues.apache.org/jira/browse/HIVE-11097
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.2.0
> Environment: Hive 0.13.1, Hive 2.0.0, hadoop 2.4.1
>Reporter: Wan Chang
>Priority: Critical
> Attachments: HIVE-11097.1.patch
>
>
> Say we have a sql as
> {code}
> create table if not exists test_orc_src (a int, b int, c int) stored as orc;
> create table if not exists test_orc_src2 (a int, b int, d int) stored as orc;
> insert overwrite table test_orc_src select 1,2,3 from src limit 1;
> insert overwrite table test_orc_src2 select 1,2,4 from src limit 1;
> set hive.auto.convert.join = false;
> set hive.execution.engine=mr;
> select
>   tb.c
> from test.test_orc_src tb
> join (select * from test.test_orc_src2) tm
> on tb.a = tm.a
> where tb.b = 2
> {code}
> The correct result is 3 but it produced no result.
> I find that in HiveInputFormat.pushProjectionsAndFilters
> {code}
> match = splitPath.startsWith(key) || splitPathWithNoSchema.startsWith(key);
> {code}
> It uses startsWith to combine aliases with path, so tm will match two alias 
> in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-24 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14599787#comment-14599787
 ] 

Gunther Hagleitner commented on HIVE-10233:
---

Test failures are unrelated.

> Hive on tez: memory manager for grace hash join
> ---
>
> Key: HIVE-10233
> URL: https://issues.apache.org/jira/browse/HIVE-10233
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: llap, 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Gunther Hagleitner
> Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
> HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
> HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
> HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch, 
> HIVE-10233.12.patch, HIVE-10233.13.patch, HIVE-10233.14.patch
>
>
> We need a memory manager in llap/tez to manage the usage of memory across 
> threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11090) ordering issues with windows unit test runs

2015-06-24 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14599754#comment-14599754
 ] 

Gunther Hagleitner commented on HIVE-11090:
---

   * Cutting the limit off in all these cases causes the results to be another 
10mb in size. Is that really necessary? I understand it's painful to do a limit 
and produce stable results, but does adding the required order instead change 
the test?
   * In vectorization_9 you took out the a bunch of udfs and simplified one of 
the queries. Why was that necessary?
   * Finally - seems you need to update the spark driver tests as well

> ordering issues with windows unit test runs
> ---
>
> Key: HIVE-11090
> URL: https://issues.apache.org/jira/browse/HIVE-11090
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-11090.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11086) Remove use of ErrorMsg in Orc's RunLengthIntegerReaderV2

2015-06-24 Thread Owen O'Malley (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-11086:
-
Attachment: HIVE-11086.patch

Since ORC wasn't using the error code, I just moved the string back into ORC 
for now.

> Remove use of ErrorMsg in Orc's RunLengthIntegerReaderV2
> 
>
> Key: HIVE-11086
> URL: https://issues.apache.org/jira/browse/HIVE-11086
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-11086.patch
>
>
> ORC's rle v2 reader uses a string literal from ErrorMsg, which forces a large 
> dependency on the rle v2 reader. Pulling the string literal in directly 
> doesn't change the behavior and fixes the linkage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9557) create UDF to measure strings similarity using Cosine Similarity algo

2015-06-24 Thread Nishant Kelkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14599747#comment-14599747
 ] 

Nishant Kelkar commented on HIVE-9557:
--

Thanks for the pointers! I'll modify the patch per your instructions and 
reupload.

Thanks for working with me through my first patch! :)

> create UDF to measure strings similarity using Cosine Similarity algo
> -
>
> Key: HIVE-9557
> URL: https://issues.apache.org/jira/browse/HIVE-9557
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Nishant Kelkar
>  Labels: CosineSimilarity, SimilarityMetric, UDF
> Attachments: udf_cosine_similarity-v01.patch
>
>
> algo description http://en.wikipedia.org/wiki/Cosine_similarity
> {code}
> --one word different, total 2 words
> str_sim_cosine('Test String1', 'Test String2') = (2 - 1) / 2 = 0.5f
> {code}
> reference implementation:
> https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/CosineSimilarity.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9557) create UDF to measure strings similarity using Cosine Similarity algo

2015-06-24 Thread Alexander Pivovarov (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14599737#comment-14599737
 ] 

Alexander Pivovarov commented on HIVE-9557:
---

Hi Nishant,

Thank you for the patch. Can you look at the following recommendations/issues
- usually patch name should look like HIVE-9557.1.patch
- can you attach RB link to the Jira? So, we can leave comments for particular 
code lines
- you have to provide integration tests for the function (q file and q.out file)
- the function should be registered
- probably you can look at HIVE-9556 as an example
- hive code uses 2 spaces for indent
- "a*b" should be separated by space "a * b"

> create UDF to measure strings similarity using Cosine Similarity algo
> -
>
> Key: HIVE-9557
> URL: https://issues.apache.org/jira/browse/HIVE-9557
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Nishant Kelkar
>  Labels: CosineSimilarity, SimilarityMetric, UDF
> Attachments: udf_cosine_similarity-v01.patch
>
>
> algo description http://en.wikipedia.org/wiki/Cosine_similarity
> {code}
> --one word different, total 2 words
> str_sim_cosine('Test String1', 'Test String2') = (2 - 1) / 2 = 0.5f
> {code}
> reference implementation:
> https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/CosineSimilarity.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11055) HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)

2015-06-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14599720#comment-14599720
 ] 

Hive QA commented on HIVE-11055:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12741609/HIVE-11055.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9015 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join28
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4365/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4365/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4365/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12741609 - PreCommit-HIVE-TRUNK-Build

> HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)
> ---
>
> Key: HIVE-11055
> URL: https://issues.apache.org/jira/browse/HIVE-11055
> Project: Hive
>  Issue Type: Improvement
>Reporter: Dmitry Tolpeko
>Assignee: Dmitry Tolpeko
> Attachments: HIVE-11055.1.patch
>
>
> There is PL/HQL tool (www.plhql.org) that implements procedural SQL for Hive 
> (actually any SQL-on-Hadoop implementation and any JDBC source).
> Alan Gates offered to contribute it to Hive under HPL/SQL name 
> (org.apache.hive.hplsql package). This JIRA is to create a patch to 
> contribute  the PL/HQL code. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11077) Add support in parser and wire up to txn manager

2015-06-24 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11077:
--
Attachment: HIVE-11077.patch

> Add support in parser and wire up to txn manager
> 
>
> Key: HIVE-11077
> URL: https://issues.apache.org/jira/browse/HIVE-11077
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL, Transactions
>Affects Versions: 1.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-11077.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0

2015-06-24 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14599651#comment-14599651
 ] 

Sergio Peña commented on HIVE-10975:


[~Ferd] I took this patch and bump it to parquet 1.7.0 on HIVE-11096. The issue 
is owned by you even if I uploaded the patch because you already did the work 
here.

Let's use this ticket to bump it to 1.8.0 when this version is officially 
released by Parquet.

> Parquet: Bump the parquet version up to 1.8.0
> -
>
> Key: HIVE-10975
> URL: https://issues.apache.org/jira/browse/HIVE-10975
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
>Priority: Minor
> Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch
>
>
> There are lots of changes since parquet's graduation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11096) Bump the parquet version to 1.7.0

2015-06-24 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14599643#comment-14599643
 ] 

Sergio Peña commented on HIVE-11096:


[~Ferd] Could you review this patch? It is basically the same from HIVE-10975, 
but with 1.7.0 version, and it will be merged to master.

> Bump the parquet version to 1.7.0
> -
>
> Key: HIVE-11096
> URL: https://issues.apache.org/jira/browse/HIVE-11096
> Project: Hive
>  Issue Type: Task
>Affects Versions: 1.2.0
>Reporter: Sergio Peña
>Assignee: Ferdinand Xu
>Priority: Minor
> Attachments: HIVE-11096.1.patch
>
>
> Parquet has moved officially as an Apache project since parquet 1.7.0.
> This new version does not have any bugfixes nor improvements from its last 
> 1.6.0 version, but all imports were changed to be org.apache.parquet, and the 
> pom.xml must use org.apache.parquet instead of com.twitter.
> This ticket should address those import and pom changes only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11096) Bump the parquet version to 1.7.0

2015-06-24 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-11096:
---
Attachment: HIVE-11096.1.patch

[~Ferd] already submitted a patch and did this work on HIVE-10975, but using 
1.8.0rc2-SNAPSHOT instead. Parquet 1.8.0 has not been officially released, and 
we need the new org.apache.parquet imports to keep working on new parquet data 
types and other enhancements.

I took [~Ferd] patch, and change 1.8.0rc2-SNAPSHOT to 1.7.0

> Bump the parquet version to 1.7.0
> -
>
> Key: HIVE-11096
> URL: https://issues.apache.org/jira/browse/HIVE-11096
> Project: Hive
>  Issue Type: Task
>Reporter: Sergio Peña
>Assignee: Ferdinand Xu
>Priority: Minor
> Attachments: HIVE-11096.1.patch
>
>
> Parquet has moved officially as an Apache project since parquet 1.7.0.
> This new version does not have any bugfixes nor improvements from its last 
> 1.6.0 version, but all imports were changed to be org.apache.parquet, and the 
> pom.xml must use org.apache.parquet instead of com.twitter.
> This ticket should address those import and pom changes only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11055) HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)

2015-06-24 Thread Dmitry Tolpeko (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14599601#comment-14599601
 ] 

Dmitry Tolpeko commented on HIVE-11055:
---

I will do this shortly. Thanks, again. 

> HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)
> ---
>
> Key: HIVE-11055
> URL: https://issues.apache.org/jira/browse/HIVE-11055
> Project: Hive
>  Issue Type: Improvement
>Reporter: Dmitry Tolpeko
>Assignee: Dmitry Tolpeko
> Attachments: HIVE-11055.1.patch
>
>
> There is PL/HQL tool (www.plhql.org) that implements procedural SQL for Hive 
> (actually any SQL-on-Hadoop implementation and any JDBC source).
> Alan Gates offered to contribute it to Hive under HPL/SQL name 
> (org.apache.hive.hplsql package). This JIRA is to create a patch to 
> contribute  the PL/HQL code. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11055) HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)

2015-06-24 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14599592#comment-14599592
 ] 

Xuefu Zhang commented on HIVE-11055:


what about /bin directory, where different scripts are located? You can provide 
one for Linux and one for Windows, as it's the case for other scripts.

> HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)
> ---
>
> Key: HIVE-11055
> URL: https://issues.apache.org/jira/browse/HIVE-11055
> Project: Hive
>  Issue Type: Improvement
>Reporter: Dmitry Tolpeko
>Assignee: Dmitry Tolpeko
> Attachments: HIVE-11055.1.patch
>
>
> There is PL/HQL tool (www.plhql.org) that implements procedural SQL for Hive 
> (actually any SQL-on-Hadoop implementation and any JDBC source).
> Alan Gates offered to contribute it to Hive under HPL/SQL name 
> (org.apache.hive.hplsql package). This JIRA is to create a patch to 
> contribute  the PL/HQL code. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11094) Beeline redirecting all output to ErrorStream

2015-06-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14599559#comment-14599559
 ] 

Hive QA commented on HIVE-11094:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12741603/HIVE-11094.patch

{color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 9015 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join20
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join23
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_6
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket_map_join_spark2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_cbo_subq_in
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby4
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_multi_insert_common_distinct
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_reorder
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_mapjoin_addjar
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_timestamp_null
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union6
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_14
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_cast_constant
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4364/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4364/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4364/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 16 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12741603 - PreCommit-HIVE-TRUNK-Build

> Beeline redirecting all output to ErrorStream
> -
>
> Key: HIVE-11094
> URL: https://issues.apache.org/jira/browse/HIVE-11094
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11094.patch
>
>
> Beeline is sending all output to ErrorStream, instead of using OutputStream 
> for info or debug information.
> The problem can be reproduced by running:
> {noformat}
> ./bin/beeline -u jdbc:hive2:// -e "show databases" > exec.out
> {noformat}
> I will still print the output through the terminal. The reason seems to be 
> that the normal output is also sent through the ErrorStream.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10895) ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources

2015-06-24 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10895:

Attachment: HIVE-10895.2.patch

> ObjectStore does not close Query objects in some calls, causing a potential 
> leak in some metastore db resources
> ---
>
> Key: HIVE-10895
> URL: https://issues.apache.org/jira/browse/HIVE-10895
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13
>Reporter: Takahiko Saito
>Assignee: Aihua Xu
> Attachments: HIVE-10895.1.patch, HIVE-10895.2.patch
>
>
> During testing, we've noticed Oracle db running out of cursors. Might be 
> related to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10895) ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources

2015-06-24 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10895:

Attachment: (was: HIVE-10895.2.patch)

> ObjectStore does not close Query objects in some calls, causing a potential 
> leak in some metastore db resources
> ---
>
> Key: HIVE-10895
> URL: https://issues.apache.org/jira/browse/HIVE-10895
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13
>Reporter: Takahiko Saito
>Assignee: Aihua Xu
> Attachments: HIVE-10895.1.patch, HIVE-10895.2.patch
>
>
> During testing, we've noticed Oracle db running out of cursors. Might be 
> related to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11055) HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)

2015-06-24 Thread Dmitry Tolpeko (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14599516#comment-14599516
 ] 

Dmitry Tolpeko commented on HIVE-11055:
---

Thanks, Xuefu. Since for now it will be a separate tool, where can we put shell 
wrappers to invoke the tool? Now I have script hplsql (for Linux) and 
hplsql.bat (for Windows) that calls

java -cp  org.apache.hive.hplsql.Hplsql "$@"

so user can simply call the tool as: 

hplsql -f script.sql

> HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)
> ---
>
> Key: HIVE-11055
> URL: https://issues.apache.org/jira/browse/HIVE-11055
> Project: Hive
>  Issue Type: Improvement
>Reporter: Dmitry Tolpeko
>Assignee: Dmitry Tolpeko
> Attachments: HIVE-11055.1.patch
>
>
> There is PL/HQL tool (www.plhql.org) that implements procedural SQL for Hive 
> (actually any SQL-on-Hadoop implementation and any JDBC source).
> Alan Gates offered to contribute it to Hive under HPL/SQL name 
> (org.apache.hive.hplsql package). This JIRA is to create a patch to 
> contribute  the PL/HQL code. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10803) document jdbc url format properly

2015-06-24 Thread Gabor Liptak (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14599501#comment-14599501
 ] 

Gabor Liptak commented on HIVE-10803:
-

[~thejas] Did you have a chance to review? Thanks

> document jdbc url format properly
> -
>
> Key: HIVE-10803
> URL: https://issues.apache.org/jira/browse/HIVE-10803
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation, HiveServer2
>Reporter: Thejas M Nair
>
> This is the format of the HS2 string, this needs to be documented in the wiki 
> doc (taken from jdbc.Utils.java)
>  
> jdbc:hive2://:,:/dbName;sess_var_list?hive_conf_list#hive_var_list



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10895) ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources

2015-06-24 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14599456#comment-14599456
 ] 

Aihua Xu commented on HIVE-10895:
-

All the tests are not related to the patch. 

> ObjectStore does not close Query objects in some calls, causing a potential 
> leak in some metastore db resources
> ---
>
> Key: HIVE-10895
> URL: https://issues.apache.org/jira/browse/HIVE-10895
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13
>Reporter: Takahiko Saito
>Assignee: Aihua Xu
> Attachments: HIVE-10895.1.patch, HIVE-10895.2.patch
>
>
> During testing, we've noticed Oracle db running out of cursors. Might be 
> related to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11095) SerDeUtils another bug ,when Text is reused

2015-06-24 Thread xiaowei wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14599430#comment-14599430
 ] 

xiaowei wang commented on HIVE-11095:
-

SerDeUtils  invoke a bad method of Text,getBytes()! 

> SerDeUtils  another bug ,when Text is reused
> 
>
> Key: HIVE-11095
> URL: https://issues.apache.org/jira/browse/HIVE-11095
> Project: Hive
>  Issue Type: Bug
>  Components: API, CLI
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
> Environment: Hadoop 2.3.0-cdh5.0.0
> Hive 0.14
>Reporter: xiaowei wang
>Assignee: xiaowei wang
>Priority: Critical
> Fix For: 1.2.0
>
> Attachments: HIVE-11095.1.patch.txt
>
>
> The method transformTextFromUTF8 have a  error bug, 
> It invoke a bad method of Text,getBytes()!
> When i query data from a lzo table ， I found in results ： the length of the 
> current row is always largr than the previous row， and sometimes，the current 
> row contains the contents of the previous row。 For example ，i execute a sql 
> ,"select * from web_searchhub where logdate=2015061003", the result of sql 
> see blow.Notice that ,the second row content contains the first row content.
> INFO [03:00:05.589] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
> INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
> session=901,thread=223ession=3151,thread=254 2015061003
> The content of origin lzo file content see below ,just 2 rows.
> INFO [03:00:05.635]  
> session=3148,thread=285
> INFO [03:00:05.635] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
> I think this error is caused by the Text reuse,and I found the solutions .
> Addicational, table create sql is : 
> CREATE EXTERNAL TABLE `web_searchhub`(
> `line` string)
> PARTITIONED BY (
> `logdate` string)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '
> U'
> WITH SERDEPROPERTIES (
> 'serialization.encoding'='GBK')
> STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
> OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
> LOCATION
> 'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' ；



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11095) SerDeUtils another bug ,when Text is reused

2015-06-24 Thread xiaowei wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaowei wang updated HIVE-11095:

Attachment: HIVE-11095.1.patch.txt

> SerDeUtils  another bug ,when Text is reused
> 
>
> Key: HIVE-11095
> URL: https://issues.apache.org/jira/browse/HIVE-11095
> Project: Hive
>  Issue Type: Bug
>  Components: API, CLI
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
> Environment: Hadoop 2.3.0-cdh5.0.0
> Hive 0.14
>Reporter: xiaowei wang
>Assignee: xiaowei wang
>Priority: Critical
> Fix For: 1.2.0
>
> Attachments: HIVE-11095.1.patch.txt
>
>
> The method transformTextFromUTF8 have a  error bug, 
> It invoke a bad method of Text,getBytes()!
> When i query data from a lzo table ， I found in results ： the length of the 
> current row is always largr than the previous row， and sometimes，the current 
> row contains the contents of the previous row。 For example ，i execute a sql 
> ,"select * from web_searchhub where logdate=2015061003", the result of sql 
> see blow.Notice that ,the second row content contains the first row content.
> INFO [03:00:05.589] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
> INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
> session=901,thread=223ession=3151,thread=254 2015061003
> The content of origin lzo file content see below ,just 2 rows.
> INFO [03:00:05.635]  
> session=3148,thread=285
> INFO [03:00:05.635] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
> I think this error is caused by the Text reuse,and I found the solutions .
> Addicational, table create sql is : 
> CREATE EXTERNAL TABLE `web_searchhub`(
> `line` string)
> PARTITIONED BY (
> `logdate` string)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '
> U'
> WITH SERDEPROPERTIES (
> 'serialization.encoding'='GBK')
> STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
> OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
> LOCATION
> 'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' ；



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11095) SerDeUtils another bug ,when Text is reused

2015-06-24 Thread xiaowei wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaowei wang updated HIVE-11095:

Description: 
The method transformTextFromUTF8 have a  error bug, 
It invoke a bad method of Text,getBytes()!
When i query data from a lzo table ， I found in results ： the length of the 
current row is always largr than the previous row， and sometimes，the current 
row contains the contents of the previous row。 For example ，i execute a sql 
,"select * from web_searchhub where logdate=2015061003", the result of sql see 
blow.Notice that ,the second row content contains the first row content.
INFO [03:00:05.589] HttpFrontServer::FrontSH 
msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
session=901,thread=223ession=3151,thread=254 2015061003
The content of origin lzo file content see below ,just 2 rows.
INFO [03:00:05.635]  
session=3148,thread=285
INFO [03:00:05.635] HttpFrontServer::FrontSH 
msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
I think this error is caused by the Text reuse,and I found the solutions .
Addicational, table create sql is : 
CREATE EXTERNAL TABLE `web_searchhub`(
`line` string)
PARTITIONED BY (
`logdate` string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '
U'
WITH SERDEPROPERTIES (
'serialization.encoding'='GBK')
STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
LOCATION
'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' ；

  was:
The method transformTextFromUTF8 have a  error bug, 
When i query data from a lzo table ， I found in results ： the length of the 
current row is always largr than the previous row， and sometimes，the current 
row contains the contents of the previous row。 For example ，i execute a sql 
,"select * from web_searchhub where logdate=2015061003", the result of sql see 
blow.Notice that ,the second row content contains the first row content.
INFO [03:00:05.589] HttpFrontServer::FrontSH 
msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
session=901,thread=223ession=3151,thread=254 2015061003
The content of origin lzo file content see below ,just 2 rows.
INFO [03:00:05.635]  
session=3148,thread=285
INFO [03:00:05.635] HttpFrontServer::FrontSH 
msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
I think this error is caused by the Text reuse,and I found the solutions .
Addicational, table create sql is : 
CREATE EXTERNAL TABLE `web_searchhub`(
`line` string)
PARTITIONED BY (
`logdate` string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '
U'
WITH SERDEPROPERTIES (
'serialization.encoding'='GBK')
STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
LOCATION
'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' ；


> SerDeUtils  another bug ,when Text is reused
> 
>
> Key: HIVE-11095
> URL: https://issues.apache.org/jira/browse/HIVE-11095
> Project: Hive
>  Issue Type: Bug
>  Components: API, CLI
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
> Environment: Hadoop 2.3.0-cdh5.0.0
> Hive 0.14
>Reporter: xiaowei wang
>Assignee: xiaowei wang
>Priority: Critical
> Fix For: 1.2.0
>
>
> The method transformTextFromUTF8 have a  error bug, 
> It invoke a bad method of Text,getBytes()!
> When i query data from a lzo table ， I found in results ： the length of the 
> current row is always largr than the previous row， and sometimes，the current 
> row contains the contents of the previous row。 For example ，i execute a sql 
> ,"select * from web_searchhub where logdate=2015061003", the result of sql 
> see blow.Notice that ,the second row content contains the first row content.
> INFO [03:00:05.589] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
> INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
> session=901,thread=223ession=3151,thread=254 2015061003
> The content of origin lzo file content see below ,just 2 rows.
> INFO [03:00:05.635]  
> session=3148,thread=285
> INFO [03:00:05.635] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
> I think this error is caused by the Text reuse,and I found the solutions .
> Addicational, table create sql is : 
> CREATE EXTERNAL TABLE `web_searchhub`(
> `line` string)
> PARTITIONED BY (
> `logdate` string)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '
> U'
> WITH SERDEPROPERTIES (
> 'serialization.encoding'='GBK')
> STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
> OUTPUTFORMAT "org.apache.had

[jira] [Updated] (HIVE-10983) SerDeUtils bug ,when Text is reused

2015-06-24 Thread xiaowei wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaowei wang updated HIVE-10983:

Description: 
the mothod transformTextToUTF8 have a error bug!
It invoke a bad method of Text,getBytes()!
When i query data from a lzo table ， I found  in results ： the length of the 
current row is always largr  than the previous row， and sometimes，the current  
row contains the contents of the previous row。 For example ，i execute a sql 
,"select *   from web_searchhub where logdate=2015061003", the result of sql 
see blow.Notice that ,the second row content contains the first row content.

INFO [03:00:05.589] HttpFrontServer::FrontSH 
msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
session=901,thread=223ession=3151,thread=254 2015061003

The content  of origin lzo file content see below ,just 2 rows.

INFO [03:00:05.635]  
session=3148,thread=285
INFO [03:00:05.635] HttpFrontServer::FrontSH 
msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285


I think this error is caused by the Text reuse,and I found the solutions .

Addicational, table create sql is : 
CREATE EXTERNAL TABLE `web_searchhub`(
  `line` string)
PARTITIONED BY (
  `logdate` string)
ROW FORMAT DELIMITED
  FIELDS TERMINATED BY '\\U'
WITH SERDEPROPERTIES (
  'serialization.encoding'='GBK')
STORED AS INPUTFORMAT  "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
  OUTPUTFORMAT 
"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";

LOCATION
  'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' ；


  was:
the mothod transformTextToUTF8 have a error bug!
When i query data from a lzo table ， I found  in results ： the length of the 
current row is always largr  than the previous row， and sometimes，the current  
row contains the contents of the previous row。 For example ，i execute a sql 
,"select *   from web_searchhub where logdate=2015061003", the result of sql 
see blow.Notice that ,the second row content contains the first row content.

INFO [03:00:05.589] HttpFrontServer::FrontSH 
msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
session=901,thread=223ession=3151,thread=254 2015061003

The content  of origin lzo file content see below ,just 2 rows.

INFO [03:00:05.635]  
session=3148,thread=285
INFO [03:00:05.635] HttpFrontServer::FrontSH 
msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285


I think this error is caused by the Text reuse,and I found the solutions .

Addicational, table create sql is : 
CREATE EXTERNAL TABLE `web_searchhub`(
  `line` string)
PARTITIONED BY (
  `logdate` string)
ROW FORMAT DELIMITED
  FIELDS TERMINATED BY '\\U'
WITH SERDEPROPERTIES (
  'serialization.encoding'='GBK')
STORED AS INPUTFORMAT  "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
  OUTPUTFORMAT 
"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";

LOCATION
  'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' ；



> SerDeUtils bug  ,when Text is reused 
> -
>
> Key: HIVE-10983
> URL: https://issues.apache.org/jira/browse/HIVE-10983
> Project: Hive
>  Issue Type: Bug
>  Components: API, CLI
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
> Environment: Hadoop 2.3.0-cdh5.0.0
> Hive 0.14
>Reporter: xiaowei wang
>Assignee: xiaowei wang
>Priority: Critical
>  Labels: patch
> Fix For: 0.14.1, 1.2.0
>
> Attachments: HIVE-10983.1.patch.txt, HIVE-10983.2.patch.txt
>
>
> the mothod transformTextToUTF8 have a error bug!
> It invoke a bad method of Text,getBytes()!
> When i query data from a lzo table ， I found  in results ： the length of the 
> current row is always largr  than the previous row， and sometimes，the current 
>  row contains the contents of the previous row。 For example ，i execute a sql 
> ,"select *   from web_searchhub where logdate=2015061003", the result of sql 
> see blow.Notice that ,the second row content contains the first row content.
> INFO [03:00:05.589] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
> INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
> session=901,thread=223ession=3151,thread=254 2015061003
> The content  of origin lzo file content see below ,just 2 rows.
> INFO [03:00:05.635]  
> session=3148,thread=285
> INFO [03:00:05.635] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
> I think this error is caused by the Text reuse,and I found the solutions .
> Addicational, table create sql is : 
> CREATE EXTERNAL TABLE `web_searchhub`(
>   `line` string)
> PARTITIONED BY (
>   `logdate` string)
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '\\U

1 2 >

1 - 100 of 135 matches

Mail list logo