date:20140925


[ 
https://issues.apache.org/jira/browse/HIVE-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147451#comment-14147451
 ] 

Hive QA commented on HIVE-8246:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12671036/HIVE-8246.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6346 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.parse.TestParse.testParse_union
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/972/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/972/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-972/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12671036

 HiveServer2 in http-kerberos mode is restrictive on client usernames
 

 Key: HIVE-8246
 URL: https://issues.apache.org/jira/browse/HIVE-8246
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.14.0

 Attachments: HIVE-8246.1.patch


 Unable to use client usernames of the format:
 {code}
 username/host@REALM
 username@FOREIGN_REALM
 {code}
 The following works fine:
 {code}
 username@REALM 
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8246) HiveServer2 in http-kerberos mode is restrictive on client usernames

2014-09-25 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147452#comment-14147452
 ] 

Vaibhav Gumashta commented on HIVE-8246:


[~thejas] The failed test has been inconsistent (failed on: HIVE-6148, 
HIVE-7156 and others).

 HiveServer2 in http-kerberos mode is restrictive on client usernames
 

 Key: HIVE-8246
 URL: https://issues.apache.org/jira/browse/HIVE-8246
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.14.0

 Attachments: HIVE-8246.1.patch


 Unable to use client usernames of the format:
 {code}
 username/host@REALM
 username@FOREIGN_REALM
 {code}
 The following works fine:
 {code}
 username@REALM 
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 25320: HIVE-7971: Support alter table change/replace/add columns for existing partitions

2014-09-25 Thread Gunther Hagleitner



 On Sept. 12, 2014, 6:58 p.m., Gunther Hagleitner wrote:
  ql/src/test/queries/clientpositive/alter_partition_change_col.q, line 7
  https://reviews.apache.org/r/25320/diff/1/?file=676166#file676166line7
 
  it'd be good to test:
  
  - dynamic partition case (no value for one partition specs, multiple 
  partition specs). does this work? If it does, what happens if there are 
  some partitions that cannot be changed? (some partitions already have a 
  column x others don't)
  - tables with multiple partitions
  - reordering columns
  - null/default partition
  - negative cases (name clash, column doesn't exist, etc)
 
 Jason Dere wrote:
 - Dynamic partition spec does not work for alter table statement, looks 
 like the syntax only supports specifying one partition at a time.
 - I've updated alter_partition_change_col.q to have multiple partitions
 - Reordering the columns in a partition does not look very effective 
 since it appears that the partition column names are ignored when reading 
 data from the table. The 1st column from the partition is treated as if it is 
 the 1st column of the table, and so on.
 - Are null/default partitions possible? Tried to test this but I didn't 
 get it working
 - Adding negative test cases
 
 Gunther Hagleitner wrote:
 - Reordering: Sounds like this is broken then? You're changing the first 
 column in the partition, if that's the 5th column in the table hive should 
 map, no? Is hive broken or is this patch not updating this?
 - Are you saying you couldn't write an alter table statement that accepts 
 the default value?

- what's the behavior of the dynamic partition pruning case? it just fails on 
the parser level? did you turn on dynamic pruning?


- Gunther


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25320/#review53197
---


On Sept. 15, 2014, 6:21 p.m., Jason Dere wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25320/
 ---
 
 (Updated Sept. 15, 2014, 6:21 p.m.)
 
 
 Review request for hive and Gunther Hagleitner.
 
 
 Bugs: HIVE-7971
 https://issues.apache.org/jira/browse/HIVE-7971
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Allow change/replace/add column to work on partitions
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 020943f 
   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
 05cde3e 
   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g 25cd3a5 
   ql/src/java/org/apache/hadoop/hive/ql/plan/AlterTableDesc.java 8517319 
   ql/src/test/queries/clientnegative/alter_partition_change_col_dup_col.q 
 PRE-CREATION 
   ql/src/test/queries/clientnegative/alter_partition_change_col_nonexist.q 
 PRE-CREATION 
   ql/src/test/queries/clientpositive/alter_partition_change_col.q 
 PRE-CREATION 
   ql/src/test/results/clientnegative/alter_partition_change_col_dup_col.q.out 
 PRE-CREATION 
   
 ql/src/test/results/clientnegative/alter_partition_change_col_nonexist.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/alter_partition_change_col.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/25320/diff/
 
 
 Testing
 ---
 
 New qfile test added
 
 
 Thanks,
 
 Jason Dere

[jira] [Commented] (HIVE-7647) Beeline does not honor --headerInterval and --color when executing with -e

2014-09-25 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147457#comment-14147457
 ] 

Lefty Leverenz commented on HIVE-7647:
--

Nicely done, [~ngangam].  Why didn't you mention the fix for --color too?

 Beeline does not honor --headerInterval and --color when executing with -e
 

 Key: HIVE-7647
 URL: https://issues.apache.org/jira/browse/HIVE-7647
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.14.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam
Priority: Minor
  Labels: TODOC14
 Fix For: 0.14.0

 Attachments: HIVE-7647.1.patch, HIVE-7647.2.patch


 --showHeader is being honored
 [root@localhost ~]# beeline --showHeader=false -u 
 'jdbc:hive2://localhost:1/default' -n hive -d 
 org.apache.hive.jdbc.HiveDriver -e select * from sample_07 limit 10;
 Connecting to jdbc:hive2://localhost:1/default
 Connected to: Apache Hive (version 0.12.0-cdh5.0.1)
 Driver: Hive JDBC (version 0.12.0-cdh5.0.1)
 Transaction isolation: TRANSACTION_REPEATABLE_READ
 -hiveconf (No such file or directory)
 +--+--++-+
 | 00-  | All Occupations  | 135185230  | 42270   |
 | 11-  | Management occupations   | 6152650| 100310  |
 | 11-1011  | Chief executives | 301930 | 160440  |
 | 11-1021  | General and operations managers  | 1697690| 107970  |
 | 11-1031  | Legislators  | 64650  | 37980   |
 | 11-2011  | Advertising and promotions managers  | 36100  | 94720   |
 | 11-2021  | Marketing managers   | 166790 | 118160  |
 | 11-2022  | Sales managers   | 333910 | 110390  |
 | 11-2031  | Public relations managers| 51730  | 101220  |
 | 11-3011  | Administrative services managers | 246930 | 79500   |
 +--+--++-+
 10 rows selected (0.838 seconds)
 Beeline version 0.12.0-cdh5.1.0 by Apache Hive
 Closing: org.apache.hive.jdbc.HiveConnection
 --outputFormat is being honored.
 [root@localhost ~]# beeline --outputFormat=csv -u 
 'jdbc:hive2://localhost:1/default' -n hive -d 
 org.apache.hive.jdbc.HiveDriver -e select * from sample_07 limit 10;
 Connecting to jdbc:hive2://localhost:1/default
 Connected to: Apache Hive (version 0.12.0-cdh5.0.1)
 Driver: Hive JDBC (version 0.12.0-cdh5.0.1)
 Transaction isolation: TRANSACTION_REPEATABLE_READ
 'code','description','total_emp','salary'
 '00-','All Occupations','135185230','42270'
 '11-','Management occupations','6152650','100310'
 '11-1011','Chief executives','301930','160440'
 '11-1021','General and operations managers','1697690','107970'
 '11-1031','Legislators','64650','37980'
 '11-2011','Advertising and promotions managers','36100','94720'
 '11-2021','Marketing managers','166790','118160'
 '11-2022','Sales managers','333910','110390'
 '11-2031','Public relations managers','51730','101220'
 '11-3011','Administrative services managers','246930','79500'
 10 rows selected (0.664 seconds)
 Beeline version 0.12.0-cdh5.1.0 by Apache Hive
 Closing: org.apache.hive.jdbc.HiveConnection
 both --color  --headerInterval are being honored when executing using -f 
 option (reads query from a file rather than the commandline) (cannot really 
 see the color here but use the terminal colors)
 [root@localhost ~]# beeline --showheader=true --color=true --headerInterval=5 
 -u 'jdbc:hive2://localhost:1/default' -n hive -d 
 org.apache.hive.jdbc.HiveDriver -f /tmp/tmp.sql  
 Connecting to jdbc:hive2://localhost:1/default
 Connected to: Apache Hive (version 0.12.0-cdh5.0.1)
 Driver: Hive JDBC (version 0.12.0-cdh5.0.1)
 Transaction isolation: TRANSACTION_REPEATABLE_READ
 Beeline version 0.12.0-cdh5.1.0 by Apache Hive
 0: jdbc:hive2://localhost select * from sample_07 limit 8;
 +--+--++-+
 |   code   | description  | total_emp  | salary  |
 +--+--++-+
 | 00-  | All Occupations  | 135185230  | 42270   |
 | 11-  | Management occupations   | 6152650| 100310  |
 | 11-1011  | Chief executives | 301930 | 160440  |
 | 11-1021  | General and operations managers  | 1697690| 107970  |
 | 11-1031  | Legislators  | 64650  | 37980   |
 +--+--++-+
 |   code   | description  | total_emp  | salary  |

[jira] [Commented] (HIVE-7971) Support alter table change/replace/add columns for existing partitions

[
https://issues.apache.org/jira/browse/HIVE-7971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147461#comment-14147461
]

Gunther Hagleitner commented on HIVE-7971:
--

The patch looks good, but I added some follow up questions on rb. Maybe better
here:

- dynamic partition: You're saying that's not supported, is this at the parser
level? did you enable the flag?
- if that's not there, it will be hard to update all partitions (your decimal
use case), but that could be another jira/time

- default partition: You're saying this doesn't work. Without it you can't
completely update the table though, right? Is this easy to add?

- reordering has no effect: hm, this should be the serde that reconciles the
difference between table and schema right? If that's broken altogether then
probably just file another jira. I just want to make sure it's not just this
patch not updating all fields. Are there any serdes that work?

Support alter table change/replace/add columns for existing partitions
--

Key: HIVE-7971
URL: https://issues.apache.org/jira/browse/HIVE-7971
Project: Hive
Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere
Attachments: HIVE-7971.1.patch, HIVE-7971.2.patch

ALTER TABLE CHANGE COLUMN is allowed for tables, but not for partitions. Same
for add/replace columns.
Allowing this for partitions can be useful in some cases. For example, one
user has tables with Hive 0.12 Decimal columns, which do not specify
precision/scale. To be able to properly read the decimal values from the
existing partitions, the column types in the partitions need to be changed to
decimal types with precision/scale.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8201) Remove hardwiring to HiveInputFormat in acid qfile tests


[ 
https://issues.apache.org/jira/browse/HIVE-8201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147466#comment-14147466
 ] 

Gunther Hagleitner commented on HIVE-8201:
--

Lol. This is your jobconf. This is your jobconf on acid.

 Remove hardwiring to HiveInputFormat in acid qfile tests
 

 Key: HIVE-8201
 URL: https://issues.apache.org/jira/browse/HIVE-8201
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Alan Gates
Assignee: Alan Gates
 Attachments: HIVE-8201.2.patch, HIVE-8201.patch


 Now that HIVE-7812 is checked in we should remove the hardwiring to 
 HiveInputFormat for the qfile tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8188) ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight loop


[ 
https://issues.apache.org/jira/browse/HIVE-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147481#comment-14147481
 ] 

Gunther Hagleitner commented on HIVE-8188:
--

on commit, you might want to add some comments saying that annotation lookups 
are expensive and that isDeterministic and isEstimable cannot change while 
running through the op pipeline.

 ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight 
 loop
 -

 Key: HIVE-8188
 URL: https://issues.apache.org/jira/browse/HIVE-8188
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.14.0
Reporter: Gopal V
Assignee: Gopal V
  Labels: Performance
 Attachments: HIVE-8188.1.patch, HIVE-8188.2.patch, 
 udf-deterministic.png


 When running a near-constant UDF, most of the CPU is burnt within the VM 
 trying to read the class annotations for every row.
 !udf-deterministic.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8245) Collect table read entities at same time as view read entities


 [ 
https://issues.apache.org/jira/browse/HIVE-8245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8245:
---
Status: Patch Available  (was: Open)

 Collect table read entities at same time as view read entities 
 ---

 Key: HIVE-8245
 URL: https://issues.apache.org/jira/browse/HIVE-8245
 Project: Hive
  Issue Type: Improvement
  Components: CBO
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-8245.1.patch, HIVE-8245.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8082) generateErrorMessage doesn't handle null ast properly


[ 
https://issues.apache.org/jira/browse/HIVE-8082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147483#comment-14147483
 ] 

Gunther Hagleitner commented on HIVE-8082:
--

+1

 generateErrorMessage doesn't handle null ast properly
 -

 Key: HIVE-8082
 URL: https://issues.apache.org/jira/browse/HIVE-8082
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.13.1
 Environment: anything
Reporter: Darren Yin
Priority: Minor
  Labels: newbie, patch
 Attachments: HIVE-8082.1.patch


 in SemanticAnalzyer.genUnionPlan, tabref can be null, and if then one of the 
 throw new SemanticException lines gets called, generateErrorMessage will 
 error out with a NullPointerException when ast.getLine() is called. The fix 
 is just to add a check for if (ast == null)
 example stack trace:
 {noformat}
 2014-09-12 14:02:3014/09/12 21:02:30 ERROR ql.Driver: FAILED: 
 NullPointerException null
 2014-09-12 14:02:30java.lang.NullPointerException
 2014-09-12 14:02:30   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.generateErrorMessage(SemanticAnalyzer.java:484)
 2014-09-12 14:02:30   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genUnionPlan(SemanticAnalyzer.java:7411)
 2014-09-12 14:02:30   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:7970)
 2014-09-12 14:02:30   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:7985)
 2014-09-12 14:02:30   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8693)
 2014-09-12 14:02:30   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:278)
 2014-09-12 14:02:30   at 
 org.apache.hadoop.hive.ql.Driver.compile(Driver.java:458)
 2014-09-12 14:02:30   at 
 org.apache.hadoop.hive.ql.Driver.compile(Driver.java:407)
 2014-09-12 14:02:30   at 
 org.apache.hadoop.hive.ql.Driver.compile(Driver.java:339)
 2014-09-12 14:02:30   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:969)
 2014-09-12 14:02:30   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:261)
 2014-09-12 14:02:30   at 
 org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:218)
 2014-09-12 14:02:30   at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:421)
 2014-09-12 14:02:30   at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:356)
 2014-09-12 14:02:30   at 
 org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
 2014-09-12 14:02:30   at 
 org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:622)
 2014-09-12 14:02:30   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
 Method)
 2014-09-12 14:02:30   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 2014-09-12 14:02:30   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 2014-09-12 14:02:30   at java.lang.reflect.Method.invoke(Method.java:597)
 2014-09-12 14:02:30   at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8188) ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight loop


[ 
https://issues.apache.org/jira/browse/HIVE-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147477#comment-14147477
 ] 

Gunther Hagleitner commented on HIVE-8188:
--

+1

 ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight 
 loop
 -

 Key: HIVE-8188
 URL: https://issues.apache.org/jira/browse/HIVE-8188
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.14.0
Reporter: Gopal V
Assignee: Gopal V
  Labels: Performance
 Attachments: HIVE-8188.1.patch, HIVE-8188.2.patch, 
 udf-deterministic.png


 When running a near-constant UDF, most of the CPU is burnt within the VM 
 trying to read the class annotations for every row.
 !udf-deterministic.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7389) Reduce number of metastore calls in MoveTask (when loading dynamic partitions)


[ 
https://issues.apache.org/jira/browse/HIVE-7389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147500#comment-14147500
 ] 

Gunther Hagleitner commented on HIVE-7389:
--

[~rajesh.balamohan] is this ready to go?

 Reduce number of metastore calls in MoveTask (when loading dynamic partitions)
 --

 Key: HIVE-7389
 URL: https://issues.apache.org/jira/browse/HIVE-7389
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Rajesh Balamohan
Assignee: Rajesh Balamohan
  Labels: performance
 Attachments: HIVE-7389.1.patch, local_vm_testcase.txt


 When the number of dynamic partitions to be loaded are high, the time taken 
 for 'MoveTask' is greater than the actual job in some scenarios.  It would be 
 possible to reduce overall runtime by reducing the number of calls made to 
 metastore from MoveTask operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 25997: Collect table read entities at same time as view read entities

2014-09-25 Thread Ashutosh Chauhan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25997/
---

(Updated Sept. 25, 2014, 6:41 a.m.)


Review request for hive and Thejas Nair.


Changes
---

Updated patch with golden files update.


Bugs: HIVE-8245
https://issues.apache.org/jira/browse/HIVE-8245


Repository: hive-git


Description
---

Collect table read entities at same time as view read entities 


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 2f36f04 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java c4dacf9 
  ql/src/test/results/clientnegative/limit_partition_stats.q.out 5a5fe1f 
  ql/src/test/results/clientpositive/alter_merge_stats_orc.q.out eae45b2 
  ql/src/test/results/clientpositive/authorization_explain.q.out e5e605b 
  ql/src/test/results/clientpositive/explain_dependency.q.out cb98d54 
  ql/src/test/results/clientpositive/limit0.q.out d047374 
  ql/src/test/results/clientpositive/limit_pushdown.q.out a5a0090 
  ql/src/test/results/clientpositive/metadata_only_queries.q.out e273570 
  ql/src/test/results/clientpositive/metadata_only_queries_with_filters.q.out 
664e065 
  ql/src/test/results/clientpositive/orc_analyze.q.out 07e46e9 
  ql/src/test/results/clientpositive/orc_merge5.q.out 2ac3342 
  ql/src/test/results/clientpositive/orc_merge6.q.out 05deb57 
  ql/src/test/results/clientpositive/orc_merge7.q.out d342736 
  ql/src/test/results/clientpositive/orc_merge_incompat1.q.out e6ef838 
  ql/src/test/results/clientpositive/orc_merge_incompat2.q.out e28d8b3 
  ql/src/test/results/clientpositive/ql_rewrite_gbtoidx.q.out 0e7c4af 
  ql/src/test/results/clientpositive/query_properties.q.out 47f8d8c 
  ql/src/test/results/clientpositive/stats_only_null.q.out c4728c9 
  ql/src/test/results/clientpositive/tez/alter_merge_stats_orc.q.out eae45b2 
  ql/src/test/results/clientpositive/tez/limit_pushdown.q.out 23df5ec 
  ql/src/test/results/clientpositive/tez/metadata_only_queries.q.out 7942ce7 
  ql/src/test/results/clientpositive/tez/orc_analyze.q.out 07e46e9 
  ql/src/test/results/clientpositive/tez/orc_merge5.q.out b40a37d 
  ql/src/test/results/clientpositive/tez/orc_merge6.q.out 0441fa4 
  ql/src/test/results/clientpositive/tez/orc_merge7.q.out c6809a1 
  ql/src/test/results/clientpositive/tez/orc_merge_incompat1.q.out 90f7f24 
  ql/src/test/results/clientpositive/tez/orc_merge_incompat2.q.out 30c6ab8 
  ql/src/test/results/clientpositive/vectorization_limit.q.out d67d559 

Diff: https://reviews.apache.org/r/25997/diff/


Testing
---

Unit tests


Thanks,

Ashutosh Chauhan

[jira] [Updated] (HIVE-7389) Reduce number of metastore calls in MoveTask (when loading dynamic partitions)


 [ 
https://issues.apache.org/jira/browse/HIVE-7389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-7389:
-
Status: Patch Available  (was: Open)

 Reduce number of metastore calls in MoveTask (when loading dynamic partitions)
 --

 Key: HIVE-7389
 URL: https://issues.apache.org/jira/browse/HIVE-7389
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Rajesh Balamohan
Assignee: Rajesh Balamohan
  Labels: performance
 Attachments: HIVE-7389.1.patch, local_vm_testcase.txt


 When the number of dynamic partitions to be loaded are high, the time taken 
 for 'MoveTask' is greater than the actual job in some scenarios.  It would be 
 possible to reduce overall runtime by reducing the number of calls made to 
 metastore from MoveTask operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8245) Collect table read entities at same time as view read entities


 [ 
https://issues.apache.org/jira/browse/HIVE-8245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8245:
---
Status: Open  (was: Patch Available)

 Collect table read entities at same time as view read entities 
 ---

 Key: HIVE-8245
 URL: https://issues.apache.org/jira/browse/HIVE-8245
 Project: Hive
  Issue Type: Improvement
  Components: CBO
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-8245.1.patch, HIVE-8245.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8162) hive.optimize.sort.dynamic.partition causes RuntimeException for inserting into dynamic partitioned table when map function is used in the subquery

2014-09-25 Thread Prasanth J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-8162:
-
Attachment: HIVE-8162.2.patch

 hive.optimize.sort.dynamic.partition causes RuntimeException for inserting 
 into dynamic partitioned table when map function is used in the subquery 
 

 Key: HIVE-8162
 URL: https://issues.apache.org/jira/browse/HIVE-8162
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0, 0.14.0
Reporter: Na Yang
Assignee: Prasanth J
 Attachments: 47rows.txt, HIVE-8162.1.patch, HIVE-8162.2.patch


 Exception:
 Diagnostic Messages for this Task:
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
 Hive Runtime Error: Unable to deserialize reduce input key from 
 x1x129x51x83x14x1x128x0x0x2x1x1x1x120x95x112x114x111x100x117x99x116x95x105x100x0x1x0x0x255
  with properties {columns=reducesinkkey0,reducesinkkey1,reducesinkkey2, 
 serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe,
  serialization.sort.order=+++, columns.types=int,mapstring,string,int}
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:283)
   at 
 org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:518)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:462)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:282)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1122)
   at org.apache.hadoop.mapred.Child.main(Child.java:271)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error: Unable to deserialize reduce input key from 
 x1x129x51x83x14x1x128x0x0x2x1x1x1x120x95x112x114x111x100x117x99x116x95x105x100x0x1x0x0x255
  with properties {columns=reducesinkkey0,reducesinkkey1,reducesinkkey2, 
 serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe,
  serialization.sort.order=+++, columns.types=int,mapstring,string,int}
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:222)
   ... 7 more
 Caused by: org.apache.hadoop.hive.serde2.SerDeException: java.io.EOFException
   at 
 org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:189)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:220)
   ... 7 more
 Caused by: java.io.EOFException
   at 
 org.apache.hadoop.hive.serde2.binarysortable.InputByteBuffer.read(InputByteBuffer.java:54)
   at 
 org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserializeInt(BinarySortableSerDe.java:533)
   at 
 org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:236)
   at 
 org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:185)
   ... 8 more
 Step to reproduce the exception:
 -
 CREATE TABLE associateddata(creative_id int,creative_group_id int,placement_id
 int,sm_campaign_id int,browser_id string, trans_type_p string,trans_time_p
 string,group_name string,event_name string,order_id string,revenue
 float,currency string, trans_type_ci string,trans_time_ci string,f16
 mapstring,string,campaign_id int,user_agent_cat string,geo_country
 string,geo_city string,geo_state string,geo_zip string,geo_dma string,geo_area
 string,geo_isp string,site_id int,section_id int,f16_ci mapstring,string)
 PARTITIONED BY(day_id int, hour_id int) ROW FORMAT DELIMITED FIELDS TERMINATED
 BY '\t';
 LOAD DATA LOCAL INPATH '/tmp/47rows.txt' INTO TABLE associateddata
 PARTITION(day_id=20140814,hour_id=2014081417);
 set hive.exec.dynamic.partition=true;
 set hive.exec.dynamic.partition.mode=nonstrict; 
 CREATE  EXTERNAL TABLE IF NOT EXISTS agg_pv_associateddata_c (
  vt_tran_qty int COMMENT 'The count of view
 thru transactions'
 , pair_value_txt  string  COMMENT 'F16 name values
 pairs'
 )
 PARTITIONED BY (day_id int)
 ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
 STORED AS TEXTFILE
 LOCATION '/user/prodman/agg_pv_associateddata_c';
 INSERT INTO TABLE agg_pv_associateddata_c PARTITION (day_id)
 select 2 as vt_tran_qty, pair_value_txt, day_id
  from (select map( 'x_product_id',coalesce(F16['x_product_id'],'') ) as 
 pair_value_txt , day_id , hour_id 
 from associateddata where hour_id = 2014081417 and sm_campaign_id in

Re: Review Request 25997: Collect table read entities at same time as view read entities

2014-09-25 Thread Thejas Nair


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25997/#review54523
---

Ship it!


Ship It!

- Thejas Nair


On Sept. 25, 2014, 6:41 a.m., Ashutosh Chauhan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25997/
 ---
 
 (Updated Sept. 25, 2014, 6:41 a.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Bugs: HIVE-8245
 https://issues.apache.org/jira/browse/HIVE-8245
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Collect table read entities at same time as view read entities 
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 2f36f04 
   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java c4dacf9 
   ql/src/test/results/clientnegative/limit_partition_stats.q.out 5a5fe1f 
   ql/src/test/results/clientpositive/alter_merge_stats_orc.q.out eae45b2 
   ql/src/test/results/clientpositive/authorization_explain.q.out e5e605b 
   ql/src/test/results/clientpositive/explain_dependency.q.out cb98d54 
   ql/src/test/results/clientpositive/limit0.q.out d047374 
   ql/src/test/results/clientpositive/limit_pushdown.q.out a5a0090 
   ql/src/test/results/clientpositive/metadata_only_queries.q.out e273570 
   ql/src/test/results/clientpositive/metadata_only_queries_with_filters.q.out 
 664e065 
   ql/src/test/results/clientpositive/orc_analyze.q.out 07e46e9 
   ql/src/test/results/clientpositive/orc_merge5.q.out 2ac3342 
   ql/src/test/results/clientpositive/orc_merge6.q.out 05deb57 
   ql/src/test/results/clientpositive/orc_merge7.q.out d342736 
   ql/src/test/results/clientpositive/orc_merge_incompat1.q.out e6ef838 
   ql/src/test/results/clientpositive/orc_merge_incompat2.q.out e28d8b3 
   ql/src/test/results/clientpositive/ql_rewrite_gbtoidx.q.out 0e7c4af 
   ql/src/test/results/clientpositive/query_properties.q.out 47f8d8c 
   ql/src/test/results/clientpositive/stats_only_null.q.out c4728c9 
   ql/src/test/results/clientpositive/tez/alter_merge_stats_orc.q.out eae45b2 
   ql/src/test/results/clientpositive/tez/limit_pushdown.q.out 23df5ec 
   ql/src/test/results/clientpositive/tez/metadata_only_queries.q.out 7942ce7 
   ql/src/test/results/clientpositive/tez/orc_analyze.q.out 07e46e9 
   ql/src/test/results/clientpositive/tez/orc_merge5.q.out b40a37d 
   ql/src/test/results/clientpositive/tez/orc_merge6.q.out 0441fa4 
   ql/src/test/results/clientpositive/tez/orc_merge7.q.out c6809a1 
   ql/src/test/results/clientpositive/tez/orc_merge_incompat1.q.out 90f7f24 
   ql/src/test/results/clientpositive/tez/orc_merge_incompat2.q.out 30c6ab8 
   ql/src/test/results/clientpositive/vectorization_limit.q.out d67d559 
 
 Diff: https://reviews.apache.org/r/25997/diff/
 
 
 Testing
 ---
 
 Unit tests
 
 
 Thanks,
 
 Ashutosh Chauhan

[jira] [Commented] (HIVE-8245) Collect table read entities at same time as view read entities

2014-09-25 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147506#comment-14147506
 ] 

Thejas M Nair commented on HIVE-8245:
-

+1 pending tests


 Collect table read entities at same time as view read entities 
 ---

 Key: HIVE-8245
 URL: https://issues.apache.org/jira/browse/HIVE-8245
 Project: Hive
  Issue Type: Improvement
  Components: CBO
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-8245.1.patch, HIVE-8245.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8245) Collect table read entities at same time as view read entities


 [ 
https://issues.apache.org/jira/browse/HIVE-8245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8245:
---
Attachment: HIVE-8245.1.patch

 Collect table read entities at same time as view read entities 
 ---

 Key: HIVE-8245
 URL: https://issues.apache.org/jira/browse/HIVE-8245
 Project: Hive
  Issue Type: Improvement
  Components: CBO
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-8245.1.patch, HIVE-8245.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8111) CBO trunk merge: duplicated casts for arithmetic expressions in Hive and CBO


[ 
https://issues.apache.org/jira/browse/HIVE-8111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147515#comment-14147515
 ] 

Hive QA commented on HIVE-8111:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12671057/HIVE-8111.03.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6345 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.parse.TestParse.testParse_union
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/973/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/973/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-973/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12671057

 CBO trunk merge: duplicated casts for arithmetic expressions in Hive and CBO
 

 Key: HIVE-8111
 URL: https://issues.apache.org/jira/browse/HIVE-8111
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-8111.01.patch, HIVE-8111.02.patch, 
 HIVE-8111.03.patch, HIVE-8111.patch


 Original test failure: looks like column type changes to different decimals 
 in most cases. In one case it causes the integer part to be too big to fit, 
 so the result becomes null it seems.
 What happens is that CBO adds casts to arithmetic expressions to make them 
 type compatible; these casts become part of new AST, and then Hive adds casts 
 on top of these casts. This (the first part) also causes lots of out file 
 changes. It's not clear how to best fix it so far, in addition to incorrect 
 decimal width and sometimes nulls when width is larger than allowed in Hive.
 Option one - don't add those for numeric ops - cannot be done if numeric op 
 is a part of compare, for which CBO needs correct types.
 Option two - unwrap casts when determining type in Hive - hard or impossible 
 to tell apart CBO-added casts and user casts. 
 Option three - don't change types in Hive if CBO has run - seems hacky and 
 hard to ensure it's applied everywhere.
 Option four - map all expressions precisely between two trees and remove 
 casts again after optimization, will be pretty difficult.
 Option five - somehow mark those casts. Not sure about how yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8072) TesParse_union is failing on trunk


 [ 
https://issues.apache.org/jira/browse/HIVE-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8072:
-
Status: Open  (was: Patch Available)

 TesParse_union is failing on trunk
 --

 Key: HIVE-8072
 URL: https://issues.apache.org/jira/browse/HIVE-8072
 Project: Hive
  Issue Type: Task
  Components: Tests
Affects Versions: 0.14.0
Reporter: Ashutosh Chauhan
 Attachments: HIVE-8072.1.patch.txt, HIVE-8072.2.patch, HIVE-8072.patch


 Needs golden file update



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8072) TesParse_union is failing on trunk


 [ 
https://issues.apache.org/jira/browse/HIVE-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8072:
-
Attachment: HIVE-8072.2.patch

 TesParse_union is failing on trunk
 --

 Key: HIVE-8072
 URL: https://issues.apache.org/jira/browse/HIVE-8072
 Project: Hive
  Issue Type: Task
  Components: Tests
Affects Versions: 0.14.0
Reporter: Ashutosh Chauhan
 Attachments: HIVE-8072.1.patch.txt, HIVE-8072.2.patch, HIVE-8072.patch


 Needs golden file update



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8072) TesParse_union is failing on trunk


 [ 
https://issues.apache.org/jira/browse/HIVE-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8072:
-
Status: Patch Available  (was: Open)

 TesParse_union is failing on trunk
 --

 Key: HIVE-8072
 URL: https://issues.apache.org/jira/browse/HIVE-8072
 Project: Hive
  Issue Type: Task
  Components: Tests
Affects Versions: 0.14.0
Reporter: Ashutosh Chauhan
 Attachments: HIVE-8072.1.patch.txt, HIVE-8072.2.patch, HIVE-8072.patch


 Needs golden file update



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8072) TesParse_union is failing on trunk


[ 
https://issues.apache.org/jira/browse/HIVE-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147531#comment-14147531
 ] 

Gunther Hagleitner commented on HIVE-8072:
--

Patch LGTM. I've caused this by introducing the interners it seems. I can't 
reproduce any of the failures and the build logs are gone by now. I've also 
slightly changed the patch in .2. [~navis] since you're using a new interner 
per query, i think you can use a strong interner. Also, I believe 
aliasToPartition.values() can return null, when querying empty tables.

 TesParse_union is failing on trunk
 --

 Key: HIVE-8072
 URL: https://issues.apache.org/jira/browse/HIVE-8072
 Project: Hive
  Issue Type: Task
  Components: Tests
Affects Versions: 0.14.0
Reporter: Ashutosh Chauhan
 Attachments: HIVE-8072.1.patch.txt, HIVE-8072.2.patch, HIVE-8072.patch


 Needs golden file update



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8204) Dynamic partition pruning fails with IndexOutOfBoundsException


[ 
https://issues.apache.org/jira/browse/HIVE-8204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147534#comment-14147534
 ] 

Gunther Hagleitner commented on HIVE-8204:
--

[~prasanth_j] can i get a +1 on the patch? Even if it's not reproducible it'd 
be good to get these tests added.

 Dynamic partition pruning fails with IndexOutOfBoundsException
 --

 Key: HIVE-8204
 URL: https://issues.apache.org/jira/browse/HIVE-8204
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Prasanth J
Assignee: Gunther Hagleitner
 Attachments: HIVE-8204.1.patch


 Dynamic partition pruning fails with IndexOutOfBounds exception when 
 dimension table is partitioned and fact table is not.
 Steps to reproduce:
 1) Partition date_dim table from tpcds on d_date_sk
 2) Fact table is store_sales which is not partitioned
 3) Run the following
 {code}
 set hive.stats.fetch.column.stats=ture;
 set hive.tez.dynamic.partition.pruning=true;
 explain select d_date 
 from store_sales, date_dim 
 where 
 store_sales.ss_sold_date_sk = date_dim.d_date_sk and 
 date_dim.d_year = 1998;
 {code}
 The stack trace is:
 {code}
 2014-09-19 19:06:16,254 ERROR ql.Driver (SessionState.java:printError(825)) - 
 FAILED: IndexOutOfBoundsException Index: 0, Size: 0
 java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
   at java.util.ArrayList.get(ArrayList.java:411)
   at 
 org.apache.hadoop.hive.ql.optimizer.RemoveDynamicPruningBySize.process(RemoveDynamicPruningBySize.java:61)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
   at 
 org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:61)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
   at 
 org.apache.hadoop.hive.ql.parse.TezCompiler.runStatsDependentOptimizations(TezCompiler.java:277)
   at 
 org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:120)
   at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:97)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9781)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
   at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:407)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:303)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1060)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1130)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:997)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:987)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:246)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:198)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408)
   at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8204) Dynamic partition pruning fails with IndexOutOfBoundsException


 [ 
https://issues.apache.org/jira/browse/HIVE-8204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8204:
-
Status: Patch Available  (was: Open)

 Dynamic partition pruning fails with IndexOutOfBoundsException
 --

 Key: HIVE-8204
 URL: https://issues.apache.org/jira/browse/HIVE-8204
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Prasanth J
Assignee: Gunther Hagleitner
 Attachments: HIVE-8204.1.patch


 Dynamic partition pruning fails with IndexOutOfBounds exception when 
 dimension table is partitioned and fact table is not.
 Steps to reproduce:
 1) Partition date_dim table from tpcds on d_date_sk
 2) Fact table is store_sales which is not partitioned
 3) Run the following
 {code}
 set hive.stats.fetch.column.stats=ture;
 set hive.tez.dynamic.partition.pruning=true;
 explain select d_date 
 from store_sales, date_dim 
 where 
 store_sales.ss_sold_date_sk = date_dim.d_date_sk and 
 date_dim.d_year = 1998;
 {code}
 The stack trace is:
 {code}
 2014-09-19 19:06:16,254 ERROR ql.Driver (SessionState.java:printError(825)) - 
 FAILED: IndexOutOfBoundsException Index: 0, Size: 0
 java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
   at java.util.ArrayList.get(ArrayList.java:411)
   at 
 org.apache.hadoop.hive.ql.optimizer.RemoveDynamicPruningBySize.process(RemoveDynamicPruningBySize.java:61)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
   at 
 org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:61)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
   at 
 org.apache.hadoop.hive.ql.parse.TezCompiler.runStatsDependentOptimizations(TezCompiler.java:277)
   at 
 org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:120)
   at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:97)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9781)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
   at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:407)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:303)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1060)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1130)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:997)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:987)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:246)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:198)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408)
   at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7802) Update language manual for insert, update, and delete

2014-09-25 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147546#comment-14147546
 ] 

Lefty Leverenz commented on HIVE-7802:
--

[~alangates], the last example for INSERT...VALUES specifies a partition column 
without a value:

{{INSERT INTO TABLE pageviews PARTITION (datestamp) VALUES}}

but the syntax doesn't seem to allow that:

{{INSERT INTO TABLE tablename \[PARTITION (partcol1=val1, partcol2=val2 ...)\] 
VALUES ...}}

Also, Fred and Barney had better start taking Mickey Mouse courses.

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-InsertingintotablesfromSQL

 Update language manual for insert, update, and delete
 -

 Key: HIVE-7802
 URL: https://issues.apache.org/jira/browse/HIVE-7802
 Project: Hive
  Issue Type: Sub-task
  Components: Documentation
Reporter: Alan Gates
Assignee: Alan Gates
  Labels: TODOC14

 With the addition of ACID compliant insert, insert...values, update, and 
 delete we need to update the Hive language manual to cover the new features.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 25575: HIVE-7615: Beeline should have an option for user to see the query progress

2014-09-25 Thread Dong Chen


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25575/
---

(Updated Sept. 25, 2014, 8:43 a.m.)


Review request for hive.


Changes
---

Update patch V4 to address error logging comments.


Repository: hive-git


Description
---

When executing query in Beeline, user should have a option to see the progress 
through the outputs. Beeline could use the API introduced in HIVE-4629 to get 
and display the logs to the client.


Diffs (updated)
-

  beeline/pom.xml 45fa02b 
  beeline/src/java/org/apache/hive/beeline/Commands.java a92d69f 
  
itests/hive-unit/src/test/java/org/apache/hive/beeline/TestBeeLineWithArgs.java 
1e66542 
  itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java 
daf8e9e 
  jdbc/src/java/org/apache/hive/jdbc/ClosedOrCancelledStatementException.java 
PRE-CREATION 
  jdbc/src/java/org/apache/hive/jdbc/HiveQueryResultSet.java 86bc580 
  jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java 2cbf58c 

Diff: https://reviews.apache.org/r/25575/diff/


Testing
---

UT passed.


Thanks,

Dong Chen

[jira] [Updated] (HIVE-7615) Beeline should have an option for user to see the query progress

2014-09-25 Thread Dong Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Chen updated HIVE-7615:

Attachment: HIVE-7615.4.patch

Thanks Thejas. 
Patch V4 is updated with changes of error logging. 

 Beeline should have an option for user to see the query progress
 

 Key: HIVE-7615
 URL: https://issues.apache.org/jira/browse/HIVE-7615
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Reporter: Dong Chen
Assignee: Dong Chen
 Fix For: 0.14.0

 Attachments: HIVE-7615.1.patch, HIVE-7615.2.patch, HIVE-7615.3.patch, 
 HIVE-7615.4.patch, HIVE-7615.patch, complete_logs, simple_logs


 When executing query in Beeline, user should have a option to see the 
 progress through the outputs.
 Beeline could use the API introduced in HIVE-4629 to get and display the logs 
 to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8204) Dynamic partition pruning fails with IndexOutOfBoundsException

2014-09-25 Thread Prasanth J (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147551#comment-14147551
 ] 

Prasanth J commented on HIVE-8204:
--

LGTM +1

 Dynamic partition pruning fails with IndexOutOfBoundsException
 --

 Key: HIVE-8204
 URL: https://issues.apache.org/jira/browse/HIVE-8204
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Prasanth J
Assignee: Gunther Hagleitner
 Attachments: HIVE-8204.1.patch


 Dynamic partition pruning fails with IndexOutOfBounds exception when 
 dimension table is partitioned and fact table is not.
 Steps to reproduce:
 1) Partition date_dim table from tpcds on d_date_sk
 2) Fact table is store_sales which is not partitioned
 3) Run the following
 {code}
 set hive.stats.fetch.column.stats=ture;
 set hive.tez.dynamic.partition.pruning=true;
 explain select d_date 
 from store_sales, date_dim 
 where 
 store_sales.ss_sold_date_sk = date_dim.d_date_sk and 
 date_dim.d_year = 1998;
 {code}
 The stack trace is:
 {code}
 2014-09-19 19:06:16,254 ERROR ql.Driver (SessionState.java:printError(825)) - 
 FAILED: IndexOutOfBoundsException Index: 0, Size: 0
 java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
   at java.util.ArrayList.get(ArrayList.java:411)
   at 
 org.apache.hadoop.hive.ql.optimizer.RemoveDynamicPruningBySize.process(RemoveDynamicPruningBySize.java:61)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
   at 
 org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:61)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
   at 
 org.apache.hadoop.hive.ql.parse.TezCompiler.runStatsDependentOptimizations(TezCompiler.java:277)
   at 
 org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:120)
   at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:97)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9781)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
   at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:407)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:303)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1060)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1130)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:997)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:987)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:246)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:198)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408)
   at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7802) Update language manual for insert, update, and delete

2014-09-25 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147558#comment-14147558
 ] 

Lefty Leverenz commented on HIVE-7802:
--

+1 pending approval of my edits.

 Update language manual for insert, update, and delete
 -

 Key: HIVE-7802
 URL: https://issues.apache.org/jira/browse/HIVE-7802
 Project: Hive
  Issue Type: Sub-task
  Components: Documentation
Reporter: Alan Gates
Assignee: Alan Gates
  Labels: TODOC14

 With the addition of ACID compliant insert, insert...values, update, and 
 delete we need to update the Hive language manual to cover the new features.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8248) TestHCatLoader.testReadDataPrimitiveTypes() occasionally fails


[ 
https://issues.apache.org/jira/browse/HIVE-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147562#comment-14147562
 ] 

Hive QA commented on HIVE-8248:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12671062/HIVE-8248.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6345 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.parse.TestParse.testParse_union
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/974/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/974/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-974/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12671062

 TestHCatLoader.testReadDataPrimitiveTypes() occasionally fails
 --

 Key: HIVE-8248
 URL: https://issues.apache.org/jira/browse/HIVE-8248
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Tests
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-8248.1.patch


 This occasionally shows up in the test failures.
 It looks like testConvertBooleanToInt() sets 
 HCatConstants.HCAT_DATA_CONVERT_BOOLEAN_TO_INTEGER=true, and this sets the 
 static configuration in HCatContext.INSTANCE. If testConvertBooleanToInt() 
 runs first, then this setting sticks around and converts the boolean values 
 to int in testReadDataPrimitiveTypes().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8106) Enable vectorization for spark [spark branch]


 [ 
https://issues.apache.org/jira/browse/HIVE-8106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-8106:
---
Attachment: HIVE-8106.2-spark.patch

 Enable vectorization for spark [spark branch]
 -

 Key: HIVE-8106
 URL: https://issues.apache.org/jira/browse/HIVE-8106
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-8106-spark.patch, HIVE-8106.1-spark.patch, 
 HIVE-8106.2-spark.patch


 Enable the vectorization optimization on spark



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8106) Enable vectorization for spark [spark branch]


 [ 
https://issues.apache.org/jira/browse/HIVE-8106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-8106:
---
Status: Patch Available  (was: Open)

Updated the patch.

 Enable vectorization for spark [spark branch]
 -

 Key: HIVE-8106
 URL: https://issues.apache.org/jira/browse/HIVE-8106
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-8106-spark.patch, HIVE-8106.1-spark.patch, 
 HIVE-8106.2-spark.patch


 Enable the vectorization optimization on spark



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8223) CBO Trunk Merge: partition_wise_fileformat2 select result depends on ordering


[ 
https://issues.apache.org/jira/browse/HIVE-8223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147612#comment-14147612
 ] 

Hive QA commented on HIVE-8223:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12671064/HIVE-8223.02.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6346 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hadoop.hive.ql.parse.TestParse.testParse_union
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/975/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/975/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-975/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12671064

 CBO Trunk Merge: partition_wise_fileformat2 select result depends on ordering
 -

 Key: HIVE-8223
 URL: https://issues.apache.org/jira/browse/HIVE-8223
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-8223.01.patch, HIVE-8223.02.patch, HIVE-8223.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8251) An error occurred when trying to close the Operator running your custom script.

2014-09-25 Thread someshwar kale (JIRA)

someshwar kale created HIVE-8251:


 Summary: An error occurred when trying to close the Operator 
running your custom script.
 Key: HIVE-8251
 URL: https://issues.apache.org/jira/browse/HIVE-8251
 Project: Hive
  Issue Type: Bug
  Components: Contrib
Affects Versions: 0.12.0
 Environment: MapR distribution
Reporter: someshwar kale


We are trying to plugin custom map reduce to our hive , but facing the error as 
below-

java.lang.RuntimeException: Hive Runtime Error while closing operators
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
at org.apache.hadoop.mapred.Child.main(Child.java:264)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20003]: An 
error occurred when trying to close the Operator running your custom script.
at 
org.apache.hadoop.hive.ql.exec.ScriptOperator.close(ScriptOperator.java:514)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
... 8 more


FAILED: Execution Error, return code 20003 from 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask. An error occurred when trying to 
close the Operator running your custom script.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8252) Generic cryptographic codec and key management framework

2014-09-25 Thread Xiaomeng Huang (JIRA)

Xiaomeng Huang created HIVE-8252:


 Summary: Generic cryptographic codec and key management framework
 Key: HIVE-8252
 URL: https://issues.apache.org/jira/browse/HIVE-8252
 Project: Hive
  Issue Type: Sub-task
Reporter: Xiaomeng Huang
Assignee: Xiaomeng Huang






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8049) Transparent column level encryption using kms

2014-09-25 Thread Xiaomeng Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaomeng Huang updated HIVE-8049:
-
Summary: Transparent column level encryption using kms  (was: Transparent 
column level encryption using key management)

 Transparent column level encryption using kms
 -

 Key: HIVE-8049
 URL: https://issues.apache.org/jira/browse/HIVE-8049
 Project: Hive
  Issue Type: Sub-task
Reporter: Xiaomeng Huang
Assignee: Xiaomeng Huang
 Attachments: HIVE-8049.001.patch


 This patch implement transparent column level encryption. Users don't need to 
 set anything when they quey tables.
 # setup kms and set kms-acls.xml (e.g. user1 and root has permission to get 
 key)
 {code}
  property
 namehadoop.kms.acl.GET/name
 valueuser1 root/value
 description
   ACL for get-key-version and get-current-key operations.
 /description
   /property
 {code}
 # set hive-site.xml 
 {code}
  property  
 namehadoop.security.kms.uri/name  
 valuehttp://localhost:16000/kms/value  
  /property 
 {code}
 # create an encrypted table
 {code}
 -- region-aes-column.q
 drop table region_aes_column;
 create table region_aes_column (r_regionkey int, r_name string) ROW FORMAT 
 SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
   WITH SERDEPROPERTIES ('column.encode.columns'='r_name', 
 'column.encode.classname'='org.apache.hadoop.hive.serde2.aes.AESRewriter')
   STORED AS TEXTFILE TBLPROPERTIES(hive.encrypt.keynames=hive.k1);
 insert overwrite table region_aes_column
 select
   r_regionkey, r_name
 from region;
 {code}
 # query table by different user, this is transparent to users. It is very 
 convenient and don't need to set anything.
 {code}
 [root@huang1 hive_data]# hive
 hive select * from region_aes_column;
 OK
 0 AFRICA
 1 AMERICA
 2 ASIA
 3 EUROPE
 4 MIDDLE EAST
 Time taken: 0.9 seconds, Fetched: 5 row(s)
 [root@huang1 hive_data]# su user1
 [user1@huang1 hive_data]$ hive
 hive select * from region_aes_column;
 OK
 0 AFRICA
 1 AMERICA
 2 ASIA
 3 EUROPE
 4 MIDDLE EAST
 Time taken: 0.899 seconds, Fetched: 5 row(s)
 [root@huang1 hive_data]# su user2
 [user2@huang1 hive_data]$ hive
 hive select * from region_aes_column;
 OK
 0 RcQycWVD
 1 Rc8lam9Bxg==
 2 RdEpeQ==
 3 Qdcyd3ZH
 4 ScskfGpHp8KIIuY=
 Time taken: 0.749 seconds, Fetched: 5 row(s)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8106) Enable vectorization for spark [spark branch]


[ 
https://issues.apache.org/jira/browse/HIVE-8106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147665#comment-14147665
 ] 

Hive QA commented on HIVE-8106:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12671191/HIVE-8106.2-spark.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 6506 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_fs_default_name2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_cast_constant
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/156/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/156/console
Test logs: 
http://ec2-54-176-176-199.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-156/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12671191

 Enable vectorization for spark [spark branch]
 -

 Key: HIVE-8106
 URL: https://issues.apache.org/jira/browse/HIVE-8106
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-8106-spark.patch, HIVE-8106.1-spark.patch, 
 HIVE-8106.2-spark.patch


 Enable the vectorization optimization on spark



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8199) CBO Trunk Merge: quote2 test fails due to incorrect literal translation


[ 
https://issues.apache.org/jira/browse/HIVE-8199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147679#comment-14147679
 ] 

Hive QA commented on HIVE-8199:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12671071/HIVE-8199.02.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6345 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.parse.TestParse.testParse_union
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/976/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/976/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-976/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12671071

 CBO Trunk Merge: quote2 test fails due to incorrect literal translation
 ---

 Key: HIVE-8199
 URL: https://issues.apache.org/jira/browse/HIVE-8199
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-8199.01.patch, HIVE-8199.02.patch, HIVE-8199.patch


 Quoting of quotes and slashes is lost in translation back from CBO to AST, it 
 seems



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8106) Enable vectorization for spark [spark branch]


 [ 
https://issues.apache.org/jira/browse/HIVE-8106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-8106:
---
Status: Open  (was: Patch Available)

{code:xml}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_cast_constant
{code}

This failed due to output order change. I am trying to add ORDER BY for the 
query it is failed because of HIVE-8180. 
This test will be enabled as part of  HIVE-8180.

 Enable vectorization for spark [spark branch]
 -

 Key: HIVE-8106
 URL: https://issues.apache.org/jira/browse/HIVE-8106
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-8106-spark.patch, HIVE-8106.1-spark.patch, 
 HIVE-8106.2-spark.patch


 Enable the vectorization optimization on spark



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8180) Update SparkReduceRecordHandler for processing the vectors [spark branch]


[ 
https://issues.apache.org/jira/browse/HIVE-8180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147707#comment-14147707
 ] 

Chinna Rao Lalam commented on HIVE-8180:


Enable this test vector_cast_constant.q as part of this.

 Update SparkReduceRecordHandler for processing the vectors [spark branch]
 -

 Key: HIVE-8180
 URL: https://issues.apache.org/jira/browse/HIVE-8180
 Project: Hive
  Issue Type: Bug
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam

 Update SparkReduceRecordHandler for processing the vectors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8106) Enable vectorization for spark [spark branch]


 [ 
https://issues.apache.org/jira/browse/HIVE-8106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-8106:
---
Attachment: HIVE-8106.3-spark.patch

Updated the patch with failed test disabled.

 Enable vectorization for spark [spark branch]
 -

 Key: HIVE-8106
 URL: https://issues.apache.org/jira/browse/HIVE-8106
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-8106-spark.patch, HIVE-8106.1-spark.patch, 
 HIVE-8106.2-spark.patch, HIVE-8106.3-spark.patch


 Enable the vectorization optimization on spark



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8106) Enable vectorization for spark [spark branch]


 [ 
https://issues.apache.org/jira/browse/HIVE-8106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-8106:
---
Status: Patch Available  (was: Open)

 Enable vectorization for spark [spark branch]
 -

 Key: HIVE-8106
 URL: https://issues.apache.org/jira/browse/HIVE-8106
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-8106-spark.patch, HIVE-8106.1-spark.patch, 
 HIVE-8106.2-spark.patch, HIVE-8106.3-spark.patch


 Enable the vectorization optimization on spark



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7382) Create a MiniSparkCluster and set up a testing framework [Spark Branch]

2014-09-25 Thread Rui Li (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-7382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147714#comment-14147714
]

Rui Li commented on HIVE-7382:
--

Hi [~xuefuz],

I hit the same problem as Szehon mentioned.

After some digging, I think this is because in local-cluster mode spark will
launch separate JVMs for executor backends. So it needs to run some scripts to
determine proper class path (and probably something else), please refer to
{{CommandUtils.buildCommandSeq}}, which is called when {{ExecutorRunner}} tries
to launch the executor backend.
Therefore local-cluster mode requires an installation of spark, and spark.home
or spark.test.home to be properly set. I think this is all right if
local-cluster is merely used for spark unit tests. But it shouldn't be used for
user applications, because it's not that local in the sense it requires an
installation of spark.

To verify my guess, I run some hive query (not tests) on spark without setting
spark.home. It runs well on standalone and local modes, but got the same error
with local-cluster mode.
To make it work, I have to export SPARK_HOME properly. (Please note setting
spark.home or spark.testing + spark.test.home in SparkConf won't help)

What's your opinion?

Create a MiniSparkCluster and set up a testing framework [Spark Branch]
---

Key: HIVE-7382
URL: https://issues.apache.org/jira/browse/HIVE-7382
Project: Hive
Issue Type: Sub-task
Components: Spark
Reporter: Xuefu Zhang
Assignee: Rui Li
Labels: Spark-M1

To automatically test Hive functionality over Spark execution engine, we need
to create a test framework that can execute Hive queries with Spark as the
backend. For that, we should create a MiniSparkCluser for this, similar to
other execution engines.
Spark has a way to create a local cluster with a few processes in the local
machine, each process is a work node. It's fairly close to a real Spark
cluster. Our mini cluster can be based on that.
For more info, please refer to the design doc on wiki.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7723) Explain plan for complex query with lots of partitions is slow due to in-efficient collection used to find a matching ReadEntity


[ 
https://issues.apache.org/jira/browse/HIVE-7723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147745#comment-14147745
 ] 

Hive QA commented on HIVE-7723:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12671070/HIVE-7723.6.patch

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 6346 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_view_as_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_explain
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_escape1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_escape2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_dependency
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_dependency2
org.apache.hadoop.hive.ql.parse.TestParse.testParse_union
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/977/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/977/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-977/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12671070

 Explain plan for complex query with lots of partitions is slow due to 
 in-efficient collection used to find a matching ReadEntity
 

 Key: HIVE-7723
 URL: https://issues.apache.org/jira/browse/HIVE-7723
 Project: Hive
  Issue Type: Bug
  Components: CLI, Physical Optimizer
Affects Versions: 0.13.1
Reporter: Mostafa Mokhtar
Assignee: Mostafa Mokhtar
 Fix For: 0.14.0

 Attachments: HIVE-7723.1.patch, HIVE-7723.2.patch, HIVE-7723.3.patch, 
 HIVE-7723.4.patch, HIVE-7723.5.patch, HIVE-7723.6.patch


 Explain on TPC-DS query 64 took 11 seconds, when the CLI was profiled it 
 showed that ReadEntity.equals is taking ~40% of the CPU.
 ReadEntity.equals is called from the snippet below.
 Again and again the set is iterated over to get the actual match, a HashMap 
 is a better option for this case as Set doesn't have a Get method.
 Also for ReadEntity equals is case-insensitive while hash is , which is an 
 undesired behavior.
 {code}
 public static ReadEntity addInput(SetReadEntity inputs, ReadEntity 
 newInput) {
 // If the input is already present, make sure the new parent is added to 
 the input.
 if (inputs.contains(newInput)) {
   for (ReadEntity input : inputs) {
 if (input.equals(newInput)) {
   if ((newInput.getParents() != null)  
 (!newInput.getParents().isEmpty())) {
 input.getParents().addAll(newInput.getParents());
 input.setDirect(input.isDirect() || newInput.isDirect());
   }
   return input;
 }
   }
   assert false;
 } else {
   inputs.add(newInput);
   return newInput;
 }
 // make compile happy
 return null;
   }
 {code}
 This is the query used : 
 {code}
 select cs1.product_name ,cs1.store_name ,cs1.store_zip ,cs1.b_street_number 
 ,cs1.b_streen_name ,cs1.b_city
  ,cs1.b_zip ,cs1.c_street_number ,cs1.c_street_name ,cs1.c_city 
 ,cs1.c_zip ,cs1.syear ,cs1.cnt
  ,cs1.s1 ,cs1.s2 ,cs1.s3
  ,cs2.s1 ,cs2.s2 ,cs2.s3 ,cs2.syear ,cs2.cnt
 from
 (select i_product_name as product_name ,i_item_sk as item_sk ,s_store_name as 
 store_name
  ,s_zip as store_zip ,ad1.ca_street_number as b_street_number 
 ,ad1.ca_street_name as b_streen_name
  ,ad1.ca_city as b_city ,ad1.ca_zip as b_zip ,ad2.ca_street_number as 
 c_street_number
  ,ad2.ca_street_name as c_street_name ,ad2.ca_city as c_city ,ad2.ca_zip 
 as c_zip
  ,d1.d_year as syear ,d2.d_year as fsyear ,d3.d_year as s2year ,count(*) 
 as cnt
  ,sum(ss_wholesale_cost) as s1 ,sum(ss_list_price) as s2 
 ,sum(ss_coupon_amt) as s3
   FROM   store_sales
 JOIN store_returns ON store_sales.ss_item_sk = 
 store_returns.sr_item_sk and store_sales.ss_ticket_number = 
 store_returns.sr_ticket_number
 JOIN customer ON store_sales.ss_customer_sk = customer.c_customer_sk
 JOIN date_dim d1 ON store_sales.ss_sold_date_sk = d1.d_date_sk
 JOIN date_dim d2 ON customer.c_first_sales_date_sk = d2.d_date_sk 
 JOIN date_dim d3 ON customer.c_first_shipto_date_sk = d3.d_date_sk
 JOIN store ON

[jira] [Created] (HIVE-8253) Compaction throwing java.lang.NullPointerException at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:244)

2014-09-25 Thread Supriya Sahay (JIRA)

Supriya Sahay created HIVE-8253:
---

 Summary: Compaction throwing java.lang.NullPointerException at 
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:244)
 Key: HIVE-8253
 URL: https://issues.apache.org/jira/browse/HIVE-8253
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Supriya Sahay


While trying to INSERT OVERWRITE into bucketed table using transactions, I am 
getting below error:
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:244)
at 
org.apache.hadoop.hive.ql.exec.Heartbeater.heartbeat(Heartbeater.java:79)
at 
org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:242)
at 
org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:547)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:426)
at 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1508)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1275)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1093)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:916)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:906)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Ended Job = job_1411574868628_0015 with exception 
'java.lang.NullPointerException(null)'

This is what I was doing:
hive CREATE EXTERNAL TABLE BUCKET_EMP (ID INT, NAME STRING, VAR STRING)
 PARTITIONED BY (COUNTRY STRING)
 CLUSTERED BY(VAR) INTO 3 BUCKETS
 ROW FORMAT DELIMITED
 FIELDS TERMINATED BY '\t'
 LINES TERMINATED BY '\n'
 STORED AS ORC
 LOCATION '/tmp/bucket_emp';
hive SELECT * FROM BUCKET_EMP;
OK
7   G   x   AUS
3   C   1   AUS
8   H   y   IND
10  J   y   UK
2   B   y   UK
6   F   2   UK
4   D   2   UK
9   Ix   US
1   A   x   US
5   E   1   US

hive SET hive.exec.dynamic.partition = true;
hive SET hive.exec.dynamic.partition.mode = nonstrict;
hive SET hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
hive SET hive.compactor.initiator.on = true;
hive SET hive.compactor.worker.threads = 3;
hive SET hive.compactor.check.interval = 300;
hive SET hive.compactor.delta.num.threshold = 1;

hive INSERT OVERWRITE TABLE BUCKET_EMP
 PARTITION(COUNTRY)
 SELECT ID, NAME,
 CASE WHEN VAR = '1' THEN 'X' WHEN VAR = '2' THEN 'Y' END AS VAR, COUNTRY
 FROM EMP;





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8254) Compaction throwing java.lang.NullPointerException at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:244)

2014-09-25 Thread Supriya Sahay (JIRA)

Supriya Sahay created HIVE-8254:
---

 Summary: Compaction throwing java.lang.NullPointerException at 
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:244)
 Key: HIVE-8254
 URL: https://issues.apache.org/jira/browse/HIVE-8254
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Supriya Sahay


While trying to INSERT OVERWRITE into bucketed table using transactions, I am 
getting below error:
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:244)
at 
org.apache.hadoop.hive.ql.exec.Heartbeater.heartbeat(Heartbeater.java:79)
at 
org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:242)
at 
org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:547)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:426)
at 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1508)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1275)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1093)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:916)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:906)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Ended Job = job_1411574868628_0015 with exception 
'java.lang.NullPointerException(null)'

This is what I was doing:
hive CREATE EXTERNAL TABLE BUCKET_EMP (ID INT, NAME STRING, VAR STRING)
 PARTITIONED BY (COUNTRY STRING)
 CLUSTERED BY(VAR) INTO 3 BUCKETS
 ROW FORMAT DELIMITED
 FIELDS TERMINATED BY '\t'
 LINES TERMINATED BY '\n'
 STORED AS ORC
 LOCATION '/tmp/bucket_emp';
hive SELECT * FROM BUCKET_EMP;
OK
7   G   x   AUS
3   C   1   AUS
8   H   y   IND
10  J   y   UK
2   B   y   UK
6   F   2   UK
4   D   2   UK
9   Ix   US
1   A   x   US
5   E   1   US

hive SET hive.exec.dynamic.partition = true;
hive SET hive.exec.dynamic.partition.mode = nonstrict;
hive SET hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
hive SET hive.compactor.initiator.on = true;
hive SET hive.compactor.worker.threads = 3;
hive SET hive.compactor.check.interval = 300;
hive SET hive.compactor.delta.num.threshold = 1;

hive INSERT OVERWRITE TABLE BUCKET_EMP
 PARTITION(COUNTRY)
 SELECT ID, NAME,
 CASE WHEN VAR = '1' THEN 'X' WHEN VAR = '2' THEN 'Y' END AS VAR, COUNTRY
 FROM EMP;





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8254) Transaction throwing java.lang.NullPointerException at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:244)

2014-09-25 Thread Supriya Sahay (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Supriya Sahay updated HIVE-8254:

Summary: Transaction throwing java.lang.NullPointerException at 
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:244) 
 (was: Compaction throwing java.lang.NullPointerException at 
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:244))

 Transaction throwing java.lang.NullPointerException at 
 org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:244)
 --

 Key: HIVE-8254
 URL: https://issues.apache.org/jira/browse/HIVE-8254
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Supriya Sahay

 While trying to INSERT OVERWRITE into bucketed table using transactions, I am 
 getting below error:
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:244)
 at 
 org.apache.hadoop.hive.ql.exec.Heartbeater.heartbeat(Heartbeater.java:79)
 at 
 org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:242)
 at 
 org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:547)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:426)
 at 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
 at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1508)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1275)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1093)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:916)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:906)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
 at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 Ended Job = job_1411574868628_0015 with exception 
 'java.lang.NullPointerException(null)'
 This is what I was doing:
 hive CREATE EXTERNAL TABLE BUCKET_EMP (ID INT, NAME STRING, VAR STRING)
  PARTITIONED BY (COUNTRY STRING)
  CLUSTERED BY(VAR) INTO 3 BUCKETS
  ROW FORMAT DELIMITED
  FIELDS TERMINATED BY '\t'
  LINES TERMINATED BY '\n'
  STORED AS ORC
  LOCATION '/tmp/bucket_emp';
 hive SELECT * FROM BUCKET_EMP;
 OK
 7   G   x   AUS
 3   C   1   AUS
 8   H   y   IND
 10  J   y   UK
 2   B   y   UK
 6   F   2   UK
 4   D   2   UK
 9   Ix   US
 1   A   x   US
 5   E   1   US
 hive SET hive.exec.dynamic.partition = true;
 hive SET hive.exec.dynamic.partition.mode = nonstrict;
 hive SET hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
 hive SET hive.compactor.initiator.on = true;
 hive SET hive.compactor.worker.threads = 3;
 hive SET hive.compactor.check.interval = 300;
 hive SET hive.compactor.delta.num.threshold = 1;
 hive INSERT OVERWRITE TABLE BUCKET_EMP
  PARTITION(COUNTRY)
  SELECT ID, NAME,
  CASE WHEN VAR = '1' THEN 'X' WHEN VAR = '2' THEN 'Y' END AS VAR, COUNTRY
  FROM EMP;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6148) Support arbitrary structs stored in HBase

2014-09-25 Thread Swarnim Kulkarni (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147754#comment-14147754
 ] 

Swarnim Kulkarni commented on HIVE-6148:


The above test failure is unrelated to the change.

 Support arbitrary structs stored in HBase
 -

 Key: HIVE-6148
 URL: https://issues.apache.org/jira/browse/HIVE-6148
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Affects Versions: 0.12.0
Reporter: Swarnim Kulkarni
Assignee: Swarnim Kulkarni
 Attachments: HIVE-6148.1.patch.txt, HIVE-6148.2.patch.txt


 We should add support to be able to query arbitrary structs stored in HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8106) Enable vectorization for spark [spark branch]


[ 
https://issues.apache.org/jira/browse/HIVE-8106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147786#comment-14147786
 ] 

Hive QA commented on HIVE-8106:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12671212/HIVE-8106.3-spark.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6505 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_fs_default_name2
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/157/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/157/console
Test logs: 
http://ec2-54-176-176-199.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-157/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12671212

 Enable vectorization for spark [spark branch]
 -

 Key: HIVE-8106
 URL: https://issues.apache.org/jira/browse/HIVE-8106
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-8106-spark.patch, HIVE-8106.1-spark.patch, 
 HIVE-8106.2-spark.patch, HIVE-8106.3-spark.patch


 Enable the vectorization optimization on spark



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8106) Enable vectorization for spark [spark branch]


[ 
https://issues.apache.org/jira/browse/HIVE-8106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147799#comment-14147799
 ] 

Xuefu Zhang commented on HIVE-8106:
---

+1

 Enable vectorization for spark [spark branch]
 -

 Key: HIVE-8106
 URL: https://issues.apache.org/jira/browse/HIVE-8106
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-8106-spark.patch, HIVE-8106.1-spark.patch, 
 HIVE-8106.2-spark.patch, HIVE-8106.3-spark.patch


 Enable the vectorization optimization on spark



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8106) Enable vectorization for spark [spark branch]


 [ 
https://issues.apache.org/jira/browse/HIVE-8106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-8106:
--
   Resolution: Fixed
Fix Version/s: spark-branch
   Status: Resolved  (was: Patch Available)

Patch committed to Spark branch. Thanks to Chinna for the contribution.

 Enable vectorization for spark [spark branch]
 -

 Key: HIVE-8106
 URL: https://issues.apache.org/jira/browse/HIVE-8106
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Fix For: spark-branch

 Attachments: HIVE-8106-spark.patch, HIVE-8106.1-spark.patch, 
 HIVE-8106.2-spark.patch, HIVE-8106.3-spark.patch


 Enable the vectorization optimization on spark



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Patches to release branches

2014-09-25 Thread Alan Gates

So combining Mithun's proposal with the input from Sergey and Gopal, I 
propose:


1) When a contributor provides a patch for a high priority bug (data 
corruption, wrong results, crashes) he or she should also provide a 
patch against the branch of the latest feature release.  For example, 
once Hive 0.14 is released that will mean providing patches for trunk 
and the 0.14 branch.  I believe the test infrastructure already supports 
running the tests against alternate branches (is that correct Brock?) so 
the patches can be tested against both trunk and the release branch.
2) The release manager of the feature release (e.g. Hive 0.14) will be 
responsible for maintaining the branch with these patch fixes.  It is 
his or her call whether a given bug merits inclusion on the branch.  If 
a contributor provides a patch for trunk which in the release manager's 
opinion should also be on the branch, then the release manager can ask 
the contributor to also provide a patch for the branch.  Since whoever 
manages the feature release may not want to or be able to continue 
managing the branch post release, these release manager duties are 
transferable.  But the transfer should be clear and announced on the dev 
list.
3) In order to make these patch fixes available to Hive users we should 
strive to have frequent maintenance releases.  The frequency will depend 
on the number of bug fixes going into branch, but 6-8 weeks seems like a 
good goal.


Hive 0.14 could be the test run of this process to see what works and 
what doesn't.  Seem reasonable?


Alan.




Mithun Radhakrishnan mailto:mithun.radhakrish...@yahoo.com.INVALID
September 15, 2014 at 11:16
Hey, Gopal.
Thank you, that makes sense. I'll concede that delaying the initial 
commit till a patch is available for the recent-most release-branch 
won't always be viable. While I'd expect it to be easier to patch the 
release-branch early than late, if we (the community) would prefer a 
cloned JIRA in a separate queue, of course I'll go along. Anything to 
make the release-branch usable out of the box, without further patching.
Forgive my ignorance of the relevant protocol... Would this be a 
change in release/patch process? Does this need codifying? I'm not 
sure if this needs voting on, or even who might call a vote on this.

Mithun

On Thursday, September 11, 2014 3:15 PM, Gopal V gop...@apache.org 
wrote:




This is a very sensible proposal.

As a start, I think we need to have people open backport JIRAs, for such
issues - even if a direct merge might be hard to do with the same patch.

Immediately cherry-picking the same patch should be done if it applies
with very little modifications - but reworking the patch for an older
release is a significant overhead for the initial commit.

At the very least, we need to get past the unknowns that currently
surround the last point release against the bugs already fixed in trunk.

Once we have a backport queue, I'm sure the RMs in charge of the branch
can moderate the community on the complexity and risk factors involved.

Cheers,
Gopal



Gopal V mailto:gop...@apache.org
September 11, 2014 at 15:15
On 9/9/14, 1:52 PM, Mithun Radhakrishnan wrote:


This is a very sensible proposal.

As a start, I think we need to have people open backport JIRAs, for 
such issues - even if a direct merge might be hard to do with the same 
patch.


Immediately cherry-picking the same patch should be done if it applies 
with very little modifications - but reworking the patch for an older 
release is a significant overhead for the initial commit.


At the very least, we need to get past the unknowns that currently 
surround the last point release against the bugs already fixed in trunk.


Once we have a backport queue, I'm sure the RMs in charge of the 
branch can moderate the community on the complexity and risk factors 
involved.


Cheers,
Gopal


--
Sent with Postbox http://www.getpostbox.com

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

[jira] [Commented] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns


[ 
https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147844#comment-14147844
 ] 

Hive QA commented on HIVE-8171:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12671082/HIVE-8171.03.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 6348 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_timestamp_funcs
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_timestamp_funcs
org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/978/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/978/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-978/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12671082

 Tez and Vectorized Reduce doesn't create scratch columns
 

 Key: HIVE-8171
 URL: https://issues.apache.org/jira/browse/HIVE-8171
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8171.01.patch, HIVE-8171.02.patch, 
 HIVE-8171.03.patch


 This query fails with ArrayIndexOutofBound exception in the reducer.
 {code}
 create table varchar_3 (
   field varchar(25)
 ) stored as orc;
 insert into table varchar_3 select cint from alltypesorc limit 10;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8201) Remove hardwiring to HiveInputFormat in acid qfile tests


 [ 
https://issues.apache.org/jira/browse/HIVE-8201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-8201:
-
   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Patch committed.  Thanks Owen for the review.

 Remove hardwiring to HiveInputFormat in acid qfile tests
 

 Key: HIVE-8201
 URL: https://issues.apache.org/jira/browse/HIVE-8201
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.14.0

 Attachments: HIVE-8201.2.patch, HIVE-8201.patch


 Now that HIVE-7812 is checked in we should remove the hardwiring to 
 HiveInputFormat for the qfile tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8240) VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR


[ 
https://issues.apache.org/jira/browse/HIVE-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147847#comment-14147847
 ] 

Hive QA commented on HIVE-8240:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12671094/HIVE-8240.02.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/979/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/979/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-979/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-979/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
Reverted 'itests/src/test/resources/testconfiguration.properties'
Reverted 'ql/src/test/results/clientpositive/vector_char_simple.q.out'
Reverted 'ql/src/test/results/clientpositive/vector_varchar_simple.q.out'
Reverted 'ql/src/test/results/clientpositive/tez/vector_char_simple.q.out'
Reverted 'ql/src/test/results/clientpositive/tez/vector_varchar_simple.q.out'
Reverted 'ql/src/test/queries/clientpositive/vector_char_simple.q'
Reverted 'ql/src/test/queries/clientpositive/vector_varchar_simple.q'
Reverted 
'ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java'
Reverted 
'ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatchCtx.java'
Reverted 
'ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordProcessor.java'
Reverted 
'ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordSource.java'
++ awk '{print $2}'
++ egrep -v '^X|^Performing status on external'
++ svn status --no-ignore
+ rm -rf target datanucleus.log ant/target shims/target shims/0.20/target 
shims/0.20S/target shims/0.23/target shims/aggregator/target 
shims/common/target shims/common-secure/target packaging/target 
hbase-handler/target testutils/target jdbc/target metastore/target 
itests/target itests/hcatalog-unit/target itests/test-serde/target 
itests/qtest/target itests/hive-unit-hadoop2/target itests/hive-minikdc/target 
itests/hive-unit/target itests/custom-serde/target itests/util/target 
hcatalog/target hcatalog/core/target hcatalog/streaming/target 
hcatalog/server-extensions/target hcatalog/hcatalog-pig-adapter/target 
hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target 
accumulo-handler/target hwi/target common/target common/src/gen service/target 
contrib/target serde/target beeline/target odbc/target cli/target 
ql/dependency-reduced-pom.xml ql/target
+ svn update

Fetching external item into 'hcatalog/src/test/e2e/harness'
External at revision 1627557.

At revision 1627557.
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12671094

 VectorColumnAssignFactory throws Incompatible Bytes vector column and 
 primitive category VARCHAR
 --

 Key: HIVE-8240
 URL:

[jira] [Updated] (HIVE-8182) beeline fails when executing multiple-line queries with trailing spaces


 [ 
https://issues.apache.org/jira/browse/HIVE-8182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8182:
--
Status: Open  (was: Patch Available)

 beeline fails when executing multiple-line queries with trailing spaces
 ---

 Key: HIVE-8182
 URL: https://issues.apache.org/jira/browse/HIVE-8182
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1, 0.12.0
Reporter: Yongzhi Chen
Assignee: Sergio Peña
 Fix For: 0.14.0

 Attachments: HIVE-8181.1.patch


 As title indicates, when executing a multi-line query with trailing spaces, 
 beeline reports syntax error: 
 Error: Error while compiling statement: FAILED: ParseException line 1:76 
 extraneous input ';' expecting EOF near 'EOF' (state=42000,code=4)
 If put this query in one single line, beeline succeeds to execute it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8182) beeline fails when executing multiple-line queries with trailing spaces


 [ 
https://issues.apache.org/jira/browse/HIVE-8182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8182:
--
Attachment: HIVE-8182.1.patch

 beeline fails when executing multiple-line queries with trailing spaces
 ---

 Key: HIVE-8182
 URL: https://issues.apache.org/jira/browse/HIVE-8182
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0, 0.13.1
Reporter: Yongzhi Chen
Assignee: Sergio Peña
 Fix For: 0.14.0

 Attachments: HIVE-8181.1.patch, HIVE-8182.1.patch


 As title indicates, when executing a multi-line query with trailing spaces, 
 beeline reports syntax error: 
 Error: Error while compiling statement: FAILED: ParseException line 1:76 
 extraneous input ';' expecting EOF near 'EOF' (state=42000,code=4)
 If put this query in one single line, beeline succeeds to execute it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8182) beeline fails when executing multiple-line queries with trailing spaces


 [ 
https://issues.apache.org/jira/browse/HIVE-8182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8182:
--
Release Note:   (was: Re-submit patch to run tests non-related with this 
fix.)
  Status: Patch Available  (was: Open)

Re-submit patch to run tests non-related with this fix.

 beeline fails when executing multiple-line queries with trailing spaces
 ---

 Key: HIVE-8182
 URL: https://issues.apache.org/jira/browse/HIVE-8182
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1, 0.12.0
Reporter: Yongzhi Chen
Assignee: Sergio Peña
 Fix For: 0.14.0

 Attachments: HIVE-8181.1.patch, HIVE-8182.1.patch


 As title indicates, when executing a multi-line query with trailing spaces, 
 beeline reports syntax error: 
 Error: Error while compiling statement: FAILED: ParseException line 1:76 
 extraneous input ';' expecting EOF near 'EOF' (state=42000,code=4)
 If put this query in one single line, beeline succeeds to execute it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6683) Beeline does not accept comments at end of line


 [ 
https://issues.apache.org/jira/browse/HIVE-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-6683:
--
Attachment: HIVE-6683.1.patch

 Beeline does not accept comments at end of line
 ---

 Key: HIVE-6683
 URL: https://issues.apache.org/jira/browse/HIVE-6683
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.10.0
Reporter: Jeremy Beard
Assignee: Sergio Peña
 Fix For: 0.14.0

 Attachments: HIVE-6683.1.patch, HIVE-6683.1.patch


 Beeline fails to read queries where lines have comments at the end. This 
 works in the embedded Hive CLI.
 Example:
 SELECT
 1 -- this is a comment about this value
 FROM
 table;
 Error: Error while processing statement: FAILED: ParseException line 1:36 
 mismatched input 'EOF' expecting FROM near '1' in from clause 
 (state=42000,code=4)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6683) Beeline does not accept comments at end of line


 [ 
https://issues.apache.org/jira/browse/HIVE-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-6683:
--
Status: Open  (was: Patch Available)

 Beeline does not accept comments at end of line
 ---

 Key: HIVE-6683
 URL: https://issues.apache.org/jira/browse/HIVE-6683
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.10.0
Reporter: Jeremy Beard
Assignee: Sergio Peña
 Fix For: 0.14.0

 Attachments: HIVE-6683.1.patch, HIVE-6683.1.patch


 Beeline fails to read queries where lines have comments at the end. This 
 works in the embedded Hive CLI.
 Example:
 SELECT
 1 -- this is a comment about this value
 FROM
 table;
 Error: Error while processing statement: FAILED: ParseException line 1:36 
 mismatched input 'EOF' expecting FROM near '1' in from clause 
 (state=42000,code=4)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6683) Beeline does not accept comments at end of line


 [ 
https://issues.apache.org/jira/browse/HIVE-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-6683:
--
Release Note:   (was: Re-submit patch to run tests non-related with this 
fix.)
  Status: Patch Available  (was: Open)

Re-submit patch to run tests non-related with this fix.

 Beeline does not accept comments at end of line
 ---

 Key: HIVE-6683
 URL: https://issues.apache.org/jira/browse/HIVE-6683
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.10.0
Reporter: Jeremy Beard
Assignee: Sergio Peña
 Fix For: 0.14.0

 Attachments: HIVE-6683.1.patch, HIVE-6683.1.patch


 Beeline fails to read queries where lines have comments at the end. This 
 works in the embedded Hive CLI.
 Example:
 SELECT
 1 -- this is a comment about this value
 FROM
 table;
 Error: Error while processing statement: FAILED: ParseException line 1:36 
 mismatched input 'EOF' expecting FROM near '1' in from clause 
 (state=42000,code=4)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Getting started with contributing to Hive

2014-09-25 Thread Navneet Gupta

Hi,

I have been following developments in the hadoop ecosystem for quite
sometime and have decided to make contributions to open source. I wanted to
start with contributing to Hive as it's used at work for reporting purposes.

*JIRA Username : navneet4735*

Please add me as a hive contributor. I am currently referring this
https://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute-GettingtheSourceCode
wiki
to get started.

Please let me know any other good resources for getting started.

-- 
Regards,
Navneet

[jira] [Updated] (HIVE-8111) CBO trunk merge: duplicated casts for arithmetic expressions in Hive and CBO

[
https://issues.apache.org/jira/browse/HIVE-8111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ashutosh Chauhan updated HIVE-8111:
---
Resolution: Fixed
Status: Resolved (was: Patch Available)

Committed to trunk. Thanks, Sergey.
[~vikram.dixit] It will be good to have this bug fix in 0.14 branch as well.

CBO trunk merge: duplicated casts for arithmetic expressions in Hive and CBO

Key: HIVE-8111
URL: https://issues.apache.org/jira/browse/HIVE-8111
Project: Hive
Issue Type: Sub-task
Components: CBO
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Attachments: HIVE-8111.01.patch, HIVE-8111.02.patch,
HIVE-8111.03.patch, HIVE-8111.patch

Original test failure: looks like column type changes to different decimals
in most cases. In one case it causes the integer part to be too big to fit,
so the result becomes null it seems.
What happens is that CBO adds casts to arithmetic expressions to make them
type compatible; these casts become part of new AST, and then Hive adds casts
on top of these casts. This (the first part) also causes lots of out file
changes. It's not clear how to best fix it so far, in addition to incorrect
decimal width and sometimes nulls when width is larger than allowed in Hive.
Option one - don't add those for numeric ops - cannot be done if numeric op
is a part of compare, for which CBO needs correct types.
Option two - unwrap casts when determining type in Hive - hard or impossible
to tell apart CBO-added casts and user casts.
Option three - don't change types in Hive if CBO has run - seems hacky and
hard to ensure it's applied everywhere.
Option four - map all expressions precisely between two trees and remove
casts again after optimization, will be pretty difficult.
Option five - somehow mark those casts. Not sure about how yet.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8111) CBO trunk merge: duplicated casts for arithmetic expressions in Hive and CBO

[
https://issues.apache.org/jira/browse/HIVE-8111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ashutosh Chauhan updated HIVE-8111:
---
Fix Version/s: 0.15.0

CBO trunk merge: duplicated casts for arithmetic expressions in Hive and CBO

Key: HIVE-8111
URL: https://issues.apache.org/jira/browse/HIVE-8111
Project: Hive
Issue Type: Sub-task
Components: CBO
Affects Versions: 0.14.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Fix For: 0.15.0

Attachments: HIVE-8111.01.patch, HIVE-8111.02.patch,
HIVE-8111.03.patch, HIVE-8111.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8111) CBO trunk merge: duplicated casts for arithmetic expressions in Hive and CBO

[
https://issues.apache.org/jira/browse/HIVE-8111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ashutosh Chauhan updated HIVE-8111:
---
Affects Version/s: 0.14.0

CBO trunk merge: duplicated casts for arithmetic expressions in Hive and CBO

Attachments: HIVE-8111.01.patch, HIVE-8111.02.patch,
HIVE-8111.03.patch, HIVE-8111.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8199) CBO Trunk Merge: quote2 test fails due to incorrect literal translation


 [ 
https://issues.apache.org/jira/browse/HIVE-8199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8199:
---
   Resolution: Fixed
Fix Version/s: 0.15.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Sergey!
[~vikram.dixit] It will be good to have this bug fix in 0.14 as well.

 CBO Trunk Merge: quote2 test fails due to incorrect literal translation
 ---

 Key: HIVE-8199
 URL: https://issues.apache.org/jira/browse/HIVE-8199
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Affects Versions: 0.14.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.15.0

 Attachments: HIVE-8199.01.patch, HIVE-8199.02.patch, HIVE-8199.patch


 Quoting of quotes and slashes is lost in translation back from CBO to AST, it 
 seems



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8199) CBO Trunk Merge: quote2 test fails due to incorrect literal translation


 [ 
https://issues.apache.org/jira/browse/HIVE-8199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8199:
---
Affects Version/s: 0.14.0

 CBO Trunk Merge: quote2 test fails due to incorrect literal translation
 ---

 Key: HIVE-8199
 URL: https://issues.apache.org/jira/browse/HIVE-8199
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Affects Versions: 0.14.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.15.0

 Attachments: HIVE-8199.01.patch, HIVE-8199.02.patch, HIVE-8199.patch


 Quoting of quotes and slashes is lost in translation back from CBO to AST, it 
 seems



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8223) CBO Trunk Merge: partition_wise_fileformat2 select result depends on ordering


 [ 
https://issues.apache.org/jira/browse/HIVE-8223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8223:
---
   Resolution: Fixed
Fix Version/s: 0.15.0
   Status: Resolved  (was: Patch Available)

Thanks Sergey! Committed to trunk. 
[~vikram.dixit] It will be good to have this bug fix in 0.14 as well.

 CBO Trunk Merge: partition_wise_fileformat2 select result depends on ordering
 -

 Key: HIVE-8223
 URL: https://issues.apache.org/jira/browse/HIVE-8223
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.15.0

 Attachments: HIVE-8223.01.patch, HIVE-8223.02.patch, HIVE-8223.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8223) CBO Trunk Merge: partition_wise_fileformat2 select result depends on ordering


 [ 
https://issues.apache.org/jira/browse/HIVE-8223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8223:
---
Affects Version/s: 0.14.0

 CBO Trunk Merge: partition_wise_fileformat2 select result depends on ordering
 -

 Key: HIVE-8223
 URL: https://issues.apache.org/jira/browse/HIVE-8223
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Affects Versions: 0.14.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.15.0

 Attachments: HIVE-8223.01.patch, HIVE-8223.02.patch, HIVE-8223.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7802) Update language manual for insert, update, and delete


[ 
https://issues.apache.org/jira/browse/HIVE-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147911#comment-14147911
 ] 

Alan Gates commented on HIVE-7802:
--

Edits look good.  I changed the syntax for the insert...values to reflect the 
dynamic partitioning case as you noted.  Thanks for the review.

And yes, neither Fred nor Barney ever struck me as the academic type.

 Update language manual for insert, update, and delete
 -

 Key: HIVE-7802
 URL: https://issues.apache.org/jira/browse/HIVE-7802
 Project: Hive
  Issue Type: Sub-task
  Components: Documentation
Reporter: Alan Gates
Assignee: Alan Gates
  Labels: TODOC14

 With the addition of ACID compliant insert, insert...values, update, and 
 delete we need to update the Hive language manual to cover the new features.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7802) Update language manual for insert, update, and delete


 [ 
https://issues.apache.org/jira/browse/HIVE-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-7802:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Documentation has been added to the Language Manual wiki.

 Update language manual for insert, update, and delete
 -

 Key: HIVE-7802
 URL: https://issues.apache.org/jira/browse/HIVE-7802
 Project: Hive
  Issue Type: Sub-task
  Components: Documentation
Reporter: Alan Gates
Assignee: Alan Gates
  Labels: TODOC14

 With the addition of ACID compliant insert, insert...values, update, and 
 delete we need to update the Hive language manual to cover the new features.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-5317) Implement insert, update, and delete in Hive with full ACID support


 [ 
https://issues.apache.org/jira/browse/HIVE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates resolved HIVE-5317.
--
   Resolution: Fixed
Fix Version/s: 0.14.0

All the sub-tasks have been completed.

 Implement insert, update, and delete in Hive with full ACID support
 ---

 Key: HIVE-5317
 URL: https://issues.apache.org/jira/browse/HIVE-5317
 Project: Hive
  Issue Type: New Feature
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.14.0

 Attachments: InsertUpdatesinHive.pdf


 Many customers want to be able to insert, update and delete rows from Hive 
 tables with full ACID support. The use cases are varied, but the form of the 
 queries that should be supported are:
 * INSERT INTO tbl SELECT …
 * INSERT INTO tbl VALUES ...
 * UPDATE tbl SET … WHERE …
 * DELETE FROM tbl WHERE …
 * MERGE INTO tbl USING src ON … WHEN MATCHED THEN ... WHEN NOT MATCHED THEN 
 ...
 * SET TRANSACTION LEVEL …
 * BEGIN/END TRANSACTION
 Use Cases
 * Once an hour, a set of inserts and updates (up to 500k rows) for various 
 dimension tables (eg. customer, inventory, stores) needs to be processed. The 
 dimension tables have primary keys and are typically bucketed and sorted on 
 those keys.
 * Once a day a small set (up to 100k rows) of records need to be deleted for 
 regulatory compliance.
 * Once an hour a log of transactions is exported from a RDBS and the fact 
 tables need to be updated (up to 1m rows)  to reflect the new data. The 
 transactions are a combination of inserts, updates, and deletes. The table is 
 partitioned and bucketed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns

2014-09-25 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8171:
---
Status: In Progress  (was: Patch Available)

 Tez and Vectorized Reduce doesn't create scratch columns
 

 Key: HIVE-8171
 URL: https://issues.apache.org/jira/browse/HIVE-8171
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8171.01.patch, HIVE-8171.02.patch, 
 HIVE-8171.03.patch


 This query fails with ArrayIndexOutofBound exception in the reducer.
 {code}
 create table varchar_3 (
   field varchar(25)
 ) stored as orc;
 insert into table varchar_3 select cint from alltypesorc limit 10;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns

2014-09-25 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8171:
---
Status: Patch Available  (was: In Progress)

 Tez and Vectorized Reduce doesn't create scratch columns
 

 Key: HIVE-8171
 URL: https://issues.apache.org/jira/browse/HIVE-8171
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8171.01.patch, HIVE-8171.02.patch, 
 HIVE-8171.03.patch, HIVE-8171.04.patch


 This query fails with ArrayIndexOutofBound exception in the reducer.
 {code}
 create table varchar_3 (
   field varchar(25)
 ) stored as orc;
 insert into table varchar_3 select cint from alltypesorc limit 10;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns

2014-09-25 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8171:
---
Attachment: HIVE-8171.04.patch

 Tez and Vectorized Reduce doesn't create scratch columns
 

 Key: HIVE-8171
 URL: https://issues.apache.org/jira/browse/HIVE-8171
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8171.01.patch, HIVE-8171.02.patch, 
 HIVE-8171.03.patch, HIVE-8171.04.patch


 This query fails with ArrayIndexOutofBound exception in the reducer.
 {code}
 create table varchar_3 (
   field varchar(25)
 ) stored as orc;
 insert into table varchar_3 select cint from alltypesorc limit 10;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8255) Need additional properties to control size of mapper and reducer containers independently for tez tasks.

2014-09-25 Thread David Kjerrumgaard (JIRA)

David Kjerrumgaard created HIVE-8255:


 Summary: Need additional properties to control size of mapper and 
reducer containers independently for tez tasks.
 Key: HIVE-8255
 URL: https://issues.apache.org/jira/browse/HIVE-8255
 Project: Hive
  Issue Type: Improvement
  Components: Tez
Affects Versions: 0.13.1
Reporter: David Kjerrumgaard
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8203) ACID operations result in NPE when run through HS2


 [ 
https://issues.apache.org/jira/browse/HIVE-8203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-8203:
-
Status: Open  (was: Patch Available)

Rebased the patch after commit of HIVE-8201.  I don't think any of the test 
failures are related as I can't reproduce them on either Mac or Linux.  But I 
want to get a second run to make sure.

 ACID operations result in NPE when run through HS2
 --

 Key: HIVE-8203
 URL: https://issues.apache.org/jira/browse/HIVE-8203
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 0.14.0
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8203.patch


 When accessing Hive via HS2, any operation requiring the DbTxnManager results 
 in an NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8255) Need additional properties to control size of mapper and reducer containers independently for tez tasks.

2014-09-25 Thread David Kjerrumgaard (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-8255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

David Kjerrumgaard updated HIVE-8255:
-
Description:
There were 2 memory related configuration properties added in HIVE-6037,
hive.tez.container.size hive.tez.java.opts.

These settings apply to BOTH the mapper and reducer tasks in a TEZ job.
Usually, the amount of memory required for your map tasks varies greatly from
the memory requirements for your reducers. Therefore, it would be better if the
hive.tez.container.size hive.tez.java.opts properties were replaced with map
and reduce specific ones, e.g.

hive.tez.map.container.size hive.tez.map.java.opts
hive.tez.reduce.container.size hive.tez.reduce.java.opts

Need additional properties to control size of mapper and reducer containers
independently for tez tasks.

Key: HIVE-8255
URL: https://issues.apache.org/jira/browse/HIVE-8255
Project: Hive
Issue Type: Improvement
Components: Tez
Affects Versions: 0.13.1
Reporter: David Kjerrumgaard
Priority: Minor

There were 2 memory related configuration properties added in HIVE-6037,
hive.tez.container.size hive.tez.java.opts.
These settings apply to BOTH the mapper and reducer tasks in a TEZ job.
Usually, the amount of memory required for your map tasks varies greatly from
the memory requirements for your reducers. Therefore, it would be better if
the hive.tez.container.size hive.tez.java.opts properties were replaced
with map and reduce specific ones, e.g.
hive.tez.map.container.size hive.tez.map.java.opts
hive.tez.reduce.container.size hive.tez.reduce.java.opts

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8203) ACID operations result in NPE when run through HS2


 [ 
https://issues.apache.org/jira/browse/HIVE-8203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-8203:
-
Status: Patch Available  (was: Open)

 ACID operations result in NPE when run through HS2
 --

 Key: HIVE-8203
 URL: https://issues.apache.org/jira/browse/HIVE-8203
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 0.14.0
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8203.2.patch, HIVE-8203.patch


 When accessing Hive via HS2, any operation requiring the DbTxnManager results 
 in an NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8203) ACID operations result in NPE when run through HS2


 [ 
https://issues.apache.org/jira/browse/HIVE-8203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-8203:
-
Attachment: HIVE-8203.2.patch

 ACID operations result in NPE when run through HS2
 --

 Key: HIVE-8203
 URL: https://issues.apache.org/jira/browse/HIVE-8203
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 0.14.0
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8203.2.patch, HIVE-8203.patch


 When accessing Hive via HS2, any operation requiring the DbTxnManager results 
 in an NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8228) CBO: fix couple of issues with partition pruning


 [ 
https://issues.apache.org/jira/browse/HIVE-8228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8228:
---
Affects Version/s: 0.14.0

 CBO: fix couple of issues with partition pruning
 

 Key: HIVE-8228
 URL: https://issues.apache.org/jira/browse/HIVE-8228
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Affects Versions: 0.14.0
Reporter: Harish Butani
Assignee: Harish Butani
 Fix For: 0.15.0

 Attachments: HIVE-8228.1.patch


 - Pruner doesn't handle non-deterministic UDFs correctly
 - Plan genned after CBO has a Project between TScan and Filter; which 
 prevents PartPruning from triggering in hive post CBO. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8191) Update and delete on tables with non Acid output formats gives runtime error


 [ 
https://issues.apache.org/jira/browse/HIVE-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-8191:
-
Status: Open  (was: Patch Available)

These test failures are caused by HIVE-8203.  I'll wait until that is committed 
and then rebase this patch.

 Update and delete on tables with non Acid output formats gives runtime error
 

 Key: HIVE-8191
 URL: https://issues.apache.org/jira/browse/HIVE-8191
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.14.0
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Critical
 Attachments: HIVE-8191.2.patch, HIVE-8191.patch


 {code}
 create table not_an_acid_table(a int, b varchar(128));
 insert into table not_an_acid_table select cint, cast(cstring1 as 
 varchar(128)) from alltypesorc where cint is not null order by cint limit 10;
 delete from not_an_acid_table where b = '0ruyd6Y50JpdGRf6HqD';
 {code}
 This generates a runtime error.  It should get a compile error instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8228) CBO: fix couple of issues with partition pruning


 [ 
https://issues.apache.org/jira/browse/HIVE-8228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8228:
---
   Resolution: Fixed
Fix Version/s: 0.15.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Harish!
[~vikram.dixit] It will be good to have this bug fix in 0.14

 CBO: fix couple of issues with partition pruning
 

 Key: HIVE-8228
 URL: https://issues.apache.org/jira/browse/HIVE-8228
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Affects Versions: 0.14.0
Reporter: Harish Butani
Assignee: Harish Butani
 Fix For: 0.15.0

 Attachments: HIVE-8228.1.patch


 - Pruner doesn't handle non-deterministic UDFs correctly
 - Plan genned after CBO has a Project between TScan and Filter; which 
 prevents PartPruning from triggering in hive post CBO. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8114) Type resolution for udf arguments of Decimal Type results in error


[ 
https://issues.apache.org/jira/browse/HIVE-8114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147959#comment-14147959
 ] 

Hive QA commented on HIVE-8114:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12671101/HIVE-8114.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6347 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.parse.TestParse.testParse_union
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/980/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/980/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-980/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12671101

 Type resolution for udf arguments of Decimal Type results in error
 --

 Key: HIVE-8114
 URL: https://issues.apache.org/jira/browse/HIVE-8114
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, Types
Affects Versions: 0.13.0, 0.13.1
Reporter: Ashutosh Chauhan
Assignee: Jason Dere
 Attachments: HIVE-8114.1.patch


 {code}
 select log (2, 10.5BD) from src;
 {code}
 results in exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8201) Remove hardwiring to HiveInputFormat in acid qfile tests


[ 
https://issues.apache.org/jira/browse/HIVE-8201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147960#comment-14147960
 ] 

Alan Gates commented on HIVE-8201:
--

Patch committed to branch-0.14 as well.

 Remove hardwiring to HiveInputFormat in acid qfile tests
 

 Key: HIVE-8201
 URL: https://issues.apache.org/jira/browse/HIVE-8201
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.14.0

 Attachments: HIVE-8201.2.patch, HIVE-8201.patch


 Now that HIVE-7812 is checked in we should remove the hardwiring to 
 HiveInputFormat for the qfile tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7735) Implement Char, Varchar in ParquetSerDe

2014-09-25 Thread Pratik Khadloya (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147962#comment-14147962
 ] 

Pratik Khadloya commented on HIVE-7735:
---

Oh ok, thanks [~mohitsabharwal], that clears the confusion i had. 
Is this jira still resolved ?

 Implement Char, Varchar in ParquetSerDe
 ---

 Key: HIVE-7735
 URL: https://issues.apache.org/jira/browse/HIVE-7735
 Project: Hive
  Issue Type: Sub-task
  Components: Serializers/Deserializers
Reporter: Mohit Sabharwal
Assignee: Mohit Sabharwal
  Labels: Parquet, TODOC14
 Fix For: 0.14.0

 Attachments: HIVE-7735.1.patch, HIVE-7735.1.patch, HIVE-7735.2.patch, 
 HIVE-7735.2.patch, HIVE-7735.3.patch, HIVE-7735.patch


 This JIRA is to implement CHAR and VARCHAR support in Parquet SerDe.
 Both are represented in Parquet as PrimitiveType binary and OriginalType UTF8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8114) Type resolution for udf arguments of Decimal Type results in error


 [ 
https://issues.apache.org/jira/browse/HIVE-8114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8114:
---
   Resolution: Fixed
Fix Version/s: 0.15.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Jason!
[~vikram.dixit] It will be good to have this bug fix in 0.14 too

 Type resolution for udf arguments of Decimal Type results in error
 --

 Key: HIVE-8114
 URL: https://issues.apache.org/jira/browse/HIVE-8114
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, Types
Affects Versions: 0.13.0, 0.13.1
Reporter: Ashutosh Chauhan
Assignee: Jason Dere
 Fix For: 0.15.0

 Attachments: HIVE-8114.1.patch


 {code}
 select log (2, 10.5BD) from src;
 {code}
 results in exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8256) Add SORT_QUERY_RESULTS for test that doesn't guarantee order #2


 [ 
https://issues.apache.org/jira/browse/HIVE-8256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-8256:
---
Priority: Minor  (was: Major)

 Add SORT_QUERY_RESULTS for test that doesn't guarantee order #2
 ---

 Key: HIVE-8256
 URL: https://issues.apache.org/jira/browse/HIVE-8256
 Project: Hive
  Issue Type: Test
Reporter: Chao
Assignee: Chao
Priority: Minor

 Following HIVE-8035, we need to further add {{SORT_QUERY_RESULTS}} to a few 
 more tests that doesn't guarantee output order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8256) Add SORT_QUERY_RESULTS for test that doesn't guarantee order #2

Chao created HIVE-8256:
--

 Summary: Add SORT_QUERY_RESULTS for test that doesn't guarantee 
order #2
 Key: HIVE-8256
 URL: https://issues.apache.org/jira/browse/HIVE-8256
 Project: Hive
  Issue Type: Test
Reporter: Chao
Assignee: Chao


Following HIVE-8035, we need to further add {{SORT_QUERY_RESULTS}} to a few 
more tests that doesn't guarantee output order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8162) hive.optimize.sort.dynamic.partition causes RuntimeException for inserting into dynamic partitioned table when map function is used in the subquery

2014-09-25 Thread Vikram Dixit K (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148030#comment-14148030
 ] 

Vikram Dixit K commented on HIVE-8162:
--

+1 LGTM.

 hive.optimize.sort.dynamic.partition causes RuntimeException for inserting 
 into dynamic partitioned table when map function is used in the subquery 
 

 Key: HIVE-8162
 URL: https://issues.apache.org/jira/browse/HIVE-8162
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0, 0.14.0
Reporter: Na Yang
Assignee: Prasanth J
 Attachments: 47rows.txt, HIVE-8162.1.patch, HIVE-8162.2.patch


 Exception:
 Diagnostic Messages for this Task:
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
 Hive Runtime Error: Unable to deserialize reduce input key from 
 x1x129x51x83x14x1x128x0x0x2x1x1x1x120x95x112x114x111x100x117x99x116x95x105x100x0x1x0x0x255
  with properties {columns=reducesinkkey0,reducesinkkey1,reducesinkkey2, 
 serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe,
  serialization.sort.order=+++, columns.types=int,mapstring,string,int}
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:283)
   at 
 org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:518)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:462)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:282)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1122)
   at org.apache.hadoop.mapred.Child.main(Child.java:271)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error: Unable to deserialize reduce input key from 
 x1x129x51x83x14x1x128x0x0x2x1x1x1x120x95x112x114x111x100x117x99x116x95x105x100x0x1x0x0x255
  with properties {columns=reducesinkkey0,reducesinkkey1,reducesinkkey2, 
 serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe,
  serialization.sort.order=+++, columns.types=int,mapstring,string,int}
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:222)
   ... 7 more
 Caused by: org.apache.hadoop.hive.serde2.SerDeException: java.io.EOFException
   at 
 org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:189)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:220)
   ... 7 more
 Caused by: java.io.EOFException
   at 
 org.apache.hadoop.hive.serde2.binarysortable.InputByteBuffer.read(InputByteBuffer.java:54)
   at 
 org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserializeInt(BinarySortableSerDe.java:533)
   at 
 org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:236)
   at 
 org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:185)
   ... 8 more
 Step to reproduce the exception:
 -
 CREATE TABLE associateddata(creative_id int,creative_group_id int,placement_id
 int,sm_campaign_id int,browser_id string, trans_type_p string,trans_time_p
 string,group_name string,event_name string,order_id string,revenue
 float,currency string, trans_type_ci string,trans_time_ci string,f16
 mapstring,string,campaign_id int,user_agent_cat string,geo_country
 string,geo_city string,geo_state string,geo_zip string,geo_dma string,geo_area
 string,geo_isp string,site_id int,section_id int,f16_ci mapstring,string)
 PARTITIONED BY(day_id int, hour_id int) ROW FORMAT DELIMITED FIELDS TERMINATED
 BY '\t';
 LOAD DATA LOCAL INPATH '/tmp/47rows.txt' INTO TABLE associateddata
 PARTITION(day_id=20140814,hour_id=2014081417);
 set hive.exec.dynamic.partition=true;
 set hive.exec.dynamic.partition.mode=nonstrict; 
 CREATE  EXTERNAL TABLE IF NOT EXISTS agg_pv_associateddata_c (
  vt_tran_qty int COMMENT 'The count of view
 thru transactions'
 , pair_value_txt  string  COMMENT 'F16 name values
 pairs'
 )
 PARTITIONED BY (day_id int)
 ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
 STORED AS TEXTFILE
 LOCATION '/user/prodman/agg_pv_associateddata_c';
 INSERT INTO TABLE agg_pv_associateddata_c PARTITION (day_id)
 select 2 as vt_tran_qty, pair_value_txt, day_id
  from (select map( 'x_product_id',coalesce(F16['x_product_id'],'') ) as 
 pair_value_txt , day_id , hour_id 
 from associateddata where hour_id = 2014081417 and sm_campaign_id in

[jira] [Commented] (HIVE-7382) Create a MiniSparkCluster and set up a testing framework [Spark Branch]


[ 
https://issues.apache.org/jira/browse/HIVE-7382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148053#comment-14148053
 ] 

Xuefu Zhang commented on HIVE-7382:
---

Hi [~lirui] Thank you for your detailed analysis. Based on your finding, I 
think we can forgo local-cluster and request an equivalent of Hadoop MR mini 
cluster from Spark community instead. I'll create a Spark ticket for this.

 Create a MiniSparkCluster and set up a testing framework [Spark Branch]
 ---

 Key: HIVE-7382
 URL: https://issues.apache.org/jira/browse/HIVE-7382
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xuefu Zhang
Assignee: Rui Li
  Labels: Spark-M1

 To automatically test Hive functionality over Spark execution engine, we need 
 to create a test framework that can execute Hive queries with Spark as the 
 backend. For that, we should create a MiniSparkCluser for this, similar to 
 other execution engines.
 Spark has a way to create a local cluster with a few processes in the local 
 machine, each process is a work node. It's fairly close to a real Spark 
 cluster. Our mini cluster can be based on that.
 For more info, please refer to the design doc on wiki.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Patches to release branches

2014-09-25 Thread Thejas Nair

This sounds reasonable to me. We need to get maintenance releases out, not
just commit the fixes to the maintenance release branch.


On Thu, Sep 25, 2014 at 7:06 AM, Alan Gates ga...@hortonworks.com wrote:

 So combining Mithun's proposal with the input from Sergey and Gopal, I
 propose:

 1) When a contributor provides a patch for a high priority bug (data
 corruption, wrong results, crashes) he or she should also provide a patch
 against the branch of the latest feature release.  For example, once Hive
 0.14 is released that will mean providing patches for trunk and the 0.14
 branch.  I believe the test infrastructure already supports running the
 tests against alternate branches (is that correct Brock?) so the patches
 can be tested against both trunk and the release branch.
 2) The release manager of the feature release (e.g. Hive 0.14) will be
 responsible for maintaining the branch with these patch fixes.  It is his
 or her call whether a given bug merits inclusion on the branch.  If a
 contributor provides a patch for trunk which in the release manager's
 opinion should also be on the branch, then the release manager can ask the
 contributor to also provide a patch for the branch.  Since whoever manages
 the feature release may not want to or be able to continue managing the
 branch post release, these release manager duties are transferable.  But
 the transfer should be clear and announced on the dev list.
 3) In order to make these patch fixes available to Hive users we should
 strive to have frequent maintenance releases.  The frequency will depend on
 the number of bug fixes going into branch, but 6-8 weeks seems like a good
 goal.

 Hive 0.14 could be the test run of this process to see what works and what
 doesn't.  Seem reasonable?

 Alan.



   Mithun Radhakrishnan mithun.radhakrish...@yahoo.com.INVALID
  September 15, 2014 at 11:16
 Hey, Gopal.
 Thank you, that makes sense. I'll concede that delaying the initial commit
 till a patch is available for the recent-most release-branch won't always
 be viable. While I'd expect it to be easier to patch the release-branch
 early than late, if we (the community) would prefer a cloned JIRA in a
 separate queue, of course I'll go along. Anything to make the
 release-branch usable out of the box, without further patching.
 Forgive my ignorance of the relevant protocol... Would this be a change in
 release/patch process? Does this need codifying? I'm not sure if this
 needs voting on, or even who might call a vote on this.
 Mithun

 On Thursday, September 11, 2014 3:15 PM, Gopal V gop...@apache.org
 gop...@apache.org wrote:



 This is a very sensible proposal.

 As a start, I think we need to have people open backport JIRAs, for such
 issues - even if a direct merge might be hard to do with the same patch.

 Immediately cherry-picking the same patch should be done if it applies
 with very little modifications - but reworking the patch for an older
 release is a significant overhead for the initial commit.

 At the very least, we need to get past the unknowns that currently
 surround the last point release against the bugs already fixed in trunk.

 Once we have a backport queue, I'm sure the RMs in charge of the branch
 can moderate the community on the complexity and risk factors involved.

 Cheers,
 Gopal



   Gopal V gop...@apache.org
  September 11, 2014 at 15:15
 On 9/9/14, 1:52 PM, Mithun Radhakrishnan wrote:


 This is a very sensible proposal.

 As a start, I think we need to have people open backport JIRAs, for such
 issues - even if a direct merge might be hard to do with the same patch.

 Immediately cherry-picking the same patch should be done if it applies
 with very little modifications - but reworking the patch for an older
 release is a significant overhead for the initial commit.

 At the very least, we need to get past the unknowns that currently
 surround the last point release against the bugs already fixed in trunk.

 Once we have a backport queue, I'm sure the RMs in charge of the branch
 can moderate the community on the complexity and risk factors involved.

 Cheers,
 Gopal


 --
 Sent with Postbox http://www.getpostbox.com

 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity
 to which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable

Re: Review Request 25906: HIVE-7856 : Enable parallelism in Reduce Side Join [Spark Branch]

2014-09-25 Thread Szehon Ho


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25906/
---

(Updated Sept. 25, 2014, 6:14 p.m.)


Review request for hive.


Changes
---

Update more of the golden files.

Now a join reports edge-type PARTITION-LEVEL SORT to distinguish from 
total-order SORT shuffle.


Bugs: HIVE-7856
https://issues.apache.org/jira/browse/HIVE-7856


Repository: hive-git


Description
---

This work is to consume the new API provided by SPARK-2978 called 
'repartitionAndSortWithinPartitions'.

Now we need to make a distinction between old sort-by which is a total-order 
sort, vs this one which does partition-level sort.  So added a new SparkEdge 
type for the same.  Only if its partition-level sort do we call this API.  This 
will be the case, of course, for reduce-side join.


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties 637fbc1 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SortByShuffler.java 446e3cc 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 
7ab2ca0 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java ed06a57 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkWork.java 4f889db 
  ql/src/java/org/apache/hadoop/hive/ql/plan/SparkEdgeProperty.java bdfef87 
  ql/src/test/queries/clientpositive/parallel_join0.q PRE-CREATION 
  ql/src/test/queries/clientpositive/parallel_join1.q PRE-CREATION 
  ql/src/test/results/clientpositive/spark/char_join1.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/spark/column_access_stats.q.out 56b763e 
  ql/src/test/results/clientpositive/spark/groupby_position.q.out bef99c9 
  ql/src/test/results/clientpositive/spark/groupby_sort_1_23.q.out 5b5495d 
  ql/src/test/results/clientpositive/spark/groupby_sort_skew_1_23.q.out 318fbbc 
  ql/src/test/results/clientpositive/spark/innerjoin.q.out acac2b9 
  ql/src/test/results/clientpositive/spark/join0.q.out 913f57a 
  ql/src/test/results/clientpositive/spark/join1.q.out 9db644b 
  ql/src/test/results/clientpositive/spark/join10.q.out 5122c56 
  ql/src/test/results/clientpositive/spark/join11.q.out f4a080f 
  ql/src/test/results/clientpositive/spark/join12.q.out 1b5992f 
  ql/src/test/results/clientpositive/spark/join13.q.out c64bdb3 
  ql/src/test/results/clientpositive/spark/join14.q.out 9dcc6c8 
  ql/src/test/results/clientpositive/spark/join15.q.out ca7b5c5 
  ql/src/test/results/clientpositive/spark/join16.q.out 3a57bf5 
  ql/src/test/results/clientpositive/spark/join17.q.out 7c6d9ff 
  ql/src/test/results/clientpositive/spark/join18.q.out 3278dde 
  ql/src/test/results/clientpositive/spark/join19.q.out 87606fd 
  ql/src/test/results/clientpositive/spark/join2.q.out 0c3880b 
  ql/src/test/results/clientpositive/spark/join20.q.out 56b4bed 
  ql/src/test/results/clientpositive/spark/join21.q.out 0e08bf8 
  ql/src/test/results/clientpositive/spark/join22.q.out 1c8ab7c 
  ql/src/test/results/clientpositive/spark/join23.q.out ecd8371 
  ql/src/test/results/clientpositive/spark/join25.q.out 71df358 
  ql/src/test/results/clientpositive/spark/join26.q.out 06246a4 
  ql/src/test/results/clientpositive/spark/join27.q.out 8cbe599 
  ql/src/test/results/clientpositive/spark/join3.q.out 2f47a21 
  ql/src/test/results/clientpositive/spark/join4.q.out 48ea655 
  ql/src/test/results/clientpositive/spark/join5.q.out d1130fe 
  ql/src/test/results/clientpositive/spark/join6.q.out bfbe240 
  ql/src/test/results/clientpositive/spark/join7.q.out 1f5a4cc 
  ql/src/test/results/clientpositive/spark/join8.q.out 70782cc 
  ql/src/test/results/clientpositive/spark/join9.q.out f0c4172 
  ql/src/test/results/clientpositive/spark/join_nullsafe.q.out 48d5d76 
  ql/src/test/results/clientpositive/spark/limit_pushdown.q.out d088c8a 
  ql/src/test/results/clientpositive/spark/optimize_nullscan.q.out 33c470e 
  ql/src/test/results/clientpositive/spark/parallel_join0.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/spark/parallel_join1.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/spark/ppd_multi_insert.q.out 76b531e 
  ql/src/test/results/clientpositive/spark/sample8.q.out 365468b 
  ql/src/test/results/clientpositive/spark/subquery_multiinsert.q.out d38c554 

Diff: https://reviews.apache.org/r/25906/diff/


Testing
---

Adding a few tests that force reducers  1, manually verified results.


Thanks,

Szehon Ho

[jira] [Updated] (HIVE-7856) Enable parallelism in Reduce Side Join [Spark Branch]

2014-09-25 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-7856:

Attachment: HIVE-7856.3-spark.patch

Update more golden files.  Now all reduce-side joins are reported with spark 
edge type 'partition-level sort', to distinguish from total-order sorting.

 Enable parallelism in Reduce Side Join [Spark Branch]
 -

 Key: HIVE-7856
 URL: https://issues.apache.org/jira/browse/HIVE-7856
 Project: Hive
  Issue Type: New Feature
  Components: Spark
Affects Versions: spark-branch
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-7856-spark.patch, HIVE-7856.1-spark.patch, 
 HIVE-7856.2-spark.patch, HIVE-7856.3-spark.patch


 This is dependent on new transformation to be provided by SPARK-2978, see 
 parent JIRA for details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8256) Add SORT_QUERY_RESULTS for test that doesn't guarantee order #2


 [ 
https://issues.apache.org/jira/browse/HIVE-8256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-8256:
---
Attachment: HIVE-8256.1-spark.patch

This patch adds {{SORT_QUERY_RESULTS}} to:

{noformat}
groupby7.q
groupby_complex_types.q
table_access_keys_stats.q
{noformat}

 Add SORT_QUERY_RESULTS for test that doesn't guarantee order #2
 ---

 Key: HIVE-8256
 URL: https://issues.apache.org/jira/browse/HIVE-8256
 Project: Hive
  Issue Type: Test
Reporter: Chao
Assignee: Chao
Priority: Minor
 Attachments: HIVE-8256.1-spark.patch


 Following HIVE-8035, we need to further add {{SORT_QUERY_RESULTS}} to a few 
 more tests that doesn't guarantee output order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8256) Add SORT_QUERY_RESULTS for test that doesn't guarantee order #2


 [ 
https://issues.apache.org/jira/browse/HIVE-8256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-8256:
---
Status: Patch Available  (was: Open)

 Add SORT_QUERY_RESULTS for test that doesn't guarantee order #2
 ---

 Key: HIVE-8256
 URL: https://issues.apache.org/jira/browse/HIVE-8256
 Project: Hive
  Issue Type: Test
Reporter: Chao
Assignee: Chao
Priority: Minor
 Attachments: HIVE-8256.1-spark.patch


 Following HIVE-8035, we need to further add {{SORT_QUERY_RESULTS}} to a few 
 more tests that doesn't guarantee output order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7382) Create a MiniSparkCluster and set up a testing framework [Spark Branch]


[ 
https://issues.apache.org/jira/browse/HIVE-7382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148085#comment-14148085
 ] 

Xuefu Zhang commented on HIVE-7382:
---

SPARK-3691 is created. We will continue the work when that SPARK JIRA is fixed.

 Create a MiniSparkCluster and set up a testing framework [Spark Branch]
 ---

 Key: HIVE-7382
 URL: https://issues.apache.org/jira/browse/HIVE-7382
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xuefu Zhang
Assignee: Rui Li
  Labels: Spark-M1

 To automatically test Hive functionality over Spark execution engine, we need 
 to create a test framework that can execute Hive queries with Spark as the 
 backend. For that, we should create a MiniSparkCluser for this, similar to 
 other execution engines.
 Spark has a way to create a local cluster with a few processes in the local 
 machine, each process is a work node. It's fairly close to a real Spark 
 cluster. Our mini cluster can be based on that.
 For more info, please refer to the design doc on wiki.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8021) CBO: support CTAS and insert ... select