[jira] [Updated] (HIVE-7513) Add ROW__ID VirtualColumn

2014-08-04 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7513:
-

Attachment: HIVE-7513.4.patch

fixed a couple of tests which are due to API changes and not related to column 
renumbering

> Add ROW__ID VirtualColumn
> -
>
> Key: HIVE-7513
> URL: https://issues.apache.org/jira/browse/HIVE-7513
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-7513.3.patch, HIVE-7513.4.patch
>
>
> In order to support Update/Delete we need to read rowId from AcidInputFormat 
> and pass that along through the operator pipeline (built from the WHERE 
> clause of the SQL Statement) so that it can be written to the delta file by 
> the update/delete (sink) operators.
> The parser will add this column to the projection list to make sure it's 
> passed along.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7513) Add ROW__ID VirtualColumn

2014-08-04 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14085432#comment-14085432
 ] 

Eugene Koifman commented on HIVE-7513:
--

Bunch of failures here because the columns got renamed in the plan.  The 
internal col names such as _col5 became _col6.  Most of the failed test cases 
is in the join queries because adding VirtualColumns to LHS table renumbers the 
RHS columns.

> Add ROW__ID VirtualColumn
> -
>
> Key: HIVE-7513
> URL: https://issues.apache.org/jira/browse/HIVE-7513
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-7513.3.patch
>
>
> In order to support Update/Delete we need to read rowId from AcidInputFormat 
> and pass that along through the operator pipeline (built from the WHERE 
> clause of the SQL Statement) so that it can be written to the delta file by 
> the update/delete (sink) operators.
> The parser will add this column to the projection list to make sure it's 
> passed along.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7513) Add ROW__ID VirtualColumn

2014-08-04 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14085187#comment-14085187
 ] 

Eugene Koifman commented on HIVE-7513:
--

https://reviews.apache.org/r/24254/

> Add ROW__ID VirtualColumn
> -
>
> Key: HIVE-7513
> URL: https://issues.apache.org/jira/browse/HIVE-7513
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-7513.3.patch
>
>
> In order to support Update/Delete we need to read rowId from AcidInputFormat 
> and pass that along through the operator pipeline (built from the WHERE 
> clause of the SQL Statement) so that it can be written to the delta file by 
> the update/delete (sink) operators.
> The parser will add this column to the projection list to make sure it's 
> passed along.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7513) Add ROW__ID VirtualColumn

2014-08-03 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7513:
-

Status: Patch Available  (was: Open)

> Add ROW__ID VirtualColumn
> -
>
> Key: HIVE-7513
> URL: https://issues.apache.org/jira/browse/HIVE-7513
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-7513.3.patch
>
>
> In order to support Update/Delete we need to read rowId from AcidInputFormat 
> and pass that along through the operator pipeline (built from the WHERE 
> clause of the SQL Statement) so that it can be written to the delta file by 
> the update/delete (sink) operators.
> The parser will add this column to the projection list to make sure it's 
> passed along.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7513) Add ROW__ID VirtualColumn

2014-08-03 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7513:
-

Attachment: HIVE-7513.3.patch

HIVE-7513.3.patch has 1st stab 

> Add ROW__ID VirtualColumn
> -
>
> Key: HIVE-7513
> URL: https://issues.apache.org/jira/browse/HIVE-7513
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-7513.3.patch
>
>
> In order to support Update/Delete we need to read rowId from AcidInputFormat 
> and pass that along through the operator pipeline (built from the WHERE 
> clause of the SQL Statement) so that it can be written to the delta file by 
> the update/delete (sink) operators.
> The parser will add this column to the projection list to make sure it's 
> passed along.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6226) It should be possible to get hadoop, hive, and pig version being used by WebHCat

2014-08-03 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14084101#comment-14084101
 ] 

Eugene Koifman commented on HIVE-6226:
--

Thanks [~leftylev]

> It should be possible to get hadoop, hive, and pig version being used by 
> WebHCat
> 
>
> Key: HIVE-6226
> URL: https://issues.apache.org/jira/browse/HIVE-6226
> Project: Hive
>  Issue Type: New Feature
>  Components: WebHCat
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: 0.13.0
>
> Attachments: HIVE-6226.2.patch, HIVE-6226.patch
>
>
> Calling /version on WebHCat tells the caller the protocol verison, but there 
> is no way to determine the versions of software being run by the applications 
> that WebHCat spawns.  
> I propose to add an end-point: /version/\{module\} where module could be pig, 
> hive, or hadoop.  The response will then be:
> {code}
> {
>   "module" : _module_name_,
>   "version" : _version_string_
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7517) RecordIdentifier overrides equals() but not hashCode()

2014-07-25 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-7517:


 Summary: RecordIdentifier overrides equals() but not hashCode()
 Key: HIVE-7517
 URL: https://issues.apache.org/jira/browse/HIVE-7517
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.1
Reporter: Eugene Koifman
Assignee: Eugene Koifman






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7513) Add ROW__ID VirtualColumn

2014-07-24 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-7513:


 Summary: Add ROW__ID VirtualColumn
 Key: HIVE-7513
 URL: https://issues.apache.org/jira/browse/HIVE-7513
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: 0.13.1
Reporter: Eugene Koifman
Assignee: Eugene Koifman


In order to support Update/Delete we need to read rowId from AcidInputFormat 
and pass that along through the operator pipeline (built from the WHERE clause 
of the SQL Statement) so that it can be written to the delta file by the 
update/delete (sink) operators.

The parser will add this column to the projection list to make sure it's passed 
along.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7483) hive insert overwrite table select from self dead lock

2014-07-22 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14071344#comment-14071344
 ] 

Eugene Koifman commented on HIVE-7483:
--

DbTxnManager will use DbLockManager(), so I think hive.zookeeper.quorum is not 
relevant.

> hive insert overwrite table select from self dead lock
> --
>
> Key: HIVE-7483
> URL: https://issues.apache.org/jira/browse/HIVE-7483
> Project: Hive
>  Issue Type: Bug
>  Components: Locking
>Affects Versions: 0.13.1
>Reporter: Xiaoyu Wang
>
> CREATE TABLE test(
>   id int, 
>   msg string)
> PARTITIONED BY ( 
>   continent string, 
>   country string)
> CLUSTERED BY (id) 
> INTO 10 BUCKETS
> STORED AS ORC;
> alter table test add partition(continent='Asia',country='India');
> in hive-site.xml:
> hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> hive.support.concurrency=true;
> hive.zookeeper.quorum=zk1,zk2,zk3;
> in hive shell:
> set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
> insert into test table some records first.
> then execute sql:
> insert overwrite table test partition(continent='Asia',country='India') 
> select id,msg from test;
> the log stop at :
> INFO log.PerfLogger:  from=org.apache.hadoop.hive.ql.Driver>
> i think it has dead lock when insert overwrite table from it self.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-4590) HCatalog documentation example is wrong

2014-07-20 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068228#comment-14068228
 ] 

Eugene Koifman commented on HIVE-4590:
--

[~leftylev]
1. The MR program does "value.get(1)" in reduce() which means it's "col1" is 
the 2nd column.  Presumably the 1st (0th) column could have been "UserName".
2. you are correct on both

> HCatalog documentation example is wrong
> ---
>
> Key: HIVE-4590
> URL: https://issues.apache.org/jira/browse/HIVE-4590
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation, HCatalog
>Affects Versions: 0.10.0
>Reporter: Eugene Koifman
>Assignee: Lefty Leverenz
>Priority: Minor
>
> http://hive.apache.org/docs/hcat_r0.5.0/inputoutput.html#Read+Example
> reads
> The following very simple MapReduce program reads data from one table which 
> it assumes to have an integer in the second column, and counts how many 
> different values it sees. That is, it does the equivalent of "select col1, 
> count(*) from $table group by col1;".
> The description of the query is wrong.  It actually counts how many instances 
> of each distinct value it find.  For example, if values of col1 are 
> {1,1,1,3,3,3,5) it will produce
> 1, 3
> 3, 2,
> 5, 1
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7423) produce hive-exec-core.jar from ql module

2014-07-15 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14063148#comment-14063148
 ] 

Eugene Koifman commented on HIVE-7423:
--

test failures not related: 2 have been failing for a while now and 
TestHCatLoader doesn't even use hive-exec.

> produce hive-exec-core.jar from ql module
> -
>
> Key: HIVE-7423
> URL: https://issues.apache.org/jira/browse/HIVE-7423
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-7423.patch
>
>
> currently ql module produces hive-exec-$version.jar which is an uber jar.  
> It's also useful to have a thin jar, let's call it 
> hive-exec-$version-core.jar, that only has classes from ql.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7423) produce hive-exec-core.jar from ql module

2014-07-15 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14063128#comment-14063128
 ] 

Eugene Koifman commented on HIVE-7423:
--

I'm far from a maven expert, but using classifier lets other projects refer to 
this jar as a dependency.  If we name it using other means, can they still do 
that?

> produce hive-exec-core.jar from ql module
> -
>
> Key: HIVE-7423
> URL: https://issues.apache.org/jira/browse/HIVE-7423
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-7423.patch
>
>
> currently ql module produces hive-exec-$version.jar which is an uber jar.  
> It's also useful to have a thin jar, let's call it 
> hive-exec-$version-core.jar, that only has classes from ql.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7423) produce hive-exec-core.jar from ql module

2014-07-15 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14063102#comment-14063102
 ] 

Eugene Koifman commented on HIVE-7423:
--

I don't think Maven supports that.  The classifier ("core") goes at the end.
For example, hive-exec-0.14.0-SNAPSHOT-tests.jar.


> produce hive-exec-core.jar from ql module
> -
>
> Key: HIVE-7423
> URL: https://issues.apache.org/jira/browse/HIVE-7423
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-7423.patch
>
>
> currently ql module produces hive-exec-$version.jar which is an uber jar.  
> It's also useful to have a thin jar, let's call it 
> hive-exec-$version-core.jar, that only has classes from ql.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7423) produce hive-exec-core.jar from ql module

2014-07-15 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7423:
-

Status: Patch Available  (was: Open)

> produce hive-exec-core.jar from ql module
> -
>
> Key: HIVE-7423
> URL: https://issues.apache.org/jira/browse/HIVE-7423
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-7423.patch
>
>
> currently ql module produces hive-exec-$version.jar which is an uber jar.  
> It's also useful to have a thin jar, let's call it 
> hive-exec-$version-core.jar, that only has classes from ql.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7423) produce hive-exec-core.jar from ql module

2014-07-15 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7423:
-

Attachment: HIVE-7423.patch

> produce hive-exec-core.jar from ql module
> -
>
> Key: HIVE-7423
> URL: https://issues.apache.org/jira/browse/HIVE-7423
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-7423.patch
>
>
> currently ql module produces hive-exec-$version.jar which is an uber jar.  
> It's also useful to have a thin jar, let's call it 
> hive-exec-$version-core.jar, that only has classes from ql.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7423) produce hive-exec-core.jar from ql module

2014-07-15 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7423:
-

Component/s: Build Infrastructure

> produce hive-exec-core.jar from ql module
> -
>
> Key: HIVE-7423
> URL: https://issues.apache.org/jira/browse/HIVE-7423
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> currently ql module produces hive-exec-$version.jar which is an uber jar.  
> It's also useful to have a thin jar, let's call it 
> hive-exec-$version-core.jar, that only has classes from ql.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7423) produce hive-exec-core.jar from ql module

2014-07-15 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7423:
-

Affects Version/s: 0.13.1

> produce hive-exec-core.jar from ql module
> -
>
> Key: HIVE-7423
> URL: https://issues.apache.org/jira/browse/HIVE-7423
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> currently ql module produces hive-exec-$version.jar which is an uber jar.  
> It's also useful to have a thin jar, let's call it 
> hive-exec-$version-core.jar, that only has classes from ql.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7423) produce hive-exec-core.jar from ql module

2014-07-15 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-7423:


 Summary: produce hive-exec-core.jar from ql module
 Key: HIVE-7423
 URL: https://issues.apache.org/jira/browse/HIVE-7423
 Project: Hive
  Issue Type: Bug
Reporter: Eugene Koifman
Assignee: Eugene Koifman


currently ql module produces hive-exec-$version.jar which is an uber jar.  It's 
also useful to have a thin jar, let's call it hive-exec-$version-core.jar, that 
only has classes from ql.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7376) add minimizeJar to jdbc/pom.xml

2014-07-12 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14059916#comment-14059916
 ] 

Eugene Koifman commented on HIVE-7376:
--

my understanding is that by default uber jar will pull in whole-jar 
dependencies even if only some of the classes in the jar are needed.  this 
option makes it only include classes from any given jar that are necessary 
(transitively) by classes in module

> add minimizeJar to jdbc/pom.xml
> ---
>
> Key: HIVE-7376
> URL: https://issues.apache.org/jira/browse/HIVE-7376
> Project: Hive
>  Issue Type: Bug
>Reporter: Eugene Koifman
> Attachments: HIVE-7376.1.patch.txt
>
>
> adding {code}true{code} to maven-shade-plugin 
> reduces the uber jar (hive-jdbc-0.14.0-SNAPSHOT-standalone.jar) from 51MB to 
> 27MB.  Is there any reason not to add it?
> https://maven.apache.org/plugins/maven-shade-plugin/shade-mojo.html#minimizeJar



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-538) make hive_jdbc.jar self-containing

2014-07-11 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14058873#comment-14058873
 ] 

Eugene Koifman commented on HIVE-538:
-

yes, "where is it published to"?  It seems like one would have to build Hive to 
get it.

> make hive_jdbc.jar self-containing
> --
>
> Key: HIVE-538
> URL: https://issues.apache.org/jira/browse/HIVE-538
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 0.3.0, 0.4.0, 0.6.0, 0.13.0
>Reporter: Raghotham Murthy
>Assignee: Nick White
> Fix For: 0.14.0
>
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-538.D2553.1.patch, 
> ASF.LICENSE.NOT.GRANTED--HIVE-538.D2553.2.patch, HIVE-538.patch
>
>
> Currently, most jars in hive/build/dist/lib and the hadoop-*-core.jar are 
> required in the classpath to run jdbc applications on hive. We need to do 
> atleast the following to get rid of most unnecessary dependencies:
> 1. get rid of dynamic serde and use a standard serialization format, maybe 
> tab separated, json or avro
> 2. dont use hadoop configuration parameters
> 3. repackage thrift and fb303 classes into hive_jdbc.jar



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-538) make hive_jdbc.jar self-containing

2014-07-10 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14058185#comment-14058185
 ] 

Eugene Koifman commented on HIVE-538:
-

the current build system produces 2 jdbc jars:
hive-jdbc-0.14.0-SNAPSHOT-standalone.jar - the 51MB uber jar
hive-jdbc-0.14.0-SNAPSHOT.jar - the 135K jar

The pom file hive-jdbc-0.14.0-SNAPSHOT.pom (which I will attach) does not 
mention the hive-jdbc-0.14.0-SNAPSHOT-standalone.jar at all. Standalone jar is 
not part of hive tar bundle either.  How is the end user supposed to access 
this standalone jar?

> make hive_jdbc.jar self-containing
> --
>
> Key: HIVE-538
> URL: https://issues.apache.org/jira/browse/HIVE-538
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 0.3.0, 0.4.0, 0.6.0, 0.13.0
>Reporter: Raghotham Murthy
>Assignee: Nick White
> Fix For: 0.14.0
>
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-538.D2553.1.patch, 
> ASF.LICENSE.NOT.GRANTED--HIVE-538.D2553.2.patch, HIVE-538.patch
>
>
> Currently, most jars in hive/build/dist/lib and the hadoop-*-core.jar are 
> required in the classpath to run jdbc applications on hive. We need to do 
> atleast the following to get rid of most unnecessary dependencies:
> 1. get rid of dynamic serde and use a standard serialization format, maybe 
> tab separated, json or avro
> 2. dont use hadoop configuration parameters
> 3. repackage thrift and fb303 classes into hive_jdbc.jar



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7376) add minimizeJar to jdbc/pom.xml

2014-07-09 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7376:
-

Description: 
adding {code}true{code} to maven-shade-plugin 
reduces the uber jar (hive-jdbc-0.14.0-SNAPSHOT-standalone.jar) from 51MB to 
27MB.  Is there any reason not to add it?

https://maven.apache.org/plugins/maven-shade-plugin/shade-mojo.html#minimizeJar

  was:adding {code}true{code} to maven-shade-plugin 
reduces the uber jar (hive-jdbc-0.14.0-SNAPSHOT-standalone.jar) from 51MB to 
27MB.  Is there any reason not to add it?


> add minimizeJar to jdbc/pom.xml
> ---
>
> Key: HIVE-7376
> URL: https://issues.apache.org/jira/browse/HIVE-7376
> Project: Hive
>  Issue Type: Bug
>Reporter: Eugene Koifman
>
> adding {code}true{code} to maven-shade-plugin 
> reduces the uber jar (hive-jdbc-0.14.0-SNAPSHOT-standalone.jar) from 51MB to 
> 27MB.  Is there any reason not to add it?
> https://maven.apache.org/plugins/maven-shade-plugin/shade-mojo.html#minimizeJar



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7376) add minimizeJar to jdbc/pom.xml

2014-07-09 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7376:
-

Description: adding {code}true{code} to 
maven-shade-plugin reduces the uber jar 
(hive-jdbc-0.14.0-SNAPSHOT-standalone.jar) from 51MB to 27MB.  Is there any 
reason not to add it?  (was: adding {code}true{code} 
to maven-shade-plugin reduces the uber jar from 51MB to 27MB.  Is there any 
reason not to add it?)

> add minimizeJar to jdbc/pom.xml
> ---
>
> Key: HIVE-7376
> URL: https://issues.apache.org/jira/browse/HIVE-7376
> Project: Hive
>  Issue Type: Bug
>Reporter: Eugene Koifman
>
> adding {code}true{code} to maven-shade-plugin 
> reduces the uber jar (hive-jdbc-0.14.0-SNAPSHOT-standalone.jar) from 51MB to 
> 27MB.  Is there any reason not to add it?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7376) add minimizeJar to jdbc/pom.xml

2014-07-09 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-7376:


 Summary: add minimizeJar to jdbc/pom.xml
 Key: HIVE-7376
 URL: https://issues.apache.org/jira/browse/HIVE-7376
 Project: Hive
  Issue Type: Bug
Reporter: Eugene Koifman


adding {code}true{code} to maven-shade-plugin 
reduces the uber jar from 51MB to 27MB.  Is there any reason not to add it?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7288) Enable support for -libjars and -archives in WebHcat for Streaming MapReduce jobs

2014-07-09 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056487#comment-14056487
 ] 

Eugene Koifman commented on HIVE-7288:
--

[~shanyu] I left some comments on RB.

> Enable support for -libjars and -archives in WebHcat for Streaming MapReduce 
> jobs
> -
>
> Key: HIVE-7288
> URL: https://issues.apache.org/jira/browse/HIVE-7288
> Project: Hive
>  Issue Type: New Feature
>  Components: WebHCat
>Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.13.1
> Environment: HDInsight deploying HDP 2.1;  Also HDP 2.1 on Windows 
>Reporter: Azim Uddin
>Assignee: shanyu zhao
> Attachments: HIVE-7288.1.patch, hive-7288.patch
>
>
> Issue:
> ==
> Due to lack of parameters (or support for) equivalent of '-libjars' and 
> '-archives' in WebHcat REST API, we cannot use an external Java Jars or 
> Archive files with a Streaming MapReduce job, when the job is submitted via 
> WebHcat/templeton. 
> I am citing a few use cases here, but there can be plenty of scenarios like 
> this-
> #1 
> (for -archives):In order to use R with a hadoop distribution like HDInsight 
> or HDP on Windows, we could package the R directory up in a zip file and 
> rename it to r.jar and put it into HDFS or WASB. We can then do 
> something like this from hadoop command line (ignore the wasb syntax, same 
> command can be run with hdfs) - 
> hadoop jar %HADOOP_HOME%\lib\hadoop-streaming.jar -archives 
> wasb:///example/jars/r.jar -files 
> "wasb:///example/apps/mapper.r,wasb:///example/apps/reducer.r" -mapper 
> "./r.jar/bin/Rscript.exe mapper.r" -reducer "./r.jar/bin/Rscript.exe 
> reducer.r" -input /example/data/gutenberg -output /probe/r/wordcount
> This works from hadoop command line, but due to lack of support for 
> '-archives' parameter in WebHcat, we can't submit the same Streaming MR job 
> via WebHcat.
> #2 (for -libjars):
> Consider a scenario where a user would like to use a custom inputFormat with 
> a Streaming MapReduce job and wrote his own custom InputFormat JAR. From a 
> hadoop command line we can do something like this - 
> hadoop jar /path/to/hadoop-streaming.jar \
> -libjars /path/to/custom-formats.jar \
> -D map.output.key.field.separator=, \
> -D mapred.text.key.partitioner.options=-k1,1 \
> -input my_data/ \
> -output my_output/ \
> -outputformat test.example.outputformat.DateFieldMultipleOutputFormat 
> \
> -mapper my_mapper.py \
> -reducer my_reducer.py \
> But due to lack of support for '-libjars' parameter for streaming MapReduce 
> job in WebHcat, we can't submit the above streaming MR job (that uses a 
> custom Java JAR) via WebHcat.
> Impact:
> 
> We think, being able to submit jobs remotely is a vital feature for hadoop to 
> be enterprise-ready and WebHcat plays an important role there. Streaming 
> MapReduce job is also very important for interoperability. So, it would be 
> very useful to keep WebHcat on par with hadoop command line in terms of 
> streaming MR job submission capability.
> Ask:
> 
> Enable parameter support for 'libjars' and 'archives' in WebHcat for Hadoop 
> streaming jobs in WebHcat.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7342) support hiveserver2,metastore specific config files

2014-07-08 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14055535#comment-14055535
 ] 

Eugene Koifman commented on HIVE-7342:
--

does this have effect on HCatCLI, HCat and WebHCat?  

> support hiveserver2,metastore specific config files
> ---
>
> Key: HIVE-7342
> URL: https://issues.apache.org/jira/browse/HIVE-7342
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration, HiveServer2, Metastore
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-7342.1.patch
>
>
> There is currently a single configuration file for all components in hive. 
> ie, components such as hive cli, hiveserver2 and metastore all read from the 
> same hive-site.xml. 
> It will be useful to have a server specific hive-site.xml, so that you can 
> have some different configuration value set for a server. For example, you 
> might want to enabled authorization checks for hiveserver2, while disabling 
> the checks for hive cli. The workaround today is to add any component 
> specific configuration as a commandline (-hiveconf) argument.
> Using server specific config files (eg hiveserver2-site.xml, 
> metastore-site.xml) that override the entries in hive-site.xml will make the 
> configuration much more easy to manage.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5510) [WebHCat] GET job/queue return wrong job information

2014-07-07 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14053877#comment-14053877
 ] 

Eugene Koifman commented on HIVE-5510:
--

[~leftylev] the 1st example (under "JSON Output (fields)") seems to be of the 
behavior before the bug fix - isn't likely to confuse users.  Should the 
example be of 'correct' output?
[~daijy] Does that make sense to you?

> [WebHCat] GET job/queue return wrong job information
> 
>
> Key: HIVE-5510
> URL: https://issues.apache.org/jira/browse/HIVE-5510
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.12.0
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.13.0
>
> Attachments: HIVE-5510-1.patch, HIVE-5510-2.patch, HIVE-5510-3.patch, 
> HIVE-5510-4.patch, test_harnesss_1381798977
>
>
> GET job/queue of a TempletonController job return weird information. It is a 
> mix of child job and itself. It should only pull the information of the 
> controller job itself.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7282) HCatLoader fail to load Orc map with null key

2014-07-02 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14050878#comment-14050878
 ] 

Eugene Koifman commented on HIVE-7282:
--

I agree that null key in a map is a bad idea.  Since we still have to deal with 
data which already has been written with null key, could we add some table 
property that will let user say "if data contains a map with null key, replace 
null with 'my_value' on read".  (Perhaps the same property can be used to 
change a null key to 'my_value' on write to support existing writers, but this 
of course won't work for all cases.)  This way null key can be disallowed.

> HCatLoader fail to load Orc map with null key
> -
>
> Key: HIVE-7282
> URL: https://issues.apache.org/jira/browse/HIVE-7282
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.14.0
>
> Attachments: HIVE-7282-1.patch, HIVE-7282-2.patch
>
>
> Here is the stack:
> Get exception:
> AttemptID:attempt_1403634189382_0011_m_00_0 Info:Error: 
> org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
> converting read value to tuple
> at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
> at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
> at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
> at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
> at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToPigMap(PigHCatUtil.java:469)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:404)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:456)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:374)
> at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64)
> ... 13 more



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6207) Integrate HCatalog with locking

2014-06-30 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6207:
-

Attachment: HIVE-6207.patch

HIVE-6207.patch - preliminary patch which includes 
https://issues.apache.org/jira/secure/attachment/12651359/HIVE-7249.patch
https://issues.apache.org/jira/secure/attachment/12652273/HIVE-7256.patch
https://issues.apache.org/jira/secure/attachment/12653274/HIVE-7256.addendum.patch

> Integrate HCatalog with locking
> ---
>
> Key: HIVE-6207
> URL: https://issues.apache.org/jira/browse/HIVE-6207
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog
>Affects Versions: 0.13.0
>Reporter: Alan Gates
>Assignee: Eugene Koifman
> Fix For: 0.14.0
>
> Attachments: ACIDHCatalogDesign.pdf, HIVE-6207.patch
>
>
> HCatalog currently ignores any locks created by Hive users.  It should 
> respect the locks Hive creates as well as create locks itself when locking is 
> configured.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6207) Integrate HCatalog with locking

2014-06-30 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6207:
-

Attachment: (was: HIVE-6207.4.patch)

> Integrate HCatalog with locking
> ---
>
> Key: HIVE-6207
> URL: https://issues.apache.org/jira/browse/HIVE-6207
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog
>Affects Versions: 0.13.0
>Reporter: Alan Gates
>Assignee: Eugene Koifman
> Fix For: 0.14.0
>
> Attachments: ACIDHCatalogDesign.pdf
>
>
> HCatalog currently ignores any locks created by Hive users.  It should 
> respect the locks Hive creates as well as create locks itself when locking is 
> configured.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7256) HiveTxnManager should be stateless

2014-06-30 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7256:
-

Attachment: HIVE-7256.addendum.patch

HIVE-7256.addendum.patch is in addition to HIVE-7256.patch.  Contains fix to 
TxnManagerFactory so that both types of managers can coexist.  
Contains a fix to HcatDbTxnManager.reconstructTxnInfo() to make it work for 
read-only transactions (as designed in HIVE-7256.patch).  Adds some more 
detailed logging.

> HiveTxnManager should be stateless
> --
>
> Key: HIVE-7256
> URL: https://issues.apache.org/jira/browse/HIVE-7256
> Project: Hive
>  Issue Type: Bug
>  Components: Locking
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Alan Gates
> Attachments: HIVE-7256.addendum.patch, HIVE-7256.patch
>
>
> In order to integrate HCat with Hive ACID, we should be able to create an 
> instance of HiveTxnManager and use it to acquire locks, and release locks 
> from a different instance of HiveTxnManager.
> One use case where this shows up is when a job using HCat is retried, since 
> calls to TxnManager are made from the jobs OutputCommitter.
> Another, is HCatReader/Writer.  For example, TestReaderWriter, calls 
> setupJob()  from one instance of OutputCommitterContainer and commitJob() 
> from another instance.  The 2nd case is perhaps better solved by ensuring 
> there is only 1 instance of OutputCommitterContainer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7249) HiveTxnManager.closeTxnManger() throws if called after commitTxn()

2014-06-30 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14048358#comment-14048358
 ] 

Eugene Koifman commented on HIVE-7249:
--

BTW, I did test with this patch.  I don't see the issue any more.

> HiveTxnManager.closeTxnManger() throws if called after commitTxn()
> --
>
> Key: HIVE-7249
> URL: https://issues.apache.org/jira/browse/HIVE-7249
> Project: Hive
>  Issue Type: Bug
>  Components: Locking
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Alan Gates
> Attachments: HIVE-7249.patch
>
>
>  I openTxn() and acquireLocks() for a query that looks like "INSERT INTO T 
> PARTITION(p) SELECT * FROM T".
> Then I call commitTxn().  Then I call closeTxnManger() I get an exception 
> saying lock not found (the only lock in this txn).  So it seems TxnMgr 
> doesn't know that commit released the locks.
> Here is the stack trace and some log output which maybe useful:
> {noformat}
> 2014-06-17 15:54:40,771 DEBUG mapreduce.TransactionContext 
> (TransactionContext.java:onCommitJob(128)) - 
> onCommitJob(job_local557130041_0001). this=46719652
> 2014-06-17 15:54:40,771 DEBUG lockmgr.DbTxnManager 
> (DbTxnManager.java:commitTxn(205)) - Committing txn 1
> 2014-06-17 15:54:40,771 DEBUG txn.TxnHandler (TxnHandler.java:getDbTime(872)) 
> - Going to execute query 
> 2014-06-17 15:54:40,772 DEBUG txn.TxnHandler 
> (TxnHandler.java:heartbeatTxn(1423)) - Going to execute query  txn_state from TXNS where txn_id = 1 for\
>  update>
> 2014-06-17 15:54:40,773 DEBUG txn.TxnHandler 
> (TxnHandler.java:heartbeatTxn(1438)) - Going to execute update  set txn_last_heartbeat = 140304568\
> 0772 where txn_id = 1>
> 2014-06-17 15:54:40,778 DEBUG txn.TxnHandler 
> (TxnHandler.java:heartbeatTxn(1440)) - Going to commit
> 2014-06-17 15:54:40,779 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(344)) 
> - Going to execute insert  id, tc_database, tc_table, tc_partition from TXN_COMPONENTS where tc_txnid = 
> 1>
> 2014-06-17 15:54:40,784 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(352)) 
> - Going to execute update 
> 2014-06-17 15:54:40,788 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(356)) 
> - Going to execute update 
> 2014-06-17 15:54:40,791 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(359)) 
> - Going to execute update 
> 2014-06-17 15:54:40,794 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(361)) 
> - Going to commit
> 2014-06-17 15:54:40,795 WARN  mapreduce.TransactionContext 
> (TransactionContext.java:cleanup(317)) - 
> cleanupJob(JobID=job_local557130041_0001)this=46719652
> 2014-06-17 15:54:40,795 DEBUG lockmgr.DbLockManager 
> (DbLockManager.java:unlock(109)) - Unlocking id:1
> 2014-06-17 15:54:40,796 DEBUG txn.TxnHandler (TxnHandler.java:getDbTime(872)) 
> - Going to execute query 
> 2014-06-17 15:54:40,796 DEBUG txn.TxnHandler 
> (TxnHandler.java:heartbeatLock(1402)) - Going to execute update  HIVE_LOCKS set hl_last_heartbeat = 140\
> 3045680796 where hl_lock_ext_id = 1>
> 2014-06-17 15:54:40,800 DEBUG txn.TxnHandler 
> (TxnHandler.java:heartbeatLock(1405)) - Going to rollback
> 2014-06-17 15:54:40,804 ERROR metastore.RetryingHMSHandler 
> (RetryingHMSHandler.java:invoke(143)) - NoSuchLockException(message:No such 
> lock: 1)
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeatLock(TxnHandler.java:1407)
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.unlock(TxnHandler.java:477)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.unlock(HiveMetaStore.java:4817)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
> at com.sun.proxy.$Proxy14.unlock(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.unlock(HiveMetaStoreClient.java:1598)
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbLockManager.unlock(DbLockManager.java:110)
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbLockManager.close(DbLockManager.java:162)
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.destruct(DbTxnManager.java:300)
> at 
> org.apache.hadoop.hive.ql.lockmgr.HiveTxnManagerImpl.closeTxnManager(HiveTxnManagerImpl.java:39)
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.closeTxnManager(DbTxnManager.java:43)
> at 
> org.apache.hive.hcatalog.mapreduce.TransactionContext.cleanup(TransactionContext.java:327)
> at 
> org.apache

[jira] [Commented] (HIVE-7282) HCatLoader fail to load Orc map with null key

2014-06-27 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046273#comment-14046273
 ] 

Eugene Koifman commented on HIVE-7282:
--

Also, can HIVE-5020 now be closed as duplicate?

> HCatLoader fail to load Orc map with null key
> -
>
> Key: HIVE-7282
> URL: https://issues.apache.org/jira/browse/HIVE-7282
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.14.0
>
> Attachments: HIVE-7282-1.patch, HIVE-7282-2.patch
>
>
> Here is the stack:
> Get exception:
> AttemptID:attempt_1403634189382_0011_m_00_0 Info:Error: 
> org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
> converting read value to tuple
> at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
> at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
> at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
> at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
> at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToPigMap(PigHCatUtil.java:469)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:404)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:456)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:374)
> at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64)
> ... 13 more



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7288) Enable support for -libjars and -archives in WebHcat for Streaming MapReduce jobs

2014-06-27 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046261#comment-14046261
 ] 

Eugene Koifman commented on HIVE-7288:
--

[~shanyu] Please add tests for this feature.

> Enable support for -libjars and -archives in WebHcat for Streaming MapReduce 
> jobs
> -
>
> Key: HIVE-7288
> URL: https://issues.apache.org/jira/browse/HIVE-7288
> Project: Hive
>  Issue Type: New Feature
>  Components: WebHCat
>Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.13.1
> Environment: HDInsight deploying HDP 2.1;  Also HDP 2.1 on Windows 
>Reporter: Azim Uddin
>Assignee: shanyu zhao
> Attachments: hive-7288.patch
>
>
> Issue:
> ==
> Due to lack of parameters (or support for) equivalent of '-libjars' and 
> '-archives' in WebHcat REST API, we cannot use an external Java Jars or 
> Archive files with a Streaming MapReduce job, when the job is submitted via 
> WebHcat/templeton. 
> I am citing a few use cases here, but there can be plenty of scenarios like 
> this-
> #1 
> (for -archives):In order to use R with a hadoop distribution like HDInsight 
> or HDP on Windows, we could package the R directory up in a zip file and 
> rename it to r.jar and put it into HDFS or WASB. We can then do 
> something like this from hadoop command line (ignore the wasb syntax, same 
> command can be run with hdfs) - 
> hadoop jar %HADOOP_HOME%\lib\hadoop-streaming.jar -archives 
> wasb:///example/jars/r.jar -files 
> "wasb:///example/apps/mapper.r,wasb:///example/apps/reducer.r" -mapper 
> "./r.jar/bin/Rscript.exe mapper.r" -reducer "./r.jar/bin/Rscript.exe 
> reducer.r" -input /example/data/gutenberg -output /probe/r/wordcount
> This works from hadoop command line, but due to lack of support for 
> '-archives' parameter in WebHcat, we can't submit the same Streaming MR job 
> via WebHcat.
> #2 (for -libjars):
> Consider a scenario where a user would like to use a custom inputFormat with 
> a Streaming MapReduce job and wrote his own custom InputFormat JAR. From a 
> hadoop command line we can do something like this - 
> hadoop jar /path/to/hadoop-streaming.jar \
> -libjars /path/to/custom-formats.jar \
> -D map.output.key.field.separator=, \
> -D mapred.text.key.partitioner.options=-k1,1 \
> -input my_data/ \
> -output my_output/ \
> -outputformat test.example.outputformat.DateFieldMultipleOutputFormat 
> \
> -mapper my_mapper.py \
> -reducer my_reducer.py \
> But due to lack of support for '-libjars' parameter for streaming MapReduce 
> job in WebHcat, we can't submit the above streaming MR job (that uses a 
> custom Java JAR) via WebHcat.
> Impact:
> 
> We think, being able to submit jobs remotely is a vital feature for hadoop to 
> be enterprise-ready and WebHcat plays an important role there. Streaming 
> MapReduce job is also very important for interoperability. So, it would be 
> very useful to keep WebHcat on par with hadoop command line in terms of 
> streaming MR job submission capability.
> Ask:
> 
> Enable parameter support for 'libjars' and 'archives' in WebHcat for Hadoop 
> streaming jobs in WebHcat.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7282) HCatLoader fail to load Orc map with null key

2014-06-27 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046250#comment-14046250
 ] 

Eugene Koifman commented on HIVE-7282:
--

Would it not make more sense to add the new test to 
TestHCatLoaderComplexSchema, so that it's run with both ORC and RCFile?

> HCatLoader fail to load Orc map with null key
> -
>
> Key: HIVE-7282
> URL: https://issues.apache.org/jira/browse/HIVE-7282
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.14.0
>
> Attachments: HIVE-7282-1.patch, HIVE-7282-2.patch
>
>
> Here is the stack:
> Get exception:
> AttemptID:attempt_1403634189382_0011_m_00_0 Info:Error: 
> org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
> converting read value to tuple
> at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
> at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
> at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
> at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
> at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToPigMap(PigHCatUtil.java:469)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:404)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:456)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:374)
> at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64)
> ... 13 more



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7090) Support session-level temporary tables in Hive

2014-06-23 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14041378#comment-14041378
 ] 

Eugene Koifman commented on HIVE-7090:
--

In that case it may make sense to generate unique names for artifacts that may 
be left over.  The initial description in this ticket mentions 3rd party tools 
that will use this feature - I imagine they will generate the same Temp table 
name each time which may cause weird failures after crash.

> Support session-level temporary tables in Hive
> --
>
> Key: HIVE-7090
> URL: https://issues.apache.org/jira/browse/HIVE-7090
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Reporter: Gunther Hagleitner
>Assignee: Harish Butani
> Attachments: HIVE-7090.1.patch, HIVE-7090.2.patch
>
>
> It's common to see sql scripts that create some temporary table as an 
> intermediate result, run some additional queries against it and then clean up 
> at the end.
> We should support temporary tables properly, meaning automatically manage the 
> life cycle and make sure the visibility is restricted to the creating 
> connection/session. Without these it's common to see left over tables in 
> meta-store or weird errors with clashing tmp table names.
> Proposed syntax:
> CREATE TEMPORARY TABLE 
> CTAS, CTL, INSERT INTO, should all be supported as usual.
> Knowing that a user wants a temp table can enable us to further optimize 
> access to it. E.g.: temp tables should be kept in memory where possible, 
> compactions and merging table files aren't required, ...



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6207) Integrate HCatalog with locking

2014-06-23 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6207:
-

Attachment: ACIDHCatalogDesign.pdf

> Integrate HCatalog with locking
> ---
>
> Key: HIVE-6207
> URL: https://issues.apache.org/jira/browse/HIVE-6207
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog
>Affects Versions: 0.13.0
>Reporter: Alan Gates
>Assignee: Eugene Koifman
> Fix For: 0.14.0
>
> Attachments: ACIDHCatalogDesign.pdf, HIVE-6207.4.patch
>
>
> HCatalog currently ignores any locks created by Hive users.  It should 
> respect the locks Hive creates as well as create locks itself when locking is 
> configured.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7090) Support session-level temporary tables in Hive

2014-06-23 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14041250#comment-14041250
 ] 

Eugene Koifman commented on HIVE-7090:
--

If the client fails, how does the temp table get cleaned up?

> Support session-level temporary tables in Hive
> --
>
> Key: HIVE-7090
> URL: https://issues.apache.org/jira/browse/HIVE-7090
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Reporter: Gunther Hagleitner
>Assignee: Harish Butani
> Attachments: HIVE-7090.1.patch, HIVE-7090.2.patch
>
>
> It's common to see sql scripts that create some temporary table as an 
> intermediate result, run some additional queries against it and then clean up 
> at the end.
> We should support temporary tables properly, meaning automatically manage the 
> life cycle and make sure the visibility is restricted to the creating 
> connection/session. Without these it's common to see left over tables in 
> meta-store or weird errors with clashing tmp table names.
> Proposed syntax:
> CREATE TEMPORARY TABLE 
> CTAS, CTL, INSERT INTO, should all be supported as usual.
> Knowing that a user wants a temp table can enable us to further optimize 
> access to it. E.g.: temp tables should be kept in memory where possible, 
> compactions and merging table files aren't required, ...



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7249) HiveTxnManager.closeTxnManger() throws if called after commitTxn()

2014-06-23 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14041158#comment-14041158
 ] 

Eugene Koifman commented on HIVE-7249:
--

yes, i did turn on DbTxnManager, but since we are creating a HCat specific API, 
let me retest it once that is ready

> HiveTxnManager.closeTxnManger() throws if called after commitTxn()
> --
>
> Key: HIVE-7249
> URL: https://issues.apache.org/jira/browse/HIVE-7249
> Project: Hive
>  Issue Type: Bug
>  Components: Locking
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Alan Gates
> Attachments: HIVE-7249.patch
>
>
>  I openTxn() and acquireLocks() for a query that looks like "INSERT INTO T 
> PARTITION(p) SELECT * FROM T".
> Then I call commitTxn().  Then I call closeTxnManger() I get an exception 
> saying lock not found (the only lock in this txn).  So it seems TxnMgr 
> doesn't know that commit released the locks.
> Here is the stack trace and some log output which maybe useful:
> {noformat}
> 2014-06-17 15:54:40,771 DEBUG mapreduce.TransactionContext 
> (TransactionContext.java:onCommitJob(128)) - 
> onCommitJob(job_local557130041_0001). this=46719652
> 2014-06-17 15:54:40,771 DEBUG lockmgr.DbTxnManager 
> (DbTxnManager.java:commitTxn(205)) - Committing txn 1
> 2014-06-17 15:54:40,771 DEBUG txn.TxnHandler (TxnHandler.java:getDbTime(872)) 
> - Going to execute query 
> 2014-06-17 15:54:40,772 DEBUG txn.TxnHandler 
> (TxnHandler.java:heartbeatTxn(1423)) - Going to execute query  txn_state from TXNS where txn_id = 1 for\
>  update>
> 2014-06-17 15:54:40,773 DEBUG txn.TxnHandler 
> (TxnHandler.java:heartbeatTxn(1438)) - Going to execute update  set txn_last_heartbeat = 140304568\
> 0772 where txn_id = 1>
> 2014-06-17 15:54:40,778 DEBUG txn.TxnHandler 
> (TxnHandler.java:heartbeatTxn(1440)) - Going to commit
> 2014-06-17 15:54:40,779 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(344)) 
> - Going to execute insert  id, tc_database, tc_table, tc_partition from TXN_COMPONENTS where tc_txnid = 
> 1>
> 2014-06-17 15:54:40,784 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(352)) 
> - Going to execute update 
> 2014-06-17 15:54:40,788 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(356)) 
> - Going to execute update 
> 2014-06-17 15:54:40,791 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(359)) 
> - Going to execute update 
> 2014-06-17 15:54:40,794 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(361)) 
> - Going to commit
> 2014-06-17 15:54:40,795 WARN  mapreduce.TransactionContext 
> (TransactionContext.java:cleanup(317)) - 
> cleanupJob(JobID=job_local557130041_0001)this=46719652
> 2014-06-17 15:54:40,795 DEBUG lockmgr.DbLockManager 
> (DbLockManager.java:unlock(109)) - Unlocking id:1
> 2014-06-17 15:54:40,796 DEBUG txn.TxnHandler (TxnHandler.java:getDbTime(872)) 
> - Going to execute query 
> 2014-06-17 15:54:40,796 DEBUG txn.TxnHandler 
> (TxnHandler.java:heartbeatLock(1402)) - Going to execute update  HIVE_LOCKS set hl_last_heartbeat = 140\
> 3045680796 where hl_lock_ext_id = 1>
> 2014-06-17 15:54:40,800 DEBUG txn.TxnHandler 
> (TxnHandler.java:heartbeatLock(1405)) - Going to rollback
> 2014-06-17 15:54:40,804 ERROR metastore.RetryingHMSHandler 
> (RetryingHMSHandler.java:invoke(143)) - NoSuchLockException(message:No such 
> lock: 1)
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeatLock(TxnHandler.java:1407)
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.unlock(TxnHandler.java:477)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.unlock(HiveMetaStore.java:4817)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
> at com.sun.proxy.$Proxy14.unlock(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.unlock(HiveMetaStoreClient.java:1598)
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbLockManager.unlock(DbLockManager.java:110)
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbLockManager.close(DbLockManager.java:162)
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.destruct(DbTxnManager.java:300)
> at 
> org.apache.hadoop.hive.ql.lockmgr.HiveTxnManagerImpl.closeTxnManager(HiveTxnManagerImpl.java:39)
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.closeTxnManager(DbTxnManager.java:43)
> at 
> org.apache.hive.hcatalog.mapreduce.TransactionContext.cleanup(Tran

[jira] [Updated] (HIVE-6207) Integrate HCatalog with locking

2014-06-19 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6207:
-

Attachment: HIVE-6207.4.patch

preliminary patch

> Integrate HCatalog with locking
> ---
>
> Key: HIVE-6207
> URL: https://issues.apache.org/jira/browse/HIVE-6207
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog
>Affects Versions: 0.13.0
>Reporter: Alan Gates
>Assignee: Eugene Koifman
> Fix For: 0.14.0
>
> Attachments: HIVE-6207.4.patch
>
>
> HCatalog currently ignores any locks created by Hive users.  It should 
> respect the locks Hive creates as well as create locks itself when locking is 
> configured.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7249) HiveTxnManager.closeTxnManger() throws if called after commitTxn()

2014-06-19 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14038119#comment-14038119
 ] 

Eugene Koifman commented on HIVE-7249:
--

Here is the thread dump though there doesn't appear to be anything interesting 
in it
{noformat}
Picked up JAVA_TOOL_OPTIONS:  -Djava.awt.headless=true 
-Dapple.awt.UIElement=true
57554 
87066 
/Users/ekoifman/dev/hive/hcatalog/core/target/surefire/surefirebooter3727332902234772866.jar
87243 sun.tools.jps.Jps
87056 org.codehaus.plexus.classworlds.launcher.Launcher
ekoifman:hcatalog ekoifman$ jstack 87066
Picked up JAVA_TOOL_OPTIONS:  -Djava.awt.headless=true 
-Dapple.awt.UIElement=true
2014-06-19 16:38:27
Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.51-b01-457 mixed mode):

"Attach Listener" daemon prio=9 tid=7ffded8c7800 nid=0x10c84 waiting on 
condition []
   java.lang.Thread.State: RUNNABLE

"BoneCP-pool-watch-thread" daemon prio=5 tid=7ffde9e89000 nid=0x10defb000 
waiting on condition [10defa000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <7b8e93d10> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
at 
java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:322)
at com.jolbox.bonecp.PoolWatchThread.run(PoolWatchThread.java:75)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:680)

"BoneCP-keep-alive-scheduler" daemon prio=5 tid=7ffde9e88000 nid=0x10ddf8000 
waiting on condition [10ddf7000]
   java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <7b8fde4d8> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025)
at java.util.concurrent.DelayQueue.take(DelayQueue.java:164)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:609)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:602)
at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:957)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:917)
at java.lang.Thread.run(Thread.java:680)

"com.google.common.base.internal.Finalizer" daemon prio=5 tid=7ffde9e9a000 
nid=0x10dcf5000 in Object.wait() [10dcf4000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <7b906a3a8> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
- locked <7b906a3a8> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
at com.google.common.base.internal.Finalizer.run(Finalizer.java:127)

"BoneCP-pool-watch-thread" daemon prio=5 tid=7ffde91c6800 nid=0x10d068000 
waiting on condition [10d067000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <7b870b118> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
at 
java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:322)
at com.jolbox.bonecp.PoolWatchThread.run(PoolWatchThread.java:75)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:680)

"BoneCP-keep-alive-scheduler" daemon prio=5 tid=7ffdec031800 nid=0x10cf65000 
waiting on condition [10cf64000]
   java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <7b86fd7c0> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueu

[jira] [Commented] (HIVE-7249) HiveTxnManager.closeTxnManger() throws if called after commitTxn()

2014-06-19 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14038092#comment-14038092
 ] 

Eugene Koifman commented on HIVE-7249:
--

[~alangates] org.apache.hive.hcatalog.fileformats.TestOrcDynamicPartitioned 
gets wedged with this patch

> HiveTxnManager.closeTxnManger() throws if called after commitTxn()
> --
>
> Key: HIVE-7249
> URL: https://issues.apache.org/jira/browse/HIVE-7249
> Project: Hive
>  Issue Type: Bug
>  Components: Locking
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Alan Gates
> Attachments: HIVE-7249.patch
>
>
>  I openTxn() and acquireLocks() for a query that looks like "INSERT INTO T 
> PARTITION(p) SELECT * FROM T".
> Then I call commitTxn().  Then I call closeTxnManger() I get an exception 
> saying lock not found (the only lock in this txn).  So it seems TxnMgr 
> doesn't know that commit released the locks.
> Here is the stack trace and some log output which maybe useful:
> {noformat}
> 2014-06-17 15:54:40,771 DEBUG mapreduce.TransactionContext 
> (TransactionContext.java:onCommitJob(128)) - 
> onCommitJob(job_local557130041_0001). this=46719652
> 2014-06-17 15:54:40,771 DEBUG lockmgr.DbTxnManager 
> (DbTxnManager.java:commitTxn(205)) - Committing txn 1
> 2014-06-17 15:54:40,771 DEBUG txn.TxnHandler (TxnHandler.java:getDbTime(872)) 
> - Going to execute query 
> 2014-06-17 15:54:40,772 DEBUG txn.TxnHandler 
> (TxnHandler.java:heartbeatTxn(1423)) - Going to execute query  txn_state from TXNS where txn_id = 1 for\
>  update>
> 2014-06-17 15:54:40,773 DEBUG txn.TxnHandler 
> (TxnHandler.java:heartbeatTxn(1438)) - Going to execute update  set txn_last_heartbeat = 140304568\
> 0772 where txn_id = 1>
> 2014-06-17 15:54:40,778 DEBUG txn.TxnHandler 
> (TxnHandler.java:heartbeatTxn(1440)) - Going to commit
> 2014-06-17 15:54:40,779 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(344)) 
> - Going to execute insert  id, tc_database, tc_table, tc_partition from TXN_COMPONENTS where tc_txnid = 
> 1>
> 2014-06-17 15:54:40,784 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(352)) 
> - Going to execute update 
> 2014-06-17 15:54:40,788 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(356)) 
> - Going to execute update 
> 2014-06-17 15:54:40,791 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(359)) 
> - Going to execute update 
> 2014-06-17 15:54:40,794 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(361)) 
> - Going to commit
> 2014-06-17 15:54:40,795 WARN  mapreduce.TransactionContext 
> (TransactionContext.java:cleanup(317)) - 
> cleanupJob(JobID=job_local557130041_0001)this=46719652
> 2014-06-17 15:54:40,795 DEBUG lockmgr.DbLockManager 
> (DbLockManager.java:unlock(109)) - Unlocking id:1
> 2014-06-17 15:54:40,796 DEBUG txn.TxnHandler (TxnHandler.java:getDbTime(872)) 
> - Going to execute query 
> 2014-06-17 15:54:40,796 DEBUG txn.TxnHandler 
> (TxnHandler.java:heartbeatLock(1402)) - Going to execute update  HIVE_LOCKS set hl_last_heartbeat = 140\
> 3045680796 where hl_lock_ext_id = 1>
> 2014-06-17 15:54:40,800 DEBUG txn.TxnHandler 
> (TxnHandler.java:heartbeatLock(1405)) - Going to rollback
> 2014-06-17 15:54:40,804 ERROR metastore.RetryingHMSHandler 
> (RetryingHMSHandler.java:invoke(143)) - NoSuchLockException(message:No such 
> lock: 1)
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeatLock(TxnHandler.java:1407)
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.unlock(TxnHandler.java:477)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.unlock(HiveMetaStore.java:4817)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
> at com.sun.proxy.$Proxy14.unlock(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.unlock(HiveMetaStoreClient.java:1598)
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbLockManager.unlock(DbLockManager.java:110)
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbLockManager.close(DbLockManager.java:162)
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.destruct(DbTxnManager.java:300)
> at 
> org.apache.hadoop.hive.ql.lockmgr.HiveTxnManagerImpl.closeTxnManager(HiveTxnManagerImpl.java:39)
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.closeTxnManager(DbTxnManager.java:43)
> at 
> org.apache.hive.hcatalog.mapreduce.TransactionContext.cleanup(TransactionConte

[jira] [Updated] (HIVE-7256) HiveTxnManager should be stateless

2014-06-19 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7256:
-

Assignee: Alan Gates  (was: Eugene Koifman)

> HiveTxnManager should be stateless
> --
>
> Key: HIVE-7256
> URL: https://issues.apache.org/jira/browse/HIVE-7256
> Project: Hive
>  Issue Type: Bug
>  Components: Locking
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Alan Gates
>
> In order to integrate HCat with Hive ACID, we should be able to create an 
> instance of HiveTxnManager and use it to acquire locks, and release locks 
> from a different instance of HiveTxnManager.
> One use case where this shows up is when a job using HCat is retried, since 
> calls to TxnManager are made from the jobs OutputCommitter.
> Another, is HCatReader/Writer.  For example, TestReaderWriter, calls 
> setupJob()  from one instance of OutputCommitterContainer and commitJob() 
> from another instance.  The 2nd case is perhaps better solved by ensuring 
> there is only 1 instance of OutputCommitterContainer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7256) HiveTxnManager should be stateless

2014-06-18 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-7256:


 Summary: HiveTxnManager should be stateless
 Key: HIVE-7256
 URL: https://issues.apache.org/jira/browse/HIVE-7256
 Project: Hive
  Issue Type: Bug
  Components: Locking
Affects Versions: 0.13.1
Reporter: Eugene Koifman
Assignee: Eugene Koifman


In order to integrate HCat with Hive ACID, we should be able to create an 
instance of HiveTxnManager and use it to acquire locks, and release locks from 
a different instance of HiveTxnManager.

One use case where this shows up is when a job using HCat is retried, since 
calls to TxnManager are made from the jobs OutputCommitter.

Another, is HCatReader/Writer.  For example, TestReaderWriter, calls setupJob() 
 from one instance of OutputCommitterContainer and commitJob() from another 
instance.  The 2nd case is perhaps better solved by ensuring there is only 1 
instance of OutputCommitterContainer.





--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7249) HiveTxnManager.closeTxnManger() throws if called after commitTxn()

2014-06-17 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7249:
-

Description: 
 I openTxn() and acquireLocks() for a query that looks like "INSERT INTO T 
PARTITION(p) SELECT * FROM T".
Then I call commitTxn().  Then I call closeTxnManger() I get an exception 
saying lock not found (the only lock in this txn).  So it seems TxnMgr doesn't 
know that commit released the locks.

Here is the stack trace and some log output which maybe useful:
{noformat}
2014-06-17 15:54:40,771 DEBUG mapreduce.TransactionContext 
(TransactionContext.java:onCommitJob(128)) - 
onCommitJob(job_local557130041_0001). this=46719652
2014-06-17 15:54:40,771 DEBUG lockmgr.DbTxnManager 
(DbTxnManager.java:commitTxn(205)) - Committing txn 1
2014-06-17 15:54:40,771 DEBUG txn.TxnHandler (TxnHandler.java:getDbTime(872)) - 
Going to execute query 
2014-06-17 15:54:40,772 DEBUG txn.TxnHandler 
(TxnHandler.java:heartbeatTxn(1423)) - Going to execute query 
2014-06-17 15:54:40,773 DEBUG txn.TxnHandler 
(TxnHandler.java:heartbeatTxn(1438)) - Going to execute update 
2014-06-17 15:54:40,778 DEBUG txn.TxnHandler 
(TxnHandler.java:heartbeatTxn(1440)) - Going to commit
2014-06-17 15:54:40,779 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(344)) - 
Going to execute insert 
2014-06-17 15:54:40,784 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(352)) - 
Going to execute update 
2014-06-17 15:54:40,788 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(356)) - 
Going to execute update 
2014-06-17 15:54:40,791 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(359)) - 
Going to execute update 
2014-06-17 15:54:40,794 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(361)) - 
Going to commit
2014-06-17 15:54:40,795 WARN  mapreduce.TransactionContext 
(TransactionContext.java:cleanup(317)) - 
cleanupJob(JobID=job_local557130041_0001)this=46719652
2014-06-17 15:54:40,795 DEBUG lockmgr.DbLockManager 
(DbLockManager.java:unlock(109)) - Unlocking id:1
2014-06-17 15:54:40,796 DEBUG txn.TxnHandler (TxnHandler.java:getDbTime(872)) - 
Going to execute query 
2014-06-17 15:54:40,796 DEBUG txn.TxnHandler 
(TxnHandler.java:heartbeatLock(1402)) - Going to execute update 
2014-06-17 15:54:40,800 DEBUG txn.TxnHandler 
(TxnHandler.java:heartbeatLock(1405)) - Going to rollback
2014-06-17 15:54:40,804 ERROR metastore.RetryingHMSHandler 
(RetryingHMSHandler.java:invoke(143)) - NoSuchLockException(message:No such 
lock: 1)
at 
org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeatLock(TxnHandler.java:1407)
at 
org.apache.hadoop.hive.metastore.txn.TxnHandler.unlock(TxnHandler.java:477)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.unlock(HiveMetaStore.java:4817)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
at com.sun.proxy.$Proxy14.unlock(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.unlock(HiveMetaStoreClient.java:1598)
at 
org.apache.hadoop.hive.ql.lockmgr.DbLockManager.unlock(DbLockManager.java:110)
at 
org.apache.hadoop.hive.ql.lockmgr.DbLockManager.close(DbLockManager.java:162)
at 
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.destruct(DbTxnManager.java:300)
at 
org.apache.hadoop.hive.ql.lockmgr.HiveTxnManagerImpl.closeTxnManager(HiveTxnManagerImpl.java:39)
at 
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.closeTxnManager(DbTxnManager.java:43)
at 
org.apache.hive.hcatalog.mapreduce.TransactionContext.cleanup(TransactionContext.java:327)
at 
org.apache.hive.hcatalog.mapreduce.TransactionContext.onCommitJob(TransactionContext.java:142)
at 
org.apache.hive.hcatalog.mapreduce.OutputCommitterContainer.commitJob(OutputCommitterContainer.java:61)
at 
org.apache.hive.hcatalog.mapreduce.FileOutputCommitterContainer.commitJob(FileOutputCommitterContainer.java:251)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:537)

2014-06-17 15:54:40,804 ERROR lockmgr.DbLockManager 
(DbLockManager.java:unlock(114)) - Metastore could find no record of lock 1
2014-06-17 15:54:40,810 INFO  mapreduce.FileOutputCommitterContainer 
(FileOutputCommitterContainer.java:cancelDelegationTokens(976)) - Cancelling 
delegation token for the job.
{noformat}

  was:
 I openTxn() and acquireLocks() for a query that looks like "INSERT INTO T 
PARTITION(p) SELECT * FROM T".
Then I call commitTxn().  Then I call closeTxnManger() I get an exception 
saying lock not found (the only lock in this txn).  So it seems TxnMgr doesn't 
know t

[jira] [Created] (HIVE-7249) HiveTxnManager.closeTxnManger() throws if called after commitTxn()

2014-06-17 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-7249:


 Summary: HiveTxnManager.closeTxnManger() throws if called after 
commitTxn()
 Key: HIVE-7249
 URL: https://issues.apache.org/jira/browse/HIVE-7249
 Project: Hive
  Issue Type: Bug
  Components: Locking
Affects Versions: 0.13.1
Reporter: Eugene Koifman
Assignee: Alan Gates


 I openTxn() and acquireLocks() for a query that looks like "INSERT INTO T 
PARTITION(p) SELECT * FROM T".
Then I call commitTxn().  Then I call closeTxnManger() I get an exception 
saying lock not found (the only lock in this txn).  So it seems TxnMgr doesn't 
know that commit released the locks.

Here is the stack trace and some log output which maybe useful:
2014-06-17 15:54:40,771 DEBUG mapreduce.TransactionContext 
(TransactionContext.java:onCommitJob(128)) - 
onCommitJob(job_local557130041_0001). this=46719652
2014-06-17 15:54:40,771 DEBUG lockmgr.DbTxnManager 
(DbTxnManager.java:commitTxn(205)) - Committing txn 1
2014-06-17 15:54:40,771 DEBUG txn.TxnHandler (TxnHandler.java:getDbTime(872)) - 
Going to execute query 
2014-06-17 15:54:40,772 DEBUG txn.TxnHandler 
(TxnHandler.java:heartbeatTxn(1423)) - Going to execute query 
2014-06-17 15:54:40,773 DEBUG txn.TxnHandler 
(TxnHandler.java:heartbeatTxn(1438)) - Going to execute update 
2014-06-17 15:54:40,778 DEBUG txn.TxnHandler 
(TxnHandler.java:heartbeatTxn(1440)) - Going to commit
2014-06-17 15:54:40,779 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(344)) - 
Going to execute insert 
2014-06-17 15:54:40,784 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(352)) - 
Going to execute update 
2014-06-17 15:54:40,788 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(356)) - 
Going to execute update 
2014-06-17 15:54:40,791 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(359)) - 
Going to execute update 
2014-06-17 15:54:40,794 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(361)) - 
Going to commit
2014-06-17 15:54:40,795 WARN  mapreduce.TransactionContext 
(TransactionContext.java:cleanup(317)) - 
cleanupJob(JobID=job_local557130041_0001)this=46719652
2014-06-17 15:54:40,795 DEBUG lockmgr.DbLockManager 
(DbLockManager.java:unlock(109)) - Unlocking id:1
2014-06-17 15:54:40,796 DEBUG txn.TxnHandler (TxnHandler.java:getDbTime(872)) - 
Going to execute query 
2014-06-17 15:54:40,796 DEBUG txn.TxnHandler 
(TxnHandler.java:heartbeatLock(1402)) - Going to execute update 
2014-06-17 15:54:40,800 DEBUG txn.TxnHandler 
(TxnHandler.java:heartbeatLock(1405)) - Going to rollback
2014-06-17 15:54:40,804 ERROR metastore.RetryingHMSHandler 
(RetryingHMSHandler.java:invoke(143)) - NoSuchLockException(message:No such 
lock: 1)
at 
org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeatLock(TxnHandler.java:1407)
at 
org.apache.hadoop.hive.metastore.txn.TxnHandler.unlock(TxnHandler.java:477)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.unlock(HiveMetaStore.java:4817)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
at com.sun.proxy.$Proxy14.unlock(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.unlock(HiveMetaStoreClient.java:1598)
at 
org.apache.hadoop.hive.ql.lockmgr.DbLockManager.unlock(DbLockManager.java:110)
at 
org.apache.hadoop.hive.ql.lockmgr.DbLockManager.close(DbLockManager.java:162)
at 
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.destruct(DbTxnManager.java:300)
at 
org.apache.hadoop.hive.ql.lockmgr.HiveTxnManagerImpl.closeTxnManager(HiveTxnManagerImpl.java:39)
at 
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.closeTxnManager(DbTxnManager.java:43)
at 
org.apache.hive.hcatalog.mapreduce.TransactionContext.cleanup(TransactionContext.java:327)
at 
org.apache.hive.hcatalog.mapreduce.TransactionContext.onCommitJob(TransactionContext.java:142)
at 
org.apache.hive.hcatalog.mapreduce.OutputCommitterContainer.commitJob(OutputCommitterContainer.java:61)
at 
org.apache.hive.hcatalog.mapreduce.FileOutputCommitterContainer.commitJob(FileOutputCommitterContainer.java:251)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:537)

2014-06-17 15:54:40,804 ERROR lockmgr.DbLockManager 
(DbLockManager.java:unlock(114)) - Metastore could find no record of lock 1
2014-06-17 15:54:40,810 INFO  mapreduce.FileOutputCommitterContainer 
(FileOutputCommitterContainer.java:cancelDelegationTokens(976)) - Cancelling 
delegatio\
n token for the job.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7159) For inner joins push a 'is not null predicate' to the join sources for every non nullSafe join condition

2014-06-17 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14034620#comment-14034620
 ] 

Eugene Koifman commented on HIVE-7159:
--

FWIW, you can do the same with outer joins on inner side
R left outer join S on R.r=S.s is the same as R LOJ (select * from S where s is 
not null) as S on R.r=S.s
and symmetrically for ROJ.

> For inner joins push a 'is not null predicate' to the join sources for every 
> non nullSafe join condition
> 
>
> Key: HIVE-7159
> URL: https://issues.apache.org/jira/browse/HIVE-7159
> Project: Hive
>  Issue Type: Bug
>Reporter: Harish Butani
>Assignee: Harish Butani
> Attachments: HIVE-7159.1.patch, HIVE-7159.2.patch, HIVE-7159.3.patch, 
> HIVE-7159.4.patch, HIVE-7159.5.patch, HIVE-7159.6.patch, HIVE-7159.7.patch, 
> HIVE-7159.8.patch
>
>
> A join B on A.x = B.y
> can be transformed to
> (A where x is not null) join (B where y is not null) on A.x = B.y
> Apart from avoiding shuffling null keyed rows it also avoids issues with 
> reduce-side skew when there are a lot of null values in the data.
> Thanks to [~gopalv] for the analysis and coming up with the solution.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7190) WebHCat launcher task failure can cause two concurent user jobs to run

2014-06-13 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14030933#comment-14030933
 ] 

Eugene Koifman commented on HIVE-7190:
--

+1

> WebHCat launcher task failure can cause two concurent user jobs to run
> --
>
> Key: HIVE-7190
> URL: https://issues.apache.org/jira/browse/HIVE-7190
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.13.0
>Reporter: Ivan Mitic
> Attachments: HIVE-7190.2.patch, HIVE-7190.3.patch, HIVE-7190.patch
>
>
> Templeton uses launcher jobs to launch the actual user jobs. Launcher jobs 
> are 1-map jobs (a single task jobs) which kick off the actual user job and 
> monitor it until it finishes. Given that the launcher is a task, like any 
> other MR task, it has a retry policy in case it fails (due to a task crash, 
> tasktracker/nodemanager crash, machine level outage, etc.). Further, when 
> launcher task is retried, it will again launch the same user job, *however* 
> the previous attempt user job is already running. What this means is that we 
> can have two identical user jobs running in parallel. 
> In case of MRv2, there will be an MRAppMaster and the launcher task, which 
> are subject to failure. In case any of the two fails, another instance of a 
> user job will be launched again in parallel. 
> Above situation is already a bug.
> Now going further to RM HA, what RM does on failover/restart is that it kills 
> all containers, and it restarts all applications. This means that if our 
> customer had 10 jobs on the cluster (this is 10 launcher jobs and 10 user 
> jobs), on RM failover, all 20 jobs will be restarted, and launcher jobs will 
> queue user jobs again. There are two issues with this design:
> 1. There are *possible* chances for corruption of job outputs (it would be 
> useful to analyze this scenario more and confirm this statement).
> 2. Cluster resources are spent on jobs redundantly
> To address the issue at least on Yarn (Hadoop 2.0) clusters, webhcat should 
> do the same thing Oozie does in this scenario, and that is to tag all its 
> child jobs with an id, and kill those jobs on task restart before they are 
> kicked off again.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-06-12 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14029851#comment-14029851
 ] 

Eugene Koifman commented on HIVE-7065:
--

none of the failed tests are related to WebHCat

> Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
> -
>
> Key: HIVE-7065
> URL: https://issues.apache.org/jira/browse/HIVE-7065
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.14.0
>
> Attachments: HIVE-7065.1.patch, HIVE-7065.2.patch, HIVE-7065.patch
>
>
> WebHCat config has templeton.hive.properties to specify Hive config 
> properties that need to be passed to Hive client on node executing a job 
> submitted through WebHCat (hive query, for example).
> this should include "hive.execution.engine"



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-06-11 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7065:
-

Attachment: HIVE-7065.2.patch

> Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
> -
>
> Key: HIVE-7065
> URL: https://issues.apache.org/jira/browse/HIVE-7065
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.14.0
>
> Attachments: HIVE-7065.1.patch, HIVE-7065.2.patch, HIVE-7065.patch
>
>
> WebHCat config has templeton.hive.properties to specify Hive config 
> properties that need to be passed to Hive client on node executing a job 
> submitted through WebHCat (hive query, for example).
> this should include "hive.execution.engine"



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-06-11 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7065:
-

Status: Patch Available  (was: Open)

> Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
> -
>
> Key: HIVE-7065
> URL: https://issues.apache.org/jira/browse/HIVE-7065
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.14.0
>
> Attachments: HIVE-7065.1.patch, HIVE-7065.2.patch, HIVE-7065.patch
>
>
> WebHCat config has templeton.hive.properties to specify Hive config 
> properties that need to be passed to Hive client on node executing a job 
> submitted through WebHCat (hive query, for example).
> this should include "hive.execution.engine"



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-06-11 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7065:
-

Attachment: (was: HIVE-7065.2.patch)

> Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
> -
>
> Key: HIVE-7065
> URL: https://issues.apache.org/jira/browse/HIVE-7065
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.14.0
>
> Attachments: HIVE-7065.1.patch, HIVE-7065.2.patch, HIVE-7065.patch
>
>
> WebHCat config has templeton.hive.properties to specify Hive config 
> properties that need to be passed to Hive client on node executing a job 
> submitted through WebHCat (hive query, for example).
> this should include "hive.execution.engine"



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-06-11 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7065:
-

Status: Open  (was: Patch Available)

> Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
> -
>
> Key: HIVE-7065
> URL: https://issues.apache.org/jira/browse/HIVE-7065
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.14.0
>
> Attachments: HIVE-7065.1.patch, HIVE-7065.2.patch, HIVE-7065.patch
>
>
> WebHCat config has templeton.hive.properties to specify Hive config 
> properties that need to be passed to Hive client on node executing a job 
> submitted through WebHCat (hive query, for example).
> this should include "hive.execution.engine"



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-06-11 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7065:
-

Attachment: HIVE-7065.2.patch

HIVE-7065.2.patch is an ADDITIONAL patch to fix  the regression.

> Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
> -
>
> Key: HIVE-7065
> URL: https://issues.apache.org/jira/browse/HIVE-7065
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.14.0
>
> Attachments: HIVE-7065.1.patch, HIVE-7065.2.patch, HIVE-7065.patch
>
>
> WebHCat config has templeton.hive.properties to specify Hive config 
> properties that need to be passed to Hive client on node executing a job 
> submitted through WebHCat (hive query, for example).
> this should include "hive.execution.engine"



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-06-11 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7065:
-

Status: Patch Available  (was: Reopened)

> Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
> -
>
> Key: HIVE-7065
> URL: https://issues.apache.org/jira/browse/HIVE-7065
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.14.0
>
> Attachments: HIVE-7065.1.patch, HIVE-7065.2.patch, HIVE-7065.patch
>
>
> WebHCat config has templeton.hive.properties to specify Hive config 
> properties that need to be passed to Hive client on node executing a job 
> submitted through WebHCat (hive query, for example).
> this should include "hive.execution.engine"



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-06-11 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028317#comment-14028317
 ] 

Eugene Koifman commented on HIVE-7065:
--

I'm looking at it now.  Will make changes in this ticket

> Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
> -
>
> Key: HIVE-7065
> URL: https://issues.apache.org/jira/browse/HIVE-7065
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.14.0
>
> Attachments: HIVE-7065.1.patch, HIVE-7065.patch
>
>
> WebHCat config has templeton.hive.properties to specify Hive config 
> properties that need to be passed to Hive client on node executing a job 
> submitted through WebHCat (hive query, for example).
> this should include "hive.execution.engine"



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6564) WebHCat E2E tests that launch MR jobs fail on check job completion timeout

2014-06-10 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026829#comment-14026829
 ] 

Eugene Koifman commented on HIVE-6564:
--

+1

> WebHCat E2E tests that launch MR jobs fail on check job completion timeout
> --
>
> Key: HIVE-6564
> URL: https://issues.apache.org/jira/browse/HIVE-6564
> Project: Hive
>  Issue Type: Bug
>  Components: Tests, WebHCat
>Affects Versions: 0.13.0
>Reporter: Deepesh Khandelwal
>Assignee: Deepesh Khandelwal
> Attachments: HIVE-6564.2.patch, HIVE-6564.patch
>
>
> WebHCat E2E tests that fire off an MR job are not correctly being detected as 
> complete so those tests are timing out.
> The problem is happening because of JSON module available through cpan which 
> returns 1 or 0 instead of true or false.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7202) DbTxnManager deadlocks in hcatalog.cli.TestSematicAnalysis.testAlterTblFFpart()

2014-06-09 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-7202:


 Summary: DbTxnManager deadlocks in 
hcatalog.cli.TestSematicAnalysis.testAlterTblFFpart()
 Key: HIVE-7202
 URL: https://issues.apache.org/jira/browse/HIVE-7202
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Eugene Koifman
Assignee: Alan Gates


select * from HIVE_LOCKS produces

{noformat}
6   |1   |0   |default  

   |junit_sem_analysis  

|NULL   
 |w|r|1402354627716   |NULL 
   |unknown 
|ekoifman.local 

 
6   |2   |0   |default  

   |junit_sem_analysis  

|b=2010-10-10   
 |w|e|1402354627716   |NULL 
   |unknown 
|ekoifman.local 

 

2 rows selected
{noformat}

easiest way to repro this is to add
hiveConf.setBoolVar(HiveConf.ConfVars.HIVE_SUPPORT_CONCURRENCY, true);
hiveConf.setVar(HiveConf.ConfVars.HIVE_TXN_MANAGER, 
"org.apache.hadoop.hive.ql.lockmgr.DbTxnManager");

in HCatBaseTest.setUpHiveConf()



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-06-09 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025627#comment-14025627
 ] 

Eugene Koifman commented on HIVE-7065:
--

What's strange is that that is the test added for this this ticket.

> Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
> -
>
> Key: HIVE-7065
> URL: https://issues.apache.org/jira/browse/HIVE-7065
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.14.0
>
> Attachments: HIVE-7065.1.patch, HIVE-7065.patch
>
>
> WebHCat config has templeton.hive.properties to specify Hive config 
> properties that need to be passed to Hive client on node executing a job 
> submitted through WebHCat (hive query, for example).
> this should include "hive.execution.engine"



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-06-09 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025613#comment-14025613
 ] 

Eugene Koifman commented on HIVE-7065:
--

java.lang.IllegalArgumentException: Illegal escaped string 
hive.some.fake.path=C:\foo\bar.txt\ unescaped \ 
  at 22at 
org.apache.hadoop.util.StringUtils.unEscapeString(StringUtils.java:565)
  at org.apache.hadoop.util.StringUtils.unEscapeString(StringUtils.java:547)
  at org.apache.hadoop.util.StringUtils.unEscapeString(StringUtils.java:533)
  at 
org.apache.hive.hcatalog.templeton.tool.TestTempletonUtils.testPropertiesParsing(TestTempletonUtils.java:308)

> Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
> -
>
> Key: HIVE-7065
> URL: https://issues.apache.org/jira/browse/HIVE-7065
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.14.0
>
> Attachments: HIVE-7065.1.patch, HIVE-7065.patch
>
>
> WebHCat config has templeton.hive.properties to specify Hive config 
> properties that need to be passed to Hive client on node executing a job 
> submitted through WebHCat (hive query, for example).
> this should include "hive.execution.engine"



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6226) It should be possible to get hadoop, hive, and pig version being used by WebHCat

2014-06-09 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025556#comment-14025556
 ] 

Eugene Koifman commented on HIVE-6226:
--

[~leftylev]
Here is an example:
http://localhost:50111/templeton/v1/version/hive?user.name=ekoifman

which returns:

{"module":"hive","version":"0.14.0-SNAPSHOT"}


http://localhost:50111/templeton/v1/version/hadoop?user.name=ekoifman
returns:
{"module":"hadoop","version":"2.4.1-SNAPSHOT"}


http://localhost:50111/templeton/v1/version/pig?user.name=ekoifman and 
http://localhost:50111/templeton/v1/version/sqoop?user.name=ekoifman are both 
there as well, but will return
{"error":"Pig version request not yet implemented"}
So the last 2 are not really implemented, so I'm not sure they should be 
documented.



> It should be possible to get hadoop, hive, and pig version being used by 
> WebHCat
> 
>
> Key: HIVE-6226
> URL: https://issues.apache.org/jira/browse/HIVE-6226
> Project: Hive
>  Issue Type: New Feature
>  Components: WebHCat
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: 0.13.0
>
> Attachments: HIVE-6226.2.patch, HIVE-6226.patch
>
>
> Calling /version on WebHCat tells the caller the protocol verison, but there 
> is no way to determine the versions of software being run by the applications 
> that WebHCat spawns.  
> I propose to add an end-point: /version/\{module\} where module could be pig, 
> hive, or hadoop.  The response will then be:
> {code}
> {
>   "module" : _module_name_,
>   "version" : _version_string_
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-06-09 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025480#comment-14025480
 ] 

Eugene Koifman commented on HIVE-7065:
--

[~leftylev] Is there a way to make this table in the wiki be autogenerated from 
the webhcat-default.xml?  It would ensure there is a single source of truth. 
Tez was shipped in 0.13, so yes I think hive.execution.engine can be mentioned 
for 0.13.

> Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
> -
>
> Key: HIVE-7065
> URL: https://issues.apache.org/jira/browse/HIVE-7065
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.14.0
>
> Attachments: HIVE-7065.1.patch, HIVE-7065.patch
>
>
> WebHCat config has templeton.hive.properties to specify Hive config 
> properties that need to be passed to Hive client on node executing a job 
> submitted through WebHCat (hive query, for example).
> this should include "hive.execution.engine"



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-06-09 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025469#comment-14025469
 ] 

Eugene Koifman commented on HIVE-7065:
--

good point, this should have pre-commit tests


> Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
> -
>
> Key: HIVE-7065
> URL: https://issues.apache.org/jira/browse/HIVE-7065
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.14.0
>
> Attachments: HIVE-7065.1.patch, HIVE-7065.patch
>
>
> WebHCat config has templeton.hive.properties to specify Hive config 
> properties that need to be passed to Hive client on node executing a job 
> submitted through WebHCat (hive query, for example).
> this should include "hive.execution.engine"



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-06-09 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7065:
-

Description: 
WebHCat config has templeton.hive.properties to specify Hive config properties 
that need to be passed to Hive client on node executing a job submitted through 
WebHCat (hive query, for example).

this should include "hive.execution.engine"


  was:
WebHCat config has templeton.hive.properties to specify Hive config properties 
that need to be passed to Hive client on node executing a job submitted through 
WebHCat (hive query, for example).

this should include "hive.execution.engine"

NO PRECOMMIT TESTS


> Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
> -
>
> Key: HIVE-7065
> URL: https://issues.apache.org/jira/browse/HIVE-7065
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.14.0
>
> Attachments: HIVE-7065.1.patch, HIVE-7065.patch
>
>
> WebHCat config has templeton.hive.properties to specify Hive config 
> properties that need to be passed to Hive client on node executing a job 
> submitted through WebHCat (hive query, for example).
> this should include "hive.execution.engine"



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7110) TestHCatPartitionPublish test failure: No FileSystem or scheme: pfile

2014-06-07 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14020982#comment-14020982
 ] 

Eugene Koifman commented on HIVE-7110:
--

I see the same test failure on current tunk (OSX 10.8.5)

> TestHCatPartitionPublish test failure: No FileSystem or scheme: pfile
> -
>
> Key: HIVE-7110
> URL: https://issues.apache.org/jira/browse/HIVE-7110
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: David Chen
>Assignee: David Chen
> Attachments: HIVE-7110.1.patch, HIVE-7110.2.patch, HIVE-7110.3.patch, 
> HIVE-7110.4.patch
>
>
> I got the following TestHCatPartitionPublish test failure when running all 
> unit tests against Hadoop 1. This also appears when testing against Hadoop 2.
> {code}
>  Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 26.06 sec 
> <<< FAILURE! - in org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish
> testPartitionPublish(org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish)
>   Time elapsed: 1.361 sec  <<< ERROR!
> org.apache.hive.hcatalog.common.HCatException: 
> org.apache.hive.hcatalog.common.HCatException : 2001 : Error setting output 
> information. Cause : java.io.IOException: No FileSystem for scheme: pfile
> at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1443)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1464)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:263)
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
> at 
> org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:212)
> at 
> org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:70)
> at 
> org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish.runMRCreateFail(TestHCatPartitionPublish.java:191)
> at 
> org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish.testPartitionPublish(TestHCatPartitionPublish.java:155)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7155) WebHCat controller job exceeds container memory limit

2014-06-06 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14020636#comment-14020636
 ] 

Eugene Koifman commented on HIVE-7155:
--

+1

> WebHCat controller job exceeds container memory limit
> -
>
> Key: HIVE-7155
> URL: https://issues.apache.org/jira/browse/HIVE-7155
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.13.0
>Reporter: shanyu zhao
>Assignee: shanyu zhao
> Attachments: HIVE-7155.1.patch, HIVE-7155.patch
>
>
> Submit a Hive query on a large table via WebHCat results in failure because 
> the WebHCat controller job is killed by Yarn since it exceeds the memory 
> limit (set by mapreduce.map.memory.mb, defaults to 1GB):
> {code}
>  INSERT OVERWRITE TABLE Temp_InjusticeEvents_2014_03_01_00_00 SELECT * from 
> Stage_InjusticeEvents where LogTimestamp > '2014-03-01 00:00:00' and 
> LogTimestamp <= '2014-03-01 01:00:00';
> {code}
> We could increase mapreduce.map.memory.mb to solve this problem, but this way 
> we are changing this setting system wise.
> We need to provide a WebHCat configuration to overwrite 
> mapreduce.map.memory.mb when submitting the controller job.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7190) WebHCat launcher task failure can cause two concurent user jobs to run

2014-06-06 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7190:
-

Affects Version/s: 0.13.0

> WebHCat launcher task failure can cause two concurent user jobs to run
> --
>
> Key: HIVE-7190
> URL: https://issues.apache.org/jira/browse/HIVE-7190
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.13.0
>Reporter: Ivan Mitic
> Attachments: HIVE-7190.patch
>
>
> Templeton uses launcher jobs to launch the actual user jobs. Launcher jobs 
> are 1-map jobs (a single task jobs) which kick off the actual user job and 
> monitor it until it finishes. Given that the launcher is a task, like any 
> other MR task, it has a retry policy in case it fails (due to a task crash, 
> tasktracker/nodemanager crash, machine level outage, etc.). Further, when 
> launcher task is retried, it will again launch the same user job, *however* 
> the previous attempt user job is already running. What this means is that we 
> can have two identical user jobs running in parallel. 
> In case of MRv2, there will be an MRAppMaster and the launcher task, which 
> are subject to failure. In case any of the two fails, another instance of a 
> user job will be launched again in parallel. 
> Above situation is already a bug.
> Now going further to RM HA, what RM does on failover/restart is that it kills 
> all containers, and it restarts all applications. This means that if our 
> customer had 10 jobs on the cluster (this is 10 launcher jobs and 10 user 
> jobs), on RM failover, all 20 jobs will be restarted, and launcher jobs will 
> queue user jobs again. There are two issues with this design:
> 1. There are *possible* chances for corruption of job outputs (it would be 
> useful to analyze this scenario more and confirm this statement).
> 2. Cluster resources are spent on jobs redundantly
> To address the issue at least on Yarn (Hadoop 2.0) clusters, webhcat should 
> do the same thing Oozie does in this scenario, and that is to tag all its 
> child jobs with an id, and kill those jobs on task restart before they are 
> kicked off again.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7187) Reconcile jetty versions in hive

2014-06-06 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14020170#comment-14020170
 ] 

Eugene Koifman commented on HIVE-7187:
--

also, the current release of Jetty is 9.x.


> Reconcile jetty versions in hive
> 
>
> Key: HIVE-7187
> URL: https://issues.apache.org/jira/browse/HIVE-7187
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Web UI, WebHCat
>Reporter: Vaibhav Gumashta
>
> Hive root pom has 3 parameters for specifying jetty dependency versions:
> {code}
> 6.1.26
> 7.6.0.v20120127
> 7.6.0.v20120127
> {code}
> 1st one is used by HWI, 2nd by WebHCat and 3rd by HiveServer2 (in http mode). 
> We should probably use the same jetty version for all hive components. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7155) WebHCat controller job exceeds container memory limit

2014-06-05 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14018909#comment-14018909
 ] 

Eugene Koifman commented on HIVE-7155:
--

[~shanyu] I can't comment on RB.  Did you perhaps not "publish" it?

> WebHCat controller job exceeds container memory limit
> -
>
> Key: HIVE-7155
> URL: https://issues.apache.org/jira/browse/HIVE-7155
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.13.0
>Reporter: shanyu zhao
>Assignee: shanyu zhao
> Attachments: HIVE-7155.1.patch, HIVE-7155.patch
>
>
> Submit a Hive query on a large table via WebHCat results in failure because 
> the WebHCat controller job is killed by Yarn since it exceeds the memory 
> limit (set by mapreduce.map.memory.mb, defaults to 1GB):
> {code}
>  INSERT OVERWRITE TABLE Temp_InjusticeEvents_2014_03_01_00_00 SELECT * from 
> Stage_InjusticeEvents where LogTimestamp > '2014-03-01 00:00:00' and 
> LogTimestamp <= '2014-03-01 01:00:00';
> {code}
> We could increase mapreduce.map.memory.mb to solve this problem, but this way 
> we are changing this setting system wise.
> We need to provide a WebHCat configuration to overwrite 
> mapreduce.map.memory.mb when submitting the controller job.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6316) Document support for new types in HCat

2014-06-04 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14018154#comment-14018154
 ] 

Eugene Koifman commented on HIVE-6316:
--

[~leftylev],
w.r.t. HCatLoader/Storer - tinyint and smallint Hive types support was there 
prior to 0.13.  (Though perhaps that was not documented).

altogether support for 5 new types was added (both HCatLoader/Storer and 
HCatInput/OutputFormat): date, timestamp, char, varchar, decimal.

If you look at 1st column of 
https://issues.apache.org/jira/secure/attachment/12626251/HCat-Pig%20Type%20Mapping%20Hive%200.13.pdf,
 it lists /.  The Java class/primitive is what 
the user can expect in HCatRecord produced by using HCatInputFormat and what 
they should use in HCatRecord to write it with HCatOutputFormat.  The only 
omission in the PDF doc is that DATE maps to java.sql.Date.

Thus in 
https://cwiki.apache.org/confluence/display/Hive/HCatalog+InputOutput#HCatalogInputOutput-HCatRecord,
 these 5 types should be added to the table. (DECIMAL is already there, but was 
not supported until 0.13 and it maps to HiveDecimal Java class)

The range of values for primitive types is what is dictated by Java, and for 
Object types, users could look at the JavaDoc for corresponding Java classes.

> Document support for new types in HCat
> --
>
> Key: HIVE-6316
> URL: https://issues.apache.org/jira/browse/HIVE-6316
> Project: Hive
>  Issue Type: Sub-task
>  Components: Documentation, HCatalog
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Lefty Leverenz
>
> HIVE-5814 added support for new types in HCat.  The PDF file in that bug 
> explains exactly how these map to Pig types.  This should be added to the 
> Wiki somewhere (probably here 
> https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore).
> In particular it should be highlighted that copying data from Hive TIMESTAMP 
> to Pig DATETIME, any 'nanos' in the timestamp will be lost.  Also, HCatStorer 
> now takes new parameter which is described in the PDF doc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-06-02 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7065:
-

Attachment: HIVE-7065.1.patch

update based on feedback from @Thejas

> Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
> -
>
> Key: HIVE-7065
> URL: https://issues.apache.org/jira/browse/HIVE-7065
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-7065.1.patch, HIVE-7065.patch
>
>
> WebHCat config has templeton.hive.properties to specify Hive config 
> properties that need to be passed to Hive client on node executing a job 
> submitted through WebHCat (hive query, for example).
> this should include "hive.execution.engine"
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-06-02 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7065:
-

Status: Patch Available  (was: Open)

> Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
> -
>
> Key: HIVE-7065
> URL: https://issues.apache.org/jira/browse/HIVE-7065
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-7065.1.patch, HIVE-7065.patch
>
>
> WebHCat config has templeton.hive.properties to specify Hive config 
> properties that need to be passed to Hive client on node executing a job 
> submitted through WebHCat (hive query, for example).
> this should include "hive.execution.engine"
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-06-02 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7065:
-

Description: 
WebHCat config has templeton.hive.properties to specify Hive config properties 
that need to be passed to Hive client on node executing a job submitted through 
WebHCat (hive query, for example).

this should include "hive.execution.engine"

NO PRECOMMIT TESTS

  was:
WebHCat config has templeton.hive.properties to specify Hive config properties 
that need to be passed to Hive client on node executing a job submitted through 
WebHCat (hive query, for example).

this should include "hive.execution.engine"


> Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
> -
>
> Key: HIVE-7065
> URL: https://issues.apache.org/jira/browse/HIVE-7065
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-7065.patch
>
>
> WebHCat config has templeton.hive.properties to specify Hive config 
> properties that need to be passed to Hive client on node executing a job 
> submitted through WebHCat (hive query, for example).
> this should include "hive.execution.engine"
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-05-30 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7065:
-

Status: Open  (was: Patch Available)

> Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
> -
>
> Key: HIVE-7065
> URL: https://issues.apache.org/jira/browse/HIVE-7065
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-7065.patch
>
>
> WebHCat config has templeton.hive.properties to specify Hive config 
> properties that need to be passed to Hive client on node executing a job 
> submitted through WebHCat (hive query, for example).
> this should include "hive.execution.engine"



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7152) OutputJobInfo.setPosOfPartCols() Comparator bug

2014-05-30 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-7152:


 Summary: OutputJobInfo.setPosOfPartCols() Comparator bug
 Key: HIVE-7152
 URL: https://issues.apache.org/jira/browse/HIVE-7152
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman


this method compares Integer objects using '=='.  This may break for wide 
tables that have more than 127 columns.

http://stackoverflow.com/questions/2602636/why-cant-the-compiler-jvm-just-make-autoboxing-just-work




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6316) Document support for new types in HCat

2014-05-27 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010341#comment-14010341
 ] 

Eugene Koifman commented on HIVE-6316:
--

[~leftylev],
Null and Throw are the only possible values.  The description of HIVE-5814 has 
a usage example:
{noformat}
HCatStorer('','', '-onOutOfRangeValue Throw')
{noformat}

hcat.pig.store.onoutofrangevalue does NOT need to be documented, it's internal. 
 This only applies when using HCat fro Pig, where the user is expected to use 
'onOutOfRangeValue in HCatStorer.  Is not really related to Data Promotion 
Behavior.

The HCatInputFormat and HCatOutputFormat section need the same update to the 
"type mapping" tables as HCatLoader/HCatStorer.  I think it would be easier to 
just create link from all 4 current tables to a single page that has the whole 
table in 
https://issues.apache.org/jira/secure/attachment/12626251/HCat-Pig%20Type%20Mapping%20Hive%200.13.pdf
 exactly.  The headers in the table actually indicate a mapping of Hive 
Type/Value system to Pig Type/Value system.

Logically speaking there is no such thing as HCatalog type/value system.  
HCatalog connects Hive tables to Pig/Map Reduce.  Pig has it's own type/value 
system; MR does not as such and is expected to use (in HCatRecord) the same 
classes as used in Hive internally.

so the data type mapping is really Hive<->Pig (HCatLoader/Storer) and Hive<->MR 
(HCatInput/OutputFormat) which is why it's all summarized in a single table in 
my document.

> Document support for new types in HCat
> --
>
> Key: HIVE-6316
> URL: https://issues.apache.org/jira/browse/HIVE-6316
> Project: Hive
>  Issue Type: Sub-task
>  Components: Documentation, HCatalog
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Lefty Leverenz
>
> HIVE-5814 added support for new types in HCat.  The PDF file in that bug 
> explains exactly how these map to Pig types.  This should be added to the 
> Wiki somewhere (probably here 
> https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore).
> In particular it should be highlighted that copying data from Hive TIMESTAMP 
> to Pig DATETIME, any 'nanos' in the timestamp will be lost.  Also, HCatStorer 
> now takes new parameter which is described in the PDF doc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6316) Document support for new types in HCat

2014-05-23 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14007255#comment-14007255
 ] 

Eugene Koifman commented on HIVE-6316:
--

no, the PDF there should be sufficient

> Document support for new types in HCat
> --
>
> Key: HIVE-6316
> URL: https://issues.apache.org/jira/browse/HIVE-6316
> Project: Hive
>  Issue Type: Sub-task
>  Components: Documentation, HCatalog
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Lefty Leverenz
>
> HIVE-5814 added support for new types in HCat.  The PDF file in that bug 
> explains exactly how these map to Pig types.  This should be added to the 
> Wiki somewhere (probably here 
> https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore).
> In particular it should be highlighted that copying data from Hive TIMESTAMP 
> to Pig DATETIME, any 'nanos' in the timestamp will be lost.  Also, HCatStorer 
> now takes new parameter which is described in the PDF doc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6316) Document support for new types in HCat

2014-05-22 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14006387#comment-14006387
 ] 

Eugene Koifman commented on HIVE-6316:
--

ping

Does anyone have cycles to take a look at this?

https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore and 
https://cwiki.apache.org/confluence/display/Hive/HCatalog+InputOutput are both 
pretty badly out of date at this point

> Document support for new types in HCat
> --
>
> Key: HIVE-6316
> URL: https://issues.apache.org/jira/browse/HIVE-6316
> Project: Hive
>  Issue Type: Sub-task
>  Components: Documentation, HCatalog
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Lefty Leverenz
>
> HIVE-5814 added support for new types in HCat.  The PDF file in that bug 
> explains exactly how these map to Pig types.  This should be added to the 
> Wiki somewhere (probably here 
> https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore).
> In particular it should be highlighted that copying data from Hive TIMESTAMP 
> to Pig DATETIME, any 'nanos' in the timestamp will be lost.  Also, HCatStorer 
> now takes new parameter which is described in the PDF doc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HIVE-6207) Integrate HCatalog with locking

2014-05-20 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-6207:


Assignee: Eugene Koifman  (was: Alan Gates)

> Integrate HCatalog with locking
> ---
>
> Key: HIVE-6207
> URL: https://issues.apache.org/jira/browse/HIVE-6207
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog
>Affects Versions: 0.13.0
>Reporter: Alan Gates
>Assignee: Eugene Koifman
> Fix For: 0.14.0
>
>
> HCatalog currently ignores any locks created by Hive users.  It should 
> respect the locks Hive creates as well as create locks itself when locking is 
> configured.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7084) TestWebHCatE2e is failing on trunk

2014-05-20 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14003856#comment-14003856
 ] 

Eugene Koifman commented on HIVE-7084:
--

this works before HIVE-7000 and breaks immediately after

> TestWebHCatE2e is failing on trunk
> --
>
> Key: HIVE-7084
> URL: https://issues.apache.org/jira/browse/HIVE-7084
> Project: Hive
>  Issue Type: Test
>  Components: WebHCat
>Affects Versions: 0.14.0
>Reporter: Ashutosh Chauhan
>Assignee: Eugene Koifman
>
> I am able to repro it consistently on fresh checkout.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7084) TestWebHCatE2e is failing on trunk

2014-05-20 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7084:
-

Assignee: Harish Butani  (was: Eugene Koifman)

> TestWebHCatE2e is failing on trunk
> --
>
> Key: HIVE-7084
> URL: https://issues.apache.org/jira/browse/HIVE-7084
> Project: Hive
>  Issue Type: Test
>  Components: WebHCat
>Affects Versions: 0.14.0
>Reporter: Ashutosh Chauhan
>Assignee: Harish Butani
>
> I am able to repro it consistently on fresh checkout.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7084) TestWebHCatE2e is failing on trunk

2014-05-20 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14003832#comment-14003832
 ] 

Eugene Koifman commented on HIVE-7084:
--

root cause here is 
{noformat}
Caused by: java.lang.RuntimeException: The resource 'resourcedoc.xml' does not 
exist.
at 
com.sun.jersey.api.wadl.config.WadlGeneratorLoader.setProperty(WadlGeneratorLoader.java:203)
at 
com.sun.jersey.api.wadl.config.WadlGeneratorLoader.loadWadlGenerator(WadlGeneratorLoader.java:139)
at 
com.sun.jersey.api.wadl.config.WadlGeneratorLoader.loadWadlGeneratorDescriptions(WadlGeneratorLoader.java:114)
at 
com.sun.jersey.api.wadl.config.WadlGeneratorConfig.createWadlGenerator(WadlGeneratorConfig.java:182)
... 49 more
{noformat}

hcatalog/webhcat/svr/pom.xml uses maven-javadoc-plugin which generates 
resourcedoc.xml which ends up in 
hcatalog/webhcat/svr/target/classes/resourcedoc.xml in 0.13.1 build but is 
missing from trunk build tree.

> TestWebHCatE2e is failing on trunk
> --
>
> Key: HIVE-7084
> URL: https://issues.apache.org/jira/browse/HIVE-7084
> Project: Hive
>  Issue Type: Test
>  Components: WebHCat
>Affects Versions: 0.14.0
>Reporter: Ashutosh Chauhan
>Assignee: Eugene Koifman
>
> I am able to repro it consistently on fresh checkout.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HIVE-7084) TestWebHCatE2e is failing on trunk

2014-05-20 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-7084:


Assignee: Eugene Koifman

> TestWebHCatE2e is failing on trunk
> --
>
> Key: HIVE-7084
> URL: https://issues.apache.org/jira/browse/HIVE-7084
> Project: Hive
>  Issue Type: Test
>  Components: WebHCat
>Affects Versions: 0.14.0
>Reporter: Ashutosh Chauhan
>Assignee: Eugene Koifman
>
> I am able to repro it consistently on fresh checkout.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6651) broken link in WebHCat doc: Job Information — GET queue/:jobid

2014-05-20 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14003721#comment-14003721
 ] 

Eugene Koifman commented on HIVE-6651:
--

+1

> broken link in WebHCat doc: Job Information — GET queue/:jobid
> --
>
> Key: HIVE-6651
> URL: https://issues.apache.org/jira/browse/HIVE-6651
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation, WebHCat
>Reporter: Eugene Koifman
>Assignee: Lefty Leverenz
>Priority: Minor
>
> https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference+JobInfo#WebHCatReferenceJobInfo-Results
> the link in the table to "Class JobProfile" is broken



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5814) Add DATE, TIMESTAMP, DECIMAL, CHAR, VARCHAR types support in HCat

2014-05-19 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14002163#comment-14002163
 ] 

Eugene Koifman commented on HIVE-5814:
--

org.apache.hcatalog.pig.HCatLoader has been deprecated in Hive 0.12.
In fact every class in org.apache.hcatalog has been deprecated.  All new 
features are added in org.apache.hive.hcatalog which contains all the 
classes/methods from org.apache.hcatalog and new APIs.



> Add DATE, TIMESTAMP, DECIMAL, CHAR, VARCHAR types support in HCat
> -
>
> Key: HIVE-5814
> URL: https://issues.apache.org/jira/browse/HIVE-5814
> Project: Hive
>  Issue Type: New Feature
>  Components: HCatalog
>Affects Versions: 0.12.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.13.0
>
> Attachments: HCat-Pig Type Mapping Hive 0.13.pdf, HIVE-5814.2.patch, 
> HIVE-5814.3.patch, HIVE-5814.4.patch, HIVE-5814.5.patch
>
>
> Hive 0.12 added support for new data types.  Pig 0.12 added some as well.  
> HCat should handle these as well.Also note that CHAR was added recently.
> Also allow user to specify a parameter in Pig like so HCatStorer('','', 
> '-onOutOfRangeValue Throw') to control what happens when Pig's value is out 
> of range for target Hive column.  Valid values for the option are Throw and 
> Null.  Throw - make the runtime raise an exception, Null, which is the 
> default, means NULL is written to target column and a message to that effect 
> is emitted to the log.  Only 1 message per column/data type is sent to the 
> log.
> See attached HCat-Pig Type Mapping Hive 0.13.pdf for exact mappings.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6549) remove templeton.jar from webhcat-default.xml, remove hcatalog/bin/hive-config.sh

2014-05-16 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14000446#comment-14000446
 ] 

Eugene Koifman commented on HIVE-6549:
--

no, this is definitely used.  I guess the proper version number is being set in 
the release brunch but not trunk

> remove templeton.jar from webhcat-default.xml, remove 
> hcatalog/bin/hive-config.sh
> -
>
> Key: HIVE-6549
> URL: https://issues.apache.org/jira/browse/HIVE-6549
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.12.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-6549.2.patch, HIVE-6549.patch
>
>
> this property is no longer used
> also removed corresponding AppConfig.TEMPLETON_JAR_NAME
> hcatalog/bin/hive-config.sh is not used
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7075) JsonSerde raises NullPointerException when object key is not lower case

2014-05-16 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7075:
-

Component/s: HCatalog

> JsonSerde raises NullPointerException when object key is not lower case
> ---
>
> Key: HIVE-7075
> URL: https://issues.apache.org/jira/browse/HIVE-7075
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Serializers/Deserializers
>Affects Versions: 0.12.0
>Reporter: Yibing Shi
>
> We have noticed that the JsonSerde produces a NullPointerException if a JSON 
> object has a key value that is not lower case. For example. Assume we have 
> the file "one.json": 
> { "empId" : 123, "name" : "John" } 
> { "empId" : 456, "name" : "Jane" } 
> hive> CREATE TABLE emps (empId INT, name STRING) 
> ROW FORMAT SERDE "org.apache.hive.hcatalog.data.JsonSerDe"; 
> hive> LOAD DATA LOCAL INPATH 'one.json' INTO TABLE emps; 
> hive> SELECT * FROM emps; 
> Failed with exception java.io.IOException:java.lang.NullPointerException 
>  
> Notice, it seems to work if the keys are lower case. Assume we have the file 
> 'two.json': 
> { "empid" : 123, "name" : "John" } 
> { "empid" : 456, "name" : "Jane" } 
> hive> DROP TABLE emps; 
> hive> CREATE TABLE emps (empId INT, name STRING) 
> ROW FORMAT SERDE "org.apache.hive.hcatalog.data.JsonSerDe"; 
> hive> LOAD DATA LOCAL INPATH 'two.json' INTO TABLE emps;
> hive> SELECT * FROM emps; 
> OK 
> 123   John 
> 456   Jane



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6768) remove hcatalog/webhcat/svr/src/main/config/override-container-log4j.properties

2014-05-16 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13993793#comment-13993793
 ] 

Eugene Koifman commented on HIVE-6768:
--

[~thejas],[~hashutosh] the attached patch reverts all changes that were part of 
HIVE-5511 needed to handle the 'special' override-container-log4j, just like 
the bug description says.  HIVE-5511 also included some refactoring, which 
should not be reverted.

> remove 
> hcatalog/webhcat/svr/src/main/config/override-container-log4j.properties
> ---
>
> Key: HIVE-6768
> URL: https://issues.apache.org/jira/browse/HIVE-6768
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-6768.patch
>
>
> now that MAPREDUCE-5806 is fixed we can remove 
> override-container-log4j.properties and and all the logic around this which 
> was introduced in HIVE-5511 to work around MAPREDUCE-5806
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-05-15 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-7065:


 Summary: Hive jobs in webhcat run in default mr mode even in Hive 
on Tez setup
 Key: HIVE-7065
 URL: https://issues.apache.org/jira/browse/HIVE-7065
 Project: Hive
  Issue Type: Bug
  Components: Tez, WebHCat
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman


WebHCat config has templeton.hive.properties to specify Hive config properties 
that need to be passed to Hive client on node executing a job submitted through 
WebHCat (hive query, for example).

this should include "hive.execution.engine"



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7056) TestPig_11 fails with Pig 12.1 and earlier

2014-05-15 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7056:
-

Description: 
on trunk, pig script (http://svn.apache.org/repos/asf/pig/trunk/bin/pig) is 
looking for "*hcatalog-core-*.jar" etc.  In Pig 12.1 it's looking for 
"hcatalog-core-*.jar", which doesn't work with Hive 0.13.

The TestPig_11 job fails with
{noformat}
2014-05-13 17:47:10,760 [main] ERROR org.apache.pig.PigServer - exception 
during parsing: Error during parsing. Could not resolve 
org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
org.apache.pig.builtin., org.apache.pig.impl.builtin.]
Failed to parse: Pig script failed to parse: 
 pig script failed to validate: 
org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not 
resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
org.apache.pig.builtin., org.apache.pig.impl.builtin.]
at 
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:196)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678)
at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411)
at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344)
at org.apache.pig.PigServer.executeBatch(PigServer.java:369)
at org.apache.pig.PigServer.executeBatch(PigServer.java:355)
at 
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:478)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: 
 pig script failed to validate: 
org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not 
resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
org.apache.pig.builtin., org.apache.pig.impl.builtin.]
at 
org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1299)
at 
org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:1284)
at 
org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:5158)
at 
org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:7756)
at 
org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1669)
at 
org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
at 
org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
at 
org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
at 
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188)
... 16 more
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: 
Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, 
java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:653)
at 
org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1296)
... 24 more
{noformat}

the key to this is 
{noformat}
ls: 
/private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/lib/slf4j-api-*.jar:
 No such file or directory
ls: 
/private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-core-*.jar:
 No such file or directory
ls: 
/private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-*.jar:
 No such file or directory
ls: 
/private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/lib/hbase-storage-handler-*.jar:
 No such file or directory
l

[jira] [Updated] (HIVE-6316) Document support for new types in HCat

2014-05-15 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6316:
-

Assignee: Lefty Leverenz

> Document support for new types in HCat
> --
>
> Key: HIVE-6316
> URL: https://issues.apache.org/jira/browse/HIVE-6316
> Project: Hive
>  Issue Type: Sub-task
>  Components: Documentation, HCatalog
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Lefty Leverenz
>
> HIVE-5814 added support for new types in HCat.  The PDF file in that bug 
> explains exactly how these map to Pig types.  This should be added to the 
> Wiki somewhere (probably here 
> https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore).
> In particular it should be highlighted that copying data from Hive TIMESTAMP 
> to Pig DATETIME, any 'nanos' in the timestamp will be lost.  Also, HCatStorer 
> now takes new parameter which is described in the PDF doc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7056) TestPig_11 fails with Pig 12.1 and earlier

2014-05-15 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7056:
-

Description: 
on trunk, pig script (http://svn.apache.org/repos/asf/pig/trunk/bin/pig) is 
looking for "\*hcatalog-core-*.jar" etc.  In Pig 12.1 it's looking for 
"hcatalog-core-*.jar", which doesn't work with Hive 0.13.

The TestPig_11 job fails with
{noformat}
2014-05-13 17:47:10,760 [main] ERROR org.apache.pig.PigServer - exception 
during parsing: Error during parsing. Could not resolve 
org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
org.apache.pig.builtin., org.apache.pig.impl.builtin.]
Failed to parse: Pig script failed to parse: 
 pig script failed to validate: 
org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not 
resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
org.apache.pig.builtin., org.apache.pig.impl.builtin.]
at 
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:196)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678)
at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411)
at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344)
at org.apache.pig.PigServer.executeBatch(PigServer.java:369)
at org.apache.pig.PigServer.executeBatch(PigServer.java:355)
at 
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:478)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: 
 pig script failed to validate: 
org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not 
resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
org.apache.pig.builtin., org.apache.pig.impl.builtin.]
at 
org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1299)
at 
org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:1284)
at 
org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:5158)
at 
org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:7756)
at 
org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1669)
at 
org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
at 
org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
at 
org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
at 
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188)
... 16 more
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: 
Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, 
java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:653)
at 
org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1296)
... 24 more
{noformat}

the key to this is 
{noformat}
ls: 
/private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/lib/slf4j-api-*.jar:
 No such file or directory
ls: 
/private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-core-*.jar:
 No such file or directory
ls: 
/private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-*.jar:
 No such file or directory
ls: 
/private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/lib/hbase-storage-handler-*.jar:
 No such file or directory

[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-05-15 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7065:
-

Attachment: HIVE-7065.patch

> Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
> -
>
> Key: HIVE-7065
> URL: https://issues.apache.org/jira/browse/HIVE-7065
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-7065.patch
>
>
> WebHCat config has templeton.hive.properties to specify Hive config 
> properties that need to be passed to Hive client on node executing a job 
> submitted through WebHCat (hive query, for example).
> this should include "hive.execution.engine"



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7056) TestPig_11 fails with Pig 12.1 and earlier

2014-05-15 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7056:
-

Description: 
on trunk, pig script (http://svn.apache.org/repos/asf/pig/trunk/bin/pig) is 
looking for "\*hcatalog-core-\*.jar" etc.  In Pig 12.1 it's looking for 
"hcatalog-core-\*.jar", which doesn't work with Hive 0.13.

The TestPig_11 job fails with
{noformat}
2014-05-13 17:47:10,760 [main] ERROR org.apache.pig.PigServer - exception 
during parsing: Error during parsing. Could not resolve 
org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
org.apache.pig.builtin., org.apache.pig.impl.builtin.]
Failed to parse: Pig script failed to parse: 
 pig script failed to validate: 
org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not 
resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
org.apache.pig.builtin., org.apache.pig.impl.builtin.]
at 
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:196)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678)
at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411)
at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344)
at org.apache.pig.PigServer.executeBatch(PigServer.java:369)
at org.apache.pig.PigServer.executeBatch(PigServer.java:355)
at 
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:478)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: 
 pig script failed to validate: 
org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not 
resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
org.apache.pig.builtin., org.apache.pig.impl.builtin.]
at 
org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1299)
at 
org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:1284)
at 
org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:5158)
at 
org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:7756)
at 
org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1669)
at 
org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
at 
org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
at 
org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
at 
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188)
... 16 more
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: 
Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, 
java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:653)
at 
org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1296)
... 24 more
{noformat}

the key to this is 
{noformat}
ls: 
/private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/lib/slf4j-api-*.jar:
 No such file or directory
ls: 
/private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-core-*.jar:
 No such file or directory
ls: 
/private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-*.jar:
 No such file or directory
ls: 
/private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/lib/hbase-storage-handler-*.jar:
 No such file or director

[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-05-15 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7065:
-

Status: Patch Available  (was: Open)

> Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
> -
>
> Key: HIVE-7065
> URL: https://issues.apache.org/jira/browse/HIVE-7065
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-7065.patch
>
>
> WebHCat config has templeton.hive.properties to specify Hive config 
> properties that need to be passed to Hive client on node executing a job 
> submitted through WebHCat (hive query, for example).
> this should include "hive.execution.engine"



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7035) Templeton returns 500 for user errors - when job cannot be found

2014-05-15 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7035:
-

Attachment: HIVE-7035.patch

> Templeton returns 500 for user errors - when job cannot be found
> 
>
> Key: HIVE-7035
> URL: https://issues.apache.org/jira/browse/HIVE-7035
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-7035.patch
>
>
> curl -i 
> 'http://localhost:50111/templeton/v1/jobs/job_139949638_00011?user.name=ekoifman'
>  should return HTTP Status code 4xx when no such job exists; it currently 
> returns 500.
> {noformat}
> {"error":"org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: 
> Application with id 'application_201304291205_0015' doesn't exist in 
> RM.\r\n\tat org.apache.hadoop.yarn.server.resourcemanager
> .ClientRMService.getApplicationReport(ClientRMService.java:247)\r\n\tat 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocol
> PBServiceImpl.java:120)\r\n\tat 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:241)\r\n\tat
>  org.apache.hado
> op.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)\r\n\tat
>  org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)\r\n\tat 
> org.apache.hadoop.ipc.Server$Handler$1.run(Serve
> r.java:2053)\r\n\tat 
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)\r\n\tat 
> java.security.AccessController.doPrivileged(Native Method)\r\n\tat 
> javax.security.auth.Subject.doAs(Subject.ja
> va:415)\r\n\tat 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)\r\n\tat
>  org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)\r\n"}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7035) Templeton returns 500 for user errors - when job cannot be found

2014-05-14 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7035:
-

Status: Patch Available  (was: Open)

> Templeton returns 500 for user errors - when job cannot be found
> 
>
> Key: HIVE-7035
> URL: https://issues.apache.org/jira/browse/HIVE-7035
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-7035.patch
>
>
> curl -i 
> 'http://localhost:50111/templeton/v1/jobs/job_139949638_00011?user.name=ekoifman'
>  should return HTTP Status code 4xx when no such job exists; it currently 
> returns 500.
> {noformat}
> {"error":"org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: 
> Application with id 'application_201304291205_0015' doesn't exist in 
> RM.\r\n\tat org.apache.hadoop.yarn.server.resourcemanager
> .ClientRMService.getApplicationReport(ClientRMService.java:247)\r\n\tat 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocol
> PBServiceImpl.java:120)\r\n\tat 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:241)\r\n\tat
>  org.apache.hado
> op.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)\r\n\tat
>  org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)\r\n\tat 
> org.apache.hadoop.ipc.Server$Handler$1.run(Serve
> r.java:2053)\r\n\tat 
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)\r\n\tat 
> java.security.AccessController.doPrivileged(Native Method)\r\n\tat 
> javax.security.auth.Subject.doAs(Subject.ja
> va:415)\r\n\tat 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)\r\n\tat
>  org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)\r\n"}
> {noformat}
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7056) WebHCat TestPig_11 fails with Pig 12.1 and earlier on Hive 0.13

2014-05-14 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7056:
-

Summary: WebHCat TestPig_11 fails with Pig 12.1 and earlier on Hive 0.13  
(was: TestPig_11 fails with Pig 12.1 and earlier)

> WebHCat TestPig_11 fails with Pig 12.1 and earlier on Hive 0.13
> ---
>
> Key: HIVE-7056
> URL: https://issues.apache.org/jira/browse/HIVE-7056
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>
> on trunk, pig script (http://svn.apache.org/repos/asf/pig/trunk/bin/pig) is 
> looking for "\*hcatalog-core-\*.jar" etc.  In Pig 12.1 it's looking for 
> "hcatalog-core-\*.jar", which doesn't work with Hive 0.13.
> The TestPig_11 job fails with
> {noformat}
> 2014-05-13 17:47:10,760 [main] ERROR org.apache.pig.PigServer - exception 
> during parsing: Error during parsing. Could not resolve 
> org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
> org.apache.pig.builtin., org.apache.pig.impl.builtin.]
> Failed to parse: Pig script failed to parse: 
>  pig script failed to validate: 
> org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not 
> resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
> org.apache.pig.builtin., org.apache.pig.impl.builtin.]
>   at 
> org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:196)
>   at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678)
>   at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411)
>   at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344)
>   at org.apache.pig.PigServer.executeBatch(PigServer.java:369)
>   at org.apache.pig.PigServer.executeBatch(PigServer.java:355)
>   at 
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
>   at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202)
>   at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
>   at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
>   at org.apache.pig.Main.run(Main.java:478)
>   at org.apache.pig.Main.main(Main.java:156)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Caused by: 
>  pig script failed to validate: 
> org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not 
> resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
> org.apache.pig.builtin., org.apache.pig.impl.builtin.]
>   at 
> org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1299)
>   at 
> org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:1284)
>   at 
> org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:5158)
>   at 
> org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:7756)
>   at 
> org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1669)
>   at 
> org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
>   at 
> org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
>   at 
> org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
>   at 
> org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188)
>   ... 16 more
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: 
> Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, 
> java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
>   at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:653)
>   at 
> org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1296)
>   ... 24 more
> {noformat}
> the key to this is 
> {noformat}
> ls: 
> /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/lib/slf4j-api-*.jar:
>  No such file or directory
> ls: 
> /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-core-*.ja

[jira] [Updated] (HIVE-6549) remove templeton.jar from webhcat-default.xml, remove hcatalog/bin/hive-config.sh

2014-05-14 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6549:
-

Attachment: HIVE-6549.2.patch

adressed [~thejas]'s comments

> remove templeton.jar from webhcat-default.xml, remove 
> hcatalog/bin/hive-config.sh
> -
>
> Key: HIVE-6549
> URL: https://issues.apache.org/jira/browse/HIVE-6549
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.12.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Minor
> Attachments: HIVE-6549.2.patch, HIVE-6549.patch
>
>
> this property is no longer used
> also removed corresponding AppConfig.TEMPLETON_JAR_NAME
> hcatalog/bin/hive-config.sh is not used
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7056) TestPig_11 fails with Pig 12.1 and earlier

2014-05-14 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-7056:


 Summary: TestPig_11 fails with Pig 12.1 and earlier
 Key: HIVE-7056
 URL: https://issues.apache.org/jira/browse/HIVE-7056
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Reporter: Eugene Koifman


on trunk, pig script (http://svn.apache.org/repos/asf/pig/trunk/bin/pig) is 
looking for *hcatalog-core-*.jar etc.  In Pig 12.1 it's looking for 
"hcatalog-core-*.jar", which doesn't work with Hive 0.13.

The TestPig_11 job fails with
{noformat}
2014-05-13 17:47:10,760 [main] ERROR org.apache.pig.PigServer - exception 
during parsing: Error during parsing. Could not resolve 
org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
org.apache.pig.builtin., org.apache.pig.impl.builtin.]
Failed to parse: Pig script failed to parse: 
 pig script failed to validate: 
org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not 
resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
org.apache.pig.builtin., org.apache.pig.impl.builtin.]
at 
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:196)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678)
at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411)
at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344)
at org.apache.pig.PigServer.executeBatch(PigServer.java:369)
at org.apache.pig.PigServer.executeBatch(PigServer.java:355)
at 
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:478)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: 
 pig script failed to validate: 
org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not 
resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
org.apache.pig.builtin., org.apache.pig.impl.builtin.]
at 
org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1299)
at 
org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:1284)
at 
org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:5158)
at 
org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:7756)
at 
org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1669)
at 
org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
at 
org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
at 
org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
at 
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188)
... 16 more
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: 
Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, 
java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:653)
at 
org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1296)
... 24 more
{noformat}

the key to this is 
{noformat}
ls: 
/private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/lib/slf4j-api-*.jar:
 No such file or directory
ls: 
/private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-core-*.jar:
 No such file or directory
ls: 
/private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-*.jar:
 No such file or directory
ls: 
/private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_000

<    4   5   6   7   8   9   10   11   12   13   >