[jira] [Commented] (HIVE-4997) HCatalog doesn't allow multiple input tables

2014-08-16 Thread Zhichun Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099716#comment-14099716
 ] 

Zhichun Wu commented on HIVE-4997:
--

@ [~dintskirveli] :

Your approach tries to attach each InputInfo to InputSplit in 
HCatDelegatingInputFormat#getSplits, and generate InputJobInfo in 
HCatDelegatingInputFormat#createRecordReader with the inputInfo attached. It 
has to query hive metastore service when generating InputJobInfo in each map , 
so I think it may have an impact on metastore service when the maps are huge. 
Also when we setup an security hadoop cluster, each map has to acquire a 
delegation token in order to access metastore service. The current patch hasn't 
take this part into consideration.

Here I think we can generate each InputJobInfo every time we add a table and 
then we can serialize and attach Array to job conf, we can fetch 
each inputJobInfo from job conf in getSplits and createRecordReader. This will 
avoid query metastore service in map phase. I've change the usage of adding 
multiple input tables as below:
{code}
 HCatMultipleInputs.init(job);
 HCatMultipleInputs.addInput(test_table1, "default", null, 
SequenceMapper.class);
 HCatMultipleInputs.addInput(test_table2, null, "part='1'", TextMapper1.class);
 HCatMultipleInputs.addInput(test_table2, null, "part='2'", TextMapper2.class);
 HCatMultipleInputs.build();
{code}

I've upload HIVE-4997.4.patch which based on HIVE-4997.3.patch. It works on our 
security hadoop 2.2.0 cluster.  It just works and I upload it for demonstrate 
the idea. I haven't put much thought into the quality of code and the design of 
this new feature.

 

> HCatalog doesn't allow multiple input tables
> 
>
> Key: HIVE-4997
> URL: https://issues.apache.org/jira/browse/HIVE-4997
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Affects Versions: 0.13.0
>Reporter: Daniel Intskirveli
> Fix For: 0.14.0
>
> Attachments: HIVE-4997.2.patch, HIVE-4997.3.patch, HIVE-4997.4.patch
>
>
> HCatInputFormat does not allow reading from multiple hive tables in the same 
> MapReduce job. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-4997) HCatalog doesn't allow multiple input tables

2014-08-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099701#comment-14099701
 ] 

Hive QA commented on HIVE-4997:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12662294/HIVE-4997.4.patch

{color:green}SUCCESS:{color} +1 5818 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/362/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/362/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-362/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12662294

> HCatalog doesn't allow multiple input tables
> 
>
> Key: HIVE-4997
> URL: https://issues.apache.org/jira/browse/HIVE-4997
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Affects Versions: 0.13.0
>Reporter: Daniel Intskirveli
> Fix For: 0.14.0
>
> Attachments: HIVE-4997.2.patch, HIVE-4997.3.patch, HIVE-4997.4.patch
>
>
> HCatInputFormat does not allow reading from multiple hive tables in the same 
> MapReduce job. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-4997) HCatalog doesn't allow multiple input tables

2014-08-12 Thread Zhichun Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14094026#comment-14094026
 ] 

Zhichun Wu commented on HIVE-4997:
--

 HIVE-4997.3.patch  use 
{code}
JobContext ctx = new JobContext(conf, jobContext.getJobID());
{code}
at  line 57  in HCatDelegatingInputFormat , which is not compatible with 
hadoop-2,  change it like below would be fine:
{code}
ShimLoader.getHadoopShims().getHCatShim().createJobContext(conf, 
jobContext.getJobID());
{code}

> HCatalog doesn't allow multiple input tables
> 
>
> Key: HIVE-4997
> URL: https://issues.apache.org/jira/browse/HIVE-4997
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Affects Versions: 0.13.0
>Reporter: Daniel Intskirveli
> Fix For: 0.14.0
>
> Attachments: HIVE-4997.2.patch, HIVE-4997.3.patch
>
>
> HCatInputFormat does not allow reading from multiple hive tables in the same 
> MapReduce job. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-4997) HCatalog doesn't allow multiple input tables

2014-05-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13994198#comment-13994198
 ] 

Hive QA commented on HIVE-4997:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12610346/HIVE-4997.3.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/162/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/162/console

Messages:
{noformat}
 This message was trimmed, see log for full details 
[mkdir] Created dir: 
/data/hive-ptest/working/apache-svn-trunk-source/contrib/target/warehouse
[mkdir] Created dir: 
/data/hive-ptest/working/apache-svn-trunk-source/contrib/target/tmp/conf
 [copy] Copying 5 files to 
/data/hive-ptest/working/apache-svn-trunk-source/contrib/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
hive-contrib ---
[INFO] Compiling 2 source files to 
/data/hive-ptest/working/apache-svn-trunk-source/contrib/target/test-classes
[WARNING] Note: 
/data/hive-ptest/working/apache-svn-trunk-source/contrib/src/test/org/apache/hadoop/hive/contrib/serde2/TestRegexSerDe.java
 uses or overrides a deprecated API.
[WARNING] Note: Recompile with -Xlint:deprecation for details.
[INFO] 
[INFO] --- maven-surefire-plugin:2.16:test (default-test) @ hive-contrib ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ hive-contrib ---
[INFO] Building jar: 
/data/hive-ptest/working/apache-svn-trunk-source/contrib/target/hive-contrib-0.14.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ 
hive-contrib ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ hive-contrib ---
[INFO] Installing 
/data/hive-ptest/working/apache-svn-trunk-source/contrib/target/hive-contrib-0.14.0-SNAPSHOT.jar
 to 
/data/hive-ptest/working/maven/org/apache/hive/hive-contrib/0.14.0-SNAPSHOT/hive-contrib-0.14.0-SNAPSHOT.jar
[INFO] Installing 
/data/hive-ptest/working/apache-svn-trunk-source/contrib/pom.xml to 
/data/hive-ptest/working/maven/org/apache/hive/hive-contrib/0.14.0-SNAPSHOT/hive-contrib-0.14.0-SNAPSHOT.pom
[INFO] 
[INFO] 
[INFO] Building Hive HBase Handler 0.14.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-hbase-handler ---
[INFO] Deleting /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler 
(includes = [datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ 
hive-hbase-handler ---
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ 
hive-hbase-handler ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 
/data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/src/main/resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-hbase-handler 
---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ 
hive-hbase-handler ---
[INFO] Compiling 19 source files to 
/data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/target/classes
[WARNING] Note: Some input files use or override a deprecated API.
[WARNING] Note: Recompile with -Xlint:deprecation for details.
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ 
hive-hbase-handler ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 
/data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/src/test/resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-hbase-handler 
---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/target/tmp
[mkdir] Created dir: 
/data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/target/warehouse
[mkdir] Created dir: 
/data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/target/tmp/conf
 [copy] Copying 5 files to 
/data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
hive-hbase-handler ---
[INFO] Compiling 4 source files to 
/data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/target/test-classes
[WARNING] Note: Some input files use or overr

[jira] [Commented] (HIVE-4997) HCatalog doesn't allow multiple input tables

2013-10-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13806136#comment-13806136
 ] 

Hive QA commented on HIVE-4997:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12610346/HIVE-4997.3.patch

{color:green}SUCCESS:{color} +1 4456 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1248/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1248/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

> HCatalog doesn't allow multiple input tables
> 
>
> Key: HIVE-4997
> URL: https://issues.apache.org/jira/browse/HIVE-4997
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Affects Versions: 0.13.0
>Reporter: Daniel Intskirveli
> Fix For: 0.13.0
>
> Attachments: HIVE-4997.2.patch, HIVE-4997.3.patch
>
>
> HCatInputFormat does not allow reading from multiple hive tables in the same 
> MapReduce job. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-4997) HCatalog doesn't allow multiple input tables

2013-10-25 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13805753#comment-13805753
 ] 

Sushanth Sowmyan commented on HIVE-4997:


[~dintskirveli], as I mentioned in my previous comment, could you please attach 
a comment/design doc of sorts outlining need, goal, implementation and 
potential issues(if any). Since this is an interface change for HCat as a 
whole, we'd like to discuss whether this is the right thing to do.

> HCatalog doesn't allow multiple input tables
> 
>
> Key: HIVE-4997
> URL: https://issues.apache.org/jira/browse/HIVE-4997
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Affects Versions: 0.13.0
>Reporter: Daniel Intskirveli
> Fix For: 0.13.0
>
> Attachments: HIVE-4997.2.patch, HIVE-4997.3.patch
>
>
> HCatInputFormat does not allow reading from multiple hive tables in the same 
> MapReduce job. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-4997) HCatalog doesn't allow multiple input tables

2013-10-02 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13784275#comment-13784275
 ] 

Sushanth Sowmyan commented on HIVE-4997:


Rashad, I'm afraid it's a little late for feature improvements to ship with 
0.12, which is in a lockdown mode for bugfixes only.

There are two things that need to happen before this patch gets included in 
0.13 : 

a) The patch needs regeneration to change all references of org.apache.hcatalog 
package to org.apache.hive.hcatalog package (The former package is deprecated, 
and will be maintained for only one more release before removal, and is 
considered frozen as of 0.11)

b) If possible, since this is a pretty big feature(in that it adds 
functionality to the user-facing api), attach a design document with this patch 
outlining goal, implementation and potential issues(if any) and have it 
reviewed by another committer to see if that's okay. I'd suggest [~alangates] 
or [~toffer] since they've looked at MultiOutputFormat earlier and can make 
sure that it is consistent in design. After that, they can review the patch and 
commit.



> HCatalog doesn't allow multiple input tables
> 
>
> Key: HIVE-4997
> URL: https://issues.apache.org/jira/browse/HIVE-4997
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Affects Versions: 0.12.0
>Reporter: Daniel Intskirveli
>Priority: Minor
> Fix For: 0.12.0
>
> Attachments: HIVE-4997.1.patch
>
>
> HCatInputFormat does not allow reading from multiple hive tables in the same 
> MapReduce job. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-4997) HCatalog doesn't allow multiple input tables

2013-10-02 Thread Rashad Tatum (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13784073#comment-13784073
 ] 

Rashad Tatum commented on HIVE-4997:


>From what I see, Daniel added 4 files using the patch. Is there anything 
>blocking this from being included in Hive 0.12.0? Is there anything I can do 
>to help?

> HCatalog doesn't allow multiple input tables
> 
>
> Key: HIVE-4997
> URL: https://issues.apache.org/jira/browse/HIVE-4997
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Affects Versions: 0.12.0
>Reporter: Daniel Intskirveli
>Priority: Minor
> Fix For: 0.12.0
>
> Attachments: HIVE-4997.1.patch
>
>
> HCatInputFormat does not allow reading from multiple hive tables in the same 
> MapReduce job. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-4997) HCatalog doesn't allow multiple input tables

2013-08-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13730111#comment-13730111
 ] 

Hive QA commented on HIVE-4997:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12596206/HIVE-4997.1.patch

{color:green}SUCCESS:{color} +1 2760 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/312/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/312/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

> HCatalog doesn't allow multiple input tables
> 
>
> Key: HIVE-4997
> URL: https://issues.apache.org/jira/browse/HIVE-4997
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Affects Versions: 0.12.0
>Reporter: Daniel Intskirveli
>Priority: Minor
> Fix For: 0.12.0
>
> Attachments: HIVE-4997.1.patch
>
>
> HCatInputFormat does not allow reading from multiple hive tables in the same 
> MapReduce job. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira