[ 
https://issues.apache.org/jira/browse/PIG-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172212#comment-14172212
 ] 

Praveen Rachabattuni commented on PIG-4233:
-------------------------------------------

[~rohini] So, the idea would be to submit the pig dependency jars leaving spark 
jars to the cluster along with the pig snapshot jar. This can either be done by 
adding all jars from lib directory to classpath or use the legacy jar. Let me 
know if I missing something.

I have created a new jira(PIG-4236) to avoid packaging of spark jars along with 
pig dependencies.

> Package pig along with dependencies into a fat jar while job submission to 
> Spark cluster
> ----------------------------------------------------------------------------------------
>
>                 Key: PIG-4233
>                 URL: https://issues.apache.org/jira/browse/PIG-4233
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: Praveen Rachabattuni
>            Assignee: Praveen Rachabattuni
>         Attachments: PIG-4233.patch
>
>
> Currently we have a fat jar created in legacy directory which contains pig 
> along with dependencies. 
> Would need to modify build.xml to add spark dependency jars to include in 
> legacy fat jar.
> Running job on Spark cluster:
> 1. export SPARK_HOME=/path/to/spark
> 2. export 
> SPARK_PIG_JAR=$PIG_HOME/legacy/pig-0.14.0-SNAPSHOT-withouthadoop-h1.jar
> 3. export SPARK_MASTER=spark://localhost:7077
> 4 export HADOOP_HOME=/path/to/hadoop
> 5. Launch the job using ./bin/pig -x spark



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to