[jira] [Updated] (PIG-5246) Modify bin/pig about SPARK_HOME, SPARK_ASSEMBLY_JAR after upgrading spark to 2

2017-07-24 Thread liyunzhang_intel (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel updated PIG-5246:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Modify bin/pig about SPARK_HOME, SPARK_ASSEMBLY_JAR after upgrading spark to 2
> --
>
> Key: PIG-5246
> URL: https://issues.apache.org/jira/browse/PIG-5246
> Project: Pig
>  Issue Type: Bug
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Attachments: HBase9498.patch, PIG-5246.1.patch, PIG-5246_2.patch, 
> PIG-5246.3.patch, PIG-5246_4.patch, PIG-5246.patch
>
>
> in bin/pig.
> we copy assembly jar to pig's classpath in spark1.6.
> {code}
> # For spark mode:
> # Please specify SPARK_HOME first so that we can locate 
> $SPARK_HOME/lib/spark-assembly*.jar,
> # we will add spark-assembly*.jar to the classpath.
> if [ "$isSparkMode"  == "true" ]; then
> if [ -z "$SPARK_HOME" ]; then
>echo "Error: SPARK_HOME is not set!"
>exit 1
> fi
> # Please specify SPARK_JAR which is the hdfs path of spark-assembly*.jar 
> to allow YARN to cache spark-assembly*.jar on nodes so that it doesn't need 
> to be distributed each time an application runs.
> if [ -z "$SPARK_JAR" ]; then
>echo "Error: SPARK_JAR is not set, SPARK_JAR stands for the hdfs 
> location of spark-assembly*.jar. This allows YARN to cache 
> spark-assembly*.jar on nodes so that it doesn't need to be distributed each 
> time an application runs."
>exit 1
> fi
> if [ -n "$SPARK_HOME" ]; then
> echo "Using Spark Home: " ${SPARK_HOME}
> SPARK_ASSEMBLY_JAR=`ls ${SPARK_HOME}/lib/spark-assembly*`
> CLASSPATH=${CLASSPATH}:$SPARK_ASSEMBLY_JAR
> fi
> fi
> {code}
> after upgrade to spark2.0, we may modify it



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5246) Modify bin/pig about SPARK_HOME, SPARK_ASSEMBLY_JAR after upgrading spark to 2

2017-07-14 Thread liyunzhang_intel (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel updated PIG-5246:
--
Attachment: PIG-5246_4.patch

[~nkollar]:
changes in PIG-5246_4.patch:
{code}
CLASSPATH=${CLASSPATH}:${SPARK_HOME}/lib/spark-assembly*
{code}

to 
{code}
   SPARK_ASSEMBLY_JAR=`ls ${SPARK_HOME}/lib/spark-assembly*`
   CLASSPATH=${CLASSPATH}:$SPARK_ASSEMBLY_JAR

{code}

can not use wildcard to locate the spark_assembly_jar.
After all unit tests pass on my local jenkins.  will close PIG-5157

> Modify bin/pig about SPARK_HOME, SPARK_ASSEMBLY_JAR after upgrading spark to 2
> --
>
> Key: PIG-5246
> URL: https://issues.apache.org/jira/browse/PIG-5246
> Project: Pig
>  Issue Type: Bug
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Attachments: HBase9498.patch, PIG-5246.1.patch, PIG-5246_2.patch, 
> PIG-5246.3.patch, PIG-5246_4.patch, PIG-5246.patch
>
>
> in bin/pig.
> we copy assembly jar to pig's classpath in spark1.6.
> {code}
> # For spark mode:
> # Please specify SPARK_HOME first so that we can locate 
> $SPARK_HOME/lib/spark-assembly*.jar,
> # we will add spark-assembly*.jar to the classpath.
> if [ "$isSparkMode"  == "true" ]; then
> if [ -z "$SPARK_HOME" ]; then
>echo "Error: SPARK_HOME is not set!"
>exit 1
> fi
> # Please specify SPARK_JAR which is the hdfs path of spark-assembly*.jar 
> to allow YARN to cache spark-assembly*.jar on nodes so that it doesn't need 
> to be distributed each time an application runs.
> if [ -z "$SPARK_JAR" ]; then
>echo "Error: SPARK_JAR is not set, SPARK_JAR stands for the hdfs 
> location of spark-assembly*.jar. This allows YARN to cache 
> spark-assembly*.jar on nodes so that it doesn't need to be distributed each 
> time an application runs."
>exit 1
> fi
> if [ -n "$SPARK_HOME" ]; then
> echo "Using Spark Home: " ${SPARK_HOME}
> SPARK_ASSEMBLY_JAR=`ls ${SPARK_HOME}/lib/spark-assembly*`
> CLASSPATH=${CLASSPATH}:$SPARK_ASSEMBLY_JAR
> fi
> fi
> {code}
> after upgrade to spark2.0, we may modify it



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5246) Modify bin/pig about SPARK_HOME, SPARK_ASSEMBLY_JAR after upgrading spark to 2

2017-06-21 Thread liyunzhang_intel (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel updated PIG-5246:
--
Attachment: PIG-5246.3.patch

[~nkollar],[~szita],[~rohini],[~jeffzhang]: update PIG-5246.3.patch
changes
1. use spark-tags*.jar to verify whether current spark is spark1 or spark2
2. small fix like
{code}
SPARK_ASSEMBLY_JAR=`ls ${SPARK_HOME}/lib/spark-assembly*`
 CLASSPATH=${CLASSPATH}:$SPARK_ASSEMBLY_JAR

{code}

to
{code}
CLASSPATH=${CLASSPATH}:${SPARK_HOME}/lib/spark-assembly*
{code}

> Modify bin/pig about SPARK_HOME, SPARK_ASSEMBLY_JAR after upgrading spark to 2
> --
>
> Key: PIG-5246
> URL: https://issues.apache.org/jira/browse/PIG-5246
> Project: Pig
>  Issue Type: Bug
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Attachments: HBase9498.patch, PIG-5246.1.patch, PIG-5246_2.patch, 
> PIG-5246.3.patch, PIG-5246.patch
>
>
> in bin/pig.
> we copy assembly jar to pig's classpath in spark1.6.
> {code}
> # For spark mode:
> # Please specify SPARK_HOME first so that we can locate 
> $SPARK_HOME/lib/spark-assembly*.jar,
> # we will add spark-assembly*.jar to the classpath.
> if [ "$isSparkMode"  == "true" ]; then
> if [ -z "$SPARK_HOME" ]; then
>echo "Error: SPARK_HOME is not set!"
>exit 1
> fi
> # Please specify SPARK_JAR which is the hdfs path of spark-assembly*.jar 
> to allow YARN to cache spark-assembly*.jar on nodes so that it doesn't need 
> to be distributed each time an application runs.
> if [ -z "$SPARK_JAR" ]; then
>echo "Error: SPARK_JAR is not set, SPARK_JAR stands for the hdfs 
> location of spark-assembly*.jar. This allows YARN to cache 
> spark-assembly*.jar on nodes so that it doesn't need to be distributed each 
> time an application runs."
>exit 1
> fi
> if [ -n "$SPARK_HOME" ]; then
> echo "Using Spark Home: " ${SPARK_HOME}
> SPARK_ASSEMBLY_JAR=`ls ${SPARK_HOME}/lib/spark-assembly*`
> CLASSPATH=${CLASSPATH}:$SPARK_ASSEMBLY_JAR
> fi
> fi
> {code}
> after upgrade to spark2.0, we may modify it



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5246) Modify bin/pig about SPARK_HOME, SPARK_ASSEMBLY_JAR after upgrading spark to 2

2017-06-13 Thread liyunzhang_intel (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel updated PIG-5246:
--
Attachment: PIG-5246_2.patch

[~zjffdu]:
in PIG-5246_2.patch
use following way to judge spark version
{code}
+$SPARK_HOME/bin/spark-submit --version >/tmp/spark.version 2>&1
+isSpark1=`grep "version 1" /tmp/spark.version|wc -l`
+if [ "$isSpark1" -eq 0 ];then 
+  sparkversion="2"
 fi
{code}

redirect the output of "spark-submit --version" to /tmp/spark.version(later 
will remove this file), Is there any better way to judge the version? 

> Modify bin/pig about SPARK_HOME, SPARK_ASSEMBLY_JAR after upgrading spark to 2
> --
>
> Key: PIG-5246
> URL: https://issues.apache.org/jira/browse/PIG-5246
> Project: Pig
>  Issue Type: Bug
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Attachments: HBase9498.patch, PIG-5246.1.patch, PIG-5246_2.patch, 
> PIG-5246.patch
>
>
> in bin/pig.
> we copy assembly jar to pig's classpath in spark1.6.
> {code}
> # For spark mode:
> # Please specify SPARK_HOME first so that we can locate 
> $SPARK_HOME/lib/spark-assembly*.jar,
> # we will add spark-assembly*.jar to the classpath.
> if [ "$isSparkMode"  == "true" ]; then
> if [ -z "$SPARK_HOME" ]; then
>echo "Error: SPARK_HOME is not set!"
>exit 1
> fi
> # Please specify SPARK_JAR which is the hdfs path of spark-assembly*.jar 
> to allow YARN to cache spark-assembly*.jar on nodes so that it doesn't need 
> to be distributed each time an application runs.
> if [ -z "$SPARK_JAR" ]; then
>echo "Error: SPARK_JAR is not set, SPARK_JAR stands for the hdfs 
> location of spark-assembly*.jar. This allows YARN to cache 
> spark-assembly*.jar on nodes so that it doesn't need to be distributed each 
> time an application runs."
>exit 1
> fi
> if [ -n "$SPARK_HOME" ]; then
> echo "Using Spark Home: " ${SPARK_HOME}
> SPARK_ASSEMBLY_JAR=`ls ${SPARK_HOME}/lib/spark-assembly*`
> CLASSPATH=${CLASSPATH}:$SPARK_ASSEMBLY_JAR
> fi
> fi
> {code}
> after upgrade to spark2.0, we may modify it



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5246) Modify bin/pig about SPARK_HOME, SPARK_ASSEMBLY_JAR after upgrading spark to 2

2017-06-02 Thread Rohini Palaniswamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5246:

Attachment: HBase9498.patch

> Modify bin/pig about SPARK_HOME, SPARK_ASSEMBLY_JAR after upgrading spark to 2
> --
>
> Key: PIG-5246
> URL: https://issues.apache.org/jira/browse/PIG-5246
> Project: Pig
>  Issue Type: Bug
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Attachments: HBase9498.patch, PIG-5246.1.patch, PIG-5246.patch
>
>
> in bin/pig.
> we copy assembly jar to pig's classpath in spark1.6.
> {code}
> # For spark mode:
> # Please specify SPARK_HOME first so that we can locate 
> $SPARK_HOME/lib/spark-assembly*.jar,
> # we will add spark-assembly*.jar to the classpath.
> if [ "$isSparkMode"  == "true" ]; then
> if [ -z "$SPARK_HOME" ]; then
>echo "Error: SPARK_HOME is not set!"
>exit 1
> fi
> # Please specify SPARK_JAR which is the hdfs path of spark-assembly*.jar 
> to allow YARN to cache spark-assembly*.jar on nodes so that it doesn't need 
> to be distributed each time an application runs.
> if [ -z "$SPARK_JAR" ]; then
>echo "Error: SPARK_JAR is not set, SPARK_JAR stands for the hdfs 
> location of spark-assembly*.jar. This allows YARN to cache 
> spark-assembly*.jar on nodes so that it doesn't need to be distributed each 
> time an application runs."
>exit 1
> fi
> if [ -n "$SPARK_HOME" ]; then
> echo "Using Spark Home: " ${SPARK_HOME}
> SPARK_ASSEMBLY_JAR=`ls ${SPARK_HOME}/lib/spark-assembly*`
> CLASSPATH=${CLASSPATH}:$SPARK_ASSEMBLY_JAR
> fi
> fi
> {code}
> after upgrade to spark2.0, we may modify it



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (PIG-5246) Modify bin/pig about SPARK_HOME, SPARK_ASSEMBLY_JAR after upgrading spark to 2

2017-06-01 Thread liyunzhang_intel (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel updated PIG-5246:
--
Attachment: PIG-5246.1.patch

[~nkollar]: in PIG-5246.1.patch, modify {{spark16}} to {{spark1}} and 
{{spark21}} to {{spark2}}

> Modify bin/pig about SPARK_HOME, SPARK_ASSEMBLY_JAR after upgrading spark to 2
> --
>
> Key: PIG-5246
> URL: https://issues.apache.org/jira/browse/PIG-5246
> Project: Pig
>  Issue Type: Bug
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Attachments: PIG-5246.1.patch, PIG-5246.patch
>
>
> in bin/pig.
> we copy assembly jar to pig's classpath in spark1.6.
> {code}
> # For spark mode:
> # Please specify SPARK_HOME first so that we can locate 
> $SPARK_HOME/lib/spark-assembly*.jar,
> # we will add spark-assembly*.jar to the classpath.
> if [ "$isSparkMode"  == "true" ]; then
> if [ -z "$SPARK_HOME" ]; then
>echo "Error: SPARK_HOME is not set!"
>exit 1
> fi
> # Please specify SPARK_JAR which is the hdfs path of spark-assembly*.jar 
> to allow YARN to cache spark-assembly*.jar on nodes so that it doesn't need 
> to be distributed each time an application runs.
> if [ -z "$SPARK_JAR" ]; then
>echo "Error: SPARK_JAR is not set, SPARK_JAR stands for the hdfs 
> location of spark-assembly*.jar. This allows YARN to cache 
> spark-assembly*.jar on nodes so that it doesn't need to be distributed each 
> time an application runs."
>exit 1
> fi
> if [ -n "$SPARK_HOME" ]; then
> echo "Using Spark Home: " ${SPARK_HOME}
> SPARK_ASSEMBLY_JAR=`ls ${SPARK_HOME}/lib/spark-assembly*`
> CLASSPATH=${CLASSPATH}:$SPARK_ASSEMBLY_JAR
> fi
> fi
> {code}
> after upgrade to spark2.0, we may modify it



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (PIG-5246) Modify bin/pig about SPARK_HOME, SPARK_ASSEMBLY_JAR after upgrading spark to 2

2017-05-31 Thread liyunzhang_intel (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel updated PIG-5246:
--
Status: Patch Available  (was: Open)

> Modify bin/pig about SPARK_HOME, SPARK_ASSEMBLY_JAR after upgrading spark to 2
> --
>
> Key: PIG-5246
> URL: https://issues.apache.org/jira/browse/PIG-5246
> Project: Pig
>  Issue Type: Bug
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Attachments: PIG-5246.patch
>
>
> in bin/pig.
> we copy assembly jar to pig's classpath in spark1.6.
> {code}
> # For spark mode:
> # Please specify SPARK_HOME first so that we can locate 
> $SPARK_HOME/lib/spark-assembly*.jar,
> # we will add spark-assembly*.jar to the classpath.
> if [ "$isSparkMode"  == "true" ]; then
> if [ -z "$SPARK_HOME" ]; then
>echo "Error: SPARK_HOME is not set!"
>exit 1
> fi
> # Please specify SPARK_JAR which is the hdfs path of spark-assembly*.jar 
> to allow YARN to cache spark-assembly*.jar on nodes so that it doesn't need 
> to be distributed each time an application runs.
> if [ -z "$SPARK_JAR" ]; then
>echo "Error: SPARK_JAR is not set, SPARK_JAR stands for the hdfs 
> location of spark-assembly*.jar. This allows YARN to cache 
> spark-assembly*.jar on nodes so that it doesn't need to be distributed each 
> time an application runs."
>exit 1
> fi
> if [ -n "$SPARK_HOME" ]; then
> echo "Using Spark Home: " ${SPARK_HOME}
> SPARK_ASSEMBLY_JAR=`ls ${SPARK_HOME}/lib/spark-assembly*`
> CLASSPATH=${CLASSPATH}:$SPARK_ASSEMBLY_JAR
> fi
> fi
> {code}
> after upgrade to spark2.0, we may modify it



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (PIG-5246) Modify bin/pig about SPARK_HOME, SPARK_ASSEMBLY_JAR after upgrading spark to 2

2017-05-31 Thread liyunzhang_intel (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel updated PIG-5246:
--
Attachment: PIG-5246.patch

[~nkollar], [~szita]: help review 
in spark2, spark-assembly*.jar does not exist, so we need append all jars under 
$SPARK_HOME/jars/ to the pig classpath.
{code}
+if [ "$sparkversion" == "21" ]; then
+  if [ -n "$SPARK_HOME" ]; then
+ echo "Using Spark Home: " ${SPARK_HOME}
+  for f in $SPARK_HOME/jars/*.jar; do
+   CLASSPATH=${CLASSPATH}:$f
+  done
+  fi
+ fi
{code}

the way to use 

1. build pig with spark21
{noformat}
   ant clean -v  -Dsparkversion=21   -Dhadoopversion=2 jar
{noformat}
2. run pig with spark21
{noformat}
  /pig -x $mode -sparkversion 21 -log4jconf $PIG_HOME/conf/log4j.properties 
-logfile $PIG_HOME/logs/pig.log  $PIG_HOME/bin/testJoin.pig
{noformat}
  

> Modify bin/pig about SPARK_HOME, SPARK_ASSEMBLY_JAR after upgrading spark to 2
> --
>
> Key: PIG-5246
> URL: https://issues.apache.org/jira/browse/PIG-5246
> Project: Pig
>  Issue Type: Bug
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Attachments: PIG-5246.patch
>
>
> in bin/pig.
> we copy assembly jar to pig's classpath in spark1.6.
> {code}
> # For spark mode:
> # Please specify SPARK_HOME first so that we can locate 
> $SPARK_HOME/lib/spark-assembly*.jar,
> # we will add spark-assembly*.jar to the classpath.
> if [ "$isSparkMode"  == "true" ]; then
> if [ -z "$SPARK_HOME" ]; then
>echo "Error: SPARK_HOME is not set!"
>exit 1
> fi
> # Please specify SPARK_JAR which is the hdfs path of spark-assembly*.jar 
> to allow YARN to cache spark-assembly*.jar on nodes so that it doesn't need 
> to be distributed each time an application runs.
> if [ -z "$SPARK_JAR" ]; then
>echo "Error: SPARK_JAR is not set, SPARK_JAR stands for the hdfs 
> location of spark-assembly*.jar. This allows YARN to cache 
> spark-assembly*.jar on nodes so that it doesn't need to be distributed each 
> time an application runs."
>exit 1
> fi
> if [ -n "$SPARK_HOME" ]; then
> echo "Using Spark Home: " ${SPARK_HOME}
> SPARK_ASSEMBLY_JAR=`ls ${SPARK_HOME}/lib/spark-assembly*`
> CLASSPATH=${CLASSPATH}:$SPARK_ASSEMBLY_JAR
> fi
> fi
> {code}
> after upgrade to spark2.0, we may modify it



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)