[
https://issues.apache.org/jira/browse/HIVE-8836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14224124#comment-14224124
]
Chengxiang Li commented on HIVE-8836:
-------------------------------------
Hi, [~szehon] and [~brocknoland]
I'm not 100% percent sure spark assembly jar would be published to public maven
repository, but I find spark assembly at
[here|http://mvnrepository.com/artifact/org.apache.spark/spark-assembly_2.10/1.1.0],
maybe [~vanzin] know more about this. There is no
org.apache.spark:spark-assembly_2.10:jar:1.2.0-SNAPSHOT in any public maven
repository yet as it's still in SNAPSHOT status, but we can publish it to
http://ec2-50-18-79-139.us-west-1.compute.amazonaws.com/data as what we have
done for spark core. I build spark and public to local maven repository in my
local test.
{quote}
Also another question, as we were trying to set spark.home, which looks for
bin/spark-submit, which then pulled in scripts like compute-classpath.sh,
load-spark-env.sh, spark-class, and finally spark-assembly itself. I see you
are using another way (spark.test.home, spark.testing), how does that avoid
looking for these artifacts to start the spark process?
{quote}
First, bin/spark-submit is optional for Remote Spark Context.
Then, local-cluster spark only need compute-classpath.sh for launch executor,
which is used to add spark related jars into classpath(Hive unit test should
only need spark-assembly). spark.test.home and spark.testing are used to set
spark home to dummy spark installation, you can check
org.apache.spark.deploy.worker.Worker::line101 for why. I create dummy spark
installation with empty compute-classpath.sh as compute-classpath.sh is
required, and add spark assembly to spark executor classpath through
spark.executor.extraClassPath.
> Enable automatic tests with remote spark client.[Spark Branch]
> --------------------------------------------------------------
>
> Key: HIVE-8836
> URL: https://issues.apache.org/jira/browse/HIVE-8836
> Project: Hive
> Issue Type: Sub-task
> Components: Spark
> Reporter: Chengxiang Li
> Assignee: Rui Li
> Labels: Spark-M3
> Attachments: HIVE-8836-brock-1.patch, HIVE-8836-brock-2.patch,
> HIVE-8836-brock-3.patch, HIVE-8836.1-spark.patch, HIVE-8836.2-spark.patch
>
>
> In real production environment, remote spark client should be used to submit
> spark job for Hive mostly, we should enable automatic test with remote spark
> client to make sure the Hive feature workable with it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)