[2/2] tinkerpop git commit: Merge branch 'tp32'

dkuppitz Tue, 12 Dec 2017 13:08:48 -0800

Merge branch 'tp32'


Project: http://git-wip-us.apache.org/repos/asf/tinkerpop/repo
Commit: http://git-wip-us.apache.org/repos/asf/tinkerpop/commit/92a2640b
Tree: http://git-wip-us.apache.org/repos/asf/tinkerpop/tree/92a2640b
Diff: http://git-wip-us.apache.org/repos/asf/tinkerpop/diff/92a2640b

Branch: refs/heads/master
Commit: 92a2640b6ae5191144e847696f933b4aa98e99a1
Parents: f5687ee a86097d
Author: Daniel Kuppitz <daniel_kupp...@hotmail.com>
Authored: Tue Dec 12 14:08:31 2017 -0700
Committer: Daniel Kuppitz <daniel_kupp...@hotmail.com>
Committed: Tue Dec 12 14:08:31 2017 -0700

----------------------------------------------------------------------
 docs/src/recipes/olap-spark-yarn.asciidoc | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/92a2640b/docs/src/recipes/olap-spark-yarn.asciidoc
----------------------------------------------------------------------
diff --cc docs/src/recipes/olap-spark-yarn.asciidoc
index 85bfe18,634adeb..429d282
--- a/docs/src/recipes/olap-spark-yarn.asciidoc
+++ b/docs/src/recipes/olap-spark-yarn.asciidoc
@@@ -94,14 -94,14 +94,15 @@@ $ . bin/spark-yarn.s
  ----
  hadoop = System.getenv('HADOOP_HOME')
  hadoopConfDir = System.getenv('HADOOP_CONF_DIR')
- archivePath = "/tmp/spark-gremlin.zip"
- ['bash', '-c', "rm $archivePath 2>/dev/null; cd ext/spark-gremlin/lib && zip 
$archivePath *.jar"].execute()
+ archive = 'spark-gremlin.zip'
+ archivePath = "/tmp/$archive"
+ ['bash', '-c', "rm -f $archivePath; cd ext/spark-gremlin/lib && zip 
$archivePath *.jar"].execute().waitFor()
  conf = new PropertiesConfiguration('conf/hadoop/hadoop-gryo.properties')
 -conf.setProperty('spark.master', 'yarn-client')
 -conf.setProperty('spark.yarn.dist.archives', "$archivePath")
 -conf.setProperty('spark.yarn.appMasterEnv.CLASSPATH', 
"./$archive/*:$hadoopConfDir")
 -conf.setProperty('spark.executor.extraClassPath', 
"./$archive/*:$hadoopConfDir")
 +conf.setProperty('spark.master', 'yarn')
 +conf.setProperty('spark.submit.deployMode', 'client')
 +conf.setProperty('spark.yarn.archive', "$archivePath")
 +conf.setProperty('spark.yarn.appMasterEnv.CLASSPATH', 
"./__spark_libs__/*:$hadoopConfDir")
 +conf.setProperty('spark.executor.extraClassPath', 
"./__spark_libs__/*:$hadoopConfDir")
  conf.setProperty('spark.driver.extraLibraryPath', 
"$hadoop/lib/native:$hadoop/lib/native/Linux-amd64-64")
  conf.setProperty('spark.executor.extraLibraryPath', 
"$hadoop/lib/native:$hadoop/lib/native/Linux-amd64-64")
  conf.setProperty('gremlin.spark.persistContext', 'true')
@@@ -121,14 -121,13 +122,14 @@@ Explanatio
  ~~~~~~~~~~~
  
  This recipe does not require running the `bin/hadoop/init-tp-spark.sh` script 
described in the
- 
http://tinkerpop.apache.org/docs/x.y.z/reference/#sparkgraphcomputer[reference 
documentation] and thus is also
+ 
link:http://tinkerpop.apache.org/docs/x.y.z/reference/#sparkgraphcomputer[reference
 documentation] and thus is also
  valid for cluster users without access permissions to do so.
 -Rather, it exploits the `spark.yarn.dist.archives` property, which points to 
an archive with jars on the local file
 +
 +Rather, it exploits the `spark.yarn.archive` property, which points to an 
archive with jars on the local file
  system and is loaded into the various YARN containers. As a result the 
`spark-gremlin.zip` archive becomes available
 -as the directory named `spark-gremlin.zip` in the YARN containers. The 
`spark.executor.extraClassPath` and
 -`spark.yarn.appMasterEnv.CLASSPATH` properties point to the files inside this 
archive.
 -This is why they contain the `./spark-gremlin.zip/*` item. Just because a 
Spark executor got the archive with
 +as the directory named `+__spark_libs__+` in the YARN containers. The 
`spark.executor.extraClassPath` and
 +`spark.yarn.appMasterEnv.CLASSPATH` properties point to the jars inside this 
directory.
 +This is why they contain the `+./__spark_lib__/*+` item. Just because a Spark 
executor got the archive with
  jars loaded into its container, does not mean it knows how to access them.
  
  Also the `HADOOP_GREMLIN_LIBS` mechanism is not used because it can not work 
for Spark on YARN as implemented (jars
@@@ -152,7 -151,7 +153,7 @@@ as long as you do not use the `spark-su
  runtime dependencies listed in the `Gremlin-Plugin-Dependencies` section of 
the manifest file in the `spark-gremlin`
  jar.
  
 -You may not like the idea that the Hadoop and Spark jars from the TinkerPop 
distribution differ from the versions in
 +You may not like the idea that the Hadoop and Spark jars from the Tinkerpop 
distribution differ from the versions in
  your cluster. If so, just build TinkerPop from source with the corresponding 
dependencies changed in the various `pom.xml`
 -files (e.g. `spark-core_2.10-1.6.1-some-vendor.jar` instead of 
`spark-core_2.10-1.6.1.jar`). Of course, TinkerPop will
 +files (e.g. `spark-core_2.11-2.2.0-some-vendor.jar` instead of 
`spark-core_2.11-2.2.0.jar`). Of course, TinkerPop will
- only build for exactly matching or slightly differing artifact versions.
+ only build for exactly matching or slightly differing artifact versions.

[2/2] tinkerpop git commit: Merge branch 'tp32'

Reply via email to