Hi Stephen,
I recently posted recipes on the gremlin and janusgraph users lists to
configure the binary distributions to work with a spark-yarn cluster. I
think it would be useful to have the tinkerpop recipe included in Apache
Tinkerpop repo itself in the following way:
- include the spark-yarn dependency to spark-gremlin
- add the recipe to the docs so that it is actually run in the
existing documentation environment at build time
In this way:
- the recipe would be less clumsy for users to follow (no external deps)
- the recipe would be maintained and still work after version upgrades
I do not have to remind you that many users have had problems with
spark-yarn and that the ability to run OLAP queries on an existing
cluster is one of the attractive feature of Tinkerpop.
This brings me to the question: do you see potential obstacles in
accepting a PR along these lines? I will probably wait for some time
until actually doing this, though, to have more opportunity to "eat my
own dogfood" and see if changes are still required.
Cheers, HadoopMarc