Thanks Donald, Re: adding a line to ensure a jar is loaded first. Is this what you are referring to...(line at the bottom in red)?
# Add hadoop conf dir if given -- otherwise FileSystem.*, etc fail ! Note, this # assumes that there is either a HADOOP_CONF_DIR or YARN_CONF_DIR which hosts # the configurtion files. if [ -n "$HADOOP_CONF_DIR" ]; then CLASSPATH="$CLASSPATH:$HADOOP_CONF_DIR" fi if [ -n "$YARN_CONF_DIR" ]; then CLASSPATH="$CLASSPATH:$YARN_CONF_DIR" fi if [ -n "$HBASE_CONF_DIR" ]; then CLASSPATH="$CLASSPATH:$HBASE_CONF_DIR" fi if [ -n "$ES_CONF_DIR" ]; then CLASSPATH="$CLASSPATH:$ES_CONF_DIR" fi if [ -n "$POSTGRES_JDBC_DRIVER" ]; then CLASSPATH="$CLASSPATH:$POSTGRES_JDBC_DRIVER" ASSEMBLY_JARS="$ASSEMBLY_JARS,$POSTGRES_JDBC_DRIVER" fi if [ -n "$MYSQL_JDBC_DRIVER" ]; then CLASSPATH="$CLASSPATH:$MYSQL_JDBC_DRIVER" ASSEMBLY_JARS="$ASSEMBLY_JARS,$MYSQL_JDBC_DRIVER" fi file:/app/PredictionIO-dist/lib/spark/aws-java-sdk.jar echo "$CLASSPATH" *Shane Johnson | LIFT IQ* *Founder | CEO* *www.liftiq.com <http://www.liftiq.com/>* or *sh...@liftiq.com <sh...@liftiq.com>* mobile: (801) 360-3350 LinkedIn <https://www.linkedin.com/in/shanewjohnson/> | Twitter <https://twitter.com/SWaldenJ> | Facebook <https://www.facebook.com/shane.johnson.71653> On Tue, Mar 6, 2018 at 6:58 PM, Donald Szeto <don...@apache.org> wrote: > Even easier: skip cloning, and just edit the shell script directly in the > binary distribution. Hope that works. > > Regards, > Donald > > On Tue, Mar 6, 2018 at 5:41 PM Shane Johnson <sh...@liftiq.com> wrote: > >> Thanks Mars and Donald. I think this gets me to next steps: >> >> - Clone PredictionIO 0.12 and adjust the bin/compute-classpath.sh to >> have aws-java-sdk-1.7.4 loaded first. >> - Create custom binary distribution of PredicionIO 0.12. >> - Add config var to point to custom binary distribution. >> >> This is very helpful. Thank you! >> >> *Shane Johnson | LIFT IQ* >> *Founder | CEO* >> >> *www.liftiq.com <http://www.liftiq.com/>* or *sh...@liftiq.com >> <sh...@liftiq.com>* >> mobile: (801) 360-3350 >> LinkedIn <https://www.linkedin.com/in/shanewjohnson/> | Twitter >> <https://twitter.com/SWaldenJ> | Facebook >> <https://www.facebook.com/shane.johnson.71653> >> >> >> >> On Tue, Mar 6, 2018 at 4:52 PM, Mars Hall <mars.h...@salesforce.com> >> wrote: >> >>> >>> >>> On Tue, Mar 6, 2018 at 11:39 PM, Shane Johnson <sh...@liftiq.com> wrote: >>> >>>> >>>> Do you know the version of hadoop-aws.jar and aws-java-sdk.jar that you >>>> are using? >>>> >>>> I do not know what version is being used. Is this something that I can >>>> specify or control? I am using the PredictionIO buildpack >>>> https://github.com/heroku/predictionio-buildpack. I am *not* >>>> specifying these in my build.sbt currently. >>>> >>> >>> The versions are specified in the "bin/common/setup-runtime" script of >>> the buildpack: >>> https://github.com/heroku/predictionio-buildpack/blob/ >>> master/bin/common/setup-runtime#L50 >>> >>> Currently they are: >>> >>> - hadoop-aws-2.7.3 >>> - aws-java-sdk-1.7.4 >>> >>> If you fork the buildpack, those download URLs (currently hosted in my >>> S3 bucket) can be changed. >>> >>> >>> >>>> You are also right that you can modify the class path in >>>> bin/compute-classpath.sh as a short term fix. The current order is >>>> following the output of your target system's `ls`, so the order is not >>>> guaranteed like you speak. Right before the last line (echo $CLASSPATH), >>>> you can add a line to make sure that the JAR you want to be loaded first is >>>> at the very beginning. >>>> >>>> I believe this would need to be something that is edited on master ( >>>> https://github.com/apache/predictionio) as the buildpack leverages it >>>> vs. a cloned version of the code that I can edit, am I thinking through >>>> that correctly? I may need to circle back with Mars to see if there are any >>>> other options to get this to work with Heroku. Is this something that can >>>> be committed to master? The aws-java-sdk.jar is the one that we need >>>> to load first. >>>> >>> >>> If you build your own custom binary distribution of PredictionIO 0.12 >>> and upload it to a publicly accessible URL, then the buildpack can be >>> configured to use it by setting config var `PREDICTIONIO_DIST_URL` which is >>> used by the same setup script as above: >>> https://github.com/heroku/predictionio-buildpack/blob/ >>> master/bin/common/setup-runtime#L82 >>> >>> This is how I tested my pre-release builds on Heroku. >>> >>> -- >>> *Mars Hall >>> 415-818-7039 <(415)%20818-7039> >>> Customer Facing Architect >>> Salesforce Platform / Heroku >>> San Francisco, California >>> >> >>