Just out of my curiosity. Do you manually apply this patch and see if
this can actually resolve the issue? It seems that it was merged at
some point, but reverted due to that it causes some stability issue.

Sincerely,

DB Tsai
-------------------------------------------------------
My Blog: https://www.dbtsai.com
LinkedIn: https://www.linkedin.com/in/dbtsai


On Sat, Dec 13, 2014 at 7:11 AM,  <spark.dubovsky.ja...@seznam.cz> wrote:
> So to answer my own question. It is a bug and there is unmerged PR for that
> already.
>
> https://issues.apache.org/jira/browse/SPARK-2624
> https://github.com/apache/spark/pull/3238
>
> Jakub
>
> ---------- Původní zpráva ----------
> Od: spark.dubovsky.ja...@seznam.cz
> Komu: spark.dubovsky.ja...@seznam.cz
> Datum: 12. 12. 2014 15:26:35
>
>
> Předmět: Re: Including data nucleus tools
>
>
> Hi,
>
>   I had time to try it again. I submited my app by the same command with
> these additional options:
>
>   --jars
> lib/datanucleus-api-jdo-3.2.6.jar,lib/datanucleus-core-3.2.10.jar,lib/datanucleus-rdbms-3.2.9.jar
>
>   Now an app successfully creates hive context. So my question remains: Is
> "classpath entries" from sparkUI the same classpath as mentioned in submit
> script message?
>
> "Spark assembly has been built with Hive, including Datanucleus jars on
> classpath"
>
>   If so then why the script fails to really include datanucleus jars on
> classpath? I found no bug about this on jira. Or is there a way how
> particular yarn/os settings on our cluster overrides this?
>
>   Thanks in advance
>
>   Jakub
>
> ---------- Původní zpráva ----------
> Od: spark.dubovsky.ja...@seznam.cz
> Komu: Michael Armbrust <mich...@databricks.com>
> Datum: 7. 12. 2014 3:02:33
> Předmět: Re: Including data nucleus tools
>
>
> Next try. I copied whole dist directory created by make-distribution script
> to cluster not just assembly jar. Then I used
>
> ./bin/spark-submit --num-executors 200 --master yarn-cluster --class
> org.apache.spark.mllib.CreateGuidDomainDictionary ../spark/root-0.1.jar
> ${args}
>
>  ...to run app again. Startup scripts printed this message:
>
> "Spark assembly has been built with Hive, including Datanucleus jars on
> classpath"
>
>   ...so I thought I am finally there. But job started and failed on the same
> ClassNotFound exception as before. Is "classpath" from script message just
> classpath of driver? Or is it the same classpath which is affected by --jars
> option? I was trying to find out from scripts but I was not able to find
> where --jars option is processed.
>
>   thanks
>
> ---------- Původní zpráva ----------
> Od: Michael Armbrust <mich...@databricks.com>
> Komu: spark.dubovsky.ja...@seznam.cz
> Datum: 6. 12. 2014 20:39:13
> Předmět: Re: Including data nucleus tools
>
>
> On Sat, Dec 6, 2014 at 5:53 AM, <spark.dubovsky.ja...@seznam.cz> wrote:
>
> Bonus question: Should the class
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory be part of assembly?
> Because it is not in jar now.
>
>
> No these jars cannot be put into the assembly because they have extra
> metadata files that live in the same location (so if you put them all in an
> assembly they overrwrite each other).  This metadata is used in discovery.
> Instead they must be manually put on the classpath in their original form
> (usually using --jars).

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to