[GitHub] spark pull request: SPARK-2624 add datanucleus jars to the contain...

vanzin Mon, 01 Dec 2014 13:53:59 -0800

Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/3238#issuecomment-65141491
  
    I thought about making this a generic "add all the jars in this directory 
to the dist cache and to the app's classpath". This would make sense for 
regular application dependencies - all jars would be added to the app's 
classpath (similar to --jar). But the datanucleus jars are "special", they have 
to be in the main class path so that Spark classes pick them up.
    
    I think making this generic is a little dangerous since we shouldn't be 
encouraging people to add things to Spark's classpath, and should encourage the 
use of things like "userClassPathFirst" once we're comfortable that it works 
properly.
    
    Another thing about the datanucleus jars is that most people actually 
shouldn't need them; Hive's preferred way to connect to the metastore is 
through the metastore server, and that doesn't require these jars. But I don't 
know how Spark SQL is generally deployed these days, so maybe that doesn't 
apply here.
    
    In light of these I'd rather keep this as specific to these jars as 
possible, with the added comment that Tom made of investigating whether it's 
possible to add these to the uber jar somehow. That would be the best solution.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-2624 add datanucleus jars to the contain...

Reply via email to