Re: RFC: packaging Spark without assemblies
> On 25 Sep 2015, at 19:11, Marcelo Vanzinwrote: > > - People who ship the assembly with their application. As Matei > suggested (and I agree), that is kinda weird. But currently that is > the easiest way to embed Spark and get, for example, the YARN backend > working. There are ways around that but they are tricky. The code > changes I propose would make that much easier to do without the need > for an assembly. not wierd if you are bypassing bin/spark > > - People who somehow depend on the layout of the Spark distribution. > Meaning they expect a "lib/" directory with an assembly in there > matching a specific file name pattern. Although I kinda consider that > to be an invalid use case (as in "you're doing it wrong"). well, spark-submit and spark-example shells do something close to this, though primarly as error checking against >1 artifact and classpath confusion - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: RFC: packaging Spark without assemblies
On Wed, Sep 23, 2015 at 4:43 PM, Patrick Wendellwrote: > For me a key step in moving away would be to fully audit/understand > all compatibility implications of removing it. If other people are > supportive of this plan I can offer to help spend some time thinking > about any potential corner cases, etc. Thanks Patrick (and all the others) who commented on the document. For BC, I think there are two main cases: - People who ship the assembly with their application. As Matei suggested (and I agree), that is kinda weird. But currently that is the easiest way to embed Spark and get, for example, the YARN backend working. There are ways around that but they are tricky. The code changes I propose would make that much easier to do without the need for an assembly. - People who somehow depend on the layout of the Spark distribution. Meaning they expect a "lib/" directory with an assembly in there matching a specific file name pattern. Although I kinda consider that to be an invalid use case (as in "you're doing it wrong"). One potential way to avoid it is to do the work to make the assemblies unnecessary, but not get rid of them, at least at first. Maybe a build profile or an argument in make-distribution.sh to enable or disable them as desired. -- Marcelo - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
RFC: packaging Spark without assemblies
Hey all, This is something that we've discussed several times internally, but never really had much time to look into; but as time passes by, it's increasingly becoming an issue for us and I'd like to throw some ideas around about how to fix it. So, without further ado: https://github.com/vanzin/spark/pull/2/files (You can comment there or click "View" to read the formatted document. I thought that would be easier than sharing on Google Drive or Box or something.) It would be great to get people's feedback, especially if there are strong reasons for the assemblies that I'm not aware of. -- Marcelo - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: RFC: packaging Spark without assemblies
I think it would be a big improvement to get rid of it. It's not how jars are supposed to be packaged and it has caused problems in many different context over the years. For me a key step in moving away would be to fully audit/understand all compatibility implications of removing it. If other people are supportive of this plan I can offer to help spend some time thinking about any potential corner cases, etc. - Patrick On Wed, Sep 23, 2015 at 3:13 PM, Marcelo Vanzinwrote: > Hey all, > > This is something that we've discussed several times internally, but > never really had much time to look into; but as time passes by, it's > increasingly becoming an issue for us and I'd like to throw some ideas > around about how to fix it. > > So, without further ado: > https://github.com/vanzin/spark/pull/2/files > > (You can comment there or click "View" to read the formatted document. > I thought that would be easier than sharing on Google Drive or Box or > something.) > > It would be great to get people's feedback, especially if there are > strong reasons for the assemblies that I'm not aware of. > > > -- > Marcelo > > - > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org