Hi,

I fully agree that. Actually, I'm working on PR to add "client" and "exploded" profiles in Maven build.

The client profile create a spark-client-assembly jar, largely more lightweight that the spark-assembly. In our case, we construct jobs that don't require all the spark server side. It means that the minimal size of the generated jar is about 120MB, and it's painful in spark-submit submission time. That's why I started to remove unecessary dependencies in spark-assembly.

On the other hand, I'm also working on the "exploded" mode: instead of using a fat monolithic spark-assembly jar file, I'm working on a exploded mode, allowing users to view/change the dependencies.

For the client profile, I've already something ready, I will propose the PR very soon (by the end of this week hopefully). For the exploded profile, I need more time.

My $0.02

Regards
JB

On 11/11/2015 12:53 AM, Reynold Xin wrote:

On Tue, Nov 10, 2015 at 3:35 PM, Nicholas Chammas
<nicholas.cham...@gmail.com <mailto:nicholas.cham...@gmail.com>> wrote:


    > 3. Assembly-free distribution of Spark: don’t require building an 
enormous assembly jar in order to run Spark.

    Could you elaborate a bit on this? I'm not sure what an
    assembly-free distribution means.


Right now we ship Spark using a single assembly jar, which causes a few
different problems:

- total number of classes are limited on some configurations

- dependency swapping is harder


The proposal is to just avoid a single fat jar.



--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to