We package up our job jars using maven assemblies using the 'provided'
scope to exclude hadoop jars, and use intellij for local development and
testing.  We've found that it's easiest to just do all local debugging
using junit tests since provided jars will be on the classpath there (if
you don't want it to be run during actual unit testing you can @Ignore the
class).

Not super elegant but it works and encourages people to do testing via
actual tests, rather than manual scripts.

On Tue, Feb 24, 2015 at 4:53 AM, Robert Metzger <rmetz...@apache.org> wrote:

> Hi,
>
> I'm a committer at the Apache Flink project (a system for distrib. data
> processing).
> We provide our users a quickstart maven archetype to bootstrap new Flink
> jobs.
>
> For the generated Flink job's maven project, I would like to build a
> fat-jar that contains all the dependencies the user added to the project.
> However, I don't want to include the flink dependencies into the fat jar.
> The purpose of the fat-jar is to submit it to the cluster for executing the
> user's job. So it should contain the usercode, all the user's dependencies
> BUT NOT the flink dependencies, because we can assume them to be available
> in the running cluster.
>
> A fat-jar with Flink's dependencies is 60MB+, which can be annoying when
> uploading the jars to a cluster.
>
>
> I'm using the shade plugin to do that.
>
> So my first idea was to exclude everything in the "org.apache.flink"
> groupId from the fat jar.
> However, this is not possible because
> - we can only expect some artifacts to be available at runtime (Flink ships
> the core jars with the binary builds. "Extensions" have to be loaded by the
> user)
> - If users put code in their archetype project into the "org.apache.flink"
> namespace, we exclude usercode.
>
> So what I'm looking for is a way to tell the shade (or maven assembly)
> plugin to exclude a list of artifacts and their transitive dependencies.
>
>
> In case someone asks for it: I can not use the 'provided' scope for the
> Flink dependencies because users can also start (and debug) their Flink
> jobs locally. Setting the dependencies to 'provided' would tell IDEs like
> IntelliJ that the dependencies are not required and the job will fail in
> IntelliJ. (If there is a way to set the dependencies only during the
> 'package' phase to provided, let me know)
>
> I hope somebody here has a solution for us.
>
> Regards,
> Robert
>

Reply via email to