As a little update, the pattern for the exclusion of those files in
sbt-assembly is the following:

assemblyMergeStrategy in assembly := {
  case PathList(ps @ _*) if ps.last.endsWith(".DSA") ||
ps.last.endsWith(".SF") || ps.last.endsWith(".RSA")  =>
MergeStrategy.discard
  //Other MergeStrategies
}

2017-09-25 11:48 GMT+02:00 Federico D'Ambrosio <
federico.dambro...@smartlab.ws>:

> Hi Urs,
>
> Thank you very much for your advice, I will look into excluding those
> files directly during the assembly.
>
> 2017-09-25 10:58 GMT+02:00 Urs Schoenenberger <urs.schoenenberger@tngtech.
> com>:
>
>> Hi Federico,
>>
>> oh, I remember running into this problem some time ago. If I recall
>> correctly, this is not a flink issue, but an issue with technically
>> incorrect jars from dependencies which prevent the verification of the
>> manifest. I was using the maven-shade plugin back then and configured an
>> exclusion for these file types. I assume that sbt/sbt-assembly has a
>> similar option, this should be more stable than manually stripping the
>> jar.
>> Alternatively, you could try to find out which dependency puts the
>> .SF/etc files there and exclude this dependency altogether, it might be
>> a transitive lib dependency that comes with hadoop anyways, or simply
>> one that you don't need anyways.
>>
>> Best,
>> Urs
>>
>> On 25.09.2017 10:09, Federico D'Ambrosio wrote:
>> > Hi Urs,
>> >
>> > Yes the main class is set, just like you said.
>> >
>> > Still, I might have managed to get it working: during the assembly some
>> > .SF, .DSA and .RSA files are put inside the META-INF folder of the jar,
>> > possibly coming from some of the new dependencies in the deps tree.
>> > Apparently, this caused this weird issue. Using an appropriate pattern
>> for
>> > discarding the files during the assembly or removing them via zip -d
>> should
>> > be enough (I sure hope so, since this is some of the worst issues I've
>> come
>> > across).
>> >
>> >
>> > Federico D'Ambrosio
>> >
>> > Il 25 set 2017 9:51 AM, "Urs Schoenenberger" <
>> urs.schoenenber...@tngtech.com>
>> > ha scritto:
>> >
>> >> Hi Federico,
>> >>
>> >> just guessing, but are you explicitly setting the Main-Class manifest
>> >> attribute for the jar that you are building?
>> >>
>> >> Should be something like
>> >>
>> >> mainClass in (Compile, packageBin) :=
>> >> Some("org.yourorg.YourFlinkJobMainClass")
>> >>
>> >> Best,
>> >> Urs
>> >>
>> >>
>> >> On 23.09.2017 17:53, Federico D'Ambrosio wrote:
>> >>> Hello everyone,
>> >>>
>> >>> I'd like to submit to you this weird issue I'm having, hoping you
>> could
>> >>> help me.
>> >>> Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and flink
>> 1.3.2
>> >>> compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6)
>> >>> So, I'm trying to implement an sink for Hive so I added the following
>> >>> dependency in my build.sbt:
>> >>>
>> >>> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>> >>> "1.2.1000.2.6.1.0-129"
>> >>>
>> >>> in order to use hive streaming capabilities.
>> >>>
>> >>> After importing this dependency, not even using it, if I try to flink
>> run
>> >>> the job I get
>> >>>
>> >>> org.apache.flink.client.program.ProgramInvocationException: The
>> >> program's
>> >>> entry point class 'package.MainObj' was not found in the jar file.
>> >>>
>> >>> If I remove the dependency, everything goes back to normal.
>> >>> What is weird is that if I try to use sbt run in order to run job, *it
>> >> does
>> >>> find the Main class* and obviously crash because of the missing flink
>> >> core
>> >>> dependencies (AbstractStateBackend missing and whatnot).
>> >>>
>> >>> Here are the complete dependencies of the project:
>> >>>
>> >>> "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
>> >>> "org.apache.flink" %% "flink-streaming-scala" % flinkVersion %
>> >> "provided",
>> >>> "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion,
>> >>> "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
>> >>> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>> >>> "1.2.1000.2.6.1.0-129",
>> >>> "org.joda" % "joda-convert" % "1.8.3",
>> >>> "com.typesafe.play" %% "play-json" % "2.6.2",
>> >>> "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2",
>> >>> "org.scalactic" %% "scalactic" % "3.0.1",
>> >>> "org.scalatest" %% "scalatest" % "3.0.1" % "test",
>> >>> "de.javakaffee" % "kryo-serializers" % "0.42"
>> >>>
>> >>> Could it be an issue of dependencies conflicts between mongo-hadoop
>> and
>> >>> hive hadoop versions (respectively 2.7.1 and  2.7.3.2.6.1.0-129, even
>> >>> though no issue between mongodb-hadoop and flink)? I'm even starting
>> to
>> >>> think that Flink cannot handle that well big jars (before the new
>> >>> dependency it was 44M, afterwards it became 115M) when it comes to
>> >>> classpath loading?
>> >>>
>> >>> Any help would be really appreciated,
>> >>> Kind regards,
>> >>> Federico
>> >>>
>> >>>
>> >>>
>> >>> Hello everyone,
>> >>>
>> >>> I'd like to submit to you this weird issue I'm having, hoping you
>> could
>> >>> help me.
>> >>> Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and flink
>> 1.3.2
>> >>> compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6)
>> >>> So, I'm trying to implement an sink for Hive so I added the following
>> >>> dependency in my build.sbt:
>> >>>
>> >>> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>> >>> "1.2.1000.2.6.1.0-129"
>> >>>
>> >>> in order to use hive streaming capabilities.
>> >>>
>> >>> After importing this dependency, not even using it, if I try to flink
>> >>> run the job I get
>> >>>
>> >>> org.apache.flink.client.program.ProgramInvocationException: The
>> >>> program's entry point class 'package.MainObj' was not found in the jar
>> >> file.
>> >>>
>> >>> If I remove the dependency, everything goes back to normal.
>> >>> What is weird is that if I try to use sbt run in order to run job, *it
>> >>> does find the Main class* and obviously crash because of the missing
>> >>> flink core dependencies (AbstractStateBackend missing and whatnot).
>> >>>
>> >>> Here are the complete dependencies of the project:
>> >>>
>> >>> "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
>> >>> "org.apache.flink" %% "flink-streaming-scala" % flinkVersion %
>> >> "provided",
>> >>> "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion,
>> >>> "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
>> >>> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>> >>> "1.2.1000.2.6.1.0-129",
>> >>> "org.joda" % "joda-convert" % "1.8.3",
>> >>> "com.typesafe.play" %% "play-json" % "2.6.2",
>> >>> "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2",
>> >>> "org.scalactic" %% "scalactic" % "3.0.1",
>> >>> "org.scalatest" %% "scalatest" % "3.0.1" % "test",
>> >>> "de.javakaffee" % "kryo-serializers" % "0.42"
>> >>>
>> >>> Could it be an issue of dependencies conflicts between mongo-hadoop
>> and
>> >>> hive hadoop versions (respectively 2.7.1 and  2.7.3.2.6.1.0-129, even
>> >>> though no issue between mongodb-hadoop and flink)? I'm even starting
>> to
>> >>> think that Flink cannot handle that well big jars (before the new
>> >>> dependency it was 44M, afterwards it became 115M) when it comes to
>> >>> classpath loading?
>> >>>
>> >>> Any help would be really appreciated,
>> >>> Kind regards,
>> >>> Federico
>> >>
>> >> --
>> >> Urs Schönenberger - urs.schoenenber...@tngtech.com
>> >>
>> >> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>> >> Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Gerhard Müller
>> >> Sitz: Unterföhring * Amtsgericht München * HRB 135082
>> >>
>> >>
>> >>
>> >> Hi Urs,
>> >>
>> >> Yes the main class is set, just like you said.
>> >>
>> >> Still, I might have managed to get it working: during the assembly
>> >> some .SF, .DSA and .RSA files are put inside the META-INF folder of
>> >> the jar, possibly coming from some of the new dependencies in the deps
>> >> tree.
>> >> Apparently, this caused this weird issue. Using an appropriate pattern
>> >> for discarding the files during the assembly or removing them via zip
>> >> -d should be enough (I sure hope so, since this is some of the worst
>> >> issues I've come across).
>> >>
>> >>
>> >> Federico D'Ambrosio
>> >>
>> >> Il 25 set 2017 9:51 AM, "Urs Schoenenberger"
>> >> <urs.schoenenber...@tngtech.com
>> >> <mailto:urs.schoenenber...@tngtech.com>> ha scritto:
>> >>
>> >>     Hi Federico,
>> >>
>> >>     just guessing, but are you explicitly setting the Main-Class
>> manifest
>> >>     attribute for the jar that you are building?
>> >>
>> >>     Should be something like
>> >>
>> >>     mainClass in (Compile, packageBin) :=
>> >>     Some("org.yourorg.YourFlinkJobMainClass")
>> >>
>> >>     Best,
>> >>     Urs
>> >>
>> >>
>> >>     On 23.09.2017 17:53, Federico D'Ambrosio wrote:
>> >>     > Hello everyone,
>> >>     >
>> >>     > I'd like to submit to you this weird issue I'm having, hoping
>> >>     you could
>> >>     > help me.
>> >>     > Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and
>> >>     flink 1.3.2
>> >>     > compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6)
>> >>     > So, I'm trying to implement an sink for Hive so I added the
>> >>     following
>> >>     > dependency in my build.sbt:
>> >>     >
>> >>     > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>> >>     > "1.2.1000.2.6.1.0-129"
>> >>     >
>> >>     > in order to use hive streaming capabilities.
>> >>     >
>> >>     > After importing this dependency, not even using it, if I try to
>> >>     flink run
>> >>     > the job I get
>> >>     >
>> >>     > org.apache.flink.client.program.ProgramInvocationException: The
>> >>     program's
>> >>     > entry point class 'package.MainObj' was not found in the jar
>> file.
>> >>     >
>> >>     > If I remove the dependency, everything goes back to normal.
>> >>     > What is weird is that if I try to use sbt run in order to run
>> >>     job, *it does
>> >>     > find the Main class* and obviously crash because of the missing
>> >>     flink core
>> >>     > dependencies (AbstractStateBackend missing and whatnot).
>> >>     >
>> >>     > Here are the complete dependencies of the project:
>> >>     >
>> >>     > "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
>> >>     > "org.apache.flink" %% "flink-streaming-scala" % flinkVersion %
>> >>     "provided",
>> >>     > "org.apache.flink" %% "flink-connector-kafka-0.10" %
>> flinkVersion,
>> >>     > "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
>> >>     > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>> >>     > "1.2.1000.2.6.1.0-129",
>> >>     > "org.joda" % "joda-convert" % "1.8.3",
>> >>     > "com.typesafe.play" %% "play-json" % "2.6.2",
>> >>     > "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2",
>> >>     > "org.scalactic" %% "scalactic" % "3.0.1",
>> >>     > "org.scalatest" %% "scalatest" % "3.0.1" % "test",
>> >>     > "de.javakaffee" % "kryo-serializers" % "0.42"
>> >>     >
>> >>     > Could it be an issue of dependencies conflicts between
>> >>     mongo-hadoop and
>> >>     > hive hadoop versions (respectively 2.7.1 and  2.7.3.2.6.1.0-129,
>> >>     even
>> >>     > though no issue between mongodb-hadoop and flink)? I'm even
>> >>     starting to
>> >>     > think that Flink cannot handle that well big jars (before the new
>> >>     > dependency it was 44M, afterwards it became 115M) when it comes
>> to
>> >>     > classpath loading?
>> >>     >
>> >>     > Any help would be really appreciated,
>> >>     > Kind regards,
>> >>     > Federico
>> >>     >
>> >>     >
>> >>     >
>> >>     > Hello everyone,
>> >>     >
>> >>     > I'd like to submit to you this weird issue I'm having, hoping
>> >>     you could
>> >>     > help me.
>> >>     > Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and
>> >>     flink 1.3.2
>> >>     > compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6)
>> >>     > So, I'm trying to implement an sink for Hive so I added the
>> >>     following
>> >>     > dependency in my build.sbt:
>> >>     >
>> >>     > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>> >>     > "1.2.1000.2.6.1.0-129"
>> >>     >
>> >>     > in order to use hive streaming capabilities.
>> >>     >
>> >>     > After importing this dependency, not even using it, if I try to
>> >>     flink
>> >>     > run the job I get
>> >>     >
>> >>     > org.apache.flink.client.program.ProgramInvocationException: The
>> >>     > program's entry point class 'package.MainObj' was not found in
>> >>     the jar file.
>> >>     >
>> >>     > If I remove the dependency, everything goes back to normal.
>> >>     > What is weird is that if I try to use sbt run in order to run
>> >>     job, *it
>> >>     > does find the Main class* and obviously crash because of the
>> missing
>> >>     > flink core dependencies (AbstractStateBackend missing and
>> whatnot).
>> >>     >
>> >>     > Here are the complete dependencies of the project:
>> >>     >
>> >>     > "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
>> >>     > "org.apache.flink" %% "flink-streaming-scala" % flinkVersion %
>> >>     "provided",
>> >>     > "org.apache.flink" %% "flink-connector-kafka-0.10" %
>> flinkVersion,
>> >>     > "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
>> >>     > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>> >>     > "1.2.1000.2.6.1.0-129",
>> >>     > "org.joda" % "joda-convert" % "1.8.3",
>> >>     > "com.typesafe.play" %% "play-json" % "2.6.2",
>> >>     > "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2",
>> >>     > "org.scalactic" %% "scalactic" % "3.0.1",
>> >>     > "org.scalatest" %% "scalatest" % "3.0.1" % "test",
>> >>     > "de.javakaffee" % "kryo-serializers" % "0.42"
>> >>     >
>> >>     > Could it be an issue of dependencies conflicts between
>> >>     mongo-hadoop and
>> >>     > hive hadoop versions (respectively 2.7.1 and  2.7.3.2.6.1.0-129,
>> >>     even
>> >>     > though no issue between mongodb-hadoop and flink)? I'm even
>> >>     starting to
>> >>     > think that Flink cannot handle that well big jars (before the new
>> >>     > dependency it was 44M, afterwards it became 115M) when it comes
>> to
>> >>     > classpath loading?
>> >>     >
>> >>     > Any help would be really appreciated,
>> >>     > Kind regards,
>> >>     > Federico
>> >>
>> >>     --
>> >>     Urs Schönenberger - urs.schoenenber...@tngtech.com
>> >>     <mailto:urs.schoenenber...@tngtech.com>
>> >>
>> >>     TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>> >>     Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Gerhard Müller
>> >>     Sitz: Unterföhring * Amtsgericht München * HRB 135082
>> >>
>>
>> --
>> Urs Schönenberger - urs.schoenenber...@tngtech.com
>> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>> Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Gerhard Müller
>> Sitz: Unterföhring * Amtsgericht München * HRB 135082
>>
>
>

Reply via email to