One thing I've found while working is that we may want to add package with
excludes.

Launching child class with --packages to kafka_2.10 just fails since it has
conflicted libraries as transitive dependencies. Not sure how to represent
that, but technically Aether seems to support this.

SLF4J: Detected both log4j-over-slf4j.jar AND slf4j-log4j12.jar on the
class path, preempting StackOverflowError.
SLF4J: See also http://www.slf4j.org/codes.html#log4jDelegationLoop for
more details.

Two jars are transitive dependencies of org.apache.kafka:kafka_2.10:0.9.0
(also 0.8.2.1). So if we would like to add kafka lib from submission step,
exclusion should be supported.

2016년 8월 4일 (목) 오전 12:03, Jungtaek Lim <[email protected]>님이 작성:

> FYI: This proposal is filed to STORM-2016
> <https://issues.apache.org/jira/browse/STORM-2016> and I've been working
> on this.
>
> I'd like to explain the details on topology submitter as I wasn't clear on
> that.
>
> I've been experimenting several ways of topology submission, but they're
> all having pros and cons.
>
> 1. Introduce Submitter class which resolves dependencies and upload them
> to blobstore, and load topology code and dependencies to custom mutable
> classloader and finally run child class' main method by reflection. This is
> what SparkSubmit is doing though that is more complicated due to support
> various options.
>
> pros.
> - No need to handle communication between processes. That class bootstraps
> and handle all of things.
> cons.
> - We should pass custom classloader to all usages of Class.forName in
> order to prevent any CNFs.
> - Spark uses checkstyle to check usage of Class.forName, but we don't
> apply that so we could miss it.
>
> 2. Introduce Helper class which resolves transitive dependencies (with
> fetching) and upload them to blobstore, and return pair of (blob key, file)
> map. storm.py reads the response of Helper class and add them to classpath
> and run child class' main.
>
> pros.
> - We don't need to use Classloader hack (?).
> - If we make Helper class to separate module, we can even place that
> module to outside of lib and avoid adding aether libraries to lib directory.
> cons.
> - It's annoying and error prone to get and parse Helper's output from
> stdout.
> - Also storm.py needs to run two classes but it's not a big deal since we
> already do that. (confvalue, and ClientJarTransformerRunner)
> - It's not easy to remove dependencies from blobstore if topology
> submission from child class is failed.
>
> 3 Let Helper class just resolves transitive dependencies and return file
> list. storm.py reads the response of Helper class and add them to
> classpath and run child class' main. StormSubmitter will upload them to
> blobstore.
>
> pros.
> - Same as 2.
> - Easy to remove dependencies from blobstore if submission is failed.
> - Helper class is no longer depending on storm-core. Easier to place the
> module to outside of lib.
> cons.
> - StormSubmitter should handle dependencies when submitting topology.
>
> I've succeed with 2, and will try 3 to see it helps.
>
> Any other suggestions or opinions for existing options are much
> appreciated!
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> 2016년 8월 3일 (수) 오전 8:01, Jungtaek Lim <[email protected]>님이 작성:
>
>> Hi Priyank,
>>
>> first of all, this feature is similar (close) to what Spark provides.
>>
>> https://spark.apache.org/docs/2.0.0/submitting-applications.html#advanced-dependency-management
>>
>> if you have additional jars which are not packed to uber topology jar,
>> you can use --jars option to include them without repackaging topology jar.
>>
>> And I think I was not clear on submitter. I'm still trying to design that
>> point in detail since resolving dependencies need eclipse aether libraries
>> so thinking about avoiding to add dependency to storm-core. But it seems
>> not that easy and clear. I'll update once I'm clear on this.
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>> 2016년 8월 3일 (수) 오전 7:43, Priyank Shah <[email protected]>님이 작성:
>>
>>> Hi Jungtaek,
>>>
>>> For adding jars and maven at submission, you have used the word
>>> Submitter. Is Submitter the person running storm jar command or is
>>> Submitter the java code that actually submits it to Nimbus?
>>> Also, I did not quite understand the --jars option. If you could please
>>> elaborate a little on that, that will be great
>>>
>>> Thanks
>>> Priyank
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 8/2/16, 7:05 AM, "Jungtaek Lim" <[email protected]> wrote:
>>>
>>> >Ah, Satish you got the point. I meant copied version of files in
>>> >supervisor, but itself can be isolated.
>>> >I didn't think about removing blobs, and it seems not easy to do.
>>> >
>>> >Jungtaek Lim (HeartSaVIoR)
>>> >
>>> >
>>> >2016년 8월 2일 (화) 오후 7:35, Satish Duggana <[email protected]>님이
>>> 작성:
>>> >
>>> >> Hi Jungtaek,
>>> >> With the current proposal, are we removing blob store files referred
>>> by a
>>> >> topology when it is killed?
>>> >>
>>> >> Thanks,
>>> >> Satish.
>>> >>
>>> >> On Tue, Aug 2, 2016 at 3:50 PM, Jungtaek Lim <[email protected]>
>>> wrote:
>>> >>
>>> >> > Hi Satish,
>>> >> >
>>> >> > Thanks for reviewing and share your idea.
>>> >> >
>>> >> > Yes this is shared dependencies vs isolated dependencies.
>>> >> > If we name file of dependency to contain group name, artifact name,
>>> and
>>> >> > version, that can be shared.
>>> >> > One downside of this approach is storage space since we don't know
>>> when
>>> >> > it's safe to delete without additional care, but I'm curious that
>>> disk
>>> >> > fills up due to dependency blob jar files in normal situation.
>>> >> > So I think we're OK to do this but I would like to see others
>>> opinions.
>>> >> >
>>> >> > Btw, I'm designing details based on proposal. Will update to this
>>> thread
>>> >> if
>>> >> > there're not covered things with initial design.
>>> >> >
>>> >> > Thanks,
>>> >> > Jungtaek Lim (HeartSaVioR)
>>> >> >
>>> >> > 2016년 8월 2일 (화) 오후 6:58, Satish Duggana <[email protected]>님이
>>> 작성:
>>> >> >
>>> >> > > Hi Jungtaek,
>>> >> > > Proposal looks good to me. Good that we are not going with other
>>> >> > > alternative using mutable classloader etc.
>>> >> > >
>>> >> > > Good to have the mentioned config in proposal to add those jars
>>> before
>>> >> or
>>> >> > > after storm core/libs. There is a property Config.
>>> >> > > TOPOLOGY_CLASSPATH_BEGINNING which is to have that value as
>>> initial
>>> >> > > classpath and that should continue to be working as expected even
>>> with
>>> >> > the
>>> >> > > new configuration.
>>> >> > >
>>> >> > > One enhancement which we may want to add to the existing proposal.
>>> >> > > When --packages are used, storm submitter can upload those
>>> dependencies
>>> >> > in
>>> >> > > blob store with a defined naming convention so that same set of
>>> >> packages
>>> >> > > are not uploaded again and they can be used again for other
>>> topologies
>>> >> if
>>> >> > > they use same package.
>>> >> > >
>>> >> > > Thanks,
>>> >> > > Satish.
>>> >> > >
>>> >> > >
>>> >> > > On Tue, Aug 2, 2016 at 7:25 AM, Jungtaek Lim <[email protected]>
>>> >> wrote:
>>> >> > >
>>> >> > > > Hi dev,
>>> >> > > >
>>> >> > > > This is proposal review thread for submitting topology with
>>> adding
>>> >> jars
>>> >> > > and
>>> >> > > > maven artifacts. This is also following up discussion thread for
>>> >> > > > [DISCUSSION]
>>> >> > > > Policy of resolving dependencies for non storm-core modules.[1]
>>> >> > > >
>>> >> > > > I've written design doc which also describes motivation on this.
>>> >> > > >
>>> >> > > >
>>> >> > >
>>> >> >
>>> >>
>>> https://cwiki.apache.org/confluence/display/STORM/A.+Design+doc%3A+adding+jars+and+maven+artifacts+at+submission
>>> >> > > >
>>> >> > > > Please review this and comment to "this thread" instead of wiki
>>> page
>>> >> so
>>> >> > > > that all devs can be notified for the update.
>>> >> > > >
>>> >> > > > Thanks,
>>> >> > > > Jungtaek Lim (HeartSaVioR)
>>> >> > > >
>>> >> > > > [1]
>>> >> > > >
>>> >> > > >
>>> >> > >
>>> >> >
>>> >>
>>> http://mail-archives.apache.org/mod_mbox/storm-dev/201607.mbox/%3CCAF5108jByyJLTKrV_P4fS=dj8rsr_o5oubzqbviscggsc1c...@mail.gmail.com%3E
>>> >> > > >
>>> >> > >
>>> >> >
>>> >>
>>>
>>

Reply via email to