In the current state of Spark Streaming, creating separate Java processes each having a streaming context is probably the best approach to dynamically adding and removing of input sources. All of these should be able to to use a YARN cluster for resource allocation.
On Wed, Sep 3, 2014 at 6:30 PM, Tobias Pfeiffer <t...@preferred.jp> wrote: > Hi, > > I am not sure if "multi-tenancy" is the right word, but I am thinking > about a Spark application where multiple users can, say, log into some web > interface and specify a data processing pipeline with streaming source, > processing steps, and output. > > Now as far as I know, there can be only one StreamingContext per JVM and > also I cannot add sources or processing steps once it has been started. Are > there any ideas/suggestinos for how to achieve a dynamic adding and > removing of input sources and processing pipelines? Do I need a separate > 'java' process per user? > Also, can I realize such a thing when using YARN for dynamic allocation? > > Thanks > Tobias >