[
https://issues.apache.org/jira/browse/SPARK-5037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon updated SPARK-5037:
--------------------------------
Labels: bulk-closed (was: )
> support dynamic loading of input DStreams in pyspark streaming
> --------------------------------------------------------------
>
> Key: SPARK-5037
> URL: https://issues.apache.org/jira/browse/SPARK-5037
> Project: Spark
> Issue Type: New Feature
> Components: DStreams, PySpark
> Affects Versions: 1.2.0
> Reporter: Jascha Swisher
> Priority: Major
> Labels: bulk-closed
>
> The scala and java streaming APIs support "external" InputDStreams (e.g. the
> ZeroMQReceiver example) through a number of mechanisms, for instance by
> overriding ActorReceiver or just subclassing Receiver directly. The pyspark
> streaming API does not currently allow similar flexibility, being limited at
> the moment to file-backed text and binary streams or socket text streams.
> It would be great to open up the pyspark streaming API to other stream
> sources, putting it closer to on par with the JVM APIs.
> One way of doing this could be to support dynamically loading InputDStream
> implementations through reflection at the JVM level, analogously to what is
> currently done for Hadoop InputFormats in the regular pyspark context.py
> Hadoop methods.
> I'll submit a PR momentarily with my shot at this. Comments and alternative
> approaches more than welcome.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]