[jira] [Updated] (SPARK-22936) providing HttpStreamSource and HttpStreamSink

bluejoe (JIRA) Mon, 01 Jan 2018 22:02:43 -0800

     [ 
https://issues.apache.org/jira/browse/SPARK-22936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


bluejoe updated SPARK-22936:
----------------------------
    Description: 
Hi, in my project I completed a spark-http-stream, which is now available on 
https://github.com/bluejoe2008/spark-http-stream. I am thinking if it is useful 
to others and is ok to be integrated as a part of Spark.

spark-http-stream transfers Spark structured stream over HTTP protocol. Unlike 
tcp streams, Kafka streams and HDFS file streams, http streams often flow 
across distributed big data centers on the Web. This feature is very helpful to 
build global data processing pipelines across different data centers 
(scientific research institutes, for example) who own separated data sets.

The following code shows how to load messages from a HttpStreamSource:

{{val lines = spark.readStream.format(classOf[HttpStreamSourceProvider].getName)
        .option("httpServletUrl", "http://localhost:8080/xxxx";)
        .option("topic", "topic-1");
        .option("includesTimestamp", "true")
        .load();}}

  was:
Hi, in my project I completed a spark-http-stream, which is now available on 
https://github.com/bluejoe2008/spark-http-stream. I am thinking if it is useful 
to others and is ok to be integrated as a part of Spark.

spark-http-stream transfers Spark structured stream over HTTP protocol. Unlike 
tcp streams, Kafka streams and HDFS file streams, http streams often flow 
across distributed big data centers on the Web. This feature is very helpful to 
build global data processing pipelines across different data centers 
(scientific research institutes, for example) who own separated data sets.

The following code shows how to load messages from a HttpStreamSource:

{quote}val lines = 
spark.readStream.format(classOf[HttpStreamSourceProvider].getName)
        .option("httpServletUrl", "http://localhost:8080/xxxx";)
        .option("topic", "topic-1");
        .option("includesTimestamp", "true")
        .load();{quote}


> providing HttpStreamSource and HttpStreamSink
> ---------------------------------------------
>
>                 Key: SPARK-22936
>                 URL: https://issues.apache.org/jira/browse/SPARK-22936
>             Project: Spark
>          Issue Type: New Feature
>          Components: Structured Streaming
>    Affects Versions: 2.1.0
>            Reporter: bluejoe
>
> Hi, in my project I completed a spark-http-stream, which is now available on 
> https://github.com/bluejoe2008/spark-http-stream. I am thinking if it is 
> useful to others and is ok to be integrated as a part of Spark.
> spark-http-stream transfers Spark structured stream over HTTP protocol. 
> Unlike tcp streams, Kafka streams and HDFS file streams, http streams often 
> flow across distributed big data centers on the Web. This feature is very 
> helpful to build global data processing pipelines across different data 
> centers (scientific research institutes, for example) who own separated data 
> sets.
> The following code shows how to load messages from a HttpStreamSource:
> {{val lines = 
> spark.readStream.format(classOf[HttpStreamSourceProvider].getName)
>       .option("httpServletUrl", "http://localhost:8080/xxxx";)
>       .option("topic", "topic-1");
>       .option("includesTimestamp", "true")
>       .load();}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-22936) providing HttpStreamSource and HttpStreamSink

Reply via email to