[ https://issues.apache.org/jira/browse/SPARK-22936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16309682#comment-16309682 ]
Sean Owen commented on SPARK-22936: ----------------------------------- The typical place to list these is https://spark-packages.org/ > providing HttpStreamSource and HttpStreamSink > --------------------------------------------- > > Key: SPARK-22936 > URL: https://issues.apache.org/jira/browse/SPARK-22936 > Project: Spark > Issue Type: New Feature > Components: Structured Streaming > Affects Versions: 2.1.0 > Reporter: bluejoe > > Hi, in my project I completed a spark-http-stream, which is now available on > https://github.com/bluejoe2008/spark-http-stream. I am thinking if it is > useful to others and is ok to be integrated as a part of Spark. > spark-http-stream transfers Spark structured stream over HTTP protocol. > Unlike tcp streams, Kafka streams and HDFS file streams, http streams often > flow across distributed big data centers on the Web. This feature is very > helpful to build global data processing pipelines across different data > centers (scientific research institutes, for example) who own separated data > sets. > The following code shows how to load messages from a HttpStreamSource: > ``` > val lines = spark.readStream.format(classOf[HttpStreamSourceProvider].getName) > .option("httpServletUrl", "http://localhost:8080/xxxx") > .option("topic", "topic-1"); > .option("includesTimestamp", "true") > .load(); > ``` -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org