Clemens Wolff created BAHIR-117:
-----------------------------------
Summary: Expand filtering options for TwitterInputDStream
Key: BAHIR-117
URL: https://issues.apache.org/jira/browse/BAHIR-117
Project: Bahir
Issue Type: Improvement
Components: Spark Streaming Connectors
Reporter: Clemens Wolff
Priority: Minor
Currently, the TwitterInputDStream only supports filtering by keywords [1]
which corresponds to the "track" option in the Twitter API [2]. The Twitter API
supports many more ways to receive a filtered stream (e.g. get Tweets in a
particular location [3]). It would be very useful to expose these additional
filtering options in this library.
Proposal: add a new public method to TwitterUtils which follows the same
interface as createStream [4] but which takes a FilterQuery [5] object as
argument. In this way, we give full filtering flexibility to our users.
I'm currently working on Project Fortis, a social data analysis platform for
the United Nations [6]. The extra filtering options would be very useful for my
project so I'm happy to implement this and create a pull request.
[1]
https://github.com/apache/bahir/blob/fd4c35fc9f7ebb57464d231cf5d66e7bc4096a1b/streaming-twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala#L44
[2] https://dev.twitter.com/streaming/overview/request-parameters#track
[3] https://dev.twitter.com/streaming/overview/request-parameters#locations
[4]
https://github.com/apache/bahir/blob/fd4c35fc9f7ebb57464d231cf5d66e7bc4096a1b/streaming-twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterUtils.scala#L39
[5] http://twitter4j.org/javadoc/twitter4j/FilterQuery.html
[6] https://fortis-web.azurewebsites.net/#/site/ocha/
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)