if streams is willing to sort out the actual data gathering, we could have this put into ES quickly and get started on using the data gathered.
I’ll re-run that notebook and share all of the raw historical data via a zip file that the ASF deployment of kibble could incorporate to aid development of the front-end. A request: Could we get this JSON output as a single document per repost/like? That is to say, every time jane doe does a retweet etc of one of our tweets, that should be one document with the various data fields. This would allow for some interesting mappings instead of just bar charts :) It’s not currently possible using the Rest API to get a list of everyone who liked a tweet, but it is possible for retweets. I created STREAMS-550 to enable that capability. Also created STREAMS-551 to get all the needed pieces into a container. I'll have to ponder how we're going to present this, and which charts would be most informative here. There is a lot of potential here. Much more than just bar charts for sure. Steve On Dec 3, 2017 at 6:17 AM, Daniel Gruno <[email protected]> wrote: On 12/03/2017 01:02 PM, Daniel Gruno wrote: On 12/02/2017 10:41 PM, Steve Blackmon wrote: Sorry about that! Here’s a link to the notebook that doesn’t require registration. https://www.zepl.com/viewer/notebooks/bm90ZTovL3N0ZXZlYmxhY2ttb24vYXBhY2hlLXplcHBlbGluLWRhc2hib2FyZC84YjQ5YmY3MWIxYTU0ZTE2YjlkMDQyMTliMzNlMjQzYS9ub3RlLmpzb24 In this notebook we used the %spark interpreter to collect the data, but most of the work is done as scala in the driver process. The streams code base is java and not dependent on spark or other frameworks external to the jar file. The easiest integration I can think of given the python/java language gap would use docker - Streams could prepare a docker container packaged with all the necessary code, and Kibble installations could use it to run ad-hoc or scheduled data processes. The data collected could be written as new-line delimited json on container mounted volumes, or directly to an elasticsearch index. Docker’s not really necessary though, if the system where Kibble’s running has a JRE configured and a streams distribution local that could work too. Right, but probably the easiest entry point for people just "wanting to get things done" :). I could also imagine us setting up a remote service that could handle this via HTTP API as an alternate solution, akin to how you would use a GitHub API - that is to say, we'd have a VM that you could query and it'd have all the Java in place for speedy access to these sort of things. Either or both would work for me, and if streams is willing to sort out the actual data gathering, we could have this put into ES quickly and get started on using the data gathered. I'll have to ponder how we're going to present this, and which charts would be most informative here. There is a lot of potential here. If Streams can provide us with a "run this" sort of container that can spit out JSON, that would be awesome. While ES directly might be easier, there's the use-case scenario where ES is not local to the system (Kibble is intended to support both local ES and remote-via-json-api systems), so a JSON output might be the best for now. With regards, Daniel. A request: Could we get this JSON output as a single document per repost/like? That is to say, every time jane doe does a retweet etc of one of our tweets, that should be one document with the various data fields. This would allow for some interesting mappings instead of just bar charts :) Steve On Dec 2, 2017 at 2:10 PM, Daniel Gruno <[email protected]> wrote: On 12/02/2017 09:07 PM, Steve Blackmon wrote: Hi Kibble Team, I've been checking out the code and the demo site this weekend. I'm interested in joining the team and integrating some of the data sources maintained in http://streams.apache.org Specifically, activity streams from the social media presences of projects and contributors (who opt in) as well as statistics derived from them could make a nice addition to Kibble. Here's an example: analysis of Twitter accounts of Apache project using Streams and Zeppelin: https://www.zepl.com/UvGWgAZb7/spaces/Sb9ElZuDD/8b49bf71b1a54e16b9d04219b33e243a Cheers, Steve Blackmon [email protected] Hi Steve, I like the idea, but I am unable to see the link you shared, it shows a 404 for me :(. Having said that, looking into the social media space is definitely something worth doing! With regards, Daniel.
