Great points, Steven.

What's always attracted me to Apache Streams is it's descriptiveness (via JSON 
Schemas documents) vs. prescriptive-ness. Granary's approach is (currently? ;) 
) more prescriptive:

https://github.com/snarfed/granary/blob/master/granary/twitter.py

vs.

https://github.com/apache/streams/tree/STREAMS-26/streams-contrib/streams-provider-twitter

...which is mostly (though not all) a collection of .json and .conf files with 
a handful of .java files needed (afaict) for last-mile integration with one's 
tool.


The future I dream about is one where I can pick my tool for my idiosyncratic 
language, operating system, license reasons, but they'll all work off shared, 
descriptive "knowledge" documents.


Otherwise, we're all pulling separately, and end up with snowflake systems to 
process snowflake APIs. However, I also know it's unlikely everyone will come 
"under one roof" to work on things. My hope, though, is that the output of this 
group (and Granary and Sockethub and...) will be re-usable by as wide an 
audience as possible--hence the value of description over prescription (at 
least in my book ;) ).


Granted, if I'm barking up the wrong tree (again), I'm happy to wander off...


Is anything in the above sane? ;)


Cheers!

Benjamin

--

http://bigbluehat.com/

http://linkedin.com/in/benjaminyoung

________________________________
From: sblackmon <sblack...@apache.org>
Sent: Thursday, October 20, 2016 1:26:38 PM
To: dev@streams.incubator.apache.org
Cc: Matt Franklin; Benjamin Young
Subject: Re: Granary & SocketHub

On October 18, 2016 at 6:09:49 PM, Matt Franklin 
(m.ben.frank...@gmail.com<mailto:m.ben.frank...@gmail.com>) wrote:
On Tue, Oct 18, 2016 at 10:39 AM Benjamin Young <byo...@bigbluehat.com>
wrote:

> (resending from the correct account...likely the other got spammed...)
>
> Granary is a project with similar ideas and intents as Apache Streams
> (which also needs AS2 support ;) ):
> https://github.com/snarfed/granary
>

Ryan from Granary is on the list I think.  Hey Ryan!  Cool stuff, too bad it's 
python :)

> In fact Apache Streams gets a mention in their "Related Work" section:
> https://github.com/snarfed/granary#related-work
>
> Also mentioned in the Granary related work section is SocketHub:
> https://github.com/sockethub/sockethub
>

Cool stuff, too bad it's LGPL :)

> It's aims are similar, but it's reaching way beyond Web-based social APIs
> and "back" to including things like IRC, Email, etc.


Non-SNS data sources are important for sure. I've posted some work on my 
personal github using the streams framework to parse MBOX files - 
https://github.com/steveblackmon/streams-apache - and to collect quantified 
self data - https://github.com/steveblackmon/humanapi-streams
IRC is interesting as well.


> What's significant about both these projects (and others they link to) are
> the stories they're telling developers-which we can crib from as we think
> about the Streams "pitch." They also have relatively minimal setup
> docs-which Streams is also heading toward (go Steve!).
>

Agreed this is key


The existence of other open-source projects with similar themes suggests we're 
onto an important problem.  We should pay attention to these projects and what 
is working for them WRT user growth, community growth, tech media coverage, 
etc...


>
> Again, my key objective is to understand the Apache Streams vision along
> side projects like these and within the wider space of consolidating social
> data. What market does it serve? Is it "personal" (as these projects seem
> to be)? Or commercial? Or developer-only (library/framework for wiring up
> your own idiosyncratic stuff)?
>

I think the overall objective of streams remains very similar to what it
started as: A way to easily and flexibly ingest multiple different sources
of 'activity' data in a normalized ActivityStreams format. For me
personally, my interest is in ingesting this data at scale and with as
little internally-maintained code as possible.

While most of the development so far has been geared toward enabling back-end / 
commercial-scale data collection and management, I think the future should be 
more about enabling individuals and businesses to transcend data silos using 
computing resources and code entirely under their own control. This might mean 
supporting regular users with a full-featured SaaS application in addition to 
continued work on data interoperability.


>
> Thanks for reading, pondering, and helping me help. :)
>
> Cheers!
> Benjamin
>
> --
> http://bigbluehat.com/
> http://linkedin.com/in/benjaminyoung
>
>

Reply via email to