Doug -
So the point is to attempt "early detection" of an outbreak of
something based on what people are tweeting?
( "influenza", "flu", "cold", "fever", "H1N1", "H3N2",
"sneezing", "aching", "ache", "achy", "congested" )
It certainly sounds like there might be some utility to it, but
I'm wondering what kinds of reasoning went into this? Is it
based on any models of who tweets or what they are likely to
tweet about?
Was it more of a demonstration or team-building exercise, or does
someone expect to actually put it to use?
So, the data was pre-archived, but I presume a more useful
version would work from more real-time data and probably would
have a sliding time (exponential moving average?) window?
Do you know about Norm Packard's (ofEudaimonic, Prediction,
ProtoLife <http://en.wikipedia.org/wiki/Norman_Packard> fame)
latest venture called LuckySort <http://luckysort.com/>? It's R
interface is called TopicWatchr
<http://luckysort.com/products/r-package> and seems to be doing
something roughly similar (but without specific geolocation?).
Their examples suggest that they are aiming this at the
Investment sector.
Our own Mick Thompson (well, SFX if not FRIAM) was working on
related things before the startup Collecta went dark... I'm not
sure if he's still in this game (or on this list?). I used
Collecta when it was alive... it aggregated Twitter as well as
some subset of blog and maybe newsfeeds? For example, stuck in
northbound traffic on I-25 near La Cienega one time, I was able
to discover within seconds of stopping my vehicle that 3 people
also stuck in traffic had mentioned that they too were stuck and
one of them was close enough to the front of the line to see that
it was a fuel truck that had been involved in an accident so they
weren't inclined to let anyone past it until the HazMat or Fire
folks had determined there was no risk. On the other hand a CB
Radio and/or a Police Scanner (oldschool) would have told me all
that and more in time to take the La Cienega exit and frontage on
into town with only a minor delay.
One of my projects is funded by NIH, and it sponsored (read:
paid for) a group of 15 of us software developer types from 10
different organizations across the country who are working on
the project to get together last week in Las Vegas, NV to
conduct a two-day hackathon. We split into three groups, and my
group produced some rough, ugly, but working Python and R code.
The Python code conducts keyword searches on archived 1% Twitter
API data, filtered to only search only those tweets that have
valid geolocation data. The short piece of R code calls a Google
map API and plots the data on a Google map in a browser,
allowing the user to click on the geolocated map points to view
the originator's tweet text.
Our next step will be to replace the R code with Python for
calling the Google map API.
Here, it's ugly, but it's free. Don't say I never gave you
anything.
--Doug
--
/Doug Roberts
d...@parrot-farm.net <mailto:d...@parrot-farm.net>/
/http://parrot-farm.net/Second-Cousins/
/
505-455-7333 <tel:505-455-7333> - Office
505-672-8213 <tel:505-672-8213> - Mobile/
============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribehttp://redfish.com/mailman/listinfo/friam_redfish.com