2010/1/21 Olivier Grisel <[email protected]>:
> 2010/1/20 Ian Holsman <[email protected]>:
>> On 1/20/10 2:35 AM, Jason Rutherglen wrote:
>>>
>>> We've got Newsgroup classification. I'm kinda of interested in
>>> creating a Twitter classification system, or at least playing
>>> around with it. Also I think as a relevant growing large data
>>> set, it seems Twitter fit well with Hadoop based machine
>>> learning algorithms... Just throwing out into the wild!
>>>
>>>
>>
>> Hi Jason.
>> I think the biggest issues here are twofold.
>>
>> 1. access to the data, although I'm sure the ASF could work something out
>> here
>
> Firehose (the live complete twitter stream) is going to be open to the
> public this year. In the mean time the mean time it is possible to
> gain access to a sample stream and to perform adhoc search queries on
> specific terms or user profiles.

BTW, I just stumbeled upon the following project to dump a twitter
statuses stream directly to HDFS:

  http://github.com/ieure/Twidoop

-- 
Olivier
http://twitter.com/ogrisel - http://code.oliviergrisel.name

Reply via email to