Hi,


----- Original Message ----
> From: Ian Holsman <[EMAIL PROTECTED]>
> To: [email protected]
> Sent: Friday, October 10, 2008 11:18:19 PM
> Subject: Re: What are the business cases for collaborative filtering?
> 
> Otis Gospodnetic wrote:
> >
> >   Building a data-collection mechanism, storage mechanism, and figuring out 
> how to feed the data to Taste, do so quickly, frequently enough, etc.
> >
> > Otis
> >
> >  
> hmm. sounds like a good subproject.
> currently we are using a  custom piece of code hooked up via the apache 
> logwriter
> to feed the data into HDFS and then run stuff.
> 
> but it would be good to have something that does it in real time too

Heh, it sounds like we are going through similar steps.  I first wrote a simple 
"beacon servlet" for tracking purposes.  Then opted for a simpler (and more 
static) pixel tracker and a web server (nginx) logging and a log parser that is 
supposed to process that log and store it to _____ (not sure where, yet, didn't 
get there) and then from there get it to Taste.  This, of course, means more 
batch oriented processes.  Going with the beacon servlet approach could 
*presumably* do something closer to real-time recommendations....

Ian, can you elaborate on the "feed data into HDFS" part?  You simply store it 
in HDFS?  Why HDFS?  Why not some other FS or why not a RDBMS?  What happens to 
your data after you store it in the HDFS?


Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

Reply via email to