Lukas, One last thing, be sure to log only when a user clicks on a result and in Hadoop document_id will be a key in the map phase.
Lucene related steps are the same. Best, Peter W. On Aug 14, 2007, at 1:28 PM, Peter W. wrote:
When users perform a search, log the unique document_id, IP address and result position for the next step. Use Hadoop to simplifiy your logs by mapping the id and emitting IP's as intermediate values. Have reduce collect unique document_id[IP addresses].
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]