Hello

I would like to store data retrieved hourly from RSS feeds in a database or in 
Lucene so that the text can be easily
indexed for word frequencies.

I need to get the text from the title and description elements of RSS items.

Ideally, for each hourly retrieval from a given feed, I would add a row to a 
table in a dataset made up of the
following columns:

feed_url, title_element_text, description_element_text, polling_date_time

>From this, I can look up any element in a feed and calculate keyword 
>frequencies based upon the length of time required.

This can be done as a database table and hashmaps used to calculate word 
frequencies. But can I do this in Lucene to
this degree of granularity at all? If so, would each feed form a Lucene 
document or would each 'row' from the
database table form one?

Can anyone advise?

Thanks

Martin O'Shea.
-- 



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to