Hi,

I'd like to use Solr to index some webserver logs, in order to allow easy
ad-hoc querying and analysis. Each Solr Document will represent a single
request to the webserver, with fields for time, request URL, referring URL
etc.

I'm also planning to fetch the page source of each referring URL, and add
that as an indexed field in the Solr document. The aim is to allow queries
like "find hits to /xyz.html where the referring page contains the word
'foobar'".

Since hundreds or even thousands of hits may all come from the same
referring page, would this approach be horribly inefficient? (Note the page
source won't be stored in each Document, just indexed). Am I going to
dramatically increase the index size if I do this?

If so, is there a more elegant way to do what I want?

Many thanks,
Phil



-- 
View this message in context: 
http://www.nabble.com/Indexing-the-same-data-in-many-records-tp21448465p21448465.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to