What about using separate collections for different hosts and using aliases when querying many of them together? That's the usual solution for date-based rolling collections as well I believe.
https://cwiki.apache.org/confluence/display/solr/Collections+API Regards, Alex. ---- Newsletter and resources for Solr beginners and intermediates: http://www.solr-start.com/ On 7 August 2016 at 21:47, archit mehta <pathashal...@gmail.com> wrote: > Hi, > > I want to index logs files let say type_a.log & type_b.log and all this > files come from various host. Now in normal case I would index like > <some_field: "" , from_host : "" , log_type :" "> . [Index Everything as > single flat indexer] > > Now issue is let say each file has 1 billion rows, so with 2 host each with > 2 files, it will have 4 billion rows. So when some one want all details for > particular host search space still will be 4 billion rows. As lucene is flat > it indexes this way only. So is there a way I can create some metadata of > details like type of log, hostname and based on that create an index so > search become faster (when we need details from particular host or log files > only) or is this bad idea as when some join query ( give data from host A > and B and show in descending order) comes it will be very slow? > > Hope you got my question. > > Regards, > Archit --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org