Thanks Alexandre,

the list of IDs is constant for a longer time. I will take a look at
these join thematic.
Maybe another solution would be to really create a whole new
collection or set of documents containing the aggregated documents (from the
ids) from scratch and to execute queries on this collection. Then this
would take
some time, but maybe it's worth it because the querying will thank you.

Daniel

On Thu, Jul 26, 2012 at 7:43 PM, Alexandre Rafalovitch
<arafa...@gmail.com>wrote:

> You can't update the original documents except by reindexing them, so
> no easy group assigment option.
>
> If you create this 'collection' once but query it multiple times, you
> may be able to use SOLR4 join with IDs being stored separately and
> joined on. Still not great because the performance is an issue when
> mapping on IDs:
> http://www.lucidimagination.com/blog/2012/06/20/solr-and-joins/ .
>
> If the list is some sort of combination of smaller lists - you could
> probably precompute (at index time) those fragments and do compound
> query over them.
>
> But if you have to query every time and the list is different every
> time, that could be complicated.
>
> Regards,
>    Alex.
>
> Personal blog: http://blog.outerthoughts.com/
> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
> - Time is the quality of nature that keeps events from happening all
> at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
> book)
>
>
> On Thu, Jul 26, 2012 at 12:01 PM, Daniel Brügge
> <daniel.brue...@googlemail.com> wrote:
> > Hi,
> >
> > i am facing the following issue:
> >
> > I have couple of million documents, which have a field called
> "source_id".
> > My problem is, that I want to retrieve all the documents which have a
> > source_id
> > in a specific range of values. This range can be pretty big, so for
> example
> > a
> > list of 200 to 2000 source ids.
> >
> > I was thinking that a filter query can be used like fq=source_id:(1 2 3
> 4 5
> > 6 .....)
> > but this reminds me of SQLs WHERE IN (...) which was always bit slow for
> a
> > huge
> > number of values.
> >
> > Another solution that came into my mind was to assigned all the
> documents I
> > want to
> > retrieve a new kind of "filter id". So all the documents which i want to
> > analyse
> > get a new id. But i need to update all the millions of documents for this
> > and assign
> > them a new id. This could take some time.
> >
> > Do you can think of a nicer way to solve this issue?
> >
> > Regards & greetings
> >
> > Daniel
>

Reply via email to