tl;dr: retrieving 10,000 docs is a bad idea. Look into docValues for
storing security info

I suspect that you'll be better served by keeping the permissions
up-to-date in solr and invalidating the caches rather than trying to return
10,000 docs.  On average, you'll be attempting to read up to 800MB of data
per query (400GB * 10000/5060000), and that will be accessed randomly.
 Assuming As Toke said earlier, on a disc this will just be a bad idea.  If
you must persist in querying like this, then I'd second the SSDs - a pair
in RAID 1 should give you good read performance, adequate write and
redundancy. You might be able to get that query down to something in the
region of 5-10 seconds at a guess.  Assuming that you're not actually
returning the entire document in the response, giving an 800MB network
response even on GbE that'll be 10s just for layer-2, let alone the
serialisation overhead.

You might try looking into storing your security information in docValues
fields - set docValues=true against the field in schema.xml (needs Solr 4.2).
 That ought to give greater performance when reading that field and may
circumvent your concerns over cache invalidation although I haven't played
with them yet, so don't quote me on that.

Can you be more specific about the security model? What is being stored in
the DB? How does that get applied to the document? Can you translate that
into a query that solr could understand?  Is it too complex, or are you
really just worried about cache invalidation?

Would it be acceptable to have the security info in solr, but lagging the
DB somewhat. Then select a smaller selection and post-filter in your
business layer?
i.e. instead of running just q=foo:bar&rows=10000 then filtering, you run a
query such as q=foo:bar&fq=security_group:(2 3 19)&rows=150 and then
filtering against your DB as a final double-check before presenting to your
user.  This would mean that they would immediately be prevented from seeing
something that they're no longer allowed to, but may have to wait for the
next update to see something they've just been allowed to.

Regards,
  Duncan.


On 16 April 2013 15:02, Montu v Boda <montu.b...@highqsolutions.com> wrote:

> Hi
>
> problem is that the permission is frequently update in our system so that
> we
> have to update the index in the same manner other wise it will give wrong
> result.
>
> in that case i think the cache will get effect and the performance may be
> reduced.
>
>
> Thanks & Regards
> Montu v Boda
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/first-time-with-new-keyword-solr-take-to-much-time-to-give-the-result-tp4056254p4056322.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Don't let your mind wander -- it's too little to be let out alone.

Reply via email to