Hi Erick, Thanks for looking into this and for the tips you've sent.
I am leaning towards custom query component at the moment, the primary reason for it would be to be able to squeeze the amount of data that is sent over to Solr. A single round trip within the same datacenter is worth around 0.5 ms [1] and if query doesn't fit into a single ethernet packet, this number effectively has to double/triple/etc. Regarding cache filters - I was actually thinking the opposite: caching ACL queries (filter queries) would be beneficial as those tend to be the same across multiple search requests. [1] http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//people/jeff/stanford-295-talk.pdf , slide 13 m. On Tue, Apr 24, 2012 at 4:43 PM, Erick Erickson <erickerick...@gmail.com> wrote: > In general, query parsing is such a small fraction of the total time that, > almost no matter how complex, it's not worth worrying about. To see > this, attach &debugQuery=on to your query and look at the timings > in the "pepare" and "process" portions of the response. I'd be > very sure that it was a problem before spending any time trying to make > the transmission of the data across the wire more efficient, my first > reaction is that this is premature optimization. > > Second, you could do this on the server side with a custom query > component if you chose. You can freely modify the query > over there and it may make sense in your situation. > > Third, consider "no cache filters", which were developed for > expensive filter queries, ACL being one of them. See: > https://issues.apache.org/jira/browse/SOLR-2429 > > Fourth, I'd ask if there's a way to reduce the size of the FQ > clause. Is this on a particular user basis or groups basis? > If you can get this down to a few groups that would help. Although > there's often some outlier who is member of thousands of > groups :(. > > Best > Erick > > > 2012/4/24 Mindaugas Žakšauskas <min...@gmail.com>: >> On Tue, Apr 24, 2012 at 3:27 PM, Benson Margulies <bimargul...@gmail.com> >> wrote: >>> I'm about to try out a contribution for serializing queries in >>> Javascript using Jackson. I've previously done this by serializing my >>> own data structure and putting the JSON into a custom query parameter. >> >> Thanks for your reply. Appreciate your effort, but I'm not sure if I >> fully understand the gain. >> >> Having data in JSON would still require it to be converted into Lucene >> Query at the end which takes space & CPU effort, right? Or are you >> saying that having query serialized into a structured data blob (JSON >> in this case) makes it somehow easier to convert it into Lucene Query? >> >> I only thought about Java serialization because: >> - it's rather close to the in-object format >> - the mechanism is rather stable and is an established standard in Java/JVM >> - Lucene Queries seem to implement java.io.Serializable (haven't done >> a thorough check but looks good on the surface) >> - other conversions (e.g. using Xtream) are either slow or require >> custom annotations. I personally don't see how would Lucene/Solr >> include them in their core classes. >> >> Anyway, it would still be interesting to hear if anyone could >> elaborate on query parsing complexity. >> >> m.