On Sun, Nov 30, 2008 at 2:04 AM, Naomi Dushay <[EMAIL PROTECTED]> wrote: > The terms component approach, if i understand it correctly, will be > problematic. I need to present not only the next X call numbers in > sequence, but other fields in those documents (e.g. title, author).
You can still use the method Hoss suggested of doing 2 requests to satisfy this type of search: >> But as Yonik said: the new TermsComponent may actually be a better option >> for you -- doing two requests for every page (the first to get the N Terms >> in your id field starting with your input, the second to do an query for >> docs matching any of those N ids) might actually be faster even though >> there won't likely even be any cache hits. So TermsComponent gets the next 10 IDs, then you do a standard query with those 10 IDs. -Yonik > assume the Terms Component approach will only give me the next X call number > values, not the documents. > > It sounds like Glen Newton's suggestion of mapping the call numbers to a > float number is the most likely solution. > > I know it sounds ridiculous to do all this for a "call number browse" but > our faculty have explicitly asked for this. For humanities scholars > especially, they know the call numbers that are of interest to them, and > they browse the stacks that way (ML 1500s are opera, V35 is verdi ...). > They are using the research methods that have been successful for their > entire careers. Plus, library materials are going to off site, high density > storage, so the only way for them to to browse all materials, regardless of > location, via call number is online. I doubt they'll find this feature as > useful as they expect, but it behooves us to give the users what they ask > for. > > So yeah, our user needs are perhaps a little outside of your expectations. > :-) > > - Naomi > > > On Nov 29, 2008, at 2:58 PM, Chris Hostetter wrote: > >> >> : The results are correct. But the response time sucks. >> : >> : Reading the docs about caches, I thought I could populate the query >> result >> : cache with an autowarming query and the response time would be okay. >> But that >> : hasn't worked. (See excerpts from my solrConfig file below.) >> : >> : A repeated query is very fast, implying caching happens for a particular >> : starting point ("42" above). >> : >> : Is there a way to populate the cache with the ENTIRE sorted list of >> values for >> : the field, so any arbitrary starting point will get results from the >> cache, >> : rather than grabbing all results from (x) to the end, then sorting all >> these >> : results, then returning the first 10? >> >> there's two "caches" that come into play for something like this... >> >> the first cache is a low level Lucene cache called the "FieldCache" that >> is completley hidden from you (and for the most part: from Solr). >> anytime you sort on a field, it get's built, and reuse for all sorts on >> that field. My originl concern was that it wasn't getting warmed on >> "newSearcher" (because you have to be explicit about that. >> >> the second cache is the queryResultsCache which caches a "window" of an >> ordered list of documents based on a query, and a sort. you can see this >> cache in your Solr stats, and yes: these two requests results in different >> cache keys for the queryResultsCache... >> >> q=yourField:[42+TO+*]&sort=yourField+asc&rows=10 >> q=yourField:[52+TO+*]&sort=yourField+asc&rows=10 >> >> ...BUT! ... the two queries below will result in the same cache key, and >> the second will be a cache hit, provided a sufficient value for >> the "queryResultWindowSize" ... >> >> q=yourField:[42+TO+*]&sort=yourField+asc&rows=10 >> q=yourField:[42+TO+*]&sort=yourField+asc&rows=10&start=10 >> >> so perhaps the key to your problem is to just make sure that once the user >> gives you an id to start with, you "scroll" by increasing the start param >> (not altering the id) ... the first query might be "slow" but every query >> after that should be a cache hit (depending on your page size, and how far >> you expect people to scroll, you should consider increasing >> queryResultWindowSize) >> >> But as Yonik said: the new TermsComponent may actually be a better option >> for you -- doing two requests for every page (the first to get the N Terms >> in your id field starting with your input, the second to do an query for >> docs matching any of those N ids) might actually be faster even though >> there won't likely even be any cache hits. >> >> >> My opinion: Your use case sounds like a waste of effort. I can't imagine >> anyone using a library catalog system ever wanting to lookup a callnumber, >> and then scroll through all posisble books with similar call numbers -- it >> seems much more likely that i'd want to look at other books with similar >> authors, or keywords, or tags ... all things that are actaully *easier* to >> do with Solr. (but then again: i don't work in a library. i trust that >> you know something i don't about what your users want.) >> >> >> -Hoss >> > > Naomi Dushay > [EMAIL PROTECTED] > > > >