On Fri, Jul 26, 2013 at 11:18 PM, Shawn Heisey <s...@elyograg.org> wrote:
> On 7/26/2013 11:50 PM, Joe Zhang wrote: > > ==> Essentially we are doing paigination here, right? If performance is > not > > the concern, given that the index is dynamic, does the order of > > entries remain stable over time? > > Yes, it's pagination. Just like the other method that I've described in > detail, you'd have to avoid updating the index while you were getting > information. Unless you can come up with a sort parameter that's > guaranteed to make sure that new documents are at the end, any changes > to the index during the retrieval process will make it impossible to > retrieve every document. > ==> What I can guarantee is that there is no deletion, but I guess this is not equivalent to "newly added docs are at the end", right? ==> I believe you are right about performance. The retrived set becomes larger and larger. > > >> ==> This approach seems to require that the id field is numerical, > right? > > I have a text-based id that is unique. > > StrField types work perfectly with range queries. As long as it's not a > tokenized field, TextField works properly with range queries too. > KeywordTokenizer is OK, as long you don't use filters that create > additional tokens. Some examples that create additional tokens are > WordDelimiterFilter and EdgeNgramFilter. > > ==> so a "url" field would work fine? > > > ==> I'm not sure I understand the "q={XXX TO *}" part --> wouldn't query > be > > matched against the default search field, which could be "content", for > > example? How would that do the job? > > You are correct, I was too hasty in constructing the query. That should > be: > q=id:{XXX TO *}&rows=NNNNNN&sort=id asc > > You could speed things up if you don't need to see all stored fields in > the response by using the fl parameter to only return the fields that > you need. > > Responding to your additional message about an autoincrement field - > that would only be possible if you are importing from a data source that > supports autoincrement, like MySQL. Solr itself has no support for > autoincrement. > > Thanks, > Shawn > >