"It supports it like 2.9, but not using the Hits API. As described above, to
show results 991 to 1000 request the top-1000 results and display the last
10 :-)"

Bear with me as I am little confused so let me throw some stuff down here and 
think out loud...
So, I basically have to request the top 100, then do another request for the 
next 100, etc. etc which seems like that would start all over from scratch and 
be a bit of a performance hit correct???  I would think the optimal way would 
be search returns an object which maintains a cursor into the index tree until 
I close it so I can keep asking for the next 100.  It sounds like this new api 
doesn't do that?  And maybe the old one didn't either but from client 
perspective, I thought the Hits object might actually just maintain that 
pointer.

NOTE: I am not doing anything close to search.  Just basic column indexing like 
an RDBMS would do for us except we don't have an RDBMS.  Our old RDBMS system 
has scaled up to being too costly(3 terabytes).  We are now scaling out with 
noSQL and trying to replace the RDBMS before the costs start to be more than 
the customers pay us.

BIG NOTE: I think back to hibernate here where if you use select * from xx 
where yyy and setMaxResults and setFirstPage(index), it gets slower and slower 
as you page further in, BUT if you instead use the ScrollableResults, it 
maintains a cursor and the speed NEVER gets slower as you page into the results.

Maybe I am using the wrong library but there are a lot of noSQL users of Hbase 
starting to use SOLR from what I understand.  Should I be using a different 
indexing library perhaps?

Thanks,
Dean


-----Original Message-----
From: Uwe Schindler [mailto:u...@thetaphi.de] 
Sent: Sunday, June 19, 2011 12:16 PM
To: java-user@lucene.apache.org
Subject: RE: looks like no allowing of paging without counting entire result 
set?

> I am wondering how the old Hits object worked that was deprecated and
> removed....that looks like I could stop asking it for more results and it
would
> work better not counting all activities that matched in my 10 mil or 100
mil
> result set and just returning the first 100, second 100 and then I can cut
off
> which would be way more performant.

Hits did exactly what you described before. It got as many results as needed
to show the nth page. To when showing the page for results 20 to 30, it
fetches at least 30 results.

In general Full Text Search engines are only scoring the top results. This
is e.g. one reason why Google limits the maximum page you can go to.

> Should I just use 2.9 instead?  But then 3.x doesn't seem to support this?

It supports it like 2.9, but not using the Hits API. As described above, to
show results 991 to 1000 request the top-1000 results and display the last
10 :-)

Uwe


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

This message and any attachments are intended only for the use of the addressee 
and
may contain information that is privileged and confidential. If the reader of 
the 
message is not the intended recipient or an authorized representative of the
intended recipient, you are hereby notified that any dissemination of this
communication is strictly prohibited. If you have received this communication in
error, please notify us immediately by e-mail and delete the message and any
attachments from your system.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to