Re: How to order search results by Field value?

2004-03-26 Thread Erik Hatcher
On Mar 26, 2004, at 2:20 AM, Morus Walter wrote:
Erik Hatcher writes:
Why not do the unique sequential number replacement at index time
rather than query time?
how would you do that? This requires to know the ids that will be added
in future.
Let's say you start with strings 'a' and 'b'. Later you add a document
with 'aa'. How do you know that you should make 'a' 1 and 'b' 3 to be
prepared for 'aa'?
Good point.  I haven't thought through this scenario well enough yet.

	Erik

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: How to order search results by Field value?

2004-03-26 Thread Robert Koberg
Erik Hatcher wrote:

On Mar 26, 2004, at 2:20 AM, Morus Walter wrote:

Erik Hatcher writes:

Why not do the unique sequential number replacement at index time
rather than query time?
how would you do that? This requires to know the ids that will be added
in future.
Let's say you start with strings 'a' and 'b'. Later you add a document
with 'aa'. How do you know that you should make 'a' 1 and 'b' 3 to be
prepared for 'aa'?


Good point.  I haven't thought through this scenario well enough yet.
Hi,

You could form your results into XML and do a simple XSL transformation 
to get what you want.

results hits=3 searchStr=boo searchField=contents
  result id=a123 label=bbb/
  result id=c124 label=aaa/
  result id=b124 label=ccc/
/results
?xml version=1.0 encoding=UTF-8 ?
xsl:stylesheet
  xmlns:xsl=http://www.w3.org/1999/XSL/Transform;
  version=1.0
  !-- send sort by param from server, else use default --
  xsl:param name=sortBy select='label'/
  xsl:template match=results

div class=searchInfo
  xsl:textYou're search for /xsl:text
  xsl:value-of select=@searchStr/
  xsl:text in /xsl:text
  xsl:value-of select=@searchField/
  xsl:text returned /xsl:text
  xsl:value-of select=@hits/
  xsl:text results./xsl:text
/div
div class=results
  xsl:choose
xsl:when test=$sortBy='id'
  xsl:apply-templates select=result
xsl:sort select=@id/
  /xsl:apply-templates
/xsl:when
xsl:otherwise
  xsl:apply-templates select=result
xsl:sort select=@label/
  /xsl:apply-templates
/xsl:otherwise
  /xsl:choose
/div
  /xsl:template

  xsl:template match=result

div class=result
  span class=resultId
xsl:value-of select=@id/
  /span
  span class=resultLabel
xsl:value-of select=@label/
  /span
/div
  /xsl:template

/xsl:stylesheet

best,
-Rob

Erik


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: How to order search results by Field value?

2004-03-26 Thread Eric Jain
 You could form your results into XML and do a simple XSL
 transformation to get what you want.

Cool! What's the name of the XSLT processor you are using that can sort
1M records in memory?


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: How to order search results by Field value?

2004-03-26 Thread Robert Koberg
Robert Koberg wrote:

Eric Jain wrote:

You could form your results into XML and do a simple XSL
transformation to get what you want.


Cool! What's the name of the XSLT processor you are using that can sort
1M records in memory?


jd.xslt is known to handle extremely large sources:

http://www.aztecrider.com/xslt/

Alternativley, and much better performing you could use STX, see:

http://www.xml.com/pub/a/2003/02/26/stx.html

and:

http://stx.sourceforge.net/documents/


Hmmm... let me take that back before you spend time on STX. I don't 
think STX can do a sort. I will check. jd.xslt is probably the best bet, 
but development was stopped in 2003.

Do your search queries really result in 1MB of results? Is that a huge 
number of hits or are your fields extremely large?


best,
-Rob


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: How to order search results by Field value?

2004-03-26 Thread Eric Jain
 Do your search queries really result in 1MB of results? Is that a huge
 number of hits or are your fields extremely large?

The fields are not more than a dozen characters long, but there can
easily be several hundred thousand hits. Of course usually only the top
N hits need to be displayed, but somehow you need to determine what
those are...

Currently what I am doing is this: For each field by which one may need
to sort an array of all document IDs is kept, correctly sorted. This
array is built at startup, or rebuilt whenever the index is modified.
When a query is run, the results are stored into a bitset (position =
docid). For each docid in the presorted array I then check if the
corresponding bit in the bitset is set, and if yes, I add the docid to a
list of final results (or directly call some code). This procedure works
and is fast, but rather awkward (wasted memory, complicated
updates...) - hope to replace it one day :-)


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: How to order search results by Field value?

2004-03-25 Thread Eric Jain
 You can see the resolution in the latest CVS ;-)

Just to clarify things: Does the current solution require all fields
that can be used for sorting to be loaded and kept in memory? (I guess
you can answer this question faster than I can figure it out by myself
:-)


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: How to order search results by Field value?

2004-03-25 Thread Doug Cutting
Eric Jain wrote:
Just to clarify things: Does the current solution require all fields
that can be used for sorting to be loaded and kept in memory? (I guess
you can answer this question faster than I can figure it out by myself
:-)
Field values are loaded into memory.  But values are kept in an array of 
the appropirate type, so the memory used is not large, and loading is 
done via TermDocs, so it's very fast.  But, if you have a million 
documents, a sorter will use 4MB (and cache it for subsequent searches too).

Doug

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: How to order search results by Field value?

2004-03-25 Thread Doug Cutting
Eric Jain wrote:
That's reasonable. What I didn't quite understand yet: If I sort on a
string field, will Lucene need to keep all values in memory all the
time, or only during startup?
It will cache one instance of each unique value.  So if you have a 
million documents and string sort results on a field that only has 100 
different values, it will createa a 1M element array of pointers to 100 
unique strings.  But if the field has 1M different values, then the 
array will hold 1M unique strings.

Doug

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: How to order search results by Field value?

2004-03-25 Thread Eric Jain
 But if the field has 1M different values, then
 the array will hold 1M unique strings.

The strings according to which I need to sort are unique identifiers :-(

I will need to have a look at the code, but I assume that in principal
it should be possible to replace the strings with sequential integers
once the sorting is done?


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: How to order search results by Field value?

2004-03-25 Thread Doug Cutting
Eric Jain wrote:
I will need to have a look at the code, but I assume that in principal
it should be possible to replace the strings with sequential integers
once the sorting is done?
I don't understand the question.

Doug

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: How to order search results by Field value?

2004-03-25 Thread Eric Jain
 I will need to have a look at the code, but I assume that in
 principal it should be possible to replace the strings with
 sequential integers once the sorting is done?

 I don't understand the question.

I need to: Sort by a field containing 1M distinct strings. While I can't
afford to waste much memory for the entire duration of the application,
loading all strings into memory temporarily is possible. A solution may
therefore be to load all strings into memory, sort them, and then
replace them with sequentially numbered integers. The question is: Could
this approach work, or did I overlook anything?


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: How to order search results by Field value?

2004-03-25 Thread Erik Hatcher
Why not do the unique sequential number replacement at index time 
rather than query time?

	Erik

On Mar 25, 2004, at 6:26 PM, Eric Jain wrote:

I will need to have a look at the code, but I assume that in
principal it should be possible to replace the strings with
sequential integers once the sorting is done?
I don't understand the question.
I need to: Sort by a field containing 1M distinct strings. While I 
can't
afford to waste much memory for the entire duration of the application,
loading all strings into memory temporarily is possible. A solution may
therefore be to load all strings into memory, sort them, and then
replace them with sequentially numbered integers. The question is: 
Could
this approach work, or did I overlook anything?

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: How to order search results by Field value?

2004-03-25 Thread Morus Walter
Erik Hatcher writes:
 Why not do the unique sequential number replacement at index time 
 rather than query time?
 
how would you do that? This requires to know the ids that will be added
in future.
Let's say you start with strings 'a' and 'b'. Later you add a document
with 'aa'. How do you know that you should make 'a' 1 and 'b' 3 to be
prepared for 'aa'?

To me Erics suggestion makes sense.
The problem might be however: you have to sort all values, while keeping
the strings means that you sort only the hits.
And you should be aware that you have to rebuild the array each time the
index changes.

Morus

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



How to order search results by Field value?

2004-03-24 Thread Chad Small
Was there any conclusion to message:
 
http://issues.apache.org/eyebrowse/[EMAIL PROTECTED]msgNo=6762
 
Regarding Ordering by a Field?  I have a similar need and didn't see the resolusion 
in that thread.  Is it a current patch to the 1.3-final, I could see one?  
 
My other option, I guess, is just to code a comparator on a collection build off of 
the Hits.
 
thanks,
chad.


Re: How to order search results by Field value?

2004-03-24 Thread Joachim Schreiber
Chad,


 Was there any conclusion to message:


http://issues.apache.org/eyebrowse/[EMAIL PROTECTED]msgNo=6762

 Regarding Ordering by a Field?  I have a similar need and didn't see the
resolusion in that thread.  Is it a current patch to the 1.3-final, I could
see one?

You can see the resolution in the latest CVS ;-)

yo



 My other option, I guess, is just to code a comparator on a collection
build off of the Hits.

 thanks,
 chad.




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]