On 2 Nov 2005, at 08:10, Richard Jones wrote:
If i've listened to Radiohead (id 1) 10 times, Coldplay (id 2) 5 times and Beck (id 3) 2 times, the field would look like this "1 1 1 1 1 1 1 1 1 1 2 2
2 2 2 3 3"

I use this index for quickly finding "top fans" of an artist or combination of
artists, comparing peoples music taste and other things on the fly.

The issue is that i already have the termvecor (radiohead=10, coldplay=5, beck=2) handy as a hashtable, and i've found myself building up a string of numbers separated by spaces as shown above, then feeding this into lucene (i store the termvec of the field in lucene). Is there a way i could pass a termvector directly to lucene to cut out the ugly "turn it into a string and let lucene parse it" step? basically i want to provide the termvector for a field when inserting a new document, rather than let lucene build it by
analyzing a string.

This does feel like a rather perverted use of lucene i suppose.. It's faster
and less hassle than other methods i've tried to date though.

last.fm using Lucene, sweet! It has caught on with quite a number of friends, so I tried it just yesterday and my first query for music like "Michael Hedges" turned up nothing, so I was bummed. - but it is a very cool service.

Rather than building a string to index in this manner, perhaps adding each integer as an individual Field with the same name, with the term vector enabled, and using something like the WhitespaceAnalyzer. To be honest, though, I'm not sure without digging deeper whether adding same-named fields in this manner messes with the term vector capabilities.

    Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to