On 6/5/05 4:14 AM, jbv wrote:
Hi list,
I'm trying to build the fastest possible algorithm for the
following task :
I have a variable containing reference words; each "word"
in this variable is considered as an item.
I also have a list of "sentences" (each sentence being a
list of "words" separated by spaces), 1 per line. Each
sentence can contain 1 or more words from the reference,
as well as other words.
I need to determine for each sentence, which words from
the reference are included in it, in which order, and output
a list of offsets.
For example :
reference variable : W1,W2,W3,W4
sentence 1 : Wx W2 Wy Wz W3
output : 2,3
And last but not least, I need to keep only sentences that
contain more than 1 word from the reference.
My stab at it:
function calcwords tReference,tSentence
-- tReference should be the comma-delimited word list, i.e.: "w1,w2,w3"
-- tSentence is the user's space-delimited entry, i.e.: "wx w2 wy wz w3"
put tReference into tRef -- so we can manipulate a copy
replace comma with comma & cr in tRef
split tRef by cr and comma
replace space with comma & cr in tSentence
split tSentence by cr and comma
intersect tSentence with tRef -- it now has the right keys
repeat for each line l in keys(tSentence)
put itemoffset(l,tReference) & comma after tOutput
end repeat
delete last char of tOutput -- the comma
if comma is in tOutput then return tOutput
else return empty
end calcwords
I didn't time it but it seems like it should be faster.
--
Jacqueline Landman Gay | [EMAIL PROTECTED]
HyperActive Software | http://www.hyperactivesw.com
_______________________________________________
use-revolution mailing list
use-revolution@lists.runrev.com
http://lists.runrev.com/mailman/listinfo/use-revolution