OK, this time I'm just typing into email - havent tested these
suggestions :-)
On 30/08/2018 10:24, Keith Clarke via use-livecode wrote:
Folks,
Is there a single-pass mechanism or more efficient way of returning the
wordOffset of each instance of ‘the’ in ‘the quick brown fox jumped over the
lazy dog’ than to use two passes through the text?
Yes. For a single word myWord
put 0 into tOffset
repeat forever
put trueWordOffset(myWord, tSource, tOffset) into tmp
if tmp > 0 then
put tmp & comma after tOffsetList
put tmp into tOffset
end if
end repeat
BUT there's a chance that this performs poorly, becuase of repeated
skipping, so I would also benchmark the simpler
put 0 into tOffset
repeat for each trueWord W in tSource
add 1 to tOffset
if W = myWord then
put tOffset & comma after tOffsetList
end if
end repeat
Pass-1. Count the instances of ‘the’ into an array and then
Pass-2. Repeat for the count of instances using wordOffset, with a wordsToSkip
variable derived from the previous loop’s offset
I’m I’m wondering if there’s something I’ve not yet learned about (nested?) arrays
that might extend the unique word counter code that Alex, Paul & others helped
me to fix a few days ago, to add a sub-array of wordOffset alongside word count?
I'm not entirely sure what you want here, or what the 'N' below are.
Do you want a count and an offsetList for each word ? If so, no need for
nested arrays.
Then I'd change your second loop below to:
repeat for each trueWord W in tSource
add 1 to tOffset
if tANoise[W] then next repeat
add 1 to tAWordCount[W]
put tOffset & comma after tAWordOffsets[W]
end repeat
and of course the third loop to
repeat for each key K in tAWordCount
put k && tAWordCount[K] & CR after tmp
end repeat
sort lines of tmp descending numeric by word 2 of each
put tmp into fld "Words"
If I've misunderstood what you want, please say so and I'll try again :-)
Alex.
# Prepare noisewords array
repeat for each trueWord W in tNoiseWords
put true into tANoise[W]
end repeat
# Build unique words array
repeat for each trueWord W in tSource
if tANoise[W] then next repeat
add 1 to tAWords[W][N]
end repeat
# Convert unique words array to list
repeat for each key K in tAWords
put K && tAWords[K][N] & CR after fld "Words"
end repeat
sort lines of field "Words" descending numeric by word 2 of each
end repeat
Any ideas or steer towards a lesson / worked example greatly appreciated.
Best,
Keith
_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode