OK, this time I'm just typing into email - havent tested these suggestions :-)

On 30/08/2018 10:24, Keith Clarke via use-livecode wrote:
Folks,
Is there a single-pass mechanism or more efficient way of returning the 
wordOffset of each instance of ‘the’ in ‘the quick brown fox jumped over the 
lazy dog’ than to use two passes through the text?
Yes. For a single word myWord

put 0 into tOffset
repeat forever
  put trueWordOffset(myWord, tSource, tOffset) into tmp
  if tmp > 0 then
    put tmp & comma after tOffsetList
    put tmp into tOffset
  end if
end repeat

BUT there's a chance that this performs poorly, becuase of repeated skipping, so I would also benchmark the simpler
put 0 into tOffset
repeat for each trueWord W in tSource
  add 1 to tOffset
  if W = myWord then
     put tOffset & comma after tOffsetList
  end if
end repeat
Pass-1. Count the instances of ‘the’ into an array and then
Pass-2. Repeat for the count of instances using wordOffset, with a wordsToSkip 
variable derived from the previous loop’s offset

I’m I’m wondering if there’s something I’ve not yet learned about (nested?) arrays 
that might extend the unique word counter code that Alex, Paul & others helped 
me to fix a few days ago, to add a sub-array of wordOffset alongside word count?
I'm not entirely sure what you want here, or what the 'N' below are.
Do you want a count and an offsetList for each word ? If so, no need for nested arrays.

Then I'd change your second loop below to:

repeat for each trueWord W in tSource
   add 1 to tOffset
   if tANoise[W] then next repeat
   add 1 to tAWordCount[W]
   put tOffset & comma after tAWordOffsets[W]
end repeat

and of course the third loop to

repeat for each key K in tAWordCount
   put k && tAWordCount[K] & CR after tmp
end repeat
sort lines of tmp descending numeric by word 2 of each
put tmp into fld "Words"
If I've misunderstood what you want, please say so and I'll try again :-)

Alex.


# Prepare noisewords array

repeat for each trueWord W in tNoiseWords

put true into tANoise[W]

end repeat


# Build unique words array

repeat for each trueWord W in tSource

if tANoise[W] then next repeat

add 1 to tAWords[W][N]

end repeat


# Convert unique words array to list


repeat for each key K in tAWords

put K && tAWords[K][N] & CR after fld "Words"

end repeat


sort lines of field "Words" descending numeric by word 2 of each


end repeat

Any ideas or steer towards a lesson / worked example greatly appreciated.
Best,
Keith
_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to