Have pondered the question, and come up with some code which may or may not solve the problem at hand, but which may at least prove helpful in looking for a real solution:
========================== Assumption: You’ve got a text document (not HTML, not RTF, just plain TXT) which contains, among other things, however-many place names. Assumption: You have a return-list of place names, which may or may not be single words Assumption: The text document is in the variable SourceDoc Assumption: The list of place names is in the variable NamesList Assumption: You want a document which contains a complete census of exactly which of the place-names in NamesList occur in SourceDoc Assumption: For each place-name which does occur within SourceDoc, you want a list of which word-numbers each such occurrance begins at put “” into PlaceNamesCensus repeat for each line DisName in NamesList put the number of words in DisName into DisNameWords put 0 into SearchOffset put “” into FoundLocs repeat put offset (DisName, SourceDoc, SearchOffset) into DisLoc if DisLoc = 0 then -- there is no character string which matches the place name in question end repeat else —- there is a character string which matches the place name in question —- is it the actual placename, and not finding “chester” in “colchester”? put the number of words in (char 1 to DisLoc of SourceDoc) into StartWord if DisName = (word StartWord to (StartWord + DisNameWords - 1) of SourceDoc) then -- it’s a match, yay! put StartWord into item (1 + the number of items in FoundLocs) of FoundLocs end if add DisLoc to SearchOffset end if end repeat if FoundLocs <> “” then —- nope, DisName wasn’t in SourceDoc put “[nil]” into DeseLocs else —- yay! DisName *was* in SourceDoc! at least once! put FoundLocs into DeseLocs end if put DisName & comma & DeseLocs into line (1 + the number of lines in PlaceNamesCensus) of PlaceNamesCensus end repeat ========================== Known issue: The above code does not pretend to locate possessive instances of place names (i.e., California's, the United Kingdom's, etc). Am thinking that pre-processing of SourceDoc will be helpful-to-necessary. This pre-processing may need to accommodate more issues than just possessives. "Bewitched" + "Charlie's Angels" - Charlie = "At Arm's Length" Read the webcomic at [ http://www.atarmslength.net ]! If you like "At Arm's Length", support it at [ http://www.patreon.com/DarkwingDude ]. _______________________________________________ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode