Re: regex question in matchChunk function

2009-12-17 Thread Chris Sheffield
Thanks to all who replied and offered suggestions. I ended up using the find 
command on my field in order to accomplish what I need. While probably not 
super speedy, it seems to be working well. Fortunately the story passages are 
not too long, so the decreased speed is really not that noticeable.

Thanks again,
Chris

--
Chris Sheffield
Read Naturally, Inc.
www.readnaturally.com

___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: regex question in matchChunk function

2009-12-17 Thread zryip theSlug
It seems that I have missed the good tread. Apologizes if it's a double
message ;)

To enclose a word without its punctuation you have to define a list of
substitute strings like this :
- the list of possible form
 w ,w,,w.,.w,.w.
- the substitute list
 boxw/box
,boxw/box,,boxw/box.,.boxw/box,.boxw/box.

With this approach you'll be able to keep your punctuation alive. I'm sure
it'll thank us 8)

To create a list of possible form of whole word, you could:
1) Define the list of punctuation which could starts a word
i.e. : colon,space,nothing,comma 
2) Define the list of punctuation which could ends a word
i.e. : -,colon,dot,comma ...
3) Then mix all possibilities by two imbricated loops (okay it's like
cooking i'm presume 8))

So you'll obtain something like that :
put colon,space,comma,nothing,quote into startCharsList
put colon,dot,comma,- into endCharsList

put w into keyWord
put box into htmlTag
put  into wholeWordList
put  into substituteList

-- Create the list of whole words and its substitute list
repeat with startCharNum = 1 to number of items of startCharsList
repeat with endCharNum = 1 to number of items of endCharsList
put item startCharNum of startCharsList  keyWord  item endCharNum of
endCharsList, after wholeWordList
put item startCharNum of startCharsList  htmlTag  keyWord  htmlTag  item
endCharNum of endCharsList, after substituteList
end repeat
end repeat

-- Search and boxed one or a list of words
repeat with aWord in wordList
repeat with wholeWordForm in wholeWordList
put replaceText(wholeWordForm,w,aWord) into wholeWordForm --
replace the w key in your pattern by the word that you need
get fld yourField
replace wholeWordForm with itSubstituteForm in it
end repeat
end repeat

See how it could work ?
However not sure of the time of processing...

Not tested but it's a possibility.

Anyway you've already found your way and that is the main, so try this if
you would ;)


-Zryip TheSlug- wish you the best ! 8)

2009/12/17 Chris Sheffield cmsheffi...@gmail.com

 Thanks to all who replied and offered suggestions. I ended up using the
 find command on my field in order to accomplish what I need. While
 probably not super speedy, it seems to be working well. Fortunately the
 story passages are not too long, so the decreased speed is really not that
 noticeable.

 Thanks again,
 Chris

 --
 Chris Sheffield
 Read Naturally, Inc.
 www.readnaturally.com

 ___
 use-revolution mailing list
 use-revolution@lists.runrev.com
 Please visit this url to subscribe, unsubscribe and manage your
 subscription preferences:
 http://lists.runrev.com/mailman/listinfo/use-revolution

___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


regex question in matchChunk function

2009-12-15 Thread Chris Sheffield
I am not very familiar with regular expressions, and I'm wondering if someone 
more knowledgeable could give me a hint as to how to accomplish this.

Given a passage of text, I need to find every instance of certain words within 
that text and draw a box around them. The box drawing I can handle just fine by 
including box in the textStyle of the found chunk. But it's finding the 
instances that I'm struggling with. Here is my code. Big warning! This should 
not be run as is, if anyone wants to attempt it. The second repeat will go 
forever.

repeat for each line tWord in tDiffWords
repeat until matchChunk(tStoryText, (?i)\b(  tWord  )\b, 
tStartChar, tEndChar) is false
   
put the textStyle of char tStartChar to tEndChar of fld StoryText 
into tStyle
if tStyle is empty or tStyle is plain then
put box into tStyle
else
put comma  box after tStyle
end if
set the textStyle of char tStartChar to tEndChar of fld StoryText 
to tStyle
   
end repeat
end repeat

What I need is some way to use the matchChunk function and continue the search 
where the last search ended. I read through some regex documentation and came 
across \G, but this doesn't seem to work in Rev. But maybe I'm not putting it 
in the right place in my search string.

Can anyone help? Is there a way to do this? Or can someone recommend another 
method of accomplishing the same thing? Keep in mind that this needs to search 
whole words in a story passage, and we're dealing with all kinds of punctuation 
here, including hyphens, em dashes, etc.

Thanks,
Chris

--
Chris Sheffield
Read Naturally, Inc.
www.readnaturally.com

___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: regex question in matchChunk function

2009-12-15 Thread Troy Rollins


On Dec 15, 2009, at 1:46 PM, Chris Sheffield wrote:

Can anyone help? Is there a way to do this? Or can someone recommend  
another method of accomplishing the same thing?


Offset

--
Troy
RPSystems, Ltd.
http://www.rpsystems.net


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: regex question in matchChunk function

2009-12-15 Thread dunbarx
I am not either. but:

on mouseup
get fld yourField
repeat with y = 1 to the number of words in it
   if word y of it = yourtext then set the textstyle of word y of fld   
yourField to box
end repeat
end mouseup

Now this writes to fld yourfield every time it matches. I think if you 
use the htmltext you can work in a variable and set the style all at once. 
Faster.

Craig newman
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: regex question in matchChunk function

2009-12-15 Thread Chris Sheffield
Thanks, Troy. Unfortunately, offset doesn't quite work for me, as it does not 
honor the wholeMatches property. So I might search for use, and it would find 
both use and used, which is not the desired result. However, with some 
extra code I could probably make it work (manually checking for punctuation, 
spaces, etc.). Not pretty, but might work.

Thanks again,
Chris

On Dec 15, 2009, at 1:05 PM, Troy Rollins wrote:

 
 On Dec 15, 2009, at 1:46 PM, Chris Sheffield wrote:
 
 Can anyone help? Is there a way to do this? Or can someone recommend another 
 method of accomplishing the same thing?
 
 Offset
 
 --
 Troy
 RPSystems, Ltd.
 http://www.rpsystems.net
 
 
 ___
 use-revolution mailing list
 use-revolution@lists.runrev.com
 Please visit this url to subscribe, unsubscribe and manage your subscription 
 preferences:
 http://lists.runrev.com/mailman/listinfo/use-revolution

--
Chris Sheffield
Read Naturally, Inc.
www.readnaturally.com

___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: regex question in matchChunk function

2009-12-15 Thread Peter Brigham MD
Here is one way. These are utility functions I use constantly for text  
processing. Offsets(str,cntr) returns a comma-delimited list of all  
the offsets of str in ctnr. Lineoffsets(str,cntr) does the same with  
lineoffsets. Then you can interate over the list of offsets to do  
whatever you want to each instance of str in cntr. I keep them in a  
utility stack that is in the stackinuse, so it is available to all  
stacks. I don't use regex, as I have never gotten the regex syntax to  
stick in my head firmly enough to find it natural, and in any case  
doing it by script turns out to be as fast or faster.


-- Peter

Peter M. Brigham
pmb...@gmail.com
http://home.comcast.net/~pmbrig

-

function offsets str,cntr
   -- returns a comma-delimited list of
   -- all the offsets of str in cntr
   put  into oList
   put 0 into startPoint
   repeat
  put offset(str,cntr,startPoint) into os
  if os = 0 then exit repeat
  add os to startPoint
  put startPoint  , after oList
   end repeat
   if char -1 of oList = , then delete last char of oList
   if oList =  then return 0
   return mosList
end offsets

function lineOffsets str,cntr
   -- returns a comma-delimited list of
   -- all the lineoffsets of str in cntr
   put offsets(str,cntr) into charList
   if charList = 0 then return 0
   put the number of items of charList into nbr
   put  into mlo
   repeat for each item n in charList
  put the number of lines of (char 1 to n of cntr) \
, after oList
   end repeat
   if char -1 of oList = , then delete char -1 of oList
   return oList
end lineOffsets

-

On Dec 15, 2009, at 1:46 PM, Chris Sheffield wrote:

I am not very familiar with regular expressions, and I'm wondering  
if someone more knowledgeable could give me a hint as to how to  
accomplish this.


Given a passage of text, I need to find every instance of certain  
words within that text and draw a box around them. The box drawing I  
can handle just fine by including box in the textStyle of the  
found chunk. But it's finding the instances that I'm struggling  
with. Here is my code. Big warning! This should not be run as is, if  
anyone wants to attempt it. The second repeat will go forever.


repeat for each line tWord in tDiffWords
   repeat until matchChunk(tStoryText, (?i)\b(  tWord  ) 
\b, tStartChar, tEndChar) is false


   put the textStyle of char tStartChar to tEndChar of fld  
StoryText into tStyle

   if tStyle is empty or tStyle is plain then
   put box into tStyle
   else
   put comma  box after tStyle
   end if
   set the textStyle of char tStartChar to tEndChar of fld  
StoryText to tStyle


   end repeat
   end repeat

What I need is some way to use the matchChunk function and continue  
the search where the last search ended. I read through some regex  
documentation and came across \G, but this doesn't seem to work in  
Rev. But maybe I'm not putting it in the right place in my search  
string.


Can anyone help? Is there a way to do this? Or can someone recommend  
another method of accomplishing the same thing? Keep in mind that  
this needs to search whole words in a story passage, and we're  
dealing with all kinds of punctuation here, including hyphens, em  
dashes, etc.


Thanks,
Chris

--
Chris Sheffield
Read Naturally, Inc.
www.readnaturally.com

___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your  
subscription preferences:

http://lists.runrev.com/mailman/listinfo/use-revolution


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: regex question in matchChunk function

2009-12-15 Thread J. Landman Gay

Chris Sheffield wrote:

I am not very familiar with regular expressions, and I'm wondering if
someone more knowledgeable could give me a hint as to how to
accomplish this.

Given a passage of text, I need to find every instance of certain
words within that text and draw a box around them. 


All I can think of is to grab the text block and use a series of 
replace commands to replace each punctuation type with a space. That 
should still retain your word boundaries and relative character 
positions. After that, use regex to get the word boundaries. Presumably 
you won't have to box the punctuation.


Ugly, but might work.

--
Jacqueline Landman Gay | jac...@hyperactivesw.com
HyperActive Software   | http://www.hyperactivesw.com
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution