Re: MatchText

2010-10-08 Thread DunbarX
I am no expert, but this:

\xabcd\y

will match any char of "abcd".

I think.

Craig Newman
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: MatchText

2010-10-08 Thread Mike Bonner
A period matches any single char, an asterisk will match any number of
consecutive chars of the previous char.  IE, p* will match   Since .
will match anything (with a couple exceptions), .* will match any
combination of digits/chars whatever.  To match text chars only you use sets
like so [a-zA-Z]* which will match any upper or lower case char and as many
more upper or lower case chars that follows in a row.

This page is a really good place to get a grip on the easier regex
http://www.regular-expressions.info/quickstart.html


On Fri, Oct 8, 2010 at 11:50 AM, Warren Kuhl  wrote:

> What is the syntax of the matchtext command to search a variable for
> multiple characters (".", "!" or "?").  If it contains any characters in my
> list, it returns a TRUE?
>
> I am able to get it working with one value...
>
> answer MatchText(tValue,"(\.)")
>
> Thanks for any help!
> Warren
> ___
> use-revolution mailing list
> use-revolution@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-revolution
>
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: matchText, does it really exist?

2008-07-10 Thread Mark Schonewille

put

--
Best regards,

Mark Schonewille

Economy-x-Talk Consulting and Software Engineering
http://economy-x-talk.com
http://www.salery.biz

Benefit from our inexpensive hosting services. See http://economy-x-talk.com/server.html 
 for more info.


On 10 jul 2008, at 22:12, Bert Shuler wrote:

This code is from the docs, but seems to fail as if matchText is not  
a function.


on mouseUp
 matchText("Goodbye","bye")
end mouseUp

executing at 4:09:47 PM
TypeHandler: can't find handler
Object  Button
LinematchText("Goodbye","bye")
HintmatchText

Any Ideas?

Bert



___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: matchText, does it really exist?

2008-07-10 Thread Bert Shuler

worked. Thanks for the quick help.

Bert


On Jul 10, 2008, at 4:25 PM, Jan Schenkel wrote:


--- Bert Shuler <[EMAIL PROTECTED]> wrote:

This code is from the docs, but seems to fail as if
matchText is not a
function.

on mouseUp
  matchText("Goodbye","bye")
end mouseUp

executing at 4:09:47 PM
TypeHandler: can't find handler
Object  Button
LinematchText("Goodbye","bye")
HintmatchText

Any Ideas?

Bert



Hi Bert,

'matchText' is a function, not a command - so use it
like this:
##
on mouseUp
 answer matchText("Goodbye","bye")
end mouseUp
##

Admittedly, the docs for other functions sometimes
contain a 'put xxx' example, whereas the entry for
'matchText' does not.

Jan Schenkel.

Quartam Reports & PDF Library for Revolution


=
"As we grow older, we grow both wiser and more foolish at the same  
time."  (La Rochefoucauld)




___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your  
subscription preferences:

http://lists.runrev.com/mailman/listinfo/use-revolution



___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: matchText, does it really exist?

2008-07-10 Thread Jan Schenkel
--- Bert Shuler <[EMAIL PROTECTED]> wrote:
> This code is from the docs, but seems to fail as if
> matchText is not a  
> function.
> 
> on mouseUp
>matchText("Goodbye","bye")
> end mouseUp
> 
> executing at 4:09:47 PM
> Type  Handler: can't find handler
> ObjectButton
> Line  matchText("Goodbye","bye")
> Hint  matchText
> 
> Any Ideas?
> 
> Bert
> 

Hi Bert,

'matchText' is a function, not a command - so use it
like this:
##
on mouseUp
  answer matchText("Goodbye","bye")
end mouseUp
##

Admittedly, the docs for other functions sometimes
contain a 'put xxx' example, whereas the entry for
'matchText' does not.

Jan Schenkel.

Quartam Reports & PDF Library for Revolution


=
"As we grow older, we grow both wiser and more foolish at the same time."  (La 
Rochefoucauld)


  
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: matchText and accented characters

2007-10-17 Thread Chris Sheffield
Thanks, Ken. Using the hex equivalents is an interesting suggestion.  
I may look into that further.


As for replacing the accented characters with their non-accented  
equivalents, that is also something I've done in the past, but the  
problem here is that this is Mac/PC cross platform, so it's quite a  
few extra lines of code.


So I decided to simply try the offset function, with wholeMatches set  
to true (although I can't really determine if wholeMatches affects  
offset or not), and that seems to be working fine for me. Still  
testing it out to make sure, but so far so good.


Thanks again for the suggestions.


On Oct 16, 2007, at 5:59 PM, Ken Ray wrote:


On Tue, 16 Oct 2007 12:18:54 -0600, Chris Sheffield wrote:


Thanks, Andres. But that didn't seem to fix the problem. That
property, according to the docs, only seems to apply to the numToChar
and charToNum functions. I did try it just to make sure.


The issue is that PCRE (which is the lib that Rev uses) *optionally*
supports locales, so I don't know if any locales were compiled into  
the

code that Rev uses. If you knew what you were looking for, you could
replace the accented characters with their hex equivalents and you'd
get a match:

  put matchChunk(fld 1,".*(fianc\x8E).*",tStart,tEnd)

in this case "\x8E" means "use hex code 8E", which is ASCII 142, which
is é (at least on my Mac). To determine this, I ran this code:

  put baseConvert(charToNum("é"),10,16)

which gave me "8E". So if you know specifically the characters to
match, you can use this.

On the other hand, if you have a big chunk of text and you don't know
if there are accented chars or not, I would personally run it the
"brute force" way:

1) put a copy of the text into another variable
2) replace the accented chars with their non-accented counterparts - a
dozen or so lines like:
   - replace "é" with "e" in myVar
   - replace "ó" with "o" in myVar
   - etc.
3) run your 'matchChunk' on the second "clean" variable using
non-accented text (look for "fiance" and not "fiancé")
4) if you get a hit, use the startChar/endChar variables from the
'matchChunk' to extract the text from the *first* variable (the one
with the accented text)

Just my 2 cents,

Ken Ray
Sons of Thunder Software, Inc.
Email: [EMAIL PROTECTED]
Web Site: http://www.sonsothunder.com/
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your  
subscription preferences:

http://lists.runrev.com/mailman/listinfo/use-revolution


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: matchText and accented characters

2007-10-16 Thread Ken Ray
On Tue, 16 Oct 2007 12:18:54 -0600, Chris Sheffield wrote:

> Thanks, Andres. But that didn't seem to fix the problem. That 
> property, according to the docs, only seems to apply to the numToChar 
> and charToNum functions. I did try it just to make sure.

The issue is that PCRE (which is the lib that Rev uses) *optionally* 
supports locales, so I don't know if any locales were compiled into the 
code that Rev uses. If you knew what you were looking for, you could 
replace the accented characters with their hex equivalents and you'd 
get a match:

  put matchChunk(fld 1,".*(fianc\x8E).*",tStart,tEnd)

in this case "\x8E" means "use hex code 8E", which is ASCII 142, which 
is é (at least on my Mac). To determine this, I ran this code:

  put baseConvert(charToNum("é"),10,16)

which gave me "8E". So if you know specifically the characters to 
match, you can use this.

On the other hand, if you have a big chunk of text and you don't know 
if there are accented chars or not, I would personally run it the 
"brute force" way: 

1) put a copy of the text into another variable
2) replace the accented chars with their non-accented counterparts - a 
dozen or so lines like:
   - replace "é" with "e" in myVar
   - replace "ó" with "o" in myVar
   - etc.
3) run your 'matchChunk' on the second "clean" variable using 
non-accented text (look for "fiance" and not "fiancé")
4) if you get a hit, use the startChar/endChar variables from the 
'matchChunk' to extract the text from the *first* variable (the one 
with the accented text)

Just my 2 cents,

Ken Ray
Sons of Thunder Software, Inc.
Email: [EMAIL PROTECTED]
Web Site: http://www.sonsothunder.com/
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: matchText and accented characters

2007-10-16 Thread Chris Sheffield
Thanks, Andres. But that didn't seem to fix the problem. That  
property, according to the docs, only seems to apply to the numToChar  
and charToNum functions. I did try it just to make sure.


On Oct 16, 2007, at 12:02 PM, Andres Martinez wrote:


Hello Chris

I think you need to check on the unicode setting.

Use the following line before your search...

set the useUnicode to true

Regards,
Andres Martinez
www.baKno.com


On Oct 16, 2007, at 1:59 PM, Chris Sheffield wrote:



___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: matchText and accented characters

2007-10-16 Thread Andres Martinez

Hello Chris

I think you need to check on the unicode setting.

Use the following line before your search...

set the useUnicode to true

Regards,
Andres Martinez
www.baKno.com


On Oct 16, 2007, at 1:59 PM, Chris Sheffield wrote:

Sorry, I'm using matchChunk, not matchText. But maybe the solution  
is the same?


On Oct 16, 2007, at 11:49 AM, Chris Sheffield wrote:

The matchText function seems to be failing when searching for  
accented characters like á, é, í, ó, or ú. I'm not really up on my  
regex. Is there something special I need to do to make these  
characters work? For example, one search I'm performing is for the  
word "fiancé".


Thanks,
Chris


--
Chris Sheffield
Read Naturally, Inc.
www.readnaturally.com
www.oneminutereader.com

Watch reading achievements rise with Read Naturally's school-to- 
home program, One Minute Reader. Make reading fun straight from  
your classroom right to their home!


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your  
subscription preferences:

http://lists.runrev.com/mailman/listinfo/use-revolution



___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your  
subscription preferences:

http://lists.runrev.com/mailman/listinfo/use-revolution


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: matchText and accented characters

2007-10-16 Thread Chris Sheffield
Sorry, I'm using matchChunk, not matchText. But maybe the solution is  
the same?


On Oct 16, 2007, at 11:49 AM, Chris Sheffield wrote:

The matchText function seems to be failing when searching for  
accented characters like á, é, í, ó, or ú. I'm not really up on my  
regex. Is there something special I need to do to make these  
characters work? For example, one search I'm performing is for the  
word "fiancé".


Thanks,
Chris


--
Chris Sheffield
Read Naturally, Inc.
www.readnaturally.com
www.oneminutereader.com

Watch reading achievements rise with Read Naturally's school-to- 
home program, One Minute Reader. Make reading fun straight from  
your classroom right to their home!


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your  
subscription preferences:

http://lists.runrev.com/mailman/listinfo/use-revolution



___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: matchtext question using regex

2007-05-05 Thread Jim Ault
try studying the "|" symbol, which is OR
There are many ways of using it with strings and substrings and patterns.

Jim Ault
Las Vegas


On 5/4/07 10:41 AM, "ron" <[EMAIL PROTECTED]> wrote:

> Regex question for use in matchtext
> 
> I want to find word A followed by word B.  (quickly)
> So:
> put "this is my big dog called cat." into thetext
> put "my.{0,5}dog" into reg
> 
> And
> put matchtext(thetext,reg)
> returns true because I use a period so it is counting characters but I
> need it to count words. I have tried various combinations of \b and \w
> to no avail.
> 
> Something like :
> put "my([^ ]* ){0,5}dog" into reg
> works but only for words followed by spaces, not punctuation for
> example. These could be included but surely there is a more elegant and
> faster way?
> 
> Can someone help me out with this?
> 
> 
> BTW, is it true that setting the wholematches to true and using
> wordoffset only returns 'words' that are followed by a space? So that
> in the example sentence above, 'cat' is not found because it is
> followed by a period? Is this correct?
> 
> 
> Thanks
> Ron
> 
> ___
> use-revolution mailing list
> use-revolution@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-revolution


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: MatchText, MatchChunk and the needle in the haystack

2007-03-21 Thread Peter Alcibiades
There is a wonderful book I just found for this sort of thing, and am working 
through:  Minimal Perl, by Tim Maher.  

Awk is great, terse, powerful, but a bit opaque.  And more up to date people 
always seem to talk about using Perl for what awk always was used for.  Well, 
if you ever felt you too should come up to date, got tired of pitying looks 
when you mentioned awk, took up some materials on Perl and then threw up your 
hands in despair, get Minimal Perl.  Clear, practical, easy, and with a focus 
on one liners of exactly the sort that you'd use in the situation on this 
thread. 

You could call it 'Perl for the rest of us'  Text manipulation without 
tears.

Peter
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: MatchText, MatchChunk and the needle in the haystack

2007-03-21 Thread Jim Ault
On 3/21/07 1:32 AM, "Peter Alcibiades" <[EMAIL PROTECTED]>
wrote:

> Can you do it with a text editor and regular expressions?  I'm genuinely
> diffident about asking, because you all have so much more experience that if
> it were this easy, you'd have suggested it.


My basic approach for this kind of question is to assume that users have
very little experience with regular expressions combined with knowing very
little about the data set they are mining.

Also, the question they actually ask on the list is just one part of the
over-all task.  Given these three things, I like to propose tools that let
them "see" some of the pit falls that making incorrect assumptions about the
date can create.  One pit fall is assuming all occurrences of the date
string will be correctly formatted and intact.

I guess I look at it as 'what will help them build a tool they can trust'.

Don't get me wrong, I like and use regEx in a few of my apps for effectively
extracting clean data from a variety of web sites. I like its power and
flexibility.

As you say, if the user already knew some of the simpler regEx, the question
probably would not have appeared on the list.

I cannot speak for others on the list, but it seems that those who venture
into regEx only occasionally, get frustrated and are better off using the
chunking expressions of Rev.  Even when presented with a good regEx answer,
they are not sure what they are looking at.

By the way, nulls will make MatchText, etc fail, so "replace null with empty
in textBlock" needs to be part of the process for unknown data sources.

As far as using a text editor, that is usually my first step.  I like BBEdit
on an OSX platform, so I agree with your basic premise, start simple and
build up.

Nice to know you are paying attention to the big picture  :-)
Good post.

Jim Ault
Las Vegas


On 3/21/07 1:32 AM, "Peter Alcibiades" <[EMAIL PROTECTED]>
wrote:

> Can you do it with a text editor and regular expressions?  I'm genuinely
> diffident about asking, because you all have so much more experience that if
> it were this easy, you'd have suggested it.  But anyway, is there something
> wrong with the following?
> 
> I made up a fragment of a file like this in the form
> 02-Mar-92sometext01-Sep-04somemore textand a few more entries of the
> same sort.
> 
> Then opened it in Kate (but presumably all programming editors have similar
> functionality?)
> 
> Then did a match with regular expressions in the Find part of the menu.  It
> helped construct the following expression:
> 
> [\d][\d]-[\D][\D][\D]-[\d][\d]
> 
> which really would not have been so very hard to figure out unaided - a
> classic case of the obligatory gui getting in the way of your typing.  This
> picks up all dates and it obviously misses other hyphenated expressions.
> 
> Then in the replace section I put
> 
> Enter\0
> 
> It uses the \0 as backwards reference, so to include all the found string in
> the replacement.
> 
> The only hard part, all of ten seconds, was that I didn't seem able to enter a
> line feed character directly, like by \n for instance, but I just copied and
> pasted one and bingo, it worked fine.  I ended up with a bunch of lines like
> this:
> 
> 02-Mar-92sometext
> 01-Sep-04somemore text..and so on.
> 
> Was that what was wanted?
> 
> This was almost instant.  I guess if I'd a lot to do, I would think of an awk
> one liner, but have forgotten how to do backward references in awk.  And it
> would be even more embarrassing to have both got the above all wrong and to
> also cite duff awk scripts!
> 
> Peter


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: MatchText, MatchChunk and the needle in the haystack

2007-03-21 Thread Peter Alcibiades
Can you do it with a text editor and regular expressions?  I'm genuinely 
diffident about asking, because you all have so much more experience that if 
it were this easy, you'd have suggested it.  But anyway, is there something 
wrong with the following?

I made up a fragment of a file like this in the form
02-Mar-92sometext01-Sep-04somemore textand a few more entries of the 
same sort.

Then opened it in Kate (but presumably all programming editors have similar 
functionality?)

Then did a match with regular expressions in the Find part of the menu.  It 
helped construct the following expression:

[\d][\d]-[\D][\D][\D]-[\d][\d]

which really would not have been so very hard to figure out unaided - a 
classic case of the obligatory gui getting in the way of your typing.  This 
picks up all dates and it obviously misses other hyphenated expressions.

Then in the replace section I put

Enter\0

It uses the \0 as backwards reference, so to include all the found string in 
the replacement.

The only hard part, all of ten seconds, was that I didn't seem able to enter a 
line feed character directly, like by \n for instance, but I just copied and 
pasted one and bingo, it worked fine.  I ended up with a bunch of lines like 
this:

02-Mar-92sometext
01-Sep-04somemore text..and so on.

Was that what was wanted?

This was almost instant.  I guess if I'd a lot to do, I would think of an awk 
one liner, but have forgotten how to do backward references in awk.  And it 
would be even more embarrassing to have both got the above all wrong and to 
also cite duff awk scripts!

Peter
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: MatchText, MatchChunk and the needle in the haystack

2007-03-20 Thread Jim Ault
On 3/20/07 9:42 AM, "Devin Asay" <[EMAIL PROTECTED]> wrote:

> Wait, when is it ever *not* sunny in Las Vegas? ;-)
Very seldom.  About twice a year we will have 3 days in a row of cloudy
weather.

Jim


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: MatchText, MatchChunk and the needle in the haystack

2007-03-20 Thread Devin Asay


On Mar 20, 2007, at 9:29 AM, Jim Ault wrote:



On Mar 20, 2007, at 4:12 AM, Bryan McCormick wrote:
Jim, Dave, Devin

Thanks for your help in making me think harder about this. I  
literally
woke up out of a dream this morning and knew right away what was  
wrong
with the script. There was one error that would have persistently  
been a

problem that I have fixed now.


Glad I was able to help jostle some brain cells.


I love the mornings when I wake up and realize the answer to a  
programming

puzzle.


Amen!


No matter what the weather, it is a sunny day for me :-)


Wait, when is it ever *not* sunny in Las Vegas? ;-)

Devin

Devin Asay
Humanities Technology and Research Support Center
Brigham Young University

___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: MatchText, MatchChunk and the needle in the haystack

2007-03-20 Thread Jim Ault
> Jim, Dave, Devin
> 
> Thanks for your help in making me think harder about this. I literally
> woke up out of a dream this morning and knew right away what was wrong
> with the script. There was one error that would have persistently been a
> problem that I have fixed now.

Glad it worked out so well.  Data mining is a tricky business, especially if
the originator allows delimiters to also be content (such as commas and
hyphens).  The one change I would make in your routine is the use of a tab
instead of a comma as a delim, since this is a common character, but that
depends on your data set.  I assume that are not encountering and commas in
the data.

I love the mornings when I wake up and realize the answer to a programming
puzzle.  No matter what the weather, it is a sunny day for me :-)

Jim Ault
Las Vegas


On 3/20/07 3:12 AM, "Bryan McCormick" <[EMAIL PROTECTED]> wrote:

> Jim, Dave, Devin
> 
> Thanks for your help in making me think harder about this. I literally
> woke up out of a dream this morning and knew right away what was wrong
> with the script. There was one error that would have persistently been a
> problem that I have fixed now.
> 
> In the interests of anyone else who encounters a similar horrible string
> task, the solution is provided below.
> 
> One more thing. You all get credit for making me think harder about what
> else was in the files that might have been a random char throwing things
> off.
> 
> Now, I did go and change the script to make it simpler. I realized I
> only needed to find the hyphen at the start of the date and simply
> advance forward past the next hyphen in the date string. Since we were
> dealing with fixed length records forward from the first hyphen (three
> char month, hyphen, two char year) this was the simplest way.
> 
> Genius? I thought so.
> 
> As luck would happen I had hit upon the few records that were problem
> children right off the bat.
> 
> It turned out that a few of the records had the word "in-line" with a
> hyphen which threw off the whole thing. So there is a separate script
> when the file is read in that checks now for nulls, odd-ball ascii
> codes, and our friend "in-line". I was lucky in this case that the
> records were so simple. The alternative would have been to keep the
> "-Jan-...-Dec-" chunks and walk through the file 12 times. No big deal I
> suppose and it could always be done that way if one had different chunks
> to search for.
> 
> Anyway, here is the finished script with comments. I hope it helps
> others who might have similar issues. I have over 5000 of these files to
> do which will now take about ten minutes versus the agony (and days) I'd
> have had to endure if there had been no community here to draw upon for
> help and if rev was not so darn handy.
> 
> By the way the script that adds the return character also puts in a
> comma in the right place after the date so that I have another delimiter
> to work with and the record in the end is comma delimited with a return
> character as the record marker. Much better than the ugly long single
> string I started out with.
> 
> Thanks All.
> 
> --
> 
> 
> on mouseUp
>put fld 1 into textBlock
>put makeOffsets("-",textBlock,1) into varOffsets
>sort lines of varOffsets numeric descending
>-- this is the only way it works as otherwise the char count gets thrown
>-- off. essentially we are working up from the end of the string forward
>repeat for each line varRecord in varOffsets
>  put char varRecord-2 to varRecord-1 of textBlock into eval
>  if char 1 of eval is a number and char 2 of eval is a number  then
>put comma after char varRecord+6 of textBlock
>put cr  before char varRecord-2 of textBlock
>  else
>if char 1 of eval is  not a  number and char 2 of eval is a
> number   then
>  put comma after char varRecord+6 of textBlock
>  put cr before char varRecord-1 of textBlock
>end if
>  end if
>end repeat
>put textBlock into fld 1
> end mouseUp
> 
> function makeOffsets varChunk,textBlock,posStart
>if posStart = empty then
>  put 1 into pos
>else
>  put posStart into pos
>end if
>repeat until varOffset = 0
>  put offset(varChunk, textBlock, pos) into varOffset
>  if varOffset <>0 then
>put varOffset+pos&return after newText
>-- this was what was mucked-up in the original script
>-- have to add the prior pos to the new one since we
>-- are using the "skip chars" option and need to add
>-- add the prior position to the new relative pos
>add varOffset+length(varChunk)+6 to pos
>-- i could get away with adding a fixed number in this
>-- case since the date was never going to be shorter than
>-- six chars + the found offset + chunk, ("-") in this case
>  else
>exit repeat
>  end if
>end repeat
>return newText
> end makeOff

Re: MatchText, MatchChunk and the needle in the haystack

2007-03-20 Thread Bryan McCormick

Jim, Dave, Devin

Thanks for your help in making me think harder about this. I literally 
woke up out of a dream this morning and knew right away what was wrong 
with the script. There was one error that would have persistently been a 
problem that I have fixed now.


In the interests of anyone else who encounters a similar horrible string 
task, the solution is provided below.


One more thing. You all get credit for making me think harder about what 
else was in the files that might have been a random char throwing things 
off.


Now, I did go and change the script to make it simpler. I realized I 
only needed to find the hyphen at the start of the date and simply 
advance forward past the next hyphen in the date string. Since we were 
dealing with fixed length records forward from the first hyphen (three 
char month, hyphen, two char year) this was the simplest way.


Genius? I thought so.

As luck would happen I had hit upon the few records that were problem 
children right off the bat.


It turned out that a few of the records had the word "in-line" with a 
hyphen which threw off the whole thing. So there is a separate script 
when the file is read in that checks now for nulls, odd-ball ascii 
codes, and our friend "in-line". I was lucky in this case that the 
records were so simple. The alternative would have been to keep the 
"-Jan-...-Dec-" chunks and walk through the file 12 times. No big deal I 
suppose and it could always be done that way if one had different chunks 
to search for.


Anyway, here is the finished script with comments. I hope it helps 
others who might have similar issues. I have over 5000 of these files to 
do which will now take about ten minutes versus the agony (and days) I'd 
have had to endure if there had been no community here to draw upon for 
help and if rev was not so darn handy.


By the way the script that adds the return character also puts in a 
comma in the right place after the date so that I have another delimiter 
to work with and the record in the end is comma delimited with a return 
character as the record marker. Much better than the ugly long single 
string I started out with.


Thanks All.

--


on mouseUp
  put fld 1 into textBlock
  put makeOffsets("-",textBlock,1) into varOffsets
  sort lines of varOffsets numeric descending
  -- this is the only way it works as otherwise the char count gets thrown
  -- off. essentially we are working up from the end of the string forward
  repeat for each line varRecord in varOffsets
put char varRecord-2 to varRecord-1 of textBlock into eval
if char 1 of eval is a number and char 2 of eval is a number  then
  put comma after char varRecord+6 of textBlock
  put cr  before char varRecord-2 of textBlock
else
  if char 1 of eval is  not a  number and char 2 of eval is a 
number   then

put comma after char varRecord+6 of textBlock
put cr before char varRecord-1 of textBlock
  end if
end if
  end repeat
  put textBlock into fld 1
end mouseUp

function makeOffsets varChunk,textBlock,posStart
  if posStart = empty then
put 1 into pos
  else
put posStart into pos
  end if
  repeat until varOffset = 0
put offset(varChunk, textBlock, pos) into varOffset
if varOffset <>0 then
  put varOffset+pos&return after newText
  -- this was what was mucked-up in the original script
  -- have to add the prior pos to the new one since we
  -- are using the "skip chars" option and need to add
  -- add the prior position to the new relative pos
  add varOffset+length(varChunk)+6 to pos
  -- i could get away with adding a fixed number in this
  -- case since the date was never going to be shorter than
  -- six chars + the found offset + chunk, ("-") in this case
else
  exit repeat
end if
  end repeat
  return newText
end makeOffsets
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: MatchText, MatchChunk and the needle in the haystack

2007-03-19 Thread Devin Asay


On Mar 19, 2007, at 11:24 AM, Bryan McCormick wrote:


Jim,

Thanks for the script snippet. It didn't quite work as shown, but  
it did get me to think about the problem more carefully. I came up  
with this:


put "-Jan-,-Feb-,-Mar-,-Apr-,-May-,-Jun-,-Jul-,-Aug-,-Sep-,-Oct-,- 
Nov-,-Dec-" into mthStrings


Bryan,

Is it possible that the original text string is not using hyphens  
consistently? Could there perhaps be en-dash and/or em-dash  
characters there, which look just like hyphens in monospaced fonts.  
If the original text was created in MS Word, for example, it often  
auto-substitutes en- or em-dashes for hyphens.


Just a thought.

Devin

Devin Asay
Humanities Technology and Research Support Center
Brigham Young University

___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: MatchText, MatchChunk and the needle in the haystack

2007-03-19 Thread Jim Ault
On 3/19/07 10:49 AM, "Bryan McCormick" <[EMAIL PROTECTED]> wrote:

> Dave,
> 
> Sadly it does not impact the outcome. Mind you I tried it just in case.
> 
> I have played with all the vars that I can think of and it does nothing.
> It does not even appear to matter (as I thought) if there are multiple
> months (i.e "-Jan-") of the same type in a row (thought it might be
> finding the first and missing the others somehow, but no), it doesn't
> matter if the date is of xx-Month-xx, or x-Month-XX, nor does it matter
> the order or how often these appear in the string and it doesn't seem to
> matter how long or short the record or the file happens to be.
> 
> It should work as far as I can see. I am stumped at this point. It is an
> error for sure (on my end) it is just really subtle it seems. Or it will
> be until someone points the magic finger and says "here it is you idiot"!

A couple ideas 

-1--- make sure that you are not changing the length of the textBlock with
replacements.  This could accumulate to a significant offset error,
depending on how you build your loops.

-2--- test for null chars [00  ascii] used in some file formats
put length(textBlock) into origCharCnt
replace null with empty
answer  length(textBlock) - origCharCnt

-3--- do inspections to see if something is creating a false hit or false
miss

put the number of lines in textBlock into foundMth

repeat 
   replace "-Jan-" with cr &"Jan-" in textBlock
   get the number of lines in textBlock - sum(foundMth)
   put it &","after foundMth
   breakpoint
--now inspect the textBlock & foundMth for any odd occurrences

-- optional is tofilter textBlock without "*Jan-" thus purging as you go

end repeat


Jim Ault
Las Vegas


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: MatchText, MatchChunk and the needle in the haystack

2007-03-19 Thread Dave

Hi,

Not 100% sure, but should you start from 1? e.g.


put 0 into pos


should be:

 put 1 into pos

All the Best
Dave

On 19 Mar 2007, at 17:24, Bryan McCormick wrote:



-- note that i added a third param in case i need to "force" the  
routine to start elsewhere. it is set to 0 when i run this on the  
string in question (which by the way is about 5000 chars long)


function makeOffsets mth,textBlock,posStart
  if posStart = empty then
put 0 into pos
  else
put posStart into pos
  end if
  repeat until varOffset = 0
put offset(mth, textBlock, pos) into varOffset
if varOffset <> 0 and varOffset <> posStart then
  if pos  <> 0 then
put pos&return after newText
  end if
  add varoffset+length(mth)+1 to pos
else
  exit repeat
end if
  end repeat
  return newText
end makeOffsets



___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Re: MatchText, MatchChunk and the needle in the haystack

2007-03-19 Thread Bryan McCormick

Jim,

Thanks for the script snippet. It didn't quite work as shown, but it did 
get me to think about the problem more carefully. I came up with this:


put 
"-Jan-,-Feb-,-Mar-,-Apr-,-May-,-Jun-,-Jul-,-Aug-,-Sep-,-Oct-,-Nov-,-Dec-" 
into mthStrings



-- i seemed to need to separate the routine here, running it with the 
loops as shown didn't function as expected.



 repeat for each item mth in mthStrings
 put makeOffsets(mth,textBlock) after varOffsets
  end repeat

sorts line of varOffsets numeric

-- note that i added a third param in case i need to "force" the routine 
to start elsewhere. it is set to 0 when i run this on the string in 
question (which by the way is about 5000 chars long)


function makeOffsets mth,textBlock,posStart
  if posStart = empty then
put 0 into pos
  else
put posStart into pos
  end if
  repeat until varOffset = 0
put offset(mth, textBlock, pos) into varOffset
if varOffset <> 0 and varOffset <> posStart then
  if pos  <> 0 then
put pos&return after newText
  end if
  add varoffset+length(mth)+1 to pos
else
  exit repeat
end if
  end repeat
  return newText
end makeOffsets



There is another routine that then does some manipulation on the 
returned offsets since I need to put the return in BEFORE the date and 
as luck would have it the day part of the date (format is 
day-month-year) is not always two characters so I had to add in a 
routine to check for numerics back from the offset position.


Here is the odd thing though. As far as I can see the script should work 
perfectly on a string without any delims and a bunch of dates in it. 
Oddly this is not the case.


It mostly works (which means I've made a mistake or the file isn't quite 
as neat as I think it is) but gets thrown off and does not find offsets 
that it should. It does not seem to matter how long or short the record 
is nor does it happen consistently in the same place. But it always 
happens. I've looked for possible length errors (did I overshoot a 
record) but that does not seem possible or the whole thing would be broken.


What happens is, randomly it seems, some lines contain multiple records 
in a single string.


Thoughts greatly appreciated.

I could (and probably will) write another routine for expediency to walk 
through the lines of the partially correct records to see if there is 
another date line item in it, but I have to say I am stumped as to how 
it could be skipping over some records and then finding them just fine 
after the error occurs.


I checked for random oddball chars and confirmed that the dates not 
found are in fact properly formatted as x or xx-JAN-xx.


And oh yes, I am able to find the offset("-Nov-", fld 1) in the field 
that the resulting partially recovered list is placed in. So it does not 
appear to be an offset bug, not one that I can see anyway.


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: MatchText, MatchChunk and the needle in the haystack

2007-03-18 Thread Jim Ault
A simplistic approach would be to find the "-mth-" string and work from
there

on untested
  put "-Jan- -Feb- -Mar-" into mthStrings
  repeat for each word MTH in mthStrings
put 1 into pos
repeat until pos = 0
  put offset(mth, textBlock, pos+2) into pos
  put cr into char pos - 2 of textBlock
end repeat

  end repeat
  --now textBlock should have cr's in the right spots
end untested

Jim Ault
Las Vegas


On 3/18/07 4:12 PM, "Bryan McCormick" <[EMAIL PROTECTED]> wrote:

> Folks,
> 
> I have been given a batch of text files that have had their delimiters
> stripped off (by accident) leaving a single string of text to parse back
> into record delimited form. And yes, of course, there is no back-up so
> it is the strings or nothing.
> 
> I really know very little about using RegEx, but I presume this could at
> least in part solve the problem.
> 
> Basically the only good news is that each record was originally
> delimited in the form of "24-Jan-02" so that as long as each date could
> be plucked out of the string it ought to be possible to grab the offset
> and then introduce a return before the next date occurrence. As in the
> text is 06-Mar-92therewasamangledbitoftexttodealwith02-Apr-92therest...
> 
> I cannot seem to get the MatchText to work properly to identify these,
> but I guess really the problem is I still need to find an offset for
> each. Is MatchText even the right thing to use? Can I use it in
> conjunction with 
> offset(MatchText(myVar,[0-9]-(Jan|Feb|Mar...|Dec)-[0-9],someVar)) to
> find each one?
> 
> Or is this a case where the string has to be brute forced?
> 
> Any ideas on how to proceed? Any ideas, snippets would be greatly
> appreciated.
> ___
> use-revolution mailing list
> use-revolution@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-revolution


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: matchText

2007-01-21 Thread Gordon Tillman

Howdy Robert,

On Jan 21, 2007, at 11:14, Robert Mann wrote:


Don't think I am getting the syntext correct this is returning nothing
matchText("tword1","^tword$")
tword1 and tword are both variables



Yep for the second argument to the matchText() function, you need to  
construct the regular expression so that it contains "^" as the first  
character, followed by the contents of your variable, followed by the  
"$" character.  In RR you can concatenate strings with the "&"  
operator, so you could do something like this for your second argument:


"^ & tword & "$"

For your first argument, you would just your tword1 by itself.  If  
you put it in quotes like in your example, you are matching the  
literal string "tword1".


--g
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


RE: matchText

2007-01-21 Thread Robert Mann
Don't think I am getting the syntext correct this is returning nothing
matchText("tword1","^tword$")
tword1 and tword are both variables


Thanks
Rob

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Gordon Tillman
Sent: Sunday, January 21, 2007 10:42 AM
To: How to use Revolution
Subject: Re: matchText

On Jan 21, 2007, at 09:23, Robert Mann wrote:

> but I just relized that it is also selecting any strings that have  
> that
> pattern, is there a way to only select exact matches?

Robert you just need to add a bit to the regular expression part to  
tell it that you want to match the entire string.

For example:

matchText("record_id","^record_id$")  -- returns true

But

matchText("orig_record_id","^record_id$")  -- returns false

The "^" at the start of the regex binds whatever follows to the very  
beginning of the string. The "$" at the end of the regex binds  
whatever comes before it to the very end of the string.

--gordy
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution

___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: matchText

2007-01-21 Thread Gordon Tillman

On Jan 21, 2007, at 09:23, Robert Mann wrote:

but I just relized that it is also selecting any strings that have  
that

pattern, is there a way to only select exact matches?


Robert you just need to add a bit to the regular expression part to  
tell it that you want to match the entire string.


For example:

matchText("record_id","^record_id$")  -- returns true

But

matchText("orig_record_id","^record_id$")  -- returns false

The "^" at the start of the regex binds whatever follows to the very  
beginning of the string. The "$" at the end of the regex binds  
whatever comes before it to the very end of the string.


--gordy
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext script results

2006-12-01 Thread J. Landman Gay

John Craig wrote:
Jeez!  Did the office lights dim and flicker when you ran the regex 
version?


No, but I was testing during the daytime so it was hard to tell. :) I am 
pretty sure it was slower because of the complex structure of the 
pattern, but it was still cool that you could write it. That's more than 
I can say for my regex capabilities.


--
Jacqueline Landman Gay | [EMAIL PROTECTED]
HyperActive Software   | http://www.hyperactivesw.com
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext script results

2006-12-01 Thread John Craig

Jeez!  Did the office lights dim and flicker when you ran the regex version?

:-0


Mark Smith  -- native syntax: 18 ticks. Found 2 matches
Mark Smith -- filter: 18 ticks. Found 2 matches
Mark Smith -- array: 15 ticks. Found 2 matches
Dick Kriesel -- array: 8 ticks. Found 1 match
John Craig -- regex: 242 ticks. Found 2 matches
Dick Kriesel -- array: 8 ticks. Found 2 matches
Brian Yennie -- arrays: 7 ticks; found 2 matches
Jim Ault -- filter: 5 ticks; found 5 matches (strings, not words)
Ken Ray -- regex: 15 ticks (first run),8 ticks (subsequent runs); 
found 2 matches

Jacque Gay -- original Rev script: 4 ticks. Found 2 matches



___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext script results

2006-12-01 Thread Dick Kriesel
On 11/30/06 3:07 PM, "J. Landman Gay" <[EMAIL PROTECTED]> wrote:

> And that is what surprised me -- that no tinkering with arrays, or
> matchtext, or anything else is faster than the most straightforward
> Revolution syntax.

At least that's true for your sample data.  If the word list were very long,
for example, some other technique(s) would win the contest.

Thanks for publishing your findings.

-- Dick


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext script results

2006-11-30 Thread J. Landman Gay

Robert Brenstein wrote:
And that is what surprised me -- that no tinkering with arrays, or 
matchtext, or anything else is faster than the most straightforward 
Revolution syntax. I was thinking this would take a long time, but in 
fact it is the fastest way to do it (that I've seen so far, anyway.) 
We've mentioned this on the list before, but I guess I need to be hit 
on the head with the facts occasionally, just to remind me how good 
we've got it.


Surprise. Rev wins out again.

--
Jacqueline Landman Gay | [EMAIL PROTECTED]
HyperActive Software   | http://www.hyperactivesw.com



I think that the time needed to prepare the text for searching should be 
included since it is a required step and for large files can use enough 
ticks to alter the results. Best would be to see both the total time and 
search-only time.


If the processing had to be re-done for each file, then I included it in 
the timing. But there were a few scripts where a one-time variable had 
to be set up before any files were processed. I didn't think this was 
important; it was almost always only one line of script and it only 
happened once. I suppose it added a millisecond or two to the total.


I also didn't consider text preparation in my tests, because the text 
I'll be reading in won't need any cleanup, and the words I'll be looking 
for are stripped of punctuation before being passed to the scan. For 
general use,  you're probably right that we should include that.


I don't have time to redo all the tests (it took a while) but anyone who 
wants to could grab the scripts off the list and see what they get. It's 
always good to get more info, and I'm sure my results aren't definitive.


--
Jacqueline Landman Gay | [EMAIL PROTECTED]
HyperActive Software   | http://www.hyperactivesw.com
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext script results

2006-11-30 Thread J. Landman Gay

Mark Smith wrote:
Not that it necessarily matters for your application, but your version 
will miss a match if it's preceded or followed by punctuation -- "cat."  
"(dinosaur)" etc.


I cheated by not telling everything. I decided after all this to do a 
string search, so punctuation and plurals won't matter. If I find 
strings that are embedded in other words, that's okay for what I need. 
Instead of "among the words of" I decided to use just "is in".


By an unnecessary process of elimination, I arrived at just about 
exactly the same simple solution as you did in my last attempt, but 
dealing with punctuation detracts (inevitably, as far as I can see) from 
the speed.


See, what you didn't know was that I am allowed to change the specs. :)

--
Jacqueline Landman Gay | [EMAIL PROTECTED]
HyperActive Software   | http://www.hyperactivesw.com
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext script results

2006-11-30 Thread Robert Brenstein
And that is what surprised me -- that no tinkering with arrays, or 
matchtext, or anything else is faster than the most straightforward 
Revolution syntax. I was thinking this would take a long time, but 
in fact it is the fastest way to do it (that I've seen so far, 
anyway.) We've mentioned this on the list before, but I guess I need 
to be hit on the head with the facts occasionally, just to remind me 
how good we've got it.


Surprise. Rev wins out again.

--
Jacqueline Landman Gay | [EMAIL PROTECTED]
HyperActive Software   | http://www.hyperactivesw.com



I think that the time needed to prepare the text for searching should 
be included since it is a required step and for large files can use 
enough ticks to alter the results. Best would be to see both the 
total time and search-only time.


Robert
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext script results

2006-11-30 Thread Jim Ault
On 11/30/06 3:54 PM, "Mark Smith" <[EMAIL PROTECTED]> wrote:

> Not that it necessarily matters for your application, but your
> version will miss a match if it's preceded or followed by punctuation
> -- "cat."  "(dinosaur)" etc.
> 
> By an unnecessary process of elimination, I arrived at just about
> exactly the same simple solution as you did in my last attempt, but
> dealing with punctuation detracts (inevitably, as far as I can see)
> from the speed.
> 
> And we didn't even consider plurals!

Yes, Mark, and my filter solution will find 'too many' matches [5 in the
test where 2 is the right answer] which is why I designed my original to
capture and track the text blocks that tested true.

As you have noted, further processing is needed for accuracy, and that can
be done on each line in textStr (see below)


put blockNumber & blockName && theTextItself & cr after newBlock
where a each text file is concatenated as a single line
with a serial number&filename to mark which file was on each line

then
>> filter textStr with ("*" & WRD & "*")
so that each hit remains a single line in the textStr and the other lines
disappear.

As you have noted, further processing (for plurals and punc, etc) is needed
for accuracy, and that can be done on each line in textStr.

Jim Ault
Las Vegas


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext script results

2006-11-30 Thread Mark Smith
Not that it necessarily matters for your application, but your  
version will miss a match if it's preceded or followed by punctuation  
-- "cat."  "(dinosaur)" etc.


By an unnecessary process of elimination, I arrived at just about  
exactly the same simple solution as you did in my last attempt, but  
dealing with punctuation detracts (inevitably, as far as I can see)  
from the speed.


And we didn't even consider plurals!

Best,

Mark

On 30 Nov 2006, at 23:07, J. Landman Gay wrote:




The last one in the list is the one I thought I had to replace with  
something faster. It is simply this:


  repeat for each line l in tFiles
put url ("file:" & l) into tText
repeat for each word w in pWords
  put w is among the words of tText into tMatch
  if tMatch = false then exit repeat
end repeat
if tMatch then put l & cr after tList
  end repeat

And that is what surprised me -- that no tinkering with arrays, or  
matchtext, or anything else is faster than the most straightforward  
Revolution syntax. I was thinking this would take a long time, but  
in fact it is the fastest way to do it (that I've seen so far,  
anyway.) We've mentioned this on the list before, but I guess I  
need to be hit on the head with the facts occasionally, just to  
remind me how good we've got it.


Surprise. Rev wins out again.

--
Jacqueline Landman Gay | [EMAIL PROTECTED]
HyperActive Software   | http://www.hyperactivesw.com
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your  
subscription preferences:

http://lists.runrev.com/mailman/listinfo/use-revolution


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext for multiple words

2006-11-30 Thread Mark Smith

Another go. Assumes the wordList to be a comma separated list.

function wordMatch wordList, pText
  put "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" & space  
& tab & cr into tGoodChars

  repeat for each char c in pText
if c is in tGoodChars then put c after tText
  end repeat

  repeat for each item i in wordList
if i is not among the words of tText then return false
  end repeat
  return true
end wordMatch

It seems a bit brutish, but it's reasonably quick, and it won't find  
"tent" in "content" and will find "yikes" in "(yikes!)".


Best,

Mark

___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext to find a series of words

2006-11-29 Thread Jim Ault
> I meant a single pass for each block. The filter solution has to make a
> new pass through the text for each word we want to filter on. But
> regardless, it still shows very well in my tests.

Actually, the filter pass does not have to make a new pass through *ALL* of
the text.

Since I wrote it to work on the same variable successively, if the first
filter command does not find a match, the next two work on an empty
variable.  Quite fast.

If it does find matching lines, the successive commands work only on the hit
text, thus optimizing by elimination.

The last email I sent shows that if you make each block a single line by
replacing the cr's, then concatenating the next block, you can make a single
pass for each word for all blocks at ONCE.  If there are no matches for the
first word, then the following words are filtering an empty variable.

By tagging each line with a header as you concatenate, you can even tell
which lines (blocks) meet all the criteria without any speed difference
since the residual variable will contain only hits.

The slowest would obviously be the 'all three words found in all the blocks'
scenario.

Glad you are having fun

Jim Ault
Las Vegas


On 11/29/06 6:02 PM, "J. Landman Gay" <[EMAIL PROTECTED]> wrote:

> Jim Ault wrote:
>> On 11/29/06 3:37 PM, "J. Landman Gay" <[EMAIL PROTECTED]> wrote:
>>> This looks promising, thanks. It looks like there is no single-pass
>>> method, but since filter is pretty fast it may do okay. I didn't even
>>> quote your regex explanation, I don't want to touch it. :)
>> 
>> You mention single pass...
>> Question: Single pass of what?
>> Single pass of each text block or all text blocks together?
> 



___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext for multiple words

2006-11-29 Thread J. Landman Gay

John Craig wrote:

Oops.

I meant to say - check the list is passed as "list,dog,house" (comma 
separated, and without parenthesis)


Yeah, that was the problem. I was altering the scripts from the list so 
they would fit into my tests and I didn't change yours right. Now that 
I've made the correction it works fine. Thanks for the pointer, that was 
exactly what was wrong.


--
Jacqueline Landman Gay | [EMAIL PROTECTED]
HyperActive Software   | http://www.hyperactivesw.com
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext for multiple words

2006-11-29 Thread John Craig

Oops.

I meant to say - check the list is passed as "list,dog,house" (comma 
separated, and without parenthesis)




J. Landman Gay wrote:

John Craig wrote:
And a script to create the regex from a word list.  My apologies if 
this stuff turns out useless - but you can get absorbed in this mince...


I passed three random words to your script (list,house,dog) and got 
this regex from it:


(?is)\b(list|house|dog)\b\b(?!\1)(list|house|dog)\b\b(?!\1|\2)(list|house|dog)\b 



My test then goes through a bunch of text files on disk and applies 
the regex to the text of each file like this:


put matchText(tText, tRegex) into tMatch

I don't get any matches though, and my knowledge of regex is too 
limited for me to know if I'm doing something wrong. Does this look 
right to you? I think there should have been at least 2 matching files 
(that's what some of the other scripts produced.)




___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext for multiple words

2006-11-29 Thread John Craig

J. Landman Gay wrote:

John Craig wrote:
And a script to create the regex from a word list.  My apologies if 
this stuff turns out useless - but you can get absorbed in this mince...


I passed three random words to your script (list,house,dog) and got 
this regex from it:




Here is the correct regex I get when I substitute your new words into 
the script (check that list is passed as "list|house|dog")

(?is)\b(list|house|dog)\b.*\b(?!\1)(list|house|dog)\b.*\b(?!\1|\2)(list|house|dog)\b

a few bits missing from the one below!
(?is)\b(list|house|dog)\b\b(?!\1)(list|house|dog)\b\b(?!\1|\2)(list|house|dog)\b 



My test then goes through a bunch of text files on disk and applies 
the regex to the text of each file like this:


put matchText(tText, tRegex) into tMatch

I don't get any matches though, and my knowledge of regex is too 
limited for me to know if I'm doing something wrong. Does this look 
right to you? I think there should have been at least 2 matching files 
(that's what some of the other scripts produced.)




___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext to find a series of words

2006-11-29 Thread J. Landman Gay

Jim Ault wrote:

On 11/29/06 3:37 PM, "J. Landman Gay" <[EMAIL PROTECTED]> wrote:

This looks promising, thanks. It looks like there is no single-pass
method, but since filter is pretty fast it may do okay. I didn't even
quote your regex explanation, I don't want to touch it. :)


You mention single pass...
Question: Single pass of what?
Single pass of each text block or all text blocks together?


I meant a single pass for each block. The filter solution has to make a 
new pass through the text for each word we want to filter on. But 
regardless, it still shows very well in my tests.


--
Jacqueline Landman Gay | [EMAIL PROTECTED]
HyperActive Software   | http://www.hyperactivesw.com
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext to find a series of words

2006-11-29 Thread Mark Smith

On 30 Nov 2006, at 00:31, Brian Yennie wrote:

You just need to pass through the text once, and "cross off" each  
word as you find it. If everything is crossed off when you're done,  
then you're done =).


That's a much better idea than mine, so:

function aMatch pWords,tText
  -- first remove punctuation marks from the word list, perhaps  
unneccessary

  repeat for each char C in pWords
if C is cr OR charToNum(C) >= 65 then put C after tWords
  end repeat

  repeat for each word W in tText
put empty into newWord

repeat for each char C in W
  get charToNum(C)
  if it >= 65 AND it <= 122 then put C after newWord
end repeat

if newWord is among the lines of tWords then
  filter tWords without newWord
end if

if tWords is empty then return true
  end repeat

  return false
end aMatch
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext for multiple words

2006-11-29 Thread J. Landman Gay

John Craig wrote:
And a script to create the regex from a word list.  My apologies if this 
stuff turns out useless - but you can get absorbed in this mince...


I passed three random words to your script (list,house,dog) and got this 
regex from it:


(?is)\b(list|house|dog)\b\b(?!\1)(list|house|dog)\b\b(?!\1|\2)(list|house|dog)\b

My test then goes through a bunch of text files on disk and applies the 
regex to the text of each file like this:


put matchText(tText, tRegex) into tMatch

I don't get any matches though, and my knowledge of regex is too limited 
for me to know if I'm doing something wrong. Does this look right to 
you? I think there should have been at least 2 matching files (that's 
what some of the other scripts produced.)


--
Jacqueline Landman Gay | [EMAIL PROTECTED]
HyperActive Software   | http://www.hyperactivesw.com
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext to find a series of words

2006-11-29 Thread Dick Kriesel
On 11/29/06 4:31 PM, "Brian Yennie" <[EMAIL PROTECTED]> wrote:

> I do think that algorithmically
> one-pass is definitely possible. You just need to pass through the
> text once, and "cross off" each word as you find it. If everything is
> crossed off when you're done, then you're done =).

Good idea, Brian.

-- Dick

on mouseUp
  put "The purple dinosaur inadvertently stepped on the cat." & cr \
& "The white dog howled." into tText
  put "dog dinosaur cat" into tWords
  put textContainsAllWords(tText,tWords)
end mouseUp

function textContainsAllWords tText,tWords
  replace "." with space in tText
  replace "," with space in tText
  split tText using space and space
  split tWords using space and space
  repeat for each key tWord in tText
delete variable tWords[tWord]
  end repeat
  return the keys of tWords is empty
end textContainsAllWords


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext for multiple words

2006-11-29 Thread J. Landman Gay

John Craig wrote:

>  -- build the whole damn regex

LOL! I know exactly what you mean. I'll test this. I'm building a test 
suite of all the responses and will report here what I find. So far, I'm 
surprised at the results.


I'm kind of pleased with this whole thread. Scripting contests are cool. 
We should make it a monthly affair.



And a script to create the regex from a word list.  My apologies if this 
stuff turns out useless - but you can get absorbed in this mince...



on mouseUp
 -- string to search * SHOULD MATCH
 put "The purple dinosaur inadvertently stepped on the cat." & return & 
"The white dog howled." into tString


 -- DUFF string to search ** SHOULD NOT MATCH
 put "The purple dinosaur inadvertently stepped on the cat." & return & 
"The white dinosaur howled." into tString2


 -- words to find
 put "cat,dinosaur,dog" into tWords

 -- build the pattern to match the words
 put "(" into tWordsPattern
 repeat for each item tWord in tWords
   put tWord & "|" after tWordsPattern
 end repeat
 put ")" into char -1 of tWordsPattern

 -- build the whole damn regex
 put num of items in tWords into tTotalWords
 put 0 into tCurrentWord
 put "(?is)" into tRegex
 repeat for each item tWord in tWords
   add 1 to tCurrentWord
   put "\b" after tRegex
   if tCurrentWord > 1 then
 put "(?!" after tRegex
 repeat with i = 1 to tCurrentWord - 1
   put "\" & i & "|" after tRegex
 end repeat
 delete char -1 of tRegex
 put ")" after tRegex
   end if
   put tWordsPattern & "\b" after tRegex
   if tCurrentWord < tTotalWords then
 put ".*" after tRegex
   end if
 end repeat

 -- test our regex against the 2 test strings
 put matchText(tString, tRegex) & return & matchText(tString2, tRegex)

end mouseUp


John Craig wrote:

I still think it's working ok - someone slap me if I'm wrong.
The (?!  is looking ahead and saying 'you can't begin with.

(?!\1) - you can't begin with the first match
(?!\1|\2) - you can't begin with the 1st or second match

JC

J. Landman Gay wrote:
Sorry if this comes through twice, I'm having trouble sending to the 
list.


I need a matchtext/regex that will tell me if all supplied words 
exist in a block of text, regardless of their order, and ignoring 
carriage returns.


For example, see if all these words:  dog dinosaur cat

exist in this text:

"The purple dinosaur inadvertently stepped on the cat.
The white dog howled."

Should return true. Is there such a thing?



___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your 
subscription preferences:

http://lists.runrev.com/mailman/listinfo/use-revolution



___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your 
subscription preferences:

http://lists.runrev.com/mailman/listinfo/use-revolution




--
Jacqueline Landman Gay | [EMAIL PROTECTED]
HyperActive Software   | http://www.hyperactivesw.com
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext to find a series of words

2006-11-29 Thread Jim Ault
On 11/29/06 3:37 PM, "J. Landman Gay" <[EMAIL PROTECTED]> wrote:
> This looks promising, thanks. It looks like there is no single-pass
> method, but since filter is pretty fast it may do okay. I didn't even
> quote your regex explanation, I don't want to touch it. :)

You mention single pass...
Question: Single pass of what?
Single pass of each text block or all text blocks together?

Doing all as one block 
--with tracing to know which are matches

put 0 into cnt
repeat for each line LNN in variableList
   add 1 to cnt
   do "get "&LNN
  replace cr with tab in it
  put cnt & LNN && it & cr after newBlock
end repeat
--now all the blocks are their own line in the aggregate

put allWordsPresent(newBlock, wordList) into residualBlock

if residualBlock is empty then
   put "no matches anywhere"
else
--word 1 of each line =
   --   (the variable number & variable name)
   --by concatenating it is unlikely they will form a match to one of your
search words or tokens
end if

-
>> function allWordsPresent textStr, wordList
>>   replace cr with tab in textStr
>>   set the wholematches to true
>>   repeat for each word WRD in wordList
>> filter textStr with ("*" & WRD & "*")
>>   end repeat
>>   return not (textStr is empty)
>> end  allWordsPresent


Jim Ault
Las Vegas


> Jim Ault wrote:
> 
>> I would tackle this using the filter command
>> 
>> replace cr with tab in textStr
>> set the wholematches to true
>> filter textStr with "*"& token1&"*"
>> filter textStr with "*"& token2&"*"
>> filter textStr with "*"& token3&"*"
>> if textStr  is empty then return false
>> else return true
>> 
>> A better form would be
>> 
>> function allWordsPresent textStr, wordList
>>   replace cr with tab in textStr
>>   set the wholematches to true
>>   repeat for each word WRD in wordList
>> filter textStr with ("*" & WRD & "*")
>>   end repeat
>>   return not (textStr is empty)
>> end  allWordsPresent
>


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext for multiple words

2006-11-29 Thread Mark Smith

Here's my version:

put "dog" &  cr & "cat" & cr & "dinosaur" into tWords

get aMatch(tWords,tText)


function aMatch pWords,tText
sort lines of pWords

repeat for each word W in tText
put empty into newWord
repeat for each char C in W
if charToNum(C) >= 65 then put C after newWord -- to get  
rid of punctuation - maybe not ideal!

 end repeat
 if newWord is among the lines of pWords then put 0 into  
foundWordArray[newWord]

 end repeat

 put the keys of foundWordArray into foundWords
 sort lines of foundWords

 return foundWords is pWords
end aMatch

Best,

Mark

On 29 Nov 2006, at 23:07, J. Landman Gay wrote:




Basically I need the fastest possible way to scan a large number of  
text blocks for an indefinite number of words which occur in any  
portion of the text.


I'll try Ken's thing too -- thanks Ken.

(I'll send this once and cross my fingers.)
--
Jacqueline Landman Gay | [EMAIL PROTECTED]
HyperActive Software   | http://www.hyperactivesw.com
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your  
subscription preferences:

http://lists.runrev.com/mailman/listinfo/use-revolution


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext for multiple words

2006-11-29 Thread John Craig
And a script to create the regex from a word list.  My apologies if this 
stuff turns out useless - but you can get absorbed in this mince...



on mouseUp
 -- string to search * SHOULD MATCH
 put "The purple dinosaur inadvertently stepped on the cat." & return & 
"The white dog howled." into tString


 -- DUFF string to search ** SHOULD NOT MATCH
 put "The purple dinosaur inadvertently stepped on the cat." & return & 
"The white dinosaur howled." into tString2


 -- words to find
 put "cat,dinosaur,dog" into tWords

 -- build the pattern to match the words
 put "(" into tWordsPattern
 repeat for each item tWord in tWords
   put tWord & "|" after tWordsPattern
 end repeat
 put ")" into char -1 of tWordsPattern

 -- build the whole damn regex
 put num of items in tWords into tTotalWords
 put 0 into tCurrentWord
 put "(?is)" into tRegex
 repeat for each item tWord in tWords
   add 1 to tCurrentWord
   put "\b" after tRegex
   if tCurrentWord > 1 then
 put "(?!" after tRegex
 repeat with i = 1 to tCurrentWord - 1
   put "\" & i & "|" after tRegex
 end repeat
 delete char -1 of tRegex
 put ")" after tRegex
   end if
   put tWordsPattern & "\b" after tRegex
   if tCurrentWord < tTotalWords then
 put ".*" after tRegex
   end if
 end repeat

 -- test our regex against the 2 test strings
 put matchText(tString, tRegex) & return & matchText(tString2, tRegex)

end mouseUp


John Craig wrote:

I still think it's working ok - someone slap me if I'm wrong.
The (?!  is looking ahead and saying 'you can't begin with.

(?!\1) - you can't begin with the first match
(?!\1|\2) - you can't begin with the 1st or second match

JC

J. Landman Gay wrote:
Sorry if this comes through twice, I'm having trouble sending to the 
list.


I need a matchtext/regex that will tell me if all supplied words 
exist in a block of text, regardless of their order, and ignoring 
carriage returns.


For example, see if all these words:  dog dinosaur cat

exist in this text:

"The purple dinosaur inadvertently stepped on the cat.
The white dog howled."

Should return true. Is there such a thing?



___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your 
subscription preferences:

http://lists.runrev.com/mailman/listinfo/use-revolution



___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext to find a series of words

2006-11-29 Thread Brian Yennie
This looks promising, thanks. It looks like there is no single-pass  
method, but since filter is pretty fast it may do okay.


Not sure how robust my stab was, but I do think that algorithmically  
one-pass is definitely possible. You just need to pass through the  
text once, and "cross off" each word as you find it. If everything is  
crossed off when you're done, then you're done =).


HTH

- Brian
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext for multiple words

2006-11-29 Thread John Craig

I still think it's working ok - someone slap me if I'm wrong.
The (?!  is looking ahead and saying 'you can't begin with.

(?!\1) - you can't begin with the first match
(?!\1|\2) - you can't begin with the 1st or second match

JC

J. Landman Gay wrote:
Sorry if this comes through twice, I'm having trouble sending to the 
list.


I need a matchtext/regex that will tell me if all supplied words exist 
in a block of text, regardless of their order, and ignoring carriage 
returns.


For example, see if all these words:  dog dinosaur cat

exist in this text:

"The purple dinosaur inadvertently stepped on the cat.
The white dog howled."

Should return true. Is there such a thing?



___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext for multiple words

2006-11-29 Thread John Craig

Maybe I'm too tired  but I think this works.

on mouseUp
 put "The purple dinosaur inadvertently stepped on the cat." & return & 
"The white dog howled." into tString
 put "The purple dinosaur inadvertantly stepped on the cat." & return & 
"The white cat howled." into tString2
 put 
"(?is)\b(cat|dinosaur|dog)\b.*\b(?!\1)(cat|dinosaur|dog)\b.*\b(?!\1|\2)(cat|dinosaur|dog)\b" 
into tReg

 put matchText(tString, tReg, tMatch1, tMatch2, tMatch3) into tResult
 put matchText(tString2, tReg, tMatch4, tMatch5, tMatch6) into tResult2
 answer tResult && tMatch1 && tMatch2 && tMatch3 & return & tResult2 && 
tMatch4 && tMatch5 && tMatch6

end mouseUp


J. Landman Gay wrote:
Sorry if this comes through twice, I'm having trouble sending to the 
list.


I need a matchtext/regex that will tell me if all supplied words exist 
in a block of text, regardless of their order, and ignoring carriage 
returns.


For example, see if all these words:  dog dinosaur cat

exist in this text:

"The purple dinosaur inadvertently stepped on the cat.
The white dog howled."

Should return true. Is there such a thing?



___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext for multiple words

2006-11-29 Thread Dick Kriesel
On 11/29/06 1:39 PM, "J. Landman Gay" <[EMAIL PROTECTED]> wrote:

> I need a matchtext/regex that will tell me if all supplied words exist
> in a block of text, regardless of their order, and ignoring carriage
> returns.
> 
> For example, see if all these words:  dog dinosaur cat
> 
> exist in this text:
> 
> "The purple dinosaur inadvertently stepped on the cat.
> The white dog howled."
> 
> Should return true. Is there such a thing?

Since Rev says "cat" and "cat." are different words, punctuation poses a
problem.  Here's an approach that's simple and fast but depends on the
programmer to include a replace statement for each punctuation mark.

-- Dick

on mouseUp
  put ""
  put "The purple dinosaur inadvertently stepped on the cat." & cr \
& "The white dog howled." into tText
  put "dog dinosaur cat" into tWords
  putLines textContainsAllWords(tText,tWords)
end mouseUp

function textContainsAllWords tText,pWords
  replace "." with space in tText
  replace "," with space in tText
  repeat for each word tWord in tText
put 1 into tArray[tWord]
  end repeat
  repeat for each word tWord in pWords
if tArray[tWord] is empty then return "false"
  end repeat
  return "true"
end textContainsAllWords


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext for multiple words

2006-11-29 Thread John Craig

Just experimaented with;
"(?is)\b(cat|dinosaur|dog)\b.*\b(?!\1)(cat|dinosaur|dog)\b"

and got some success - further investigation needed.

:-)

J. Landman Gay wrote:
Sorry if this comes through twice, I'm having trouble sending to the 
list.


I need a matchtext/regex that will tell me if all supplied words exist 
in a block of text, regardless of their order, and ignoring carriage 
returns.


For example, see if all these words:  dog dinosaur cat

exist in this text:

"The purple dinosaur inadvertently stepped on the cat.
The white dog howled."

Should return true. Is there such a thing?



___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext to find a series of words

2006-11-29 Thread J. Landman Gay

Jim Ault wrote:


I would tackle this using the filter command

replace cr with tab in textStr
set the wholematches to true
filter textStr with "*"& token1&"*"
filter textStr with "*"& token2&"*"
filter textStr with "*"& token3&"*"
if textStr  is empty then return false
else return true

A better form would be

function allWordsPresent textStr, wordList
  replace cr with tab in textStr
  set the wholematches to true
  repeat for each word WRD in wordList
filter textStr with ("*" & WRD & "*")
  end repeat
  return not (textStr is empty)
end  allWordsPresent



This looks promising, thanks. It looks like there is no single-pass 
method, but since filter is pretty fast it may do okay. I didn't even 
quote your regex explanation, I don't want to touch it. :)


--
Jacqueline Landman Gay | [EMAIL PROTECTED]
HyperActive Software   | http://www.hyperactivesw.com
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext for multiple words

2006-11-29 Thread Ken Ray
On 11/29/06 5:07 PM, "J. Landman Gay" <[EMAIL PROTECTED]> wrote:


> if "dinosaur" is in tText and "dog" is in tText and "cat" is in tText
> 
> and that would require 3 times the number of lookups over a single
> matchtext. 

Plus, it would match paragraphs with "catastrophe", "doggedly", "muscat",
etc., which you may also not want.

> Also, the number of words can vary so I'd have to construct a
> repeat loop to build the command itself, and use a "do" statement to
> execute it -- and both of those are slow. But if I'm wrong, I'd like to
> know. Has anyone done any speed tests on this stuff?
> 
> Basically I need the fastest possible way to scan a large number of text
> blocks for an indefinite number of words which occur in any portion of
> the text.
> 
> I'll try Ken's thing too -- thanks Ken.

:-)


Ken Ray
Sons of Thunder Software, Inc.
Web site: http://www.sonsothunder.com/
Email: [EMAIL PROTECTED]


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext for multiple words

2006-11-29 Thread John Craig
Although you can invert character matching using [^ ... , I don't think 
there's an equivalent for words.

You could have used;
"(is)\b(cat|dinosaur|dog)\b.*\b_(cat|dinosaur|dog)\b"

... if there was a way to say 'not beginning with the first match' where 
the underscore appears in the above - then
it would be possible to do a quick 1 liner regex - we can use '\1' to 
back reference the first match.


:-(

J. Landman Gay wrote:
Sorry if this comes through twice, I'm having trouble sending to the 
list.


I need a matchtext/regex that will tell me if all supplied words exist 
in a block of text, regardless of their order, and ignoring carriage 
returns.


For example, see if all these words:  dog dinosaur cat

exist in this text:

"The purple dinosaur inadvertently stepped on the cat.
The white dog howled."

Should return true. Is there such a thing?



___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext for multiple words

2006-11-29 Thread J. Landman Gay

Mark Smith wrote:
Do you really need to do it with MatchText? Aren't words of> etc going to work? Or do you really need it to be a one-liner?


Best,

Mark

ps. That's the third one ;-0


Yeah, I noticed that, and I'm not sure how it happened. I only sent one, 
then waited an hour or so. Then I changed the outgoing server I was 
using and sent again. Then three of them showed up. I didn't do it! ;)


Anyway, thanks to Ken, Eric, and yourself for the suggestions. I 
probably didn't explain enough. If I were only checking a single block 
of text then I'd use some of the built-in commands, but I have to loop 
through a couple of zillion blocks. So I figured matchtext would be 
faster if, hopefully, I could issue a single command for each lookup. If 
I have to do multiple lookups for each text block, then I end up with:


if "dinosaur" is in tText and "dog" is in tText and "cat" is in tText

and that would require 3 times the number of lookups over a single 
matchtext. Also, the number of words can vary so I'd have to construct a 
repeat loop to build the command itself, and use a "do" statement to 
execute it -- and both of those are slow. But if I'm wrong, I'd like to 
know. Has anyone done any speed tests on this stuff?


Basically I need the fastest possible way to scan a large number of text 
blocks for an indefinite number of words which occur in any portion of 
the text.


I'll try Ken's thing too -- thanks Ken.

(I'll send this once and cross my fingers.)
--
Jacqueline Landman Gay | [EMAIL PROTECTED]
HyperActive Software   | http://www.hyperactivesw.com
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext for multiple words

2006-11-29 Thread Eric Chatonet
I was a bit fast: probably will you have to complicate this a bit to  
take into account words at the beginning or the end of a line and all  
punctuation marks:

Then I understand better your concern about a regex ;-)

Le 29 nov. 06 à 23:19, Eric Chatonet a écrit :


Hi Jacque,

Are you sure you need a regex? ;-)

function AreWordsIn pText,pWords
  repeat for each word tWord in pWords
if space & tWord & space is not in pText then return false
  end repeat
  return true
end AreWordsIn

As this way of doing searches for words that are not in the text,  
it should be very fast...


Le 29 nov. 06 à 22:39, J. Landman Gay a écrit :

Sorry if this comes through twice, I'm having trouble sending to  
the list.


I need a matchtext/regex that will tell me if all supplied words  
exist in a block of text, regardless of their order, and ignoring  
carriage returns.


For example, see if all these words:  dog dinosaur cat

exist in this text:

"The purple dinosaur inadvertently stepped on the cat.
The white dog howled."

Should return true. Is there such a thing?



Best Regards from Paris,
Eric Chatonet
-- 


http://www.sosmartsoftware.com/[EMAIL PROTECTED]/





Best Regards from Paris,
Eric Chatonet
 
--

http://www.sosmartsoftware.com/[EMAIL PROTECTED]/


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext to find a series of words

2006-11-29 Thread Jim Ault

On 11/29/06 1:26 PM, "J. Landman Gay" <[EMAIL PROTECTED]> wrote:

> I need a matchtext/regex that will find a series of words in a block of
> text, no matter whether they are together or not, and ignoring carriage
> returns. For example:
> 
> See if all of these words: dog cat dinosaur
> 
> are in this text:
> 
> "The purple dinosaur inadvertently stepped on the cat.
> The white dog howled."
> 
> Should return true. Is there such a thing?

I would tackle this using the filter command

replace cr with tab in textStr
set the wholematches to true
filter textStr with "*"& token1&"*"
filter textStr with "*"& token2&"*"
filter textStr with "*"& token3&"*"
if textStr  is empty then return false
else return true

A better form would be

function allWordsPresent textStr, wordList
  replace cr with tab in textStr
  set the wholematches to true
  repeat for each word WRD in wordList
filter textStr with ("*" & WRD & "*")
  end repeat
  return not (textStr is empty)
end  allWordsPresent


regEx would be as follows

the OR condition is \b(dog|cat|dinosaur)\b
--where the \b says 'word boundary' to regEx

the AND condition
 (?(?=condition)(then1|then2|then3)|(else1|else2|else3))
--major drawback is that you would have to structure the exact number of
words to check [you used 3 in your example] and also be scanned multiple
times 9starting with the hit fo 'dog') since you would be trying 4
combinations.  RegEx would stop looking as soon as one of these tested TRUE.
dog
   +positive lookbehind (?<=cat
+ positive lookbehind (?<=dinosaur)
dog
   +positive lookahead (?<=cat
+ positive lookbehind (?<=dinosaur)
dog
   +positive lookahead (?<=cat
+ positive lookahead (?<=dinosaur)
dog
   +positive lookbehind (?<=cat
+ positive lookahead (?<=dinosaur)

-- where if any of these = true, then return TRUE, else FALSE


 the filter command is far easier to build and debug, and is likely faster
than the complex regex positive lookahead/behind algorithm

Someone more conversant in regEx my show a better solution and be the better
answer to your question.

Jim Ault
Las Vegas


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext for multiple words

2006-11-29 Thread Brian Yennie

Jacque,

I think the "in any order" part will make a single RegEx a nightmare  
(although it's probably technically possible).
How about using something simple like (or scroll down for a non-RegEx  
idea)


".*(dinosaur|dog|cat).*"

Then capture the actual text matched, and remove that from the  
expression. So in your example, you would first match "dinosaur".  
Then you would run the RegEx again as:


".*(dog|cat).*"

Which would match "cat".

Then finally:

".*dog.*"

If you're not married to RegEx, you could just do something like  
this. It should be pretty speedy, as it uses array lookups, simple  
comparisons, and only one pass through your text.


## put the words into an array for quick lookup

repeat for each word w in wordList
   put 0 into myWords[w]
end repeat

## loop through your text and mark all of the words you find

repeat for each word w in myText
  if (myWords[w] = 0) then
put 1 into myWords[w]
  end if
end repeat

## check that all of your words were "marked" with a 1

put TRUE into foundThemAll
repeat for each word w in wordList
   if (myWords[w] <> 1) then
 put FALSE into foundThemAll
 exit repeat
  end if
end repeat



Sorry if this comes through twice, I'm having trouble sending to  
the list.


I need a matchtext/regex that will tell me if all supplied words  
exist in a block of text, regardless of their order, and ignoring  
carriage returns.


For example, see if all these words:  dog dinosaur cat

exist in this text:

"The purple dinosaur inadvertently stepped on the cat.
The white dog howled."

Should return true. Is there such a thing?

--
Jacqueline Landman Gay | [EMAIL PROTECTED]
HyperActive Software   | http://www.hyperactivesw.com
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your  
subscription preferences:

http://lists.runrev.com/mailman/listinfo/use-revolution




___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext for multiple words

2006-11-29 Thread Eric Chatonet

Hi Jacque,

Are you sure you need a regex? ;-)

function AreWordsIn pText,pWords
  repeat for each word tWord in pWords
if space & tWord & space is not in pText then return false
  end repeat
  return true
end AreWordsIn

As this way of doing searches for words that are not in the text, it  
should be very fast...


Le 29 nov. 06 à 22:39, J. Landman Gay a écrit :

Sorry if this comes through twice, I'm having trouble sending to  
the list.


I need a matchtext/regex that will tell me if all supplied words  
exist in a block of text, regardless of their order, and ignoring  
carriage returns.


For example, see if all these words:  dog dinosaur cat

exist in this text:

"The purple dinosaur inadvertently stepped on the cat.
The white dog howled."

Should return true. Is there such a thing?



Best Regards from Paris,
Eric Chatonet
 
--

http://www.sosmartsoftware.com/[EMAIL PROTECTED]/


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext for multiple words

2006-11-29 Thread Mark Smith
Do you really need to do it with MatchText? Aren't the words of> etc going to work? Or do you really need it to be a one- 
liner?


Best,

Mark

ps. That's the third one ;-0

On 29 Nov 2006, at 21:39, J. Landman Gay wrote:

Sorry if this comes through twice, I'm having trouble sending to  
the list.


I need a matchtext/regex that will tell me if all supplied words  
exist in a block of text, regardless of their order, and ignoring  
carriage returns.


For example, see if all these words:  dog dinosaur cat

exist in this text:

"The purple dinosaur inadvertently stepped on the cat.
The white dog howled."

Should return true. Is there such a thing?

--
Jacqueline Landman Gay | [EMAIL PROTECTED]
HyperActive Software   | http://www.hyperactivesw.com
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your  
subscription preferences:

http://lists.runrev.com/mailman/listinfo/use-revolution


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext to find a series of words

2006-11-29 Thread Ken Ray
On 11/29/06 3:26 PM, "J. Landman Gay" <[EMAIL PROTECTED]> wrote:

> I need a matchtext/regex that will find a series of words in a block of
> text, no matter whether they are together or not, and ignoring carriage
> returns. For example:
> 
> See if all of these words: dog cat dinosaur
> 
> are in this text:
> 
> "The purple dinosaur inadvertently stepped on the cat.
> The white dog howled."
> 
> Should return true. Is there such a thing?

Well, you can do this, but there may be a more efficient way:

  put (matchText(tText,"(?si)\bdog\b") and \
matchText(tText,"(?si)\bcat\b") and
matchText(tText,"(?si)\bdinosaur\b"))

If I keep trying, maybe I can come up with a more efficient one-liner...

Ken Ray
Sons of Thunder Software, Inc.
Web site: http://www.sonsothunder.com/
Email: [EMAIL PROTECTED]


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


RE: matchText handler not found

2005-10-11 Thread Harvey Toyama
Hi Sara and Alex,
Thanks for the help. I did, in fact, have the sense of it wrong. My next
line was:

put the result ...

Your help is appreciated,
-- Harvey
-- 
 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Alex
Tweedly
Sent: Tuesday, October 11, 2005 3:38 PM
To: How to use Revolution
Subject: Re: matchText handler not found

Harvey Toyama wrote:

>Hi,
>
>After yesterday's good advice on regex, I wanted to try "matchText". My
>installation (2.5.1) doesn't find the  handler. Is there something
>special about the matchText function. In fact, I don't recall any of
the
>documented functions being found.
>  
>
matchText is not a handler, it's a function.
If you use it in a context where a handler would be used, you'll get the

rather confusing message
"handler not found".

It should be used like:

  put matchText(someData, someExpr) into myVar

not simply as

   matchText(someData, someExpr)




-- 
Alex Tweedly   http://www.tweedly.net



-- 
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.344 / Virus Database: 267.11.14/128 - Release Date:
10/10/2005

___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your
subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution

___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: matchText handler not found

2005-10-11 Thread Alex Tweedly

Harvey Toyama wrote:


Hi,

After yesterday's good advice on regex, I wanted to try "matchText". My
installation (2.5.1) doesn't find the  handler. Is there something
special about the matchText function. In fact, I don't recall any of the
documented functions being found.
 


matchText is not a handler, it's a function.
If you use it in a context where a handler would be used, you'll get the 
rather confusing message

"handler not found".

It should be used like:

 put matchText(someData, someExpr) into myVar

not simply as

  matchText(someData, someExpr)




--
Alex Tweedly   http://www.tweedly.net



--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.344 / Virus Database: 267.11.14/128 - Release Date: 10/10/2005

___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: matchText handler not found

2005-10-11 Thread Sarah Reichelt
> After yesterday's good advice on regex, I wanted to try "matchText". My
> installation (2.5.1) doesn't find the  handler. Is there something
> special about the matchText function. In fact, I don't recall any of the
> documented functions being found.
>

matchText is a function, not a handler. If you don't give it somewhere
to put the result, you will get an error.

This script works perfectly:
on mouseUp
   put matchText("Goodbye","bye")
end mouseUp

This gives the "handler not found" error:
on mouseUp
   matchText("Goodbye","bye")
end mouseUp

HTH,
Sarah
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: MatchText and PCRE

2005-07-20 Thread David Wilkinson
David

Just a couple of points since Mark has already answered very well

Frederic Rinaldi's plugin in the rev development/plugin menu is 
invaluble for testing Rev's regex implementation and porting syntax, 
if you haven't discovered it

Like Mark, I had difficulties with the (?s) switch, though I 
concluded that this was to do with the greediness of the regex engine 
and my ignorance.   I resorted to using repeat for each line since I 
could not get lookbehind to work.

I would also recommend Jeffrey Friedl's illuminating book, though 
probably not reading it as I did during a  couple of sleepless nights 
with a dental abscess.  It is not a recipe book, but then I suspect 
that you already have a good understanding of regex.

David Wilkinson
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: MatchText and PCRE

2005-07-19 Thread David Vaughan


On 20/07/2005, at 10:15, Mark Greenberg <[EMAIL PROTECTED]> wrote:



David,
I am by no means an advanced programmer in Rev, but I have been  
using Regex a lot lately.  This is what I have discovered in my use  
of Regex in Rev:





I highly recommend the book Mastering Regular Expressions by  
Jeffrey E. F. Friedl.  There are a bunch more obscure codes that  
work in the Rev flavor of Regex.


thanks Mark. I have the book on order. I have also been contemplating  
buying RegExplorer, which is written in RealBasic and includes four  
checkboxes related to my four questions. Apparently, these are  
parameters you set in the RealBasic Class so I was wondering about  
the corresponding behaviours and options in Rev. Now I know how to  
set these switches for Rev defaults and will change behaviours when  
needed by using the RegEx codes rather than the checkboxes in the  
program.


regards
David


Mark Greenberg



___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: MatchText error

2004-12-14 Thread Byron
Thank you.  That did it. . . obviously I still have a lot to learn.
On Dec 14, 2004, at 12:44 PM, Alex Tweedly wrote:

put matchText("Goodbe", bye")

___
use-revolution mailing list
[EMAIL PROTECTED]
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: MatchText error

2004-12-14 Thread Alex Tweedly
At 12:36 14/12/2004 -0800, Byron wrote:
I'm getting  the error
TypeHandler: can't find handler
ObjectTest Match Text
LinematchText("Goodbye","bye")
HintmatchText
MatchText is a function - try something like
  put matchText("Goodbe", bye")
-- Alex.
___
use-revolution mailing list
[EMAIL PROTECTED]
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext

2003-07-31 Thread Alex Rice
On Thursday, July 31, 2003, at 11:42  AM, Steve Gehlbach wrote:

Alex Rice wrote:
put matchText(tLine, "(\d{1,2})/(\d{1,2})/(\d{2,4})", tDay, tMo, tYr)
??  Although the "\d" and {m,n} syntax is standard regexp well 
documented in Perl, I did not see this in the Rev Regular Expression 
Syntax page.  Do I have bad eyes or is this not documented (maybe Rev 
is a complete reg exp implementation but not completely explained in 
the rev docs)?
Steve- yes since RR 2.0 the regex engine is using the PCRE library. It 
supports nearly all Perl regular expressions syntax. It's not explained 
in the Rev docs- except mentioning Perl-compatible regex in the Release 
Notes / What's New doc for RR 2.0.x

Alex Rice, Software Developer
Architectural Research Consultants, Inc.
http://ARCplanning.com
___
use-revolution mailing list
[EMAIL PROTECTED]
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext

2003-07-31 Thread Dar Scott
On Thursday, July 31, 2003, at 10:58 AM, Alex Rice wrote:

Here is one that handles different lengths for the digits, but doesn't  
check the ranges of the day and month numbers. But that could be done  
in transcript.

put matchText(tLine, "(\d{1,2})/(\d{1,2})/(\d{2,4})", tDay, tMo, tYr)
I like this better.  The other only did partial checking anyway, so if  
date checking is needed, Transcript is a good way to go.  The other did  
have an advantage in that it would be less likely pick up something in  
a long text that looked like a date but was not.

Dar Scott

 

  Dar Scott Consultinghttp://www.swcp.com/dsc/Programming  
Services
 


___
use-revolution mailing list
[EMAIL PROTECTED]
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext

2003-07-31 Thread Dar Scott
On Thursday, July 31, 2003, at 11:42 AM, Steve Gehlbach wrote:

Alex Rice wrote:
put matchText(tLine, "(\d{1,2})/(\d{1,2})/(\d{2,4})", tDay, tMo, tYr)
??  Although the "\d" and {m,n} syntax is standard regexp well  
documented in Perl, I did not see this in the Rev Regular Expression  
Syntax page.  Do I have bad eyes or is this not documented (maybe Rev  
is a complete reg exp implementation but not completely explained in  
the rev docs)?
In my minimal change response, I avoided this issue.  There is an  
enhancement in Revolution 2 that expanded capabilities.

Alex has reason to believe that this is exactly what it has:

http://www.pcre.org/pcre.txt

Ken Ray has also mentioned this reference:

http://www.perldoc.com/perl5.6.1/pod/perlre.html

I fully expect Revolution 2 documentation to catch up and provide a  
good pointer and provide a caveat as to the completeness of its  
description.

Dar Scott

 

  Dar Scott Consultinghttp://www.swcp.com/dsc/Programming  
Services
 


___
use-revolution mailing list
[EMAIL PROTECTED]
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext

2003-07-31 Thread Yves COPPE
Le jeudi, 31 juil 2003, à 18:58 Europe/Brussels, Alex Rice a écrit :

On Thursday, July 31, 2003, at 10:39  AM, Dar Scott wrote:

On Thursday, July 31, 2003, at 07:18 AM, Yves COPPE wrote:

if  
matchtext(theLine,"(0[1-9]|[12][0-9]|3[01])/(0[1-9]|1[0-2])/([0- 
9][0-9])",theDay,TheMonth,TheYear) is true then
Try this:
"([ 0][1-9]|[12][0-9]|3[01])/(0[1-9]|1[0-2])/([0-9][0-9])"
Here is one that handles different lengths for the digits, but doesn't  
check the ranges of the day and month numbers. But that could be done  
in transcript.

put matchText(tLine, "(\d{1,2})/(\d{1,2})/(\d{2,4})", tDay, tMo, tYr)


Thank you Dar an d Alex...

Greetings.

Yves COPPE
[EMAIL PROTECTED]
___
use-revolution mailing list
[EMAIL PROTECTED]
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext

2003-07-31 Thread Steve Gehlbach
Alex Rice wrote:
put matchText(tLine, "(\d{1,2})/(\d{1,2})/(\d{2,4})", tDay, tMo, tYr)
??  Although the "\d" and {m,n} syntax is standard regexp well 
documented in Perl, I did not see this in the Rev Regular Expression 
Syntax page.  Do I have bad eyes or is this not documented (maybe Rev is 
a complete reg exp implementation but not completely explained in the 
rev docs)?

-Steve

___
use-revolution mailing list
[EMAIL PROTECTED]
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext

2003-07-31 Thread Alex Rice
On Thursday, July 31, 2003, at 10:39  AM, Dar Scott wrote:

On Thursday, July 31, 2003, at 07:18 AM, Yves COPPE wrote:

if  
matchtext(theLine,"(0[1-9]|[12][0-9]|3[01])/(0[1-9]|1[0-2])/([0-9][0- 
9])",theDay,TheMonth,TheYear) is true then
Try this:
"([ 0][1-9]|[12][0-9]|3[01])/(0[1-9]|1[0-2])/([0-9][0-9])"
Here is one that handles different lengths for the digits, but doesn't  
check the ranges of the day and month numbers. But that could be done  
in transcript.

put matchText(tLine, "(\d{1,2})/(\d{1,2})/(\d{2,4})", tDay, tMo, tYr)

Alex Rice, Software Developer
Architectural Research Consultants, Inc.
http://ARCplanning.com
___
use-revolution mailing list
[EMAIL PROTECTED]
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: Matchtext

2003-07-31 Thread Dar Scott
On Thursday, July 31, 2003, at 07:18 AM, Yves COPPE wrote:

if  
matchtext(theLine,"(0[1-9]|[12][0-9]|3[01])/(0[1-9]|1[0-2])/([0-9][0- 
9])",theDay,TheMonth,TheYear) is true then
Try this:
"([ 0][1-9]|[12][0-9]|3[01])/(0[1-9]|1[0-2])/([0-9][0-9])"
This will include the space in theDay and there must be space or zero.   
This can be enhanced to exclude the space and to allow other situations  
such as the string starting with a one digit day.

This assumes a two digit year, as the original, so it is picking up  
only the 20 of 2003.  It can be enhanced to work with 4 digit and even  
to skip over a "20" if needed.

This can also be enhanced to handle a single digit month, also.

Dar Scott

 

  Dar Scott Consultinghttp://www.swcp.com/dsc/Programming  
Services
 


___
use-revolution mailing list
[EMAIL PROTECTED]
http://lists.runrev.com/mailman/listinfo/use-revolution