Regex Help
I need to test for the following date-type strings: 2004 2004-12 2004-12-01 All of the above would be legal, any other combination would be illegal. I have been using the following data to test the regex: ''Should produce a false result. '2004'Should produce a true result. '200412' Should produce a false result. '20041201'Should produce a false result. '2004-12' Should produce a true result. '2004-12-01' Should produce a true result. My first attempt at a regex is /\d{4}(-\d{2})*/. This produces the following result: is false 2004 is true 200412 is true 20041201 is true 2004-12 is true 2004-12-01 is true I would prefer that '200412' and '20041201' would fail, i.e. be reported as false. I'm not sure what's going on with this regex as I would expect the following: match the first four numeric characters, then match zero or more occurrences of a dash character followed by two numeric characters The next regex /\d{4}(-\d{2})+/ produces the following results: is false 2004 is false 200412 is false 20041201 is false 2004-12 is true 2004-12-01 is true This is closer, but I also want '2004' to be true. The next regex /\d{4}(?=-\d{2})/ produces the same results as the previous regex: is false 2004 is false 200412 is false 20041201 is false 2004-12 is true 2004-12-01 is true I've done quite a few regexes in the past but have never used the positive lookahead assertion before because I have never really understood it. Out of the three regexes, I would have expected the first to fulfill my requirements. I do not understand why it is not, although I suspect it has something to do with the dash character. What am I doing wrong here? I should be able to accomplish this test using a single regex, right? Dirk Bremer - Systems Programmer II - ESS/AMS - NISC St. Peters USA Central Time Zone 636-922-9158 ext. 8652 fax 636-447-4471 [EMAIL PROTECTED] www.nisc.cc ___ Perl-Win32-Users mailing list Perl-Win32-Users@listserv.ActiveState.com To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
RE: Regex Help
Dirk, I'm not a regex pro, but this worked for me: /^\d{4}(-\d{2}){0,2}$/ I tested it with this: @test = qw( 2004 200412 20041201 2004-12 2004-12-01 ); foreach $test (@test) { print $test : . ($test =~ /^\d{4}(-\d{2}){0,2}$/ ? TRUE : FALSE) . \n; } Regards, Mike -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Dirk Bremer Sent: Thursday, January 27, 2005 10:43 AM To: perl-win32-users Subject: Regex Help I need to test for the following date-type strings: 2004 2004-12 2004-12-01 All of the above would be legal, any other combination would be illegal. I have been using the following data to test the regex: ''Should produce a false result. '2004'Should produce a true result. '200412' Should produce a false result. '20041201'Should produce a false result. '2004-12' Should produce a true result. '2004-12-01' Should produce a true result. My first attempt at a regex is /\d{4}(-\d{2})*/. This produces the following result: is false 2004 is true 200412 is true 20041201 is true 2004-12 is true 2004-12-01 is true I would prefer that '200412' and '20041201' would fail, i.e. be reported as false. I'm not sure what's going on with this regex as I would expect the following: match the first four numeric characters, then match zero or more occurrences of a dash character followed by two numeric characters The next regex /\d{4}(-\d{2})+/ produces the following results: is false 2004 is false 200412 is false 20041201 is false 2004-12 is true 2004-12-01 is true This is closer, but I also want '2004' to be true. The next regex /\d{4}(?=-\d{2})/ produces the same results as the previous regex: is false 2004 is false 200412 is false 20041201 is false 2004-12 is true 2004-12-01 is true I've done quite a few regexes in the past but have never used the positive lookahead assertion before because I have never really understood it. Out of the three regexes, I would have expected the first to fulfill my requirements. I do not understand why it is not, although I suspect it has something to do with the dash character. What am I doing wrong here? I should be able to accomplish this test using a single regex, right? Dirk Bremer - Systems Programmer II - ESS/AMS - NISC St. Peters USA Central Time Zone 636-922-9158 ext. 8652 fax 636-447-4471 [EMAIL PROTECTED] www.nisc.cc ___ Perl-Win32-Users mailing list Perl-Win32-Users@listserv.ActiveState.com To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs ** The information contained in this message, including attachments, may contain privileged or confidential information that is intended to be delivered only to the person identified above. If you are not the intended recipient, or the person responsible for delivering this message to the intended recipient, ALLTEL requests that you immediately notify the sender and asks that you do not read the message or its attachments, and that you delete them without copying or sending them to anyone else. ___ Perl-Win32-Users mailing list Perl-Win32-Users@listserv.ActiveState.com To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
RE: Regex Help
Title: RE: Regex Help Okay, this is amazingly cloddish, but it does work and may point out where some of your regexes are going wrong. Whenever I have difficulty, I try to simplify it down and then analyze. @strings = qw/ 2004 200412 20041201 2004-12 2004-12-01/; for (@strings){ if (/^\d{4}$|^\d{4}-\d{2}-\d{2}$|^\d{4}-\d{2}$/){ print $_ matches\n; } else{ print $_ does not match\n; } } Sam Gardner GTO Application Development Keefe, Bruyette Woods, Inc. 212-887-6753 -Original Message- From: Dirk Bremer [mailto:[EMAIL PROTECTED]] Sent: Thursday, January 27, 2005 11:43 AM To: perl-win32-users Subject: Regex Help I need to test for the following date-type strings: 2004 2004-12 2004-12-01 All of the above would be legal, any other combination would be illegal. I have been using the following data to test the regex: '' Should produce a false result. '2004' Should produce a true result. '200412' Should produce a false result. '20041201' Should produce a false result. '2004-12' Should produce a true result. '2004-12-01' Should produce a true result. My first attempt at a regex is /\d{4}(-\d{2})*/. This produces the following result: is false 2004 is true 200412 is true 20041201 is true 2004-12 is true 2004-12-01 is true I would prefer that '200412' and '20041201' would fail, i.e. be reported as false. I'm not sure what's going on with this regex as I would expect the following: match the first four numeric characters, then match zero or more occurrences of a dash character followed by two numeric characters The next regex /\d{4}(-\d{2})+/ produces the following results: is false 2004 is false 200412 is false 20041201 is false 2004-12 is true 2004-12-01 is true This is closer, but I also want '2004' to be true. The next regex /\d{4}(?=-\d{2})/ produces the same results as the previous regex: is false 2004 is false 200412 is false 20041201 is false 2004-12 is true 2004-12-01 is true I've done quite a few regexes in the past but have never used the positive lookahead assertion before because I have never really understood it. Out of the three regexes, I would have expected the first to fulfill my requirements. I do not understand why it is not, although I suspect it has something to do with the dash character. What am I doing wrong here? I should be able to accomplish this test using a single regex, right? Dirk Bremer - Systems Programmer II - ESS/AMS - NISC St. Peters USA Central Time Zone 636-922-9158 ext. 8652 fax 636-447-4471 [EMAIL PROTECTED] www.nisc.cc ___ Perl-Win32-Users mailing list Perl-Win32-Users@listserv.ActiveState.com To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs ___ Perl-Win32-Users mailing list Perl-Win32-Users@listserv.ActiveState.com To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
RE: Regex Help
The reason that 200412 matches in your first regex is that the first four characters match the pattern (as expected) but there is nothing in the regex that causes the additional characters 12 to result in a mismatch; the regex engine simply ignores them. Adding $ to the regex (as Mike has done) will disallow any additional characters in the string, and should work as you expected. Furthermore, using * instead of {0,2} will match strings such as 2004-12-01-02-03. [EMAIL PROTECTED] Sent by: [EMAIL PROTECTED] 01/27/2005 11:06 AM To:[EMAIL PROTECTED], perl-win32-users@listserv.ActiveState.com cc: Subject:RE: Regex Help Dirk, I'm not a regex pro, but this worked for me: /^\d{4}(-\d{2}){0,2}$/ I tested it with this: @test = qw( 2004 200412 20041201 2004-12 2004-12-01 ); foreach $test (@test) { print $test : . ($test =~ /^\d{4}(-\d{2}){0,2}$/ ? TRUE : FALSE) . \n; } Regards, Mike -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Dirk Bremer Sent: Thursday, January 27, 2005 10:43 AM To: perl-win32-users Subject: Regex Help I need to test for the following date-type strings: 2004 2004-12 2004-12-01 All of the above would be legal, any other combination would be illegal. I have been using the following data to test the regex: ''Should produce a false result. '2004'Should produce a true result. '200412' Should produce a false result. '20041201' Should produce a false result. '2004-12' Should produce a true result. '2004-12-01' Should produce a true result. My first attempt at a regex is /\d{4}(-\d{2})*/. This produces the following result: is false 2004 is true 200412 is true 20041201 is true 2004-12 is true 2004-12-01 is true I would prefer that '200412' and '20041201' would fail, i.e. be reported as false. I'm not sure what's going on with this regex as I would expect the following: match the first four numeric characters, then match zero or more occurrences of a dash character followed by two numeric characters The next regex /\d{4}(-\d{2})+/ produces the following results: is false 2004 is false 200412 is false 20041201 is false 2004-12 is true 2004-12-01 is true This is closer, but I also want '2004' to be true. The next regex /\d{4}(?=-\d{2})/ produces the same results as the previous regex: is false 2004 is false 200412 is false 20041201 is false 2004-12 is true 2004-12-01 is true I've done quite a few regexes in the past but have never used the positive lookahead assertion before because I have never really understood it. Out of the three regexes, I would have expected the first to fulfill my requirements. I do not understand why it is not, although I suspect it has something to do with the dash character. What am I doing wrong here? I should be able to accomplish this test using a single regex, right? Dirk Bremer - Systems Programmer II - ESS/AMS - NISC St. Peters USA Central Time Zone 636-922-9158 ext. 8652 fax 636-447-4471 [EMAIL PROTECTED] www.nisc.cc ___ Perl-Win32-Users mailing list Perl-Win32-Users@listserv.ActiveState.com To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs ** The information contained in this message, including attachments, may contain privileged or confidential information that is intended to be delivered only to the person identified above. If you are not the intended recipient, or the person responsible for delivering this message to the intended recipient, ALLTEL requests that you immediately notify the sender and asks that you do not read the message or its attachments, and that you delete them without copying or sending them to anyone else. ___ Perl-Win32-Users mailing list Perl-Win32-Users@listserv.ActiveState.com To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs ___ Perl-Win32-Users mailing list Perl-Win32-Users@listserv.ActiveState.com To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
RE: Regex Help
Yes, my brain was asleep today an forgot to throw in the anchors. The {0,2} construct is a good idea but not required in this instance as the field cannot exceed 10-characters. Thanks for everyone's help with this. -Original Message- From: Lloyd Sartor [mailto:[EMAIL PROTECTED] Sent: Thursday, January 27, 2005 11:50 To: [EMAIL PROTECTED] Cc: Dirk Bremer; perl-win32-users@listserv.ActiveState.com; [EMAIL PROTECTED] Subject: RE: Regex Help The reason that 200412 matches in your first regex is that the first four characters match the pattern (as expected) but there is nothing in the regex that causes the additional characters 12 to result in a mismatch; the regex engine simply ignores them. Adding $ to the regex (as Mike has done) will disallow any additional characters in the string, and should work as you expected. Furthermore, using * instead of {0,2} will match strings such as 2004-12-01-02-03. [EMAIL PROTECTED] Sent by: [EMAIL PROTECTED] 01/27/2005 11:06 AM To:[EMAIL PROTECTED], perl-win32-users@listserv.ActiveState.com cc: Subject:RE: Regex Help Dirk, I'm not a regex pro, but this worked for me: /^\d{4}(-\d{2}){0,2}$/ I tested it with this: @test = qw( 2004 200412 20041201 2004-12 2004-12-01 ); foreach $test (@test) { print $test : . ($test =~ /^\d{4}(-\d{2}){0,2}$/ ? TRUE : FALSE) . \n; } Regards, Mike -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Dirk Bremer Sent: Thursday, January 27, 2005 10:43 AM To: perl-win32-users Subject: Regex Help I need to test for the following date-type strings: 2004 2004-12 2004-12-01 All of the above would be legal, any other combination would be illegal. I have been using the following data to test the regex: ''Should produce a false result. '2004'Should produce a true result. '200412' Should produce a false result. '20041201'Should produce a false result. '2004-12' Should produce a true result. '2004-12-01' Should produce a true result. My first attempt at a regex is /\d{4}(-\d{2})*/. This produces the following result: is false 2004 is true 200412 is true 20041201 is true 2004-12 is true 2004-12-01 is true I would prefer that '200412' and '20041201' would fail, i.e. be reported as false. I'm not sure what's going on with this regex as I would expect the following: match the first four numeric characters, then match zero or more occurrences of a dash character followed by two numeric characters The next regex /\d{4}(-\d{2})+/ produces the following results: is false 2004 is false 200412 is false 20041201 is false 2004-12 is true 2004-12-01 is true This is closer, but I also want '2004' to be true. The next regex /\d{4}(?=-\d{2})/ produces the same results as the previous regex: is false 2004 is false 200412 is false 20041201 is false 2004-12 is true 2004-12-01 is true I've done quite a few regexes in the past but have never used the positive lookahead assertion before because I have never really understood it. Out of the three regexes, I would have expected the first to fulfill my requirements. I do not understand why it is not, although I suspect it has something to do with the dash character. What am I doing wrong here? I should be able to accomplish this test using a single regex, right? Dirk Bremer - Systems Programmer II - ESS/AMS - NISC St. Peters USA Central Time Zone 636-922-9158 ext. 8652 fax 636-447-4471 [EMAIL PROTECTED] www.nisc.cc ___ Perl-Win32-Users mailing list Perl-Win32-Users@listserv.ActiveState.com To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs ** The information contained in this message, including attachments, may contain privileged or confidential information that is intended to be delivered only to the person identified above. If you are not the intended recipient, or the person responsible for delivering this message to the intended recipient, ALLTEL requests that you immediately notify the sender and asks that you do not read the message or its attachments, and that you delete them without copying or sending them to anyone else. ___ Perl-Win32-Users mailing list Perl-Win32-Users@listserv.ActiveState.com To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs ___ Perl-Win32-Users mailing list Perl-Win32-Users@listserv.ActiveState.com To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
RegEx help
I wrote a RegEx to let me know if a line of text has exactly 36 commas in it (the comma is the separator) and I came up with this. I don't think it is quite right. I could use a little pointing in the right direction. Any thoughts? $lines[0] =~/^(.*?,){36}.*?$/ Thanks, Jeff. ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
RE: RegEx help
Any thoughts? $lines[0] =~/^(.*?,){36}.*?$/ $lines[0] =~ /^[^,](?:*,[^,]*){36}$/ I like Joe's answer, but if you must use a regex this one works: print scalar(my @commas = $lines[0] =~ /(,)/g); - Mark. ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
RE: RegEx help
[EMAIL PROTECTED] wrote: : I wrote a RegEx to let me know if a line of text has exactly : 36 commas in it (the comma is the separator) and I came up : with this. I don't think it is quite right. I could use a : little pointing in the right direction. Any thoughts? You could use tr/// to count the commas. It sounds like you need a test for valid records. This sub counts the commas and returns a comparison of the count with 36. foreach my $record_length ( 35 .. 37 ) { my $foo = ' ,' x $record_length; if ( is_valid_record( $foo ) ) { print valid record: $foo\n ; } else { print invalid record: $foo\n ; } } sub is_valid_record { return $_[0] =~ tr/,// == 36; } __END__ HTH, Charles K. Clarkson -- Mobile Homes Specialist 254 968-8328 ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
RE: RegEx help
Title: RegEx help Jeff Williams wrote, on Tue 12/14/2004 11:23: I wrote a RegEx to let me know if a line of text has exactly 36 commas init (the comma is the separator) and I came up with this. I don't think itis quite right. I could use a little pointing in the right direction. Anythoughts?: $lines[0] =~/^(.*?,){36}.*?$/ Presumably you also want the fields, eventually. I'd use @fields = split(/,/, $lines[0]); and then you can simply check if (@fields == 37) { } (36 commas implies 37 fields). That way you already have the @fields array when you want them later. Presumably the magic number 37 is a constant defined at the top of your script? Good luck, Joe == Joseph P. Discenza, Sr. Programmer/Analyst mailto:[EMAIL PROTECTED] Carleton Inc. http://www.carletoninc.com 574.243.6040 ext. 300 fax: 574.243.6060Providing Financial Solutions and Compliance for over 30 Years ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: RegEx help
at 09:23 AM 12/14/2004, Jeff Williams wrote: I wrote a RegEx to let me know if a line of text has exactly 36 commas in it (the comma is the separator) and I came up with this. I don't think it is quite right. I could use a little pointing in the right direction. Any thoughts? $lines[0] =~/^(.*?,){36}.*?$/ $lines[0] =~ /^[^,](?:*,[^,]*){36}$/ ^ beginning of line [^,]* grab any stuff before the first comma (?: group but don't capture , one comma [^,]* followed by zero or more of anything not a comma ){36} 36 of these groups $ end of line Thanks, Jeff. ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: RegEx help
There's a fun little idiom using tr/// to count the number of occurances of a character in a string: perl -le '$str = a,b,c,d,e; $count = ($str =~ tr/,/,/); print $count' will output 4, the number of commas in a,b,c,d,e. --Kester Jeff Williams wrote, on Tue 12/14/2004 11:23 : I wrote a RegEx to let me know if a line of text has exactly 36 commas in it (the comma is the separator) and I came up with this. I don't think it is quite right. I could use a little pointing in the right direction. Any thoughts? : $lines[0] =~/^(.*?,){36}.*?$/ ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: regex help
$Bill wrote: ... and \1 is deprecated for $1. I believe that should read [thanks Greg!] ...and \1 is deprecated on the right side of s///. On the left side (as in any regexp), \1 and $1 differ. But as a holdover from sed, \1 means the same as $1 on the right side of the subst. \1 isn't the same as $1, in that s/(.)\1//; deletes duplicate chars, while; s/(.)$1//; depends upon the value of $1 - which, in this case'd be set *before* the subst got invoked. On the right hand side, $1 is what we expect (the final matched text inside the first set of parens). \1 (on the LHS) also can/will change during the course of a regex evaluation - as the parser/matcher works, \1 will take on different values during backtracking etc - though, in the end \1 is the same as $1. I've never understood it but I belive perl golfer/obfu folks can do strange and terrible things w/ the difference. a Andy Bach, Sys. Mangler Internet: [EMAIL PROTECTED] VOICE: (608) 261-5738 FAX 264-5932 If there be time to expose through discussion the falsehood and fallacies, to avert the evil by the process of education, the remedy to be applied is more speech, not enforced silence. Justice Louis Brandeis ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
regex help
Newbie here having trouble with regex. I'm trying to parse an html file saved as text to find all instances of time. The parsed file is called myFindTime2.txt and contains text like = tr td align=right valign=top font size=2 color=#00 class=listings 11:30 ambr / /font /td /tr == The script is called myFindTime2.pl and contains the lines == foreach () { s/([\t][\d2][:][\d2][\s])/\1/; print $_; } == - Ignored: The command I use is. perl myFindTime2.pl myFindTime2.txt Any suggestions? Is what I'm trying to do here with s/search/replace_backreference/ valid? - Done. -- Forwarded message -- From: Malcolm Mill [EMAIL PROTECTED] To: [EMAIL PROTECTED] Date: Tue, 2 Nov 2004 21:00:27 + Subject: Regex help Hi, Newbie here having trouble with regex. I'm trying to parse an html file saved as text to find all instances of time. The parsed file is called myFindTime2.txt and contains text like = tr td align=right valign=top font size=2 color=#00 class=listings 11:30 ambr / /font /td /tr == The script is called myFindTime2.pl and contains the lines == foreach () { s/([\t][\d2][:][\d2][\s])/\1/; print $_; } == The command I use is. perl myFindTime2.pl myFindTime2.txt Any suggestions? Is what I'm trying to do here with s/search/replace_backreference/ valid? ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
RE: regex help
Title: RE: regex help Assuming your dates are always in the same format, this would work: print $1\n while ($_ =~ /(\d{1,2}\:\d{2} [ap]m)/gi); -Pete -Original Message- From: Malcolm Mill [mailto:[EMAIL PROTECTED]] Sent: Tuesday, November 02, 2004 4:11 PM To: [EMAIL PROTECTED] Subject: regex help Newbie here having trouble with regex. I'm trying to parse an html file saved as text to find all instances of time. The parsed file is called myFindTime2.txt and contains text like = tr td align=right valign=top font size=2 color=#00 class=listings 11:30 ambr / /font /td /tr == The script is called myFindTime2.pl and contains the lines == foreach () { s/([\t][\d2][:][\d2][\s])/\1/; print $_; } == - Ignored: The command I use is. perl myFindTime2.pl myFindTime2.txt Any suggestions? Is what I'm trying to do here with s/search/replace_backreference/ valid? - Done. -- Forwarded message -- From: Malcolm Mill [EMAIL PROTECTED] To: [EMAIL PROTECTED] Date: Tue, 2 Nov 2004 21:00:27 + Subject: Regex help Hi, Newbie here having trouble with regex. I'm trying to parse an html file saved as text to find all instances of time. The parsed file is called myFindTime2.txt and contains text like = tr td align=right valign=top font size=2 color=#00 class=listings 11:30 ambr / /font /td /tr == The script is called myFindTime2.pl and contains the lines == foreach () { s/([\t][\d2][:][\d2][\s])/\1/; print $_; } == The command I use is. perl myFindTime2.pl myFindTime2.txt Any suggestions? Is what I'm trying to do here with s/search/replace_backreference/ valid? ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs __ Message transport security by GatewayDefender 4:13:49 PM ET - 11/2/2004 ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
RE: regex help
Hi, Your Find patter is wrong. Use the following Pattern: s/([\t]\d{2}[:]\d{2}[\s])/\1/; Regards, Gopal.R -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Malcolm Mill Sent: Wednesday, November 03, 2004 2:41 AM To: [EMAIL PROTECTED] Subject: regex help Newbie here having trouble with regex. I'm trying to parse an html file saved as text to find all instances of time. The parsed file is called myFindTime2.txt and contains text like = tr td align=right valign=top font size=2 color=#00 class=listings 11:30 ambr / /font /td /tr == The script is called myFindTime2.pl and contains the lines == foreach () { s/([\t][\d2][:][\d2][\s])/\1/; print $_; } == - Ignored: The command I use is. perl myFindTime2.pl myFindTime2.txt Any suggestions? Is what I'm trying to do here with s/search/replace_backreference/ valid? - Done. -- Forwarded message -- From: Malcolm Mill [EMAIL PROTECTED] To: [EMAIL PROTECTED] Date: Tue, 2 Nov 2004 21:00:27 + Subject: Regex help Hi, Newbie here having trouble with regex. I'm trying to parse an html file saved as text to find all instances of time. The parsed file is called myFindTime2.txt and contains text like = tr td align=right valign=top font size=2 color=#00 class=listings 11:30 ambr / /font /td /tr == The script is called myFindTime2.pl and contains the lines == foreach () { s/([\t][\d2][:][\d2][\s])/\1/; print $_; } == The command I use is. perl myFindTime2.pl myFindTime2.txt Any suggestions? Is what I'm trying to do here with s/search/replace_backreference/ valid? ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: regex help
Ganesh Babu Nallamothu, Integra-India wrote: Hi, Your Find patter is wrong. Use the following Pattern: s/([\t]\d{2}[:]\d{2}[\s])/\1/; \t, \s and : do not need to be in a character class and \1 is deprecated for $1. This should be equivalent: s/(\t\d{2}:\d{2}\s)/$1/; You should also be careful about the leading and trailing WS - this would require leading and trailing WS of any kind and remove it. s/(\s+\d{2}:\d{2}\s+)/$1/; You could also replace it with a single leading and trailing space: s/(\s+\d{2}:\d{2}\s+)/ $1 /; From: Malcolm Mill [EMAIL PROTECTED] To: [EMAIL PROTECTED] Date: Tue, 2 Nov 2004 21:00:27 + Subject: Regex help Hi, Newbie here having trouble with regex. I'm trying to parse an html file saved as text to find all instances of time. The parsed file is called myFindTime2.txt and contains text like = tr td align=right valign=top font size=2 color=#00 class=listings 11:30 ambr / /font /td /tr == The script is called myFindTime2.pl and contains the lines == foreach () { s/([\t][\d2][:][\d2][\s])/\1/; print $_; } == The command I use is. perl myFindTime2.pl myFindTime2.txt Any suggestions? Is what I'm trying to do here with s/search/replace_backreference/ valid? -- ,-/- __ _ _ $Bill LuebkertMailto:[EMAIL PROTECTED] (_/ / )// // DBE CollectiblesMailto:[EMAIL PROTECTED] / ) /-- o // // Castle of Medieval Myth Magic http://www.todbe.com/ -/-' /___/__/_/_http://dbecoll.tripod.com/ (My Perl/Lakers stuff) ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: REGEX help!
Jamie Murray wrote: Hi Glenn, I have worked on this further and looked at some of the previous posts. I have tried this with all different combinations of ip address and this has worked. Of course I got the idea from a previous post from Alex and Mark Thomas. Please understand I could have just copied and pasted Mark's solution but then I wouldn't have learned anything. if($num =~ /^(([0-1]{0,1}[0-9]{0,1}[0-9]|2[0-4][0-9]|25[0-5])\.?){4}$/) I feel that Mark's post is the better choice though and much slicker than mine. my $octet = qr/\d|[01]?\d\d|2[0-4]\d|25[0-5]/; my $valid_ip = qr/^$octet\.$octet\.$octet\.$octet$/o; print yes if $ip =~ $valid_ip; This should cover some failing cases : use strict; my $DDD = qr{(:?\d|[01]?\d\d|2[0-4]\d|25[0-5])};# a dotted decimal digit my @ips = ('1.254.255.255', '256.257.258.259', '127.0.0.1', '127.0.1', '127.1', '127'); foreach (@ips) {# made digits 2,3,4 optional if (/^$DDD(:?(:?(:?\.$DDD)?\.$DDD)?\.$DDD)?$/o) { print OK: $_\n; } else { print BAD: $_\n; } } __END__ This would be faster but uglier : /^(?:\d|[01]?\d\d|2[0-4]\d|25[0-5])(((\.(?:\d|[01]?\d\d|2[0-4]\d|25[0-5]))?\.(?:\d|[01]?\d\d|2[0-4]\d|25[0-5]))?\.(?:\d|[01]?\d\d|2[0-4]\d|25[0-5]))$/ Then there's hex (0x7f01) and decimal IP addresses (2130706433) and IPV6 addresses ([::127.0.0.1]) etc, and inet_aton is still the way to go for pure correctness in checking an IP number. -- ,-/- __ _ _ $Bill LuebkertMailto:[EMAIL PROTECTED] (_/ / )// // DBE CollectiblesMailto:[EMAIL PROTECTED] / ) /-- o // // Castle of Medieval Myth Magic http://www.todbe.com/ -/-' /___/__/_/_http://dbecoll.tripod.com/ (My Perl/Lakers stuff) ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: REGEX help!
Hey Alex, I jumped a little quick there, the previous post does work but I had a doh moment and forgot your upper range match could only be 254 at most. Sorry about that. if($num =~ /^[0-2][0-5][0-4]\.[0-2][0-5][0-4]\.[0-2][0-5][0-4]\.[0-2][0-5][0-4]$/) after each class [] use {num,num} to adjust for a part of the ip not having a number. so for example if($num =~ /^[0-2]{0,1}[0-5][0-4]\.[0-2][0-5][0-4]\.[0-2][0-5][0-4]\.[0-2][0-5][0-4]$/) matches ip's like these three digit 254 or less.three digit 254 or less.three digit 254 or less.three digit 254 or less. or two digit 54 or less.three digit 254 or less.three digit 254 or less.three digit 254 or less - Original Message - From: alex p [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Monday, January 12, 2004 11:56 AM Subject: REGEX help! Hello all, I have been trying to find a regex to validate IP ranges and found the following: m{ ^ ( \d | [01]?\d\d | 2[0-4]\d | 25[0-5] ) \. ( \d | [01]?\d\d | 2[0-4]\d | 25[0-5] ) \. ( \d | [01]?\d\d | 2[0-4]\d | 25[0-5] ) \. ( \d | [01]?\d\d | 2[0-4]\d | 25[0-5] ) $ # from the Perl Cookbook, Recipe 6.23, }xo # pages 218-9, as fixed in the 01/00 reprint can someone explain this REGEX to me I have done the following but its not working: if ($ip =~ /^\d[0-254]\.\d[0-254]\.\d[0-254]\.\d[0-254]$/) { print $ip is valid\n; } else {print $ip is invalid\n;} } TYIA _ There are now three new levels of MSN Hotmail Extra Storage! Learn more. http://join.msn.com/?pgmarket=en-uspage=hotmail/es2ST=1 ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: REGEX help!
f ($ip =~ /^\d[0-254]\.\d[0-254]\.\d[0-254]\.\d[0-254]$/) incorect way of matching ip address, it will work fore 61.14.95.02, but will not work for 66.18.99.07. The problem here you just trying to match 2 digital number instead of 3 digits. For example using [] [aDc] true for a but not for aa, or true for D but not for aDc. \d = 0 1 2 3 4 5 6 7 8 9 [01]? = match zero or more occurrence of 0 OR 1. [0-4] = 0 1 2 3 4, but not all of them at same time. now your code: [0-254] = will match 0 1 2 4 5. alex p wrote: Hello all, I have been trying to find a regex to validate IP ranges and found the following: m{ ^ ( \d | [01]?\d\d | 2[0-4]\d | 25[0-5] ) \. ( \d | [01]?\d\d | 2[0-4]\d | 25[0-5] ) \. ( \d | [01]?\d\d | 2[0-4]\d | 25[0-5] ) \. ( \d | [01]?\d\d | 2[0-4]\d | 25[0-5] ) $ # from the Perl Cookbook, Recipe 6.23, }xo # pages 218-9, as fixed in the 01/00 reprint can someone explain this REGEX to me I have done the following but its not working: if ($ip =~ /^\d[0-254]\.\d[0-254]\.\d[0-254]\.\d[0-254]$/) { print $ip is valid\n; } else {print $ip is invalid\n;} } TYIA _ There are now three new levels of MSN Hotmail Extra Storage! Learn more. http://join.msn.com/?pgmarket=en-uspage=hotmail/es2ST=1 ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: REGEX help!
On approximately 1/12/2004 8:36 PM, came the following characters from the keyboard of Jamie Murray: Hey Alex, I jumped a little quick there, the previous post does work but I had a doh moment and forgot your upper range match could only be 254 at most. Sorry about that. if($num =~ /^[0-2][0-5][0-4]\.[0-2][0-5][0-4]\.[0-2][0-5][0-4]\.[0-2][0-5][0-4]$/) after each class [] use {num,num} to adjust for a part of the ip not having a number. so for example if($num =~ /^[0-2]{0,1}[0-5][0-4]\.[0-2][0-5][0-4]\.[0-2][0-5][0-4]\.[0-2][0-5][0-4]$/) matches ip's like these three digit 254 or less.three digit 254 or less.three digit 254 or less.three digit 254 or less. or two digit 54 or less.three digit 254 or less.three digit 254 or less.three digit 254 or less Oh please. Think about some test cases. Do you think that Alex would want to match 192.168.10.10 ? I think so, but your regex wouldn't. See $Bill's reply, or the original Cookbook expression, for a much better suggestion. But to get back to one of the questions that Alex originally asked: alex p wrote: I have been trying to find a regex to validate IP ranges and found the following: m{ ^ ( \d | [01]?\d\d | 2[0-4]\d | 25[0-5] ) \. ( \d | [01]?\d\d | 2[0-4]\d | 25[0-5] ) \. ( \d | [01]?\d\d | 2[0-4]\d | 25[0-5] ) \. ( \d | [01]?\d\d | 2[0-4]\d | 25[0-5] ) $ # from the Perl Cookbook, Recipe 6.23, }xo # pages 218-9, as fixed in the 01/00 reprint can someone explain this REGEX to me The first thing one observes by the structure of the above REGEX is that it consists of 4 repeated identical expressions (one on each line). The first one starts at the beginning of the string (^), the last one ends at the end of the string ($), and they are separated by periods (\.). Now let's examine one of the repeated groups. Each group has 4 alternatives, each separated by |. Alternative 1 is any single digit. This covers the range of values from 0-9 inclusive. Alternative 2 is any pair of digits, or optionally any pair of digits preceeded by either 0 or 1. This covers the range of values from 0-199 inclusive. Note that I did NOT say it covers the range of values from 10-199... the values from 0-9 can also legally be expressed as 01 - 09, or 001 - 009. Alternative 3 is a 2, followed by a digit from 0-4, followed by any digit. This covers the range of values: 200-209, 210-219, 220-229, 230-239, and 240-249, the union of which is 200-249. Alternative 4 is 25, followed by a digit from 0-5, so it covers the range of values from 250-255. Put together, the alternatives cover the complete range of 1 to 3 digit numbers ranging in value from 0-255. The whole expression covers the complete range of 1 to 3 digit numbers ranging in value from 0-255, grouped in 4 groups separated by a single period character. This is the usual form for expressing IP addresses. -- Glenn -- http://nevcal.com/ === The best part about procrastination is that you are never bored, because you have all kinds of things that you should be doing. ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: REGEX help!
Jamie Murray wrote: Hey Alex, I jumped a little quick there, the previous post does work but I had a doh moment and forgot your upper range match could only be 254 at most. Sorry about that. if($num =~ /^[0-2][0-5][0-4]\.[0-2][0-5][0-4]\.[0-2][0-5][0-4]\.[0-2][0-5][0-4]$/) ^^^ ^^^ ^^^ The digits can be 0-9, not 0-2, 0-4 or 0-5. eg: 192.168.0.1 is a legal IP You can't check a number range this way. after each class [] use {num,num} to adjust for a part of the ip not having a number. so for example if($num =~ /^[0-2]{0,1}[0-5][0-4]\.[0-2][0-5][0-4]\.[0-2][0-5][0-4]\.[0-2][0-5][0-4]$/) matches ip's like these three digit 254 or less.three digit 254 or less.three digit 254 or less.three digit 254 or less. or two digit 54 or less.three digit 254 or less.three digit 254 or less.three digit 254 or less -- ,-/- __ _ _ $Bill LuebkertMailto:[EMAIL PROTECTED] (_/ / )// // DBE CollectiblesMailto:[EMAIL PROTECTED] / ) /-- o // // Castle of Medieval Myth Magic http://www.todbe.com/ -/-' /___/__/_/_http://dbecoll.tripod.com/ (My Perl/Lakers stuff) ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: REGEX help!
BiLL, If you check my second post I made the correction but I'm still correct in my example and method. Actually the e-mail from Raul Davletshin pretty much verifys what I had also stated and he's also correct. As for explaining [0-2] 0 or 1 or 2 are all possibilities of course but only one(unless using ? but thats another story) so wheres the problem your explaining something we already know. Also the example I gave Alex can be adjusted to his needs using class [] and range {}. At least he will know how to put together some type of expression that works instead of just relying on built in functions. As for your post down below you can check numbers that way. Did you run this in a script before you decided it doesn't work because it worked perfectly for me. Please run what I have below and correct what is actually incorrect not what you think is incorrect I'm all for learning and am just trying my best. Thanks! - Original Message - From: $Bill Luebkert [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Tuesday, January 13, 2004 5:07 AM Subject: Re: REGEX help! Jamie Murray wrote: Hey Alex, I jumped a little quick there, the previous post does work but I had a doh moment and forgot your upper range match could only be 254 at most. Sorry about that. if($num =~ /^[0-2][0-5][0-4]\.[0-2][0-5][0-4]\.[0-2][0-5][0-4]\.[0-2][0-5][0-4]$/) ^^^ ^^^ ^^^ The digits can be 0-9, not 0-2, 0-4 or 0-5. eg: 192.168.0.1 is a legal IP You can't check a number range this way. after each class [] use {num,num} to adjust for a part of the ip not having a number. so for example if($num =~ /^[0-2]{0,1}[0-5][0-4]\.[0-2][0-5][0-4]\.[0-2][0-5][0-4]\.[0-2][0-5][0-4]$/) matches ip's like these three digit 254 or less.three digit 254 or less.three digit 254 or less.three digit 254 or less. or two digit 54 or less.three digit 254 or less.three digit 254 or less.three digit 254 or less -- ,-/- __ _ _ $Bill LuebkertMailto:[EMAIL PROTECTED] (_/ / )// // DBE CollectiblesMailto:[EMAIL PROTECTED] / ) /-- o // // Castle of Medieval Myth Magic http://www.todbe.com/ -/-' /___/__/_/_http://dbecoll.tripod.com/ (My Perl/Lakers stuff) ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: REGEX help!
On approximately 1/13/2004 6:49 AM, came the following characters from the keyboard of Jamie Murray: Hi guys, I have seen my error which I have overlooked and don't mind admitting it. : ) Course don't hold it against me cause I'm just eager to learn and try things out. My regex works it matches exactly what I want but not all possibilities . I overlooked the simple fact that alex wants not 0 or 2 or 5 or 4 but 254 or less. Course with the example I posted Alex can easily adjust for this. So my method excludes 65 and up ,165 and up but not 254 to 200 or 154 to 100 or 54 or less. So yes Bill im excluding 192 amongst others in my regex I see your point. Ok so this gets a little deeper than expected because I can have 199 but not 299 /^([0-1]{0,1}[0-9][0-9] | 2[0-4][0-9] | 25[0-5])\. ([0-1]{0,1}[0-9][0-9] | 2[0-4][0-9] | 25[0-5])\. ([0-1]{0,1}[0-9][0-9] | 2[0-4][0-9] | 25[0-5])\. ([0-1]{0,1}[0-9][0-9] | 2[0-4][0-9] | 25[0-5]) $/ so now we are checking for 000 or 00 to 199 or 200 to 249 or 250 to 255 followed by \. Now I should have this right. Making mistakes sure helps you learn and think things through more thoroughly. How is that or do you have anymore suggestions. As someone else pointed out, you are rapidly approaching the REGEX given in the Perl Cookbook. Once you add a case to handle single digit numbers you will be there. The only other differences are that you are using {0,1} which is exactly the same as ?, and you are using [0-9] which is exactly the same and \d. In both cases, the latter of the two equivalent expressions is shorter to express, and used by the REGEX in the Perl Cookbook. - Original Message - From: Jamie Murray [EMAIL PROTECTED] To: $Bill Luebkert [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Tuesday, January 13, 2004 9:40 AM Subject: Re: REGEX help! BiLL, If you check my second post I made the correction but I'm still correct in my example and method. Actually the e-mail from Raul Davletshin pretty much verifys what I had also stated and he's also correct. As for explaining [0-2] 0 or 1 or 2 are all possibilities of course but only one(unless using ? but thats another story) so wheres the problem your explaining something we already know. Also the example I gave Alex can be adjusted to his needs using class [] and range {}. At least he will know how to put together some type of expression that works instead of just relying on built in functions. As for your post down below you can check numbers that way. Did you run this in a script before you decided it doesn't work because it worked perfectly for me. Please run what I have below and correct what is actually incorrect not what you think is incorrect I'm all for learning and am just trying my best. Thanks! - Original Message - From: $Bill Luebkert [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Tuesday, January 13, 2004 5:07 AM Subject: Re: REGEX help! Jamie Murray wrote: Hey Alex, I jumped a little quick there, the previous post does work but I had a doh moment and forgot your upper range match could only be 254 at most. Sorry about that. if($num =~ /^[0-2][0-5][0-4]\.[0-2][0-5][0-4]\.[0-2][0-5][0-4]\.[0-2][0-5][0-4]$/) ^^^ ^^^ ^^^ The digits can be 0-9, not 0-2, 0-4 or 0-5. eg: 192.168.0.1 is a legal IP You can't check a number range this way. after each class [] use {num,num} to adjust for a part of the ip not having a number. so for example if($num =~ /^[0-2]{0,1}[0-5][0-4]\.[0-2][0-5][0-4]\.[0-2][0-5][0-4]\.[0-2][0-5][0-4]$/) matches ip's like these three digit 254 or less.three digit 254 or less.three digit 254 or less.three digit 254 or less. or two digit 54 or less.three digit 254 or less.three digit 254 or less.three digit 254 or less -- ,-/- __ _ _ $Bill Luebkert Mailto:[EMAIL PROTECTED] (_/ / )// // DBE CollectiblesMailto:[EMAIL PROTECTED] / ) /-- o // // Castle of Medieval Myth Magic http://www.todbe.com/ -/-' /___/__/_/_http://dbecoll.tripod.com/ (My Perl/Lakers stuff) ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs -- Glenn -- http://nevcal.com/ === The best part about procrastination is that you are never bored, because you have all kinds of things that you should be doing. ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: REGEX help!
Hi Glenn, I have worked on this further and looked at some of the previous posts. I have tried this with all different combinations of ip address and this has worked. Of course I got the idea from a previous post from Alex and Mark Thomas. Please understand I could have just copied and pasted Mark's solution but then I wouldn't have learned anything. if($num =~ /^(([0-1]{0,1}[0-9]{0,1}[0-9]|2[0-4][0-9]|25[0-5])\.?){4}$/) I feel that Mark's post is the better choice though and much slicker than mine. my $octet = qr/\d|[01]?\d\d|2[0-4]\d|25[0-5]/; my $valid_ip = qr/^$octet\.$octet\.$octet\.$octet$/o; print yes if $ip =~ $valid_ip; - Original Message - From: Glenn Linderman [EMAIL PROTECTED] To: Jamie Murray [EMAIL PROTECTED] Cc: $Bill Luebkert [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Tuesday, January 13, 2004 6:24 PM Subject: Re: REGEX help! On approximately 1/13/2004 6:49 AM, came the following characters from the keyboard of Jamie Murray: Hi guys, I have seen my error which I have overlooked and don't mind admitting it. : ) Course don't hold it against me cause I'm just eager to learn and try things out. My regex works it matches exactly what I want but not all possibilities . I overlooked the simple fact that alex wants not 0 or 2 or 5 or 4 but 254 or less. Course with the example I posted Alex can easily adjust for this. So my method excludes 65 and up ,165 and up but not 254 to 200 or 154 to 100 or 54 or less. So yes Bill im excluding 192 amongst others in my regex I see your point. Ok so this gets a little deeper than expected because I can have 199 but not 299 /^([0-1]{0,1}[0-9][0-9] | 2[0-4][0-9] | 25[0-5])\. ([0-1]{0,1}[0-9][0-9] | 2[0-4][0-9] | 25[0-5])\. ([0-1]{0,1}[0-9][0-9] | 2[0-4][0-9] | 25[0-5])\. ([0-1]{0,1}[0-9][0-9] | 2[0-4][0-9] | 25[0-5]) $/ so now we are checking for 000 or 00 to 199 or 200 to 249 or 250 to 255 followed by \. Now I should have this right. Making mistakes sure helps you learn and think things through more thoroughly. How is that or do you have anymore suggestions. As someone else pointed out, you are rapidly approaching the REGEX given in the Perl Cookbook. Once you add a case to handle single digit numbers you will be there. The only other differences are that you are using {0,1} which is exactly the same as ?, and you are using [0-9] which is exactly the same and \d. In both cases, the latter of the two equivalent expressions is shorter to express, and used by the REGEX in the Perl Cookbook. - Original Message - From: Jamie Murray [EMAIL PROTECTED] To: $Bill Luebkert [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Tuesday, January 13, 2004 9:40 AM Subject: Re: REGEX help! BiLL, If you check my second post I made the correction but I'm still correct in my example and method. Actually the e-mail from Raul Davletshin pretty much verifys what I had also stated and he's also correct. As for explaining [0-2] 0 or 1 or 2 are all possibilities of course but only one(unless using ? but thats another story) so wheres the problem your explaining something we already know. Also the example I gave Alex can be adjusted to his needs using class [] and range {}. At least he will know how to put together some type of expression that works instead of just relying on built in functions. As for your post down below you can check numbers that way. Did you run this in a script before you decided it doesn't work because it worked perfectly for me. Please run what I have below and correct what is actually incorrect not what you think is incorrect I'm all for learning and am just trying my best. Thanks! - Original Message - From: $Bill Luebkert [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Tuesday, January 13, 2004 5:07 AM Subject: Re: REGEX help! Jamie Murray wrote: Hey Alex, I jumped a little quick there, the previous post does work but I had a doh moment and forgot your upper range match could only be 254 at most. Sorry about that. if($num =~ /^[0-2][0-5][0-4]\.[0-2][0-5][0-4]\.[0-2][0-5][0-4]\.[0-2][0-5][0-4]$/) ^^^ ^^^ ^^^ The digits can be 0-9, not 0-2, 0-4 or 0-5. eg: 192.168.0.1 is a legal IP You can't check a number range this way. after each class [] use {num,num} to adjust for a part of the ip not having a number. so for example if($num =~ /^[0-2]{0,1}[0-5][0-4]\.[0-2][0-5][0-4]\.[0-2][0-5][0-4]\.[0-2][0-5][0-4]$/) matches ip's like these three digit 254 or less.three digit 254 or less.three digit 254 or less.three digit 254 or less. or two digit 54 or less.three digit 254 or less.three digit 254 or less.three digit 254 or less -- ,-/- __ _ _ $Bill Luebkert Mailto:[EMAIL PROTECTED] (_/ / )// // DBE CollectiblesMailto:[EMAIL PROTECTED] / ) /-- o // // Castle of Medieval Myth Magic
Re: REGEX help!
On approximately 1/13/2004 7:01 PM, came the following characters from the keyboard of Jamie Murray: Hi Glenn, I have worked on this further and looked at some of the previous posts. I have tried this with all different combinations of ip address and this has worked. Of course I got the idea from a previous post from Alex and Mark Thomas. Please understand I could have just copied and pasted Mark's solution but then I wouldn't have learned anything. if($num =~ /^(([0-1]{0,1}[0-9]{0,1}[0-9]|2[0-4][0-9]|25[0-5])\.?){4}$/) Well, this REGEX appears like it will match all the desired cases. However, the optional \. and the {4} combine to allow it to successfully match some things that you might not want to match. Mark's post is better, but let us learn why. I've mentioned this as a difference before, but the use of \d instead of [0-9] and ? instead of {0,1} both tend to aid readability, because they are shorter, so the specific concept that they represent, while in one sense just a shorthand for exactly the way you express it, becomes more readable to the people that understand those shortcuts, because it takes less reading and thought to grasp the concept. When you see [0-9], all 5 characters have to be examined to realize that the expression means a numeric character, whereas \d requires only the examination of 2 characters to arrive at the same conclusion. Similarly for ? vs {0,1}. These shortcuts can save time in reading the expression, and also stand out as different from [0-4] and {0,4}, and are for very common cases where the benefits of knowing and understanding the shortcuts help significantly in quickly understanding the REGEX. Now the \.?){4} vs spelling out 4 instances. I really don't know anything about the edge cases that people talk about that have fewer than 3 dots, and/or more than 3 digit groupings of numbers all the IP addresses I have ever seen have been so called dotted quads. I'm sure there is an RFC somewhere that describes the exact details of what is and is not legal IP address notation. Clearly, an IP address is just a 32-bit number and can be expressed in a number of ways other than the dotted quad. However, the Cookbook expression that Alex started with was clearly intended to match dotted quads and only dotted quads, and so I think is the goal you are trying to achieve. Your REGEX allows fewer than 3 dots. The \.? is optional in each of the 4 groupings, not only for the last one. So your REGEX would permit things that aren't dotted quads, such as 127.999 1.2.34 123234135157 When testing software of any sort, it is important to be sure that it accepts the things you want to accept, and also rejects the things it doesn't understand. I feel that Mark's post is the better choice though and much slicker than mine. my $octet = qr/\d|[01]?\d\d|2[0-4]\d|25[0-5]/; my $valid_ip = qr/^$octet\.$octet\.$octet\.$octet$/o; print yes if $ip =~ $valid_ip; Yes, Mark's solution is better than yours, in that it only accepts dotted quads, and it is better than the Cookbook solution (in my opinion) because it removes the complex redundancy, and can be expressed in half the lines even with the enhanced clarity. Happy learning. - Original Message - From: Glenn Linderman [EMAIL PROTECTED] To: Jamie Murray [EMAIL PROTECTED] Cc: $Bill Luebkert [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Tuesday, January 13, 2004 6:24 PM Subject: Re: REGEX help! On approximately 1/13/2004 6:49 AM, came the following characters from the keyboard of Jamie Murray: Hi guys, I have seen my error which I have overlooked and don't mind admitting it. : ) Course don't hold it against me cause I'm just eager to learn and try things out. My regex works it matches exactly what I want but not all possibilities . I overlooked the simple fact that alex wants not 0 or 2 or 5 or 4 but 254 or less. Course with the example I posted Alex can easily adjust for this. So my method excludes 65 and up ,165 and up but not 254 to 200 or 154 to 100 or 54 or less. So yes Bill im excluding 192 amongst others in my regex I see your point. Ok so this gets a little deeper than expected because I can have 199 but not 299 /^([0-1]{0,1}[0-9][0-9] | 2[0-4][0-9] | 25[0-5])\. ([0-1]{0,1}[0-9][0-9] | 2[0-4][0-9] | 25[0-5])\. ([0-1]{0,1}[0-9][0-9] | 2[0-4][0-9] | 25[0-5])\. ([0-1]{0,1}[0-9][0-9] | 2[0-4][0-9] | 25[0-5]) $/ so now we are checking for 000 or 00 to 199 or 200 to 249 or 250 to 255 followed by \. Now I should have this right. Making mistakes sure helps you learn and think things through more thoroughly. How is that or do you have anymore suggestions. As someone else pointed out, you are rapidly approaching the REGEX given in the Perl Cookbook. Once you add a case to handle single digit numbers you will be there. The only other differences are that you are using {0,1} which is exactly the same as ?, and you are using [0-9] which is exactly the same and \d
Re: REGEX help!
Thanks for the discussion,clear explanation and advice Glenn. - Original Message - From: Glenn Linderman [EMAIL PROTECTED] To: Jamie Murray [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Wednesday, January 14, 2004 12:48 AM Subject: Re: REGEX help! On approximately 1/13/2004 7:01 PM, came the following characters from the keyboard of Jamie Murray: Hi Glenn, I have worked on this further and looked at some of the previous posts. I have tried this with all different combinations of ip address and this has worked. Of course I got the idea from a previous post from Alex and Mark Thomas. Please understand I could have just copied and pasted Mark's solution but then I wouldn't have learned anything. if($num =~ /^(([0-1]{0,1}[0-9]{0,1}[0-9]|2[0-4][0-9]|25[0-5])\.?){4}$/) Well, this REGEX appears like it will match all the desired cases. However, the optional \. and the {4} combine to allow it to successfully match some things that you might not want to match. Mark's post is better, but let us learn why. I've mentioned this as a difference before, but the use of \d instead of [0-9] and ? instead of {0,1} both tend to aid readability, because they are shorter, so the specific concept that they represent, while in one sense just a shorthand for exactly the way you express it, becomes more readable to the people that understand those shortcuts, because it takes less reading and thought to grasp the concept. When you see [0-9], all 5 characters have to be examined to realize that the expression means a numeric character, whereas \d requires only the examination of 2 characters to arrive at the same conclusion. Similarly for ? vs {0,1}. These shortcuts can save time in reading the expression, and also stand out as different from [0-4] and {0,4}, and are for very common cases where the benefits of knowing and understanding the shortcuts help significantly in quickly understanding the REGEX. Now the \.?){4} vs spelling out 4 instances. I really don't know anything about the edge cases that people talk about that have fewer than 3 dots, and/or more than 3 digit groupings of numbers all the IP addresses I have ever seen have been so called dotted quads. I'm sure there is an RFC somewhere that describes the exact details of what is and is not legal IP address notation. Clearly, an IP address is just a 32-bit number and can be expressed in a number of ways other than the dotted quad. However, the Cookbook expression that Alex started with was clearly intended to match dotted quads and only dotted quads, and so I think is the goal you are trying to achieve. Your REGEX allows fewer than 3 dots. The \.? is optional in each of the 4 groupings, not only for the last one. So your REGEX would permit things that aren't dotted quads, such as 127.999 1.2.34 123234135157 When testing software of any sort, it is important to be sure that it accepts the things you want to accept, and also rejects the things it doesn't understand. I feel that Mark's post is the better choice though and much slicker than mine. my $octet = qr/\d|[01]?\d\d|2[0-4]\d|25[0-5]/; my $valid_ip = qr/^$octet\.$octet\.$octet\.$octet$/o; print yes if $ip =~ $valid_ip; Yes, Mark's solution is better than yours, in that it only accepts dotted quads, and it is better than the Cookbook solution (in my opinion) because it removes the complex redundancy, and can be expressed in half the lines even with the enhanced clarity. Happy learning. - Original Message - From: Glenn Linderman [EMAIL PROTECTED] To: Jamie Murray [EMAIL PROTECTED] Cc: $Bill Luebkert [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Tuesday, January 13, 2004 6:24 PM Subject: Re: REGEX help! On approximately 1/13/2004 6:49 AM, came the following characters from the keyboard of Jamie Murray: Hi guys, I have seen my error which I have overlooked and don't mind admitting it. : ) Course don't hold it against me cause I'm just eager to learn and try things out. My regex works it matches exactly what I want but not all possibilities . I overlooked the simple fact that alex wants not 0 or 2 or 5 or 4 but 254 or less. Course with the example I posted Alex can easily adjust for this. So my method excludes 65 and up ,165 and up but not 254 to 200 or 154 to 100 or 54 or less. So yes Bill im excluding 192 amongst others in my regex I see your point. Ok so this gets a little deeper than expected because I can have 199 but not 299 /^([0-1]{0,1}[0-9][0-9] | 2[0-4][0-9] | 25[0-5])\. ([0-1]{0,1}[0-9][0-9] | 2[0-4][0-9] | 25[0-5])\. ([0-1]{0,1}[0-9][0-9] | 2[0-4][0-9] | 25[0-5])\. ([0-1]{0,1}[0-9][0-9] | 2[0-4][0-9] | 25[0-5]) $/ so now we are checking for 000 or 00 to 199 or 200 to 249 or 250 to 255 followed by \. Now I should have this right. Making mistakes sure helps you learn
Re: REGEX help!
I think this is why $Bill said use the syscall rather than the RE - IPs are also legally expressed as the raw 32 bit number in decimal, or as a subset of the dotted quad only including the elements required to dis-ambiguate depending on the subnet mask (with a subnet mask of 255.0.0.0, 127. is a valid IP representing the network (or is it broadcast?) address...) not following standards will always bite you in the behind at some stage.. :) On Tue, 13 Jan 2004 20:18:58 -0800, Glenn Linderman [EMAIL PROTECTED] wrote: On approximately 1/13/2004 7:01 PM, came the following characters from the keyboard of Jamie Murray: Hi Glenn, I have worked on this further and looked at some of the previous posts. I have tried this with all different combinations of ip address and this has worked. Of course I got the idea from a previous post from Alex and Mark Thomas. Please understand I could have just copied and pasted Mark's solution but then I wouldn't have learned anything. if($num =~ /^(([0-1]{0,1}[0-9]{0,1}[0-9]|2[0-4][0-9]|25[0-5])\.?){4}$/) Well, this REGEX appears like it will match all the desired cases. However, the optional \. and the {4} combine to allow it to successfully match some things that you might not want to match. Mark's post is better, but let us learn why. I've mentioned this as a difference before, but the use of \d instead of [0-9] and ? instead of {0,1} both tend to aid readability, because they are shorter, so the specific concept that they represent, while in one sense just a shorthand for exactly the way you express it, becomes more readable to the people that understand those shortcuts, because it takes less reading and thought to grasp the concept. When you see [0-9], all 5 characters have to be examined to realize that the expression means a numeric character, whereas \d requires only the examination of 2 characters to arrive at the same conclusion. Similarly for ? vs {0,1}. These shortcuts can save time in reading the expression, and also stand out as different from [0-4] and {0,4}, and are for very common cases where the benefits of knowing and understanding the shortcuts help significantly in quickly understanding the REGEX. Now the \.?){4} vs spelling out 4 instances. I really don't know anything about the edge cases that people talk about that have fewer than 3 dots, and/or more than 3 digit groupings of numbers all the IP addresses I have ever seen have been so called dotted quads. I'm sure there is an RFC somewhere that describes the exact details of what is and is not legal IP address notation. Clearly, an IP address is just a 32-bit number and can be expressed in a number of ways other than the dotted quad. However, the Cookbook expression that Alex started with was clearly intended to match dotted quads and only dotted quads, and so I think is the goal you are trying to achieve. Your REGEX allows fewer than 3 dots. The \.? is optional in each of the 4 groupings, not only for the last one. So your REGEX would permit things that aren't dotted quads, such as 127.999 1.2.34 123234135157 When testing software of any sort, it is important to be sure that it accepts the things you want to accept, and also rejects the things it doesn't understand. I feel that Mark's post is the better choice though and much slicker than mine. my $octet = qr/\d|[01]?\d\d|2[0-4]\d|25[0-5]/; my $valid_ip = qr/^$octet\.$octet\.$octet\.$octet$/o; print yes if $ip =~ $valid_ip; Yes, Mark's solution is better than yours, in that it only accepts dotted quads, and it is better than the Cookbook solution (in my opinion) because it removes the complex redundancy, and can be expressed in half the lines even with the enhanced clarity. Happy learning. - Original Message - From: Glenn Linderman [EMAIL PROTECTED] To: Jamie Murray [EMAIL PROTECTED] Cc: $Bill Luebkert [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Tuesday, January 13, 2004 6:24 PM Subject: Re: REGEX help! On approximately 1/13/2004 6:49 AM, came the following characters from the keyboard of Jamie Murray: Hi guys, I have seen my error which I have overlooked and don't mind admitting it. : ) Course don't hold it against me cause I'm just eager to learn and try things out. My regex works it matches exactly what I want but not all possibilities . I overlooked the simple fact that alex wants not 0 or 2 or 5 or 4 but 254 or less. Course with the example I posted Alex can easily adjust for this. So my method excludes 65 and up ,165 and up but not 254 to 200 or 154 to 100 or 54 or less. So yes Bill im excluding 192 amongst others in my regex I see your point. Ok so this gets a little deeper than expected because I can have 199 but not 299 /^([0-1]{0,1}[0-9][0-9] | 2[0-4][0-9] | 25[0-5])\. ([0-1]{0,1}[0-9][0-9] | 2[0-4][0-9] | 25[0-5])\. ([0-1]{0,1}[0-9][0-9] | 2[0-4][0-9] | 25[0-5])\. ([0-1]{0,1}[0-9][0-9] | 2[0-4][0-9] | 25[0-5]) $/ so now we are checking for 000 or 00 to 199 or 200 to 249
Re: REGEX help!
just a stab starts with any digit(s) = any number of digits or = 0 or 1 or none(?=can be none but at most 1) followed by any 2 numbers = 2 followed by number between 0 and 4 followed by any number = 25 followed by any number between 0 and 5 followed by . the rest is repeat (I think that cookbook regex is crap but anyways) your regex fails for a number of reasons. what you have there is looking to match ip in format oftwo digits.two digits.two digits.two digits. you regex should be if($num =~ /^[0-9]\.[0-9]\.[0-9]\.[0-9]$/) this matches digit.digit.digit.digit. so we need to add {} with a range of occurences if($num =~ /^[0-9]{1,2}\.[0-9]]{3}\.[0-9]]{3}\.[0-9]]{3}$/) this matches one or two digits.three digits.three digits.three digits I put in the {num,num} just to show how you can adjust this to your needs for any match maybe there is some regex guru out there who can add to this or adjust what I have stated. But I did test the above and it works just fine. - Original Message - From: alex p [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Monday, January 12, 2004 11:56 AM Subject: REGEX help! Hello all, I have been trying to find a regex to validate IP ranges and found the following: m{ ^ ( \d | [01]?\d\d | 2[0-4]\d | 25[0-5] ) \. ( \d | [01]?\d\d | 2[0-4]\d | 25[0-5] ) \. ( \d | [01]?\d\d | 2[0-4]\d | 25[0-5] ) \. ( \d | [01]?\d\d | 2[0-4]\d | 25[0-5] ) $ # from the Perl Cookbook, Recipe 6.23, }xo # pages 218-9, as fixed in the 01/00 reprint can someone explain this REGEX to me I have done the following but its not working: if ($ip =~ /^\d[0-254]\.\d[0-254]\.\d[0-254]\.\d[0-254]$/) { print $ip is valid\n; } else {print $ip is invalid\n;} } TYIA _ There are now three new levels of MSN Hotmail Extra Storage! Learn more. http://join.msn.com/?pgmarket=en-uspage=hotmail/es2ST=1 ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Regex Help Needed
I have a list of characters. I need to get a list of all possble sequences of these characters for example. I have a string that consists of '-mevqgn' I need to pattern match any combination of 'mevqgn' with a preceding - or --. Right now this is what I am doing but it is very ugly and difficult to come up with the combinationsand it makes my brain hurt!: if ($LS_Val =~ /-{1,2}(mevqgn| emvqgn|evmqgn|evqmgn|evqgmn|evqgnm| veqgnm|vqegnm|vqgenm|vqgnem|vagnme| qvgnme|qgvnme|qgnvme|qgnmve|qgnmev| gqmnev|gmqnev|gmnqev|gmneqv|gmnevq| mgnevq|mngevq|mnegvq|mnevgq|mnevqg| nmevqg|nemvqg|nevmqg|nevqmg|nevqgm| envqgm|evnqgm|evqngm|evqgnm|evqgmn| )/i) { #Do Something; } Asubroutine that takes the string of characters as an argument and then returns 1 on success and undef on fail would be ideal for my purpose. Any help is appreciated. Dax
Re: Regex Help Needed
Dax T. Games wrote: I have a list of characters. I need to get a list of all possble sequences of these characters for example. I have a string that consists of '-mevqgn' I need to pattern match any combination of 'mevqgn' with a preceding - or --. Right now this is what I am doing but it is very ugly and difficult to come up with the combinations and it makes my brain hurt!: if ($LS_Val =~ /-{1,2}(mevqgn| emvqgn|evmqgn|evqmgn|evqgmn|evqgnm| veqgnm|vqegnm|vqgenm|vqgnem|vagnme| qvgnme|qgvnme|qgnvme|qgnmve|qgnmev| gqmnev|gmqnev|gmnqev|gmneqv|gmnevq| mgnevq|mngevq|mnegvq|mnevgq|mnevqg| nmevqg|nemvqg|nevmqg|nevqmg|nevqgm| envqgm|evnqgm|evqngm|evqgnm|evqgmn| )/i) { #Do Something; } A subroutine that takes the string of characters as an argument and then returns 1 on success and undef on fail would be ideal for my purpose. The above doesn't test all cases of the string - am I missing something in the description ? Assuming each of the letters has to be present and there is only one occurrence of each letter (that may be a stretch) and the order can be any order: # first check that we have the right letters if ($LS_Val =~ /-{1,2}([mevqgn]{6})/i) { # then make sure we only have one of each if (check ($1)) { print found it\n; } } sub check { my %letters = map { $_ = 0 } split '', $LS_Val; map { return 0 if not exists $letters{$_} or $letters{$_}++ } split '', $_[0]; return 1; } -- ,-/- __ _ _ $Bill LuebkertMailto:[EMAIL PROTECTED] (_/ / )// // DBE CollectiblesMailto:[EMAIL PROTECTED] / ) /-- o // // Castle of Medieval Myth Magic http://www.todbe.com/ -/-' /___/__/_/_http://dbecoll.tripod.com/ (My Perl/Lakers stuff) ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
RE: Regex Help Needed
I wanted to use tr but was uanble to accomplish the task that way. So I used regex like the following: use strict; my %MCTWW = qw(m -1 e -1 v -1 q -1 g -1 n -1);my $MyCharsToWorkWith = \%MCTWW; $_ = '--mepqgn '; if ( ! /-{1,2}(\S+)/ ) { printf "Expecting a hyphen or two floowed by non space, but did not get a hit.\n"; printf "Data: %-s\n", $_; printf "Correct and rerun.\n"; exit(1);}my $MyData = lc($1);my $MyMiss = 0;my $MyOrigData = $MyData; foreach my $MyChar (keys %{$MyCharsToWorkWith}) { $MyCharsToWorkWith-{$MyChar} = ( $MyData =~ s/$MyChar//g ); $MyMiss++ if ( ! $MyCharsToWorkWith-{$MyChar} );} if ( $MyMiss ) { printf "Not all characters were present in passed data. I had $MyMiss misses.\n"; printf "Data was %-s and invalid data was %-s\n", $MyOrigData, $MyData;}else { printf "All necessary characters were present.\n";} From your description they should only appear once, doesn't matter sequence. I believe it could be a starting place. Wags ;) -Original Message-From: Dax T. Games [mailto:[EMAIL PROTECTED]Sent: Tuesday, September 02, 2003 10:26To: Perl UsersSubject: Regex Help Needed I have a list of characters. I need to get a list of all possble sequences of these characters for example. I have a string that consists of '-mevqgn' I need to pattern match any combination of 'mevqgn' with a preceding - or --. Right now this is what I am doing but it is very ugly and difficult to come up with the combinationsand it makes my brain hurt!: if ($LS_Val =~ /-{1,2}(mevqgn| emvqgn|evmqgn|evqmgn|evqgmn|evqgnm| veqgnm|vqegnm|vqgenm|vqgnem|vagnme| qvgnme|qgvnme|qgnvme|qgnmve|qgnmev| gqmnev|gmqnev|gmnqev|gmneqv|gmnevq| mgnevq|mngevq|mnegvq|mnevgq|mnevqg| nmevqg|nemvqg|nevmqg|nevqmg|nevqgm| envqgm|evnqgm|evqngm|evqgnm|evqgmn| )/i) { #Do Something; } Asubroutine that takes the string of characters as an argument and then returns 1 on success and undef on fail would be ideal for my purpose. Any help is appreciated. Dax ** This message contains information that is confidential and proprietary to FedEx Freight or its affiliates. It is intended only for the recipient named and for the express purpose(s) described therein. Any other use is prohibited.
Re: Regex Help Needed
Have you tried playing around with character sets? Something like $target = 'mevqgn'; $length_target = length $target; if ( $LS_Val =~ /-{1,2}[$target]{$length_target}/ ) { #do something } Whether the above would work for you would depend on whether the code can ignore positive matches on $LS_Val = '--mmmqqq' and so forth. It might be worthwhile to look more closely at the data and see whether there are don't care cases that you can ignore. If there are not, then there is a loop approach: $t = 'mevqgn'; # just to save keystrokes $x = $LS_Val; if ( $x =~ /(-{1,2})/ ) { $goodSoFar = $1; while (length $t and $x =~ /($goodSoFar([$t]))/ ) { $goodSoFar = $1; $t =~ s/$2//; } do_Something unless length $t; } That's undoubtedly slower than your original approach, but would be more versatile and possibly easier to maintain. (Neither snippet above has been tested) Dax T. Games wrote: I have a list of characters. I need to get a list of all possble sequences of these characters for example. I have a string that consists of '-mevqgn' I need to pattern match any combination of 'mevqgn' with a preceding - or --. Right now this is what I am doing but it is very ugly and difficult to come up with the combinations and it makes my brain hurt!: if ($LS_Val =~ /-{1,2}(mevqgn| emvqgn|evmqgn|evqmgn|evqgmn|evqgnm| veqgnm|vqegnm|vqgenm|vqgnem|vagnme| qvgnme|qgvnme|qgnvme|qgnmve|qgnmev| gqmnev|gmqnev|gmnqev|gmneqv|gmnevq| mgnevq|mngevq|mnegvq|mnevgq|mnevqg| nmevqg|nemvqg|nevmqg|nevqmg|nevqgm| envqgm|evnqgm|evqngm|evqgnm|evqgmn| )/i) { #Do Something; } A subroutine that takes the string of characters as an argument and then returns 1 on success and undef on fail would be ideal for my purpose. Any help is appreciated. Dax -- Will Woodhull [EMAIL PROTECTED] ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
RE: Regex Help Needed
Here is another variation... #!/usr/bin/perl check('-mevqgn');check('-memqgn');check('-ngmevq');check('--meqvgn'); sub check{ my $LS_Val = shift; if ($LS_Val =~ /-{1,2}([mevqgn]{6})/ and unique_chars($1)) { print "Ding Ding! $LS_Val is good!\n"; } else { print "Flopped: $LS_Val\n"; }} sub unique_chars{ my $chars = shift; my %chars; for (split(//, $chars)) { $chars{$_}++; } for (keys %chars) { if ($chars{$_} 1) { return 0; } } return 1;} -Original Message-From: Dax T. Games [mailto:[EMAIL PROTECTED]Sent: Tuesday, September 02, 2003 1:26 PMTo: Perl UsersSubject: Regex Help Needed I have a list of characters. I need to get a list of all possble sequences of these characters for example. I have a string that consists of '-mevqgn' I need to pattern match any combination of 'mevqgn' with a preceding - or --. Right now this is what I am doing but it is very ugly and difficult to come up with the combinationsand it makes my brain hurt!: if ($LS_Val =~ /-{1,2}(mevqgn| emvqgn|evmqgn|evqmgn|evqgmn|evqgnm| veqgnm|vqegnm|vqgenm|vqgnem|vagnme| qvgnme|qgvnme|qgnvme|qgnmve|qgnmev| gqmnev|gmqnev|gmnqev|gmneqv|gmnevq| mgnevq|mngevq|mnegvq|mnevgq|mnevqg| nmevqg|nemvqg|nevmqg|nevqmg|nevqgm| envqgm|evnqgm|evqngm|evqgnm|evqgmn| )/i) { #Do Something; } Asubroutine that takes the string of characters as an argument and then returns 1 on success and undef on fail would be ideal for my purpose. Any help is appreciated. Dax
Re: Regex Help Needed
On Tue, 2 Sep 2003, Dax T. Games wrote: I have a list of characters. I need to get a list of all possble sequences of these characters for example. I have a string that consists of '-mevqgn' I need to pattern match any combination of 'mevqgn' with a preceding - or --. Right now this is what I am doing but it is very ugly and difficult to come up with the combinations and it makes my brain hurt!: if ($LS_Val =~ /-{1,2}(mevqgn| emvqgn|evmqgn|evqmgn|evqgmn|evqgnm| veqgnm|vqegnm|vqgenm|vqgnem|vagnme| qvgnme|qgvnme|qgnvme|qgnmve|qgnmev| gqmnev|gmqnev|gmnqev|gmneqv|gmnevq| mgnevq|mngevq|mnegvq|mnevgq|mnevqg| nmevqg|nemvqg|nevmqg|nevqmg|nevqgm| envqgm|evnqgm|evqngm|evqgnm|evqgmn| )/i) { #Do Something; } A subroutine that takes the string of characters as an argument and then returns 1 on success and undef on fail would be ideal for my purpose. How about /^-{1,2}[mevgn]{6}/ i.e. turn all the characters you wish to match into a character class. This assumes that the alphabetic string you want to match will be 6 characters long. If it could be of any arbitratry length but only consisting of the characters mevgn then the patters might be: /^-{1,2}[mevgn]*/. This also assumes that you are matching the beginning of a string. The regex used in a match would of course produce a result of 0 or 1 so it you really wanted it to be in a subroutine then you could do something like: sub matched{ return $mystring=~/^-{1,2}[mevgn]*/; } I, personally would think that just using the match directly in an if statement would be cleaner, e.g. if($mystring=~/^-{1,2}[mevgn]*/) { . [EMAIL PROTECTED] Carl Jolley All opinions are my own and not necessarily those of my employer ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: Regex Help Needed
Wow... looks like some good replies to this one. Here's a less elegant, recursive approach (until I learn map :-) #!perl -w # print all 720 permutations using letters: e m v q g n use strict; sub mutate { my ($in) = @_; if (length($in) == 6) { print $in\n; $in = ''; } else { foreach (qw/e m v q g n/) { if ($in =~ /$_/) { #do nothing; } else { print mutate($in . $_); } } } } print mutate(''); #THE END If you're going the other way (trying to see if a known string is valid), something like this ought to work: if (length($a) == 6 and $a =~ /e/ and $a =~ /m/ and $a =~ /v/ [... et al ...] ) { #it's valid } It figures TMTOWTDI. Hope this helps :-) Charlie - Original Message - From: Dax T. Games [EMAIL PROTECTED] To: Perl Users [EMAIL PROTECTED] Sent: Tuesday, September 02, 2003 12:26 PM Subject: Regex Help Needed I have a list of characters. I need to get a list of all possble sequences of these characters for example. I have a string that consists of '-mevqgn' I need to pattern match any combination of 'mevqgn' with a preceding - or --. Right now this is what I am doing but it is very ugly and difficult to come up with the combinations and it makes my brain hurt!: if ($LS_Val =~ /-{1,2}(mevqgn| emvqgn|evmqgn|evqmgn|evqgmn|evqgnm| veqgnm|vqegnm|vqgenm|vqgnem|vagnme| qvgnme|qgvnme|qgnvme|qgnmve|qgnmev| gqmnev|gmqnev|gmnqev|gmneqv|gmnevq| mgnevq|mngevq|mnegvq|mnevgq|mnevqg| nmevqg|nemvqg|nevmqg|nevqmg|nevqgm| envqgm|evnqgm|evqngm|evqgnm|evqgmn| )/i) { #Do Something; } A subroutine that takes the string of characters as an argument and then returns 1 on success and undef on fail would be ideal for my purpose. Any help is appreciated. Dax ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
RE: Regex Help Needed
It looks like you may be doing standard command line option parsing (or almost standard as the '--' prefix is reserved for long option names). If this is so, look at GetOpt::Std . For a subroutine that does what you specified (tested): sub is_DTG_Option ($) { my $opt = shift; return 0 if $opt =! /^--?([mevqgn]+)$/; my @c = split //, $1; my %h = (); for ( @c ) { return 0 if $h{$_}++; } return 1; } # Test various combinations (both legal and illegal) my @test = qw( --mevqgn -mevqgn ---mevqgn mevqgn -mv --qm -neq -ef ); for ( @test ) { print $_ . is_DTG_Option( $_ ) . \n; } -- Mike Arms -Original Message- From: Dax T. Games [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 02, 2003 11:26 AM To: Perl Users Subject: Regex Help Needed I have a list of characters. I need to get a list of all possble sequences of these characters for example. I have a string that consists of '-mevqgn' I need to pattern match any combination of 'mevqgn' with a preceding - or --. Right now this is what I am doing but it is very ugly and difficult to come up with the combinations and it makes my brain hurt!: if ($LS_Val =~ /-{1,2}(mevqgn| emvqgn|evmqgn|evqmgn|evqgmn|evqgnm| veqgnm|vqegnm|vqgenm|vqgnem|vagnme| qvgnme|qgvnme|qgnvme|qgnmve|qgnmev| gqmnev|gmqnev|gmnqev|gmneqv|gmnevq| mgnevq|mngevq|mnegvq|mnevgq|mnevqg| nmevqg|nemvqg|nevmqg|nevqmg|nevqgm| envqgm|evnqgm|evqngm|evqgnm|evqgmn| )/i) { #Do Something; } A subroutine that takes the string of characters as an argument and then returns 1 on success and undef on fail would be ideal for my purpose. Any help is appreciated. Dax ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
RE: Regex Help Needed
Equally dirty, but possibly more flexible: $_='aSdFgHjk'; # Letters to look for. $alpha1=lc(join('',sort(split(//; $LS_Val=shift; $LS_Val=~s/^-//g; # Drop preceding dashes $alpha2=lc(join('',sort(split(//,$LS_Val; if ($alpha1 eq $alpha2) {print "Pattern found!\n";} -Original Message-From: Dax T. Games [mailto:[EMAIL PROTECTED]Sent: Tuesday, September 02, 2003 11:26 AMTo: Perl UsersSubject: Regex Help Needed I have a list of characters. I need to get a list of all possble sequences of these characters for example. I have a string that consists of '-mevqgn' I need to pattern match any combination of 'mevqgn' with a preceding - or --. Right now this is what I am doing but it is very ugly and difficult to come up with the combinationsand it makes my brain hurt!: if ($LS_Val =~ /-{1,2}(mevqgn| emvqgn|evmqgn|evqmgn|evqgmn|evqgnm| veqgnm|vqegnm|vqgenm|vqgnem|vagnme| qvgnme|qgvnme|qgnvme|qgnmve|qgnmev| gqmnev|gmqnev|gmnqev|gmneqv|gmnevq| mgnevq|mngevq|mnegvq|mnevgq|mnevqg| nmevqg|nemvqg|nevmqg|nevqmg|nevqgm| envqgm|evnqgm|evqngm|evqgnm|evqgmn| )/i) { #Do Something; } Asubroutine that takes the string of characters as an argument and then returns 1 on success and undef on fail would be ideal for my purpose. Any help is appreciated. Dax
Re: Regex Help Needed
At 01:26 PM 9/2/2003, Dax T. Games wrote: I have a list of characters. I need to get a list of all possble sequences of these characters for example. I have a string that consists of '-mevqgn' I need to pattern match any combination of 'mevqgn' with a preceding - or --. Right now this is what I am doing but it is very ugly and difficult to come up with the combinations and it makes my brain hurt!: if ($LS_Val =~ /-{1,2}(mevqgn| emvqgn|evmqgn|evqmgn|evqgmn|evqgnm| veqgnm|vqegnm|vqgenm|vqgnem|vagnme| qvgnme|qgvnme|qgnvme|qgnmve|qgnmev| gqmnev|gmqnev|gmnqev|gmneqv|gmnevq| mgnevq|mngevq|mnegvq|mnevgq|mnevqg| nmevqg|nemvqg|nevmqg|nevqmg|nevqgm| envqgm|evnqgm|evqngm|evqgnm|evqgmn| )/i) { #Do Something; } A subroutine that takes the string of characters as an argument and then returns 1 on success and undef on fail would be ideal for my purpose. if ($val =~ /^--?[mevqgn]{6}$/) almost works, but allows, e.g. -me, i.e. repetitions. I would just do: if (($val =~ /^--?([a-z]+)$/) (length($1)==6) check($1, qw(m e v q g n))) { ... sub check { my($str, @chars) = @_; foreach my $ch (@chars) { return undef if (index($str, $ch)==-1); } Full test program: - use strict; MAIN: { test(''); test('-'); test('--'); test('-mev'); test('-mevqgn'); test('--mevqgn'); test('--qvgnme'); test('-qgvnme'); test('--qgnvme'); test('--qgnmve'); test('-qgnmev'); } # -- sub test { my($str) = @_; if (($str =~ /^--?([a-z]+)$/) (length($1)==6) check($1, qw(m e v q g n))) { print(OK '$str'\n); } else { print(no '$str'\n); } } # test # -- sub check { my($str, @chars) = @_; foreach my $ch (@chars) { return undef if (index($str, $ch)==-1); } return 1; } # check John Deighan Public Consulting Group 1700 Kraft Dr. Suite 2250 Blacksburg, VA 24060 [EMAIL PROTECTED] 540-953-2330 x12 FAX: 540-953-2335
Re: Regex Help Needed
$Bill Luebkert wrote: Dax T. Games wrote: I have a list of characters. I need to get a list of all possble sequences of these characters for example. I have a string that consists of '-mevqgn' I need to pattern match any combination of 'mevqgn' with a preceding - or --. Right now this is what I am doing but it is very ugly and difficult to come up with the combinations and it makes my brain hurt!: if ($LS_Val =~ /-{1,2}(mevqgn| emvqgn|evmqgn|evqmgn|evqgmn|evqgnm| veqgnm|vqegnm|vqgenm|vqgnem|vagnme| qvgnme|qgvnme|qgnvme|qgnmve|qgnmev| gqmnev|gmqnev|gmnqev|gmneqv|gmnevq| mgnevq|mngevq|mnegvq|mnevgq|mnevqg| nmevqg|nemvqg|nevmqg|nevqmg|nevqgm| envqgm|evnqgm|evqngm|evqgnm|evqgmn| )/i) { #Do Something; } A subroutine that takes the string of characters as an argument and then returns 1 on success and undef on fail would be ideal for my purpose. The above doesn't test all cases of the string - am I missing something in the description ? Assuming each of the letters has to be present and there is only one occurrence of each letter (that may be a stretch) and the order can be any order: # first check that we have the right letters if ($LS_Val =~ /-{1,2}([mevqgn]{6})/i) { # then make sure we only have one of each if (check ($1)) { print found it\n; } } sub check { my %letters = map { $_ = 0 } split '', $LS_Val; The above $LS_Val should be another var that has the letters in it or just 'mevqgn' instead. map { return 0 if not exists $letters{$_} or $letters{$_}++ } split '', $_[0]; return 1; } -- ,-/- __ _ _ $Bill LuebkertMailto:[EMAIL PROTECTED] (_/ / )// // DBE CollectiblesMailto:[EMAIL PROTECTED] / ) /-- o // // Castle of Medieval Myth Magic http://www.todbe.com/ -/-' /___/__/_/_http://dbecoll.tripod.com/ (My Perl/Lakers stuff) ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
RE: Regex Help Needed
how about sorting the letters first: $var="meqgvn"; $sortedvar=join("", sort(split("", $var))); if ($sortedvar eq "egmnqv") { print "yes!\n";} --ken -Original Message-From: Dax T. Games [mailto:[EMAIL PROTECTED]Sent: Tuesday, September 02, 2003 12:26 PMTo: Perl UsersSubject: Regex Help Needed I have a list of characters. I need to get a list of all possble sequences of these characters for example. I have a string that consists of '-mevqgn' I need to pattern match any combination of 'mevqgn' with a preceding - or --. Right now this is what I am doing but it is very ugly and difficult to come up with the combinationsand it makes my brain hurt!: if ($LS_Val =~ /-{1,2}(mevqgn| emvqgn|evmqgn|evqmgn|evqgmn|evqgnm| veqgnm|vqegnm|vqgenm|vqgnem|vagnme| qvgnme|qgvnme|qgnvme|qgnmve|qgnmev| gqmnev|gmqnev|gmnqev|gmneqv|gmnevq| mgnevq|mngevq|mnegvq|mnevgq|mnevqg| nmevqg|nemvqg|nevmqg|nevqmg|nevqgm| envqgm|evnqgm|evqngm|evqgnm|evqgmn| )/i) { #Do Something; } Asubroutine that takes the string of characters as an argument and then returns 1 on success and undef on fail would be ideal for my purpose. Any help is appreciated. Dax
Regex help
I'm trying to match a string which would start with http://, then a character string where there is one one or more instances of %, followed by one instance of .com, i.e. http://www.%55.com Here is my current pattern: http://.*%+.*(\.com) However, if there are two or more instances of .com in the string, it matches on the last .com. This is a test string that should not match, but does: http://www.test.com/%7Encbath/;http://www.test.com/ In this case, I want the match to fail completely, because there is another .com in the string. I've tried various combinations, and I obviously suck at regexes. Help? Andrew ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
RE: Regex help
I'm trying to match a string which would start with http://, then a character string where there is one one or more instances of %, followed by one instance of .com, i.e. http://www.%55.com Here is my current pattern: http://.*%+.*(\.com) However, if there are two or more instances of .com in the string, it matches on the last .com. This is a test string that should not match, but does: http://www.test.com/%7Encbath/;http://www.test.com/ In this case, I want the match to fail completely, because there is another .com in the string. I've tried various combinations, and I obviously suck at regexes. Help? Andrew The problem is that many things can slip through the .*, which allows any number of any character, so a .com will slip through it. Why not first check for only one instance of .com, then use your original pattern. Also, look at using ^ and so that the pattern can't slide along the string. R. ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
RE: RegEx help
You could cheat a little instead of trying to do it with just a RE, use an embeded tr/// or s///: my $text = 'axxxccvvvacassdcxaswrefaejjawerassdcxaswrefaejhhaasera'; $text =~ s#s(.*?)e# my $x = $1; $x =~ tr/a/8/; sprintf 's%se', $x #eg; -- Thanks. The main problem I was having was I couldn't find the e option for the RegEx. I did look again and still couldn't find it in perlre. My final code looks like this: my $x; $Text =~ s#blockquote(.*?)\/blockquote# $x = $1; $x =~ s/br \/|br/\n/; $x =~ s//\n/; $x =~ s/\t/ /; $x =~ s/pre/pre style=font-size:x-small; font-family: monospace;/; sprintf 'blockquote%s/blockquote', $x ; #egisx; ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: RegEx help
Bullock, Howard A. wrote: You could cheat a little instead of trying to do it with just a RE, use an embeded tr/// or s///: my $text = 'axxxccvvvacassdcxaswrefaejjawerassdcxaswrefaejhhaasera'; $text =~ s#s(.*?)e# my $x = $1; $x =~ tr/a/8/; sprintf 's%se', $x #eg; Your two little dashes make my client think your reply is your .sig - I wouldn't use them. Had to cut-n-paste your reply: Thanks. The main problem I was having was I couldn't find the e option for the RegEx. I did look again and still couldn't find it in perlre. You looked in the wrong place: perlfunc man page: s/PATTERN/REPLACEMENT/egimosx Searches a string for a pattern, and if found, replaces that pattern with the replacement text and returns the number of substitutions made. Otherwise it returns false (specifically, the empty string). snip Options are: e Evaluate the right side as an expression. g Replace globally, i.e., all occurrences. i Do case-insensitive pattern matching. m Treat string as multiple lines. o Compile pattern only once. s Treat string as single line. x Use extended regular expressions. -- ,-/- __ _ _ $Bill Luebkert ICQ=162126130 (_/ / )// // DBE Collectibles Mailto:dbe;todbe.com / ) /-- o // // http://dbecoll.tripod.com/ (Free site for Perl) -/-' /___/__/_/_ Castle of Medieval Myth Magic http://www.todbe.com/ ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: RegEx help
Bullock, Howard A. wrote: I want to alter some characters in a text variable, but only those between two markers. $text = 'axxxccvvvacassdcxaswrefaejjawerassdcxaswrefaejhhaasera'; I want to specify the start s and the end e and only change the a's to 8's between the markers. How do I accomplish this? $text =~ s/s find a's e/g; You could cheat a little instead of trying to do it with just a RE, use an embeded tr/// or s///: my $text = 'axxxccvvvacassdcxaswrefaejjawerassdcxaswrefaejhhaasera'; $text =~ s#s(.*?)e# my $x = $1; $x =~ tr/a/8/; sprintf 's%se', $x #eg; -- ,-/- __ _ _ $Bill Luebkert ICQ=162126130 (_/ / )// // DBE Collectibles Mailto:dbe;todbe.com / ) /-- o // // http://dbecoll.tripod.com/ (Free site for Perl) -/-' /___/__/_/_ Castle of Medieval Myth Magic http://www.todbe.com/ ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: Regex help - trim doc contents
* Stephen Patterson ([EMAIL PROTECTED]) wrote: I have a scalar, $doc, which contains a plain ascii file, and which I'll be storing on a database (for searching). (...) Can anyone help? There's a module called Whitespace on CPAN, would that work for you? Additionnally, there's also a String::Strip. jb. ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users
Regex Help Please!
I am trying to come up with a script to convert this output from RRDTool DUMP to a format which lends itself to import into Excel 97. Unfortunately, I am just getting started with Perl and do not have a clear enough grasp of how to configure this so that it strips out the unwanted parts and formats it correctly. I would like to be able to feed a file into this script, and then receive a comma delimited formatted file as output. Can anyone point me in the right direction? I have the O'reilly camel book, but when I read the section on Regex, I feel like an idiot! :( Input file: | (misc header information I want to delete) #This is how the data I want to pull out is formatted !-- 2002-01-08 09:35:00 Eastern Standard Time / 1010500500 -- rowv NaN /vv NaN /v/row !-- 2002-01-08 09:40:00 Eastern Standard Time / 1010500800 -- rowv 6.00e+001 /vv 6.90e+001 /v/row |--- Output wanted is: 2002-01-08 09:35:00 Eastern Standard Time, 1010500500, NaN, NaN 2002-01-08 09:40:00 Eastern Standard Time, 1010500800, 6.00e+001, 6.90e+001 |-- Thanks in advance. Gordon -- ___ 1 cent a minute calls anywhere in the U.S.! http://www.getpennytalk.com/cgi-bin/adforward.cgi?p_key=RG9853KJurl=http://www.getpennytalk.com ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users
RE: Regex Help Please!
Here is a simplistic approach. May want more edits, but is a starting place. Placing the data for testing under DATA: while ( DATA ) { chomp; next if ( /^\s*$/ ); # bypass blank lines if ( /^!--\s(\d+.+)\s\/\s(\d+)\s-- rowv (.+) \/vv (.+) \/v\/row/ ) { printf %-s, %-s, %-s, %-s\n, $1, $2, $3, $4; }else { printf No hit on data:\n%-s\n, $_; } } __DATA__ !-- 2002-01-08 09:35:00 Eastern Standard Time / 1010500500 -- rowv NaN /vv NaN /v/row !-- 2002-01-08 09:40:00 Eastern Standard Time / 1010500800 -- rowv 6.00e+001 /vv 6.90e+001 /v/row ^--- Script ends here Output: 2002-01-08 09:35:00 Eastern Standard Time, 1010500500, NaN, NaN 2002-01-08 09:40:00 Eastern Standard Time, 1010500800, 6.00e+001, 6.90e+001 Wags ;) -Original Message- From: Gordon Brandt [mailto:[EMAIL PROTECTED]] Sent: Thursday, January 10, 2002 10:17 To: [EMAIL PROTECTED] Subject: Regex Help Please! I am trying to come up with a script to convert this output from RRDTool DUMP to a format which lends itself to import into Excel 97. Unfortunately, I am just getting started with Perl and do not have a clear enough grasp of how to configure this so that it strips out the unwanted parts and formats it correctly. I would like to be able to feed a file into this script, and then receive a comma delimited formatted file as output. Can anyone point me in the right direction? I have the O'reilly camel book, but when I read the section on Regex, I feel like an idiot! :( Input file: | (misc header information I want to delete) #This is how the data I want to pull out is formatted !-- 2002-01-08 09:35:00 Eastern Standard Time / 1010500500 -- rowv NaN /vv NaN /v/row !-- 2002-01-08 09:40:00 Eastern Standard Time / 1010500800 -- rowv 6.00e+001 /vv 6.90e+001 /v/row |--- Output wanted is: 2002-01-08 09:35:00 Eastern Standard Time, 1010500500, NaN, NaN 2002-01-08 09:40:00 Eastern Standard Time, 1010500800, 6.00e+001, 6.90e+001 |-- Thanks in advance. Gordon -- ___ 1 cent a minute calls anywhere in the U.S.! http://www.getpennytalk.com/cgi-bin/adforward.cgi?p_key=RG9853KJurl=http://www.getpennytalk.com ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users
RE: Regex Help Please!
Works but not if you have more or fewer than 2 values in a row. Do you? -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Wagner-David Sent: Thursday, January 10, 2002 1:31 PM To: 'Gordon Brandt'; [EMAIL PROTECTED] Subject: RE: Regex Help Please! Here is a simplistic approach. May want more edits, but is a starting place. Placing the data for testing under DATA: while ( DATA ) { chomp; next if ( /^\s*$/ ); # bypass blank lines if ( /^!--\s(\d+.+)\s\/\s(\d+)\s-- rowv (.+) \/vv (.+) \/v\/row/ ) { printf %-s, %-s, %-s, %-s\n, $1, $2, $3, $4; }else { printf No hit on data:\n%-s\n, $_; } } __DATA__ !-- 2002-01-08 09:35:00 Eastern Standard Time / 1010500500 -- rowv NaN /vv NaN /v/row !-- 2002-01-08 09:40:00 Eastern Standard Time / 1010500800 -- rowv 6.00e+001 /vv 6.90e+001 /v/row ^--- Script ends here Output: 2002-01-08 09:35:00 Eastern Standard Time, 1010500500, NaN, NaN 2002-01-08 09:40:00 Eastern Standard Time, 1010500800, 6.00e+001, 6.90e+001 Wags ;) -Original Message- From: Gordon Brandt [mailto:[EMAIL PROTECTED]] Sent: Thursday, January 10, 2002 10:17 To: [EMAIL PROTECTED] Subject: Regex Help Please! I am trying to come up with a script to convert this output from RRDTool DUMP to a format which lends itself to import into Excel 97. Unfortunately, I am just getting started with Perl and do not have a clear enough grasp of how to configure this so that it strips out the unwanted parts and formats it correctly. I would like to be able to feed a file into this script, and then receive a comma delimited formatted file as output. Can anyone point me in the right direction? I have the O'reilly camel book, but when I read the section on Regex, I feel like an idiot! :( Input file: | (misc header information I want to delete) #This is how the data I want to pull out is formatted !-- 2002-01-08 09:35:00 Eastern Standard Time / 1010500500 -- rowv NaN /vv NaN /v/row !-- 2002-01-08 09:40:00 Eastern Standard Time / 1010500800 -- rowv 6.00e+001 /vv 6.90e+001 /v/row |--- Output wanted is: 2002-01-08 09:35:00 Eastern Standard Time, 1010500500, NaN, NaN 2002-01-08 09:40:00 Eastern Standard Time, 1010500800, 6.00e+001, 6.90e+001 |-- Thanks in advance. Gordon -- ___ 1 cent a minute calls anywhere in the U.S.! http://www.getpennytalk.com/cgi-bin/adforward.cgi?p_key=RG9853KJu rl=http://www.getpennytalk.com ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users
RE: Regex Help Please!
I worked from the data you provided. What can the data really look like? Provide some other and will make mod to handle(hopefully). Wags ;) -Original Message- From: Ron Hartikka [mailto:[EMAIL PROTECTED]] Sent: Thursday, January 10, 2002 10:49 To: 'Gordon Brandt'; [EMAIL PROTECTED] Subject: RE: Regex Help Please! Works but not if you have more or fewer than 2 values in a row. Do you? -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Wagner-David Sent: Thursday, January 10, 2002 1:31 PM To: 'Gordon Brandt'; [EMAIL PROTECTED] Subject: RE: Regex Help Please! Here is a simplistic approach. May want more edits, but is a starting place. Placing the data for testing under DATA: while ( DATA ) { chomp; next if ( /^\s*$/ ); # bypass blank lines if ( /^!--\s(\d+.+)\s\/\s(\d+)\s-- rowv (.+) \/vv (.+) \/v\/row/ ) { printf %-s, %-s, %-s, %-s\n, $1, $2, $3, $4; }else { printf No hit on data:\n%-s\n, $_; } } __DATA__ !-- 2002-01-08 09:35:00 Eastern Standard Time / 1010500500 -- rowv NaN /vv NaN /v/row !-- 2002-01-08 09:40:00 Eastern Standard Time / 1010500800 -- rowv 6.00e+001 /vv 6.90e+001 /v/row ^--- Script ends here Output: 2002-01-08 09:35:00 Eastern Standard Time, 1010500500, NaN, NaN 2002-01-08 09:40:00 Eastern Standard Time, 1010500800, 6.00e+001, 6.90e+001 Wags ;) -Original Message- From: Gordon Brandt [mailto:[EMAIL PROTECTED]] Sent: Thursday, January 10, 2002 10:17 To: [EMAIL PROTECTED] Subject: Regex Help Please! I am trying to come up with a script to convert this output from RRDTool DUMP to a format which lends itself to import into Excel 97. Unfortunately, I am just getting started with Perl and do not have a clear enough grasp of how to configure this so that it strips out the unwanted parts and formats it correctly. I would like to be able to feed a file into this script, and then receive a comma delimited formatted file as output. Can anyone point me in the right direction? I have the O'reilly camel book, but when I read the section on Regex, I feel like an idiot! :( Input file: | (misc header information I want to delete) #This is how the data I want to pull out is formatted !-- 2002-01-08 09:35:00 Eastern Standard Time / 1010500500 -- rowv NaN /vv NaN /v/row !-- 2002-01-08 09:40:00 Eastern Standard Time / 1010500800 -- rowv 6.00e+001 /vv 6.90e+001 /v/row |--- Output wanted is: 2002-01-08 09:35:00 Eastern Standard Time, 1010500500, NaN, NaN 2002-01-08 09:40:00 Eastern Standard Time, 1010500800, 6.00e+001, 6.90e+001 |-- Thanks in advance. Gordon -- ___ 1 cent a minute calls anywhere in the U.S.! http://www.getpennytalk.com/cgi-bin/adforward.cgi?p_key=RG9853KJu rl=http://www.getpennytalk.com ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users
RE: Regex Help Please!
:: -Original Message- :: From: Gordon Brandt [mailto:[EMAIL PROTECTED]] :: Sent: Thursday, January 10, 2002 12:17 PM :: To: [EMAIL PROTECTED] :: Subject: Regex Help Please! :: :: [-snip-] :: :: Input file: :: | :: :: (misc header information I want to delete) :: :: #This is how the data I want to pull out is formatted :: !-- 2002-01-08 09:35:00 Eastern Standard Time / 1010500500 :: -- rowv NaN /vv NaN /v/row :: !-- 2002-01-08 09:40:00 Eastern Standard Time / 1010500800 :: -- rowv 6.00e+001 /vv 6.90e+001 /v/row :: :: |--- :: :: Output wanted is: :: 2002-01-08 09:35:00 Eastern Standard Time, 1010500500, NaN, NaN :: 2002-01-08 09:40:00 Eastern Standard Time, 1010500800, :: 6.00e+001, 6.90e+001 :: :: |-- The people on this list like nothing more, it seems, than chewing on a regular expression puzzle, so you've come to the right place. However, you'll get better results out of your request if you can fill in some more information about the input source. Regular expressions get more complicated when they have to deal with more variable/generic data forms, but they're relatively easy if you build one for a single specific case. Programmers in general will usually go for the least amount of complexity in a solution, while at the same time being able to handle all possible scenarios. So to get good regex help, we all need to understand what variables there are in the scenarios. Some issues that come to mind regarding the input data in your problem are: 1) Is the data guaranteed to contain one record per line? Can data ever spread to 2 or more lines? 2) Inside the v.../v tags are values (or not). Are there ALWAYS EXACTLY TWO v/v groups? 3) Can the data ever contain quotation marks or commas? This is important to know, when outputting to CSV. 4) Are there any other ways that the input data may vary from EXACTLY the format you've presented in your sample? - Aaron -- Aaron Brown - [EMAIL PROTECTED] Middleware Programmer University of Kansas 785-864-0423 http://www.ku.edu/~aaronb/ ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users
Re: Regex Help Please!
A less elegant (perhaps) solution, but effective, no matter how many rows / values: while() { s/\r//g; # I hate that carriage return chomp; next if(!/^.*\!--/); # skip non-matching lines my @values; my $ts = $1 if(s/\!--\s*(.*?)\s*--//); my($ts1,$ts2) = split(/\s*\/\s*/,$ts); while(s/row(.*?)\/row(.*)/$2/g) { my $row = $1; while($row =~ s/v\s*(.*?)\s*\/v(.*)/$2/g) { push(@values,$1); } } my $val_str = join(', ',@values); print($ts1, $ts2, $val_str\n); } on input: !-- 2002-01-08 09:35:00 Eastern Standard Time / 1010500500 -- rowv NaN /vv NaN /v/row !-- 2002-01-08 09:35:00 Eastern Standard Time / 1010500500 -- rowv 59 /vv 6000 /vv 700/v/row returns: 2002-01-08 09:35:00 Eastern Standard Time, 1010500500, NaN, NaN 2002-01-08 09:35:00 Eastern Standard Time, 1010500500, 59, 6000, 700 ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users
Re: Regex Help
How about this as a regex: ($root,$file,ext)=$path=~/^([-_\w]+[\\/])+([^.]+)\.(.+)$/; [EMAIL PROTECTED] Carl Jolley All opinions are my own and not necessarily those of my employer On Tue, 27 Mar 2001, Dirk Bremer wrote: I would like something without the overhead of a module. Dirk Bremer - Systems Programmer II - AMS Department - NISC 636-922-9158 ext. 652 fax 636-447-4471 mailto:[EMAIL PROTECTED] - Original Message - From: [EMAIL PROTECTED] To: "Dirk Bremer" [EMAIL PROTECTED] Cc: "perl-win32-users" [EMAIL PROTECTED] Sent: Tuesday, March 27, 2001 4:05 PM Subject: Re: Regex Help Try using this: use File::Basename; # for picking apart file specs # filename, directory, extension # ($filena,$ldir,$ext) = fileparse($ARGV[0],'\..*'); "Dirk Bremer" [EMAIL PROTECTED] Sent by: To: "perl-win32-users" [EMAIL PROTECTED] [EMAIL PROTECTED] eState.com cc: Subject: Regex Help 03/27/01 05:03 PM Please respond to "Dirk Bremer" I want to build a regular expression that will separate a filename string into the root name, filename, and extension name, placing the results in $1, $2, $3. For example, given the filename "d:\doc1_directories\journals\test\b1_01_emfe.jrn", I would like the regex to return 3 different items, i.e.: d:\doc1_directories\journals\test\ b1_01_emfe jrn If possible, I would like it to work with filename strings that utilize either forward slashes "/" or backward slashes "\". So far, I am able to get to the filename and extension using: my $f = 'd:\doc1_directories\journals\test\b1_01_emfe.jrn'; $f =~ /([^\/\\]+)\.(.*)$/; which returns: $1 = b1_01_emfe $2 = jrn I have tried various combinations to get the root-part without success. This regex should also be able to handle filename strings that do not have the root part, i.e. b1_01_emfe.jrn where $1 would be undefined. Your suggestions will be appreciated. Dirk Bremer - Systems Programmer II - AMS Department - NISC 636-922-9158 ext. 652 fax 636-447-4471 mailto:[EMAIL PROTECTED] ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users
Re: Regex Help
I gave up the below code in favour of the module ;) Pick at it if you will. lee accepts path returns dir, fn if ($path !~ /\.(html?|xml)$/) if ($path !~/(\\|\/)$/){ $path =~ m/(\\|\/)/; $path.=$1; } return ($path, ""); } else { my $sep = rindex($path,"\\") +1; $sep = rindex($path,"/") +1 if $sep=0; return ( substr($path, 0, $sep), substr($path, $sep) ); } lee At 16:39 27/03/2001 -0600, Dirk Bremer wrote: I would like something without the overhead of a module. Dirk Bremer - Systems Programmer II - AMS Department - NISC 636-922-9158 ext. 652 fax 636-447-4471 mailto:[EMAIL PROTECTED] - Original Message - From: [EMAIL PROTECTED] To: "Dirk Bremer" [EMAIL PROTECTED] Cc: "perl-win32-users" [EMAIL PROTECTED] Sent: Tuesday, March 27, 2001 4:05 PM Subject: Re: Regex Help Try using this: use File::Basename; # for picking apart file specs # filename, directory, extension # ($filena,$ldir,$ext) = fileparse($ARGV[0],'\..*'); "Dirk Bremer" [EMAIL PROTECTED] Sent by: To: "perl-win32-users" [EMAIL PROTECTED] [EMAIL PROTECTED] eState.com cc: Subject: Regex Help 03/27/01 05:03 PM Please respond to "Dirk Bremer" I want to build a regular expression that will separate a filename string into the root name, filename, and extension name, placing the results in $1, $2, $3. For example, given the filename "d:\doc1_directories\journals\test\b1_01_emfe.jrn", I would like the regex to return 3 different items, i.e.: d:\doc1_directories\journals\test\ b1_01_emfe jrn If possible, I would like it to work with filename strings that utilize either forward slashes "/" or backward slashes "\". So far, I am able to get to the filename and extension using: my $f = 'd:\doc1_directories\journals\test\b1_01_emfe.jrn'; $f =~ /([^\/\\]+)\.(.*)$/; which returns: $1 = b1_01_emfe $2 = jrn I have tried various combinations to get the root-part without success. This regex should also be able to handle filename strings that do not have the root part, i.e. b1_01_emfe.jrn where $1 would be undefined. Your suggestions will be appreciated. ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users
Regex Help
I have a complicated string: (! SUBSTR(DB.USER1,2,5)="9") .AND. (LEFT(DB.USER1,1)"1") .AND. (ALLTRIM(DB.OUNCEWT)"2") that I want to search for and them replace. I have tried this with no luck: if ($InputLine =~ /\(\! SUBSTR\(DB\.USER1,2,5\)\=\"9\") \.AND\. \(LEFT\(DB\.USER1,1\)\\"1\"\) \.AND\. \(ALLTRIM\(DB\.OUNCEWT\)\\"2\"\)/io) I have also tried this with no luck: if ($InputLine =~ /\(ALLTRIM\(DB\.OUNCEWT\)/io) In the last example, I get a regex error about unbalanced paranthesis. I would really like to assign the whole string to a scalar: $Str = "\(\! SUBSTR\(DB\.USER1,2,5\)\=\"9\") \.AND\. \(LEFT\(DB\.USER1,1\)\\"1\"\) \.AND\. \(ALLTRIM\(DB\.OUNCEWT\)\\"2\"\)" if ($InputLine =~ /$Str/io) I welcome your suggestions. Dirk Bremer - Systems Programmer II - AMS Department - NISC 636-922-9158 ext. 652 fax 636-447-4471 mailto:[EMAIL PROTECTED] ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users
RE: Regex Help
I tried this and it seemed to work: my $InputLine = '(! SUBSTR(DB.USER1,2,5)="9") .AND. (LEFT(DB.USER1,1)"1") .AND. (ALLTRIM(DB.OUNCEWT)"2")'; my $InputLine1 = '(! SUjSTR(DB.USER1,2,5)="9") .AND. (LEFT(DB.USER1,1)"1") .AND. (ALLTRIM(DB.OUNCEWT)"2")'; my $Str = '(! SUBSTR(DB.USER1,2,5)="9") .AND. (LEFT(DB.USER1,1)"1") .AND. (ALLTRIM(DB.OUNCEWT)"2")'; if ( $InputLine =~ /\Q$Str\E/io ) { print "Found the string in InputLine!!\n"; }else { printf "No hit: InputLine:\n%-s\nSearch:\n%-s\n", $InputLine, $Str; } if ( $InputLine1 =~ /\Q$Str\E/io ) { print "Found the string in InputLine1!!\n"; }else { printf "No hit: InputLine1:\n%-s\nSearch:\n%-s\n", $InputLine1, $Str; } running on win2000 and AS623. Wags ;) -Original Message- From: Dirk Bremer [mailto:[EMAIL PROTECTED]] Sent: Wednesday, January 31, 2001 15:03 To: perl-win32-users Subject: Regex Help I have a complicated string: (! SUBSTR(DB.USER1,2,5)="9") .AND. (LEFT(DB.USER1,1)"1") .AND. (ALLTRIM(DB.OUNCEWT)"2") that I want to search for and them replace. I have tried this with no luck: if ($InputLine =~ /\(\! SUBSTR\(DB\.USER1,2,5\)\=\"9\") \.AND\. \(LEFT\(DB\.USER1,1\)\\"1\"\) \.AND\. \(ALLTRIM\(DB\.OUNCEWT\)\\"2\"\)/io) I have also tried this with no luck: if ($InputLine =~ /\(ALLTRIM\(DB\.OUNCEWT\)/io) In the last example, I get a regex error about unbalanced paranthesis. I would really like to assign the whole string to a scalar: $Str = "\(\! SUBSTR\(DB\.USER1,2,5\)\=\"9\") \.AND\. \(LEFT\(DB\.USER1,1\)\\"1\"\) \.AND\. \(ALLTRIM\(DB\.OUNCEWT\)\\"2\"\)" if ($InputLine =~ /$Str/io) I welcome your suggestions. Dirk Bremer - Systems Programmer II - AMS Department - NISC 636-922-9158 ext. 652 fax 636-447-4471 mailto:[EMAIL PROTECTED] ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users
Regex: help with '\'
Is there a way to get the following text to match? If the '\' in '1\V' is changed, the regex matches. I would like to maintain the RAW text if possible. $c = 'JUA_APPS01\VOL1:\APPS'; $d = 'JUA_APPS01\VOL'; if ($c =~ /$d/) { print "matched\n"; } else { print "Not Matched\n"; } ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users
Re: Regex: help with '\'
"Bullock, Howard A." wrote: Is there a way to get the following text to match? If the '\' in '1\V' is changed, the regex matches. I would like to maintain the RAW text if possible. $c = 'JUA_APPS01\VOL1:\APPS'; $d = 'JUA_APPS01\VOL'; if ($c =~ /$d/) { print "matched\n"; } else { print "Not Matched\n"; } my $c = 'JUA_APPS01\VOL1:\APPS'; my $d = 'JUA_APPS01\VOL'; if ($c =~ /\Q$d\E/) { print "matched\n"; } else { print "Not Matched\n"; } Or use quotemeta on string $d. -- ,-/- __ _ _ $Bill Luebkert ICQ=14439852 (_/ / )// // DBE Collectibles http://www.todbe.com/ / ) /-- o // // Mailto:[EMAIL PROTECTED] http://dbecoll.webjump.com/ -/-' /___/__/_/_http://www.freeyellow.com/members/dbecoll/ ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] http://listserv.ActiveState.com/mailman/listinfo/perl-win32-users