Re: Regex to match "bad" characters in a parameter
"Chris Charley" writes: > You could do that in 1 line - See the following small program. > (The line using a 'grep' solution is commented out. It would work as well). > > > #!/usr/bin/perl > use strict; > use warnings; > > while (my $id = ) { >chomp $id; >#if (grep /itemid=.*?[^\w-]/, split /&/, $id) { >if ($id =~ /itemid/ && $id !~ /itemid=[\w-]+(?:&|$)/) { >print "Bad id: <$id>\n"; >} > } > > __DATA__ > itemid=AT18C&i_AT18C=1&t=main.htm&storeid=1&cols=1&c=detail.htm&ordering=asc > c=detail.htm&itemid=AT18C > itemid=AT18/C > t=main.htm&storeid=1&cols=1&c=detail.htm&ordering=asc This might be a string with a bad item id because there is none: Are you going to process the string, assuming that it is a good item id? How do you determine the beginning of the relevant sequence --- and thus whether the string contains a good item id or not --- when the string might not contain 'itemid' to designate the beginning? I think you might need to work with cleaner definitions, and/or attempt to find the good item ids instead of the bad ones. > itemid=?AT18C > > > When this is run, it prints out: > > Bad id: > Bad id: > > Chris -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Regex to match "bad" characters in a parameter
On Jan 26, 2016, at 11:22 AM, Chris Charley wrote: > > You could do that in 1 line - See the following small program. Thanks, Chris. That'll do the trick. And the grep alternative is interesting, too. I hadn't thought of that. Regards, Frank -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Regex to match "bad" characters in a parameter
"SSC_perl" wrote in message news:ef7499af-b4a5-4b07-8c69-3192ef782...@surfshopcart.com... On Jan 25, 2016, at 4:59 PM, Shawn H Corey wrote: Use the negative match operator !~ if( $QUERY_STRING !~ m{ itemid = [-0-9A-Za-z_]+? (?: \& | \z ) }msx ){ print "bad: $QUERY_STRING\n"; } Thanks for that, Shawn. It works perfectly except for one criteria that I inadvertently forgot to >include. It's possible that the string will _not_ contain the itemid parameter at all. When that's >missing, the regex matches and it shouldn't. I guess that's why I was trying to stay with the >positive match operator. I tried inverting your regex: if ( $QUERY_STRING =~ m/ itemid= .*? [^-0-9A-Za-z_]+? .*? (?: \& | \z ) /sx ) { > say "bad: $QUERY_STRING"; } but that doesn't work either. It catches even good item numbers. In the meantime, I got it to work by grabbing the itemid and working with that separately: my $item_id = $1 if ($QUERY_STRING =~ m/ itemid=([^&]*) /x); if ( $item_id =~ m/ [^a-zA-Z0-9_-] /x ) { ... however, I'd like to do that with a single line, if possible, so I don't have to create a new variable >just for that. Thanks, Frank= ### ### Hello Frank, You could do that in 1 line - See the following small program. (The line using a 'grep' solution is commented out. It would work as well). #!/usr/bin/perl use strict; use warnings; while (my $id = ) { chomp $id; #if (grep /itemid=.*?[^\w-]/, split /&/, $id) { if ($id =~ /itemid/ && $id !~ /itemid=[\w-]+(?:&|$)/) { print "Bad id: <$id>\n"; } } __DATA__ itemid=AT18C&i_AT18C=1&t=main.htm&storeid=1&cols=1&c=detail.htm&ordering=asc c=detail.htm&itemid=AT18C itemid=AT18/C t=main.htm&storeid=1&cols=1&c=detail.htm&ordering=asc itemid=?AT18C When this is run, it prints out: Bad id: Bad id: Chris -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Regex to match "bad" characters in a parameter
On Jan 25, 2016, at 4:59 PM, Shawn H Corey wrote: > > Use the negative match operator !~ > > if( $QUERY_STRING !~ m{ itemid = [-0-9A-Za-z_]+? (?: \& | \z ) }msx ){ >print "bad: $QUERY_STRING\n"; > } Thanks for that, Shawn. It works perfectly except for one criteria that I inadvertently forgot to include. It's possible that the string will _not_ contain the itemid parameter at all. When that's missing, the regex matches and it shouldn't. I guess that's why I was trying to stay with the positive match operator. I tried inverting your regex: if ( $QUERY_STRING =~ m/ itemid= .*? [^-0-9A-Za-z_]+? .*? (?: \& | \z ) /sx ) { say "bad: $QUERY_STRING"; } but that doesn't work either. It catches even good item numbers. In the meantime, I got it to work by grabbing the itemid and working with that separately: my $item_id = $1 if ($QUERY_STRING =~ m/ itemid=([^&]*) /x); if ( $item_id =~ m/ [^a-zA-Z0-9_-] /x ) { ... however, I'd like to do that with a single line, if possible, so I don't have to create a new variable just for that. Thanks, Frank -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Regex to match "bad" characters in a parameter
On Mon, 25 Jan 2016 16:16:40 -0800 SSC_perl wrote: > I'm trying to find a way to trap bad item numbers. I want to > parse the parameter "itemid=" and then everything up to either an "&" > or end-of-string. A good item number will contain only ASCII > letters, numbers, dashes, and underscores and may terminate with a > "&" or it may not (see samples below). The following string should > test negative in the regex below: > > my $QUERY_STRING = 'itemid=AT18C&i_AT18C=1'; > > but a string containing "itemid=AT18/C" should test positive, since > it has a slash. > > I can catch a single bad character and get it to work, e.g. > > if ( $QUERY_STRING =~ m| itemid= .*? [/]+? .*? &? |x ) { > > but I'd like to do something like this instead to catch others: > > if ( $QUERY_STRING =~ m| itemid= (?: .*? [^a-zA-Z0_-]+ .*? ) &? |x ) > { ... > > Unfortunately, I can't get it to work. I've read perlretut, > but can't see the answer. What am I doing wrong? > > Thanks, > Frank > > Here are a couple of test strings: > > 'itemid=AT18C&i_AT18C=1&t=main.htm&storeid=1&cols=1&c=detail.htm&ordering=asc' > > 'c=detail.htm&itemid=AT18C' > > > > Use the negative match operator !~ if( $QUERY_STRING !~ m{ itemid = [-0-9A-Za-z_]+? (?: \& | \z ) }msx ){ print "bad: $QUERY_STRING\n"; } -- Don't stop where the ink does. Shawn -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Regex to match "bad" characters in a parameter
I'm trying to find a way to trap bad item numbers. I want to parse the parameter "itemid=" and then everything up to either an "&" or end-of-string. A good item number will contain only ASCII letters, numbers, dashes, and underscores and may terminate with a "&" or it may not (see samples below). The following string should test negative in the regex below: my $QUERY_STRING = 'itemid=AT18C&i_AT18C=1'; but a string containing "itemid=AT18/C" should test positive, since it has a slash. I can catch a single bad character and get it to work, e.g. if ( $QUERY_STRING =~ m| itemid= .*? [/]+? .*? &? |x ) { but I'd like to do something like this instead to catch others: if ( $QUERY_STRING =~ m| itemid= (?: .*? [^a-zA-Z0_-]+ .*? ) &? |x ) { ... Unfortunately, I can't get it to work. I've read perlretut, but can't see the answer. What am I doing wrong? Thanks, Frank Here are a couple of test strings: 'itemid=AT18C&i_AT18C=1&t=main.htm&storeid=1&cols=1&c=detail.htm&ordering=asc' 'c=detail.htm&itemid=AT18C' -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/