On Monday, May 13, 2002, at 10:40  AM, Todd Wade,,,Room 108 wrote:

> Bob Ackerman wrote:
>
>>
>> this one wins the prolix award of the solutions we have seen today.
>> we dare a non-perl programmer to believe this could mean something.
>> I'm not sure i believe it means whatever. especially (?)(.)  - zero or 
>> one
>> character followed by a character?
>> followed by a non-greedy run of characters up to dot? something to lose
>> sleep over, i think.
>> and this is the beginner's list. geez. and that was only one line out of
>> three.
>
> sub current_weather_conditions {
>   my($weather) =
>     get('http://weather.noaa.gov/pub/data/forecasts/zone/oh/ohz021.txt');
>   ($weather) or return('Today\'s weather conditions are currently
>   unavailable');
>   $weather =~ s/.*?\n\.[^.]*?\.{3}(.*?)\n\..*/$1/s;
>   $weather =~ tr/[A-Z]\n/[a-z] /;
>   $weather =~ s/( ?)(.)(.*?)\./$1\U$2\E$3\./g;
>   return($weather);
> }
>
> Actually its not too tough. Obviously get() stores the contents of the web
> document in $weather. The current weather conditions is always the "first"
> weather condition. Each forecast starts with a ( . ), has a string of
> letters, then has three ( ... )'s followed by the forecast. This is the
> delimiter for each "section" of weather forecasts.
>
> I call the first forecast in the document the current weather, so I only
> want the first forecast.
>
> .*? matches all characters until the first \n\.[^.]*?\.{3}
>
> \n\.[^.]*?\.{3} is the forecast delimiter. The first time in the string a
> period follows a newline, then has some characters that are not periods,
> then has three periods in a row.
>
> Now I am sitting at the nothing in between a period and the first letter 
> of
> the current weather conditions, so I start to "record" the following text.
> All the text from the first letter of the current weather conditions up to
> the next period that follows a newline is captured. Then the .* is to 
> match
> the rest of the contents of the string. Then /$1/ replaces $weather with
> the current weather conditions.
>
> All the characters in the string are capitalized, so some formatting needs
> done before I can send it to whereever it is heading. The next line
> lowercases all characters and turns newlines into spaces. (Theres some
> rookie mistakes in there, see the following post)
>
> The last substution before the return capitalizes The first character in
> each sentence. ( ?) matches and records to $1 the space in between every
> sentence.

ah. i missed the seeing the space. that is what really confused me before.
  seeing (?) then (.)
makes sense with the space.

> The ? makes the space optional because The first sentence in
> $weather has no space in front of it. then the (.) matches and records to
> $2 any character. This is the first character in the sentence. The (.*?)\
> .
> Matches and records to $3 all the characters up to the next period. Then
> the right hand side of the substutution replaces the matched part of the
> string with the (optional) space that was stored in $1, then turns on
> uppercasing, then inserts $2, the first letter of each sentence, then 
> turns
> off uppercasing, then inserts the rest of the sentence, then puts a period
> at the end of the sentence (The \ in front of the period dosent need to be
> there. This was one of my first all-by-myself regexes, so ive learned alot
> since then. I also would use +'s instead of *'s nowadays...) The g after
> the / makes it so the right hand side of the regex is applied to every
> string that matches the left hand side.
>
> Im only half way through my CS bachelors, but I do want to teach this 
> stuff
> one day... How did I do with this?
>
> Todd W.
>

I'm still not sure you would convince anyone to attempt something like 
this is any computer language.
still looks like something to go blind working out. perhaps displaying the 
regex down a column with explanation to right would work better. seems 
like too much prose there. in another language, one would suggest 
factoring the problem into subfunctions for comprehensibility. you have 
split it into 3 lines - but factoringa a regex tends to make it less 
efficient.


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to