Well Community, The string that I thought was using carriage return/line feeds actually does not have any in them. (I had copied the message string from within an HTM file, and by pasting it, the lines broke up accordingly. Naturally, that led me to believe there were CR/LFs in the string.) I forgot to mention I use the following code to assign $msg. $msg = $element->as_trimmed_text(); (where element points to this line in an htm file.) <td class="CommentGrid" colspan="20"><textarea name="WeekRow1:CommentTextBox" id="WeekRow1_CommentTextBox" title="Please Write Comment Here" class="CommentTextbox" onchange="OnCommentChange (1); return false;" onclick="OnCommentTextboxClick (1); return false;" ondblclick="OnCommentTextboxClick (1); return false;" style="border-style:None;font-family:Arial Unicode MS;font-size:15px;height:100%;width:100%;display:inline">OCT 31 - Attended CSP weekly meeting. Engaged in third party SCADA host (ZedI) problem in Fairview district NOV 1 - Preparation and attendance of Normandville Phase 1 meetings. NOV 2 - Maintained project list. Troubleshoot software licensing issues. NOV 3 - Attended COG facilities meeting. Follow-up of Autosol software licenses.</textarea> </td>
The effect of the as_trimmed_text call made the cr/lf's disappear. (I found out they were missing when I executed a single character dump on $msg using this code: while($msg =~/./g) { print "$& and " . ord($&); } Thanks to Perl monks) So, in reality, $element-> as_trimmed_text() causes $msg to be one big string. For example, $msg = "OCT 31 - Attended CSP weekly meeting. Engaged in third party SCADA host (ZedI) problem in Fairview district NOV 1 - Preparation and attendance of Normandville Phase 1 meetings. NOV 2 - Maintained project list. Troubleshoot software licensing issues. NOV 3 - Attended COG facilities meeting. Follow-up of Autosol software licenses." Now, I could traverse the string, word by word, until I encounter a known month abbreviation, then push that chunk of $msg on to the array. Or, I could look for the pattern MMM D or MMM DD, then any text following up to the next MMM. So this leads me to want to use some sort of look-ahead assertion regexp. Does anyone have a suggestion? Should I simply just set up a for loop and hack away on getting $msg into @ans? Paul From: paulrousseau...@hotmail.com To: perl-win32-users@listserv.activestate.com Subject: How to split up a string with carriage returns into an array Date: Fri, 4 Nov 2011 12:07:57 -0600 Hello Perl Community, I have string variable, $msg, assigned the following text. OCT 31 - Attended CSP weekly meeting. Engaged in third party SCADA host (ZedI) problem in Fairview district NOV 1 - Preparation and attendance of Normandville Phase 1 meetings. NOV 2 - Maintained project list. Troubleshoot software licensing issues. NOV 3 - Attended COG facilities meeting. Follow-up of Autosol software licenses. I want to split the string into an array just as it looks. The string does have carriage returns in it. I tried @ans = split (/\r/s, $msg); @ans = split (/\r/g, $msg); @ans = split (/\r\n/s, $msg); @ans = split (/\r\n/g, $msg); and I get no split. For some reason, the regexp can't find the carriage return and/or line feed. Now I'm thinking about somehow splitting the string using \w{3}\s\d+\s-\s Would a look-behind assertion be better? Perhaps a map function call? Thank you. _______________________________________________ Perl-Win32-Users mailing list Perl-Win32-Users@listserv.ActiveState.com To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
_______________________________________________ Perl-Win32-Users mailing list Perl-Win32-Users@listserv.ActiveState.com To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs