Re: XML Parsing error - regex problem?

2006-03-10 Thread Tom Phoenix
On 3/10/06, Graeme McLaren <[EMAIL PROTECTED]> wrote:

> I've checked my XML file and it contains:

> St. Patrick<92>s R.C. P.S.
>
> This is because St. Patrick's contains an apostrophe.

I'm guessing that where I see four characters "<92>", the actual file
has a single character. Some tools render unusual characters that way.

> I have a couple of
> regexes to handle ampersands and apostrophes, however the apostrophe regex
> doesn't appear to be working correctly:
>
>
> ampersand regex works:
>
> $data->[$i] =~ s/&/&/g;

I'm not sure I know what you mean by "works". It seems to be replacing
every ampersand with an ampersand in the target string, which would be
a no-op if it didn't have side effects.

> apostrophe regex doesn't work:
>
> $data->[$i] =~ s/'/'/g;

It doesn't? It's probably matching any true apostrophes.

> I've worked out that the character is a type of apostrophe which has
> a hex value of 92.  How would I write my regex to substitute this character
> for a normal apostrophe?

> I've tried: s/92/'/g;

> and it didn't work.

I think you're looking for one of these:

s/\x92/'/g
s/\x92/'/g
tr/\x92/'/

Backslash escapes are documented in perlop. Hope this helps!

--Tom Phoenix
Stonehenge Perl Training

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




RE: XML Parsing error - regex problem?

2006-03-10 Thread Graeme McLaren
Hi all, I've worked out that the character is a type of apostrophe which has 
a hex value of 92.  How would I write my regex to substitute this character 
for a normal apostrophe?


I've tried: s/92/'/g;

and it didn't work.


Any ideas?





From: "Graeme McLaren" <[EMAIL PROTECTED]>
To: beginners@perl.org
Subject: XML Parsing error - regex problem?
Date: Fri, 10 Mar 2006 10:03:50 +
MIME-Version: 1.0
X-Originating-IP: [212.250.155.249]
X-Originating-Email: [EMAIL PROTECTED]
X-Sender: [EMAIL PROTECTED]
Received: from lists.develooper.com ([63.251.223.186]) by 
bay0-mc10-f2.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.211); Fri, 10 
Mar 2006 03:40:24 -0800

Received: (qmail 30267 invoked by uid 514); 10 Mar 2006 10:08:22 -
Received: (qmail 29736 invoked from network); 10 Mar 2006 10:05:11 -
Received: from x1a.develooper.com (HELO x1.develooper.com) (216.52.237.111) 
 by lists.develooper.com with SMTP; 10 Mar 2006 10:05:11 -

Received: (qmail 634 invoked by uid 225); 10 Mar 2006 10:04:02 -
Received: (qmail 626 invoked by alias); 10 Mar 2006 10:04:01 -
Received: pass (x1.develooper.com: domain of [EMAIL PROTECTED] 
designates 64.4.56.20 as permitted sender)
Received: from bay101-f10.bay101.hotmail.com (HELO hotmail.com) 
(64.4.56.20)by la.mx.develooper.com (qpsmtpd/0.28) with ESMTP; Fri, 10 
Mar 2006 02:03:56 -0800
Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC; 
Fri, 10 Mar 2006 02:03:51 -0800
Received: from 64.4.56.200 by by101fd.bay101.hotmail.msn.com with HTTP;Fri, 
10 Mar 2006 10:03:50 GMT

X-Message-Info: JGTYoYF78jEHjJx36Oi8+Z3TmmkSEdPt4iogl2abg+M=
Mailing-List: contact [EMAIL PROTECTED]; run by ezmlm
Precedence: bulk
List-Post: <mailto:beginners@perl.org>
List-Help: <mailto:[EMAIL PROTECTED]>
List-Unsubscribe: <mailto:[EMAIL PROTECTED]>
List-Subscribe: <mailto:[EMAIL PROTECTED]>
List-Id: 
Delivered-To: mailing list beginners@perl.org
Delivered-To: beginners@perl.org
X-Spam-Status: No, hits=-0.7 
required=8.0tests=BAYES_00,DNS_FROM_RFC_ABUSE,DNS_FROM_RFC_POST,MSGID_FROM_MTA_HEADER,SPF_HELO_PASS,SPF_PASS

X-Spam-Check-By: la.mx.develooper.com
X-OriginalArrivalTime: 10 Mar 2006 10:03:51.0759 (UTC) 
FILETIME=[EECFEDF0:01C64429]

Return-Path: [EMAIL PROTECTED]

Hi all, I'm getting the following XML parsing error:

[Fri Mar 10 09:37:39 2006] insert_xml.pl: not well-formed (invalid token) 
at line 13628, column 24, byte 413248:

[Fri Mar 10 09:37:39 2006] insert_xml.pl: LA14
[Fri Mar 10 09:37:39 2006] insert_xml.pl: 5741726
[Fri Mar 10 09:37:39 2006] insert_xml.pl: St. Patricks R.C. 
P.S.

[Fri Mar 10 09:37:39 2006] insert_xml.pl: ===^
[Fri Mar 10 09:37:39 2006] insert_xml.pl: Falkirk
[Fri Mar 10 09:37:39 2006] insert_xml.pl: CE-511 (Edge)
[Fri Mar 10 09:37:39 2006] insert_xml.pl:  at 
/usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/XML/Parser.pm line 
185



I've checked my XML file and it contains:


St. Patrick<92>s R.C. P.S.

This is because St. Patrick's contains an apostrophe.  I have a couple of 
regexes to handle ampersands and apostrophes, however the apostrophe regex 
doesn't appear to be working correctly:



ampersand regex works:

$data->[$i] =~ s/&/&/g;


apostrophe regex doesn't work:

$data->[$i] =~ s/'/'/g;


Any ideas on this one?

G :)

P.S. Thank you to all who replied to my previous post, I got that array 
dereferenced properly.




--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>






--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>




XML Parsing error - regex problem?

2006-03-10 Thread Graeme McLaren

Hi all, I'm getting the following XML parsing error:

[Fri Mar 10 09:37:39 2006] insert_xml.pl: not well-formed (invalid token) at 
line 13628, column 24, byte 413248:

[Fri Mar 10 09:37:39 2006] insert_xml.pl: LA14
[Fri Mar 10 09:37:39 2006] insert_xml.pl: 5741726
[Fri Mar 10 09:37:39 2006] insert_xml.pl: St. Patricks R.C. 
P.S.

[Fri Mar 10 09:37:39 2006] insert_xml.pl: ===^
[Fri Mar 10 09:37:39 2006] insert_xml.pl: Falkirk
[Fri Mar 10 09:37:39 2006] insert_xml.pl: CE-511 (Edge)
[Fri Mar 10 09:37:39 2006] insert_xml.pl:  at 
/usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/XML/Parser.pm line 
185



I've checked my XML file and it contains:


St. Patrick<92>s R.C. P.S.

This is because St. Patrick's contains an apostrophe.  I have a couple of 
regexes to handle ampersands and apostrophes, however the apostrophe regex 
doesn't appear to be working correctly:



ampersand regex works:

$data->[$i] =~ s/&/&/g;


apostrophe regex doesn't work:

$data->[$i] =~ s/'/'/g;


Any ideas on this one?

G :)

P.S. Thank you to all who replied to my previous post, I got that array 
dereferenced properly.




--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]