Yes the problem was related with BOM.

Used this function to remove the BOM chars of a UTF-8 file:


sub parse($) {

my $mydoc = shift ;
        # check BOM
        my $top1 = unpack("C", substr($mydoc, 0, 1));
        my $top2 = unpack("C", substr($mydoc, 1, 1));
        my $top3 = unpack("C", substr($mydoc, 2, 1));

        # UTF-8
        if($top1 eq 239 && $top2 eq 187 && $top3 eq 191) {
                $mydoc = substr($mydoc, 3, length($mydoc) - 3);
        }

        return $mydoc;
}

and the idea was from the following code and its parse function:

http://dev.w3.org/cvsweb/p3p-validator/20010928/xml.pl?annotate=1.3&sortby=file

Thanks all for help

-Jalal





>From: Brian Stell <[EMAIL PROTECTED]>
>To: [EMAIL PROTECTED]
>Subject: Re: Starnge characters when displaying html files saved in UTF-8 
>format
>Date: Tue, 11 Dec 2001 11:34:09 -0800
>
>Jalal,
>
>Kindly reply via the mailing list so others can see the discussion.
>That way others can benefit and/or help.
>
>BOM is the Byte Order Mark used in Unicode to indicate an
>important detail about the Unicode data stream.
>
>Perhaps the Perl people can describe how to inhibit the BOM?
>
>Jalal Kakavand wrote:
> >
> > Hi there,
> >
> > I don't now what is BOM?!
> > With both IE6 and Netscape 4.7 I 've same issue and this is my final 
>page
> > with that issue:
> >
> > http://www.khaterat.com/
> >
> > If you see there is an extra blank newline at the first line and at the
> > start of other snip files.
> > BTW the OS is Linux/Unix and I'm using notpad to save my html files in 
>UTF-8
> > format and also i dont use any soecial perl modules of unicode.
> >
> > Thanks,
> >
> > jalal
> >
> > >From: Markus Kuhn <[EMAIL PROTECTED]>
> > >To: "Jalal Kakavand" <[EMAIL PROTECTED]>
> > >CC: [EMAIL PROTECTED]
> > >Subject: Re: Starnge characters when displaying html files saved in 
>UTF-8
> > >format
> > >Date: Tue, 11 Dec 2001 14:36:02 +0000
> > >
> > >"Jalal Kakavand" wrote on 2001-12-10 23:45 UTC:
> > > > I use Windows Notepad for typing and saving my html snip files and 
>then
> > >save
> > > > them in UTF-8 format.Then in my perl program after reading thoes 
>snip
> > >files
> > > > and printing to the browser there is a strange character at the 
>start of
> > > > each snip!! how can I remove thoes extra chars? its a kind of new 
>line
> > > > character.
> > >
> > >Is it the BOM?
> > >
> > >http://www.cl.cam.ac.uk/~mgk25/unicode.html
> > >
> > >Markus
> > >
> > >--
> > >Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
> > >Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>
> > >
> >
> > _________________________________________________________________
> > Get your FREE download of MSN Explorer at 
>http://explorer.msn.com/intl.asp
>
>--
>Brian Stell
>mailto:[EMAIL PROTECTED]




_________________________________________________________________
MSN Photos is the easiest way to share and print your photos: 
http://photos.msn.com/support/worldwide.aspx

Reply via email to