Re: [PHP-DOC] translations and livedocs release.

2004-08-09 Thread Derick Rethans
On Sun, 8 Aug 2004, moshe doron wrote:

 Hey,

 running livedocs + hebrew+ php5, i found out that  for some 8bit, not
 'ISO-8859-1' encoding, libxml it just mark all the 80 - FF range with
 question marks.

 the windows-1255 charset is one that affected by this behavior so we have to
 mark all the hebrew xml files as ISO-8859-1, but the old build system
 doesn't like it.

Don't mangle charset names. Use the correct one and fix livedocs.

Derick


Re: [PHP-DOC] translations and livedocs release.

2004-08-09 Thread Derick Rethans
On Mon, 9 Aug 2004, moshe doron wrote:

 There is nothing wrong in livedocs that's libxml2 fault, but libxml2 put
 enough disclaimers here http://www.xmlsoft.org/encoding.html, mean I
 have nothing to do with it.

Are you running on Windows perhaps, as that page states:

More over when compiled on an Unix platform with iconv support the full
set of encodings supported by iconv can be instantly be used by libxml.
On a linux machine with glibc-2.1 the list of supported encodings and
aliases fill 3 full pages, and include UCS-4, the full set of ISO-Latin
encodings, and the various Japanese ones.

the iconv in glibc-2.3 has even more supported encodings. If this is a
Windows only problem: screw windows. We should not add kludges if that
platform is broken (but afaik we also use iconv in our windows build of
libxml2 so I don't see how there is a problem here).

regards,
Derick


Re: [PHP-DOC] translations and livedocs release.

2004-08-09 Thread moshe doron
There is nothing wrong in livedocs that's libxml2 fault, but libxml2 put
enough disclaimers here http://www.xmlsoft.org/encoding.html, mean I have
nothing to do with it.

livedocs fix, complex preg_replace of the encoding line just before the xml
parsing, if this kind of solution is what you want?

Moshe.

Derick Rethans [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 On Sun, 8 Aug 2004, moshe doron wrote:

  Hey,
 
  running livedocs + hebrew+ php5, i found out that  for some 8bit, not
  'ISO-8859-1' encoding, libxml it just mark all the 80 - FF range with
  question marks.
 
  the windows-1255 charset is one that affected by this behavior so we
have to
  mark all the hebrew xml files as ISO-8859-1, but the old build system
  doesn't like it.

 Don't mangle charset names. Use the correct one and fix livedocs.

 Derick


[PHP-DOC] translations and livedocs release.

2004-08-08 Thread moshe doron
Hey,

running livedocs + hebrew+ php5, i found out that  for some 8bit, not
'ISO-8859-1' encoding, libxml it just mark all the 80 - FF range with
question marks.

the windows-1255 charset is one that affected by this behavior so we have to
mark all the hebrew xml files as ISO-8859-1, but the old build system
doesn't like it.
we have here more incompatibility of the old and new systems.

another translations projects may want to use this test script to look for
potential conflicts (php5+libxml only):

?
//$charset = ISO-8859-1;$a=128;$aend=255;
//$charset = WINDOWS-1255;$a=224;$aend=250;
$charset = iso-8859-8;$a=224;$aend=250;

$ord = ''; for(;$a=$aend;$a++) $ord.=chr($a);

echo $ord\n;
$xml =  XOF
?xml version=1.0 encoding=$charset?
chapter$ord/chapter
XOF;

//$xml = utf8_encode($xml);
$p = xml_parser_create();
xml_parser_set_option($p, XML_OPTION_CASE_FOLDING, 0);
xml_set_element_handler($p, 'start_elem', 'end_elem');
xml_set_character_data_handler($p, 'cdata');

if (!xml_parse($p, $xml, true)) {
 printf(XML: %d:%d %s\n,
   xml_get_current_line_number($p),
   xml_get_current_column_number($p),
   xml_error_string(xml_get_error_code($p))
   );
 $lines = explode(\n, $xml);
 $l = xml_get_current_line_number($p);
 echo \nLine: $l is b . htmlentities($lines[$l-1]) . /bbr /\n;
 echo 'pre';
 echo htmlentities($xml);
 echo '/pre';
}
xml_parser_free($p);
function start_elem($parser, $tagname, $attributes){}
function end_elem($parser, $tagname){}
function cdata($parser, $data) { echo $data; }
?


Re: [PHP-DOC] translations and livedocs release.

2004-08-08 Thread Gabor Hojtsy
Hi,
running livedocs + hebrew+ php5, i found out that  for some 8bit, not
'ISO-8859-1' encoding, libxml it just mark all the 80 - FF range with
question marks.
the windows-1255 charset is one that affected by this behavior so we have to
mark all the hebrew xml files as ISO-8859-1, but the old build system
doesn't like it.
we have here more incompatibility of the old and new systems.
Since the RTL stuff is not working in the old build system anyway, it 
would probably be better to exclude Hebrew from that build system and so 
encoding and stuff would not be problematic. Hebrew can also be one of 
the good test targets for livedocs :)

another translations projects may want to use this test script to look for
potential conflicts (php5+libxml only):
Could you please commit this script to phpdoc/scripts so others can use 
it later if needed?

Goba