On 5 August 2011 13:04, Ferenc Kovacs <tyr...@gmail.com> wrote:
> On Fri, Aug 5, 2011 at 1:43 PM, Richard Quadling <rquadl...@gmail.com> wrote:
>> Hello all.
>> During the last week, I've been converting the HTML Entities in phpdoc
>> to their Unicode counterparts, in connection to
>> http://news.php.net/php.doc.cvs/8536
>> "Remove html entities (the english translation no longer uses any.. if
>> this breaks translations then they should folow the english one, or if
>> to much work, we can revert this commit)"
>> In examining the translations, there are a significant number of files
>> NOT encoded using UTF-8.
>> As such, embedding a UTF-8 character in these files will produce garbage.
>> As an English only speaker, I am not confident that my convertion from
>> ISO encoding to UTF-8 encoding is accurate - and that I have no
>> realistic way to check.
>> So, here is a list of all the files requiring someone with the
>> language skills to look at them and manually convert them.
>> If someone has a routine that can convert ISO encoded XML to UTF-8
>> accurately, then I can apply that and then process the entities.
>> cs/bookinfo.xml
>> cs/faq/generanl.xml
>> cs/reference/strings/functions/get-html-translation-table.xml
>> hk/variables.xml
>> hu/bookinfo.xml
>> hu/language/control-structures.xml
>> hu/reference/image/functions/imagearc.xml
>> hu/reference/mbstring/functions/mb-strtoupper.xml
>> hu/reference/recode/functions/recode-string.xml
> hi Richard
> I will fix it for the hungarian files.
> --
> Ferenc Kovács
> @Tyr43l - http://tyrael.hu

If you could detail what you do in terms of re-encoding, then I'm
quite happy to rely on that process for the other files.

At some stage, converting all the encoded files to UTF-8 would be a
nice step, but that is a significant step. If/when that was
undertaken, I'd suggest adding a pre-commit hook to reject non UTF-8
encoded XML files from phpdoc.

Richard Quadling
Twitter : EE : Zend : PHPDoc
@RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY : bit.ly/lFnVea

Reply via email to