2011/11/29 Karl Pflästerer <k...@rl.pflaesterer.de>:
> Am 27.11.11 23:47, schrieb Karl Pflästerer:
>> Am 27.11.11 23:23, schrieb Yannick Torrès:
>>>  2011/11/27 Karl Pflästerer<k...@rl.pflaesterer.de>:
>>>>   Hi,
>>>
>>>  Hi,
>>>
>>>>   forgive me if I ask something which had already been discussed, but I've
>>>>   seen nothing in the archives.
>>>>
>>>>   I try to help translating some of the docs and saw here
>>>>   https://edit.php.net/ this box:
>>>>
>>>>   Check for errors in /language-snippets.ent
>>>>
>>>>   The content for that box seems to get computed from tha class
>>>>   http://svn.php.net/repository/web/doc-editor/trunk/php/ToolsError.php
>>>>
>>>>   There is a method attributLinkTag()
>>>>
>>>>   To compare the linkend atrribute of the<link>   tags it uses a regex.
>>>>
>>>>   $reg = '/<link\s*?linkend=("|\')(.*?)("|\')\s*?>/s';
>>>>
>>>>   You see between<link and the linkend attribute only whitespace is 
>>>> allowed.
>>>>   But for example in the german translation (and also in the english
>>>>   documentation) some<link>   tags have another attribute between the 
>>>> element
>>>>   name and "linkend".
>>>
>>>  Could you give me an example please of this case ?
>>
>>> From en/language-snippets.ent
>>
>> <!ENTITY seealso.array.sorting 'The<link 
>> xmlns="http://docbook.org/ns/docbook"; linkend="array.sorting">comparison of 
>> array sorting functions</link>'>
>>
>> <!ENTITY seealso.callback 'information about the<link 
>> xmlns="http://docbook.org/ns/docbook"; 
>> linkend="language.types.callback">callback</link>  type'>
>>
>> In the german translation are more examples (some of them IMHO wrong, since
>> they duplicate the xmlns attribute), but I'm not sure if such a simple
>> difference should trigger such an error.
>>
>>>
>>>>   An easy fix would be
>>>>   $reg = '/<link[^<>]+linkend=("|\')(.*?)("|\')[^<>]*>/s';
>>>>
>>>>   But that would solve only have of the problem; the other problem is that 
>>>> the
>>>>   check script needs the same order of entities in both files and it 
>>>> compares
>>>>   only the position of the found links in both match arrays. So e.g. one 
>>>> link
>>>>   more in the translation will give false matches for all following 
>>>> entries.
>>>
>>>  Yes it is.
>>>  The goal here is to check each file and warn when there is only one
>>>  difference even if this is an ordre problem (this can be a translation
>>>  error too).
>>
>> Ok. (for a file with only entity definition order shouldn't matter or?)
>>
>>>
>>>>   Does it make sense to rewrite that algorithm, so that it compares each
>>>>   entity in the english original and the translation so we get better 
>>>> errors?
>>>
>>>  You mean to avoid order check ?
>>>  Perhaps we can do this yes : check the number of this tag, and check
>>>  if there is all of this tag, even if the order is not respected.
>>
>> I thought to perhaps check each entity definition; so not to do a simple
>> preg_match_all and compare $match_en[1] to $match_lang[1] but to compare the
>> linkend attribute of entity definition in en and $lang.
>>
>> Then the error could be: Difference in linkend attribute in entity xyz.
>
> To be a little bit more concrete, here is a code example (that's just a POC):
>
> <?php
>
> function extract_linkend ($s) {
>
>  $rx_linkend = '
>    /
>    <(?: link | xref)
>     [^<>]+
>     linkend=(?:"|\') (.*?) (?:"|\')
>     [^<>]*
>    >
>   /xs';
>
>  $rx_entities = '/(<!ENTITY\s+(\S+).+?)(?=(?:<!ENTITY|$))/s';
>
>  preg_match_all($rx_entities, $s, $m_entities, PREG_SET_ORDER);
>  $linkend_by_entity = array();
>  foreach ($m_entities as $entity) {
>    preg_match_all($rx_linkend, $entity[1], $m_linkend);
>    if ($m_linkend[1])
>      $linkend_by_entity[$entity[2]] = $m_linkend[1];
>  };
>  return $linkend_by_entity;
> }
>
>
> $link_de = extract_linkend(file_get_contents('language-snippets.ent'));
> $link_en = extract_linkend(file_get_contents('../en/language-snippets.ent'));
>
> $diff = array_udiff_assoc($link_en, $link_de,
>                           function ($en, $lang) { return array_diff($en, 
> $lang) ? 1 : 0; } );
>
> foreach ($diff as $entity => $linkends) {
>  echo "Entity: $entity\n";
>  echo 'EN: ' . join('; ', $linkends), "\n";
>  echo 'DE: ' . join('; ', $link_de[$entity]), "\n\n";
> }
>
>
> If I run that (with the de translation), I get:
>
> Entity: ini.php.constants
> EN: configuration.changes.modes
> DE: ini
>
> Entity: mysqli.available.mysqlnd
> EN: book.mysqlnd
> DE: mysqli.overview.mysqlnd
>
> That could be helpful (IMHO).
>
>  KP
>
>

Thanks Karl,
I will add this asap

Great work,
Yannick

Reply via email to