Interesting, I've just took a look at the perluniprop documentation
carefully, generic \p{Line_Break} didn't exist but Ubuntu and Mac OS let me
slip pass without error/warning. I supposed  \p{Line_Break} automatically
goes to  \p{Line_Break: Alphabetic} or \p{Line_Break: Ambiguous}

Maybe the perluniprop handling is different for different machines/OS/perl
versions.

\p{Line_Break: Hyphen} might be closer to \p{Hyphen} than
alphbetic/ambiguous \p{Line_Break}


I think we can change it to \p{Line_Break: Hyphen} safely.But first,
@Dingyuan, please do tell us your OS and perl versions and see whether
it's just that the newer perl version handles the  \p{Line_Break}
automatically while the older perl doesn't


On Wed, Mar 29, 2017 at 12:36 AM, liling tan <alvati...@gmail.com> wrote:

> Hi Dingyuan, Hieu,
>
> Thanks for highlighting the issue.
>
> The deprecation warning from mteval has been there since early 2015 on
> https://www.mail-archive.com/moses-support@mit.edu/msg12057.html
>
> The fix at https://github.com/moses-smt/mosesdecoder/pull/170 was
> adhering to unicode annex on 2016-06-01.
>
> Actually \p{Line_Break:Hyphen} is a little more restricting than 
> \p{Line_Break}.
> Technically, if the script didn't break with  \p{Line_Break:Hyphen}  ,
> the superset  \p{Line_Break} shouldn't break. Perl unicode properties
> documentation has the detailed explanations: http://perl11.
> org/pod/perluniprops.html
>
> The current mteval-v13a.pl works fine on:
> - Ubuntu 16.04.01 with perl v5.22.1
> - Mac OS Sierra with perl v5.18.2
>
> @Dingyuan, a couple of short questions.
>
> (i) Which OS are you using? Which version of the OS?
>
> (ii) What is your perl version? use command "perl -v"
>
> That'll help us to know which OS and/or perl version caused that error.
>
> Regards,
> Liling
>
>
> On Tue, Mar 28, 2017 at 8:52 PM, Hieu Hoang <hieuho...@gmail.com> wrote:
>
>> If you find that {Line_Break:Hyphen} works, please consider checking it
>> in.
>>
>> These compatibility issues are difficult to debug alone and depends on
>> the exact perl/OS version you're running. Your fix will add a little to the
>> body of knowledge
>>
>> * Looking for MT/NLP opportunities *
>> Hieu Hoang
>> http://moses-smt.org/
>>
>>
>> On 28 March 2017 at 02:48, Dingyuan Wang <abcdoyle...@gmail.com> wrote:
>>
>>> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA256
>>>
>>> Recently mteval-v13a.pl stopped working, printing:
>>>
>>> Can't find Unicode property definition "Line_Break" in regex; marked
>>> by <-- HERE in m/\p{Line_Break} <-- HERE \p{Zl}/ at
>>> /home/gumble/software/moses/scripts/generic/mteval-v13a.pl line 953.
>>>
>>> I see this commit
>>> <https://github.com/moses-smt/mosesdecoder/commit/c6c3bc84b7673618f37948
>>> 2cbc6b708f55a9ecd3
>>> <https://github.com/moses-smt/mosesdecoder/commit/c6c3bc84b7673618f379482cbc6b708f55a9ecd3>
>>> >.
>>> I found that changing this to \p{Line_Break: Hyphen} worked. Is this
>>> the equivalent of \p{Hyphen}?
>>>
>>> - --
>>> Dingyuan Wang
>>> -----BEGIN PGP SIGNATURE-----
>>>
>>> iQEzBAEBCAAdFiEEjE4PLbCEqfvlC0rjs+TYPj8+X9wFAljZwNYACgkQs+TYPj8+
>>> X9xShggAhSjPEEYXsiRPT9wVljRV7XjBmexe/E7EKzl9b/PEnuxlSNSrz/0Estr5
>>> 8/H4s+lwKdv9xx1jTxOGOVkToiVC95QkuppXX3WS+BCDjajE8fqWc2Y0IhUWRaqf
>>> PAhhotEZmoWAhQC/qVM7lILf29N9OhQ2FStQH9rn+LpD2dkSZweZ0XGJ+CFpCdaP
>>> VA7XPWJCJZeEBUsBqrSxl1Cwzr+KQ4pw/NFP6yxJ+smmTkUSyp2FfYCtvalx/L0d
>>> 2UZ1fiujzco7NHeJW/0ZYwsb+NNMuM7CljBMhQAWIN+D0f6Wz1/bHH8jhFyxUw0B
>>> +/hN/chrAmYX+Kz2j/MKc7eXZuPtmA==
>>> =C+jT
>>> -----END PGP SIGNATURE-----
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>
>>
>
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to