Interesting, I've just took a look at the perluniprop documentation carefully, generic \p{Line_Break} didn't exist but Ubuntu and Mac OS let me slip pass without error/warning. I supposed \p{Line_Break} automatically goes to \p{Line_Break: Alphabetic} or \p{Line_Break: Ambiguous}
Maybe the perluniprop handling is different for different machines/OS/perl versions. \p{Line_Break: Hyphen} might be closer to \p{Hyphen} than alphbetic/ambiguous \p{Line_Break} I think we can change it to \p{Line_Break: Hyphen} safely.But first, @Dingyuan, please do tell us your OS and perl versions and see whether it's just that the newer perl version handles the \p{Line_Break} automatically while the older perl doesn't On Wed, Mar 29, 2017 at 12:36 AM, liling tan <alvati...@gmail.com> wrote: > Hi Dingyuan, Hieu, > > Thanks for highlighting the issue. > > The deprecation warning from mteval has been there since early 2015 on > https://www.mail-archive.com/moses-support@mit.edu/msg12057.html > > The fix at https://github.com/moses-smt/mosesdecoder/pull/170 was > adhering to unicode annex on 2016-06-01. > > Actually \p{Line_Break:Hyphen} is a little more restricting than > \p{Line_Break}. > Technically, if the script didn't break with \p{Line_Break:Hyphen} , > the superset \p{Line_Break} shouldn't break. Perl unicode properties > documentation has the detailed explanations: http://perl11. > org/pod/perluniprops.html > > The current mteval-v13a.pl works fine on: > - Ubuntu 16.04.01 with perl v5.22.1 > - Mac OS Sierra with perl v5.18.2 > > @Dingyuan, a couple of short questions. > > (i) Which OS are you using? Which version of the OS? > > (ii) What is your perl version? use command "perl -v" > > That'll help us to know which OS and/or perl version caused that error. > > Regards, > Liling > > > On Tue, Mar 28, 2017 at 8:52 PM, Hieu Hoang <hieuho...@gmail.com> wrote: > >> If you find that {Line_Break:Hyphen} works, please consider checking it >> in. >> >> These compatibility issues are difficult to debug alone and depends on >> the exact perl/OS version you're running. Your fix will add a little to the >> body of knowledge >> >> * Looking for MT/NLP opportunities * >> Hieu Hoang >> http://moses-smt.org/ >> >> >> On 28 March 2017 at 02:48, Dingyuan Wang <abcdoyle...@gmail.com> wrote: >> >>> -----BEGIN PGP SIGNED MESSAGE----- >>> Hash: SHA256 >>> >>> Recently mteval-v13a.pl stopped working, printing: >>> >>> Can't find Unicode property definition "Line_Break" in regex; marked >>> by <-- HERE in m/\p{Line_Break} <-- HERE \p{Zl}/ at >>> /home/gumble/software/moses/scripts/generic/mteval-v13a.pl line 953. >>> >>> I see this commit >>> <https://github.com/moses-smt/mosesdecoder/commit/c6c3bc84b7673618f37948 >>> 2cbc6b708f55a9ecd3 >>> <https://github.com/moses-smt/mosesdecoder/commit/c6c3bc84b7673618f379482cbc6b708f55a9ecd3> >>> >. >>> I found that changing this to \p{Line_Break: Hyphen} worked. Is this >>> the equivalent of \p{Hyphen}? >>> >>> - -- >>> Dingyuan Wang >>> -----BEGIN PGP SIGNATURE----- >>> >>> iQEzBAEBCAAdFiEEjE4PLbCEqfvlC0rjs+TYPj8+X9wFAljZwNYACgkQs+TYPj8+ >>> X9xShggAhSjPEEYXsiRPT9wVljRV7XjBmexe/E7EKzl9b/PEnuxlSNSrz/0Estr5 >>> 8/H4s+lwKdv9xx1jTxOGOVkToiVC95QkuppXX3WS+BCDjajE8fqWc2Y0IhUWRaqf >>> PAhhotEZmoWAhQC/qVM7lILf29N9OhQ2FStQH9rn+LpD2dkSZweZ0XGJ+CFpCdaP >>> VA7XPWJCJZeEBUsBqrSxl1Cwzr+KQ4pw/NFP6yxJ+smmTkUSyp2FfYCtvalx/L0d >>> 2UZ1fiujzco7NHeJW/0ZYwsb+NNMuM7CljBMhQAWIN+D0f6Wz1/bHH8jhFyxUw0B >>> +/hN/chrAmYX+Kz2j/MKc7eXZuPtmA== >>> =C+jT >>> -----END PGP SIGNATURE----- >>> _______________________________________________ >>> Moses-support mailing list >>> Moses-support@mit.edu >>> http://mailman.mit.edu/mailman/listinfo/moses-support >>> >> >> >
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support