Hi,

This is quite common. If you look at the scores, they are pretty low 
when they do not make sense, so, even though they are in the phrase 
table, most probably they will never be used for translation. I would 
not bother.

Cheers
--
Felipe

El 23/09/15 a las 16:50, Vincent Nguyen escribió:
> I agree and would like to.
> But this is tricky, look at the first 30 lines of my phrase table below.
>
> and this happens a lot in the first line of tables where there are &apos
> or weird codes, EN/FR pairs do not match.
>
>
>
>
> ! ! ! ! ||| ! ! ! ! ||| 0.103413 0.132185 0.103413 0.401758 ||| 0-0 1-1
> 2-2 3-3 ||| 1 1 1 ||| |||
> ! ! ! ) ||| ! ! ! ) ||| 0.339323 0.167884 0.508985 0.4246 ||| 0-0 1-0
> 2-0 2-1 2-2 3-3 ||| 3 2 2 ||| |||
> ! ! ! ||| ! ! ! ||| 0.501834 0.219223 0.716905 0.50463 ||| 0-0 1-1 2-2
> ||| 10 7 6 ||| |||
> ! ! ! ||| budget ! ! ! ||| 0.0517067 0.219223 0.0147733 4.50635e-05 |||
> 0-1 1-2 2-3 ||| 2 7 1 ||| |||
> ! ! ) , ||| ! ! ) - , ||| 0.103413 0.111989 0.103413 0.00192967 ||| 0-0
> 1-1 2-2 3-3 3-4 ||| 1 1 1 ||| |||
> ! ! ) ||| ! ! ) ||| 0.103413 0.278429 0.103413 0.533321 ||| 0-0 1-1 2-2
> ||| 1 1 1 ||| |||
> ! ! ||| ! ! ||| 0.625 0.363573 0.769231 0.633844 ||| 0-0 1-1 ||| 16 13
> 10 ||| |||
> ! ! ||| . ||| 4.65922e-08 6.71089e-07 0.00795487 0.140779 ||| 0-0 1-0
> ||| 2.21954e+06 13 1 ||| |||
> ! ! ||| budget ! ! ||| 0.0517067 0.363573 0.00795487 5.66022e-05 ||| 0-1
> 1-2 ||| 2 13 1 ||| |||
> ! ! ||| nécessaire ! ! ||| 0.103413 0.363573 0.00795487 0.000130572 |||
> 0-1 1-2 ||| 1 13 1 ||| |||
> ! [ never again ! ||| ! ||| 6.51628e-06 5.42074e-13 0.103413
> 0.796143 ||| 0-0 4-0 ||| 15870 1 1 ||| |||
> ! ] this is ||| tel est ||| 7.38667e-05 9.16191e-11 0.103413
> 0.00147917 ||| 2-0 3-1 ||| 1400 1 1 ||| |||
> ! ] this ||| tel ||| 1.09594e-05 1.44188e-10 0.103413 0.0035893 |||
> 2-0 ||| 9436 1 1 ||| |||
> ! ] ||| ! ] ||| 0.103413 0.352335 0.103413 0.472387 ||| 0-0 1-1
> ||| 1 1 1 ||| |||
> ! & quot ; ||| ! " . et ||| 0.0517067 2.36396e-12 0.0517067
> 1.88268e-05 ||| 0-0 1-1 2-1 3-3 ||| 2 2 1 ||| |||
> ! & quot ; ||| ! " ||| 0.000222394 1.44515e-11 0.0517067
> 0.518419 ||| 0-0 2-1 ||| 465 2 1 ||| |||
> ! & quot ||| ! " . ||| 0.000662906 8.30626e-09 0.0344711
> 0.00232791 ||| 0-0 1-1 2-1 ||| 156 3 1 ||| |||
> ! & quot ||| ! " ||| 0.00218918 8.30626e-09 0.339323 0.518419
> ||| 0-0 2-1 ||| 465 3 2 ||| |||
> ! & ||| ! ||| 6.51628e-06 7.21755e-05 0.103413 0.796143 ||| 0-0 |||
> 15870 1 1 ||| |||
> ! ' ] , addressed ||| ! " adressé ||| 0.103413 3.70838e-07
> 0.103413 0.00596848 ||| 0-0 1-1 2-1 4-2 ||| 1 1 1 ||| |||
> ! ' ] , ||| ! " ||| 0.000222394 2.49698e-06 0.103413
> 0.215573 ||| 0-0 1-1 2-1 ||| 465 1 1 ||| |||
> ! ' ] ||| ! " ||| 0.000222394 3.57128e-05 0.103413
> 0.215573 ||| 0-0 1-1 2-1 ||| 465 1 1 ||| |||
> ! ' ' Alstom shares ||| l' on constate un
> dysfonctionnement ||| 0.0344711 5.62605e-16 0.103413 1.03361e-14 ||| 1-0
> 2-0 1-1 3-4 4-4 ||| 3 1 1 ||| |||
> ! ' ' ||| l' on constate un ||| 0.0147733 1.56906e-11
> 0.0129267 2.2766e-12 ||| 1-0 2-0 1-1 ||| 7 8 1 ||| |||
> ! ' ' ||| l' on constate ||| 0.000984889 1.56906e-11
> 0.0129267 2.36929e-10 ||| 1-0 2-0 1-1 ||| 105 8 1 ||| |||
> ! ' ' ||| l' on ||| 6.76656e-06 1.56906e-11 0.0129267
> 6.18613e-06 ||| 1-0 2-0 1-1 ||| 15283 8 1 ||| |||
> ! ' ' ||| ou que l' on constate ||| 0.0344711 1.56906e-11
> 0.0129267 4.69534e-15 ||| 1-2 2-2 1-3 ||| 3 8 1 ||| |||
> ! ' ' ||| ou que l' on ||| 0.00304157 1.56906e-11
> 0.0129267 1.22594e-10 ||| 1-2 2-2 1-3 ||| 34 8 1 ||| |||
> ! ' ' ||| que l' on constate un ||| 0.0344711 1.56906e-11
> 0.0129267 4.56092e-14 ||| 1-1 2-1 1-2 ||| 3 8 1 ||| |||
> ! ' ' ||| que l' on constate ||| 0.00323167 1.56906e-11
> 0.0129267 4.74661e-12 ||| 1-1 2-1 1-2 ||| 32 8 1 ||| |||
>
>
>
> Le 23/09/2015 15:12, Tom Hoar a écrit :
>> Vincent,
>>
>> If you suspect bad entries, isn't it better to address the root of the
>> problem and prepare your training corpus better?
>>
>>
>> On 9/23/2015 6:46 PM, moses-support-requ...@mit.edu wrote:
>>> Date: Tue, 22 Sep 2015 20:24:02 +0200
>>> From: Philipp Koehn<p...@jhu.edu>
>>> Subject: Re: [Moses-support] is there a way to remove a bad entry in
>>>     the phrase table ?
>>> To: Vincent Nguyen<vngu...@neuf.fr>
>>> Cc: moses-support<moses-support@mit.edu>
>>>
>>> Hi,
>>>
>>> you can remove it manually (just edit the text file), there will be no
>>> negative consequences.
>>>
>>> However, it is not a realistic strategy to try to remove by hand every
>>> offending phrase table entry.
>>>
>>> -phi
>>>
>>> On Tue, Sep 22, 2015 at 4:05 PM, Vincent Nguyen<vngu...@neuf.fr>  wrote:
>>>
>>>> >Hi,
>>>> >
>>>> >I was wondering if after an analysis of the BLEU-Annotation file we
>>>> >realize that there must be a bad entry in the phrase table,
>>>> >we could remove it manually or in some other ways ?
>>>> >
>>>> >Gracias.
>>>> >V.
>>>> >_______________________________________________
>>>> >Moses-support mailing list
>>>> >Moses-support@mit.edu
>>>> >http://mailman.mit.edu/mailman/listinfo/moses-support
>>>> >
>>
>> --
>> Best regards,
>>
>> Tom Hoar
>> Chief Executive Officer
>> /*Precision Translation Tools Pte Ltd*/
>> Singapore/Thailand
>> Web: www.precisiontranslationtools.com
>> <http://www.precisiontranslationtools.com>
>> Thailand Mobile: +66 87 345-1875
>> Skype: tahoar
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>

-- 
Felipe Sánchez Martínez
Dep. de Llenguatges i Sistemes Informàtics
Universitat d'Alacant, E-03071 Alacant (Spain)
Tel.: +34 965 903 400, ext: 2966
http://www.dlsi.ua.es/~fsanchez
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to