Hi, This is quite common. If you look at the scores, they are pretty low when they do not make sense, so, even though they are in the phrase table, most probably they will never be used for translation. I would not bother.
Cheers -- Felipe El 23/09/15 a las 16:50, Vincent Nguyen escribió: > I agree and would like to. > But this is tricky, look at the first 30 lines of my phrase table below. > > and this happens a lot in the first line of tables where there are &apos > or weird codes, EN/FR pairs do not match. > > > > > ! ! ! ! ||| ! ! ! ! ||| 0.103413 0.132185 0.103413 0.401758 ||| 0-0 1-1 > 2-2 3-3 ||| 1 1 1 ||| ||| > ! ! ! ) ||| ! ! ! ) ||| 0.339323 0.167884 0.508985 0.4246 ||| 0-0 1-0 > 2-0 2-1 2-2 3-3 ||| 3 2 2 ||| ||| > ! ! ! ||| ! ! ! ||| 0.501834 0.219223 0.716905 0.50463 ||| 0-0 1-1 2-2 > ||| 10 7 6 ||| ||| > ! ! ! ||| budget ! ! ! ||| 0.0517067 0.219223 0.0147733 4.50635e-05 ||| > 0-1 1-2 2-3 ||| 2 7 1 ||| ||| > ! ! ) , ||| ! ! ) - , ||| 0.103413 0.111989 0.103413 0.00192967 ||| 0-0 > 1-1 2-2 3-3 3-4 ||| 1 1 1 ||| ||| > ! ! ) ||| ! ! ) ||| 0.103413 0.278429 0.103413 0.533321 ||| 0-0 1-1 2-2 > ||| 1 1 1 ||| ||| > ! ! ||| ! ! ||| 0.625 0.363573 0.769231 0.633844 ||| 0-0 1-1 ||| 16 13 > 10 ||| ||| > ! ! ||| . ||| 4.65922e-08 6.71089e-07 0.00795487 0.140779 ||| 0-0 1-0 > ||| 2.21954e+06 13 1 ||| ||| > ! ! ||| budget ! ! ||| 0.0517067 0.363573 0.00795487 5.66022e-05 ||| 0-1 > 1-2 ||| 2 13 1 ||| ||| > ! ! ||| nécessaire ! ! ||| 0.103413 0.363573 0.00795487 0.000130572 ||| > 0-1 1-2 ||| 1 13 1 ||| ||| > ! [ never again ! ||| ! ||| 6.51628e-06 5.42074e-13 0.103413 > 0.796143 ||| 0-0 4-0 ||| 15870 1 1 ||| ||| > ! ] this is ||| tel est ||| 7.38667e-05 9.16191e-11 0.103413 > 0.00147917 ||| 2-0 3-1 ||| 1400 1 1 ||| ||| > ! ] this ||| tel ||| 1.09594e-05 1.44188e-10 0.103413 0.0035893 ||| > 2-0 ||| 9436 1 1 ||| ||| > ! ] ||| ! ] ||| 0.103413 0.352335 0.103413 0.472387 ||| 0-0 1-1 > ||| 1 1 1 ||| ||| > ! & quot ; ||| ! " . et ||| 0.0517067 2.36396e-12 0.0517067 > 1.88268e-05 ||| 0-0 1-1 2-1 3-3 ||| 2 2 1 ||| ||| > ! & quot ; ||| ! " ||| 0.000222394 1.44515e-11 0.0517067 > 0.518419 ||| 0-0 2-1 ||| 465 2 1 ||| ||| > ! & quot ||| ! " . ||| 0.000662906 8.30626e-09 0.0344711 > 0.00232791 ||| 0-0 1-1 2-1 ||| 156 3 1 ||| ||| > ! & quot ||| ! " ||| 0.00218918 8.30626e-09 0.339323 0.518419 > ||| 0-0 2-1 ||| 465 3 2 ||| ||| > ! & ||| ! ||| 6.51628e-06 7.21755e-05 0.103413 0.796143 ||| 0-0 ||| > 15870 1 1 ||| ||| > ! ' ] , addressed ||| ! " adressé ||| 0.103413 3.70838e-07 > 0.103413 0.00596848 ||| 0-0 1-1 2-1 4-2 ||| 1 1 1 ||| ||| > ! ' ] , ||| ! " ||| 0.000222394 2.49698e-06 0.103413 > 0.215573 ||| 0-0 1-1 2-1 ||| 465 1 1 ||| ||| > ! ' ] ||| ! " ||| 0.000222394 3.57128e-05 0.103413 > 0.215573 ||| 0-0 1-1 2-1 ||| 465 1 1 ||| ||| > ! ' ' Alstom shares ||| l' on constate un > dysfonctionnement ||| 0.0344711 5.62605e-16 0.103413 1.03361e-14 ||| 1-0 > 2-0 1-1 3-4 4-4 ||| 3 1 1 ||| ||| > ! ' ' ||| l' on constate un ||| 0.0147733 1.56906e-11 > 0.0129267 2.2766e-12 ||| 1-0 2-0 1-1 ||| 7 8 1 ||| ||| > ! ' ' ||| l' on constate ||| 0.000984889 1.56906e-11 > 0.0129267 2.36929e-10 ||| 1-0 2-0 1-1 ||| 105 8 1 ||| ||| > ! ' ' ||| l' on ||| 6.76656e-06 1.56906e-11 0.0129267 > 6.18613e-06 ||| 1-0 2-0 1-1 ||| 15283 8 1 ||| ||| > ! ' ' ||| ou que l' on constate ||| 0.0344711 1.56906e-11 > 0.0129267 4.69534e-15 ||| 1-2 2-2 1-3 ||| 3 8 1 ||| ||| > ! ' ' ||| ou que l' on ||| 0.00304157 1.56906e-11 > 0.0129267 1.22594e-10 ||| 1-2 2-2 1-3 ||| 34 8 1 ||| ||| > ! ' ' ||| que l' on constate un ||| 0.0344711 1.56906e-11 > 0.0129267 4.56092e-14 ||| 1-1 2-1 1-2 ||| 3 8 1 ||| ||| > ! ' ' ||| que l' on constate ||| 0.00323167 1.56906e-11 > 0.0129267 4.74661e-12 ||| 1-1 2-1 1-2 ||| 32 8 1 ||| ||| > > > > Le 23/09/2015 15:12, Tom Hoar a écrit : >> Vincent, >> >> If you suspect bad entries, isn't it better to address the root of the >> problem and prepare your training corpus better? >> >> >> On 9/23/2015 6:46 PM, moses-support-requ...@mit.edu wrote: >>> Date: Tue, 22 Sep 2015 20:24:02 +0200 >>> From: Philipp Koehn<p...@jhu.edu> >>> Subject: Re: [Moses-support] is there a way to remove a bad entry in >>> the phrase table ? >>> To: Vincent Nguyen<vngu...@neuf.fr> >>> Cc: moses-support<moses-support@mit.edu> >>> >>> Hi, >>> >>> you can remove it manually (just edit the text file), there will be no >>> negative consequences. >>> >>> However, it is not a realistic strategy to try to remove by hand every >>> offending phrase table entry. >>> >>> -phi >>> >>> On Tue, Sep 22, 2015 at 4:05 PM, Vincent Nguyen<vngu...@neuf.fr> wrote: >>> >>>> >Hi, >>>> > >>>> >I was wondering if after an analysis of the BLEU-Annotation file we >>>> >realize that there must be a bad entry in the phrase table, >>>> >we could remove it manually or in some other ways ? >>>> > >>>> >Gracias. >>>> >V. >>>> >_______________________________________________ >>>> >Moses-support mailing list >>>> >Moses-support@mit.edu >>>> >http://mailman.mit.edu/mailman/listinfo/moses-support >>>> > >> >> -- >> Best regards, >> >> Tom Hoar >> Chief Executive Officer >> /*Precision Translation Tools Pte Ltd*/ >> Singapore/Thailand >> Web: www.precisiontranslationtools.com >> <http://www.precisiontranslationtools.com> >> Thailand Mobile: +66 87 345-1875 >> Skype: tahoar >> >> >> _______________________________________________ >> Moses-support mailing list >> Moses-support@mit.edu >> http://mailman.mit.edu/mailman/listinfo/moses-support > > > > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support > -- Felipe Sánchez Martínez Dep. de Llenguatges i Sistemes Informàtics Universitat d'Alacant, E-03071 Alacant (Spain) Tel.: +34 965 903 400, ext: 2966 http://www.dlsi.ua.es/~fsanchez _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support