Hi Megan,
I've also had this problem in the past. In my case it was fixed by
typing "export LC_ALL=C" prior to running the processPhraseTable
command. I hope that helps.
Kevin.
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
[EMAIL PROTECTED]
Sent: Monda
Ham, Michael wrote:
> Those escape numbers are Unicode characters. The Chinese character
> set
> does not exist in ASCII, so you have to use UTF-8.
Sorry if I wasn't clear: I'm talking about the Chinese side of
LDC2004E12, which is not in ASCII or Unicode, it's in GB18030.
Apparently, th
I get the same problem also.
The issue seems to be with obtuse unix sort command.
In some versions of sort, it may be sorting by a hash index, rather than
alphanumberic sort. Therefore, you need to force it to do an alphanumberic
sort
sort -t"|" -k1,1
This fixed it for me. It's not th
Hello again,
Yes, I was using the command as described on the Moses web site at
http://www.statmt.org/moses/?n=Moses.AdvancedFeatures. I have also tried piping
the results from sort through uniq before piping it into processPhraseTable and
encountered the same error. Perhaps I am unaware of som