Re: [Moses-support] [Bulk] Re: phrase table memory issue

2008-07-14 Thread Hieu Hoang
I get the same problem also. The issue seems to be with obtuse unix sort command. In some versions of sort, it may be sorting by a hash index, rather than alphanumberic sort. Therefore, you need to force it to do an alphanumberic sort sort -t| -k1,1 This fixed it for me. It's not the

Re: [Moses-support] OT: LDC2004E12

2008-07-14 Thread John D. Burger
Ham, Michael wrote: Those escape numbers are Unicode characters. The Chinese character set does not exist in ASCII, so you have to use UTF-8. Sorry if I wasn't clear: I'm talking about the Chinese side of LDC2004E12, which is not in ASCII or Unicode, it's in GB18030. Apparently,

Re: [Moses-support] binary phrase table issue

2008-07-14 Thread Wilson, Kevin
Hi Megan, I've also had this problem in the past. In my case it was fixed by typing export LC_ALL=C prior to running the processPhraseTable command. I hope that helps. Kevin. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: