Re: [Moses-support] binary phrase table issue

2008-07-14 Thread Wilson, Kevin
Hi Megan, I've also had this problem in the past. In my case it was fixed by typing "export LC_ALL=C" prior to running the processPhraseTable command. I hope that helps. Kevin. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Monda

Re: [Moses-support] OT: LDC2004E12

2008-07-14 Thread John D. Burger
Ham, Michael wrote: > Those escape numbers are Unicode characters. The Chinese character > set > does not exist in ASCII, so you have to use UTF-8. Sorry if I wasn't clear: I'm talking about the Chinese side of LDC2004E12, which is not in ASCII or Unicode, it's in GB18030. Apparently, th

Re: [Moses-support] [Bulk] Re: phrase table memory issue

2008-07-14 Thread Hieu Hoang
I get the same problem also. The issue seems to be with obtuse unix sort command. In some versions of sort, it may be sorting by a hash index, rather than alphanumberic sort. Therefore, you need to force it to do an alphanumberic sort sort -t"|" -k1,1 This fixed it for me. It's not th

Re: [Moses-support] phrase table memory issue

2008-07-14 Thread Megan Elmore ([EMAIL PROTECTED])
Hello again, Yes, I was using the command as described on the Moses web site at http://www.statmt.org/moses/?n=Moses.AdvancedFeatures. I have also tried piping the results from sort through uniq before piping it into processPhraseTable and encountered the same error. Perhaps I am unaware of som