Hi,

        I've taken a shot in the dark based on physmem.c to support physical
memory estimation on BSD and OS X.  Please clone

github.com/kpu/kenlm

and compile with

./bjam

If that fails, please let Hieu and I know (maybe Hieu can help since he
has OS X).  If it doesn't fail, run

bin/lmplz

with no argument.  The help message will include a line e.g.

"This machine has 135224176640 bytes of memory."

or

"Unable to determine the amount of memory on this machine."

If it works, then I'll push to Moses.  Trying to not break Moses master
for OS X.

Kenneth

On 11/24/13 22:40, Prasanth K wrote:
> Hi Kenneth, 
> 
> Thanks for the clarification w.r.t. calculating the memory size. But I
> am running these on a Mac (10.9 Mavericks). Do you think I should still
> port the lmplz code to Mac for the estimation of probabilities? 
> 
> One thing though, I did change the default clang compiler that comes
> with this new Mac to a gcc-4.8 (not sure that changes anything in this
> context). 
> 
> - Prasanth
> 
> 
> 
> 
> On Fri, Nov 22, 2013 at 6:50 PM, Kenneth Heafield <mo...@kheafield.com
> <mailto:mo...@kheafield.com>> wrote:
> 
>     Hi,
> 
>             What OS are you on?  Cygwin?  Apparently every OS reports
>     memory size
>     in a different way:
> 
>     
> http://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=blob;f=lib/physmem.c;h=2629936146e3042f927523322f18aca76996cd7f;hb=HEAD
> 
>     The good news is that the above code is LGPLv2:
> 
>     
> http://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=blob;f=modules/physmem;h=9644522e0493a85a9fb4ae7c4449741c2c1500ea;hb=HEAD
> 
>     But currently I'm just using this short function that will fail on some
>     platforms:
> 
>     uint64_t GuessPhysicalMemory() {
>     #if defined(_WIN32) || defined(_WIN64)
>       return 0;
>     #elif defined(_SC_PHYS_PAGES) && defined(_SC_PAGESIZE)
>       long pages = sysconf(_SC_PHYS_PAGES);
>       if (pages == -1) return 0;
>       long page_size = sysconf(_SC_PAGESIZE);
>       if (page_size == -1) return 0;
>       return static_cast<uint64_t>(pages) *
>     static_cast<uint64_t>(page_size);
>     #else
>       return 0;
>     #endif
>     }
> 
>     If it fails, I just don't let users specify memory as a percentage.  So
>     one thing thing to fix is putting physmem.{h,c} in util then changing
>     calls to GuessPhysicalMemory.  But I'm also not a fan of the way the GNU
>     code gives up and makes up a number at the end.
> 
>     The second porting issue is that lmplz makes parallel use of pread,
>     pwrite, and write.  Windows is unsafe in this regard (POSIX requires
>     that pread/pwrite not change the file pointer; Windows has no way to
>     implement that atomically).  To fix this, we'll always specify the file
>     offset in cases that happen concurrently.  Extend util/stream/io.* with
>     a PWrite class based on PWriteOrThrow then change FileBuffer to use
>     PWrite.  Then I guess one should rename PReadOrThrow/PWriteOrThrow to
>     something that indicates they're not-quite-POSIX on windows.  Also, the
>     macros in these functions should detect cygwin, bypassing cygwin's
>     "Function not implemented" and calling Windows APIs directly (they're
>     already there for _WIN32).
> 
>     I don't have a windows box so I can say what should be changed at a high
>     level, but need an actual user to ensure it compiles and runs correctly.
> 
>     Kenneth
> 
>     On 11/22/13 06:49, Prasanth K wrote:
>     > Hi,
>     >
>     > I am trying to use KenLM for building a language model on the Europarl
>     > corpus. Following the instructions in
>     >
>     
> (http://www.statmt.org/moses/?n=FactoredTraining.BuildingLanguageModel#ntoc19),
>     > I added the few lines for getting KenLM to estimate the LM
>     probabilities
>     > (order/n=5) to my config file to the EMS. The language model dies down
>     > during training saying that the "Function not implemented" at counting
>     > and sorting n-grams stage (the first stage itself). Does this mean
>     there
>     > is something wrong with my installation? Or is just insufficient
>     memory?
>     >
>     > Incidentally, when I started giving the amount of memory in terms of %
>     > (80%) there was an error "Failed to parse .. into memory size because
>     > physical memory size could not be determined". I am also curious why
>     > this happens?
>     >
>     > Kenneth, can you shed some light on this? Thanks.
>     >
>     > - Regards,
>     > Prasanth
>     >
>     >
>     >
>     > --
>     > "Theories have four stages of acceptance. i) this is worthless
>     nonsense;
>     > ii) this is an interesting, but perverse, point of view, iii) this is
>     > true, but quite unimportant; iv) I always said so."
>     >
>     >   --- J.B.S. Haldane
>     >
>     >
>     > _______________________________________________
>     > Moses-support mailing list
>     > Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>     > http://mailman.mit.edu/mailman/listinfo/moses-support
>     >
>     _______________________________________________
>     Moses-support mailing list
>     Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>     http://mailman.mit.edu/mailman/listinfo/moses-support
> 
> 
> 
> 
> -- 
> "Theories have four stages of acceptance. i) this is worthless nonsense;
> ii) this is an interesting, but perverse, point of view, iii) this is
> true, but quite unimportant; iv) I always said so."
> 
>   --- J.B.S. Haldane
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to