Hi Lane, For the compact phrase table and reordering table you can use --minphr-memory and --minlexr-memory respectively. That will disable memory mapping entirely and just read both into RAM. Best, Marcin
On 20.02.2016 00:29, Lane Schwartz wrote: > Hey, > > This is mostly addressed to Kenneth, since as far as I know he's the > author of the data structures involved. > > I have access to a cluster at the University of Illinois. The cluster > here uses GPFS as its file system. > > I've observed that when running moses, especially with lots of > threads, that the threads spend virtually all of their time at near 0% > CPU usage, in D (uninterruptible sleep, awaiting IO) status. When I > copy my model files and config file to scratch space on local disk > (and cat $file > /dev/null each model file), this issue disappears. It > appears that doing cat $file > /dev/null on GPFS does not load the > file into RAM in the same way that doing so appears to do on other > file systems. > > I spent quite a bit of time today with three cluster admins / disk > engineers trying to debug this problem. > > Their ultimate solution was for me to cp each $file from GPFS to > /dev/shm, which as far as I can tell acts like a RAM disk. Doing so > resolves the issue. > > Their best estimate of the problem is that moses (from their > perspective) appeared to (for each thread) ask the file system for > access to data that's present in the model files, causing a new disk > read (with a corresponding disk lock) every time. They believe that > this issue is not present with local disk because the cat $file > > /dev/null is pre-loading each file into RAM in that case, but is not > doing so with GPFS. Thus the threads are (according to this theory) > getting bogged down by disk locks. > > I was puzzled by this, because I thought that the probing data > structure underlying the LM and the phrase table used memory mapping. > I had (perhaps naively) assumed that when the memory mapping is > initiated, the OS actively loaded all of the file contents into > appropriate VM pages. Now the question is, is the memory mapping > actually acting lazily, only loading data from disk on an as-needed > basis? If so, that could potentially explain the horrific disk delays > that I'm encountering. And if so, then one question is, is it possible > to alter the behavior of the memory mapping such that when the memory > map is initiated, it actually does active load the entire file into > memory? > > Thanks, > Lane > > > > > > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support