Hi Lane,
For the compact phrase table and reordering table you can use 
--minphr-memory and --minlexr-memory respectively. That will disable 
memory mapping entirely and just read both into RAM.
Best,
Marcin

On 20.02.2016 00:29, Lane Schwartz wrote:
> Hey,
>
> This is mostly addressed to Kenneth, since as far as I know he's the 
> author of the data structures involved.
>
> I have access to a cluster at the University of Illinois. The cluster 
> here uses GPFS as its file system.
>
> I've observed that when running moses, especially with lots of 
> threads, that the threads spend virtually all of their time at near 0% 
> CPU usage, in D (uninterruptible sleep, awaiting IO) status. When I 
> copy my model files and config file to scratch space on local disk 
> (and cat $file > /dev/null each model file), this issue disappears. It 
> appears that doing cat $file > /dev/null on GPFS does not load the 
> file into RAM in the same way that doing so appears to do on other 
> file systems.
>
> I spent quite a bit of time today with three cluster admins / disk 
> engineers trying to debug this problem.
>
> Their ultimate solution was for me to cp each $file from GPFS to 
> /dev/shm, which as far as I can tell acts like a RAM disk. Doing so 
> resolves the issue.
>
> Their best estimate of the problem is that moses (from their 
> perspective) appeared to (for each thread) ask the file system for 
> access to data that's present in the model files, causing a new disk 
> read (with a corresponding disk lock) every time. They believe that 
> this issue is not present with local disk because the cat $file > 
> /dev/null is pre-loading each file into RAM in that case, but is not 
> doing so with GPFS. Thus the threads are (according to this theory) 
> getting bogged down by disk locks.
>
> I was puzzled by this, because I thought that the probing data 
> structure underlying the LM and the phrase table used memory mapping. 
> I had (perhaps naively) assumed that when the memory mapping is 
> initiated, the OS actively loaded all of the file contents into 
> appropriate VM pages. Now the question is, is the memory mapping 
> actually acting lazily, only loading data from disk on an as-needed 
> basis? If so, that could potentially explain the horrific disk delays 
> that I'm encountering. And if so, then one question is, is it possible 
> to alter the behavior of the memory mapping such that when the memory 
> map is initiated, it actually does active load the entire file into 
> memory?
>
> Thanks,
> Lane
>
>
>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to