Thrashing is actually a fairly common problem on machines whose RAM is 
managed by PBS or social agreement (i.e. users should look at top before 
running).  Put two jobs on one machine, one or both of which require 
disk cache in addition to their virtual size.  Now you've got a 
thrashing machine but PBS won't kill either job because their virtual 
sizes are still below their limits (and the limits sum to less than the 
total RAM in the machine).  On Philipp's group machines, it frequently 
happens that two people run Moses at the same time.  The total memory 
usage looks fine, but whomever runs cat >/dev/null on their phrase 
tables gets all the CPU time while the other thrashes.

Kenneth

On 05/27/2012 11:23 AM, Tom Hoar wrote:
> Forgot about the disk cache. My idle footprint sizes don't include that.
>
> Have you ever had thrashing problems? The only time we've had I/O
> performance issues was when we tried to tuna model in a virtual machine
> and the host was also trying to share the hardware resources with
> another resource-intensive virtual machine. Since then, we only use
> dedicated hardware and stay away from cloud configurations.
>
> Here are some other points when configuring SSD/RAID-0. We found it's
> best to mount the partition with the option to disable atime updates,
> and use ext2 or configure ext4 partitions to disable journaling.
>
> http://fenidik.blogspot.com/2010/03/ext4-disable-journal.html
>
> Also, when creating the mdadm RAID-0, manually create the device using
> --chunk=1024.
>
> On Sat, 26 May 2012 22:31:27 -0400, Jonathan Clark <jhcl...@cs.cmu.edu>
> wrote:
>
>> Tom,
>> I agree that things can be made to work well under the current setup.
>> It's just not intuitive nor well-documented.
>> By "hidden", I mean hidden in terms of any standard Unix monitoring
>> tools. It is hidden both from both the virtual and resident memory
>> tallies, which are where most Unix admins and hackers expect to see
>> memory usage show up; they're the essential tools for troubleshooting
>> thrashing issues.
>> In terms of thrashing, I'd also say that these issues aren't quite
>> well-documented. As Kenneth points out, the very name "on disk" likely
>> leads even relatively experienced users to think that having enough
>> free disk space is enough to allow the model to run efficiently. The
>> documentation reads:
>> "For larger tasks the phrase tables usually become huge, typically too
>> large to fit into memory. Therefore, moses supports a binary phrase
>> table with on-demand loading, i.e. only the part of the phrase table
>> that is required to translate a sentence is loaded into memory."
>>
>> Depending on disk cache to make repeated random-access disk reads
>> works fine until cache runs out, but this requirement needs to be
>> documented. Otherwise, one might do a few tests with medium-sized
>> models and observe that performance is quite good and memory usage is
>> extremely low only to find that Moses thrashes with very large models,
>> despite memory usage remaining low.
>> I agree that a RAID-0 setup over SSDs is a nice solution since they do
>> achieve near-RAM speeds. However, I'm just pointing out that the need
>> for either 1) having the entire model fit into disk cache (and how to
>> determine how much RAM this actually implies given a set of model
>> files) or 2) having your "disk" have near-RAM speeds is not
>> well-documented nor easy to uncover via usual monitoring tools.
>> -Jon
>>
>>
>> On Sat, May 26, 2012 at 9:40 PM, Tom Hoar
>> <tah...@precisiontranslationtools.com
>> <mailto:tah...@precisiontranslationtools.com>> wrote:
>>
>>     I'm not exactly sure how this can be considered a "hidden memory
>>     requirement". By default, train-table-perl creates a moses.ini
>>     configuration that loads these resources into RAM. A user must
>>     edit the configuration to enable mem-mapped language model or
>>     binarized tables, and these on-disk features are well-documented.
>>
>>     Performance issues related to these features are relatively easy
>>     to troubleshoot and overcome. These are read-only files during
>>     translation runtime. Our systems uses working copies of these
>>     files stored on mounted RAID-0 devices (mdadm). We typically build
>>     the RAID-0 with 2 or more SSD disks and achieve near-full RAM
>>     speed. Even better, there's no delay loading models into RAM at
>>     start-up time. It really makes for a nice multi-translation server
>>     system without mosesserver.
>>
>>     As for RAM requirements, we've found that moses configured for
>>     on-disk needs about ~90 MB per instance when idle (phrase, not
>>     hierarchical but haven't tested since SVN 4153 Aug 2011). We
>>     frequently run 3-4 instances on a quad-core with 4 GB RAM and the
>>     RAID-0 configuration without problems. With the on-disk
>>     configuration, each translation request loads moses when needed
>>     and releases it and the RAM when finished. We haven't used
>>     mosesserver in a long time because we developed this solution to
>>     overcome some mosesserver memory leaks, which I think have been
>>     resolved.
>>
>>     On Sat, 26 May 2012 12:21:55 -0400, Jonathan Clark
>>     <jon.h.cl...@gmail.com <mailto:jon.h.cl...@gmail.com>> wrote:
>>
>>         Personally, I would could "The sees the process as a
>>         small-memory process and won't be
>>         tempted to kill it when it's running out of memory" as a
>>         disadvantage rather than an advantage. If the OOM killer is
>>         trying to stabilize the system, this will potentially prevent
>>         it from doing so.
>>
>>         The other disadvantage is the lack of accountability. If one
>>         is trying to figure out *why* Moses is going so slowly and
>>         sees that it's not using up much vmem or memory, this would
>>         usually lead to the conclusion that it's not a memory-related
>>         issue and that a solution such as buying more RAM won't help.
>>         This hidden memory requirement placed on the disk cache can be
>>         quite confusing there.
>>         -Jon
>>
>>
>>         On Sat, May 26, 2012 at 9:09 AM, Hieu Hoang
>>         <fishandfrol...@gmail.com <mailto:fishandfrol...@gmail.com>>
>>         wrote:
>>
>>             You'll also get thrashing with memory-mapped files if you
>>             don't have
>>             enough memory.
>>
>>             Advantage of the file API:
>>             1. can access 2+GB files even running a 32 bit OS
>>             2. OS portable
>>             3. The sees the process as a small-memory process and won't be
>>             tempted to kill it when it's running out of memory
>>             Disadvantage:
>>             1. Slower (by how much?)
>>
>>             On 25/05/2012 16:16, Kenneth Heafield wrote:
>>             > I have heard people have new phrase table formats.
>>             >
>>             > The OnDiskPt format is a file accessed with file APIs,
>>             not memory
>>             > mapping. Functionally, it uses the disk cache as shared
>>             memory (and the
>>             > kernel shares the disk cache across processes). There is
>>             also some
>>             > funny accounting going on because a process that depends
>>             on the disk
>>             > cache is not charged for usage of that memory while a
>>             mmapped process
>>             > would be. That means you can run Moses, it looks like
>>             it's fitting in
>>             > virtual memory, and still thrash the disk because you
>>             also need enough
>>             > disk cache to fit the entire phrase table. In this case,
>>             it is very
>>             > slow despite the name OnDiskPt.
>>             >
>>             > Kenneth
>>             >
>>             > On 05/25/2012 10:57 AM, Lane Schwartz wrote:
>>             >> Is there no current option to allow memory mapped
>>             phrase tables? I
>>             >> thought that's what the binary phrase table was.
>>             >>
>>             >> Lane
>>             >>
>>             >>
>>             >> On Fri, May 25, 2012 at 10:50 AM, Kenneth
>>             Heafield<mo...@kheafield.com <mailto:mo...@kheafield.com>>
>>             wrote:
>>             >>> Use memory mapping (KenLM 8 or 9 on Linux, 9 on
>>             non-Linux, or IRSTLM
>>             >>> with .mm) and the kernel takes care of shared memory
>>             for you.
>>             >>>
>>             >>> But there is merit to your argument e.g. different
>>             weights with the same
>>             >>> phrase tables. Perhaps the answer is to make the
>>             phrase tables memory
>>             >>> mapped. . .
>>             >>>
>>             >>> Kenneth
>>             >>>
>>             >>> On 05/25/2012 09:13 AM, Lane Schwartz wrote:
>>             >>>> I could imagine if you were translating N languages,
>>             all into a common
>>             >>>> target language, that it might be a memory footprint
>>             savings to be able
>>             >>>> to do this all within a common process. The savings
>>             would be from being
>>             >>>> able to have a single language model instance.
>>             >>>>
>>             >>>> Lane
>>             >>>>
>>             >>>> On Fri, May 25, 2012 at 2:00 AM, Philipp
>>             Koehn<pko...@inf.ed.ac.uk <mailto:pko...@inf.ed.ac.uk>
>>             >>>> pko...@inf.ed.ac.uk <mailto:pko...@inf.ed.ac.uk>>> wrote:
>>             >>>>
>>             >>>> Hi,
>>             >>>>
>>             >>>> my understanding is that this is not currently possible.
>>             >>>>
>>             >>>> But why would you want to do this? If you translate
>>             with different
>>             >>>> systems, why not just run different processes?
>>             >>>>
>>             >>>> The motivation to do this in the server process is
>>             that it avoids
>>             >>>> keeping multiple server processes at the same time,
>>             which is not
>>             >>>> a concern with batch Moses.
>>             >>>>
>>             >>>> -phi
>>             >>>>
>>             >>>> On Thu, May 24, 2012 at 12:55 AM, Fong Po Po
>>             >>>> <fongpui...@yahoo.com.hk
>>             <mailto:fongpui...@yahoo.com.hk>fongpui...@yahoo.com.hk
>>             <mailto:fongpui...@yahoo.com.hk>>> wrote:
>>             >>>>
>>             >>>> Dear all:
>>             >>>> I have read page in
>>             >>>>
>>             http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc22
>>             >>>> This page say that Moses Server can run in multi
>>             >>>> translation systems.
>>             >>>> Can Traditional Moses (not Moses Server) also run in
>>             >>>> multi translation systems?
>>             >>>> Can you help me? Thanks!
>>             >>>> Best Regards,
>>             >>>> Fong Pui Chi
>>             >>>>
>>             >>>>
>>             >>>> _______________________________________________
>>             >>>> Moses-support mailing list
>>             >>>> Moses-support@mit.edu
>>             <mailto:Moses-support@mit.edu>Moses-support@mit.edu
>>             <mailto:Moses-support@mit.edu>>
>>
>>             >>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>             >>>>
>>             >>>>
>>             >>>>
>>             >>>> _______________________________________________
>>             >>>> Moses-support mailing list
>>             >>>> Moses-support@mit.edu
>>             <mailto:Moses-support@mit.edu>Moses-support@mit.edu
>>             <mailto:Moses-support@mit.edu>>
>>
>>             >>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>             >>>>
>>             >>>>
>>             >>>>
>>             >>>>
>>             >>>> --
>>             >>>> When a place gets crowded enough to require ID's,
>>             social collapse is not
>>             >>>> far away. It is time to go elsewhere. The best thing
>>             about space travel
>>             >>>> is that it made it possible to go elsewhere.
>>             >>>> -- R.A. Heinlein, "Time Enough For Love"
>>             >>>>
>>             >>>>
>>             >>>> _______________________________________________
>>             >>>> Moses-support mailing list
>>             >>>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>>             >>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>             >>> _______________________________________________
>>             >>> Moses-support mailing list
>>             >>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>>             >>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>             >>
>>             >>
>>             > _______________________________________________
>>             > Moses-support mailing list
>>             > Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>>             > http://mailman.mit.edu/mailman/listinfo/moses-support
>>             >
>>             _______________________________________________
>>             Moses-support mailing list
>>             Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>>             http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to