Forgot about the disk cache. My idle footprint sizes don't include
that. 

Have you ever had thrashing problems? The only time we've had
I/O performance issues was when we tried to tuna model in a virtual
machine and the host was also trying to share the hardware resources
with another resource-intensive virtual machine. Since then, we only use
dedicated hardware and stay away from cloud configurations. 

Here are
some other points when configuring SSD/RAID-0. We found it's best to
mount the partition with the option to disable atime updates, and use
ext2 or configure ext4 partitions to disable journaling.


http://fenidik.blogspot.com/2010/03/ext4-disable-journal.html 

Also,
when creating the mdadm RAID-0, manually create the device using
--chunk=1024. 

On Sat, 26 May 2012 22:31:27 -0400, Jonathan Clark 
wrote:  
Tom, 

I agree that things can be made to work well under the
current setup. It's just not intuitive nor well-documented. 
 By
"hidden", I mean hidden in terms of any standard Unix monitoring tools.
It is hidden both from both the virtual and resident memory tallies,
which are where most Unix admins and hackers expect to see memory usage
show up; they're the essential tools for troubleshooting thrashing
issues. 

In terms of thrashing, I'd also say that these issues aren't
quite well-documented. As Kenneth points out, the very name "on disk"
likely leads even relatively experienced users to think that having
enough free disk space is enough to allow the model to run efficiently.
The documentation reads: 

 "For larger tasks the phrase tables usually
become huge, typically too large to fit into memory. Therefore, moses
supports a binary phrase table with on-demand loading, i.e. only the
part of the phrase table that is required to translate a sentence is
loaded into memory." 

Depending on disk cache to make repeated
random-access disk reads works fine until cache runs out, but this
requirement needs to be documented. Otherwise, one might do a few tests
with medium-sized models and observe that performance is quite good and
memory usage is extremely low only to find that Moses thrashes with very
large models, despite memory usage remaining low. 

I agree that a
RAID-0 setup over SSDs is a nice solution since they do achieve near-RAM
speeds. However, I'm just pointing out that the need for either 1)
having the entire model fit into disk cache (and how to determine how
much RAM this actually implies given a set of model files) or 2) having
your "disk" have near-RAM speeds is not well-documented nor easy to
uncover via usual monitoring tools. 

-Jon 

On Sat, May 26, 2012 at
9:40 PM, Tom Hoar  wrote:

I'm not exactly sure how this can be
considered a "hidden memory requirement". By default, train-table-perl
creates a moses.ini configuration that loads these resources into RAM. A
user must edit the configuration to enable mem-mapped language model or
binarized tables, and these on-disk features are well-documented.


Performance issues related to these features are relatively easy to
troubleshoot and overcome. These are read-only files during translation
runtime. Our systems uses working copies of these files stored on
mounted RAID-0 devices (mdadm). We typically build the RAID-0 with 2 or
more SSD disks and achieve near-full RAM speed. Even better, there's no
delay loading models into RAM at start-up time. It really makes for a
nice multi-translation server system without mosesserver. 

As for RAM
requirements, we've found that moses configured for on-disk needs about
~90 MB per instance when idle (phrase, not hierarchical but haven't
tested since SVN 4153 Aug 2011). We frequently run 3-4 instances on a
quad-core with 4 GB RAM and the RAID-0 configuration without problems.
With the on-disk configuration, each translation request loads moses
when needed and releases it and the RAM when finished. We haven't used
mosesserver in a long time because we developed this solution to
overcome some mosesserver memory leaks, which I think have been
resolved.  

On Sat, 26 May 2012 12:21:55 -0400, Jonathan Clark  wrote: 
 

Personally, I would could "The sees the process as a small-memory
process and won't be
tempted to kill it when it's running out of memory"
as a disadvantage rather than an advantage. If the OOM killer is trying
to stabilize the system, this will potentially prevent it from doing so.

The other disadvantage is the lack of accountability. If one is trying
to figure out *why* Moses is going so slowly and sees that it's not
using up much vmem or memory, this would usually lead to the conclusion
that it's not a memory-related issue and that a solution such as buying
more RAM won't help. This hidden memory requirement placed on the disk
cache can be quite confusing there. 
-Jon   

On Sat, May 26, 2012 at
9:09 AM, Hieu Hoang  wrote:   

You'll also get thrashing with
memory-mapped files if you don't have
 enough memory.

 Advantage of the
file API:
 1. can access 2+GB files even running a 32 bit OS
 2. OS
portable
 3. The sees the process as a small-memory process and won't
be
 tempted to kill it when it's running out of memory
 Disadvantage:

1. Slower (by how much?)  

 On 25/05/2012 16:16, Kenneth Heafield
wrote:
 > I have heard people have new phrase table formats.
 >
 > The
OnDiskPt format is a file accessed with file APIs, not memory
 >
mapping. Functionally, it uses the disk cache as shared memory (and the

> kernel shares the disk cache across processes). There is also some
 >
funny accounting going on because a process that depends on the disk
 >
cache is not charged for usage of that memory while a mmapped process
 >
would be. That means you can run Moses, it looks like it's fitting in
 >
virtual memory, and still thrash the disk because you also need enough

> disk cache to fit the entire phrase table. In this case, it is very
 >
slow despite the name OnDiskPt.
 >
 > Kenneth
 >
 > On 05/25/2012 10:57
AM, Lane Schwartz wrote:
 >> Is there no current option to allow memory
mapped phrase tables? I
 >> thought that's what the binary phrase table
was.
 >>
 >> Lane
 >>
 >>
 >> On Fri, May 25, 2012 at 10:50 AM, Kenneth
Heafield wrote:
 >>> Use memory mapping (KenLM 8 or 9 on Linux, 9 on
non-Linux, or IRSTLM
 >>> with .mm) and the kernel takes care of shared
memory for you.
 >>>
 >>> But there is merit to your argument e.g.
different weights with the same
 >>> phrase tables. Perhaps the answer
is to make the phrase tables memory
 >>> mapped. . .
 >>>
 >>> Kenneth

>>>
 >>> On 05/25/2012 09:13 AM, Lane Schwartz wrote:
 >>>> I could
imagine if you were translating N languages, all into a common
 >>>>
target language, that it might be a memory footprint savings to be able

>>>> to do this all within a common process. The savings would be from
being
 >>>> able to have a single language model instance.
 >>>>
 >>>>
Lane
 >>>>
 >>>> On Fri, May 25, 2012 at 2:00 AM, Philipp Koehn>>>
pko...@inf.ed.ac.uk [6]>> wrote:
 >>>>
 >>>> Hi,
 >>>>
 >>>> my
understanding is that this is not currently possible.
 >>>>
 >>>> But
why would you want to do this? If you translate with different
 >>>>
systems, why not just run different processes?
 >>>>
 >>>> The
motivation to do this in the server process is that it avoids
 >>>>
keeping multiple server processes at the same time, which is not
 >>>> a
concern with batch Moses.
 >>>>
 >>>> -phi
 >>>>
 >>>> On Thu, May 24,
2012 at 12:55 AM, Fong Po Po 
>>>> > wrote:
 >>>>
 >>>> Dear all:
 >>>>
I have read page in
 >>>>
http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc22 [9]
 >>>>
This page say that Moses Server can run in multi
 >>>> translation
systems.
 >>>> Can Traditional Moses (not Moses Server) also run in

>>>> multi translation systems?
 >>>> Can you help me? Thanks!
 >>>>
Best Regards,
 >>>> Fong Pui Chi
 >>>>
 >>>>
 >>>>
_______________________________________________
 >>>> Moses-support
mailing list >>>> Moses-support@mit.edu [10]Moses-support@mit.edu [11]>


 >>>> http://mailman.mit.edu/mailman/listinfo/moses-support [12]

>>>>
 >>>>
 >>>>
 >>>> _______________________________________________

>>>> Moses-support mailing list >>>> Moses-support@mit.edu
[13]Moses-support@mit.edu [14]> 

 >>>>
http://mailman.mit.edu/mailman/listinfo/moses-support [15]
 >>>>
 >>>>

>>>>
 >>>>
 >>>> --
 >>>> When a place gets crowded enough to require
ID's, social collapse is not
 >>>> far away. It is time to go elsewhere.
The best thing about space travel
 >>>> is that it made it possible to
go elsewhere.
 >>>> -- R.A. Heinlein, "Time Enough For Love"
 >>>>

>>>>
 >>>> _______________________________________________
 >>>>
Moses-support mailing list
 >>>> Moses-support@mit.edu [16]
 >>>>
http://mailman.mit.edu/mailman/listinfo/moses-support [17]
 >>>
_______________________________________________
 >>> Moses-support
mailing list
 >>> Moses-support@mit.edu [18]
 >>>
http://mailman.mit.edu/mailman/listinfo/moses-support [19]
 >>
 >>
 >
_______________________________________________
 > Moses-support mailing
list
 > Moses-support@mit.edu [20]
 >
http://mailman.mit.edu/mailman/listinfo/moses-support [21]
 >

_______________________________________________
 Moses-support mailing
list
Moses-support@mit.edu
[22]
http://mailman.mit.edu/mailman/listinfo/moses-support [23]         


Links:
------
[1] mailto:tah...@precisiontranslationtools.com
[2]
mailto:jon.h.cl...@gmail.com
[3] mailto:fishandfrol...@gmail.com
[4]
mailto:mo...@kheafield.com
[5] mailto:pko...@inf.ed.ac.uk
[6]
mailto:pko...@inf.ed.ac.uk
[7] mailto:fongpui...@yahoo.com.hk
[8]
mailto:fongpui...@yahoo.com.hk
[9]
http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc22
[10]
mailto:Moses-support@mit.edu
[11] mailto:Moses-support@mit.edu
[12]
http://mailman.mit.edu/mailman/listinfo/moses-support
[13]
mailto:Moses-support@mit.edu
[14] mailto:Moses-support@mit.edu
[15]
http://mailman.mit.edu/mailman/listinfo/moses-support
[16]
mailto:Moses-support@mit.edu
[17]
http://mailman.mit.edu/mailman/listinfo/moses-support
[18]
mailto:Moses-support@mit.edu
[19]
http://mailman.mit.edu/mailman/listinfo/moses-support
[20]
mailto:Moses-support@mit.edu
[21]
http://mailman.mit.edu/mailman/listinfo/moses-support
[22]
mailto:Moses-support@mit.edu
[23]
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to