Urghs, not good. Can I somehow get access to that machine? Is it deterministic?

W dniu 13.04.2016 o 21:06, Barry Haddow pisze:
Hi Ales

Well, bitPos=18446744073708512633  looks bogus. Marcin?

cheers - Barry

On 13/04/16 17:23, Aleš Tamchyna wrote:
Hi all,

sorry for the delay. I'm attaching the debug backtrace.

Best,
Ales

On Wed, Apr 13, 2016 at 1:49 PM, Barry Haddow <bhad...@staffmail.ed.ac.uk <mailto:bhad...@staffmail.ed.ac.uk>> wrote:

    Hi

    The backtrace would be more informative if you run with a debug
    build (add variant=debug to bjam). Sometimes this makes bugs go
    away, or new bugs appear, but if not then it will give more
    information. You can run with core files enabled (ulimit -c
    unlimited) to save having to run Moses inside gdb.

    If the bug is random, but not thread related, then it could well
    be memory corruption. Running Moses in valgrind can help track
    this down (again, using a debug build is better). Note that the
    suffix arrays crash valgrind (last time I checked) so don't build
    them in,

    cheers - Barry


    On 13/04/16 11:25, Ales Tamchyna wrote:
    Hi,
    Let me add some more information to this: when running Moses in gdb, I get 
the following backtrace:
    #0  0x00000000006e3ba4 in 
Moses::PhraseDecoder::CreateTargetPhraseCollection(Moses::Phrase const&, bool, 
bool) ()
    #1  0x00000000005cd2a7 in 
Moses::PhraseDictionaryCompact::GetTargetPhraseCollectionNonCacheLEGACY(Moses::Phrase
 const&) const ()
    #2  0x000000000048efe4 in 
Moses::PhraseDictionary::GetTargetPhraseCollectionLEGACY(Moses::Phrase const&) 
const ()
    #3  0x000000000048e6a0 in 
Moses::PhraseDictionary::GetTargetPhraseCollectionBatch(std::vector<Moses::InputPath*, 
std::allocator<Moses::InputPath*> > const&) const ()
    #4  0x0000000000560948 in 
Moses::TranslationOptionCollection::GetTargetPhraseCollectionBatch() ()
    #5  0x0000000000551a39 in 
Moses::TranslationOptionCollectionText::CreateTranslationOptions() ()
    #6  0x00000000004bddfc in Moses::Manager::Decode() ()
    #7  0x0000000000433bd4 in Moses::TranslationTask::Run() ()
    #8  0x0000000000496088 in Moses::ThreadPool::Execute() ()
    #9  0x00000000007cbdba in thread_proxy ()
    #10 0x00007fffc210c182 in start_thread (arg=0x7ffc23a5d700) at 
pthread_create.c:312
    #11 0x00007fffc1e3947d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
    This suggests the problem is somewhere in loading phrase translations from 
the compact phrase table.
    I’m not sure why the LEGACY functions are called but I’m assuming that 
these are “future” legacy methods and that they are in fact still used by 
phrase dictionary implementations (?).
    Best,
    Ales
    From: Ondrej Bojar
    Sent: středa 13. dubna 2016 12:19
    To:moses-support@mit.edu <mailto:moses-support@mit.edu>
    Cc: Roman Sudarikov; Ales Tamchyna
    Subject: Random segfaults with alternative decoding paths


    Hi,

    we're experiencing random segfaults when we use two phrase tables in 
alternative decoding paths. The exact commit of moses we use is 
6a06e7776a58b09e4ed5b1cf11eb64fbdd6b02a2, from April 1.

    We do have test runs on the exact same 200 input sentences, exact same 
moses.ini, on the very same machine, where one of the runs succeeds and the 
other dies after 45 sentences.


    Would anyone have any idea what should we be chasing?

    - it doesn't seem to be thread-related (segfault experienced with -threads 
1 as well as -threads 8)
    - not related to nbest-list construction (we first had this problem in mert 
tuning so we isolated this)
    - not related to more LMs (we first had several LMs in the setup, we get 
the crash with just one as well)
    - not related to -search, the bug is there with -search set to 0, 1 or 4
    - seems related to data or data size: when we trained the first ttable on 
just a very small corpus, we did not get the segfault (yet)
    - not related to translation options caching, the bug is there even with 
-no-cache
    - not related to the specification of output-factors; left unspecified or set to 
0<CR>1, the bug is there


    Here is the moses.ini:

    [input-factors]
    0

    [mapping]
    0 T 0
    1 T 1

    [distortion-limit]
    6

    [feature]
    Distortion
    KENLM lazyken=0 name=LM0 factor=0 path=lm.1.trie.lm order=4
    PhraseDictionaryCompact name=TranslationModel0 num-features=4 
path=phrase-table.0-0,1.1.1 input-factor=0 output-factor=0,1 table-limit=100
    PhraseDictionaryCompact name=TranslationModel1 num-features=4 
path=phrase-table.0-0,1.2.1 input-factor=0 output-factor=0,1 table-limit=100
    PhrasePenalty
    UnknownWordPenalty
    WordPenalty

    [weight]
    Distortion0= 0.3
    LM0= 0.5
    PhrasePenalty0= 0.2
    TranslationModel0= 0.2 0.2 0.2 0.2
    TranslationModel1= 0.2 0.2 0.2 0.2
    UnknownWordPenalty0= 1
    WordPenalty0= -1


    The large setup that shows these crashes uses this big files:

    -rw-r--r-- 1 bojar ufal 584M Apr 13 09:19 lm.1.trie.lm
    -rw-r--r-- 1 bojar ufal 1.1G Apr 13 09:24 phrase-table.0-0,1.1.1.minphr
    -rw-r--r-- 1 bojar ufal 5.7M Apr 13 09:24 phrase-table.0-0,1.2.1.minphr


    Thanks,
       Ondrej.




    _______________________________________________
    Moses-support mailing list
    Moses-support@mit.edu <mailto:Moses-support@mit.edu>
    http://mailman.mit.edu/mailman/listinfo/moses-support


    The University of Edinburgh is a charitable body, registered in
    Scotland, with registration number SC005336.

    _______________________________________________
    Moses-support mailing list
    Moses-support@mit.edu <mailto:Moses-support@mit.edu>
    http://mailman.mit.edu/mailman/listinfo/moses-support





The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to