Thanks that's useful info! Roberto Sanchez and I logged into a MIPS
machine on Tuesday and we found corroborating evidence which I wrote
up in the forwarded ticket:

https://jira.mongodb.org/browse/CDRIVER-2053

The problem is, the test suite starts hundreds of threads. I think the
solution is to run single-threaded, as you say, since that's simplest
and easiest to debug if a test fails in the future. Out of curiosity,
we'll also test a version that limits concurrency to 10 threads.

On Thu, Feb 16, 2017 at 10:18 AM, James Cowgill <jcowg...@debian.org> wrote:
> Hi,
>
> On Sat, 04 Feb 2017 14:01:10 +0200 Adrian Bunk <b...@debian.org> wrote:
>> Source: libbson
>> Version: 1.5.0-1
>> Severity: serious
>>
>> https://buildd.debian.org/status/package.php?p=libbson&suite=sid
>>
>> ...
>>     { "status": "PASS", "test_file": "/bson/utf8/from_unichar", "seed": 
>> "3897722749", "start": 918303.161886737, "end": 918303.161907618, "elapsed": 
>> 0.000020881 },
>>     { "status": "PASS", "test_file": "/bson/utf8/invalid", "seed": 
>> "3495301312", "start": 918303.172066716, "end": 918303.172096569, "elapsed": 
>> 0.000029853 },
>>     { "status": "PASS", "test_file": 
>> "/bson/decimal128/from_string/exponent_normalization", "seed": "2524386511", 
>> "start": 918303.160139775, "end": 918303.172289377, "elapsed": 0.012149602 },
>>     { "status": "PASS", "test_file": "/bson/as_json/stack_overflow", "seed": 
>> "4062366096", "start": 918303.082477899, "end": 918303.199222704, "elapsed": 
>> 0.116744805 },
>>     { "status": "PASS", "test_file": "/type/decimal128/decimal128-2", 
>> "seed": "973167199", "start": 918303.151844572, "end": 918303.202736059, 
>> "elapsed": 0.050891487 },
>>     { "status": "PASS", "test_file": "/type/decimal128/decimal128-3", 
>> "seed": "3407436287", "start": 918303.160160818, "end": 918303.227553065, 
>> "elapsed": 0.067392247 },
>>     { "status": "PASS", "test_file": "/bson/as_json/x1000", "seed": 
>> "3264176593", "start": 918303.082226314, "end": 918303.360835632, "elapsed": 
>> 0.278609318 },
>> /bin/bash: line 1:  2488 Aborted                 /bin/bash ./libtool 
>> --mode=execute $VALGRIND ./$TEST_PROG --threads --no-fork -F 
>> test-results.json
>
> Backtrace:
> Thread 232 "test-libbson" received signal SIGABRT, Aborted.
> [Switching to Thread 0x444e4a0 (LWP 20412)]
> __GI_raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:58
> 58      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
> (gdb) bt
> #0  __GI_raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:58
> #1  0x77de4e74 in __GI_abort () at abort.c:89
> #2  0x77f9186c in bson_realloc (mem=<optimized out>, num_bytes=<optimized 
> out>) at src/bson/bson-memory.c:153
> #3  0x77f7fb4c in _bson_impl_alloc_grow (size=<optimized out>, 
> impl=0x55b00518) at src/bson/bson.c:175
> #4  _bson_grow (bson=0x55b00518, size=<optimized out>) at src/bson/bson.c:209
> #5  0x77f810f8 in _bson_append_va (args=0x444d7cc, first_data=0x77f99183 
> <type> "\022bson_append_int64", first_len=1, n_pairs=4, n_bytes=<optimized 
> out>, bson=0x55b00518) at src/bson/bson.c:313
> #6  _bson_append (bson=0x55b00518, n_pairs=4, n_bytes=<optimized out>, 
> first_len=1, first_data=0x77f99183 <type> "\022bson_append_int64") at 
> src/bson/bson.c:392
> #7  0x77f82d34 in bson_append_int64 (bson=0x55b00518, key=<optimized out>, 
> key_length=<optimized out>, value=271) at src/bson/bson.c:1153
> #8  0x55573be8 in test_bson_writer_shared_buffer () at tests/test-writer.c:83
> #9  0x55554564 in TestSuite_RunTest (suite=0x7fff6ae4, test=0x555a8fc0, 
> mutex=0x7fff69fc, count=0x7fff69f8) at tests/TestSuite.c:444
> #10 0x555547cc in TestSuite_ParallelWorker (data=0x555cf3a8) at 
> tests/TestSuite.c:645
> #11 0x77f3b8c0 in start_thread (arg=0x0) at pthread_create.c:335
> #12 0x77ea39e4 in __thread_start () at 
> ../sysdeps/unix/sysv/linux/mips/clone.S:143
>
> The abort call in frame #2 is due to running out of memory and indeed
> running libbson through strace shows that mmap returns ENOMEM just
> before aborting.
>
> libbson appears to use up all of the MIPS virtual address space. MIPS
> was hit first because it only has 2GB of virtual memory (whereas most
> other 32-bit Linux arches have 3GB). The correct fix would be to try
> and reduce the amount of memory allocated - it seems that a lot of the
> memory being allocated is never used anyway.
>
> Making the testsuite run single threaded (as was mentioned on the
> upstream bug report) may workaround this. Possibly the amount of memory
> allocated increases the more threads are in use?
>
> Thanks,
> James
>

Reply via email to