I don't think GCC 5 is involved, as postgres is running dockerised
using the standard postgres 9.5 docker image, which is based on Debian
jessie, which in turn uses GCC 4:

https://github.com/docker-library/postgres/blob/master/9.5/Dockerfile

And postgis and open scene graph come from the jessie apt
repositories, which I assume use GCC 4 to compile everything too.

Still, I guess it could well still ultimately be the same issue, both
the stack trace posted above and one in the linked JIRA issues
segfault in dbconnector::postgres::Allocator::free

On 24 October 2017 at 16:03, Rahul Iyer <[email protected]> wrote:
> Hi James,
>
> I think the problem goes beyond external libraries including postgis.
> Multiple developers have noticed the same issue when just using MADlib with
> GCC 5 and above (see MADLIB-1145
> <https://issues.apache.org/jira/browse/MADLIB-1145> and MADLIB-1068
> <https://issues.apache.org/jira/browse/MADLIB-1068>). There is a serious
> bug in memory management that some of us are looking at. Hopefully we'll
> have answers soon ...
>
> - iR
>
> On Tue, Oct 24, 2017 at 7:53 AM, James Gregory <[email protected]> wrote:
>
>> When running queries that make use of both madlib cosine_similarity
>> and postgis ST_Intersects, with often get segmentation faults. It
>> doesn't happen 100% of the time - perhaps it needs multiple queries
>> running in parallel to make the segfault happen, or it might be some
>> other random thing that triggers it.
>>
>> The stack trace is:
>>
>> Program terminated with signal SIGSEGV, Segmentation fault.
>> #0  0x000056464a888e64 in pfree ()
>> (gdb)
>> (gdb) bt
>> #0  0x000056464a888e64 in pfree ()
>> #1  0x00007f3ddf33ae4d in
>> madlib::dbconnector::postgres::Allocator::free<(madlib::
>> dbal::MemoryContext)0>
>> (inPtr=inPtr@entry=0x56464b701ae0, this=<optimized out>)
>>     at /tmp/tmpbs9UjC/madlib-1.11.0/src/ports/postgres/
>> dbconnector/Allocator_impl.hpp:189
>> #2  0x00007f3ddf33b16d in operator delete (ptr=0x56464b701ae0)
>>     at /tmp/tmpbs9UjC/madlib-1.11.0/src/ports/postgres/
>> dbconnector/NewDelete.cpp:62
>> #3  0x00007f3ddf04c718 in deallocate (this=0x56464b6fa828, __p=<optimized
>> out>)
>>     at /usr/include/c++/4.9/ext/new_allocator.h:110
>> #4  deallocate (__a=..., __n=1, __p=<optimized out>) at
>> /usr/include/c++/4.9/ext/alloc_traits.h:185
>> #5  _M_put_node (this=0x56464b6fa828, __p=<optimized out>) at
>> /usr/include/c++/4.9/bits/stl_tree.h:389
>> #6  _M_destroy_node (this=0x56464b6fa828, __p=<optimized out>)
>>     at /usr/include/c++/4.9/bits/stl_tree.h:410
>> #7  std::_Rb_tree<std::string, std::string,
>> std::_Identity<std::string>, std::less<std::string>,
>> std::allocator<std::string> >::_M_erase
>> (this=this@entry=0x56464b6fa828, __x=<optimized out>)
>>     at /usr/include/c++/4.9/bits/stl_tree.h:1247
>> #8  0x00007f3ddf04c6f4 in std::_Rb_tree<std::string, std::string,
>> std::_Identity<std::string>, std::less<std::string>,
>> std::allocator<std::string> >::_M_erase (this=0x56464b6fa828,
>> __x=0x56464b701a80)
>>     at /usr/include/c++/4.9/bits/stl_tree.h:1245
>> #9  0x00007f3ddc714a99 in osgDB::Registry::~Registry() ()
>>    from /usr/lib/x86_64-linux-gnu/libosgDB.so.100
>> #10 0x00007f3ddc714d99 in osgDB::Registry::~Registry() ()
>>    from /usr/lib/x86_64-linux-gnu/libosgDB.so.100
>> #11 0x00007f3e27b6ab29 in __run_exit_handlers (status=0,
>> listp=0x7f3e27ed85a8 <__exit_funcs>,
>>     run_list_atexit=run_list_atexit@entry=true) at exit.c:82
>> #12 0x00007f3e27b6ab75 in __GI_exit (status=<optimized out>) at exit.c:104
>> #13 0x000056464a748504 in proc_exit ()
>> #14 0x000056464a768c63 in PostgresMain ()
>> #15 0x000056464a501001 in ?? ()
>> #16 0x000056464a70c9d1 in PostmasterMain ()
>> #17 0x000056464a502187 in main ()
>>
>> osgDB::Registry is part of the open scene graph library
>> (libosgDB.so.100 in the stack trace), which is used by postgis due to
>> its dependency on SFCGAL.
>>
>> I can see in madlib's NewDelete.cpp the comment:
>>
>> * We override the C++ global memory allocation and deallocation functions.
>> We
>> * map them to ultimately use the PostgreSQL memory routines to protect
>> against
>> * memory leaks.
>>
>> I guess somehow the memory management in open scene graph interacts
>> badly with these overrides?
>>
>> A colleague is looking into using a more recent version of postgis,
>> which may make the problem go away, though we are already using madlib
>> 1.11 and postgis 2.3.3+dfsg-1.pgdg80+1, which are pretty recent. It's
>> also possible that the conflict only happens when using the
>> Debian/Ubuntu postgis binaries, perhaps installing postgis using pgxn
>> or even compiling from source would resolve the issue.
>>
>> Still, it seems a bit strange that a destructor in open scene graph is
>> being caused to segfault by a custom override of memory deallocation
>> in madlib?
>>
>> --
>> James
>>



-- 
James

Reply via email to