[ 
https://issues.apache.org/jira/browse/ARROW-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588313#comment-16588313
 ] 

Lukasz Bartnik edited comment on ARROW-1380 at 8/22/18 8:45 AM:
----------------------------------------------------------------

The first of these warnings could be probably addressed by not calling exit(0) 
from the signal handler. My impression is that after a signal is caught and 
exit() is called, main() never returns, and thus destructors for its local 
objects are not called. Below is the valgrind warning in question.

{code:java}
==1990== 33 bytes in 1 blocks are still reachable in loss record 1 of 2
==1990== at 0x4C3017F: operator new(unsigned long) (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1990== by 0x513088C: std::string::_Rep::_S_create(unsigned long, unsigned 
long, std::allocator<char> const&) (in 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5130C55: std::string::_M_mutate(unsigned long, unsigned long, 
unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5131321: std::string::_M_replace_safe(unsigned long, unsigned 
long, char const*, unsigned long) (in 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x198A23: main (store.cc:937)
{code}

With changes as in I can reduce warnings to the one below. Looking at the code 
it's not clear if CreateObject() should be paired with a delete operation of if 
there is an internal pool/tracking mechanism.

{code}
pyarrow/tests/test_plasma.py::TestPlasmaClient::test_put_and_get command:  
valgrind --track-origins=yes --leak-check=full --show-leak-kinds=all 
--leak-check-heuristics=stdstring --error-exitcode=1 
/io/arrow/python/pyarrow/plasma_store_server -s 
/tmp/test_plasma-k6wtcvi4/plasma.sock -m 100000000
==575== Memcheck, a memory error detector
==575== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==575== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==575== Command: /io/arrow/python/pyarrow/plasma_store_server -s 
/tmp/test_plasma-k6wtcvi4/plasma.sock -m 100000000
==575== 
Allowing the Plasma store to use up to 0.1GB of memory.
Starting object store with directory /dev/shm and huge page support disabled
PASSED==575== 
==575== HEAP SUMMARY:
==575==     in use at exit: 552 bytes in 1 blocks
==575==   total heap usage: 178 allocs, 177 frees, 143,037 bytes allocated
==575== 
==575== 552 bytes in 1 blocks are still reachable in loss record 1 of 1
==575==    at 0x4C2FB0F: malloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==575==    by 0x567F5F7: fdopen@@GLIBC_2.2.5 (iofdopen.c:122)
==575==    by 0x1BD47F: create_buffer(long) (malloc.cc:105)
==575==    by 0x1BFF17: fake_mmap (malloc.cc:135)
==575==    by 0x1C077B: sys_alloc (dlmalloc.c:4155)
==575==    by 0x1C077B: dlmalloc (dlmalloc.c:4680)
==575==    by 0x1C2850: internal_memalign.constprop.98 (dlmalloc.c:4917)
==575==    by 0x19391A: plasma::PlasmaStore::CreateObject(plasma::UniqueID 
const&, long, long, int, plasma::Client*, plasma::PlasmaObject*) (store.cc:178)
==575==    by 0x197337: plasma::PlasmaStore::ProcessMessage(plasma::Client*) 
(store.cc:740)
==575==    by 0x195E02: 
plasma::PlasmaStore::ConnectClient(int)::{lambda(int)#1}::operator()(int) const 
(store.cc:544)
==575==    by 0x19927B: std::_Function_handler<void (int), 
plasma::PlasmaStore::ConnectClient(int)::{lambda(int)#1}>::_M_invoke(std::_Any_data
 const&, int&&) (std_function.h:297)
==575==    by 0x1B75FD: std::function<void (int)>::operator()(int) const 
(std_function.h:687)
==575==    by 0x1B6F4E: plasma::EventLoop::FileEventCallback(aeEventLoop*, int, 
void*, int) (events.cc:28)
==575== 
==575== LEAK SUMMARY:
==575==    definitely lost: 0 bytes in 0 blocks
==575==    indirectly lost: 0 bytes in 0 blocks
==575==      possibly lost: 0 bytes in 0 blocks
==575==    still reachable: 552 bytes in 1 blocks
==575==         suppressed: 0 bytes in 0 blocks
==575== 
==575== For counts of detected and suppressed errors, rerun with: -v
==575== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
{code}


was (Author: lbartnik):
The first of these warnings could be probably addressed by not calling exit(0) 
from the signal handler. My impression is that after a signal is caught and 
exit() is called, main() never returns, and thus destructors for its local 
objects are not called. Below is the valgrind warning in question.
{code:java}
==1990== 33 bytes in 1 blocks are still reachable in loss record 1 of 2
==1990== at 0x4C3017F: operator new(unsigned long) (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1990== by 0x513088C: std::string::_Rep::_S_create(unsigned long, unsigned 
long, std::allocator<char> const&) (in 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5130C55: std::string::_M_mutate(unsigned long, unsigned long, 
unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5131321: std::string::_M_replace_safe(unsigned long, unsigned 
long, char const*, unsigned long) (in 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x198A23: main (store.cc:937)
{code}
 

I see that SIGTERM comes from Python: "Ensure Valgrind and/or coverage have a 
clean exit". Does it make sense to set an exit flag in the signal handler and 
then let the event loop exit on its own in the main call stack?

> [C++] Fix "still reachable" valgrind warnings when PLASMA_VALGRIND=1
> --------------------------------------------------------------------
>
>                 Key: ARROW-1380
>                 URL: https://issues.apache.org/jira/browse/ARROW-1380
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Plasma (C++)
>            Reporter: Wes McKinney
>            Priority: Major
>             Fix For: 0.11.0
>
>         Attachments: LastTest.log, valgrind.supp_
>
>
> I thought I fixed this, but they seem to have recurred:
> https://travis-ci.org/apache/arrow/jobs/266421430#L5220



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to