[jira] [Assigned] (ARROW-1380) [C++] Fix "still reachable" valgrind warnings when PLASMA_VALGRIND=1

2018-08-22 Thread Lukasz Bartnik (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukasz Bartnik reassigned ARROW-1380:
-

Assignee: Lukasz Bartnik

> [C++] Fix "still reachable" valgrind warnings when PLASMA_VALGRIND=1
> 
>
> Key: ARROW-1380
> URL: https://issues.apache.org/jira/browse/ARROW-1380
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Plasma (C++)
>Reporter: Wes McKinney
>Assignee: Lukasz Bartnik
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.11.0
>
> Attachments: LastTest.log, valgrind.supp_
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I thought I fixed this, but they seem to have recurred:
> https://travis-ci.org/apache/arrow/jobs/266421430#L5220



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (ARROW-1380) [C++] Fix "still reachable" valgrind warnings when PLASMA_VALGRIND=1

2018-08-22 Thread Lukasz Bartnik (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588313#comment-16588313
 ] 

Lukasz Bartnik edited comment on ARROW-1380 at 8/22/18 8:48 AM:


The first of these warnings could be probably addressed by not calling exit(0) 
from the signal handler. My impression is that after a signal is caught and 
exit() is called, main() never returns, and thus destructors for its local 
objects are not called. Below is the valgrind warning in question.

{code:java}
==1990== 33 bytes in 1 blocks are still reachable in loss record 1 of 2
==1990== at 0x4C3017F: operator new(unsigned long) (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1990== by 0x513088C: std::string::_Rep::_S_create(unsigned long, unsigned 
long, std::allocator const&) (in 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5130C55: std::string::_M_mutate(unsigned long, unsigned long, 
unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5131321: std::string::_M_replace_safe(unsigned long, unsigned 
long, char const*, unsigned long) (in 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x198A23: main (store.cc:937)
{code}

With changes as in 
https://github.com/lbartnik/arrow/commit/089153d518c081d7b9c1b3fb839463bca9ac1a35
 I can reduce warnings to the one below. Looking at the code it's not clear if 
CreateObject() should be paired with a delete operation of if there is an 
internal pool/tracking mechanism.

{code}
pyarrow/tests/test_plasma.py::TestPlasmaClient::test_put_and_get command:  
valgrind --track-origins=yes --leak-check=full --show-leak-kinds=all 
--leak-check-heuristics=stdstring --error-exitcode=1 
/io/arrow/python/pyarrow/plasma_store_server -s 
/tmp/test_plasma-k6wtcvi4/plasma.sock -m 1
==575== Memcheck, a memory error detector
==575== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==575== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==575== Command: /io/arrow/python/pyarrow/plasma_store_server -s 
/tmp/test_plasma-k6wtcvi4/plasma.sock -m 1
==575== 
Allowing the Plasma store to use up to 0.1GB of memory.
Starting object store with directory /dev/shm and huge page support disabled
PASSED==575== 
==575== HEAP SUMMARY:
==575== in use at exit: 552 bytes in 1 blocks
==575==   total heap usage: 178 allocs, 177 frees, 143,037 bytes allocated
==575== 
==575== 552 bytes in 1 blocks are still reachable in loss record 1 of 1
==575==at 0x4C2FB0F: malloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==575==by 0x567F5F7: fdopen@@GLIBC_2.2.5 (iofdopen.c:122)
==575==by 0x1BD47F: create_buffer(long) (malloc.cc:105)
==575==by 0x1BFF17: fake_mmap (malloc.cc:135)
==575==by 0x1C077B: sys_alloc (dlmalloc.c:4155)
==575==by 0x1C077B: dlmalloc (dlmalloc.c:4680)
==575==by 0x1C2850: internal_memalign.constprop.98 (dlmalloc.c:4917)
==575==by 0x19391A: plasma::PlasmaStore::CreateObject(plasma::UniqueID 
const&, long, long, int, plasma::Client*, plasma::PlasmaObject*) (store.cc:178)
==575==by 0x197337: plasma::PlasmaStore::ProcessMessage(plasma::Client*) 
(store.cc:740)
==575==by 0x195E02: 
plasma::PlasmaStore::ConnectClient(int)::{lambda(int)#1}::operator()(int) const 
(store.cc:544)
==575==by 0x19927B: std::_Function_handler::_M_invoke(std::_Any_data
 const&, int&&) (std_function.h:297)
==575==by 0x1B75FD: std::function::operator()(int) const 
(std_function.h:687)
==575==by 0x1B6F4E: plasma::EventLoop::FileEventCallback(aeEventLoop*, int, 
void*, int) (events.cc:28)
==575== 
==575== LEAK SUMMARY:
==575==definitely lost: 0 bytes in 0 blocks
==575==indirectly lost: 0 bytes in 0 blocks
==575==  possibly lost: 0 bytes in 0 blocks
==575==still reachable: 552 bytes in 1 blocks
==575== suppressed: 0 bytes in 0 blocks
==575== 
==575== For counts of detected and suppressed errors, rerun with: -v
==575== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
{code}


was (Author: lbartnik):
The first of these warnings could be probably addressed by not calling exit(0) 
from the signal handler. My impression is that after a signal is caught and 
exit() is called, main() never returns, and thus destructors for its local 
objects are not called. Below is the valgrind warning in question.

{code:java}
==1990== 33 bytes in 1 blocks are still reachable in loss record 1 of 2
==1990== at 0x4C3017F: operator new(unsigned long) (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1990== by 0x513088C: std::string::_Rep::_S_create(unsigned long, unsigned 
long, std::allocator const&) (in 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5130C55: std::string::_M_mutate(unsigned long, unsigned long, 
unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5131321: 

[jira] [Comment Edited] (ARROW-1380) [C++] Fix "still reachable" valgrind warnings when PLASMA_VALGRIND=1

2018-08-22 Thread Lukasz Bartnik (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588313#comment-16588313
 ] 

Lukasz Bartnik edited comment on ARROW-1380 at 8/22/18 8:45 AM:


The first of these warnings could be probably addressed by not calling exit(0) 
from the signal handler. My impression is that after a signal is caught and 
exit() is called, main() never returns, and thus destructors for its local 
objects are not called. Below is the valgrind warning in question.

{code:java}
==1990== 33 bytes in 1 blocks are still reachable in loss record 1 of 2
==1990== at 0x4C3017F: operator new(unsigned long) (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1990== by 0x513088C: std::string::_Rep::_S_create(unsigned long, unsigned 
long, std::allocator const&) (in 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5130C55: std::string::_M_mutate(unsigned long, unsigned long, 
unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5131321: std::string::_M_replace_safe(unsigned long, unsigned 
long, char const*, unsigned long) (in 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x198A23: main (store.cc:937)
{code}

With changes as in I can reduce warnings to the one below. Looking at the code 
it's not clear if CreateObject() should be paired with a delete operation of if 
there is an internal pool/tracking mechanism.

{code}
pyarrow/tests/test_plasma.py::TestPlasmaClient::test_put_and_get command:  
valgrind --track-origins=yes --leak-check=full --show-leak-kinds=all 
--leak-check-heuristics=stdstring --error-exitcode=1 
/io/arrow/python/pyarrow/plasma_store_server -s 
/tmp/test_plasma-k6wtcvi4/plasma.sock -m 1
==575== Memcheck, a memory error detector
==575== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==575== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==575== Command: /io/arrow/python/pyarrow/plasma_store_server -s 
/tmp/test_plasma-k6wtcvi4/plasma.sock -m 1
==575== 
Allowing the Plasma store to use up to 0.1GB of memory.
Starting object store with directory /dev/shm and huge page support disabled
PASSED==575== 
==575== HEAP SUMMARY:
==575== in use at exit: 552 bytes in 1 blocks
==575==   total heap usage: 178 allocs, 177 frees, 143,037 bytes allocated
==575== 
==575== 552 bytes in 1 blocks are still reachable in loss record 1 of 1
==575==at 0x4C2FB0F: malloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==575==by 0x567F5F7: fdopen@@GLIBC_2.2.5 (iofdopen.c:122)
==575==by 0x1BD47F: create_buffer(long) (malloc.cc:105)
==575==by 0x1BFF17: fake_mmap (malloc.cc:135)
==575==by 0x1C077B: sys_alloc (dlmalloc.c:4155)
==575==by 0x1C077B: dlmalloc (dlmalloc.c:4680)
==575==by 0x1C2850: internal_memalign.constprop.98 (dlmalloc.c:4917)
==575==by 0x19391A: plasma::PlasmaStore::CreateObject(plasma::UniqueID 
const&, long, long, int, plasma::Client*, plasma::PlasmaObject*) (store.cc:178)
==575==by 0x197337: plasma::PlasmaStore::ProcessMessage(plasma::Client*) 
(store.cc:740)
==575==by 0x195E02: 
plasma::PlasmaStore::ConnectClient(int)::{lambda(int)#1}::operator()(int) const 
(store.cc:544)
==575==by 0x19927B: std::_Function_handler::_M_invoke(std::_Any_data
 const&, int&&) (std_function.h:297)
==575==by 0x1B75FD: std::function::operator()(int) const 
(std_function.h:687)
==575==by 0x1B6F4E: plasma::EventLoop::FileEventCallback(aeEventLoop*, int, 
void*, int) (events.cc:28)
==575== 
==575== LEAK SUMMARY:
==575==definitely lost: 0 bytes in 0 blocks
==575==indirectly lost: 0 bytes in 0 blocks
==575==  possibly lost: 0 bytes in 0 blocks
==575==still reachable: 552 bytes in 1 blocks
==575== suppressed: 0 bytes in 0 blocks
==575== 
==575== For counts of detected and suppressed errors, rerun with: -v
==575== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
{code}


was (Author: lbartnik):
The first of these warnings could be probably addressed by not calling exit(0) 
from the signal handler. My impression is that after a signal is caught and 
exit() is called, main() never returns, and thus destructors for its local 
objects are not called. Below is the valgrind warning in question.
{code:java}
==1990== 33 bytes in 1 blocks are still reachable in loss record 1 of 2
==1990== at 0x4C3017F: operator new(unsigned long) (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1990== by 0x513088C: std::string::_Rep::_S_create(unsigned long, unsigned 
long, std::allocator const&) (in 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5130C55: std::string::_M_mutate(unsigned long, unsigned long, 
unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5131321: std::string::_M_replace_safe(unsigned long, unsigned 
long, char const*, unsigned long) (in 

[jira] [Comment Edited] (ARROW-1380) [C++] Fix "still reachable" valgrind warnings when PLASMA_VALGRIND=1

2018-08-22 Thread Lukasz Bartnik (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588313#comment-16588313
 ] 

Lukasz Bartnik edited comment on ARROW-1380 at 8/22/18 7:52 AM:


The first of these warnings could be probably addressed by not calling exit(0) 
from the signal handler. My impression is that after a signal is caught and 
exit() is called, main() never returns, and thus destructors for its local 
objects are not called. Below is the valgrind warning in question.
{code:java}
==1990== 33 bytes in 1 blocks are still reachable in loss record 1 of 2
==1990== at 0x4C3017F: operator new(unsigned long) (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1990== by 0x513088C: std::string::_Rep::_S_create(unsigned long, unsigned 
long, std::allocator const&) (in 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5130C55: std::string::_M_mutate(unsigned long, unsigned long, 
unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5131321: std::string::_M_replace_safe(unsigned long, unsigned 
long, char const*, unsigned long) (in 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x198A23: main (store.cc:937)
{code}
 

I see that SIGTERM comes from Python: "Ensure Valgrind and/or coverage have a 
clean exit". Does it make sense to set an exit flag in the signal handler and 
then let the event loop exit on its own in the main call stack?


was (Author: lbartnik):
The first of these warnings could be probably addressed by not calling exit(0) 
from the signal handler. My impression is that after a signal is caught and 
exit() is called, main() never returns, and thus destructors for its local 
objects are not called. Below is the valgrind warning in question.
{code:java}
==1990== 33 bytes in 1 blocks are still reachable in loss record 1 of 2
==1990== at 0x4C3017F: operator new(unsigned long) (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1990== by 0x513088C: std::string::_Rep::_S_create(unsigned long, unsigned 
long, std::allocator const&) (in 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5130C55: std::string::_M_mutate(unsigned long, unsigned long, 
unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5131321: std::string::_M_replace_safe(unsigned long, unsigned 
long, char const*, unsigned long) (in 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x198A23: main (store.cc:937)
{code}
 

I tried simply commenting exit() out but that leads to other errors and I 
assume is not intended. I don't see much other signal handling in plasma and my 
current guess is that it is ae that gets interrupted and then drops the event 
loop.

Why is there even a SIGTERM in the first place? Where does it come from?

I'd be grateful for comments and/or pointers to relevant areas in the code.

> [C++] Fix "still reachable" valgrind warnings when PLASMA_VALGRIND=1
> 
>
> Key: ARROW-1380
> URL: https://issues.apache.org/jira/browse/ARROW-1380
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Plasma (C++)
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.11.0
>
> Attachments: LastTest.log, valgrind.supp_
>
>
> I thought I fixed this, but they seem to have recurred:
> https://travis-ci.org/apache/arrow/jobs/266421430#L5220



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (ARROW-1380) [C++] Fix "still reachable" valgrind warnings when PLASMA_VALGRIND=1

2018-08-21 Thread Lukasz Bartnik (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588313#comment-16588313
 ] 

Lukasz Bartnik edited comment on ARROW-1380 at 8/22/18 3:03 AM:


The first of these warnings could be probably addressed by not calling exit(0) 
from the signal handler. My impression is that after a signal is caught and 
exit() is called, main() never returns, and thus destructors for its local 
objects are not called. Below is the valgrind warning in question.
{code:java}
==1990== 33 bytes in 1 blocks are still reachable in loss record 1 of 2
==1990== at 0x4C3017F: operator new(unsigned long) (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1990== by 0x513088C: std::string::_Rep::_S_create(unsigned long, unsigned 
long, std::allocator const&) (in 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5130C55: std::string::_M_mutate(unsigned long, unsigned long, 
unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5131321: std::string::_M_replace_safe(unsigned long, unsigned 
long, char const*, unsigned long) (in 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x198A23: main (store.cc:937)
{code}
 

I tried simply commenting exit() out but that leads to other errors and I 
assume is not intended. I don't see much other signal handling in plasma and my 
current guess is that it is ae that gets interrupted and then drops the event 
loop.

Why is there even a SIGTERM in the first place? Where does it come from?

I'd be grateful for comments and/or pointers to relevant areas in the code.


was (Author: lbartnik):
The first of these warnings could be probably addressed by not calling exit(0) 
from the signal handler. My impression is that after a signal is caught and 
exit() is called, main() never returns, and thus destructors for its local 
objects are not called. Below is the valgrind warning in question.

{code}
==1990== 33 bytes in 1 blocks are still reachable in loss record 1 of 2
==1990== at 0x4C3017F: operator new(unsigned long) (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1990== by 0x513088C: std::string::_Rep::_S_create(unsigned long, unsigned 
long, std::allocator const&) (in 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5130C55: std::string::_M_mutate(unsigned long, unsigned long, 
unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5131321: std::string::_M_replace_safe(unsigned long, unsigned 
long, char const*, unsigned long) (in 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x198A23: main (store.cc:937)
{code}
 

I tried simply commenting exit() out but that leads to other errors and I 
assume is not intended. I don't see much other signal handling in plasma and my 
current guess is that it is ae that gets interrupted and then drops the event 
loop.

I'd be grateful for comments and/or pointers to relevant areas in the code.

> [C++] Fix "still reachable" valgrind warnings when PLASMA_VALGRIND=1
> 
>
> Key: ARROW-1380
> URL: https://issues.apache.org/jira/browse/ARROW-1380
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Plasma (C++)
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.11.0
>
> Attachments: LastTest.log, valgrind.supp_
>
>
> I thought I fixed this, but they seem to have recurred:
> https://travis-ci.org/apache/arrow/jobs/266421430#L5220



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (ARROW-1380) [C++] Fix "still reachable" valgrind warnings when PLASMA_VALGRIND=1

2018-08-21 Thread Lukasz Bartnik (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588313#comment-16588313
 ] 

Lukasz Bartnik edited comment on ARROW-1380 at 8/22/18 3:01 AM:


The first of these warnings could be probably addressed by not calling exit(0) 
from the signal handler. My impression is that after a signal is caught and 
exit() is called, main() never returns, and thus destructors for its local 
objects are not called. Below is the valgrind warning in question.

{code}
==1990== 33 bytes in 1 blocks are still reachable in loss record 1 of 2
==1990== at 0x4C3017F: operator new(unsigned long) (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1990== by 0x513088C: std::string::_Rep::_S_create(unsigned long, unsigned 
long, std::allocator const&) (in 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5130C55: std::string::_M_mutate(unsigned long, unsigned long, 
unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5131321: std::string::_M_replace_safe(unsigned long, unsigned 
long, char const*, unsigned long) (in 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x198A23: main (store.cc:937)
{code}
 

I tried simply commenting exit() out but that leads to other errors and I 
assume is not intended. I don't see much other signal handling in plasma and my 
current guess is that it is ae that gets interrupted and then drops the event 
loop.

I'd be grateful for comments and/or pointers to relevant areas in the code.


was (Author: lbartnik):
The first of these warnings could be probably addressed by not calling exit(0) 
from the signal handler. My impression is that after a signal is caught and 
exit() is called, main() never returns, and thus destructors for its local 
objects are not called. Below is the valgrind warning in question.

==1990== 33 bytes in 1 blocks are still reachable in loss record 1 of 2
==1990== at 0x4C3017F: operator new(unsigned long) (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1990== by 0x513088C: std::string::_Rep::_S_create(unsigned long, unsigned 
long, std::allocator const&) (in 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5130C55: std::string::_M_mutate(unsigned long, unsigned long, 
unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5131321: std::string::_M_replace_safe(unsigned long, unsigned 
long, char const*, unsigned long) (in 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x198A23: main (store.cc:937)

 

I tried simply commenting exit() out but that leads to other errors and I 
assume is not intended. I don't see much other signal handling in plasma and my 
current guess is that it is ae that gets interrupted and then drops the event 
loop.

I'd be grateful for comments and/or pointers to relevant areas in the code.

> [C++] Fix "still reachable" valgrind warnings when PLASMA_VALGRIND=1
> 
>
> Key: ARROW-1380
> URL: https://issues.apache.org/jira/browse/ARROW-1380
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Plasma (C++)
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.11.0
>
> Attachments: LastTest.log, valgrind.supp_
>
>
> I thought I fixed this, but they seem to have recurred:
> https://travis-ci.org/apache/arrow/jobs/266421430#L5220



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1380) [C++] Fix "still reachable" valgrind warnings when PLASMA_VALGRIND=1

2018-08-21 Thread Lukasz Bartnik (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588313#comment-16588313
 ] 

Lukasz Bartnik commented on ARROW-1380:
---

The first of these warnings could be probably addressed by not calling exit(0) 
from the signal handler. My impression is that after a signal is caught and 
exit() is called, main() never returns, and thus destructors for its local 
objects are not called. Below is the valgrind warning in question.

==1990== 33 bytes in 1 blocks are still reachable in loss record 1 of 2
==1990== at 0x4C3017F: operator new(unsigned long) (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1990== by 0x513088C: std::string::_Rep::_S_create(unsigned long, unsigned 
long, std::allocator const&) (in 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5130C55: std::string::_M_mutate(unsigned long, unsigned long, 
unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x5131321: std::string::_M_replace_safe(unsigned long, unsigned 
long, char const*, unsigned long) (in 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==1990== by 0x198A23: main (store.cc:937)

 

I tried simply commenting exit() out but that leads to other errors and I 
assume is not intended. I don't see much other signal handling in plasma and my 
current guess is that it is ae that gets interrupted and then drops the event 
loop.

I'd be grateful for comments and/or pointers to relevant areas in the code.

> [C++] Fix "still reachable" valgrind warnings when PLASMA_VALGRIND=1
> 
>
> Key: ARROW-1380
> URL: https://issues.apache.org/jira/browse/ARROW-1380
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Plasma (C++)
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.11.0
>
> Attachments: LastTest.log, valgrind.supp_
>
>
> I thought I fixed this, but they seem to have recurred:
> https://travis-ci.org/apache/arrow/jobs/266421430#L5220



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (ARROW-1380) [C++] Fix "still reachable" valgrind warnings in Plasma Python unit tests

2018-08-20 Thread Lukasz Bartnik (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586818#comment-16586818
 ] 

Lukasz Bartnik edited comment on ARROW-1380 at 8/21/18 2:43 AM:


All Valgrind errors seem to originate in libpython. I re-ran:

{code}export PLASMA_VALGRIND=1
python \-m pytest \-vv \-r sxX --durations=15 -s $PYARROW_PATH --parquet{code}

with a suppression file aiming at {code}obj:*/libpython3.6m.so*{code} 
(attached) and all errors got suppressed. Another way to show it is to extract 
all suppressions from Valgrind: there are no C++ related context filters among 
those extracted suppressions.


was (Author: lbartnik):
All Valgrind errors seem to originate in libpython. I re-ran:

{{export PLASMA_VALGRIND=1 }}
 {{python -m pytest -vv -r sxX --durations=15 -s $PYARROW_PATH --parquet}}

with a suppression file aiming at _obj:*/libpython3.6m.so*_ (attached) and all 
errors got suppressed. Another way to show it is to extract all suppressions 
from Valgrind: there are no C++ related context filters among those extracted 
suppressions.

> [C++] Fix "still reachable" valgrind warnings in Plasma Python unit tests
> -
>
> Key: ARROW-1380
> URL: https://issues.apache.org/jira/browse/ARROW-1380
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Plasma (C++)
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.11.0
>
> Attachments: LastTest.log, valgrind.supp_
>
>
> I thought I fixed this, but they seem to have recurred:
> https://travis-ci.org/apache/arrow/jobs/266421430#L5220



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (ARROW-1380) [C++] Fix "still reachable" valgrind warnings in Plasma Python unit tests

2018-08-20 Thread Lukasz Bartnik (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586818#comment-16586818
 ] 

Lukasz Bartnik edited comment on ARROW-1380 at 8/21/18 2:39 AM:


All Valgrind errors seem to originate in libpython. I re-ran:

{{export PLASMA_VALGRIND=1 }}
 {{python -m pytest -vv -r sxX --durations=15 -s $PYARROW_PATH --parquet}}

with a suppression file aiming at _obj:*/libpython3.6m.so*_ (attached) and all 
errors got suppressed. Another way to show it is to extract all suppressions 
from Valgrind: there are no C++ related context filters among those extracted 
suppressions.


was (Author: lbartnik):
All valgrind errors seem to originate in libpython. I re-ran:

{{export PLASMA_VALGRIND=1 }}
{{python -m pytest -vv -r sxX --durations=15 -s $PYARROW_PATH –parquet}}

with a suppression file aiming at _obj:*/libpython3.6m.so*_ (attached) and all 
errors got suppressed. Another way to show it is to extract all suppressions 
from Valgrind: there are no C++ related context filters among those extracted 
suppressions.

> [C++] Fix "still reachable" valgrind warnings in Plasma Python unit tests
> -
>
> Key: ARROW-1380
> URL: https://issues.apache.org/jira/browse/ARROW-1380
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Plasma (C++)
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.11.0
>
> Attachments: LastTest.log, valgrind.supp_
>
>
> I thought I fixed this, but they seem to have recurred:
> https://travis-ci.org/apache/arrow/jobs/266421430#L5220



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1380) [C++] Fix "still reachable" valgrind warnings in Plasma Python unit tests

2018-08-20 Thread Lukasz Bartnik (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586818#comment-16586818
 ] 

Lukasz Bartnik commented on ARROW-1380:
---

All valgrind errors seem to originate in libpython. I re-ran:

{{export PLASMA_VALGRIND=1 }}
{{python -m pytest -vv -r sxX --durations=15 -s $PYARROW_PATH –parquet}}

with a suppression file aiming at _obj:*/libpython3.6m.so*_ (attached) and all 
errors got suppressed. Another way to show it is to extract all suppressions 
from Valgrind: there are no C++ related context filters among those extracted 
suppressions.

> [C++] Fix "still reachable" valgrind warnings in Plasma Python unit tests
> -
>
> Key: ARROW-1380
> URL: https://issues.apache.org/jira/browse/ARROW-1380
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Plasma (C++)
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.11.0
>
> Attachments: LastTest.log, valgrind.supp_
>
>
> I thought I fixed this, but they seem to have recurred:
> https://travis-ci.org/apache/arrow/jobs/266421430#L5220



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-1380) [C++] Fix "still reachable" valgrind warnings in Plasma Python unit tests

2018-08-20 Thread Lukasz Bartnik (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukasz Bartnik updated ARROW-1380:
--
Attachment: valgrind.supp_

> [C++] Fix "still reachable" valgrind warnings in Plasma Python unit tests
> -
>
> Key: ARROW-1380
> URL: https://issues.apache.org/jira/browse/ARROW-1380
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Plasma (C++)
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.11.0
>
> Attachments: LastTest.log, valgrind.supp_
>
>
> I thought I fixed this, but they seem to have recurred:
> https://travis-ci.org/apache/arrow/jobs/266421430#L5220



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (ARROW-1380) [C++] Fix "still reachable" valgrind warnings in Plasma Python unit tests

2018-08-16 Thread Lukasz Bartnik (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583276#comment-16583276
 ] 

Lukasz Bartnik edited comment on ARROW-1380 at 8/17/18 2:22 AM:


I took a quick look at a recent build 
([https://travis-ci.org/apache/arrow/builds/417014924).] Neither of its C++ 
jobs ([https://travis-ci.org/apache/arrow/jobs/417014931] and 
[https://travis-ci.org/apache/arrow/jobs/417014927)] seem to use valgrind. The 
only job that seems to use valgrind is the openjdk8/gcc one 
([https://travis-ci.org/apache/arrow/jobs/417014925)] but there are no reports 
from valgrind in the log; in fact, valgrind doesn't seem to be used there at 
all.

Looking at job descriptions: the original job where "still reachable" blocks 
are reported was a "gcc C++" one, but there were two such jobs back then 
(3786.1 and 3786.8) whereas there's only one now (9492.7).

It seem that the error has been fixed between builds 3786 and 9492.

I'm attaching the LastTest.log which does not contain any valgrind alarms: 
every "HEAP SUMMARY" line is followed by a "in use at exit: 0 bytes in 0 
blocks" line.


was (Author: lbartnik):
I took a quick look at a recent build 
([https://travis-ci.org/apache/arrow/builds/417014924).] Neither of its C++ 
jobs ([https://travis-ci.org/apache/arrow/jobs/417014931] and 
[https://travis-ci.org/apache/arrow/jobs/417014927)] seem to use valgrind. The 
only job that seems to use valgrind is the openjdk8/gcc one 
([https://travis-ci.org/apache/arrow/jobs/417014925)] but there are no reports 
from valgrind in the log; in fact, valgrind doesn't seem to be used there at 
all.

Looking at job descriptions: the original job where "still reachable" blocks 
are reported was a "gcc C++" one, but there were two such jobs back then 
(3786.1 and 3786.8) whereas there's only one now (9492.7).

It seem that the error has been fixed between builds 3786 and 9492.

I'm attaching the LastTest.log which does not contain any valgrind alarms: 
every "HEAP SUMMARY" line is followed by a "in use at exit: 0 bytes in 0 
blocks" line.

> [C++] Fix "still reachable" valgrind warnings in Plasma Python unit tests
> -
>
> Key: ARROW-1380
> URL: https://issues.apache.org/jira/browse/ARROW-1380
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Plasma (C++)
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.11.0
>
> Attachments: LastTest.log
>
>
> I thought I fixed this, but they seem to have recurred:
> https://travis-ci.org/apache/arrow/jobs/266421430#L5220



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-1380) [C++] Fix "still reachable" valgrind warnings in Plasma Python unit tests

2018-08-16 Thread Lukasz Bartnik (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukasz Bartnik updated ARROW-1380:
--
Attachment: LastTest.log

> [C++] Fix "still reachable" valgrind warnings in Plasma Python unit tests
> -
>
> Key: ARROW-1380
> URL: https://issues.apache.org/jira/browse/ARROW-1380
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Plasma (C++)
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.11.0
>
> Attachments: LastTest.log
>
>
> I thought I fixed this, but they seem to have recurred:
> https://travis-ci.org/apache/arrow/jobs/266421430#L5220



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1380) [C++] Fix "still reachable" valgrind warnings in Plasma Python unit tests

2018-08-16 Thread Lukasz Bartnik (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583276#comment-16583276
 ] 

Lukasz Bartnik commented on ARROW-1380:
---

I took a quick look at a recent build 
([https://travis-ci.org/apache/arrow/builds/417014924).] Neither of its C++ 
jobs ([https://travis-ci.org/apache/arrow/jobs/417014931] and 
[https://travis-ci.org/apache/arrow/jobs/417014927)] seem to use valgrind. The 
only job that seems to use valgrind is the openjdk8/gcc one 
([https://travis-ci.org/apache/arrow/jobs/417014925)] but there are no reports 
from valgrind in the log; in fact, valgrind doesn't seem to be used there at 
all.

Looking at job descriptions: the original job where "still reachable" blocks 
are reported was a "gcc C++" one, but there were two such jobs back then 
(3786.1 and 3786.8) whereas there's only one now (9492.7).

It seem that the error has been fixed between builds 3786 and 9492.

I'm attaching the LastTest.log which does not contain any valgrind alarms: 
every "HEAP SUMMARY" line is followed by a "in use at exit: 0 bytes in 0 
blocks" line.

> [C++] Fix "still reachable" valgrind warnings in Plasma Python unit tests
> -
>
> Key: ARROW-1380
> URL: https://issues.apache.org/jira/browse/ARROW-1380
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Plasma (C++)
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.11.0
>
>
> I thought I fixed this, but they seem to have recurred:
> https://travis-ci.org/apache/arrow/jobs/266421430#L5220



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (ARROW-1799) [Plasma C++] Make unittest does not create plasma store executable

2018-08-16 Thread Lukasz Bartnik (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582765#comment-16582765
 ] 

Lukasz Bartnik edited comment on ARROW-1799 at 8/16/18 4:18 PM:


{{make unittest}} fails repeatedly unless {{make all}}, which creates 
{{libarrow.\*}} and {{libplasma.\*}} libraries, is run beforehand. Quite 
possibly, the {{unittest}} target needs additional dependencies.


was (Author: lbartnik):
{{make unittest}} fails repeatedly unless {{make all}}, which creates 
{{libarrow.*}} and {{libplasma.*}} libraries, is run beforehand. Quite 
possibly, the {{unittest}} target needs additional dependencies.

> [Plasma C++] Make unittest does not create plasma store executable
> --
>
> Key: ARROW-1799
> URL: https://issues.apache.org/jira/browse/ARROW-1799
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Plasma (C++)
>Reporter: William Paul
>Priority: Minor
>
> Steps to reproduce from a fresh clone of Arrow:
> mkdir cpp/debug
> cd cpp/debug
> cmake .. -DARROW_PLASMA=on
> make -j8 unittest
> client_tests may then fail due to the store executable not being created. The 
> first time I reproduced the issue the test did fail, but the test passed on 
> subsequent reproductions of this issue. Regardless, if you look in 
> cpp/debug/debug, there is no plasma store executable. If you then call make, 
> the store executable is generated in that directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1799) [Plasma C++] Make unittest does not create plasma store executable

2018-08-16 Thread Lukasz Bartnik (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582765#comment-16582765
 ] 

Lukasz Bartnik commented on ARROW-1799:
---

{{make unittest}} fails repeatedly unless {{make all}}, which creates 
{{libarrow.*}} and {{libplasma.*}} libraries, is run beforehand. Quite 
possibly, the {{unittest}} target needs additional dependencies.

> [Plasma C++] Make unittest does not create plasma store executable
> --
>
> Key: ARROW-1799
> URL: https://issues.apache.org/jira/browse/ARROW-1799
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Plasma (C++)
>Reporter: William Paul
>Priority: Minor
>
> Steps to reproduce from a fresh clone of Arrow:
> mkdir cpp/debug
> cd cpp/debug
> cmake .. -DARROW_PLASMA=on
> make -j8 unittest
> client_tests may then fail due to the store executable not being created. The 
> first time I reproduced the issue the test did fail, but the test passed on 
> subsequent reproductions of this issue. Regardless, if you look in 
> cpp/debug/debug, there is no plasma store executable. If you then call make, 
> the store executable is generated in that directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)