from:"Philippe Waroquiers via KDE Bugzilla"

[valgrind] [Bug 356044] Dwarf line info reader misinterprets is_stmt register

2015-12-03 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=356044

--- Comment #7 from Philippe Waroquiers  ---
(In reply to Ivo Raisr from comment #6)
> Created attachment 95828 [details]
> proposed patch
> 
> Adjacent DiLoc entries are now merged if they refer to the same line. This
> should give an improvement in terms of memory used.

Yes, results are good. With the patch, the memory  used is now similar to the
trunk.
Just one question: the merging is done when adding a new entry in the loctab
(i.e. in addLoc function). That is good to avoid uselessly growing the loctab 
during insertion.

I am wondering however if this merges all what can be merged.
Maybe it would be useful to also merge adjacent entries in canonicaliseLoctab
(after having sorted on addr) ?
This is of course only useful if there are non successive addLoc calls for
mergeable entries.

Otherwise, small style remark for the storage.c patch: I think (most of) the
code
splits the too long lines before the &&, not after.
so this
+  if ((previous->lineno == loc->lineno) &&
+  (previous->addr + previous->size == loc->addr)) {
should be
+  if ((previous->lineno == loc->lineno)
+  && (previous->addr + previous->size == loc->addr)) {

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 356174] Enhance the embedded gdbserver to allow LLDB to use it

2015-12-03 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=356174

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #4 from Philippe Waroquiers  ---
Thanks for the analysis and patches. I can reproduce on debian8/x86 the problem
that
implies to have support for qC.
I have quickly looked at the patches, which look ok.
I should be able to handle and commit the patches in a few days.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 356174] Enhance the embedded gdbserver to allow LLDB to use it

2015-12-05 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=356174

--- Comment #8 from Philippe Waroquiers  ---
I already committed (svn 15743) the support for qC (slightly modified patch).

With this, lldb somewhat works on debian8/x86 or Ubuntu14/amd64
(e.g. continue till a breakpoint works).

However, in both environments, I encounter several things not working:
* unwind and/or frame discovery seems to not work properly.
e.g.  bt   shows an empty frame:
* thread #1: tid = 29071, , stop reason = signal SIGTRAP
  * frame #0: 

register read  gives an error
(lldb) register read
error: invalid frame

Despite the fact that valgrind gdbserver tells qXfer:features:read+; is
supported,
lldb does not send requests to read the target description.

Finally, it is not very clear how to send monitor commands from lldb.
'process plugin packet monitor v.info scheduler'
sends the command v.info scheduler, but the behaviour is strange:
* the first output line is shown (encoded in protocol layout).
* then if continue is given, the rest of the output is shown properly.

Note that the Hc-1 handling seems not very critical, but applying the patch
does
not improve the above in any case.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 356174] Enhance the embedded gdbserver to allow LLDB to use it

2015-12-05 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=356174

--- Comment #9 from Philippe Waroquiers  ---
An additional note: on x86 linux, valgrind gdbserver only reports  the
Xfer:features:read+ 
supported if --vgdb-shadow-registers=yes is given.
On amd64 linux, Xfer:features:read+ is reported as supported if either shadow
registers are
requested or if the host has avx register.
Otherwise, valgrind gdbserver expects that the debugger knows the register
layout
of x86 or amd64.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 356174] Enhance the embedded gdbserver to allow LLDB to use it

2015-12-07 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=356174

--- Comment #13 from Philippe Waroquiers  ---
(In reply to Daniel Trebbien from comment #10)
> (In reply to Philippe Waroquiers from comment #9)
> > Otherwise, valgrind gdbserver expects that the debugger knows the register
> > layout of x86 or amd64.
> 
> I think that this is what is causing the problem; i.e. that unlike gdb, lldb
> does not have built-in knowledge of the x86 and amd64 architectures.
> 
> On my system (OS X 10.11.1 and Core i7 with AVX support), the embedded
> gdbserver responds with qXfer:features:read+ and lldb retrieves the
> target.xml.  In my case, the embedded gdbserver sends back
> amd64-avx-coresse.xml.  When I comment out the s leaving just
> the i386:x86-64 element, then I also see "frame
> #0: 0x" and `register read' says "error: invalid frame".
> 
> I think that the solution is to always respond with qXfer:features:read+ and
> for the XML target descriptions to have the generic register information.
> 
> I am looking at the LLDB sources, within source/Plugins/ABI, to see what the
> appropriate generic registers are for the different architectures.

On debina8/x86, when using --vgdb-shadow-registers=yes, valgrind gdbserver
reports
qXfer:features:read+;
but lldb 3.5.0 does not read target.xml

When I use gdb with the same setup, it requests to read target.xml

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 356174] Enhance the embedded gdbserver to allow LLDB to use it

2015-12-08 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=356174

--- Comment #16 from Philippe Waroquiers  ---
(In reply to Daniel Trebbien from comment #15)
> Looking through the sources of the release_35 and release_36 branches, I see
> that LLDB 3.5 and 3.6 do not support the target.xml or target definition
> file ways of specifying register information.  See
> ProcessGDBRemote::BuildDynamicRegisterInfo():
> http://llvm.org/svn/llvm-project/lldb/branches/release_35/source/Plugins/
> Process/gdb-remote/ProcessGDBRemote.cpp
> http://llvm.org/svn/llvm-project/lldb/branches/release_36/source/Plugins/
> Process/gdb-remote/ProcessGDBRemote.cpp
> 
> LLDB 3.7 is the first to add this support:
> http://llvm.org/svn/llvm-project/lldb/branches/release_37/source/Plugins/
> Process/gdb-remote/ProcessGDBRemote.cpp
> 
> I think that to get this working with LLDB 3.5 and 3.6, the embedded
> gdbserver would need to respond to 'qRegisterInfo XX' packets. 
> 'qRegisterInfo' is specific to LLDB.  It is documented here:
> http://llvm.org/svn/llvm-project/lldb/trunk/docs/lldb-gdb-remote.txt
> 
> Adding support for 'qRegisterInfo' would entail using an XML parser in the
> embedded server which can build up the XML target description (resolving the
> s if any), and then process the  elements.
> 
> Would adding a dependency on an XML parser be out of the question? 
> Preferably libxml2.
Valgrind core cannot be linked with a library (to avoid problems of
interactions
with guest processes that are using the same library).
So, linking with libxml2 is out of the question.

Also, if lldb 3.7 properly supports target.xml, then it looks to me good enough
to make the changes needed to have lldb 3.7 working properly:
As I understood, it is needed to do some modifications to the xml files, to add
some lldb specific xml elements, such as altname or generic.
What we have to ensure is that these elements unknown to gdb are not causing a
problem
(the idea is that valgrind gdbserver supports various gdb versions).
 Otherwise, either lldb has to be modified to work with the current xml files,
or alternatively, we maintain 2 sets of xml files : the 'gdb' version and the
'lldb' version
(we can maybe build the first one by automatically editing the 2nd one during
build phase).

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 356174] Enhance the embedded gdbserver to allow LLDB to use it

2015-12-08 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=356174

--- Comment #17 from Philippe Waroquiers  ---
Also, without a reasonable working equivalent of gdb 'monitor' command, a lot
of Valgrind gdbserver features are not usable.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 356273] conserve memory by merging adjacent DiLoc entries in the debug info location table

2015-12-09 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=356273

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
Created attachment 95964
  --> https://bugs.kde.org/attachment.cgi?id=95964&action=edit
merging in canonicaliseLoctab

The attached patch adds merging logic in the canonicalise Loctab.
Tested on a big executable, this patch seems useless:
When activating the trace, we only see merging of 0 size entries (that will be
cleaned up
in anycase) or failed merging due to max size reached.

So, unless we find a test case where significant merging is done, I suggest to
close
this bug with WONTFIX.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 356457] valgrind: m_mallocfree.c:2042 (vgPlain_arena_free): Assertion 'blockSane(a, b)' failed.

2015-12-10 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=356457

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
The Assertion 'blockSane(a, b)' failed. might indicate that there is a bug in
Valgrind
(buffer overrun in reading the debug information of an mmaped library?).
However, without a small reproducer and/or more details, it is unlikely much
can be done.

Here are a few things you could try, easiest things are first :). If this is a
buffer overrun
in Valgrind, the 2nd action is most likely to find the problem.
* run with -v -v -v -d -d -d and see which library load is loaded just before
the corruption
  This might give a hint about how to reproduce the problem and make a small
test case.
* recompile valgrind after having uncommented
 // #define DEBUG_MALLOC
   in m_mallocfree.c and rerun your test case.
* compile valgrind as an 'inner valgrind', and then run 
  valgrind  under valgrind
   (see section self-hosting in  README_DEVELOPERS ).
   This might detect buffer overrun in valgrind heap allocated blocks.
* run with --vgdb-stop-at=valgrindabexit
   till the problem reproduces. You can then attach with gdb, and debug
valgrind itself.
   (but that will not be easy, and moreover that will be after the corruption
has happened)

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 356457] valgrind: m_mallocfree.c:2042 (vgPlain_arena_free): Assertion 'blockSane(a, b)' failed.

2015-12-10 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=356457

--- Comment #2 from Philippe Waroquiers  ---
(In reply to Philippe Waroquiers from comment #1)
> in Valgrind, the 2nd action is most likely to find the problem.
The 3rd action (self-hosting)  is in fact most likely to detect the buffer
overrun.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 356273] conserve memory by merging adjacent DiLoc entries in the debug info location table

2015-12-13 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=356273

--- Comment #5 from Philippe Waroquiers  ---
(In reply to Ivo Raisr from comment #2)
> I tried the patch and I see quite a lot of merged entries on Solaris 12:
Yes, for sure, I also see a lot of 'addLoc merging' (which are done during
addLoc merging).
(NB: the changes around "addLoc merging" are just minor code cleanup, there is
there
no functional change).

So, this patch is supposed to add merging in a second phase, producing traces
"canonicaliseLoctab merging"

Even on a big executable, I see only irrelevant "canonicaliseLoctab merging" 
(i.e. either 0 size
or too big size) traces, that leads to no gain.

So, before committing this patch, I would like to see some real gains.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 191069] Exiting due to signal not reported in XML output

2015-12-13 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=191069

Philippe Waroquiers  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #10 from Philippe Waroquiers  ---
Committed (slightly modified)  in revision 15747

Thanks for the patch, sorry for the long time taken to look and commit it.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 356457] valgrind: m_mallocfree.c:2042 (vgPlain_arena_free): Assertion 'blockSane(a, b)' failed.

2015-12-19 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=356457

--- Comment #5 from Philippe Waroquiers  ---
(In reply to Joost VandeVondele from comment #3)
> just happened again, but it is really rare. (this is a 12 core server
> running valgrind +-12h a day... and this seems to happen every +- 10 days).
> Is any of the suggestions mentioned above possible without runtime overhead
> and excessive IO ?
Assuming you know which executable/test causes the bug, the first thing to try
is
the 'self-hosting', and run your test executable under valgrind self-hosted
under itself.

and/or run in a loop valgrind on the executable that gave the problem,
with the options -v -v -v -d -d -d --vgdb-stop-at=valgrindabexit
A failing run will then stop, and allow to examine the debug output of
valgrind.

The above will for sure consume CPU, but you can sleep during that time :)

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 356457] valgrind: m_mallocfree.c:2042 (vgPlain_arena_free): Assertion 'blockSane(a, b)' failed.

2015-12-19 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=356457

--- Comment #6 from Philippe Waroquiers  ---
(In reply to Kim Rosberg from comment #4)
> I'm experience exactly the same behavior. But the interval is between +-4
> days and I'm running multiple servers 24h. 
> 
> ==247219== Memcheck, a memory error detector
> ==247219== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
> ==247219== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
> ==247219== Command: some test.elf
> ==247219== Parent PID: 247218
> ==247219== 
> blockSane: fail -- redzone-hi
> 
> valgrind: m_mallocfree.c:2047 (vgPlain_arena_free): Assertion 'blockSane(a,
> b)' failed.
The above line nr is strange. There is no assertion at line 2047. There is one
at line 2042.

It would be good if you could (both) indicate which distribution and libraries
your executables
are using : this might give a hint about the origin.
I am now re-running all valgrind tests under itself on a amd64 Debian 7, just
in case.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 356457] valgrind: m_mallocfree.c:2042 (vgPlain_arena_free): Assertion 'blockSane(a, b)' failed.

2015-12-24 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=356457

--- Comment #9 from Philippe Waroquiers  ---
(In reply to Joost VandeVondele from comment #8)
> I'll try something similar on the other machine, but the failure is not so
> easy to trigger, seemingly.
...
> There are many more static libraries involved, and all are compiled with
> debug info. The binary is also large (~142Mb).
According to the guest stacktrace, the corruption happens when mmap-ing a
shared lib
(so, when valgrind is reading the debug info of this library), so is probably
related to
the shared lib being loaded.

When self-hosting, you will increase the chance to detect a possible buffer
overrun
by using --core-redzone-size=xxx with xxx being e.g; 100 bytes, or even 1000
bytes
(if that does not give an out of memory) on the inner valgrind,

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 357033] VALGRIND_DO_QUICK_LEAK_CHECK reports leaked and dubious memory as reachable in intel-compiled applications.

2015-12-28 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=357033

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
This behaviour is not necessarily a bug, as some pointers might e.g. be left on
the stack
and/or in registers, and so can still be considered as reachable.

See e.g. memcheck/tests/leak-cases.c for a technique used to avoid (or limit)
such
pointers being kept in regression tests.

Note that you can investigate why a block is still reachable using gdb+vgdb :
see
http://www.valgrind.org/docs/manual/mc-manual.html#mc-manual.monitor-commands
e.g. leak_check, block_list  and who_points_at monitor commands.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 357034] Inlined functions are not reported for intel-compiled applications.

2015-12-28 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=357034

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #2 from Philippe Waroquiers  ---
(In reply to Tatyana from comment #1)
> Created attachment 96256 [details]
> a reproducer application

Can you use
objdump --dwarf inline_info_icc
and
valgrind --trace-symtab=yes --trace-symtab-patt=*inline_info_icc*
to double check that effectively icc has inlined calls ?
It would be good to compare the dwarf inline information with the one generated
by gcc.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 353660] XML in auxwhat tag not escaping reserved symbols properly

2015-12-28 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=353660

Philippe Waroquiers  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED
 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #2 from Philippe Waroquiers  ---
A fix was committed revision 15753.
As there is very few xml tests (and there was no reproducer to reproduce the
problem),
it would be good to test with the last SVN version.
Thanks

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 357294] cannot start valgrind with tool dhat

2015-12-29 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=357294

Philippe Waroquiers  changed:

   What|Removed |Added

 Resolution|--- |WONTFIX
 Status|UNCONFIRMED |RESOLVED
 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
No bug fix is done on 3.8.0 which is very old.
I understand that this is working in 3.11.0, so closing as WONTFIX.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 357037] Line numbers are occasionally displayed incorrectly in intel-compiled applications

2015-12-30 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=357037

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
It would be good to analyse the debug info generated by icc 
e.g. using objdump
and/or using gdb  e.g.  info line 5/6/7
and info line *0x..
and/or the valgrind gdbserver monitor command   v.info location 
   (where addr is an address that should be part of the line 5)

Alternatively, it might be the unwind info that is not ok and/or the valgrind
unwinder.
You might investigate that by using gdb+vgdb, and put a break at
vg_replace_malloc.c:299.
You can then compare gdb unwinder (using the bt gdb command) 
with the valgrind unwinder (using monitor v.info scheduler)

All the above might give some hints about what is going wrong.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 358030] support direct socket calls on x86 32bit (new in linux 4.3)

2016-01-15 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=358030

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #2 from Philippe Waroquiers  ---
(In reply to Ron from comment #1)
> Created attachment 96657 [details]
> Patch that adds the direct socket syscall definitions for x86
Thanks for the patch, which seems reasonable (but quick reading only :).
Have you run the regression tests with your patch ?
The testsuite has a bunch of socket related tests, so if there is a lot of
failures
without your patch, and a lot less failures with your patch, then that will
help to
see the patch is correct/needed/
Maybe also  memcheck/tests/x86-linux/scalar.c should/could be modified ?

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 357871] pthread_spin_destroy not properly wrapped

2016-01-17 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=357871

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be
 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 356457] valgrind: m_mallocfree.c:2042 (vgPlain_arena_free): Assertion 'blockSane(a, b)' failed.

2016-01-22 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=356457

--- Comment #13 from Philippe Waroquiers  ---
(In reply to Joost VandeVondele from comment #12)
Thanks for this data.

The warning about the stack switch is normal : valgrind has an heuristic to
detect stack switch. If a program uses huge stackframes, then a call can be
confused
with a stack switch, and this warning indicates what to do if this is *not* a
stack switch
(but in a self-hosting setup, such message is normal: it is a stack switch, not
a huge frame).

Sadly, no error is detected by the self-hosting.

So, there are a few more things you could try:
Run (no need to self host, just a normal run)   but add the option  
 --sanity-level=4
Valgrind will do (more) sanity checks while running, and maybe this might give
a hint.

Another thing is to add the option --vgdb-stop-at=valgrindabexit
when you run all your regression tests. And then, when the problem reproduces,
valgrind will stop and wait for a gdb to connect.
Then attach with gdb to the valgrind process and do e.g. 
   bt full
Also to the frame (image.c:778) and do
   print img->ces[i]->off
   print img->ces[i]->used

You might also print all the not null ces entries.

But we are really trying to kill the bug by shooting in the dark :(


> Since the error is recurring, I have now tried to run the self-hosting.
> Running :
> 
> /data/vjoost/test/outer/install/bin/valgrind --sim-hints=enable-outer
> --trace-children=yes --smc-check=all-non-file --run-libc-freeres=no
> --tool=memcheck -v /data/vjoost/test/inner/install/bin/valgrind
> --suppressions=/data/vjoost/toolchain-r16494/install/valgrind.supp
> --max-stackframe=2168152 --error-exitcode=42 --vgdb-prefix=./inner
> --core-redzone-size=1000 --tool=memcheck -v
> /data/schuetto/auto_regtesting/regtests/cp2k/exe/local_valgrind/cp2k.sdbg
> ethanol_both_rcut10.0_e1-1_v1-4_RSR.inp
> 
> (I.e. self-hosting with added redzone, on the our executable corresponding
> to a failed run, with its arguments and parameters), I get a seemingly
> correct run. The output will be attached as out.innerouter.2 . Maybe it is
> worthwhile to look with expert eyes.
> 
> However, after observing in that output a warning on stack switching, I
> added --max-stackframe=68009224472 (as suggested, seems a bit large;-), and
> that lead to a run with some other error (Memcheck: the 'impossible'
> happened:   create_MC_Chunk: shadow area is accessible).

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 359133] m_deduppoolalloc.c:258 (vgPlain_allocEltDedupPA): Assertion 'eltSzB <= ddpa->poolSzB' failed.

2016-02-18 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=359133

--- Comment #8 from Philippe Waroquiers  ---
(In reply to David Hallas from comment #7)
> I have attached a reduced test case that shows the problem. I have tested
> with gcc-4.9.3 and clang-3.7.1 using a 64bit Linux PC. I compiled it like
> this:
> 
> g++ -std=c++11 main.cpp -o test
> 
> I also verified that the latest master fixes the problem.
> 
> Let me know if there is anything else you need

Thanks for the test case.
Test added in revision 15799

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 359133] m_deduppoolalloc.c:258 (vgPlain_allocEltDedupPA): Assertion 'eltSzB <= ddpa->poolSzB' failed.

2016-02-20 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=359133

--- Comment #10 from Philippe Waroquiers  ---
(In reply to David Hallas from comment #9)
> So, should I go ahead and close the bug now that a testcase has been added?

Status was changed to RESOLVED/FIXED which seems to be the final status of
valgrind bugs.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 349128] Access not within mapped region in _pthread_find_thread (OS X 10.11)

2016-02-22 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=349128

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 359705] memcheck causes segfault on a dynamically-linked test from rustlang's test suite on i686

2016-02-27 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=359705

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #6 from Philippe Waroquiers  ---
Seeing the first 2 lines of output:
  ==6449== Can't extend stack to 0x4bb9880 during signal delivery for thread 2: 
  ==6449== no stack segment 
it might be worth trying with
--vex-iropt-register-updates=allregs-at-each-insn

just in case your test case does special things with signals.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 356457] valgrind: m_mallocfree.c:2042 (vgPlain_arena_free): Assertion 'blockSane(a, b)' failed.

2016-01-27 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=356457

--- Comment #15 from Philippe Waroquiers  ---
(In reply to Joost VandeVondele from comment #14)
> Also no luck with --sanity-level=4 
> 
> The fact that it is not reproducible on command is indeed not simplifying
> this. I wonder if this could be related to something external to valgrind
> triggering this.
Yes, this bug is quite mysterious.

The only remaining thing to try that I see is to add 
   --vgdb-stop-at=valgrindabexit
to the valgrind args you use for your regression tests.
Then when the error happens, valgrind will wait for a gdb to connect using
gdb+vgdb.
You can then examine e.g. which library is being mmap-ed.
You might also use gdb to directly attach to valgrind and examine valgrind
internals.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 303877] valgrind doesn't support compressed debuginfo sections.

2016-01-31 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=303877

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #18 from Philippe Waroquiers  ---
An alternative is also the simple/super small 'inflate' implementation in zlib
code
zlib-1.2.8/contrib/puff.h and puff.c

This is a fully independent inflate implementation (no #include).

There are some drawbacks (2 times slower than the real zlib inflate, and as it
does
not do memory allocation, inflate fails if the target buffer is too small
(and so, you must redo the inflate with a bigger buffer).
If the debug info stores the uncompressed size, then this is not a problem

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 348345] Assertion fails for negative lineno

2016-02-03 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=348345

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #5 from Philippe Waroquiers  ---
Transformed the other assert for negative line number in a complain once
+ refactorisation of the checking  committed in revision 15780.

Thanks for the patch

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 359133] m_deduppoolalloc.c:258 (vgPlain_allocEltDedupPA): Assertion 'eltSzB <= ddpa->poolSzB' failed.

2016-02-14 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=359133

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
Fixed in revision 15787.

Note: the fix was tested by temporarily changing the pool size to a very small
value.
It would be nice if you could produce a small test case which has a string > 64
Kb, so
as to have a regression test for this.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 359133] m_deduppoolalloc.c:258 (vgPlain_allocEltDedupPA): Assertion 'eltSzB <= ddpa->poolSzB' failed.

2016-02-14 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=359133

Philippe Waroquiers  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 359133] m_deduppoolalloc.c:258 (vgPlain_allocEltDedupPA): Assertion 'eltSzB <= ddpa->poolSzB' failed.

2016-02-15 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=359133

--- Comment #5 from Philippe Waroquiers  ---
(In reply to David Hallas from comment #4)
> I can try :) What would the format of a testcase be? Would a C++ code
> snippet be good enough?
A small compilable testcase c++ is ok.
Bonus points if the testcase consists in a single file.
Thanks

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 199468] Suppressions: stack size limited to 25 while --num-callers allows up to 50 frames

2016-08-10 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=199468

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #2 from Philippe Waroquiers  ---
(In reply to Robin Kuzmin from comment #1)
> I have the same issue.
> I have a suppression:
> {
>google::protobuf::Message::PrintDebugString() const (text_format.cc:110)
>Memcheck:Leak
>match-leak-kinds: reachable
>fun:_Znwm
>...
>fun:_ZNK6google8protobuf7Message11DebugStringEv
>fun:_ZNK6google8protobuf7Message16PrintDebugStringEv
>...
> }
> This suppression is ignored for "--num-callers=20", but works fine
> (suppresses) for "--num-callers=40". 
> I didn't expect the suppressions to depend on "--num-callers".

I think that we have 2 different problems:
The bug speaks about the fact that a suppression is limited to 24 entries,
while the --num-callers can now (in valgrind 3.11) go up to 500.

Your problem is that a suppression works with 40 num callers, but does not
suppress
for 20 num callers. This is not abnormal: if for example 
fun:_ZNK6google8protobuf7Message11DebugStringEv is at depth 30, when you use
--num-callers=20, then this function will not be recorded in the error
stacktrace and the suppression will not match.
This is because the logic is:
  * first an error is made, with a stacktrace recorded as specified with
--num-callers
  * after that, the suppression entries are tried.

If suppressions would have to be done without taking --num-callers into
account, then it
means that all errors stacktraces would have to be first recorded without
limit, and then after
(unsuccesful) suppression matching, the frames exceeding --num-callers would
have to be dropped.

The original problem (suppressions limited to 24 entries while errors can
record up to 500 frames) is somewhat strange. I could not retrieve any comment
or svn log describing why
suppression entries are limited to 24 entries. --num-callers used to be limited
to 50.

If we increase the nr of entries in suppressions to 500 (the max value for
--num-callers),
this will however has as a side effect that --gen-suppressions will produce
bigger entries
while they were limited to 24 entries. I guess this is not a problem, as in any
case, for many suppression entries, 24 was already a lot (e.g. for suppression
entries intended to suppress errors in libraries)

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 199468] Suppressions: stack size limited to 25 while --num-callers allows up to 50 frames

2016-08-11 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=199468

--- Comment #5 from Philippe Waroquiers  ---
(In reply to Robin Kuzmin from comment #4)
> I have run through this discussion more thoroughly. My understanding how
> valgrind works (or should work) has changed. Here is my vision.
> 
> In order to avoid multiple reports of the same error, valgrind uses top 4
> stack frames to common up the errors (see description of "--num-callers" at
> http://valgrind.org/docs/manual/manual-core.html#manual-core.options). 
> 
> Valgrind (memcheck) deals with (at least) 2 types of the stack traces:
> 
> 1. The Short-term Stack Traces. They are reported immediately upon error,
> e.g. when the analyzed program accesses outside the heap-allocated block. At
> this stage the entire stack trace can be used for suppressions (both for
> applying the suppressions and for generating the suppression with
> --gen-suppressions). After suppressions, if/when reporting these stack
> traces to the console the top --num-callers frames are (or can be) printed.
> Then these stack traces are (or can be) forgotten by valgrind (and only the
> top 4 stack frames are (or can be) saved to common up the errors). Thus for
> the Short-term Stack Traces there is no need for the suppressions to depend
> on the --num-callers. Implementing this mechanism will be relatively cheap
> (the price will be an INsignificant lower down in performance and
> INsignificant (and short-term) increase in resource requirements).
> 
> 2. The Long-term Stack Traces. These stack traces valgrind has to store in
> memory for a long period of time, and potentially till the end of the
> analyzed program. E.g. when the analyzed program makes an allocation, the
> allocation stack trace has to be kept until the CORRECT deallocation (if the
> CORRECT deallocation Ever happens) or until the termination (if the CORRECT
> deallocation Never happens). For the Long-term Stack Traces valgrind can try
> to APPLY suppressions immediately (upon allocation) to the entire stack
> trace. If any of the suppressions is applicable then the top 4 stack frames
> are stored to common up the errors (and the stack trace can be marked as "to
> be suppressed by suppression N"), otherwise (none of the suppressions apply)
> the number of stack frames to be saved depends on the --gen-suppressions. If
> "no" then the --num-callers stack frames are stored, otherwise the entire
> stack trace is stored (and the entire stack trace is used for generating the
> suppression (if the CORRECT deallocation never happens), and only the
> --num-callers stack frames are used for reporting the error to the console).
> 
> Thus the suppressions might be independent of the --num-callers.

For what you call 'long term stacktraces" : capturing a big stacktrace is
costly by itself.
A lot of effort was spent to optimise this, but still this is costly. So,
always capturing
an (unlimited) stack trace is not desirable.
But applying the suppression entries is even (a lot) more costly. E.g. it
implies to translate
an IP address into a function name.
A lot of effort was spent already to make these suppression matching faster
(e.g. using lazy completers, see m_errormgr.c is_suppressible_error function).
But such suppression matching can clearly not be done all the time (e.g. for
all allocations;
in memcheck or in some tools such as helgrind, even for all memory accesses).

So, what you suggest cannot be implemented without impacting significantly the
performance.

With that in mind, I think the best approach for you is just to use
suppressions that works
properly with the --num-callers you are using.

For what concerns the original problem (suppression entries limited to 24
entries):
I guess we could/should use the same max sze as the max value fo r--num-callers

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 369468] Implement HT_remove_at_Iter, allows removing the current entry from a table during iteration

2016-10-05 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=369468

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
Created attachment 101438
  --> https://bugs.kde.org/attachment.cgi?id=101438&action=edit
slightly reworked patch, with an alternative implementation of
HT_remove_at_Iter

Find attached a slightly reworked patch (in m_hashtable.c) :
Instead of maintaining in HT_Next iterPrevNode and iterPrevChain,
the previous chain and previous node are calculated in VG_(remove_at_Iter).

This has no performance impact on existing usage of VG_(HT_Next).
On the performance test of the auto free pool, this is also slightly faster (a
few percents).

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 369468] Implement HT_remove_at_Iter to avoid quadratic auto free pool algorithm

2016-10-05 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=369468

Philippe Waroquiers  changed:

   What|Removed |Added

Summary|Implement   |Implement HT_remove_at_Iter
   |HT_remove_at_Iter, allows   |to avoid quadratic auto
   |removing the current entry  |free pool algorithm
   |from a table during |
   |iteration   |

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 369468] Implement HT_remove_at_Iter to avoid quadratic auto free pool algorithm

2016-10-15 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=369468

Philippe Waroquiers  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Philippe Waroquiers  ---
fixed in revision 16041
(reworked implementation applied, to ensure no impact on existing hashtable
uses)

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 356457] valgrind: m_mallocfree.c:2042 (vgPlain_arena_free): Assertion 'blockSane(a, b)' failed.

2016-10-19 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=356457

--- Comment #19 from Philippe Waroquiers  ---
(In reply to Eric Chamberland from comment #18)
> which corresponds to one of the 6 failling tests of last night.
For these 6 failing tests (or the failing tests of the other runs)
 is it always (or often ?) just after seeing the line telling that it is
loading the syms of
  libgiref_opt_Interface.so ?
(wondering if this bug is linked to some specific debug info/symbols in a
library).
Or is it after a seemingly random library load ?

Also, a recent commit has added a check on valgrind's own heap, to detect
double free
(that could potentially create such a assert).
So, it would be good if you could try the valgrind svn version, just in case
this would be a double
free.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 356457] valgrind: m_mallocfree.c:2042 (vgPlain_arena_free): Assertion 'blockSane(a, b)' failed.

2016-10-19 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=356457

--- Comment #20 from Philippe Waroquiers  ---
Yet another trial you can do (if you try the svn version) is to activate
DEBUG_MALLOC
in coregrind/m_mallocfree.c file.
This might give a chance to detect a possible corruption closer to the origin
of the corruption.

Thanks

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 365208] valgrind stuck after redirecting "memcpy"

2016-07-08 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=365208

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
With just the provided info, there is not much we can say.

Several things can be done to investigate what goes wrong:

1. add some traces e.g. start valgrind with
-v -v -v -d -d -d

2. assuming that your application is started up, you can investigate what it is
doing under valgrind
   using valgrind+vgdb + gdb.

3. finally, you could try to debug valgrind itself by using gdb and attach to
the memcheck-ppc32 program, do a backtrace and report what it does

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 360557] helgrind reports data race which I can't see (involves rwlocks)

2016-07-08 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=360557

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
(In reply to Ari Sundholm from comment #0)

> The reason why I am so puzzled by this is that both on line 36 and line 61
> the mutex for the element is held.

Are you sure the lock is always held on line 36 (that does  b.var2++;) ?
pthread_cond_wait will release the lock.
So, on the next loop, b.B_lock  is not held anymore.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 365208] valgrind stuck after redirecting "memcpy"

2016-07-08 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=365208

--- Comment #4 from Philippe Waroquiers  ---
Effectively, there is not a lot more info.
It looks however that (some) user level code is being executed, as the user
stack
is being extended.

So, you might try to attach to the valgrind gdb server by doing
   gdb 
and then
  (gdb) target remote  | /usr/lib/valgrind/../../bin/vgdb --pid=X
(as instructed by gdb).

Assuming the above does not work or give any info, you might try
with the none tool:   --tool=none

Also maybe try with the smallest simplest executable you have
(maybe and "hello world" program, or date, or similar).

Other possible options to trace are --trace-signals=yes --trace-syscalls=yes
--trace-flags=0010 --trace-notbelow=1

Just to see if something happens.

But for sure, we shoot in the dark with the above

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 365208] valgrind stuck after redirecting "memcpy"

2016-07-08 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=365208

--- Comment #5 from Philippe Waroquiers  ---
NB/ when adding additional trace, still keep the -v -v -v -d -d -d

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 365273] Invalid write to stack location reported after signal handler runs

2016-07-09 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=365273

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #5 from Philippe Waroquiers  ---
Compiled the program on an x86 debian 8.5 and an amd64 debian 7.9.
Tried with several valgrind versions, but could not reproduce any write error.

Can you run with the traces -v -v -v -d -d -d --trace-signals=yes
and attach the resulting trace.

Can you attach the trace both with and without your patch ?

Thanks

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 365273] Invalid write to stack location reported after signal handler runs

2016-07-09 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=365273

--- Comment #6 from Philippe Waroquiers  ---
Just to be sure to be complete, please also add --trace-signals=yes

Thanks

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 365273] Invalid write to stack location reported after signal handler runs

2016-07-09 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=365273

--- Comment #7 from Philippe Waroquiers  ---
Humph, I mean:
Just to be sure to be complete, please also add --trace-syscalls=yes

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 365273] Invalid write to stack location reported after signal handler runs

2016-07-10 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=365273

Philippe Waroquiers  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from Philippe Waroquiers  ---
Thanks for the analysis and the patch, this was a tricky problem.

Patch (slightly modified)  committed in revision 15902.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 360557] helgrind reports data race which I can't see (involves rwlocks)

2016-07-12 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=360557

--- Comment #3 from Philippe Waroquiers  ---
(In reply to Ari Sundholm from comment #2)
> This is not the case, as, per standard semantics for condition variables,
> pthread_cond_wait re-acquires the mutex before returning to the caller.
Yes, you are correct, I stopped reading pthread_cond_wait manual too quickly
:(.

So, here is another (desperate?) trial to explain the helgrind behaviour.

I have modified the program to comment out the destroy and delete of B_cond and
B_lock.
So, cond and locks are never deleted/destroyed.
This allows then to still show which lock was held, to avoid the  (and 1 that
can't be shown).

Then this has shown that in fact, the read had the A lock and had one B_lock,
but not the same B_lock as the write operation.

This then leads to the hypothesis that the memory of B might be freed and then
reallocated
fast enough (and get another lock) so that effectively there is some kind of
race condition
reported by helgrind.
Or asked otherwise, how is the program ensuring that no thread is waiting
on a lock of B, while in parallel another thread is delete-ing b.
For sure, the delete &b; is done after the unlock of B_lock
but I guess that the A lock ensures that there is no race on B when busy
inserting
or deleting B.

In any case, here is a little bit of tracing produced by --trace-malloc=yes
(this is the modified version that does not destroy B_cond/B_lock):
--21045-- _ZdlPv(0x4B780E8)
--21045-- _Znwj(16) = 0x4B780E8 < allocation of a B
--21045-- _Znwj(24) = 0x4B7DB58 < allocation of new lock for B 
--21045-- _Znwj(48) = 0x4B7DBA0
--21045-- _Znwj(24) = 0x4B7DC00
--21045-- _ZdlPv(0x4B7DC00)
--21045-- _ZdlPv(0x4B780E8) < free previous B
--21045-- calloc(18,8) = 0x4B7DC00
--21045-- _Znwj(16) = 0x4B780E8 < allocation of a B (reallocating just
freed block)
--21045-- _Znwj(24) = 0x4B7DCC0 < allocation of new lock for B
--21045-- _Znwj(48) = 0x4B7DD08
--21045-- _Znwj(24) = 0x4B7DD68


When using the helgrind option  --free-is-write=yes, helgrind reports a race
condition
between a write and a previous write, which is the delete/free operation:
==21164== Possible data race during write of size 4 at 0x4B783EC by thread #7
==21164== Locks held: 1, at address 0x4B78698
==21164==at 0x8049178: A::method1(int)
(helgrind_bug_reproducer.orig.cpp:36)
==21164==by 0x8048E06: thread(void*) (helgrind_bug_reproducer.orig.cpp:137)
==21164==by 0x402DA76: mythread_wrapper (hg_intercepts.c:389)
==21164==by 0x405BEFA: start_thread (pthread_create.c:309)
==21164==by 0x42B2EDD: clone (clone.S:129)
==21164== 
==21164== This conflicts with a previous write of size 4 by thread #11
==21164== Locks held: 1, at address 0x804CBD8
==21164==at 0x402A868: operator delete(void*) (vg_replace_malloc.c:576)
==21164==by 0x80492AD: A::method2(int)
(helgrind_bug_reproducer.orig.cpp:65)
==21164==by 0x8048E30: thread(void*) (helgrind_bug_reproducer.orig.cpp:139)
==21164==by 0x402DA76: mythread_wrapper (hg_intercepts.c:389)
==21164==by 0x405BEFA: start_thread (pthread_create.c:309)
==21164==by 0x42B2EDD: clone (clone.S:129)
==21164==  Address 0x4b783ec is 4 bytes inside a block of size 16 alloc'd
==21164==at 0x40299F6: operator new(unsigned int) (vg_replace_malloc.c:328)
==21164==by 0x80490D3: A::method1(int)
(helgrind_bug_reproducer.orig.cpp:23)
==21164==by 0x8048E06: thread(void*) (helgrind_bug_reproducer.orig.cpp:137)
==21164==by 0x402DA76: mythread_wrapper (hg_intercepts.c:389)
==21164==by 0x405BEFA: start_thread (pthread_create.c:309)
==21164==by 0x42B2EDD: clone (clone.S:129)
==21164==  Block was alloc'd by thread #9

So, we have 2 indications that the problem might be related to the way the
memory
of the B objects are allocated/freed/reallocated.

So that might be a bug in the way helgrind handles allocate/free/reallocate.
It might also be a real race in the program, related to memory management, but
not very
clear how this can happen with the A lock.

I am wondering also about the limitations helgrind has about the condition
variables
(see helgrind user manual).

So, this helgrind error is not very clear to me, it looks like we need a real
expert to look
at this :)

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 366035] valgrind misses buffer overflow, segfaults in malloc in localtime

2016-07-25 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=366035

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #3 from Philippe Waroquiers  ---
Thanks for the bug report.
The program is however still quite big (> 400 SLOC), accesses a bunch of calls
from sound related libraries, is using include files not installed by default
(at least on
my setup), and understanding what this program does is not trivial.
E.g. I do not see why the line after #ifndef BUG
is needed: this seems to get the nr of channels that were set just before ???

So, it would be nice if you could isolate the buffer overflow in the minimal
program
to cause valgrind crash, and largely preferrably without dependencies to sound
libraries
and similar. I understand that this might be far to be easy, so below are other
ideas to help determine what is wrong in Valgrind.


First, it is strange that in the output file, the host stack trace has no
source/line nr.
This valgrind seems to have been built/compiled an unusual way.
Can you re-run with a valgrind compiled the 'classical way', so that the host
stacktrace
contains source/line nrs ?

Also, it is normal that valgrind does not (always) detect buffer overflow.
See e.g. manual for --redzone-size parameter.
Try to re-run with the max allowed redzone size (4096) as this will detect the
buffer overflow
as long as the 'overflow' is less than 4Kb far from the end of the allocated
buffer.

(and of course, valgrind memcheck tool detects buffer overflow in malloc-ed
buffers,
but not buffers on the stack and global (you might try --tool=exp-sgcheck for
that)
or buffer overflows inside array inside struct. 

An undected buffer overflow normally however by itself should not lead to a
SEGV inside
valgrind code, so this is still puzzling.
I am wondering if this is not the result of a side effect of hacking with the
nr of channels
(maybe due to a syscall with arguments not fully checked by Valgrind, leading
to this SIGSEGV).
It would be also nice to re-run with more trace e.g.
   -v -v -v -d -d -d --trace-syscalls=yes

Thanks

Philippe

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 366035] valgrind misses buffer overflow, segfaults in malloc in localtime

2016-07-26 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=366035

--- Comment #7 from Philippe Waroquiers  ---
(In reply to Frederick Eaton from comment #6)
> Hi Philippe,
> 
> Thanks for responding.
> 
> I'm using Arch Linux, it's weird that the default Arch package is not
> to your liking. 
The assumption is that valgrind is (basically) build using
   configure/make/make install
and that the resulting executables are not stripped (to ensure a.o. that the
crash stack traces contains symbolic information).
But everybody is free to compile valgrind as they want, this is free software
:).
And of course, then to get some help from Valgrind developers, very likely,
we will prefer to have non stripped executables :).


> Well I compiled from ABS adding (!strip debug) to
> OPTIONS in /etc/makepkg.conf, the new version gives line numbers for
> the backtrace. I attached the new output and also the output with the
> verbose options you recommended.
Thanks. This line/source nr allows to confirm that the valgrind malloc
structure has been 
heavily corrupted.

> In the original version I just called `snd_pcm_hw_params_set_channels`
> and assumed the setting had been honored by ALSA. I requested 1
> channel, ALSA created 2 channels per device constraints, I allocated
> space for 32768 1-channel "frames", ALSA return 32768 2-channel
> frames. I got a buffer overflow. I hope this answers your question. I
> imagine that my original submission was a bit confusing.
Forcing the channel to 1 at that 'BUG' place has other consequences than just
allocating a 'too small' buffer.
As I understand, after this 'forcing to 1', we will also have sfinfo.channels
that will
be 1 instead of 2. No idea if that participates (or not) to the problem.
But what you might do instead of changing channels to 1 is to just allocate a
buffer
with a size equal to frames * 2.
Then we know that the only thing which is 'wrong' is the buffer size.

That being said, assuming the too small buffer is the only cause of this
corruption,
the only 'modify' use of this buffer is likely in the source line:
res = snd_pcm_readi(handle, buffer, frames);
In the detailed trace, we see no matching read system call.
So, I will assume that this snd_pcm_readi is not doing a read syscall but is
rather
an ioctl, probably corresponding to this trace:
SYSCALL[14734,1](16) sys_ioctl ( 4, 0x80184151, 0xffefff170 ) --> [async] ... 
SYSCALL[14734,1](16) ... [async] --> Success(0x0) 

I am not sure how to translate this 0x80184151 into one of the 'symbolic' ioctl
SND/PCM things handled in syswrap-linux.c.
I suspect that some SND/PCM ioctl are not properly described  as
reading/writing the memory
pointed to by the ARG3 of the ioctl. Then of course, that might do a buffer
overrun
which is undetected by Valgrind, which then corrupts the end of the buffer
block
and the malloc data structure/memory/blocks following this (too small) buffer.

So, at this point, what we need to confirm is:
  is snd_pcm_readi really doing an ioctl ?
  if yes, which one ? (i.e. which 'symbolic' ioctl is it doing ?)
e.g. maybe it is  case VKI_SOUND_PCM_READ_CHANNELS:
and if this is the case, then syswrap-linux.c describes that this ioctl is
writing the
size of an int, while it clearly reads more than an int, if the ioctl is
reading real data.
 If now that is the bug, and syswrap-linux.c really should describe that it
reads a bunch
 of bytes depending on previously set parameters, then I think that is a lot of
work to do,
 as basically syswrap-linux.c will need to record the previous SND/PCM ioctl to
know what
 is the expected size of such an ioctl ARG3.

 Now, maybe this ioctl is rather something like VKI_SNDRV_CTL_IOCTL_TLV_READ
 but I do not see how that matches this buffer logic and the snd_pcm_readi,
which only
  has a buffer argument.

So, to understand where the undetected buffer overflow comes from, I guess some
alsa/snd lib source code reading might be needed, to see how snd_pcm_readi is
implemented
in terms of ioctl.
We can then check if syswrap-linux.c properly describes what this ioctl is
accessing
in read and/or write mode.

Hoping this helps 

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 366035] valgrind misses buffer overflow, segfaults in malloc in localtime

2016-07-28 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=366035

--- Comment #9 from Philippe Waroquiers  ---

> I hope it helps too! What do you want me to do now? Is there some
> other tracing facility which I should run to help you identify the
> problematic ioctl? Do you want me to make the example program more
> minimal? Perhaps I could do the latter, otherwise I don't really have
> much time - I just wanted to report this bug as a kind of housekeeping
> task, so that upstream knows about the problem. We can maybe rename it
> to something like "valgrind doesn't correctly track buffer overflows
> from certain ALSA ioctls". Thank you,
Yes, it would be nice to continue the investigation, as the ioctl hypothesis is
not yet
confirmed.
So, the following would be useful:
Discover which ioctl are done by snd_pcm_readi
One way is to put a breakpoint on the line calling this function, and the
next line.
When the first breapoint is encountered, add a breakpoint on the ioctl
syscall.
Each time the ioctl breapoint triggers, write down the (symbolic) ioctl
request
   (I guess the symbolic request will be seen in the source e.g. when doing GDB
up)

It is assumed that the ioctl done (or one of the ioctl done by this function)
is using
the buffer address as parameter. Would be nice to confirm that.

The remaining mystery is : if really an ioctrl is 'badly' described in
syswrap-linux.c,
then that should create a lot of errors in memcheck (use of undefined memory).

So, the ioctrl badly described hypothesis is still not (fully) understandable.

The alternative to find which ioctl is done is to read the library source file.

Thanks

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 367995] Integration of memcheck with custom memory allocator

2016-09-21 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=367995

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #14 from Philippe Waroquiers  ---
Quickly read the last version of the patch, sorry for entering in the game so
late

Some comments:

* Typo in the xml documentation:  alocator

* lines like below:  opening brace should be on the same line
+ if (MC_(is_mempool_block)(ch1))
+ {

* for detecting/reporting/asserting the overlap condition
   in case ch1_is_meta != ch2_is_meta, I am wondering if we should not check
   that the non meta block is (fully) inside the meta block.
   It looks to be an error if the non meta block is not fully inside the meta
block.

*  free_mallocs_in_mempool_block : this looks to be an algorithm that will be
O (n * m)   when n is the nr of malloc-ed blocks, and m is the nr of blocks
covered
by Start/End address. That might be very slow for big applications, that
allocates millions
of blocks, e.g. 1 million normal block, and one million blocks in meta
blocks
will take a lot of time to cleanup ?

Thanks

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 367995] Integration of memcheck with custom memory allocator

2016-09-23 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=367995

--- Comment #18 from Philippe Waroquiers  ---
(In reply to Ruurd Beerstra from comment #15)
Thanks for all your work on this, I think this is useful 
(I have not yet looked in depth, but I think this might be used
for the 'self-hosting' of valgrind, as valgrind uses pools).

> Part of the inefficiency is that it has to restart the scan after modifying
> the list. Can't help that.
> Also, I can't find any other way in valgrind to find the chunks with a
> particular address-range other than a brute-force scan.
It is effectively the scan restart which makes it quadratic.
The brute-force scan is reasonable: that will be O(n), and
avoiding this linear scan would imply overhead for the non mempool
uses.

Such mempool functionalities will often be used by applications
that use a lot of (small) blocks, so it would be better to avoid this quadratic
aspect.
Various techniques can be used for that, I think the best/easiest
is to add a function
 void* VG_(HT_remove_at_Iter) ( VgHashTable *table)
which removes the item at the current position of the iterator
ensuring that after the removal the iterator is such that VG_(HT_next)
will return the element following the one just removed (or NULL, if the removed
element was the last element).
This removal will not cause problems (no hash table resize, and proper
maintenance of the iter and chains).

Philippe

NB: more generally, as an hash table can only have one single iterator, it
would be
possible to allow calls to the other removal functions, but I think it is
better to
have a special function for that).
Handling additions during iteration is more problematic, due to possible hash
table resize.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 367995] Integration of memcheck with custom memory allocator

2016-09-23 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=367995

--- Comment #19 from Philippe Waroquiers  ---
(In reply to Ivo Raisr from comment #17)
> I will take care of the integration if Philippe is ok with it.
As indicated in comment 18, I think we can avoid relatively
easily the quadratic aspect.
Otherwise, if that would not be easy to do, let's integrate the
patch in any case.

Thanks

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 361615] Inconsistent termination for multithreaded process terminated by signal

2016-09-24 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=361615

Philippe Waroquiers  changed:

   What|Removed |Added

Summary|Inconsistent termination|Inconsistent termination
   |when an instrumented|for multithreaded process
   |multithreaded process is|terminated by signal
   |terminated by signal|

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 361615] Inconsistent termination for multithreaded process terminated by signal

2016-09-24 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=361615

Philippe Waroquiers  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #14 from Philippe Waroquiers  ---
Fixed in revision 15982.

Note that the fix committed consists in calling
 VG_(reap_threads)(tid);
after nuke all threads.

Thanks for the bug report and the test case.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 369456] callgrind_control failed to find an active callgrind run.

2016-09-29 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=369456

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #3 from Philippe Waroquiers  ---
I ported and tested vgdb on macos some years ago, so the basic of vgdb should
work.
What is not done on macos is the (optional) vgdb  (ptrace based on linux)
functionality
that takes a process out of a syscall.

Another difference between vgdb on linux and macos is that vgdb cannot find
the command line by opening "/proc/%d/cmdline", pid);

This means vgdb -l reports e.g.
use --pid=93572 for (could not open process command line)

This is probably the reason for which callgrind_control does not work, as I
guess it expects
something that matches :
 if (/^use --pid=(\d+) for \S*?valgrind\s+(.*?)\s*$/) {

So, I suppose that fixing the perl regexp there to also match the (could not
)
should allow callgrind_control to work

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 369456] callgrind_control failed to find an active callgrind run.

2016-09-30 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=369456

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||rhysk...@gmail.com

--- Comment #5 from Philippe Waroquiers  ---
(In reply to Alex from comment #4)
> Fixing regex can help to find PID. Something like that (I'm not familiar
> with perl/regex):
> $ vgdb -l | perl -lpe'/^use --pid=(\d+) for [\S\s]+$/; print $1;'
> 30750
> use --pid=30750 for (could not open process command line)
> 
> But callgrind_control also check  that running tool is callgrind.
> There is no such info in the string returning from vgdb -l.
Yes, a proper fix implies to have a way to find the command line (or at least
the
launched executable) on MacOS.
No idea how to do that (and moreover I have no access to a MacOS system).
Maybe Rhys has an idea ?

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 364058] array overruns are not detected

2016-06-08 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=364058

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #4 from Philippe Waroquiers  ---
Please (re-)read the user manual about sgcheck,
in particular the sections
   11.3. How SGCheck Works
and 11.5. Limitations

That should (clearly?) explain why nothing is reported for your example
(false negative).

If the manual is not clear, please re-open a bug, e.g. suggesting what to add
to make
it more clear.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 364058] array overruns are not detected

2016-06-08 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=364058

--- Comment #6 from Philippe Waroquiers  ---
(In reply to Sergey Meirovich from comment #5)
> Sorry. I indeed missed that. But why next also doesn't trigger any error
> message?
> 
> -bash-4.1$ cat t.c 
> int main(int c, char **o)
> {
>   int stack[2]; 
>   stack[0] = c;
>   stack[1] = c++;
>   stack[2] = c++;
>   return stack[2];
> }
exp-sgcheck associates (for each function call) each instruction to the first
array accessed
by this instruction. It then checks that (during the same function call)  this
instruction continues to access the same array (and in the array bounds).
So, basically, this means that exp-sgcheck will only detect array over or
under-run in
loops. It will never detect an over/under-run on instructions executed only
once
(either because they are not in a loop, or because the loop is executed once).
All this limitations derived from the fact that exp-sgcheck works at binary
level. It has
to discover which array is accessed by an instruction 'at run time'.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 364497] Run valgrind on nginx

2016-06-19 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=364497

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
This looks very similar to bug 356393, that has been fixed in vex r3213.

Can you check by trying the svn version ?

Thanks

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 364058] clarify in manual limitations of array overruns detections

2016-06-30 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=364058

Philippe Waroquiers  changed:

   What|Removed |Added

Summary|array overruns are not  |clarify in manual
   |detected|limitations of array
   ||overruns detections

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 364058] clarify in manual limitations of array overruns detections

2016-06-30 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=364058

Philippe Waroquiers  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Philippe Waroquiers  ---
(In reply to Sergey Meirovich from comment #7)
> Thanks for the explanation. Is that could be concluded by implication from
> the manual?
IMO, effectively, reading the manual leads to this.
But I have in any case added a few sentences to make this even more clear.
Committed revision 15897.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 199468] Suppressions: stack size limited to 25 while --num-callers allows more frames

2016-09-07 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=199468

Philippe Waroquiers  changed:

   What|Removed |Added

Summary|Suppressions: stack size|Suppressions: stack size
   |limited to 25 while |limited to 25 while
   |--num-callers allows up to  |--num-callers allows more
   |50 frames   |frames

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 199468] Suppressions: stack size limited to 25 while --num-callers allows more frames

2016-09-07 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=199468

Philippe Waroquiers  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #7 from Philippe Waroquiers  ---
Committed revision 15945.
With this commit, a suppression entry can have a nr of callers up to the
maximum of
--num-callers (500). Generated suppressions are generated with up to
--num-callers entries.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 368461] mmapunmap test fails on ppc64

2016-09-10 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=368461

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #2 from Philippe Waroquiers  ---
(In reply to Carl Love from comment #1)
> Created attachment 100988 [details]
> fix for mmapunmap test on ppc64
> 
> The attached patch fixes the post failure on ppc64.  The patch was submitted
> by Will Schmidt
> 
> Will post the patch to the developers mailing list for review as it touches
> arch independent files.
This looks good to me.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 368416] Add tc06_two_races_xml.exp output for ppc64.

2016-09-10 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=368416

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #2 from Philippe Waroquiers  ---
Adding a specific ppc64 file is for sure something that makes the test pass.
However, this means we have (almost) duplicated files to maintain.
Wouldn't it be possible to instead write a filter for the xml (similarly to the
filter for
the text output) that removes the ..._WRK ?
If that is reasonably easy, the filter approach is preferrable to a duplicated
exp file.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 368412] False positive result for altivec capability check

2016-09-10 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=368412

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #2 from Philippe Waroquiers  ---
Comment on attachment 100973
  --> https://bugs.kde.org/attachment.cgi?id=100973
Fix false positive check for altivec support on PPC machines

Can't really comment on this (autoconf is somewhat mysterious to me) but I
guess
this is ok.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 368416] Add tc06_two_races_xml.exp output for ppc64.

2016-09-13 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=368416

--- Comment #4 from Philippe Waroquiers  ---
(In reply to Will Schmidt from comment #3)
> Created attachment 101061 [details]
> updated and refactored patch
> 
> Per feedback and commentary, this is a different approach to solving the
> problem...
> 
> Update and modify the filter_xml filter to strip out the troublesome frame. 
> And a tweak to the same to suppress the blurb that typically indicates a
> frame has been skipped.
> Appears to work OK across the systems I have access to.  (a mix of ppc64 and
> a couple x86 boxes).

That looks good to me. Feel free to commit.
Thanks

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 361615] Inconsistent termination when an instrumented multithreaded process is terminated by signal

2016-09-13 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=361615

--- Comment #7 from Philippe Waroquiers  ---
(In reply to Julian Seward from comment #6)
> Philippe, didn't you fix something like this recently?
Not recently.
Related to thread termination, I did something some years ago, to fix bug
324227.
I can take a look at this bug and see if I find something.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 361615] Inconsistent termination when an instrumented multithreaded process is terminated by signal

2016-09-13 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=361615

--- Comment #8 from Philippe Waroquiers  ---
(In reply to earl_chew from comment #4)
> Created attachment 100145 [details]
> Self contained test case

It seems there was an attachment problem.
Could you re-attach the self contained test case ?
Thanks

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 366035] valgrind misses buffer overflow, segfaults in malloc in localtime

2016-09-14 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=366035

--- Comment #13 from Philippe Waroquiers  ---
(In reply to Julian Seward from comment #12)
> Philippe, is there anything we can or should do here?
The current hypothesis is that an ioctl used by alsa or a sound library is
not properly handled by syswrap : the syscall wrapper does not see that the
kernel
will overwrite a memory zone.

However, to confirm this (and fix the wrapper) implies to compile the test
program
(probably not very difficult, but still it does not compile out of the box:
some include
and/or dev libs are needed).
After that, some knowledge of the sound ioctl-s are needed to see if/what is
wrong.

My knowledge of the sound sycalls are close to 0, so I cannot do this without a
significant time (that I do not have);

So, in summary, yes, there are things to do, but nobody has time to volunteer
for the
moment.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 360008] Contents of Power vr registers contents is not printed correctly when the --vgdb-shadow-registers=yes option is used.

2016-03-29 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=360008

--- Comment #2 from Philippe Waroquiers  ---
Sorry, I saw this bug but forgot to work on it, thanks for the reminder.

Here is what I think is wrong:
valgrind on ppc64 implements a ppc64 version that provides the registers vs0 ..
vs63.
The 'old' vr0 .. vr31 registers are mapped to vs32..vs63.
The 'old' floating point registers f0 .. f31 are mapped to the 'low' part of
the register vs0 .. vs31.
The 'high' part of vs0 .. vs31 is new.

However, the xml files describing the ppc64 target are still describing only
the 'old' ppc64
architecture.

When gdb connects to valgrind --vgdb-shadow-registers=no, then valgrind gives
to gdb
an xml file that describes the old arch. 
I suspect that when giving  --vgdb-shadow-registers=yes, the reply packet is
not containing
the values as expected by gdb, due to the missing 'new registers high parts' of
vs0 .. vs31.

Looking at the recent gdb descriptions of power, we see the following:


  powerpc:common64
  
  
  
  
  


The valgrind xml target is missing the power-vsx.xml, that describes the



  
  


So, I think  that adding power-vsx.xml and the shadow  power-vsx-valgrind-s1
and s2.xml
will solve the problem.
Note that gdb understands how to rebuild vs0 .. vs31 from the f0 .. f31 and
vs0h .. vs31h
registers.  I think gdb will not understand how to rebuild vs0s1 .. vs31s1 from
f0s1 and vs0hs1 (problem similar to intel avx registers, described in user
manual
in  3.2.7. Examining and modifying Valgrind shadow registers).

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 360008] Contents of Power vr registers contents is not printed correctly when the --vgdb-shadow-registers=yes option is used.

2016-03-30 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=360008

--- Comment #3 from Philippe Waroquiers  ---
Some more info:
* As discussed in comment 2, the vs registers high parts should be described.
   It is however not clear to me if these registers should always be reported
to GDB
   or if they should be reported only depending on some hwcaps.
   If that is the case, then some logic similar to 'have_avx()' in
valgrind-low-amd64.c should
   be implemented, so as to choose an xml file that only reports the registers
present
   in the currently used arch.

* On top of adding the xml file for the vs registers, I suspect that it will be
needed
  to either fix or remove the regnum notation in s1 and s2 xml files:
  in the x86 and amd64 xml files, either the regnum components were updated in
the s1 and s2
  xml files, or the regnum components were removed.
  I see that many xml files (e.g. for power64, but also e.g. for mips) have
kept the regnum
  of the 'original non shadow' xml file.  Having such regnum is for sure also
causing problems.
  It might even be the main cause for the vr shadow registers being incorrect.

Finally, a few hints that can help investigating:
* in GDB, you can use the command
 maint print remote-registers
  to see the definition of registers as understood by gdb.
  The columns 'Rmt Nr  g/G Offset' are giving the protocol register number and
the
   offset in the g/G packets.

* to examine directly the valgrind threads registers, you can do in gdb+vgdb:
 mo v.set h
  then
add-symbol-file  0x3800

  Then e.g. to print a register of thread 1, you can from gdb do
 (gdb) p /x vgPlain_threads[1].arch.vex.guest_GPR13
 $2 = 0x0
 (gdb) p /x vgPlain_threads[1].arch.vex_shadow1.guest_GPR13
 $3 = 0x0
(gdb) 
(all valgrind global variables can be examined once host visibility was
activated using
   mo v.set h

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 360008] Contents of Power vr registers contents is not printed correctly when the --vgdb-shadow-registers=yes option is used.

2016-04-17 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=360008

--- Comment #5 from Philippe Waroquiers  ---
I (quickly) read the patch (did not test), a few minor comments/questions:

* typo in power64-core.xml : typo: regect -> reject
* powerpc-altivec64l-valgrind.xml : I am not sure to fully understand why we
have 2 new
  includes for   power64-core2-valgrind-s1.xml and
power64-core2-valgrind-s1.xml, but there
  was no addition of power64-core2.xml after power-fpu.xml : normally, the s1
and s2 files
  should be similar in structure to the 'non shadow' register files.
  I see that the power64-core2*xml files are defining the shadow registers for
e.g. pc/msr/cr
  while the equivalent non shadow registers are in power64-core.xml
  It is unclear to me why the shadow registers for these cannot be defined in
files that are
  'similar in structure' to the non shadow files.
* valgrind-low-ppc64.c : typos fpmap -> fp map
 lower lower -> lower
psuedo -> pseudo  (twice this typo)
remove final , after +  { "vs31h", 10720, 64 },
* valgrind-low-ppc64.c : we have a bunch of lines that are (almost) duplicated
for big/little
endian.
   As far as I can see, the only difference is that the offset is [0] or [2].
   So, if this is really the only difference, it would be better to use
something like
  [offset]
   and have offset set to 0 or 2, depending on endianness.

Thanks for looking at this

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 362009] Valgrind dumps core on unimplemented functionality before threads are created

2016-04-20 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=362009

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #2 from Philippe Waroquiers  ---
If show sched status is called before the threads are implemented, then nothing
will be
visible.
Maybe it would be better to do something like:

   if (VG_(threads) == NULL) {
  VG_(printf) ("cannot show sched status : scheduler not yet
initialised\");
  return;
  }
  ... here the old code ...
rather than report nothing ?

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 360008] Contents of Power vr registers contents is not printed correctly when the --vgdb-shadow-registers=yes option is used.

2016-04-20 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=360008

--- Comment #9 from Philippe Waroquiers  ---
Re-reading the comment at the beginning of valgrind-low-ppc64.c, the register
maps is 
still not clear to me.
The comment tells there are 64 VSR registers of 64 bits.
But afterwards; the same commen tells that "The 32 vr[0] to vr[31] registers of
size 128-bits map to VSR[31] to VSR[63].".
Probably a VSR register is 128 bits.

The second not very clear thing is for the fp registers: when the comment tells
that 'however, these are not "real" floating point registers':  is that
speaking about fp[0..31]
or rather about upper 64 bits of vsr[32] to vsr[63] ? I guess the later, so
maybe tells something
like floating point instructions are referencing the fp registers, and not the
vsr registers.

Finally, the last paragraph tells 'the 32 floating point registers  (AKA VSR[0]
to VSR[31])'
which contradicts the previous paragraph that told the fp register sare in
vsr[32..63] upper bits.


In the code that initialises low_offset and high_offset : for VG_BIGENDIAN, the
comment tells
that the 64-bits are stored as Little Endian. Are the values really stored as
little endian, even
on a big endian platform ? The little endian comment says everything is little
endian (so sounds logical); but the big endian case seems half big/half little.
If this is really the case, then maybe
add a sentence such as:  "In the big endian case, a little endian convention is
still partially used."

For the xml core2 file:  after re-reading the comment and playing with gdb on
gcc110,
I finally understood :).  GDB is somewhat bizarre: it prints the general
registers, then the
fp register 0 .. 31, then pc etc,  even if the xml file defines them in the
order general registers
followed by pc etc   followed by fp 0 .. 31.
Thanks for clarifying this.

Otherwise, patch looks good to me. Up to you to see which additional updates
you do
for the comments above, but feel free to commit.

Thanks

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 199468] Suppressions: stack size limited to 25 while --num-callers allows up to 50 frames

2016-08-10 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=199468

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #2 from Philippe Waroquiers  ---
(In reply to Robin Kuzmin from comment #1)
> I have the same issue.
> I have a suppression:
> {
>google::protobuf::Message::PrintDebugString() const (text_format.cc:110)
>Memcheck:Leak
>match-leak-kinds: reachable
>fun:_Znwm
>...
>fun:_ZNK6google8protobuf7Message11DebugStringEv
>fun:_ZNK6google8protobuf7Message16PrintDebugStringEv
>...
> }
> This suppression is ignored for "--num-callers=20", but works fine
> (suppresses) for "--num-callers=40". 
> I didn't expect the suppressions to depend on "--num-callers".

I think that we have 2 different problems:
The bug speaks about the fact that a suppression is limited to 24 entries,
while the --num-callers can now (in valgrind 3.11) go up to 500.

Your problem is that a suppression works with 40 num callers, but does not
suppress
for 20 num callers. This is not abnormal: if for example 
fun:_ZNK6google8protobuf7Message11DebugStringEv is at depth 30, when you use
--num-callers=20, then this function will not be recorded in the error
stacktrace and the suppression will not match.
This is because the logic is:
  * first an error is made, with a stacktrace recorded as specified with
--num-callers
  * after that, the suppression entries are tried.

If suppressions would have to be done without taking --num-callers into
account, then it
means that all errors stacktraces would have to be first recorded without
limit, and then after
(unsuccesful) suppression matching, the frames exceeding --num-callers would
have to be dropped.

The original problem (suppressions limited to 24 entries while errors can
record up to 500 frames) is somewhat strange. I could not retrieve any comment
or svn log describing why
suppression entries are limited to 24 entries. --num-callers used to be limited
to 50.

If we increase the nr of entries in suppressions to 500 (the max value for
--num-callers),
this will however has as a side effect that --gen-suppressions will produce
bigger entries
while they were limited to 24 entries. I guess this is not a problem, as in any
case, for many suppression entries, 24 was already a lot (e.g. for suppression
entries intended to suppress errors in libraries)

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 199468] Suppressions: stack size limited to 25 while --num-callers allows up to 50 frames

2016-08-11 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=199468

--- Comment #5 from Philippe Waroquiers  ---
(In reply to Robin Kuzmin from comment #4)
> I have run through this discussion more thoroughly. My understanding how
> valgrind works (or should work) has changed. Here is my vision.
> 
> In order to avoid multiple reports of the same error, valgrind uses top 4
> stack frames to common up the errors (see description of "--num-callers" at
> http://valgrind.org/docs/manual/manual-core.html#manual-core.options). 
> 
> Valgrind (memcheck) deals with (at least) 2 types of the stack traces:
> 
> 1. The Short-term Stack Traces. They are reported immediately upon error,
> e.g. when the analyzed program accesses outside the heap-allocated block. At
> this stage the entire stack trace can be used for suppressions (both for
> applying the suppressions and for generating the suppression with
> --gen-suppressions). After suppressions, if/when reporting these stack
> traces to the console the top --num-callers frames are (or can be) printed.
> Then these stack traces are (or can be) forgotten by valgrind (and only the
> top 4 stack frames are (or can be) saved to common up the errors). Thus for
> the Short-term Stack Traces there is no need for the suppressions to depend
> on the --num-callers. Implementing this mechanism will be relatively cheap
> (the price will be an INsignificant lower down in performance and
> INsignificant (and short-term) increase in resource requirements).
> 
> 2. The Long-term Stack Traces. These stack traces valgrind has to store in
> memory for a long period of time, and potentially till the end of the
> analyzed program. E.g. when the analyzed program makes an allocation, the
> allocation stack trace has to be kept until the CORRECT deallocation (if the
> CORRECT deallocation Ever happens) or until the termination (if the CORRECT
> deallocation Never happens). For the Long-term Stack Traces valgrind can try
> to APPLY suppressions immediately (upon allocation) to the entire stack
> trace. If any of the suppressions is applicable then the top 4 stack frames
> are stored to common up the errors (and the stack trace can be marked as "to
> be suppressed by suppression N"), otherwise (none of the suppressions apply)
> the number of stack frames to be saved depends on the --gen-suppressions. If
> "no" then the --num-callers stack frames are stored, otherwise the entire
> stack trace is stored (and the entire stack trace is used for generating the
> suppression (if the CORRECT deallocation never happens), and only the
> --num-callers stack frames are used for reporting the error to the console).
> 
> Thus the suppressions might be independent of the --num-callers.

For what you call 'long term stacktraces" : capturing a big stacktrace is
costly by itself.
A lot of effort was spent to optimise this, but still this is costly. So,
always capturing
an (unlimited) stack trace is not desirable.
But applying the suppression entries is even (a lot) more costly. E.g. it
implies to translate
an IP address into a function name.
A lot of effort was spent already to make these suppression matching faster
(e.g. using lazy completers, see m_errormgr.c is_suppressible_error function).
But such suppression matching can clearly not be done all the time (e.g. for
all allocations;
in memcheck or in some tools such as helgrind, even for all memory accesses).

So, what you suggest cannot be implemented without impacting significantly the
performance.

With that in mind, I think the best approach for you is just to use
suppressions that works
properly with the --num-callers you are using.

For what concerns the original problem (suppression entries limited to 24
entries):
I guess we could/should use the same max sze as the max value fo r--num-callers

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 357037] Line numbers are occasionally displayed incorrectly in intel-compiled applications

2015-12-30 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=357037

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
It would be good to analyse the debug info generated by icc 
e.g. using objdump
and/or using gdb  e.g.  info line 5/6/7
and info line *0x..
and/or the valgrind gdbserver monitor command   v.info location 
   (where addr is an address that should be part of the line 5)

Alternatively, it might be the unwind info that is not ok and/or the valgrind
unwinder.
You might investigate that by using gdb+vgdb, and put a break at
vg_replace_malloc.c:299.
You can then compare gdb unwinder (using the bt gdb command) 
with the valgrind unwinder (using monitor v.info scheduler)

All the above might give some hints about what is going wrong.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 358030] support direct socket calls on x86 32bit (new in linux 4.3)

2016-01-15 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=358030

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #2 from Philippe Waroquiers  ---
(In reply to Ron from comment #1)
> Created attachment 96657 [details]
> Patch that adds the direct socket syscall definitions for x86
Thanks for the patch, which seems reasonable (but quick reading only :).
Have you run the regression tests with your patch ?
The testsuite has a bunch of socket related tests, so if there is a lot of
failures
without your patch, and a lot less failures with your patch, then that will
help to
see the patch is correct/needed/
Maybe also  memcheck/tests/x86-linux/scalar.c should/could be modified ?

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 357871] pthread_spin_destroy not properly wrapped

2016-01-17 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=357871

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be
 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 356457] valgrind: m_mallocfree.c:2042 (vgPlain_arena_free): Assertion 'blockSane(a, b)' failed.

2016-01-22 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=356457

--- Comment #13 from Philippe Waroquiers  ---
(In reply to Joost VandeVondele from comment #12)
Thanks for this data.

The warning about the stack switch is normal : valgrind has an heuristic to
detect stack switch. If a program uses huge stackframes, then a call can be
confused
with a stack switch, and this warning indicates what to do if this is *not* a
stack switch
(but in a self-hosting setup, such message is normal: it is a stack switch, not
a huge frame).

Sadly, no error is detected by the self-hosting.

So, there are a few more things you could try:
Run (no need to self host, just a normal run)   but add the option  
 --sanity-level=4
Valgrind will do (more) sanity checks while running, and maybe this might give
a hint.

Another thing is to add the option --vgdb-stop-at=valgrindabexit
when you run all your regression tests. And then, when the problem reproduces,
valgrind will stop and wait for a gdb to connect.
Then attach with gdb to the valgrind process and do e.g. 
   bt full
Also to the frame (image.c:778) and do
   print img->ces[i]->off
   print img->ces[i]->used

You might also print all the not null ces entries.

But we are really trying to kill the bug by shooting in the dark :(


> Since the error is recurring, I have now tried to run the self-hosting.
> Running :
> 
> /data/vjoost/test/outer/install/bin/valgrind --sim-hints=enable-outer
> --trace-children=yes --smc-check=all-non-file --run-libc-freeres=no
> --tool=memcheck -v /data/vjoost/test/inner/install/bin/valgrind
> --suppressions=/data/vjoost/toolchain-r16494/install/valgrind.supp
> --max-stackframe=2168152 --error-exitcode=42 --vgdb-prefix=./inner
> --core-redzone-size=1000 --tool=memcheck -v
> /data/schuetto/auto_regtesting/regtests/cp2k/exe/local_valgrind/cp2k.sdbg
> ethanol_both_rcut10.0_e1-1_v1-4_RSR.inp
> 
> (I.e. self-hosting with added redzone, on the our executable corresponding
> to a failed run, with its arguments and parameters), I get a seemingly
> correct run. The output will be attached as out.innerouter.2 . Maybe it is
> worthwhile to look with expert eyes.
> 
> However, after observing in that output a warning on stack switching, I
> added --max-stackframe=68009224472 (as suggested, seems a bit large;-), and
> that lead to a run with some other error (Memcheck: the 'impossible'
> happened:   create_MC_Chunk: shadow area is accessible).

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 359133] m_deduppoolalloc.c:258 (vgPlain_allocEltDedupPA): Assertion 'eltSzB <= ddpa->poolSzB' failed.

2016-02-20 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=359133

--- Comment #10 from Philippe Waroquiers  ---
(In reply to David Hallas from comment #9)
> So, should I go ahead and close the bug now that a testcase has been added?

Status was changed to RESOLVED/FIXED which seems to be the final status of
valgrind bugs.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 349128] Access not within mapped region in _pthread_find_thread (OS X 10.11)

2016-02-22 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=349128

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 359705] memcheck causes segfault on a dynamically-linked test from rustlang's test suite on i686

2016-02-27 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=359705

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #6 from Philippe Waroquiers  ---
Seeing the first 2 lines of output:
  ==6449== Can't extend stack to 0x4bb9880 during signal delivery for thread 2: 
  ==6449== no stack segment 
it might be worth trying with
--vex-iropt-register-updates=allregs-at-each-insn

just in case your test case does special things with signals.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 360008] Contents of Power vr registers contents is not printed correctly when the --vgdb-shadow-registers=yes option is used.

2016-03-29 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=360008

--- Comment #2 from Philippe Waroquiers  ---
Sorry, I saw this bug but forgot to work on it, thanks for the reminder.

Here is what I think is wrong:
valgrind on ppc64 implements a ppc64 version that provides the registers vs0 ..
vs63.
The 'old' vr0 .. vr31 registers are mapped to vs32..vs63.
The 'old' floating point registers f0 .. f31 are mapped to the 'low' part of
the register vs0 .. vs31.
The 'high' part of vs0 .. vs31 is new.

However, the xml files describing the ppc64 target are still describing only
the 'old' ppc64
architecture.

When gdb connects to valgrind --vgdb-shadow-registers=no, then valgrind gives
to gdb
an xml file that describes the old arch. 
I suspect that when giving  --vgdb-shadow-registers=yes, the reply packet is
not containing
the values as expected by gdb, due to the missing 'new registers high parts' of
vs0 .. vs31.

Looking at the recent gdb descriptions of power, we see the following:


  powerpc:common64
  
  
  
  
  


The valgrind xml target is missing the power-vsx.xml, that describes the



  
  


So, I think  that adding power-vsx.xml and the shadow  power-vsx-valgrind-s1
and s2.xml
will solve the problem.
Note that gdb understands how to rebuild vs0 .. vs31 from the f0 .. f31 and
vs0h .. vs31h
registers.  I think gdb will not understand how to rebuild vs0s1 .. vs31s1 from
f0s1 and vs0hs1 (problem similar to intel avx registers, described in user
manual
in  3.2.7. Examining and modifying Valgrind shadow registers).

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 360008] Contents of Power vr registers contents is not printed correctly when the --vgdb-shadow-registers=yes option is used.

2016-03-30 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=360008

--- Comment #3 from Philippe Waroquiers  ---
Some more info:
* As discussed in comment 2, the vs registers high parts should be described.
   It is however not clear to me if these registers should always be reported
to GDB
   or if they should be reported only depending on some hwcaps.
   If that is the case, then some logic similar to 'have_avx()' in
valgrind-low-amd64.c should
   be implemented, so as to choose an xml file that only reports the registers
present
   in the currently used arch.

* On top of adding the xml file for the vs registers, I suspect that it will be
needed
  to either fix or remove the regnum notation in s1 and s2 xml files:
  in the x86 and amd64 xml files, either the regnum components were updated in
the s1 and s2
  xml files, or the regnum components were removed.
  I see that many xml files (e.g. for power64, but also e.g. for mips) have
kept the regnum
  of the 'original non shadow' xml file.  Having such regnum is for sure also
causing problems.
  It might even be the main cause for the vr shadow registers being incorrect.

Finally, a few hints that can help investigating:
* in GDB, you can use the command
 maint print remote-registers
  to see the definition of registers as understood by gdb.
  The columns 'Rmt Nr  g/G Offset' are giving the protocol register number and
the
   offset in the g/G packets.

* to examine directly the valgrind threads registers, you can do in gdb+vgdb:
 mo v.set h
  then
add-symbol-file  0x3800

  Then e.g. to print a register of thread 1, you can from gdb do
 (gdb) p /x vgPlain_threads[1].arch.vex.guest_GPR13
 $2 = 0x0
 (gdb) p /x vgPlain_threads[1].arch.vex_shadow1.guest_GPR13
 $3 = 0x0
(gdb) 
(all valgrind global variables can be examined once host visibility was
activated using
   mo v.set h

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 360008] Contents of Power vr registers contents is not printed correctly when the --vgdb-shadow-registers=yes option is used.

2016-04-17 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=360008

--- Comment #5 from Philippe Waroquiers  ---
I (quickly) read the patch (did not test), a few minor comments/questions:

* typo in power64-core.xml : typo: regect -> reject
* powerpc-altivec64l-valgrind.xml : I am not sure to fully understand why we
have 2 new
  includes for   power64-core2-valgrind-s1.xml and
power64-core2-valgrind-s1.xml, but there
  was no addition of power64-core2.xml after power-fpu.xml : normally, the s1
and s2 files
  should be similar in structure to the 'non shadow' register files.
  I see that the power64-core2*xml files are defining the shadow registers for
e.g. pc/msr/cr
  while the equivalent non shadow registers are in power64-core.xml
  It is unclear to me why the shadow registers for these cannot be defined in
files that are
  'similar in structure' to the non shadow files.
* valgrind-low-ppc64.c : typos fpmap -> fp map
 lower lower -> lower
psuedo -> pseudo  (twice this typo)
remove final , after +  { "vs31h", 10720, 64 },
* valgrind-low-ppc64.c : we have a bunch of lines that are (almost) duplicated
for big/little
endian.
   As far as I can see, the only difference is that the offset is [0] or [2].
   So, if this is really the only difference, it would be better to use
something like
  [offset]
   and have offset set to 0 or 2, depending on endianness.

Thanks for looking at this

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 362009] Valgrind dumps core on unimplemented functionality before threads are created

2016-04-20 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=362009

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #2 from Philippe Waroquiers  ---
If show sched status is called before the threads are implemented, then nothing
will be
visible.
Maybe it would be better to do something like:

   if (VG_(threads) == NULL) {
  VG_(printf) ("cannot show sched status : scheduler not yet
initialised\");
  return;
  }
  ... here the old code ...
rather than report nothing ?

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 360008] Contents of Power vr registers contents is not printed correctly when the --vgdb-shadow-registers=yes option is used.

2016-04-20 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=360008

--- Comment #9 from Philippe Waroquiers  ---
Re-reading the comment at the beginning of valgrind-low-ppc64.c, the register
maps is 
still not clear to me.
The comment tells there are 64 VSR registers of 64 bits.
But afterwards; the same commen tells that "The 32 vr[0] to vr[31] registers of
size 128-bits map to VSR[31] to VSR[63].".
Probably a VSR register is 128 bits.

The second not very clear thing is for the fp registers: when the comment tells
that 'however, these are not "real" floating point registers':  is that
speaking about fp[0..31]
or rather about upper 64 bits of vsr[32] to vsr[63] ? I guess the later, so
maybe tells something
like floating point instructions are referencing the fp registers, and not the
vsr registers.

Finally, the last paragraph tells 'the 32 floating point registers  (AKA VSR[0]
to VSR[31])'
which contradicts the previous paragraph that told the fp register sare in
vsr[32..63] upper bits.


In the code that initialises low_offset and high_offset : for VG_BIGENDIAN, the
comment tells
that the 64-bits are stored as Little Endian. Are the values really stored as
little endian, even
on a big endian platform ? The little endian comment says everything is little
endian (so sounds logical); but the big endian case seems half big/half little.
If this is really the case, then maybe
add a sentence such as:  "In the big endian case, a little endian convention is
still partially used."

For the xml core2 file:  after re-reading the comment and playing with gdb on
gcc110,
I finally understood :).  GDB is somewhat bizarre: it prints the general
registers, then the
fp register 0 .. 31, then pc etc,  even if the xml file defines them in the
order general registers
followed by pc etc   followed by fp 0 .. 31.
Thanks for clarifying this.

Otherwise, patch looks good to me. Up to you to see which additional updates
you do
for the comments above, but feel free to commit.

Thanks

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 364058] array overruns are not detected

2016-06-08 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=364058

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #4 from Philippe Waroquiers  ---
Please (re-)read the user manual about sgcheck,
in particular the sections
   11.3. How SGCheck Works
and 11.5. Limitations

That should (clearly?) explain why nothing is reported for your example
(false negative).

If the manual is not clear, please re-open a bug, e.g. suggesting what to add
to make
it more clear.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 364058] array overruns are not detected

2016-06-08 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=364058

--- Comment #6 from Philippe Waroquiers  ---
(In reply to Sergey Meirovich from comment #5)
> Sorry. I indeed missed that. But why next also doesn't trigger any error
> message?
> 
> -bash-4.1$ cat t.c 
> int main(int c, char **o)
> {
>   int stack[2]; 
>   stack[0] = c;
>   stack[1] = c++;
>   stack[2] = c++;
>   return stack[2];
> }
exp-sgcheck associates (for each function call) each instruction to the first
array accessed
by this instruction. It then checks that (during the same function call)  this
instruction continues to access the same array (and in the array bounds).
So, basically, this means that exp-sgcheck will only detect array over or
under-run in
loops. It will never detect an over/under-run on instructions executed only
once
(either because they are not in a loop, or because the loop is executed once).
All this limitations derived from the fact that exp-sgcheck works at binary
level. It has
to discover which array is accessed by an instruction 'at run time'.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 364497] Run valgrind on nginx

2016-06-19 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=364497

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
This looks very similar to bug 356393, that has been fixed in vex r3213.

Can you check by trying the svn version ?

Thanks

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 364058] clarify in manual limitations of array overruns detections

2016-06-30 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=364058

Philippe Waroquiers  changed:

   What|Removed |Added

Summary|array overruns are not  |clarify in manual
   |detected|limitations of array
   ||overruns detections

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 364058] clarify in manual limitations of array overruns detections

2016-06-30 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=364058

Philippe Waroquiers  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Philippe Waroquiers  ---
(In reply to Sergey Meirovich from comment #7)
> Thanks for the explanation. Is that could be concluded by implication from
> the manual?
IMO, effectively, reading the manual leads to this.
But I have in any case added a few sentences to make this even more clear.
Committed revision 15897.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 356457] valgrind: m_mallocfree.c:2042 (vgPlain_arena_free): Assertion 'blockSane(a, b)' failed.

2016-01-27 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=356457

--- Comment #15 from Philippe Waroquiers  ---
(In reply to Joost VandeVondele from comment #14)
> Also no luck with --sanity-level=4 
> 
> The fact that it is not reproducible on command is indeed not simplifying
> this. I wonder if this could be related to something external to valgrind
> triggering this.
Yes, this bug is quite mysterious.

The only remaining thing to try that I see is to add 
   --vgdb-stop-at=valgrindabexit
to the valgrind args you use for your regression tests.
Then when the error happens, valgrind will wait for a gdb to connect using
gdb+vgdb.
You can then examine e.g. which library is being mmap-ed.
You might also use gdb to directly attach to valgrind and examine valgrind
internals.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 303877] valgrind doesn't support compressed debuginfo sections.

2016-01-31 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=303877

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #18 from Philippe Waroquiers  ---
An alternative is also the simple/super small 'inflate' implementation in zlib
code
zlib-1.2.8/contrib/puff.h and puff.c

This is a fully independent inflate implementation (no #include).

There are some drawbacks (2 times slower than the real zlib inflate, and as it
does
not do memory allocation, inflate fails if the target buffer is too small
(and so, you must redo the inflate with a bigger buffer).
If the debug info stores the uncompressed size, then this is not a problem

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 348345] Assertion fails for negative lineno

2016-02-03 Thread Philippe Waroquiers via KDE Bugzilla

https://bugs.kde.org/show_bug.cgi?id=348345

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #5 from Philippe Waroquiers  ---
Transformed the other assert for negative line number in a complain once
+ refactorisation of the checking  committed in revision 15780.

Thanks for the patch

-- 
You are receiving this mail because:
You are watching all bug changes.

1 2 >

1 - 100 of 156 matches

Mail list logo