On 12/06/2021 16:45, atn19 via pacman-dev wrote:
> Hi all,
> Using an up-to-date system, I have an issue since I have been using pacman 6
> parallel downloads feature. When retrieving a large number of quite small
> files with a quite good bandwidth, it interrupts the downloads, fails, and I
> get the following error message: "failed to commit transaction (download
> library error)".
Are there any other messages before this? Usually there will be, if not
you can try with --debug.
> As a computer science student, I would like to find the root cause of this
> issue. I guess this is due to some critical race, as the issue disappears
> when parallel downloads are disabled.
> As a first try, I have been following some wiki indications, building pacman
> with debug symbols. However, I was unable to pinpoint the bug with gdb, the
> log is very short with only creation and exits of the created threads.
> Then, I have been building pacman-static, hoping to avoid some shared
> libraries issues. This did not work either.
> This is likely due to the fact that I am not very experienced with gdb,
> especially with the use of shared libraries and/or threads.
>
> I guess many people on this mailing-list could fix this bug, the only reason
> I did not report it yet is because I do not have any precise log.
> In order to report it in a more precise way, I would like to have some
> indications.
> Moreover, this could turn out to be quite instructive for me, and perhaps for
> others as I would like to add some indications to the wiki and/or other
> places.
> If you have online resources to recommend, please do.
>
> 1. How should I use gdb to get complete traces, including those of the
> created threads?
Parallel doesn't actually mean multi threaded. All the downloads are
done in one thread so you don't need to worry about that aspect.
Generally to get a trace with gdb you run the program until it crashes
and then use bt to get a backtrace. However in this instance there is no
crash, instead the error is handled within pacman so you cant just
easily grab a backtrace.
So instead you'd want to use breakpoints. I'd start by looking at
curl_download_internal().
> 2. I am so far not using breakpoints, I would only like to have the complete
> call graph, including those of the created threads. Is this a reasonable
> approach when trying to address a not-yet-narrowed-down bug, especially with
> shared libraries? Do I need some other tools, modifying LD_PRELOAD, etc?
Just build and run pacman from git. ALPM is part of the repo so there's
no need to worry about libraries.
> 3. What is the recommended way of handling symbols from several symbols
> tables?
Just run gdb and it should figure everything out.
> 4. Is there already some sanitizer such as TSAN in use for pacman
> development? If not, is this something planned, or are they reasons against
> using such tools?
> 5. Last but not least, many tests used in the PKGBUILD check function
> unexpectedly fail when I build pacman locally. Is this normal? If not, I can
> attach the log in another email, even though I guess it should be quite
> reproducible.
Works for me!
> Any piece of advice about how to do such things, what is your usual workflow
> with gdb (and perhaps other debuggers), more generally what is your workflow
> to find the cause of bugs, would be of great help.
> Thank you very much for all of the pacman development.
>
> Best regards,
> atn19
>