Re: rpmbuild core dumps
Florian Weimer writes: * Sam Varshavchik: > Stephen Smoogen writes: > >>https://github.com/rpm-software-management/rpm/issues/ >>2826>https://github.com/rpm-software-management/rpm/issues/2826 >> >> >> And thanks for opening a bug. I will watch to see what happens. > > I'm genuinely curious. Am I really the only one seeing this? The bug > seems fairly clear cut to me. What the heck. I suspect most of us package only files from one user, so the cache never needs evicting? You know, I think you're right. You need to put files into rpms that have different user and group ownership. Nearly all packages likely %defattr away all files to the same user and group id the race condition never gets triggered. The race condition occurs when the next file that gets added to the rpm is specified to have a different user or group ownership from the previous file. pgpehJrVvEXHx.pgp Description: PGP signature -- ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: rpmbuild core dumps
On 1/2/24 11:49, Florian Weimer wrote: * Sam Varshavchik: Stephen Smoogen writes: https://github.com/rpm-software-management/rpm/issues/ 2826>https://github.com/rpm-software-management/rpm/issues/2826 And thanks for opening a bug. I will watch to see what happens. I'm genuinely curious. Am I really the only one seeing this? The bug seems fairly clear cut to me. What the heck. I suspect most of us package only files from one user, so the cache never needs evicting? Indeed. Technically the "thread-unsafe" bug has been there since rpm 4.15 which was the first version to parallelize the package generation. It's just that 4.19 eliminates some code that has previously managed to more or less mask it it seems. It could've manifested as silent user/group name corruption before this AFAICS. - Panu - -- ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: rpmbuild core dumps
* Sam Varshavchik: > Stephen Smoogen writes: > >>https://github.com/rpm-software-management/rpm/issues/ >>2826>https://github.com/rpm-software-management/rpm/issues/2826 >> >> >> And thanks for opening a bug. I will watch to see what happens. > > I'm genuinely curious. Am I really the only one seeing this? The bug > seems fairly clear cut to me. What the heck. I suspect most of us package only files from one user, so the cache never needs evicting? Thanks, Florian -- ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: rpmbuild core dumps
Stephen Smoogen writes: https://github.com/rpm-software-management/rpm/issues/ 2826>https://github.com/rpm-software-management/rpm/issues/2826 And thanks for opening a bug. I will watch to see what happens. I'm genuinely curious. Am I really the only one seeing this? The bug seems fairly clear cut to me. What the heck. pgp3PQIGEzN6H.pgp Description: PGP signature -- ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: rpmbuild core dumps
On Thu, 28 Dec 2023 at 17:24, Sam Varshavchik wrote: > Stephen Smoogen writes: > > > I am trying to figure out the logic of this section: > > > > > > ``` > > > > static char * lastUname = NULL; // So lastUname is NULL > > static uid_t lastUid; > > > > if (!thisUname) { > > lastUname = rfree(lastUname); // lastUname should still be NULL > and > > we are freeing NULL and setting itself back to NULL. > > return -1; > > > > ``` > > > > > > > > I expect this is where I am not understanding something basic in C from > too > > many years in non-pointer land. I looked at the change of these lines > and > > they date back to this commit. > > This is a fairly common kind of simple caching to avoid expensive > username/userid and groupname/groupid lookups by caching the last one. > This Yeah I completely forgot that static allows for caching so I was misreading this as 'always set to NULL at the beginning.' > > https://github.com/rpm-software-management/rpm/issues/2826 > > > And thanks for opening a bug. I will watch to see what happens. -- Stephen Smoogen, Red Hat Automotive Let us be kind to one another, for most of us are fighting a hard battle. -- Ian MacClaren -- ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: rpmbuild core dumps
Stephen Smoogen writes: I am trying to figure out the logic of this section: ``` static char * lastUname = NULL; // So lastUname is NULL static uid_t lastUid; if (!thisUname) { lastUname = rfree(lastUname); // lastUname should still be NULL and we are freeing NULL and setting itself back to NULL. return -1; ``` I expect this is where I am not understanding something basic in C from too many years in non-pointer land. I looked at the change of these lines and they date back to this commit. This is a fairly common kind of simple caching to avoid expensive username/userid and groupname/groupid lookups by caching the last one. This code is expecting that it will be called, repeatedly, to look up the same user/group, so it caches the results of the last lookup, and returns it. Most of the files in a binary rpm would typically have the same uid gid owner, so these functions are expected to be called with the same values each time. Which now get cached in static variables. This worked great while everything was single-threaded. Now, if you have two execution threads going step by step right here, and both of them passed in a null pointer, both of them will run this, and both of them will `free` the same pointer. Fail. This `static` usage pattern is inherently thread unsafe. The entries for __thread I found come in around 2019. Did you have a bugzilla or a report on https://github.com/rpm-software- management/rpm/>https://github.com/rpm-software-management/rpm/ which I can add anything I find? and most date back from 2013. -- https://github.com/rpm-software-management/rpm/issues/2826 pgpRtx7rW1maN.pgp Description: PGP signature -- ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: rpmbuild core dumps
On Tue, 26 Dec 2023 at 07:32, Sam Varshavchik wrote: > Stephen Smoogen writes: > > > > > I am guessing the problem is really with the free(lastUname) since the > rfree > > Yes. Multiple execution threads will reach lastUname and try to free the > same pointer. glibc rightfully complains about the double-free. > > I am trying to figure out the logic of this section: ``` static char * lastUname = NULL; // So lastUname is NULL static uid_t lastUid; if (!thisUname) { lastUname = rfree(lastUname); // lastUname should still be NULL and we are freeing NULL and setting itself back to NULL. return -1; ``` I expect this is where I am not understanding something basic in C from too many years in non-pointer land. I looked at the change of these lines and they date back to this commit. fe645f822d (Panu Matilainen 2023-05-04 11:59:36 +0300 136) lastUname = rfree(lastUname); commit fe645f822dbd71da4145f6174e526a09eb5c815e Author: Panu Matilainen Date: Thu May 4 11:59:36 2023 +0300 Simplify rpmug caching The simple cache whose efficiency troubled ewt back in 1997 (see commit 97999ce92c1cad3315d85c02bb3c62007a75d846) has proven more than adequate over the years. In a local testcase based on Fedora 33 server iso contents, an install of 1765 packages consisting of 201344 files did a whopping 27 user + groups combined. So a few more alloc+free is not going to make the damnest difference, don't bother with reallocing the cache buffer, just strdup() a new one when needed. And the code before that was gnarlier from days of yore. > > isn't referred to (but not sure if an optimization would have removed > it. The > > comment before this code mentions that this is a hack to try and get > things > > done.. probably from long long ago when rpm was single threaded. > > The problem is in all of these functions. It's the same problem with all > of > them. Here's rpmugUname(), for example. You have two execution threads > traversing that nest of "if" statements and all of them winding up here: > > } else { > char *uname = NULL; > > if (lookup_str(pwfile(), uid, 2, 0, )) > return NULL; > > lastUid = uid; > free(lastUname); > > And now both execution threads will try to free() the same pointer. > > The next statement resets lastUname to the newly-allocated uname, but > it's > too late. If the code that executes in parallel calls rpmugUname, then > just > say good night. > > All of the static variables in all of the functions here must have a > mutex > wrapped around them. > > Or declared with a __thread attribute. > > The window of vulnerability is very tiny. But I have 32 cores and have 32 > execution threads churning. They have about a 5% chance of hitting the > double-free on every build. Worse, I can see how this race condition may > not > result in a crash but produce a corrupted rpm. > > That is ugly. The only mention of mutexes I found lib/package.c rpmio/macro.c rpmio/rpmlog.c The entries for __thread I found come in around 2019. Did you have a bugzilla or a report on https://github.com/rpm-software-management/rpm/ which I can add anything I find? and most date back from 2013. -- Stephen Smoogen, Red Hat Automotive Let us be kind to one another, for most of us are fighting a hard battle. -- Ian MacClaren -- ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: rpmbuild core dumps
Stephen Smoogen writes: I am guessing the problem is really with the free(lastUname) since the rfree Yes. Multiple execution threads will reach lastUname and try to free the same pointer. glibc rightfully complains about the double-free. isn't referred to (but not sure if an optimization would have removed it. The comment before this code mentions that this is a hack to try and get things done.. probably from long long ago when rpm was single threaded. The problem is in all of these functions. It's the same problem with all of them. Here's rpmugUname(), for example. You have two execution threads traversing that nest of "if" statements and all of them winding up here: } else { char *uname = NULL; if (lookup_str(pwfile(), uid, 2, 0, )) return NULL; lastUid = uid; free(lastUname); And now both execution threads will try to free() the same pointer. The next statement resets lastUname to the newly-allocated uname, but it's too late. If the code that executes in parallel calls rpmugUname, then just say good night. All of the static variables in all of the functions here must have a mutex wrapped around them. Or declared with a __thread attribute. The window of vulnerability is very tiny. But I have 32 cores and have 32 execution threads churning. They have about a 5% chance of hitting the double-free on every build. Worse, I can see how this race condition may not result in a crash but produce a corrupted rpm. pgpp7iby8QrFx.pgp Description: PGP signature -- ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: rpmbuild core dumps
On Mon, 25 Dec 2023 at 11:07, Stephen Smoogen wrote: > > > On Sun, 24 Dec 2023 at 15:51, Sam Varshavchik > wrote: > >> Stephen Smoogen writes: >> >> > »My apologies for bad quoting.. email from phone. What version of rpm >> build >> > is used and what are some packages which are rebuilt that show this >> issue. >> > This may be needed if the core dump is due to something else in the >> > environment like memory limits etc >> >> It's 4.19.1 on FC39, and it's packages that I'm working on. It's glibc >> complaining about a double-free, and not any resource limits. I can get >> a >> backtrace out of it: >> >> #1 0x7f05dd8588ee raise (libc.so.6 + 0x3e8ee) >> #2 0x7f05dd8408ff abort (libc.so.6 + 0x268ff) >> #3 0x7f05dd8417d0 __libc_message.cold (libc.so.6 + >> 0x277d0) >> #4 0x7f05dd8b47a5 malloc_printerr (libc.so.6 + >> 0x9a7a5) >> #5 0x7f05dd8b6a3a _int_free (libc.so.6 + 0x9ca3a) >> #6 0x7f05dd8b93de free (libc.so.6 + 0x9f3de) >> #7 0x7f05dda984ec rpmugUid (librpm.so.10 + 0x584ec) >> #8 0x7f05dda84255 rpmfilesStat (librpm.so.10 + >> 0x44255) >> #9 0x7f05dda8438f rpmfiStat (librpm.so.10 + 0x4438f) >> #10 0x7f05dda8 rpmfiArchiveWriteHeader >> (librpm.so.10 + 0x4) >> #11 0x7f05dda871c9 iterWriteArchiveNext (librpm.so.10 >> + 0x471c9) >> >> I am looking at this core dump. I see 32 active execution threads at the >> time this whole thing went kaput, and all the code in rpmug.c is >> definitely >> not thread safe. I did not look very hard, I don't know if there are >> mutexes >> higher up the call chain, but the overall behavior – occasional core >> dumps >> -- is indicative of thread races. >> >> > Thanks. I was wondering if it was dnf/rpm on the system or dnf/rpm in the > chroot but it sounds like something changed between 4.19.0.1 (what I had on > my system since September?) and 4.19.1 ( December) > > The changelog doesn't say much beyond > * Tue Dec 12 2023 Michal Domonkos - 4.19.1-1 > - Update to 4.19.1 (https://rpm.org/wiki/Releases/4.19.1) > > I forget if there is a way to pin an rpm in a mock environment so that you > don't update over 4.19.0 to see if you can see if > a) the problem still happens with that (possibly indicating that whatever > is calling into rpm is broken) or b) the problem doesn't occur and it is a > change between .0.1 and .19.1 > > https://github.com/rpm-software-management/rpm/compare/rpm-4.19.0-release...rpm-4.19.1-release ``` void * rfree (void *ptr) { free(ptr); return NULL; } /** * Test for string equality * @param s1string 1 * @param s2string 2 * @return 0 if strings differ, 1 if equal */ static inline int rstreq(const char *s1, const char *s2) { return (strcmp(s1, s2) == 0); } int rpmugUid(const char * thisUname, uid_t * uid) { static char * lastUname = NULL; static uid_t lastUid; if (!thisUname) { lastUname = rfree(lastUname); return -1; } else if (rstreq(thisUname, UID_0_USER)) { *uid = 0; return 0; } if (lastUname == NULL || !rstreq(thisUname, lastUname)) { long id; if (lookup_num(pwfile(), thisUname, 0, 2, )) return -1; free(lastUname); lastUname = xstrdup(thisUname); lastUid = id; } *uid = lastUid; return 0; } ``` I am guessing the problem is really with the free(lastUname) since the rfree isn't referred to (but not sure if an optimization would have removed it. The comment before this code mentions that this is a hack to try and get things done.. probably from long long ago when rpm was single threaded. -- Stephen Smoogen, Red Hat Automotive Let us be kind to one another, for most of us are fighting a hard battle. -- Ian MacClaren -- ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: rpmbuild core dumps
Sam Varshavchik writes: Looking at a diff between the 4.19.0 an 4.19.1 tags, a call to rpmfiStat() was added to fill_archive_entry(). The backtrace above shows the execution finding its way from rpmfiStat() into very-much-thread-unsafe code in rpmug.c That code is used only by rpm2archive, though. But the backtrace does show an execution path from packageBinaries into rpmug.c. I wonder what happens if someone were to try to rebuild texlive, and all of its umpteen packages. That should thoroughly excersize the problematic code… pgpSGpzBw4aNy.pgp Description: PGP signature -- ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: rpmbuild core dumps
Stephen Smoogen writes: #1 0x7f05dd8588ee raise (libc.so.6 + 0x3e8ee) #2 0x7f05dd8408ff abort (libc.so.6 + 0x268ff) #3 0x7f05dd8417d0 __libc_message.cold (libc.so.6 + 0x277d0) #4 0x7f05dd8b47a5 malloc_printerr (libc.so.6 + 0x9a7a5) #5 0x7f05dd8b6a3a _int_free (libc.so.6 + 0x9ca3a) #6 0x7f05dd8b93de free (libc.so.6 + 0x9f3de) #7 0x7f05dda984ec rpmugUid (librpm.so.10 + 0x584ec) #8 0x7f05dda84255 rpmfilesStat (librpm.so.10 + 0x44255) #9 0x7f05dda8438f rpmfiStat (librpm.so.10 + 0x4438f) #10 0x7f05dda8 rpmfiArchiveWriteHeader (librpm.so. 10 + 0x4) #11 0x7f05dda871c9 iterWriteArchiveNext (librpm.so.10 + 0x471c9) Thanks. I was wondering if it was dnf/rpm on the system or dnf/rpm in the chroot but it sounds like something changed between 4.19.0.1 (what I had on my system since September?) and 4.19.1 ( December) Looking at a diff between the 4.19.0 an 4.19.1 tags, a call to rpmfiStat() was added to fill_archive_entry(). The backtrace above shows the execution finding its way from rpmfiStat() into very-much-thread-unsafe code in rpmug.c Further up the backtrace is packageBinaries() which, according to the backtrace, is being multithreaded via OMP. Looking at the core dump, I see a bunch of execution threads running packageBinaries(). I'm confident that this is the breakage. 4.19.1 needs to be fixed ASAP. My humble suggestion for the simplest fix is to slap a __thread on all those static variables in rpmug.c. Or, perhaps, throw a mutex around them so that all execution thread share that micro-optimization those static variables are used for… Hopefully that's the only thing that's thread unsafe in there… pgpUdgoKdRnr2.pgp Description: PGP signature -- ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: rpmbuild core dumps
On Sun, 24 Dec 2023 at 15:51, Sam Varshavchik wrote: > Stephen Smoogen writes: > > > »My apologies for bad quoting.. email from phone. What version of rpm > build > > is used and what are some packages which are rebuilt that show this > issue. > > This may be needed if the core dump is due to something else in the > > environment like memory limits etc > > It's 4.19.1 on FC39, and it's packages that I'm working on. It's glibc > complaining about a double-free, and not any resource limits. I can get a > backtrace out of it: > > #1 0x7f05dd8588ee raise (libc.so.6 + 0x3e8ee) > #2 0x7f05dd8408ff abort (libc.so.6 + 0x268ff) > #3 0x7f05dd8417d0 __libc_message.cold (libc.so.6 + > 0x277d0) > #4 0x7f05dd8b47a5 malloc_printerr (libc.so.6 + > 0x9a7a5) > #5 0x7f05dd8b6a3a _int_free (libc.so.6 + 0x9ca3a) > #6 0x7f05dd8b93de free (libc.so.6 + 0x9f3de) > #7 0x7f05dda984ec rpmugUid (librpm.so.10 + 0x584ec) > #8 0x7f05dda84255 rpmfilesStat (librpm.so.10 + > 0x44255) > #9 0x7f05dda8438f rpmfiStat (librpm.so.10 + 0x4438f) > #10 0x7f05dda8 rpmfiArchiveWriteHeader > (librpm.so.10 + 0x4) > #11 0x7f05dda871c9 iterWriteArchiveNext (librpm.so.10 > + 0x471c9) > > I am looking at this core dump. I see 32 active execution threads at the > time this whole thing went kaput, and all the code in rpmug.c is > definitely > not thread safe. I did not look very hard, I don't know if there are > mutexes > higher up the call chain, but the overall behavior – occasional core > dumps > -- is indicative of thread races. > > Thanks. I was wondering if it was dnf/rpm on the system or dnf/rpm in the chroot but it sounds like something changed between 4.19.0.1 (what I had on my system since September?) and 4.19.1 ( December) The changelog doesn't say much beyond * Tue Dec 12 2023 Michal Domonkos - 4.19.1-1 - Update to 4.19.1 (https://rpm.org/wiki/Releases/4.19.1) I forget if there is a way to pin an rpm in a mock environment so that you don't update over 4.19.0 to see if you can see if a) the problem still happens with that (possibly indicating that whatever is calling into rpm is broken) or b) the problem doesn't occur and it is a change between .0.1 and .19.1 > > -- > ___ > devel mailing list -- devel@lists.fedoraproject.org > To unsubscribe send an email to devel-le...@lists.fedoraproject.org > Fedora Code of Conduct: > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org > Do not reply to spam, report it: > https://pagure.io/fedora-infrastructure/new_issue > -- Stephen Smoogen, Red Hat Automotive Let us be kind to one another, for most of us are fighting a hard battle. -- Ian MacClaren -- ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: rpmbuild core dumps
Stephen Smoogen writes: »My apologies for bad quoting.. email from phone. What version of rpm build is used and what are some packages which are rebuilt that show this issue. This may be needed if the core dump is due to something else in the environment like memory limits etc It's 4.19.1 on FC39, and it's packages that I'm working on. It's glibc complaining about a double-free, and not any resource limits. I can get a backtrace out of it: #1 0x7f05dd8588ee raise (libc.so.6 + 0x3e8ee) #2 0x7f05dd8408ff abort (libc.so.6 + 0x268ff) #3 0x7f05dd8417d0 __libc_message.cold (libc.so.6 + 0x277d0) #4 0x7f05dd8b47a5 malloc_printerr (libc.so.6 + 0x9a7a5) #5 0x7f05dd8b6a3a _int_free (libc.so.6 + 0x9ca3a) #6 0x7f05dd8b93de free (libc.so.6 + 0x9f3de) #7 0x7f05dda984ec rpmugUid (librpm.so.10 + 0x584ec) #8 0x7f05dda84255 rpmfilesStat (librpm.so.10 + 0x44255) #9 0x7f05dda8438f rpmfiStat (librpm.so.10 + 0x4438f) #10 0x7f05dda8 rpmfiArchiveWriteHeader (librpm.so.10 + 0x4) #11 0x7f05dda871c9 iterWriteArchiveNext (librpm.so.10 + 0x471c9) I am looking at this core dump. I see 32 active execution threads at the time this whole thing went kaput, and all the code in rpmug.c is definitely not thread safe. I did not look very hard, I don't know if there are mutexes higher up the call chain, but the overall behavior – occasional core dumps -- is indicative of thread races. pgpmoev3mIsCL.pgp Description: PGP signature -- ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: rpmbuild core dumps
My apologies for bad quoting.. email from phone. What version of rpm build is used and what are some packages which are rebuilt that show this issue. This may be needed if the core dump is due to something else in the environment like memory limits etc Stephen Smoogen, Red Hat Automotive Let us be kind to one another, for most of us are fighting a hard battle. -- Ian MacClaren On Sun, Dec 24, 2023 at 14:27 Sam Varshavchik wrote: > It seems that rpmbuild dumps core at the end of the build process, every > once in a while. Typical: > > Wrote: > /__w/courier-libs/courier-libs/courier-authlib/rpm/RPMS/x86_64/courier-authlib-userdb-debuginfo-0.72.0.20231223-101.fc39.x86_64.rpm > Wrote: > /__w/courier-libs/courier-libs/courier-authlib/rpm/RPMS/x86_64/courier-authlib-devel-0.72.0.20231223-101.fc39.x86_64.rpm > double free or corruption (fasttop) > make[1]: *** [Makefile:2447: dorpm] Aborted (core dumped) > make[1]: Leaving directory '/__w/courier-libs/courier-libs/courier-authlib' > make: *** [Makefile:2439: rpm-build] Error 2 > > Just trying again, with the same build, typically succeeds. In my > estimate > it dumps core about 5% of the time, randomly. > > I can rule out a hardware issue on my side, because this just happened in > a > github workflow container: > > > https://github.com/svarshavchik/courier-libs/actions/runs/7315972215/job/19930137170 > > You'll probably need to be signed into github see the log, but that's > basically it. > > I can't find anything relevant in Bugzilla, is anyone else seeing this, > too? > > -- > ___ > devel mailing list -- devel@lists.fedoraproject.org > To unsubscribe send an email to devel-le...@lists.fedoraproject.org > Fedora Code of Conduct: > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org > Do not reply to spam, report it: > https://pagure.io/fedora-infrastructure/new_issue > -- ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
rpmbuild core dumps
It seems that rpmbuild dumps core at the end of the build process, every once in a while. Typical: Wrote: /__w/courier-libs/courier-libs/courier-authlib/rpm/RPMS/x86_64/courier-authlib-userdb-debuginfo-0.72.0.20231223-101.fc39.x86_64.rpm Wrote: /__w/courier-libs/courier-libs/courier-authlib/rpm/RPMS/x86_64/courier-authlib-devel-0.72.0.20231223-101.fc39.x86_64.rpm double free or corruption (fasttop) make[1]: *** [Makefile:2447: dorpm] Aborted (core dumped) make[1]: Leaving directory '/__w/courier-libs/courier-libs/courier-authlib' make: *** [Makefile:2439: rpm-build] Error 2 Just trying again, with the same build, typically succeeds. In my estimate it dumps core about 5% of the time, randomly. I can rule out a hardware issue on my side, because this just happened in a github workflow container: https://github.com/svarshavchik/courier-libs/actions/runs/7315972215/job/19930137170 You'll probably need to be signed into github see the log, but that's basically it. I can't find anything relevant in Bugzilla, is anyone else seeing this, too? pgpTpfftu09M9.pgp Description: PGP signature -- ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue