Re: rpmbuild core dumps

2024-01-02 Thread Sam Varshavchik

Florian Weimer writes:


* Sam Varshavchik:

> Stephen Smoogen writes:
>
>>https://github.com/rpm-software-management/rpm/issues/
>>2826>https://github.com/rpm-software-management/rpm/issues/2826
>>
>>
>> And thanks for opening a bug. I will watch to see what happens. 
>
> I'm genuinely curious. Am I really the only one seeing this? The bug
> seems fairly clear cut to me. What the heck.

I suspect most of us package only files from one user, so the cache
never needs evicting?


You know, I think you're right. You need to put files into rpms that have  
different user and group ownership. Nearly all packages likely %defattr away  
all files to the same user and group id the race condition never gets  
triggered.


The race condition occurs when the next file that gets added to the rpm is  
specified to have a different user or group ownership from the previous file.




pgpehJrVvEXHx.pgp
Description: PGP signature
--
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: rpmbuild core dumps

2024-01-02 Thread Panu Matilainen

On 1/2/24 11:49, Florian Weimer wrote:

* Sam Varshavchik:


Stephen Smoogen writes:


https://github.com/rpm-software-management/rpm/issues/
2826>https://github.com/rpm-software-management/rpm/issues/2826


And thanks for opening a bug. I will watch to see what happens.


I'm genuinely curious. Am I really the only one seeing this? The bug
seems fairly clear cut to me. What the heck.


I suspect most of us package only files from one user, so the cache
never needs evicting?


Indeed.

Technically the "thread-unsafe" bug has been there since rpm 4.15 which 
was the first version to parallelize the package generation. It's just 
that 4.19 eliminates some code that has previously managed to more or 
less mask it it seems. It could've manifested as silent user/group name 
corruption before this AFAICS.


- Panu -
--
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: rpmbuild core dumps

2024-01-02 Thread Florian Weimer
* Sam Varshavchik:

> Stephen Smoogen writes:
>
>>https://github.com/rpm-software-management/rpm/issues/
>>2826>https://github.com/rpm-software-management/rpm/issues/2826
>>
>>
>> And thanks for opening a bug. I will watch to see what happens. 
>
> I'm genuinely curious. Am I really the only one seeing this? The bug
> seems fairly clear cut to me. What the heck.

I suspect most of us package only files from one user, so the cache
never needs evicting?

Thanks,
Florian
--
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: rpmbuild core dumps

2024-01-01 Thread Sam Varshavchik

Stephen Smoogen writes:


   https://github.com/rpm-software-management/rpm/issues/
   2826>https://github.com/rpm-software-management/rpm/issues/2826


And thanks for opening a bug. I will watch to see what happens. 


I'm genuinely curious. Am I really the only one seeing this? The bug seems  
fairly clear cut to me. What the heck.




pgp3PQIGEzN6H.pgp
Description: PGP signature
--
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: rpmbuild core dumps

2024-01-01 Thread Stephen Smoogen
On Thu, 28 Dec 2023 at 17:24, Sam Varshavchik  wrote:

> Stephen Smoogen writes:
>
> > I am trying to figure out the logic of this section:
> >
> >
> > ```
> >
> > static char * lastUname = NULL; // So lastUname is NULL
> > static uid_t lastUid;
> >
> > if (!thisUname) {
> > lastUname = rfree(lastUname); // lastUname should still be NULL
> and
> > we are freeing NULL and setting itself back to NULL.
> > return -1;
> >
> > ```
> >
> >
> >
> > I expect this is where I am not understanding something basic in C from
> too
> > many years in non-pointer land. I looked at the change of these lines
> and
> > they date back to this commit.
>
> This is a fairly common kind of simple caching to avoid expensive
> username/userid and groupname/groupid lookups by caching the last one.
> This


Yeah I completely forgot that static allows for caching so I was misreading
this as 'always set to NULL at the beginning.'


>
> https://github.com/rpm-software-management/rpm/issues/2826
>
>
>
And thanks for opening a bug. I will watch to see what happens.


-- 
Stephen Smoogen, Red Hat Automotive
Let us be kind to one another, for most of us are fighting a hard battle.
-- Ian MacClaren
--
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: rpmbuild core dumps

2023-12-28 Thread Sam Varshavchik

Stephen Smoogen writes:


I am trying to figure out the logic of this section:


```

    static char * lastUname = NULL; // So lastUname is NULL
    static uid_t lastUid;

    if (!thisUname) {
        lastUname = rfree(lastUname); // lastUname should still be NULL and  
we are freeing NULL and setting itself back to NULL.

        return -1;

```



I expect this is where I am not understanding something basic in C from too  
many years in non-pointer land. I looked at the change of these lines and  
they date back to this commit. 


This is a fairly common kind of simple caching to avoid expensive  
username/userid and groupname/groupid lookups by caching the last one. This  
code is expecting that it will be called, repeatedly, to look up the same  
user/group, so it caches the results of the last lookup, and returns it.  
Most of the files in a binary rpm would typically have the same uid gid  
owner, so these functions are expected to be called with the same values  
each time.


Which now get cached in static variables. This worked great while everything  
was single-threaded.


Now, if you have two execution threads going step by step right here, and  
both of them passed in a null pointer, both of them will run this, and both  
of them will `free` the same pointer. Fail.


This `static` usage pattern is inherently thread unsafe.

The entries for __thread I found come in around 2019. Did you have a bugzilla  
or a report on https://github.com/rpm-software- 
management/rpm/>https://github.com/rpm-software-management/rpm/ which I can  
add anything I find?





and most date back from 2013. --


https://github.com/rpm-software-management/rpm/issues/2826




pgpRtx7rW1maN.pgp
Description: PGP signature
--
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: rpmbuild core dumps

2023-12-28 Thread Stephen Smoogen
On Tue, 26 Dec 2023 at 07:32, Sam Varshavchik  wrote:

> Stephen Smoogen writes:
>
> >
> > I am guessing the problem is really with the free(lastUname) since the
> rfree
>
> Yes. Multiple execution threads will reach lastUname and try to free the
> same pointer. glibc rightfully complains about the double-free.
>
>
I am trying to figure out the logic of this section:

```

static char * lastUname = NULL; // So lastUname is NULL
static uid_t lastUid;

if (!thisUname) {
lastUname = rfree(lastUname); // lastUname should still be NULL and
we are freeing NULL and setting itself back to NULL.
return -1;
```

I expect this is where I am not understanding something basic in C from too
many years in non-pointer land. I looked at the change of these lines and
they date back to this commit.

fe645f822d (Panu Matilainen 2023-05-04 11:59:36 +0300 136)  lastUname =
rfree(lastUname);

commit fe645f822dbd71da4145f6174e526a09eb5c815e
Author: Panu Matilainen 
Date:   Thu May 4 11:59:36 2023 +0300

Simplify rpmug caching

The simple cache whose efficiency troubled ewt back in 1997
(see commit 97999ce92c1cad3315d85c02bb3c62007a75d846)
has proven more than adequate over the years.

In a local testcase based on Fedora 33 server iso contents, an install
of 1765 packages consisting of 201344 files did a whopping 27 user +
groups combined. So a few more alloc+free is not going to make the
damnest difference, don't bother with reallocing the cache buffer, just
strdup() a new one when needed.

And the code before that was gnarlier from days of yore.



> > isn't referred to (but not sure if an optimization would have removed
> it. The
> > comment before this code mentions that this is a hack to try and get
> things
> > done.. probably from long long ago when rpm was single threaded.
>
> The problem is in all of these functions. It's the same problem with all
> of
> them. Here's rpmugUname(), for example. You have two execution threads
> traversing that nest of "if" statements and all of them winding up here:
>
> } else {
> char *uname = NULL;
>
> if (lookup_str(pwfile(), uid, 2, 0, ))
> return NULL;
>
> lastUid = uid;
> free(lastUname);
>
> And now both execution threads will try to free() the same pointer.
>
> The next statement resets lastUname to the newly-allocated uname, but
> it's
> too late. If the code that executes in parallel calls rpmugUname, then
> just
> say good night.
>
> All of the static variables in all of the functions here must have a
> mutex
> wrapped around them.
>
> Or declared with a __thread attribute.
>
> The window of vulnerability is very tiny. But I have 32 cores and have 32
> execution threads churning. They have about a 5% chance of hitting the
> double-free on every build. Worse, I can see how this race condition may
> not
> result in a crash but produce a corrupted rpm.
>
>
That is ugly. The only mention of mutexes I found

lib/package.c
rpmio/macro.c
rpmio/rpmlog.c

The entries for __thread I found come in around 2019. Did you have a
bugzilla or a report on https://github.com/rpm-software-management/rpm/
which I can add anything I find?


and most date back from 2013.
-- 
Stephen Smoogen, Red Hat Automotive
Let us be kind to one another, for most of us are fighting a hard battle.
-- Ian MacClaren
--
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: rpmbuild core dumps

2023-12-26 Thread Sam Varshavchik

Stephen Smoogen writes:



I am guessing the problem is really with the free(lastUname) since the rfree


Yes. Multiple execution threads will reach lastUname and try to free the  
same pointer. glibc rightfully complains about the double-free.


isn't referred to (but not sure if an optimization would have removed it. The  
comment before this code mentions that this is a hack to try and get things  
done.. probably from long long ago when rpm was single threaded.


The problem is in all of these functions. It's the same problem with all of  
them. Here's rpmugUname(), for example. You have two execution threads  
traversing that nest of "if" statements and all of them winding up here:


   } else {
   char *uname = NULL;

   if (lookup_str(pwfile(), uid, 2, 0, ))
   return NULL;

   lastUid = uid;
   free(lastUname);

And now both execution threads will try to free() the same pointer.

The next statement resets lastUname to the newly-allocated uname, but it's  
too late. If the code that executes in parallel calls rpmugUname, then just  
say good night.


All of the static variables in all of the functions here must have a mutex  
wrapped around them.


Or declared with a __thread attribute.

The window of vulnerability is very tiny. But I have 32 cores and have 32  
execution threads churning. They have about a 5% chance of hitting the  
double-free on every build. Worse, I can see how this race condition may not  
result in a crash but produce a corrupted rpm.




pgpp7iby8QrFx.pgp
Description: PGP signature
--
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: rpmbuild core dumps

2023-12-26 Thread Stephen Smoogen
On Mon, 25 Dec 2023 at 11:07, Stephen Smoogen  wrote:

>
>
> On Sun, 24 Dec 2023 at 15:51, Sam Varshavchik 
> wrote:
>
>> Stephen Smoogen writes:
>>
>> > »My apologies for bad quoting.. email from phone. What version of rpm
>> build
>> > is used and what are some packages which are rebuilt that show this
>> issue.
>> > This may be needed if the core dump is due to something else in the
>> > environment like memory limits etc
>>
>> It's 4.19.1 on FC39, and it's packages that I'm working on. It's glibc
>> complaining about a double-free, and not any resource limits. I can get
>> a
>> backtrace out of it:
>>
>> #1  0x7f05dd8588ee raise (libc.so.6 + 0x3e8ee)
>> #2  0x7f05dd8408ff abort (libc.so.6 + 0x268ff)
>> #3  0x7f05dd8417d0 __libc_message.cold (libc.so.6 +
>> 0x277d0)
>> #4  0x7f05dd8b47a5 malloc_printerr (libc.so.6 +
>> 0x9a7a5)
>> #5  0x7f05dd8b6a3a _int_free (libc.so.6 + 0x9ca3a)
>> #6  0x7f05dd8b93de free (libc.so.6 + 0x9f3de)
>> #7  0x7f05dda984ec rpmugUid (librpm.so.10 + 0x584ec)
>> #8  0x7f05dda84255 rpmfilesStat (librpm.so.10 +
>> 0x44255)
>> #9  0x7f05dda8438f rpmfiStat (librpm.so.10 + 0x4438f)
>> #10 0x7f05dda8 rpmfiArchiveWriteHeader
>> (librpm.so.10 + 0x4)
>> #11 0x7f05dda871c9 iterWriteArchiveNext (librpm.so.10
>> + 0x471c9)
>>
>> I am looking at this core dump. I see 32 active execution threads at the
>> time this whole thing went kaput, and all the code in rpmug.c is
>> definitely
>> not thread safe. I did not look very hard, I don't know if there are
>> mutexes
>> higher up the call chain, but the overall behavior – occasional core
>> dumps
>> -- is indicative of thread races.
>>
>>
> Thanks. I was wondering if it was dnf/rpm on the system or dnf/rpm in the
> chroot but it sounds like something changed between 4.19.0.1 (what I had on
> my system since September?)  and 4.19.1 ( December)
>
> The changelog doesn't say much beyond
> * Tue Dec 12 2023 Michal Domonkos  - 4.19.1-1
> - Update to 4.19.1 (https://rpm.org/wiki/Releases/4.19.1)
>
> I forget if there is a way to pin an rpm in a mock environment so that you
> don't update over 4.19.0 to see if you can see if
> a) the problem still happens with that (possibly indicating that whatever
> is calling into rpm is broken) or b) the problem doesn't occur and it is a
> change between .0.1 and .19.1
>
>

https://github.com/rpm-software-management/rpm/compare/rpm-4.19.0-release...rpm-4.19.1-release

```

void * rfree (void *ptr)
{
free(ptr);
return NULL;
}
/**
 * Test for string equality
 * @param s1string 1
 * @param s2string 2
 * @return  0 if strings differ, 1 if equal
 */
static inline int rstreq(const char *s1, const char *s2)
{
return (strcmp(s1, s2) == 0);
}
int rpmugUid(const char * thisUname, uid_t * uid) {
static char * lastUname = NULL;
static uid_t lastUid;
if (!thisUname) {
lastUname = rfree(lastUname);
return -1;
} else if (rstreq(thisUname, UID_0_USER)) {
*uid = 0;
return 0;
}
if (lastUname == NULL || !rstreq(thisUname, lastUname)) {
long id;
if (lookup_num(pwfile(), thisUname, 0, 2, ))
return -1;
free(lastUname);
lastUname = xstrdup(thisUname);
lastUid = id;
}
*uid = lastUid;
return 0;
}

```
I am guessing the problem is really with the free(lastUname) since the
rfree isn't referred to (but not sure if an optimization would have removed
it. The comment before this code mentions that this is a hack to try and
get things done.. probably from long long ago when rpm was single threaded.

-- 
Stephen Smoogen, Red Hat Automotive
Let us be kind to one another, for most of us are fighting a hard battle.
-- Ian MacClaren
--
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: rpmbuild core dumps

2023-12-25 Thread Sam Varshavchik

Sam Varshavchik writes:



Looking at a diff between the 4.19.0 an 4.19.1 tags, a call to rpmfiStat()  
was added to fill_archive_entry(). The backtrace above shows the execution  
finding its way from rpmfiStat() into very-much-thread-unsafe code in rpmug.c


That code is used only by rpm2archive, though.

But the backtrace does show an execution path from packageBinaries into  
rpmug.c.


I wonder what happens if someone were to try to rebuild texlive, and all of  
its umpteen packages. That should thoroughly excersize the problematic code…




pgpSGpzBw4aNy.pgp
Description: PGP signature
--
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: rpmbuild core dumps

2023-12-25 Thread Sam Varshavchik

Stephen Smoogen writes:


                   #1  0x7f05dd8588ee raise (libc.so.6 + 0x3e8ee)
                   #2  0x7f05dd8408ff abort (libc.so.6 + 0x268ff)
                   #3  0x7f05dd8417d0 __libc_message.cold (libc.so.6 +
   0x277d0)
                   #4  0x7f05dd8b47a5 malloc_printerr (libc.so.6 +
   0x9a7a5)
                   #5  0x7f05dd8b6a3a _int_free (libc.so.6 + 0x9ca3a)
                   #6  0x7f05dd8b93de free (libc.so.6 + 0x9f3de)
                   #7  0x7f05dda984ec rpmugUid (librpm.so.10 + 0x584ec)
                   #8  0x7f05dda84255 rpmfilesStat (librpm.so.10 +
   0x44255)
                   #9  0x7f05dda8438f rpmfiStat (librpm.so.10 + 0x4438f)
                   #10 0x7f05dda8 rpmfiArchiveWriteHeader (librpm.so.
   10 + 0x4)
                   #11 0x7f05dda871c9 iterWriteArchiveNext (librpm.so.10
   + 0x471c9)


Thanks. I was wondering if it was dnf/rpm on the system or dnf/rpm in the  
chroot but it sounds like something changed between 4.19.0.1 (what I had on  
my system since September?)  and 4.19.1 ( December)




Looking at a diff between the 4.19.0 an 4.19.1 tags, a call to rpmfiStat()  
was added to fill_archive_entry(). The backtrace above shows the execution  
finding its way from rpmfiStat() into very-much-thread-unsafe code in rpmug.c


Further up the backtrace is packageBinaries() which, according to the  
backtrace, is being multithreaded via OMP. Looking at the core dump, I see a  
bunch of execution threads running packageBinaries().


I'm confident that this is the breakage. 4.19.1 needs to be fixed ASAP.

My humble suggestion for the simplest fix is to slap a __thread on all those  
static variables in rpmug.c. Or, perhaps, throw a mutex around them so that  
all execution thread share that micro-optimization those static variables  
are used for…


Hopefully that's the only thing that's thread unsafe in there…



pgpUdgoKdRnr2.pgp
Description: PGP signature
--
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: rpmbuild core dumps

2023-12-25 Thread Stephen Smoogen
On Sun, 24 Dec 2023 at 15:51, Sam Varshavchik  wrote:

> Stephen Smoogen writes:
>
> > »My apologies for bad quoting.. email from phone. What version of rpm
> build
> > is used and what are some packages which are rebuilt that show this
> issue.
> > This may be needed if the core dump is due to something else in the
> > environment like memory limits etc
>
> It's 4.19.1 on FC39, and it's packages that I'm working on. It's glibc
> complaining about a double-free, and not any resource limits. I can get a
> backtrace out of it:
>
> #1  0x7f05dd8588ee raise (libc.so.6 + 0x3e8ee)
> #2  0x7f05dd8408ff abort (libc.so.6 + 0x268ff)
> #3  0x7f05dd8417d0 __libc_message.cold (libc.so.6 +
> 0x277d0)
> #4  0x7f05dd8b47a5 malloc_printerr (libc.so.6 +
> 0x9a7a5)
> #5  0x7f05dd8b6a3a _int_free (libc.so.6 + 0x9ca3a)
> #6  0x7f05dd8b93de free (libc.so.6 + 0x9f3de)
> #7  0x7f05dda984ec rpmugUid (librpm.so.10 + 0x584ec)
> #8  0x7f05dda84255 rpmfilesStat (librpm.so.10 +
> 0x44255)
> #9  0x7f05dda8438f rpmfiStat (librpm.so.10 + 0x4438f)
> #10 0x7f05dda8 rpmfiArchiveWriteHeader
> (librpm.so.10 + 0x4)
> #11 0x7f05dda871c9 iterWriteArchiveNext (librpm.so.10
> + 0x471c9)
>
> I am looking at this core dump. I see 32 active execution threads at the
> time this whole thing went kaput, and all the code in rpmug.c is
> definitely
> not thread safe. I did not look very hard, I don't know if there are
> mutexes
> higher up the call chain, but the overall behavior – occasional core
> dumps
> -- is indicative of thread races.
>
>
Thanks. I was wondering if it was dnf/rpm on the system or dnf/rpm in the
chroot but it sounds like something changed between 4.19.0.1 (what I had on
my system since September?)  and 4.19.1 ( December)

The changelog doesn't say much beyond
* Tue Dec 12 2023 Michal Domonkos  - 4.19.1-1
- Update to 4.19.1 (https://rpm.org/wiki/Releases/4.19.1)

I forget if there is a way to pin an rpm in a mock environment so that you
don't update over 4.19.0 to see if you can see if
a) the problem still happens with that (possibly indicating that whatever
is calling into rpm is broken) or b) the problem doesn't occur and it is a
change between .0.1 and .19.1


>
> --
> ___
> devel mailing list -- devel@lists.fedoraproject.org
> To unsubscribe send an email to devel-le...@lists.fedoraproject.org
> Fedora Code of Conduct:
> https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives:
> https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
> Do not reply to spam, report it:
> https://pagure.io/fedora-infrastructure/new_issue
>


-- 
Stephen Smoogen, Red Hat Automotive
Let us be kind to one another, for most of us are fighting a hard battle.
-- Ian MacClaren
--
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: rpmbuild core dumps

2023-12-24 Thread Sam Varshavchik

Stephen Smoogen writes:

»My apologies for bad quoting.. email from phone. What version of rpm build  
is used and what are some packages which are rebuilt that show this issue.  
This may be needed if the core dump is due to something else in the  
environment like memory limits etc 


It's 4.19.1 on FC39, and it's packages that I'm working on. It's glibc  
complaining about a double-free, and not any resource limits. I can get a  
backtrace out of it:


   #1  0x7f05dd8588ee raise (libc.so.6 + 0x3e8ee)
   #2  0x7f05dd8408ff abort (libc.so.6 + 0x268ff)
   #3  0x7f05dd8417d0 __libc_message.cold (libc.so.6 + 0x277d0)
   #4  0x7f05dd8b47a5 malloc_printerr (libc.so.6 + 0x9a7a5)
   #5  0x7f05dd8b6a3a _int_free (libc.so.6 + 0x9ca3a)
   #6  0x7f05dd8b93de free (libc.so.6 + 0x9f3de)
   #7  0x7f05dda984ec rpmugUid (librpm.so.10 + 0x584ec)
   #8  0x7f05dda84255 rpmfilesStat (librpm.so.10 + 0x44255)
   #9  0x7f05dda8438f rpmfiStat (librpm.so.10 + 0x4438f)
   #10 0x7f05dda8 rpmfiArchiveWriteHeader (librpm.so.10 + 
0x4)
   #11 0x7f05dda871c9 iterWriteArchiveNext (librpm.so.10 + 
0x471c9)

I am looking at this core dump. I see 32 active execution threads at the  
time this whole thing went kaput, and all the code in rpmug.c is definitely  
not thread safe. I did not look very hard, I don't know if there are mutexes  
higher up the call chain, but the overall behavior – occasional core dumps  
-- is indicative of thread races.





pgpmoev3mIsCL.pgp
Description: PGP signature
--
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: rpmbuild core dumps

2023-12-24 Thread Stephen Smoogen
My apologies for bad quoting.. email from phone. What version of rpm build
is used and what are some packages which are rebuilt that show this issue.
This may be needed if the core dump is due to something else in the
environment like memory limits etc

Stephen Smoogen, Red Hat Automotive
Let us be kind to one another, for most of us are fighting a hard battle.
-- Ian MacClaren


On Sun, Dec 24, 2023 at 14:27 Sam Varshavchik  wrote:

> It seems that rpmbuild dumps core at the end of the build process, every
> once in a while. Typical:
>
> Wrote:
> /__w/courier-libs/courier-libs/courier-authlib/rpm/RPMS/x86_64/courier-authlib-userdb-debuginfo-0.72.0.20231223-101.fc39.x86_64.rpm
> Wrote:
> /__w/courier-libs/courier-libs/courier-authlib/rpm/RPMS/x86_64/courier-authlib-devel-0.72.0.20231223-101.fc39.x86_64.rpm
> double free or corruption (fasttop)
> make[1]: *** [Makefile:2447: dorpm] Aborted (core dumped)
> make[1]: Leaving directory '/__w/courier-libs/courier-libs/courier-authlib'
> make: *** [Makefile:2439: rpm-build] Error 2
>
> Just trying again, with the same build, typically succeeds. In my
> estimate
> it dumps core about 5% of the time, randomly.
>
> I can rule out a hardware issue on my side, because this just happened in
> a
> github workflow container:
>
>
> https://github.com/svarshavchik/courier-libs/actions/runs/7315972215/job/19930137170
>
> You'll probably need to be signed into github see the log, but that's
> basically it.
>
> I can't find anything relevant in Bugzilla, is anyone else seeing this,
> too?
>
> --
> ___
> devel mailing list -- devel@lists.fedoraproject.org
> To unsubscribe send an email to devel-le...@lists.fedoraproject.org
> Fedora Code of Conduct:
> https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives:
> https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
> Do not reply to spam, report it:
> https://pagure.io/fedora-infrastructure/new_issue
>
--
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


rpmbuild core dumps

2023-12-24 Thread Sam Varshavchik
It seems that rpmbuild dumps core at the end of the build process, every  
once in a while. Typical:


Wrote: 
/__w/courier-libs/courier-libs/courier-authlib/rpm/RPMS/x86_64/courier-authlib-userdb-debuginfo-0.72.0.20231223-101.fc39.x86_64.rpm
Wrote: 
/__w/courier-libs/courier-libs/courier-authlib/rpm/RPMS/x86_64/courier-authlib-devel-0.72.0.20231223-101.fc39.x86_64.rpm
double free or corruption (fasttop)
make[1]: *** [Makefile:2447: dorpm] Aborted (core dumped)
make[1]: Leaving directory '/__w/courier-libs/courier-libs/courier-authlib'
make: *** [Makefile:2439: rpm-build] Error 2

Just trying again, with the same build, typically succeeds. In my estimate  
it dumps core about 5% of the time, randomly.


I can rule out a hardware issue on my side, because this just happened in a  
github workflow container:


https://github.com/svarshavchik/courier-libs/actions/runs/7315972215/job/19930137170

You'll probably need to be signed into github see the log, but that's  
basically it.


I can't find anything relevant in Bugzilla, is anyone else seeing this, too?



pgpTpfftu09M9.pgp
Description: PGP signature
--
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue