rthreading)
> and I am not so sure anymore that this is memory related.
>
> For further debugging, I've updated
>http://tracker.ceph.com/issues/16610
> with a summary of my finding plus some log files:
> - The gdb.txt I get after running
> $ gdb /path/to/ceph-fuse
The debug.out (gzipped) I get after running ceph-fuse in debug mode with
'debug client 20' and 'debug objectcacher = 20'
Cheers
Goncalo
____
From: Gregory Farnum [gfar...@redhat.com]
Sent: 12 July 2016 03:07
To: Goncalo Borges
Cc: John Spray;
I get after running ceph-fuse in debug mode with
'debug client 20' and 'debug objectcacher = 20'
Cheers
Goncalo
From: Gregory Farnum [gfar...@redhat.com]
Sent: 12 July 2016 03:07
To: Goncalo Borges
Cc: John Spray; ceph-users
Subject:
- The gdb.txt I get after running
> > $ gdb /path/to/ceph-fuse core.
> > (gdb) set pag off
> > (gdb) set log on
> > (gdb) thread apply all bt
> > (gdb) thread apply all bt full
> > as advised by Brad
> > - The debug.out (gzipped) I get after runn
(gdb) set log on
> (gdb) thread apply all bt
> (gdb) thread apply all bt full
> as advised by Brad
> - The debug.out (gzipped) I get after running ceph-fuse in debug mode with
> 'debug client 20' and 'debug objectcacher = 20'
>
> Cheers
> Goncalo
>
b.txt I get after running
> >$ gdb /path/to/ceph-fuse core.XXXX
> >(gdb) set pag off
> >(gdb) set log on
> >(gdb) thread apply all bt
> >(gdb) thread apply all bt full
> >as advised by Brad
> > - The debug.out (gzipped) I get after runnin
_
From: Gregory Farnum [gfar...@redhat.com]
Sent: 12 July 2016 03:07
To: Goncalo Borges
Cc: John Spray; ceph-users
Subject: Re: [ceph-users] ceph-fuse segfaults ( jewel 10.2.2)
Oh, is this one of your custom-built packages? Are they using
tcmalloc? That difference between VSZ and RSS looks l
ectcacher = 20'
Cheers
Goncalo
From: Gregory Farnum [gfar...@redhat.com]
Sent: 12 July 2016 03:07
To: Goncalo Borges
Cc: John Spray; ceph-users
Subject: Re: [ceph-users] ceph-fuse segfaults ( jewel 10.2.2)
Oh, is this one of your custom-built packages? A
On Tue, Jul 12, 2016 at 1:07 AM, Gregory Farnum wrote:
> Oh, is this one of your custom-built packages? Are they using
> tcmalloc? That difference between VSZ and RSS looks like a glibc
> malloc problem.
> -Greg
>
ceph-fuse at http://download.ceph.com/rpm-jewel/el7/x86_64/ is not
linked to libtcm
Oh, is this one of your custom-built packages? Are they using
tcmalloc? That difference between VSZ and RSS looks like a glibc
malloc problem.
-Greg
On Mon, Jul 11, 2016 at 12:04 AM, Goncalo Borges
wrote:
> Hi John...
>
> Thank you for replying.
>
> Here is the result of the tests you asked but I
Hi Goncalo,
On Fri, Jul 8, 2016 at 3:01 AM, Goncalo Borges
wrote:
> 5./ I have noticed that ceph-fuse (in 10.2.2) consumes about 1.5 GB of
> virtual memory when there is no applications using the filesystem.
>
> 7152 root 20 0 1108m 12m 5496 S 0.0 0.0 0:00.04 ceph-fuse
>
> When I onl
On Mon, Jul 11, 2016 at 8:04 AM, Goncalo Borges
wrote:
> Hi John...
>
> Thank you for replying.
>
> Here is the result of the tests you asked but I do not see nothing abnormal.
Thanks for running through that. Yes, nothing in the output struck me
as unreasonable either :-/
> Actually, your sugg
On 07/11/2016 05:04 PM, Goncalo Borges wrote:
Hi John...
Thank you for replying.
Here is the result of the tests you asked but I do not see nothing
abnormal. Actually, your suggestions made me see that:
1) ceph-fuse 9.2.0 is presenting the same behaviour but with less
memory consumption,
Hi John...
Thank you for replying.
Here is the result of the tests you asked but I do not see nothing
abnormal. Actually, your suggestions made me see that:
1) ceph-fuse 9.2.0 is presenting the same behaviour but with less memory
consumption, probably, less enought so that it doesn't brake c
On Fri, Jul 8, 2016 at 8:01 AM, Goncalo Borges
wrote:
> Hi Brad, Patrick, All...
>
> I think I've understood this second problem. In summary, it is memory
> related.
>
> This is how I found the source of the problem:
>
> 1./ I copied and adapted the user application to run in another cluster of
>
Hi Brad, Patrick, All...
I think I've understood this second problem. In summary, it is memory
related.
This is how I found the source of the problem:
1./ I copied and adapted the user application to run in another
cluster of ours. The idea was for me to understand the application
an
Hi Goncalo,
If possible it would be great if you could capture a core file for this with
full debugging symbols (preferably glibc debuginfo as well). How you do
that will depend on the ceph version and your OS but we can offfer help
if required I'm sure.
Once you have the core do the following.
On Thu, Jul 7, 2016 at 2:01 AM, Goncalo Borges
wrote:
> Unfortunately, the other user application breaks ceph-fuse again (It is a
> completely different application then in my previous test).
>
> We have tested it in 4 machines with 4 cores. The user is submitting 16
> single core jobs which are a
My previous email did not go through because of its size. Here goes a
new attempt:
Cheers
Goncalo
--- * ---
Hi Patrick, Brad...
Unfortunately, the other user application breaks ceph-fuse again (It is
a completely different application then in my previous test).
We have tested it in 4 machi
On Thu, Jul 7, 2016 at 12:31 AM, Patrick Donnelly wrote:
>
> The locks were missing in 9.2.0. There were probably instances of the
> segfault unreported/unresolved.
Or even unseen :)
Race conditions are funny things and extremely subtle changes in
timing introduced
by any number of things can af
Hi Goncalo,
On Wed, Jul 6, 2016 at 2:18 AM, Goncalo Borges
wrote:
> Just to confirm that, after applying the patch and recompiling, we are no
> longer seeing segfaults.
>
> I just tested with a user application which would kill ceph-fuse almost
> instantaneously. Now it is running for quite some
Hi All...
Just to confirm that, after applying the patch and recompiling, we are
no longer seeing segfaults.
I just tested with a user application which would kill ceph-fuse almost
instantaneously. Now it is running for quite some time, reading and
updating the files that it should.
I sho
Will do Brad. From you answer it should be a safe thing to do.
Will report later.
Thanks for the help
Cheers
Goncalo
On 07/05/2016 02:42 PM, Brad Hubbard wrote:
On Tue, Jul 5, 2016 at 1:34 PM, Patrick Donnelly wrote:
Hi Goncalo,
I believe this segfault may be the one fixed here:
https:
Hi Brad, Shinobu, Patrick...
Indeed if I run with 'debug client = 20' it seems I get a very similar
log to what Patrick has in the patch. However it is difficult for me to
really say if it is exactly the same thing.
One thing I could try is simply to apply the fix in the source code and
reco
On Tue, Jul 5, 2016 at 1:34 PM, Patrick Donnelly wrote:
> Hi Goncalo,
>
> I believe this segfault may be the one fixed here:
>
> https://github.com/ceph/ceph/pull/10027
Ah, nice one Patrick.
Goncalo, the patch is fairly simple, just the addition of a lock on two lines to
resolve the race. Could
Hi Goncalo,
I believe this segfault may be the one fixed here:
https://github.com/ceph/ceph/pull/10027
(Sorry for brief top-post. Im on mobile.)
On Jul 4, 2016 9:16 PM, "Goncalo Borges"
wrote:
>
> Dear All...
>
> We have recently migrated all our ceph infrastructure from 9.2.0 to
10.2.2.
>
> W
On Tue, Jul 5, 2016 at 12:13 PM, Shinobu Kinjo wrote:
> Can you reproduce with debug client = 20?
In addition to this I would suggest making sure you have debug symbols
in your build
and capturing a core file.
You can do that by setting "ulimit -c unlimited" in the environment
where ceph-fuse is
Can you reproduce with debug client = 20?
On Tue, Jul 5, 2016 at 10:16 AM, Goncalo Borges <
goncalo.bor...@sydney.edu.au> wrote:
> Dear All...
>
> We have recently migrated all our ceph infrastructure from 9.2.0 to 10.2.2.
>
> We are currently using ceph-fuse to mount cephfs in a number of client
Dear All...
We have recently migrated all our ceph infrastructure from 9.2.0 to 10.2.2.
We are currently using ceph-fuse to mount cephfs in a number of clients.
ceph-fuse 10.2.2 client is segfaulting in some situations. One of the
scenarios where ceph-fuse segfaults is when a user submits a pa
29 matches
Mail list logo