On Sun, Sep 10, 2017 at 05:22:14PM -0700, Marc MERLIN wrote:
> On Sun, Sep 10, 2017 at 01:16:26PM +, Josef Bacik wrote:
> > Great, if the free space cache is fucked again after the next go
> > around then I need to expand the verifier to watch entries being added
> > to the cache as well. Than
On Sun, Sep 10, 2017 at 01:16:26PM +, Josef Bacik wrote:
> Great, if the free space cache is fucked again after the next go
> around then I need to expand the verifier to watch entries being added
> to the cache as well. Thanks,
Well, I copied about 1TB of data, and nothing happened.
So it se
Great, if the free space cache is fucked again after the next go around then I
need to expand the verifier to watch entries being added to the cache as well.
Thanks,
Josef
Sent from my iPhone
> On Sep 10, 2017, at 9:14 AM, Marc MERLIN wrote:
>
>> On Sun, Sep 10, 2017 at 03:12:16AM +, Jo
On Sun, Sep 10, 2017 at 03:12:16AM +, Josef Bacik wrote:
> Ok mount -o clear_cache, umount and run fsck again just to make sure. Then
> if it comes out clean mount with ref_verify again and wait for it to blow up
> again. Thanks,
Ok, just did the 2nd fsck, came back clean after mount -o c
Ok mount -o clear_cache, umount and run fsck again just to make sure. Then if
it comes out clean mount with ref_verify again and wait for it to blow up
again. Thanks,
Josef
Sent from my iPhone
> On Sep 9, 2017, at 10:37 PM, Marc MERLIN wrote:
>
>> On Sat, Sep 09, 2017 at 10:56:14PM +,
On Sat, Sep 09, 2017 at 10:56:14PM +, Josef Bacik wrote:
> Well that's odd, a block allocated on disk is in the free space cache. Can I
> see the full output of the fsck? I want to make sure it's actually getting
> to the part where it checks the free space cache. If it does then I'll have
Well that's odd, a block allocated on disk is in the free space cache. Can I
see the full output of the fsck? I want to make sure it's actually getting to
the part where it checks the free space cache. If it does then I'll have to
think of how to catch this kind of bug, because you've got a w
On Tue, Sep 05, 2017 at 06:19:25PM +, Josef Bacik wrote:
> Alright I just reworked the build tree ref stuff and tested it to make sure
> it wasn’t going to give false positives again. Apparently I had only ever
> used this with very basic existing fs’es and nothing super complicated, so it
Alright I just reworked the build tree ref stuff and tested it to make sure it
wasn’t going to give false positives again. Apparently I had only ever used
this with very basic existing fs’es and nothing super complicated, so it was
just broken for anything complex. I’ve pushed it to my tree, y
Ok this output looked fishy and so I went and tested it on my box again. It
looks like I wasn't testing modifying a snapshot with an existing fs so I never
saw these errors, but I see them as well. I definitely fucked the building of
the initial ref tree. It's too late tonight for me to rewor
On Sun, Sep 03, 2017 at 05:33:33PM +, Josef Bacik wrote:
> Alright pushed, sorry about that.
I'm reasonably sure I'm running the new code, but still got this:
[ 2104.336513] Dropping a ref for a root that doesn't have a ref on the block
[ 2104.358226] Dumping block entry [115253923840 155648]
Alright pushed, sorry about that.
Josef
Sent from my iPhone
> On Sep 3, 2017, at 10:42 AM, Marc MERLIN wrote:
>
>> On Sun, Sep 03, 2017 at 02:38:57PM +, Josef Bacik wrote:
>> Oh yeah you need CONFIG_STACKTRACE turned on, otherwise this is going to be
>> difficult ;). Thanks,
>
> Right,
Jesus Christ I misspelled it, I'll fix it up when I get home. Thanks,
Josef
Sent from my iPhone
> On Sep 3, 2017, at 10:42 AM, Marc MERLIN wrote:
>
>> On Sun, Sep 03, 2017 at 02:38:57PM +, Josef Bacik wrote:
>> Oh yeah you need CONFIG_STACKTRACE turned on, otherwise this is going to be
>
On Sun, Sep 03, 2017 at 02:38:57PM +, Josef Bacik wrote:
> Oh yeah you need CONFIG_STACKTRACE turned on, otherwise this is going to be
> difficult ;). Thanks,
Right, except that I thought I did:
saruman:/usr/src/linux-btrfs/btrfs-next# grep STACKTRACE .config
CONFIG_STACKTRACE_SUPPORT=y
CO
Oh yeah you need CONFIG_STACKTRACE turned on, otherwise this is going to be
difficult ;). Thanks,
Josef
Sent from my iPhone
> On Sep 3, 2017, at 10:31 AM, Marc MERLIN wrote:
>
>> On Sun, Sep 03, 2017 at 03:26:34AM +, Josef Bacik wrote:
>> I was looking through the code for other ways to
On Sun, Sep 03, 2017 at 03:26:34AM +, Josef Bacik wrote:
> I was looking through the code for other ways to cut down memory usage when I
> noticed we only catch improper re-allocations, not adding another ref for
> metadata which is what I suspect your problem is. I added another patch and
I was looking through the code for other ways to cut down memory usage when I
noticed we only catch improper re-allocations, not adding another ref for
metadata which is what I suspect your problem is. I added another patch and
pushed it out, sorry for the churn.
Josef
Sent from my iPhone
>
On Sun, Sep 03, 2017 at 12:30:07AM +, Josef Bacik wrote:
> My bad, I forgot I don't dynamically allocate the stack trace space so my
> patch did nothing, I blame the children for distracting me. I've dropped
> allocating the action altogether for the on disk stuff, that should
> dramaticall
My bad, I forgot I don't dynamically allocate the stack trace space so my patch
did nothing, I blame the children for distracting me. I've dropped allocating
the action altogether for the on disk stuff, that should dramatically reduce
the memory usage. You can just do a git pull since I made a
On Sat, Sep 02, 2017 at 04:52:20PM +, Josef Bacik wrote:
> Oops, ok I've updated my tree so we don't save the stack trace of the initial
> scan, which we don't need anyway. That should save a decent amount of memory
> in your case. It was an in place update so you'll need to blow away your
I've just had this happen for the 3rd time in 4 days. I wasn't
suibscribed to the list so couldn't reply to the existing thread but
here it is http://www.spinics.net/lists/linux-btrfs/msg68662.html
I can do some limited testing. It's my main dev machine though..
On Sat, Sep 2, 2017 at 10:52 AM
Oops, ok I've updated my tree so we don't save the stack trace of the initial
scan, which we don't need anyway. That should save a decent amount of memory
in your case. It was an in place update so you'll need to blow away your local
branch and pull the new one to get the new code. Thanks,
J
On Fri, Sep 01, 2017 at 11:01:30PM +, Josef Bacik wrote:
> You'll be fine, it's only happening on the one fs right? That's 13gib of
> metadata with checksums and all that shit, it'll probably look like 8 or 9gib
> of ram worst case. I'd mount with -o ref_verify and check the slab amount in
You'll be fine, it's only happening on the one fs right? That's 13gib of
metadata with checksums and all that shit, it'll probably look like 8 or 9gib
of ram worst case. I'd mount with -o ref_verify and check the slab amount in
/proc/meminfo to get an idea of real usage. Once the mount is fin
On Thu, Aug 31, 2017 at 05:48:23PM +, Josef Bacik wrote:
> We are using 4.11 in production at fb with backports from recent (a month
> ago?) stuff. I’m relatively certain nothing bad will happen, and this branch
> has the most recent fsync() corruption fix (which exists in your kernel so
>
We are using 4.11 in production at fb with backports from recent (a month ago?)
stuff. I’m relatively certain nothing bad will happen, and this branch has the
most recent fsync() corruption fix (which exists in your kernel so it’s not
new). That said if you are uncomfortable I can rebase this
On Thu, Aug 31, 2017 at 02:52:56PM +, Josef Bacik wrote:
> Hello,
>
> Sorry I really thought I could accomplish this with BPF, but ref tracking is
> just too complicated to work properly with BPF. I forward ported my ref
> verification patch to the latest kernel, you can find it in the btrf
Hello,
Sorry I really thought I could accomplish this with BPF, but ref tracking is
just too complicated to work properly with BPF. I forward ported my ref
verification patch to the latest kernel, you can find it in the btrfs-readdir
branch of my btrfs-next tree here
git://git.kernel.org/pub/
I'm going to pile on this thread because I have the same issue. I've
seen this twice in just the past 2 days on a filesystem that was
created a few weeks ago. Un-mounting and mounting again with no
special options gets the filesystem back.
[Aug31 02:59] BTRFS: Transaction aborted (error -17)
[
On Tue, Aug 29, 2017 at 06:22:38PM +, Josef Bacik wrote:
> How much metadata do you have on this fs? I was going to hold everything in
> bpf hash trees, but I’m worried we’ll hit collisions and then the tracing
> will be useless. If it’s too big I’ll have to dump everything to userspace
>
How much metadata do you have on this fs? I was going to hold everything in
bpf hash trees, but I’m worried we’ll hit collisions and then the tracing will
be useless. If it’s too big I’ll have to dump everything to userspace and let
python take care of keeping everything in memory, so if you h
Alright I’ll figure out a way to differentiate between the fs’s, but being able
to scan the fs before it’s mounted was the hardest part so that’s perfect.
I’ll get something written up and tested today to make sure it won’t spit out
false positives and send it to you this afternoon or tomorrow.
On Tue, Aug 29, 2017 at 02:30:19PM +, Josef Bacik wrote:
> Sorry Marc, I’ll wire up a bcc script to try and catch when this
> happens. In order for it to work it’ll need to read the extent tree in
> before you mount the fs, is that something you’ll be able to swing or is
> this your root fs?
Sorry Marc, I’ll wire up a bcc script to try and catch when this happens. In
order for it to work it’ll need to read the extent tree in before you mount the
fs, is that something you’ll be able to swing or is this your root fs? Also is
it the only btrfs fs on the system? Thanks,
Josef
On 8/
On Sat, Jul 15, 2017 at 04:12:45PM -0700, Marc MERLIN wrote:
> On Fri, Jul 14, 2017 at 06:22:16PM -0700, Marc MERLIN wrote:
> > Dear Chris and other developers,
> >
> > Can you look at this bug which has been happening since 2012 on apparently
> > all kernels between at least
> > 3.4 and 4.11.
>
2017-07-16 18:06 GMT+02:00 Marc MERLIN :
> On Sun, Jul 16, 2017 at 04:01:53PM +0200, Giuseppe Della Bianca wrote:
>> > On Fri, Jul 14, 2017 at 06:22:16PM -0700, Marc MERLIN wrote:
>> > > Dear Chris and other developers,
>> ]zac[
>> > Others on this thread with the same error: did anyone recover fro
On Sun, Jul 16, 2017 at 04:01:53PM +0200, Giuseppe Della Bianca wrote:
> > On Fri, Jul 14, 2017 at 06:22:16PM -0700, Marc MERLIN wrote:
> > > Dear Chris and other developers,
> ]zac[
> > Others on this thread with the same error: did anyone recover from this
> > without wiping the filesystem?
> >
> On Fri, Jul 14, 2017 at 06:22:16PM -0700, Marc MERLIN wrote:
> > Dear Chris and other developers,
]zac[
> Others on this thread with the same error: did anyone recover from this
> without wiping the filesystem?
>
> Is there a chance a balance might work around the bug so that whatever
> layout I
On Fri, Jul 14, 2017 at 06:22:16PM -0700, Marc MERLIN wrote:
> Dear Chris and other developers,
>
> Can you look at this bug which has been happening since 2012 on apparently
> all kernels between at least
> 3.4 and 4.11.
> I didn't look in detail at each thread (took long enough to even find the
b4/0xbc
> [] ret_from_fork+0x1f/0x40
> [] ? init_completion+0x24/0x24
> ---[ end trace feb4b95c83ac065f ]---
> BTRFS: error (device dm-2) in btrfs_run_delayed_refs:2960: errno=-17 Object
> already exists
> BTRFS info (device dm-2): forced readonly
Ok, please try this search in g
On Thu, Jul 13, 2017 at 12:17:16PM -0600, Chris Murphy wrote:
> Well I'd say it's a bug, but that's not a revelation. Is there a
> snapshot being deleted in the approximate time frame for this? I see a
Yep :)
I run btrfs-snaps and it happens right aroudn that time.
It creates a snapshot and delete
On Wed, Jul 12, 2017 at 7:10 PM, Marc MERLIN wrote:
> On Tue, Jul 11, 2017 at 09:48:12AM -0700, Marc MERLIN wrote:
>> On Tue, Jul 11, 2017 at 10:00:40AM -0600, Chris Murphy wrote:
>> > > ---[ end trace feb4b95c83ac065f ]---
>> > > BTRFS: error (device dm-2) in b
On Tue, Jul 11, 2017 at 09:48:12AM -0700, Marc MERLIN wrote:
> On Tue, Jul 11, 2017 at 10:00:40AM -0600, Chris Murphy wrote:
> > > ---[ end trace feb4b95c83ac065f ]---
> > > BTRFS: error (device dm-2) in btrfs_run_delayed_refs:2960: errno=-17
> > > Object already e
On Tue, Jul 11, 2017 at 04:43:06PM -0600, Chris Murphy wrote:
> Assuming it works, settle on 4.9 until 4.14 shakes out a bit. Given
> your setup and the penalty for even small problems, it's probably
> better to go low risk and that means longterm kernels. Maybe one of
> the three systems can use a
On Tue, Jul 11, 2017 at 10:48 AM, Marc MERLIN wrote:
> On Tue, Jul 11, 2017 at 10:00:40AM -0600, Chris Murphy wrote:
>> > ---[ end trace feb4b95c83ac065f ]---
>> > BTRFS: error (device dm-2) in btrfs_run_delayed_refs:2960: errno=-17
>> > Object already exist
On Tue, Jul 11, 2017 at 10:00:40AM -0600, Chris Murphy wrote:
> > ---[ end trace feb4b95c83ac065f ]---
> > BTRFS: error (device dm-2) in btrfs_run_delayed_refs:2960: errno=-17 Object
> > already exists
> > BTRFS info (device dm-2): forced readonly
>
> You've a
; [] ret_from_fork+0x1f/0x40
> [] ? init_completion+0x24/0x24
> ---[ end trace feb4b95c83ac065f ]---
> BTRFS: error (device dm-2) in btrfs_run_delayed_refs:2960: errno=-17 Object
> already exists
> BTRFS info (device dm-2): forced readonly
You've already had this same traceback, not sure whethe
hread+0xb4/0xbc
[] ret_from_fork+0x1f/0x40
[] ? init_completion+0x24/0x24
---[ end trace feb4b95c83ac065f ]---
BTRFS: error (device dm-2) in btrfs_run_delayed_refs:2960: errno=-17 Object
already exists
BTRFS info (device dm-2): forced readonly
Yes, I'm back with 4.8 since I need to get
48 matches
Mail list logo