On 03/14/2018 08:27 PM, Austin S. Hemmelgarn wrote:
> On 2018-03-14 14:39, Goffredo Baroncelli wrote:
>> On 03/14/2018 01:02 PM, Austin S. Hemmelgarn wrote:
>> [...]
In btrfs, a checksum mismatch creates an -EIO error during the reading. In
a conventional filesystem (or a btrfs file
On 2018-03-14 14:39, Goffredo Baroncelli wrote:
On 03/14/2018 01:02 PM, Austin S. Hemmelgarn wrote:
[...]
In btrfs, a checksum mismatch creates an -EIO error during the reading. In a
conventional filesystem (or a btrfs filesystem w/o datasum) there is no
checksum, so this problem doesn't exis
On 03/14/2018 01:02 PM, Austin S. Hemmelgarn wrote:
[...]
>>
>> In btrfs, a checksum mismatch creates an -EIO error during the reading. In a
>> conventional filesystem (or a btrfs filesystem w/o datasum) there is no
>> checksum, so this problem doesn't exist.
>>
>> I am curious how ZFS solves thi
On 2018-03-13 15:36, Goffredo Baroncelli wrote:
On 03/12/2018 10:48 PM, Christoph Anton Mitterer wrote:
On Mon, 2018-03-12 at 22:22 +0100, Goffredo Baroncelli wrote:
Unfortunately no, the likelihood might be 100%: there are some
patterns which trigger this problem quite easily. See The link whi
On Tue, 2018-03-13 at 20:36 +0100, Goffredo Baroncelli wrote:
> A checksum mismatch, is returned as -EIO by a read() syscall. This is
> an event handled badly by most part of the programs.
Then these programs must simply be fixed... otherwise they'll also fail
under normal circumstances with btrfs,
On 03/12/2018 10:48 PM, Christoph Anton Mitterer wrote:
> On Mon, 2018-03-12 at 22:22 +0100, Goffredo Baroncelli wrote:
>> Unfortunately no, the likelihood might be 100%: there are some
>> patterns which trigger this problem quite easily. See The link which
>> I posted in my previous email. There w
On 9 March 2018 at 20:05, Alex Adriaanse wrote:
>
> Yes, we have PostgreSQL databases running these VMs that put a heavy I/O load
> on these machines.
Dump the databases and recreate them with --data-checksums and Btrfs
No_COW attribute.
You can add this to /etc/postgresql-common/createcluster.
On Mon, 2018-03-12 at 22:22 +0100, Goffredo Baroncelli wrote:
> Unfortunately no, the likelihood might be 100%: there are some
> patterns which trigger this problem quite easily. See The link which
> I posted in my previous email. There was a program which creates a
> bad checksum (in COW+DATASUM m
On 03/11/2018 11:37 PM, Christoph Anton Mitterer wrote:
> On Sun, 2018-03-11 at 18:51 +0100, Goffredo Baroncelli wrote:
>>
>> COW is needed to properly checksum the data. Otherwise is not
>> possible to ensure the coherency between data and checksum (however I
>> have to point out that BTRFS fails
On Sun, 2018-03-11 at 18:51 +0100, Goffredo Baroncelli wrote:
>
> COW is needed to properly checksum the data. Otherwise is not
> possible to ensure the coherency between data and checksum (however I
> have to point out that BTRFS fails even in this case [*]).
> We could rearrange this sentence, s
On 03/10/2018 03:29 PM, Christoph Anton Mitterer wrote:
> On Sat, 2018-03-10 at 14:04 +0200, Nikolay Borisov wrote:
>> So for OLTP workloads you definitely want nodatacow enabled, bear in
>> mind this also disables crc checksumming, but your db engine should
>> already have such functionality imple
On Sat, 2018-03-10 at 14:04 +0200, Nikolay Borisov wrote:
> So for OLTP workloads you definitely want nodatacow enabled, bear in
> mind this also disables crc checksumming, but your db engine should
> already have such functionality implemented in it.
Unlike repeated claims made here on the list a
On 9.03.2018 21:05, Alex Adriaanse wrote:
> Am I correct to understand that nodatacow doesn't really avoid CoW when
> you're using snapshots? In a filesystem that's snapshotted
Yes, so nodatacow won't interfere with how snapshots operate. For more
information on that topic check the following
On Mar 9, 2018, at 3:54 AM, Nikolay Borisov wrote:
>
>> Sorry, I clearly missed that one. I have applied the patch you referenced
>> and rebooted the VM in question. This morning we had another FS failure on
>> the same machine that caused it to go into readonly mode. This happened
>> after th
> Sorry, I clearly missed that one. I have applied the patch you referenced and
> rebooted the VM in question. This morning we had another FS failure on the
> same machine that caused it to go into readonly mode. This happened after
> that device was experiencing 100% I/O utilization for some t
On Mar 2, 2018, at 11:29 AM, Liu Bo wrote:
> On Thu, Mar 01, 2018 at 09:40:41PM +0200, Nikolay Borisov wrote:
>> On 1.03.2018 21:04, Alex Adriaanse wrote:
>>> Thanks so much for the suggestions so far, everyone. I wanted to report
>>> back on this. Last Friday I made the following changes per su
On Thu, Mar 01, 2018 at 09:40:41PM +0200, Nikolay Borisov wrote:
>
>
> On 1.03.2018 21:04, Alex Adriaanse wrote:
> > On Feb 16, 2018, at 1:44 PM, Austin S. Hemmelgarn
> > wrote:
...
>
> > [496003.641729] BTRFS: error (device xvdc) in __btrfs_free_extent:7076:
> > errno=-28 No space left
> >
On 2018年03月02日 03:04, Alex Adriaanse wrote:
> On Feb 16, 2018, at 1:44 PM, Austin S. Hemmelgarn
> wrote:
>> I would suggest changing this to eliminate the balance with '-dusage=10'
>> (it's redundant with the '-dusage=20' one unless your filesystem is in
>> pathologically bad shape), and addi
On 1.03.2018 21:04, Alex Adriaanse wrote:
> On Feb 16, 2018, at 1:44 PM, Austin S. Hemmelgarn
> wrote:
>> I would suggest changing this to eliminate the balance with '-dusage=10'
>> (it's redundant with the '-dusage=20' one unless your filesystem is in
>> pathologically bad shape), and addin
On Feb 16, 2018, at 1:44 PM, Austin S. Hemmelgarn wrote:
> I would suggest changing this to eliminate the balance with '-dusage=10'
> (it's redundant with the '-dusage=20' one unless your filesystem is in
> pathologically bad shape), and adding equivalent filters for balancing
> metadata (which
>First of all, the ssd mount option does not have anything to do with
having single or DUP metadata.
Sorry about that, I agree with you. -nossd would not help in
increasing reliability in any way. One alternative would be to format
and force duplication of metadata during filesystem creation on SS
On 02/17/2018 05:34 AM, Shehbaz Jaffer wrote:
>> It's hosted on an EBS volume; we don't use ephemeral storage at all. The EBS
>> volumes are all SSD
>
> I have recently done some SSD corruption experiments on small set of
> workloads, so I thought I would share my experience.
>
> While creating
>It's hosted on an EBS volume; we don't use ephemeral storage at all. The EBS
>volumes are all SSD
I have recently done some SSD corruption experiments on small set of
workloads, so I thought I would share my experience.
While creating btrfs using mkfs.btrfs command for SSDs, by default the
meta
Austin S. Hemmelgarn posted on Fri, 16 Feb 2018 14:44:07 -0500 as
excerpted:
> This will probably sound like an odd question, but does BTRFS think your
> storage devices are SSD's or not? Based on what you're saying, it
> sounds like you're running into issues resulting from the
> over-aggressive
into
readonly mode. We've spent an enormous amount of time trying to recover
corrupted filesystems, and the time that servers were down as a result of Btrfs
instability has accumulated to many days.
We've made many changes to try to improve Btrfs stability: upgrading to newer
kernels, se
On 16.02.2018 06:54, Alex Adriaanse wrote:
>
>> On Feb 15, 2018, at 2:42 PM, Nikolay Borisov wrote:
>>
>> On 15.02.2018 21:41, Alex Adriaanse wrote:
>>>
On Feb 15, 2018, at 12:00 PM, Nikolay Borisov wrote:
So in all of the cases you are hitting some form of premature enospc.
>>>
> On Feb 15, 2018, at 2:42 PM, Nikolay Borisov wrote:
>
> On 15.02.2018 21:41, Alex Adriaanse wrote:
>>
>>> On Feb 15, 2018, at 12:00 PM, Nikolay Borisov wrote:
>>>
>>> So in all of the cases you are hitting some form of premature enospc.
>>> There was a fix that landed in 4.15 that should ha
On 15.02.2018 21:41, Alex Adriaanse wrote:
>
>> On Feb 15, 2018, at 12:00 PM, Nikolay Borisov wrote:
>>
>> So in all of the cases you are hitting some form of premature enospc.
>> There was a fix that landed in 4.15 that should have fixed a rather
>> long-standing issue with the way metadata re
> On Feb 15, 2018, at 12:00 PM, Nikolay Borisov wrote:
>
> So in all of the cases you are hitting some form of premature enospc.
> There was a fix that landed in 4.15 that should have fixed a rather
> long-standing issue with the way metadata reservations are satisfied,
> namely:
>
> 996478ca9c
freezing, or the filesystem
> going into readonly mode. We've spent an enormous amount of time trying to
> recover corrupted filesystems, and the time that servers were down as a
> result of Btrfs instability has accumulated to many days.
>
> We've made many changes to
rmous amount of time trying to recover
corrupted filesystems, and the time that servers were down as a result of Btrfs
instability has accumulated to many days.
We've made many changes to try to improve Btrfs stability: upgrading to newer
kernels, setting up nightly balances, setting up
On Fri, 27 May 2016 00:42:07 +0200
Diego Torres wrote:
> Btrfs is the only fs that can add drives one by one to an existing raid
> setup, and use the new space inmediately, without replacing all the drives.
Ext4, XFS, JFS or pretty much any FS which can be resized upwards can also do
that, when
Hi there,
I've been using btrfs with a raid5 configuration with 3 disks for 6
months, and then with 4 disks for a couple of months more. I run a
weekly scrub, and a monthly balance. Btrfs is the only fs that can add
drives one by one to an existing raid setup, and use the new space
inmediately, wi
On Sat, Jan 26, 2013 at 01:27:11PM -0700, Andrew McNabb wrote:
> Here's an update. I tried the new kernel, and I seem to be having some
> new (possibly worse problems. In my ssh session, I'm seeing many errors
> of this sort:
>
> Message from syslogd@guru at Jan 26 13:13:14 ...
> kernel:[ 308.
On Sat, Jan 26, 2013 at 01:27:11PM -0700, Andrew McNabb wrote:
> Here's an update. I tried the new kernel, and I seem to be having some
> new (possibly worse problems. In my ssh session, I'm seeing many errors
> of this sort:
>
> Message from syslogd@guru at Jan 26 13:13:14 ...
> kernel:[ 308.
Here's an update. I tried the new kernel, and I seem to be having some
new (possibly worse problems. In my ssh session, I'm seeing many errors
of this sort:
Message from syslogd@guru at Jan 26 13:13:14 ...
kernel:[ 308.223834] BUG: soft lockup - CPU#0 stuck for 23s!
[btrfs-endio-wri:2073]
Me
On Fri, Jan 25, 2013 at 03:53:22PM -0500, Josef Bacik wrote:
>
> Actually for this one, how did you remove the disk? Did you just yank it out
> while the box was running? Did you mount -o degraded and then delete the
> device
> and then remove it? How exactly did you get to this situation. Th
On Fri, Jan 25, 2013 at 03:37:17PM -0500, Josef Bacik wrote:
> > https://bugzilla.redhat.com/show_bug.cgi?id=903794
>
> This one is just a allocator warning because the relocator doesn't do the
> right
> accounting for relocation. It's just complainig, we need to fix it but it
> won't
> keep it
On Fri, Jan 25, 2013 at 01:05:14PM -0700, Andrew McNabb wrote:
> I tried creating a multi-device btrfs filesystem for the first time (on
> Fedora 18 with 3.7.2-204.fc18.x86_64), and I ran into some problems. I
> had heard that btrfs is now reasonably stable, and though I expected to
> possibly see
On Fri, Jan 25, 2013 at 01:05:14PM -0700, Andrew McNabb wrote:
> I tried creating a multi-device btrfs filesystem for the first time (on
> Fedora 18 with 3.7.2-204.fc18.x86_64), and I ran into some problems. I
> had heard that btrfs is now reasonably stable, and though I expected to
> possibly see
I tried creating a multi-device btrfs filesystem for the first time (on
Fedora 18 with 3.7.2-204.fc18.x86_64), and I ran into some problems. I
had heard that btrfs is now reasonably stable, and though I expected to
possibly see a problem here or there, I was a little surprised at just
how many pro
41 matches
Mail list logo