Re: 6TB partition, Data only 2TB - aka When you haven't hit the "usual" problem

2016-08-05 Thread Gabriel C

On 04.08.2016 18:53, Lutz Vieweg wrote:
> 
> I was today hit by what I think is probably the same bug:
> A btrfs on a close-to-4TB sized block device, only half filled
> to almost exactly 2 TB, suddenly says "no space left on device"
> upon any attempt to write to it. The filesystem was NOT automatically
> switched to read-only by the kernel, I should mention.
> 
> Re-mounting (which is a pain as this filesystem is used for
> $HOMEs of a multitude of active users who I have to kick from
> the server for doing things like re-mounting) removed the symptom
> for now, but from what I can read in linux-btrfs mailing list
> archives, it pretty likely the symptom will re-appear.
> 
> Here are some more details:
> 
> Software versions:
>> linux-4.6.1 (vanilla from kernel.org)
...
> 
> dmesg output from the time the "no space left on device"-symptom
> appeared:
> 
>> [5171203.601620] WARNING: CPU: 4 PID: 23208 at fs/btrfs/inode.c:9261 
>> btrfs_destroy_inode+0x263/0x2a0 [btrfs]


> ...
>> [5171230.306037] WARNING: CPU: 18 PID: 12656 at fs/btrfs/extent-tree.c:4233 
>> btrfs_free_reserved_data_space_noquota+0xf3/0x100 [btrfs]


Sounds like the bug I hit too also ..

To fix this you'll need :


crazy@zwerg:~/Work/linux-git$ git show 8b8b08cbf
commit 8b8b08cbfb9021af4b54b4175fc4c51d655aac8c
Author: Chris Mason 
Date:   Tue Jul 19 05:52:36 2016 -0700

Btrfs: fix delalloc accounting after copy_from_user faults

Commit 56244ef151c3cd11 was almost but not quite enough to fix the
reservation math after btrfs_copy_from_user returned partial copies.

Some users are still seeing warnings in btrfs_destroy_inode, and with a
long enough test run I'm able to trigger them as well.

This patch fixes the accounting math again, bringing it much closer to
the way it was before the sectorsize conversion Chandan did.  The
problem is accounting for the offset into the page/sector when we do a
partial copy.  This one just uses the dirty_sectors variable which
should already be updated properly.

Signed-off-by: Chris Mason 
cc: sta...@vger.kernel.org # v4.6+

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index f3f61d1..bcfb4a2 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1629,13 +1629,11 @@ again:
 * managed to copy.
 */
if (num_sectors > dirty_sectors) {
-   /*
-* we round down because we don't want to count
-* any partial blocks actually sent through the
-* IO machines
-*/
-   release_bytes = round_down(release_bytes - copied,
- root->sectorsize);
+
+   /* release everything except the sectors we dirtied */
+   release_bytes -= dirty_sectors <<
+   root->fs_info->sb->s_blocksize_bits;
+
if (copied > 0) {
spin_lock(_I(inode)->lock);
BTRFS_I(inode)->outstanding_extents++;
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: systemd KillUserProcesses=yes and btrfs scrub

2016-07-31 Thread Gabriel C


On 30.07.2016 22:02, Chris Murphy wrote:
> Short version: When systemd-logind login.conf KillUserProcesses=yes,
> and the user does "sudo btrfs scrub start" in e.g. GNOME Terminal, and
> then logs out of the shell, the user space operation is killed, and
> btrfs scrub status reports that the scrub was aborted. [1]
> 

How this is a bug ?

Is excatly what 'KillUserProcesses=yes' is extected to do..

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A lot warnings in dmesg while running thunderbird

2016-07-21 Thread Gabriel C


On 21.07.2016 14:56, Chris Mason wrote:
> On 07/20/2016 01:50 PM, Gabriel C wrote:
>>
>> After 24h of running the program and thundirbird all is still fine here.
>>
>> I let it run one more day.. But looks very good.
>>
> 
> Thanks for your time in helping to track this down.  It'll go into the 
> next merge window and be cc'd to stable.
> 

You are welcome :)

Test program was running without problems for 52h.. I think your fix is fine :)

Also feel free to add Tested-by: Gabriel Craciunescu <nix.or@gmail.com> to 
you commit.

Regrads,

Gabriel
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A lot warnings in dmesg while running thunderbird

2016-07-20 Thread Gabriel C


On 20.07.2016 15:50, Chris Mason wrote:
> 
> 
> On 07/19/2016 08:11 PM, Gabriel C wrote:
>>
>>
>> On 19.07.2016 13:05, Chris Mason wrote:
>>> On Mon, Jul 11, 2016 at 11:28:01AM +0530, Chandan Rajendra wrote:
>>>> Hi Chris,
>>>>
>>>> I am able to reproduce the issue with the 'short-write' program. But before
>>>> the call trace associated with btrfs_destroy_inode(), I see the following 
>>>> call
>>>> trace ...
>>>>
>>>> [ cut here ]
>>>> WARNING: CPU: 2 PID: 2311 at 
>>>> /home/chandan/repos/linux/fs/btrfs/extent-tree.c:4303 
>>>> btrfs_free_reserved_data_space_noquota+0xe8/0x100
>>>
>>> [ ... ]
>>>
>>> Ok, the problem is in how we're dealing with the offset into the sector when
>>> we fail. The dirty_sectors variable already has this accounted in it, so
>>> this patch fixes it for me.  I ran overnight, but I'll let it go for a few
>>> days just to make sure:
>>>
>>> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
>>> index fac9b839..5842423 100644
>>> --- a/fs/btrfs/file.c
>>> +++ b/fs/btrfs/file.c
>>> @@ -1629,13 +1629,11 @@ again:
>>>  * managed to copy.
>>>  */
>>> if (num_sectors > dirty_sectors) {
>>> -   /*
>>> -* we round down because we don't want to count
>>> -* any partial blocks actually sent through the
>>> -* IO machines
>>> -*/
>>> -   release_bytes = round_down(release_bytes - copied,
>>> - root->sectorsize);
>>> +
>>> +   /* release everything except the sectors we dirtied */
>>> +   release_bytes -= dirty_sectors <<
>>> +   root->fs_info->sb->s_blocksize_bits;
>>> +
>>> if (copied > 0) {
>>> spin_lock(_I(inode)->lock);
>>> BTRFS_I(inode)->outstanding_extents++;
>>>
>>
>> Since I guess you are testing this on latest git code I started to test on 
>> latest stable.
> 
> Any v4.7-rc or v4.6 stable where the patch applies ;)
> 
>>
>> Until now all seems file .. your test program is still running without to 
>> trigger the bug.
>>
>> Also thunderbird is running without to trigger the bug.
>>
>> I let it run overnight and report back.
> 
> Great, thanks!

After 24h of running the program and thundirbird all is still fine here.

I let it run one more day.. But looks very good.


Regards,

Gabriel
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A lot warnings in dmesg while running thunderbird

2016-07-19 Thread Gabriel C


On 19.07.2016 13:05, Chris Mason wrote:
> On Mon, Jul 11, 2016 at 11:28:01AM +0530, Chandan Rajendra wrote:
>> Hi Chris,
>>
>> I am able to reproduce the issue with the 'short-write' program. But before
>> the call trace associated with btrfs_destroy_inode(), I see the following 
>> call
>> trace ...
>>
>> [ cut here ]
>> WARNING: CPU: 2 PID: 2311 at 
>> /home/chandan/repos/linux/fs/btrfs/extent-tree.c:4303 
>> btrfs_free_reserved_data_space_noquota+0xe8/0x100
> 
> [ ... ]
> 
> Ok, the problem is in how we're dealing with the offset into the sector when
> we fail. The dirty_sectors variable already has this accounted in it, so
> this patch fixes it for me.  I ran overnight, but I'll let it go for a few
> days just to make sure:
> 
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index fac9b839..5842423 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -1629,13 +1629,11 @@ again:
>* managed to copy.
>*/
>   if (num_sectors > dirty_sectors) {
> - /*
> -  * we round down because we don't want to count
> -  * any partial blocks actually sent through the
> -  * IO machines
> -  */
> - release_bytes = round_down(release_bytes - copied,
> -   root->sectorsize);
> +
> + /* release everything except the sectors we dirtied */
> + release_bytes -= dirty_sectors <<
> + root->fs_info->sb->s_blocksize_bits;
> +
>   if (copied > 0) {
>   spin_lock(_I(inode)->lock);
>   BTRFS_I(inode)->outstanding_extents++;
> 

Since I guess you are testing this on latest git code I started to test on 
latest stable.

Until now all seems file .. your test program is still running without to 
trigger the bug.

Also thunderbird is running without to trigger the bug.

I let it run overnight and report back.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A lot warnings in dmesg while running thunderbird

2016-07-08 Thread Gabriel C

On 08.07.2016 14:41, Chris Mason wrote:




On 07/08/2016 05:57 AM, Gabriel C wrote:

2016-07-07 21:21 GMT+02:00 Chris Mason <c...@fb.com>:



On 07/07/2016 06:24 AM, Gabriel C wrote:


Hi,

while running thunderbird on linux 4.6.3 and 4.7.0-rc6 ( didn't tested
other versions )
I trigger the following :



I definitely thought we had this fixed in v4.7-rc.  Can you easily 
fsck this filesystem?  Something strange is going on.


Yes , btrfs check and btrfs check  --check-data-csum are fine , no 
errors found.


If you want me to test any patches let me know.



Can you please try a v4.5 stable kernel?  I'm curious if this really 
is the same regression that I tried to fix in v4.7




I'm on linux 4.5.7 now and everything is fine. I'm writing this email 
from thunderbird.. which was not

possible in 4.6.3 or 4.7.-rc.

Let me know you want me to test other kernels or whatever else may help 
fixing this problem.


Regards,

Gabriel C

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A lot warnings in dmesg while running thunderbird

2016-07-08 Thread Gabriel C
2016-07-08 14:41 GMT+02:00 Chris Mason <c...@fb.com>:
>
>
> On 07/08/2016 05:57 AM, Gabriel C wrote:
>>
>> 2016-07-07 21:21 GMT+02:00 Chris Mason <c...@fb.com>:
>>>
>>>
>>>
>>> On 07/07/2016 06:24 AM, Gabriel C wrote:
>>>>
>>>>
>>>> Hi,
>>>>
>>>> while running thunderbird on linux 4.6.3 and 4.7.0-rc6 ( didn't tested
>>>> other versions )
>>>> I trigger the following :
>>>
>>>
>>>
>>> I definitely thought we had this fixed in v4.7-rc.  Can you easily fsck
>>> this filesystem?  Something strange is going on.
>>
>>
>> Yes , btrfs check and btrfs check  --check-data-csum are fine , no errors
>> found.
>>
>> If you want me to test any patches let me know.
>>
>
> Can you please try a v4.5 stable kernel?  I'm curious if this really is the
> same regression that I tried to fix in v4.7
>

Sure , I'll test on 4.5.7 and let you know.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A lot warnings in dmesg while running thunderbird

2016-07-08 Thread Gabriel C
2016-07-07 21:21 GMT+02:00 Chris Mason <c...@fb.com>:
>
>
> On 07/07/2016 06:24 AM, Gabriel C wrote:
>>
>> Hi,
>>
>> while running thunderbird on linux 4.6.3 and 4.7.0-rc6 ( didn't tested
>> other versions )
>> I trigger the following :
>
>
> I definitely thought we had this fixed in v4.7-rc.  Can you easily fsck this 
> filesystem?  Something strange is going on.

Yes , btrfs check and btrfs check  --check-data-csum are fine , no errors found.

If you want me to test any patches let me know.


Regards,

Gabriel C
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


A lot warnings in dmesg while running thunderbird

2016-07-07 Thread Gabriel C
]  [] ?
block_group_cache_tree_search+0xb1/0xd0 [btrfs]
[ 6509.253610]  [] ? run_delalloc_nocow+0xa60/0xba0 [btrfs]
[ 6509.253627]  [] ? run_delalloc_range+0x390/0x3b0 [btrfs]
[ 6509.253630]  [] ? flush_tlb_page+0x35/0x90
[ 6509.253647]  [] ?
writepage_delalloc.isra.20+0xfb/0x170 [btrfs]
[ 6509.253664]  [] ? __extent_writepage+0xb3/0x300 [btrfs]
[ 6509.253668]  [] ? __set_page_dirty_nobuffers+0xea/0x140
[ 6509.253685]  [] ?
extent_write_cache_pages.isra.16.constprop.31+0x23c/0x350 [btrfs]
[ 6509.253702]  [] ? extent_writepages+0x48/0x60 [btrfs]
[ 6509.253718]  [] ? btrfs_direct_IO+0x360/0x360 [btrfs]
[ 6509.253723]  [] ? __filemap_fdatawrite_range+0xa2/0xe0
[ 6509.253739]  [] ? btrfs_fdatawrite_range+0x16/0x40 [btrfs]
[ 6509.253755]  [] ? start_ordered_ops+0x10/0x20 [btrfs]
[ 6509.253771]  [] ? btrfs_sync_file+0x41/0x360 [btrfs]
[ 6509.253775]  [] ? do_fsync+0x33/0x60
[ 6509.253778]  [] ? SyS_fsync+0x7/0x10
[ 6509.253782]  [] ? entry_SYSCALL_64_fastpath+0x1a/0xa4

...

See http://paste.opensuse.org/view/simple/86078072 and
http://paste.opensuse.org/view/simple/87276071

This is from running thunderbird just few seconds , when I let it run
for a while I have to reboot the system.

$ uname -a
Linux zwerg 4.7.0-rc6 #1 SMP PREEMPT Tue Jul 5 07:48:39 CEST 2016
x86_64 x86_64 x86_64 GNU/Linux

btrfs-progs v4.6.1

sda is HW RAID0

...

Jun 23 14:27:48 localhost kernel: scsi host0: Avago SAS based MegaRAID driver
Jun 23 14:27:48 localhost kernel: scsi 0:0:6:0: Direct-Access ATA
WDC WD5002ABYS-5 3B06 PQ: 0 ANSI: 5
Jun 23 14:27:48 localhost kernel: scsi 0:0:7:0: Direct-Access ATA
WDC WD5002ABYS-5 3B06 PQ: 0 ANSI: 5
Jun 23 14:27:48 localhost kernel: scsi 0:0:10:0: Direct-Access ATA
 ST500NM0011  FTM6 PQ: 0 ANSI: 5
Jun 23 14:27:48 localhost kernel: scsi 0:2:0:0: Direct-Access LSI
MegaRAID SAS RMB 1.40 PQ: 0 ANSI: 5

...

mount | grep sda
/dev/sda1 on / type btrfs
(rw,noatime,compress=lzo,space_cache,autodefrag,subvolid=5,subvol=/)

( tested with and without compression , with just defaults the
warnings are still the same )

btrfs fi show
Label: none  uuid: 67b2e285-e331-42ad-8478-d78b17ea6970
   Total devices 1 FS bytes used 31.47GiB
   devid1 size 1.36TiB used 37.06GiB path /dev/sda1


btrfs fi df /
Data, single: total=32.00GiB, used=30.43GiB
System, DUP: total=32.00MiB, used=16.00KiB
Metadata, DUP: total=2.50GiB, used=1.04GiB
GlobalReserve, single: total=368.00MiB, used=0.00B


Regards,

Gabriel C
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html