Re: 3.16.0 Debian kernel hang

2015-12-05 Thread Duncan
Austin S Hemmelgarn posted on Fri, 04 Dec 2015 08:08:58 -0500 as
excerpted:

> On 2015-12-04 05:00, Russell Coker wrote:
>>
>> When I mounted the filesystem with a 4.2.0 kernel it said "The free
>> space cache file (1103101952) is invalid, skip it" and then things
>> worked.  Now that the machine is running 4.2.0 everything is fine.
>>
>> I know that there are no plans to backport things to 3.16 and I don't
>> think the Debian people are going to be very interested in this.  So
>> this message is a FYI for users, maybe consider not using the
>> Debian/Jessie kernel for BTRFS systems.
>>
> I'd suggest extending that suggestion to:
> If you're not using an Enterprise distro (RHEL, SLES, CentOS, OEL), then
> you should probably be building your own kernel, ideally using upstream
> sources.

My personal recommendation differs from that somewhat, in both directions.

The first thing to consider is that in terms of this list, at least, 
while btrfs is considered to be stabilizING, , it's not yet either fully 
stable or mature.  It's "good enough for daily use" provided you're 
following in any case reasonable admin backup policies that respect the 
general admin rule that if the data is of more value than the time and 
resources required to back it up, then it IS backed up, and conversely, 
that if it's not backed up to a particular level, you are by your actions 
defining it as worth less than the time and trouble to do that Nth level 
backup while taking into account the risk factor of actually having to 
use it.

But, that assumes keeping "reasonably" current with both the kernel and 
tools.  Here, the recommendation is much relaxed from what it used to be, 
but the assumption and recommendation is that you'll either follow the 
current kernel, being no more than one release series behind (with 4.3 
out, you might still be on 4.2) unless you have a specific bug (btrfs or 
otherwise) that's still being addressed, OR at least follow the upstream 
kernel LTS series, again, being no further than one such series behind 
(the 4.1 and 3.18 LTS series are thus currently covered, with 4.4 already 
taken on as another LTS series, so those on 3.18 should be well into 
their 4.1 upgrade preparations as 4.4 should be out around Christmas).

The reasoning is that development remains quite rapid, and older kernels 
both have known and long fixed bugs, and in the case of older LTS kernels 
(from 3.12 at least, as that's when the experimental label was stripped) 
which should still at least be getting the critical fixes, are simply too 
prehistoric code to reasonably support on-list, as too much has changed 
since then and btrfs is /not/ yet fully stable.

_But_, that's in terms of the mainline kernel and list support.  Some 
distros, primarily enterprise but the same idea applies to distros in 
general, have chosen to support btrfs on their old "stable series" 
kernels, where they presumably backport the critical btrfs patches along 
with other critical kernel patches.

If some btrfs users choose to accept their distro's claims of support on 
older kernels at face value, that's between them and the distro.  But 
here's the thing, this list is focused on the mainline kernel, and many 
here don't know or particularly care what specific patches random distro 
Y may have backported... or not.  So users that choose to use their 
distro's kernels, particularly older kernel series not well supported on-
list, really need to be looking to their distros for that support, not 
the list, because they're not running versions well supported by the list.


So contrary to Austin's recommendation, mine would be, yeah, run your 
distro's kernel if it's within the general list supported range mentioned 
above.  You don't _have_ to build your own kernel, and if it's within the 
two most current kernel series or the two most recent LTS kernel series, 
we'll do our best on-list to help.

But if you choose to run kernels outside that list-supported range, 
regardless of whether you're running distro kernels or building your own, 
you really shouldn't be relying on this list for support, because while 
we'll generally still try to do our best to help, it's out of our focus 
range and the level of support simply isn't going to be as good as it 
would if you were running newer kernel series, as a result.

For enterprise distro users as well as other distros running ancient 
kernels, that means your best support may well be from your distro, and 
particularly for enterprise distros, that's often what you're paying good 
money to get, so you should be able to expect/demand that support, or you 
can take that money elsewhere.

So my recommendation doesn't really distinguish enterprise users from 
others, except that enterprise users are both often running older kernels 
and paying good money for support.  Enterprise distro or not, however, if 
users choose to run older kernels, they can expect a lower level of 
practical list support because we're 

Re: compression disk space saving - what are your results?

2015-12-05 Thread Marc Joliet
On Wednesday 02 December 2015 18:46:30 Tomasz Chmielewski wrote:
>What are your disk space savings when using btrfs with compression?
>
>I have a 200 GB btrfs filesystem which uses compress=zlib, only stores
>text files (logs), mostly multi-gigabyte files.
>
>
>It's a "single" filesystem, so "df" output matches "btrfs fi df":
>
># df -h
>Filesystem  Size  Used Avail Use% Mounted on
>(...)
>/dev/xvdb   200G  124G   76G  62% /var/log/remote
>
>
># du -sh /var/log/remote/
>153G/var/log/remote/
>
>
> From these numbers (124 GB used where data size is 153 GB), it appears
>that we save around 20% with zlib compression enabled.
>Is 20% reasonable saving for zlib? Typically text compresses much better
>with that algorithm, although I understand that we have several
>limitations when applying that on a filesystem level.
>
>
>Tomasz Chmielewski
>http://wpkg.org

I have a total of three file systems that use compression, on a desktop and a 
laptop.  / on both uses compress=lzo, and my backup drive uses compress=zlib 
(my RAID1 FS does not use compression).  My desktop looks like this:

% df -h
DateisystemGröße Benutzt Verf. Verw% Eingehängt auf
/dev/sda1   108G 79G   26G   76% /
[...]

For / I get a total of about 8G or at least 9% space saving:

# du -hsc /mnt/rootfs/*
71G /mnt/rootfs/home
14G /mnt/rootfs/rootfs
2,3G/mnt/rootfs/var
87G insgesamt

I write "at least" because this does not include snapshots.  On my laptop the 
difference is merely 1 GB (83 vs. 84 GB), but it was using the autodefrag 
mount option until yesterday (when I migrated it to an SSD using dd), which 
probably accounts for a significant amount of wasted space.  I'll see how it 
develops over the next two weeks, but I expect the ratio to become similar to 
my desktop (probably less, since there is also a lot of music on there).

I would love to answer the question for my backup drive, but du took too long 
(> 1 h) so I stopped it :-( .  I might try it again later, but no promises!

Greetings
-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup

signature.asc
Description: This is a digitally signed message part.


Re: compression disk space saving - what are your results?

2015-12-05 Thread Marc Joliet
On Saturday 05 December 2015 14:37:05 Marc Joliet wrote:
>My desktop looks like this:
>
>% df -h
>DateisystemGröße Benutzt Verf. Verw% Eingehängt auf
>/dev/sda1   108G 79G   26G   76% /
>[...]
>
>For / I get a total of about 8G or at least 9% space saving:
>
># du -hsc /mnt/rootfs/*
>71G /mnt/rootfs/home
>14G /mnt/rootfs/rootfs
>2,3G/mnt/rootfs/var
>87G insgesamt
>
>I write "at least" because this does not include snapshots.

Just to be explicit, in case it was not clear, but I of course meant that the 
*du output* does not account for extra space used by snapshots.

>On my laptop
>the  difference is merely 1 GB (83 vs. 84 GB),

And here I also want to clarify that the df output was 84 GB, and the du 
output was 83 GB.  Again, the du output does not account for snapshots, which 
go back farther on the laptop: 2 weeks of daily snapshots (with autodefrag!) 
instead of up to up to 2 days of bi-hourly snapshots.

I do think it's interesting that compression (even with LZO) seems to have 
offset the extra space wastage caused by autodefrag.

Greetings
-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup


signature.asc
Description: This is a digitally signed message part.


Re: Subvolume UUID, data corruption?

2015-12-05 Thread Duncan
Christoph Anton Mitterer posted on Sat, 05 Dec 2015 04:28:24 +0100 as
excerpted:

> On Fri, 2015-12-04 at 13:07 +, Hugo Mills wrote:
>> I don't think it'll cause problems.
> Is there any guaranteed behaviour when btrfs encounters two filesystems
> (i.e. not talking about the subvols now) with the same UUID?
> 
> Given that it's long standing behaviour that people could clone
> filesystems (dd, etc.) and this just worked™, btrfs should at least
> handle such case gracefully.
> For example, when already more than one block device with a btrfs of the
> same UUID are known, then it should refuse to mount any of them.
> 
> And if one is already known and another device pops up it should refuse
> to mount that and continue to normally use the already mounted one.

The problem with btrfs is that because (unlike traditional filesystems) 
it's multi-device, it needs some way to identify what devices belong to a 
particular filesystem.

And UUID is, by definition and expansion, Universally Unique ID.  Btrfs 
simply depends on it being what it says on the the tin, universally 
unique, to ID the components of the filesystem and assemble them 
correctly.

Besides dd, etc, LVM snapshots are another case where this goes screwy.  
If the UUID isn't UUID, do a btrfs device scan (which udev normally does 
by default these days) so the duplicate UUID is detected, and btrfs 
*WILL* eventually start trying to write to all the "newly added" devices 
that scan found, identified by their Universally Unique IDs, aka UUIDs.  
It's not a matter of if, but when.


And the UUID is embedded so deeply within the filesystem and its 
operations, as an inextricable part of the metadata (thus avoiding the 
problem reiserfs had where a reiserfs stored in a loopback file on a 
reiserfs, would screw up reiserfsck, on btrfs, the loopback file would 
have a different UUID and thus couldn't be mixed up), that changing the 
UUID is not the simple operation of changing a few bytes in the superblock 
that it is on other filesystems, which is why there's now a tool to go 
thru all those metadata entries and change it.


So an aware btrfs admin simply takes pains to avoid triggering a btrfs 
device scan at the wrong time, and to immediately hide their LVM 
snapshots, immediately unplug their directly dd-ed devices, etc, and thus 
doesn't have to deal with the filesystem corruption that'd be a when not 
if, if they didn't take such precautions with their dupped UUIDs that 
actually aren't as UUID as the name suggests...

And as your followup suggests in a security context, they consider 
masking out their UUIDs before posting them, as well, tho most kernel 
hackers generally consider unsupervised physical access to be game-over, 
security-wise.  (After all, in that case there's often little or nothing 
preventing a reboot to that USB stick, if desired, or simply yanking the 
devices and duping them or plugging them in elsewhere, if the BIOS is 
password protected, with the only thing standing in the way at that point 
being possible device encryption.)


The UUID *as* a UUID, _unique_ at least on that system (if not actually 
universally) as it says on the tin, is so deeply embedded in btrfs that 
at this point it's not going to be removed.  The only real alternative if 
you don't like it is using a different filesystem.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: compression disk space saving - what are your results?

2015-12-05 Thread Duncan
Marc Joliet posted on Sat, 05 Dec 2015 15:11:51 +0100 as excerpted:

> I do think it's interesting that compression (even with LZO) seems to
> have offset the extra space wastage caused by autodefrag.

I've seen (I think) you mention that twice now.  Perhaps I'm missing 
something... How does autodefrag trigger space wastage?

What autodefrag does is watch for seriously fragmented files and queue 
them up for later defrag by a worker thread.  How would that waste space?

Unless of course you're talking about breaking reflinks to existing 
snapshots or other (possibly partial) copies of the file.  But I'd call 
that wasting space due to the snapshots storing old copies, not due to 
autodefrag keeping the current copy defragmented.  And reflinks are 
saving space by effectively storing parts of two files in the same 
extent, not autodefrag wasting it, as the default on a normal filesystem 
would be separate copies, so that's the zero-point base, and reflinks 
save from it, with autodefrag therefore not changing things from the zero-
point base.  No snapshots, no reflinks, autodefrag no longer "wastes" 
space, so it's not autodefrag's wastage in the first place, it's the 
other mechanisms' saving space.

>From my viewpoint, anyway.  I'd not ordinarily quibble over it one way or 
the other if that's what you're referring to.  But just in case you had 
something else in mind that I'm not aware of, I'm posting the question.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: compression disk space saving - what are your results?

2015-12-05 Thread guido_kuenne
> Subject: compression disk space saving - what are your results?
> 
> What are your disk space savings when using btrfs with compression?

I checked that for some folders when I moved from ext4 to btrfs. I compared
du with df** just to get some numbers. I use lzo since btrfs-wiki said its
better for speed.


Percent_saving=(1-df/du)*100:
47% (mostly endless text files, source code etc., total amount of data is
about 1TB)
2%-10% (for data which is mostly in the form of large (several hundred MB up
to fewGB) binary files, total amount is about 4TB)
23% (for something in between, total amount is 0.4TB)

Result indicate pretty clearly: large binary files are almost not compressed
- without understanding much of it that's what I would intuitively expect
(afaik lzo is dictionary based and those binary files have little for that).


** du -s on the folder I copied to the btrfs drive. df is the difference in
between a df before and after the copy. Based on casual checking results
were consistent with the space needed on the old ext4 drive.


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs balance: Kernel BUG

2015-12-05 Thread Wolfgang Rohdewald
Am Samstag, 5. Dezember 2015, 11:17:26 schrieb Holger Hoffstätte:
> On 12/05/15 11:04, Wolfgang Rohdewald wrote:
> > Am Samstag, 5. Dezember 2015, 10:58:44 schrieb Holger Hoffstätte:
> >> Please see: http://www.spinics.net/lists/linux-btrfs/msg49766.html
> >>
> >> You should be able to apply those patches manually (assuming you can/want 
> >> to rebuild).
> > 
> > are my data safe if I just wait for a fixed official kernel and do not
> > balance until then?
> 
> This is not directly caused by balance, so it's possible to triger it also
> during normal operation. Another option would be to revert to 4.1.13 for the
> time being.

thanks, I am now building 4.3 with patch applied

-- 
Wolfgang
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Subvolume UUID, data corruption?

2015-12-05 Thread Hugo Mills
On Sat, Dec 05, 2015 at 04:28:24AM +0100, Christoph Anton Mitterer wrote:
> On Fri, 2015-12-04 at 13:07 +, Hugo Mills wrote:
> > I don't think it'll cause problems.
> Is there any guaranteed behaviour when btrfs encounters two filesystems
> (i.e. not talking about the subvols now) with the same UUID?

   Nothing guaranteed, but the likelihood is that things will go badly
wrong, in the sense of corrupt filesystems.

> Given that it's long standing behaviour that people could clone
> filesystems (dd, etc.) and this just worked™, btrfs should at least
> handle such case gracefully.
> For example, when already more than one block device with a btrfs of
> the same UUID are known, then it should refuse to mount any of them.
> And if one is already known and another device pops up it should refuse
> to mount that and continue to normally use the already mounted one.

   Except that that's exactly the mechanism that btrfs uses to handle
multi-device filesystems, so you've just broken anything with more
than one device in the FS.

   If you inspect the devid on each device as well, and refuse
duplicates of those, you've just broken any multipathing
configurations.

   Even if you can handle that, if you have two copies of dev1, and
two copies of dev2, how do you guarantee that the "right" pair of dev1
and dev2 is selected? (e.g. if you have them as network devices, and
the device enumeration order is unstable on each boot).

   Hugo.

-- 
Hugo Mills | Geek, n.:
hugo@... carfax.org.uk | Circus sideshow performer specialising in the eating
http://carfax.org.uk/  | of live animals.
PGP: E2AB1DE4  |   OED


signature.asc
Description: Digital signature


btrfs balance: Kernel BUG

2015-12-05 Thread Wolfgang Rohdewald
Unmodified Linux 4.3 tainted with nvidia

after adding disk #4 to RAID1, I did

btrfs filesystem balance /t4

Dec  5 08:07:26 s5 kernel: [55868.756847] BTRFS info (device sdc): relocating 
block group 10768619667456 flags 17
Dec  5 08:07:35 s5 kernel: [55878.297200] BTRFS info (device sdc): found 10 
extents
Dec  5 08:07:40 s5 kernel: [55882.713437] BTRFS info (device sdc): found 10 
extents
Dec  5 08:07:40 s5 kernel: [55883.219850] BTRFS info (device sdc): relocating 
block group 10767545925632 flags 17
Dec  5 08:07:57 s5 kernel: [55899.736052] BTRFS info (device sdc): found 11 
extents
Dec  5 08:07:58 s5 kernel: [55901.082464] [ cut here ]
Dec  5 08:07:58 s5 kernel: [55901.082468] Kernel BUG at a012ef06 
[verbose debug info unavailable]
Dec  5 08:07:58 s5 kernel: [55901.082470] invalid opcode:  [#1] PREEMPT SMP 
Dec  5 08:07:58 s5 kernel: [55901.082472] Modules linked in: 
snd_hda_codec_hdmi(E) joydev(E) rc_tt_1500(E) stb6100(E) lnbp22(E) 
x86_pkg_temp_thermal(E) intel_powercl
amp(E) coretemp(E) nvidia(POE) stb0899(E) kvm_intel(E) kvm(E) 
dvb_usb_pctv452e(E) snd_hda_codec_realtek(E) ttpci_eeprom(E) 
snd_hda_codec_generic(E) crct10dif_pclmul(
E) dvb_usb(E) crc32_pclmul(E) dvb_core(E) rc_core(E) ftdi_sio(E) 
snd_hda_intel(E) hid_logitech_hidpp(E) usbserial(E) snd_hda_codec(E) 
snd_hda_core(E) aesni_intel(E) 
snd_hwdep(E) aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) snd_pcm(E) 
ablk_helper(E) cryptd(E) bnep(E) rfcomm(E) snd_seq_midi(E) 
snd_seq_midi_event(E) bluetooth(E)
 snd_rawmidi(E) microcode(E) snd_seq(E) snd_seq_device(E) snd_timer(E) 
serio_raw(E) snd(E) soundcore(E) lpc_ich(E) tpm_tis(E) mei_me(E) mei(E) 
shpchp(E) nfsd(E) auth
_rpcgss(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) fscache(E) btrfs(E) 
xor(E) raid6_pq(E) hid_logitech_dj(E) hid_generic(E) usbhid(E) hid(E) sd_mod(E) 
psmouse(
E) ahci(E) libahci(E) e1000e(E) ptp(E) pps_core(E) video(E)
Dec  5 08:07:58 s5 kernel: [55901.082502] CPU: 2 PID: 8553 Comm: btrfs Tainted: 
P   OE   4.3.0+55 #1
Dec  5 08:07:58 s5 kernel: [55901.082503] Hardware name:  
/DH87RL, BIOS RLH8710H.86A.0323.2013.1204.1726 12/04/2013
Dec  5 08:07:58 s5 kernel: [55901.082504] task: 8800d5e08000 ti: 
8800305e8000 task.ti: 8800305e8000
Dec  5 08:07:58 s5 kernel: [55901.082505] RIP: 0010:[]  
[] insert_inline_extent_backref+0xc6/0xd0 [btrfs]
Dec  5 08:07:58 s5 kernel: [55901.082518] RSP: 0018:8800305eb868  EFLAGS: 
00010293
Dec  5 08:07:58 s5 kernel: [55901.082519] RAX:  RBX: 
 RCX: 8800305eb898
Dec  5 08:07:58 s5 kernel: [55901.082520] RDX: 0001 RSI: 
0001 RDI: 
Dec  5 08:07:58 s5 kernel: [55901.082521] RBP: 8800305eb8c8 R08: 
4000 R09: 8800305eb780
Dec  5 08:07:58 s5 kernel: [55901.082522] R10:  R11: 
0002 R12: 8800d95c6800
Dec  5 08:07:58 s5 kernel: [55901.082523] R13:  R14: 
 R15: 8800960cc1b0
Dec  5 08:07:58 s5 kernel: [55901.082524] FS:  7f78bd8da880() 
GS:88021ec8() knlGS:
Dec  5 08:07:58 s5 kernel: [55901.082525] CS:  0010 DS:  ES:  CR0: 
80050033
Dec  5 08:07:58 s5 kernel: [55901.082526] CR2: 0932e000 CR3: 
0001428b1000 CR4: 001406e0
Dec  5 08:07:58 s5 kernel: [55901.082527] Stack:
Dec  5 08:07:58 s5 kernel: [55901.082528]   0005 
 
Dec  5 08:07:58 s5 kernel: [55901.082530]  0001 880214e11000 
209e 880214e11000
Dec  5 08:07:58 s5 kernel: [55901.082532]  8800575ceac8 8800960cc1b0 
0005 880214c96000
Dec  5 08:07:58 s5 kernel: [55901.082533] Call Trace:
Dec  5 08:07:58 s5 kernel: [55901.082541]  [] 
__btrfs_inc_extent_ref.isra.49+0x98/0x250 [btrfs]
Dec  5 08:07:58 s5 kernel: [55901.082545]  [] ? 
get_parent_ip+0xd/0x50
Dec  5 08:07:58 s5 kernel: [55901.082551]  [] 
__btrfs_run_delayed_refs.constprop.68+0xd1d/0x10a0 [btrfs]
Dec  5 08:07:58 s5 kernel: [55901.082556]  [] ? 
_raw_write_lock+0x17/0x40
Dec  5 08:07:58 s5 kernel: [55901.082558]  [] ? 
_raw_spin_unlock+0x1a/0x40
Dec  5 08:07:58 s5 kernel: [55901.082565]  [] 
btrfs_run_delayed_refs+0x82/0x290 [btrfs]
Dec  5 08:07:58 s5 kernel: [55901.082573]  [] 
btrfs_commit_transaction+0x43/0xb20 [btrfs]
Dec  5 08:07:58 s5 kernel: [55901.082583]  [] 
prepare_to_merge+0x213/0x240 [btrfs]
Dec  5 08:07:58 s5 kernel: [55901.082592]  [] 
relocate_block_group+0x3ea/0x600 [btrfs]
Dec  5 08:07:58 s5 kernel: [55901.082599]  [] 
btrfs_relocate_block_group+0x1a5/0x290 [btrfs]
Dec  5 08:07:58 s5 kernel: [55901.082607]  [] 
btrfs_relocate_chunk.isra.34+0x47/0xd0 [btrfs]
Dec  5 08:07:58 s5 kernel: [55901.082615]  [] 
btrfs_balance+0x7d1/0xe90 [btrfs]
Dec  5 08:07:58 s5 kernel: [55901.082623]  [] 
btrfs_ioctl_balance+0x400/0x540 [btrfs]
Dec  5 08:07:58 s5 kernel: [55901.082626]  [] ? 

freeze_bdev and scrub/re-balance

2015-12-05 Thread Wang, Zhiye
Hi all,


If I understand it correctly, defragment operation is done in user space tools, 
while scrub/re-balance is done in kernel thread.


So, if my kernel module calls freeze_bdev when scrub/re-balance is in progress, 
will I still be able to get a consistent file system state?




Thanks
Mike


 --
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs balance: Kernel BUG

2015-12-05 Thread Holger Hoffstätte
On 12/05/15 10:09, Wolfgang Rohdewald wrote:
> Unmodified Linux 4.3 tainted with nvidia
> 
> after adding disk #4 to RAID1, I did
> 
> btrfs filesystem balance /t4
> 
> Dec  5 08:07:26 s5 kernel: [55868.756847] BTRFS info (device sdc): relocating 
> block group 10768619667456 flags 17
> Dec  5 08:07:35 s5 kernel: [55878.297200] BTRFS info (device sdc): found 10 
> extents
> Dec  5 08:07:40 s5 kernel: [55882.713437] BTRFS info (device sdc): found 10 
> extents
> Dec  5 08:07:40 s5 kernel: [55883.219850] BTRFS info (device sdc): relocating 
> block group 10767545925632 flags 17
> Dec  5 08:07:57 s5 kernel: [55899.736052] BTRFS info (device sdc): found 11 
> extents
> Dec  5 08:07:58 s5 kernel: [55901.082464] [ cut here ]
> Dec  5 08:07:58 s5 kernel: [55901.082468] Kernel BUG at a012ef06 
> [verbose debug info unavailable]
> Dec  5 08:07:58 s5 kernel: [55901.082470] invalid opcode:  [#1] PREEMPT 
> SMP 
> Dec  5 08:07:58 s5 kernel: [55901.082472] Modules linked in: 
> snd_hda_codec_hdmi(E) joydev(E) rc_tt_1500(E) stb6100(E) lnbp22(E) 
> x86_pkg_temp_thermal(E) intel_powercl
> amp(E) coretemp(E) nvidia(POE) stb0899(E) kvm_intel(E) kvm(E) 
> dvb_usb_pctv452e(E) snd_hda_codec_realtek(E) ttpci_eeprom(E) 
> snd_hda_codec_generic(E) crct10dif_pclmul(
> E) dvb_usb(E) crc32_pclmul(E) dvb_core(E) rc_core(E) ftdi_sio(E) 
> snd_hda_intel(E) hid_logitech_hidpp(E) usbserial(E) snd_hda_codec(E) 
> snd_hda_core(E) aesni_intel(E) 
> snd_hwdep(E) aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) snd_pcm(E) 
> ablk_helper(E) cryptd(E) bnep(E) rfcomm(E) snd_seq_midi(E) 
> snd_seq_midi_event(E) bluetooth(E)
>  snd_rawmidi(E) microcode(E) snd_seq(E) snd_seq_device(E) snd_timer(E) 
> serio_raw(E) snd(E) soundcore(E) lpc_ich(E) tpm_tis(E) mei_me(E) mei(E) 
> shpchp(E) nfsd(E) auth
> _rpcgss(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) fscache(E) btrfs(E) 
> xor(E) raid6_pq(E) hid_logitech_dj(E) hid_generic(E) usbhid(E) hid(E) 
> sd_mod(E) psmouse(
> E) ahci(E) libahci(E) e1000e(E) ptp(E) pps_core(E) video(E)
> Dec  5 08:07:58 s5 kernel: [55901.082502] CPU: 2 PID: 8553 Comm: btrfs 
> Tainted: P   OE   4.3.0+55 #1
> Dec  5 08:07:58 s5 kernel: [55901.082503] Hardware name:  
> /DH87RL, BIOS RLH8710H.86A.0323.2013.1204.1726 12/04/2013
> Dec  5 08:07:58 s5 kernel: [55901.082504] task: 8800d5e08000 ti: 
> 8800305e8000 task.ti: 8800305e8000
> Dec  5 08:07:58 s5 kernel: [55901.082505] RIP: 0010:[]  
> [] insert_inline_extent_backref+0xc6/0xd0 [btrfs]

[..]

Please see: http://www.spinics.net/lists/linux-btrfs/msg49766.html

You should be able to apply those patches manually (assuming you can/want to 
rebuild).

-h

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs balance: Kernel BUG

2015-12-05 Thread Wolfgang Rohdewald
Am Samstag, 5. Dezember 2015, 10:58:44 schrieb Holger Hoffstätte:
> Please see: http://www.spinics.net/lists/linux-btrfs/msg49766.html
> 
> You should be able to apply those patches manually (assuming you can/want to 
> rebuild).
> 

are my data safe if I just wait for a fixed official kernel and do not
balance until then?

-- 
Wolfgang
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs balance: Kernel BUG

2015-12-05 Thread Holger Hoffstätte
On 12/05/15 11:04, Wolfgang Rohdewald wrote:
> Am Samstag, 5. Dezember 2015, 10:58:44 schrieb Holger Hoffstätte:
>> Please see: http://www.spinics.net/lists/linux-btrfs/msg49766.html
>>
>> You should be able to apply those patches manually (assuming you can/want to 
>> rebuild).
> 
> are my data safe if I just wait for a fixed official kernel and do not
> balance until then?

This is not directly caused by balance, so it's possible to triger it also
during normal operation. Another option would be to revert to 4.1.13 for the
time being.

-h

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: attacking btrfs filesystems via UUID collisions? (was: Subvolume UUID, data corruption?)

2015-12-05 Thread Christoph Anton Mitterer
On Sat, 2015-12-05 at 13:19 +, Duncan wrote:
> The problem with btrfs is that because (unlike traditional
> filesystems) 
> it's multi-device, it needs some way to identify what devices belong
> to a 
> particular filesystem.
Sure, but that applies to lvm, or MD as well... and I wouldn't know of
any random corruption issues there.


> And UUID is, by definition and expansion, Universally Unique ID.
Nitpicking doesn't help here,... reality is they're not,.. either by
people doing stuff like dd, other forms of clones, LVM, etc. ... or as
I've described maliciously.


> Btrfs 
> simply depends on it being what it says on the the tin, universally 
> unique, to ID the components of the filesystem and assemble them 
> correctly.
Admittedly, I'm not an expert to the internals of btrfs, but it seems
other multi-device containers can handle UUID duplicates fine, or at
least so that you don't get any data corruption (or leaks).

This is a showstopper - maybe not under lab conditions but surely under
real world scenarios.
I'm actually quite surprised that no-one else didn't complain about
that before, given how long btrfs exists.


> Besides dd, etc, LVM snapshots are another case where this goes
> screwy.  
> If the UUID isn't UUID, do a btrfs device scan (which udev normally
> does 
> by default these days) so the duplicate UUID is detected, and btrfs 
> *WILL* eventually start trying to write to all the "newly added"
> devices 
> that scan found, identified by their Universally Unique IDs, aka
> UUIDs.  
> It's not a matter of if, but when.
Well.. as I said... quite scary, with respect to both, accidental and
malicious cases of duplicate UUIDs.


> And the UUID is embedded so deeply within the filesystem and its 
> operations, as an inextricable part of the metadata (thus avoiding
> the 
> problem reiserfs had where a reiserfs stored in a loopback file on a 
> reiserfs, would screw up reiserfsck, on btrfs, the loopback file
> would 
> have a different UUID and thus couldn't be mixed up), that changing
> the 
> UUID is not the simple operation of changing a few bytes in the
> superblock 
> that it is on other filesystems, which is why there's now a tool to
> go 
> thru all those metadata entries and change it.
I don't think that this design is per se bad and prevents the kernel to
handle such situations gracefully.

I would expect that in addition to the fs UUID, it needs a form of
device ID... so why not simply ignoring any new device for which there
already is a matching fs UUID and device ID, unless the respective tool
(mount, btrfs, etc.) is explicitly told so via some
device=/dev/sda,/dev/sdb option.

If that means that less things work out of the box (in the sense of
"auto-assembly") well than this is simply necessary.
data security and consistency is definitely much more important than
any fancy auto-magic.



> So an aware btrfs admin simply takes pains to avoid triggering a
> btrfs 
> device scan at the wrong time, and to immediately hide their LVM 
> snapshots, immediately unplug their directly dd-ed devices, etc, and
> thus 
> doesn't have to deal with the filesystem corruption that'd be a when
> not 
> if, if they didn't take such precautions with their dupped UUIDs that
> actually aren't as UUID as the name suggests...
a) People shouldn't need to do days of study to be able to use btrfs
securely. Of course it's more advanced and not everything can be
simplified in a way so that users don't need to know anything (e.g. all
the well-known effects of CoW)... but when the point is reached where
security and data integrity is threatened, there's definitely a hard
border that mustn't be crossed.

b) Given how complex software is, I doubt that it's easily possible,
even for the aware admin, to really prevent all situations that can
lead to such situations.
Not to talk about about any attack-scenarios.



> And as your followup suggests in a security context, they consider 
> masking out their UUIDs before posting them, as well, tho most kernel
> hackers generally consider unsupervised physical access to be game-
> over, 
> security-wise.
Do they? I rather thought many of them had a rather practical and real-
world-situations-based POV.

> (After all, in that case there's often little or nothing 
> preventing a reboot to that USB stick, if desired, or simply yanking
> the 
> devices and duping them or plugging them in elsewhere, if the BIOS is
> password protected, with the only thing standing in the way at that
> point 
> being possible device encryption.)
There's hardware which would, when it detects physicals intrusion (like
yanking) lock up itself (securely clearing the memory, disconnecting
itself from other nodes, which may be compromised as well, when the
filesystem on the attacked node would go crazy.

You have things like ATMs, which are physically usually quite well
secured, but which do have rather easily accessible maintenance ports.
All of us have seen such embedded devices rebooting 

Re: attacking btrfs filesystems via UUID collisions? (was: Subvolume UUID, data corruption?)

2015-12-05 Thread Christoph Anton Mitterer
On Sat, 2015-12-05 at 12:01 +, Hugo Mills wrote:
> On Sat, Dec 05, 2015 at 04:28:24AM +0100, Christoph Anton Mitterer
> wrote:
> > On Fri, 2015-12-04 at 13:07 +, Hugo Mills wrote:
> > > I don't think it'll cause problems.
> > Is there any guaranteed behaviour when btrfs encounters two
> > filesystems
> > (i.e. not talking about the subvols now) with the same UUID?
> 
>    Nothing guaranteed, but the likelihood is that things will go
> badly
> wrong, in the sense of corrupt filesystems.
Phew... well sorry, but I think that's really something that makes
btrfs not productively usable until fixed.



>    Except that that's exactly the mechanism that btrfs uses to handle
> multi-device filesystems, so you've just broken anything with more
> than one device in the FS.
Don't other containers (e.g. LVM) do something similar, and yet they
don't fail badly in case e.g. multipl PVs with the same UUID appear,
AFAIC.

And shouldn't there be some kind of device UUID, which differs
different parts of the same btrfs (with the same fs UUID) but on
different devices?!


>    If you inspect the devid on each device as well, and refuse
> duplicates of those, you've just broken any multipathing
> configurations.
Well, how many people are actually doing this? A minority. So then it
would be simply necessary that multipathing doesn't work out of the box
and one need to specifically tell the kernel to consider a device with
the same btrfs UUID as not a clone but another path to the same device.

In any cases, rare feature like multipathing cannot justify the
possibility of data corruption.
That situtation as it is now is IMHO completely unacceptable.



>    Even if you can handle that, if you have two copies of dev1, and
> two copies of dev2, how do you guarantee that the "right" pair of
> dev1
> and dev2 is selected? (e.g. if you have them as network devices, and
> the device enumeration order is unstable on each boot).
Not sure what you mean now:
The multipathing case?
Then, as I've said, such situations would simply require to manually
set things up and explicitly tell the kernel that the devices foo and
bar are to be used (despite their dup UUID).

If you mean what happens when I have e.g. two clones of a 2-device
btrfs, as in
fsdev1
fsdev2
fsdev1_clone
fsdev2_clone
Then as I've said before... if one pair of them
is already mounted (i.e. when the *_clone appear), than it's likely
that these belong actually together and the kernel should continue to
use them and ignore any other.
If all appear before any is mounted, then
either is should refuse to mount/use any of them, or it should require
to manually specify which devices to be used (i.e. via /dev/sda or so).


Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: attacking btrfs filesystems via UUID collisions? (was: Subvolume UUID, data corruption?)

2015-12-05 Thread Duncan
Christoph Anton Mitterer posted on Sun, 06 Dec 2015 02:51:20 +0100 as
excerpted:

> You have things like ATMs, which are physically usually quite well
> secured, but which do have rather easily accessible maintenance ports.
> All of us have seen such embedded devices rebooting themselves, where
> you see kernel messages.
> That's the point where an attacker could easily get the btrfs UUID:
> [0.00] Command line: BOOT_IMAGE=/vmlinuz-4.2.0-1-amd64
> root=UUID=bd1ea5a0-9bba-11e5-82fa-502690aa641f
> 
> If you can attack such devices already by just having access to a USB
> port... then holly sh**...

There's actually a number of USB-based hardware and software vulns out
there, from the under $10 common-component-capacitor-based charge-and-zap
(charges off the 5V USB line, zaps the port with several hundred volts
reverse-polarity, if the machine survives the first pulse and continues
supplying 5V power, repeat...), to the ones that act like USB-based input
devices and "type" in whatever commands, to simple USB-boot to a forensic
distro and let you inspect attached hardware (which is where the encrypted
storage comes in, they've got everything that's not encrypted),
to the plain old fashioned boot-sector viruses that quickly jump to
everything else on the system that's not boot-sector protected and/or
secure-boot locked, to...

Which is why most people in the know say if you have unsupervised physical
access, you effectively own the machine and everything on it, at least
that's not encrypted.

There's a reason some places hot-glue the USB ports.  If you're plugging
anything untrusted into them... and that's a well known social engineering
hack as well, simply drop a few thumb drives in the target parking lot and
wait to see who picks them up and plugs them in, so they can call home... 
Pen-testers do it.  NSA does it.  It's said a form of that is how they
bridged the air-gap to the Iranian centrifuges...

If you haven't been keeping up, you really have some reading to do.  If
you're plugging in untrusted USB devices, seriously, a thumb drive with a
few duplicated btrfs UUIDs is the least of your worries!

>> The only real alternative if you don't like it is using a different
>> filesystem.

> As I've said, I don't have a problem with UUIDs... I just can't quite
> believe that btrfs and the userland cannot be modified so that it
> handles such cases gracefully.

As I implied,  UUIDs usage is so deeply embedded, fixing btrfs to not work
that way is pretty much impossible.  You'd be pretty much starting from
scratch and using some of the same ideas; it wouldn't be btrfs any longer.

> If not, than, to be quite honest, that would be really a major
> showstopper for many usage areas.

Consider the show stopped, then.

> And I'm not talking about ATMs (or any other embedded devices where
> people may have non-supervides access - e.g. TVs in a mall,
> entertainment systems in planes) but also the normal desktops/laptops
> where colleagues, fellow students, etc. may want to play some "prank".

As I said, if you're plugging in or allowing to be plugged in untrusted
USB devices, show's over, they're already playing pretty much any prank
they want, including zapping the hardware.  USB's now less trusted than a
raw Internet hookup with all services exposed.  The only controlling
factor now is the physical presence limitation, and if you're plugging in
devices you get for instance as "gifts just for trying us out" or
whatever, that someone mails to you... worse than running MS and
mindlessly running any exe someone sends you.


BTW, this is documented (in someone simpler "do not do XX" form) on the
wiki, gotchas page.

https://btrfs.wiki.kernel.org/index.php/Gotchas#Block-level_copies_of_devices


-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html