On Fri, Nov 7, 2014 at 12:37 AM, Alexandre Oliva <ol...@gnu.org> wrote:
> A few days ago, I started using rsync batches to archive old copies of
> ceph OSD snapshots for certain kinds of disaster recovery.  This seems
> to exercise an unexpected race condition in rsync, which happens to
> expose what appears to be a race condition in btrfs, causing random
> scary but harmless errors when replaying the rsync batches.
>
>
> strace has revealed that the two rsync processes running concurrently to
> apply the batch both attempt to access xattrs of the same directory
> concurrently.  I understand rsync is supposed to avoid this, but
> something's going wrong with that.  Here's the smoking gun, snipped from
> strace -p 27251 -p 27253 -o smoking.gun, where both processes are
> started from a single rsync --read-batch=- -aHAX --del ... run:
>
> 0: 27251 stat("osd/0.6ed_head/DIR_D/DIR_E/DIR_6",  <unfinished ...>
> 1: 27253 stat("osd/0.6ed_head/DIR_D/DIR_E/DIR_6", {st_mode=S_IFDIR|0755, 
> st_size=5470, ...}) = 0
> 2: 27251 <... stat resumed> {st_mode=S_IFDIR|0755, st_size=5470, ...}) = 0
> 3: 27253 llistxattr("osd/0.6ed_head/DIR_D/DIR_E/DIR_6", 
> "user.cephos.phash.contents\0", 1024) = 27
> 4: 27251 llistxattr("osd/0.6ed_head/DIR_D/DIR_E/DIR_6",  <unfinished ...>
> 5: 27253 lsetxattr("osd/0.6ed_head/DIR_D/DIR_E/DIR_6", 
> "user.cephos.phash.contents", 
> "\x01F\x00\x00\x00\x00\x00\x00\x00\x0f\x00\x00\x00\x03\x00\x00", 17, 0 
> <unfinished ...>
> 6: 27251 <... llistxattr resumed> "user.cephos.phash.contents\0", 1024) = 27
> 7: 27251 lgetxattr("osd/0.6ed_head/DIR_D/DIR_E/DIR_6", 
> "user.cephos.phash.contents", 0x0, 0) = -1 ENODATA (No data available)
> 8: 27253 <... lsetxattr resumed> )         = 0
> 9: 27253 utimensat(AT_FDCWD, "osd/0.6ed_head/DIR_D/DIR_E/DIR_6", {UTIME_NOW, 
> {1407992261, 0}}, AT_SYMLINK_NOFOLLOW) = 0
> a: 27251 write(2, "rsync: get_xattr_data: lgetxattr"..., 181) = 181
>
> lines 0-2, 3-6 and 5-8, show concurrent access of both rsync processes
> to the same directory.  This wouldn't be a problem, not even for
> replaying batches, for the lsetxattr would put the intended xattr value
> in there regardless of whether the scanner saw the xattr value before or
> after that.
>
>
> What makes the problem visible is that btrfs appears to have a race in
> its handling of xattr replacement, leaving a window between the removal
> of the old value and the insertion of the new one, as shown by lines
> 5-8.  line 3 show the attribute existed before, and lines 5-8 show it
> disappears in line 7, while lsetxattr still runs to replace it.
>
> If rsync tries hard enough to hit this window, the lgetxattr concurrent
> to the lsetxattr eventually hits, and then rsync reports an error:
>
> rsync: get_xattr_data: 
> lgetxattr(""/media/px/snapshots/cluster/20141102-to-20140816/osd/0.6ed_head/DIR_D/DIR_E/DIR_6"","user.cephos.phash.contents",0)
>  failed: No data available (61)
>
> At the end, it exits with a nonzero status, even though nothing really
> wrong went on and the tree ended up looking just as it was supposed to.
>
>
> Now, I'm a bit concerned because the btrfs race condition, if exercised
> on security-related xattrs or ACLs, could cause data to become visible
> that shouldn't, which could turn this into a locally exploitable
> security issue.  Sure enough nobody goes nuts repeatedly changing the
> ACLs of a dir or file containing information that should be guarded by
> it, so as to increase the likelihood that an attacker succeeds in
> accessing the data, but still...  I don't think the temporary removal of
> the xattr for subsequent insertion should be visible at all.

Indeed, the xattr replace code is not atomic in btrfs.
I'll send a fix soon.

thanks

>
> I'm sorry for reporting a potential security issue like that, but by the
> time it occurred to me that it might have potential security
> implications, I'd already mentioned the problem on #btrfs at FreeNode,
> so the horse was out of the barn already :-(
>
>
> I hope this helps,
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Filipe David Manana,

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to