В Вт, 02/12/2014 в 09:33 +0800, Qu Wenruo пишет:
> -------- Original Message --------
> Subject: btrfs stuck with lot's of files
> From: Peter Volkov <p...@gentoo.org>
> To: linux-btrfs@vger.kernel.org <linux-btrfs@vger.kernel.org>
> Date: 2014年12月01日 19:46
> > Hi, guys.
> >
> > We have a problem with btrfs file system: sometimes it became stuck
> > without leaving me any way to interrupt it (shutdown -r now is unable to
> > restart server). By stuck I mean some processes that previously were
> > able to write on disk are unable to cope with load and load average goes
> > up:
> >
> > top - 13:10:58 up 1 day,  9:26,  5 users,  load average: 157.76, 156.61,
> > 149.29
> > Tasks: 235 total,   2 running, 233 sleeping,   0 stopped,   0 zombie
> > %Cpu(s): 19.8 us, 15.0 sy,  0.0 ni, 60.7 id,  3.9 wa,  0.0 hi,  0.6 si,
> > 0.0 st
> > KiB Mem:  65922104 total, 65414856 used,   507248 free,     1844 buffers
> > KiB Swap:        0 total,        0 used,        0 free. 62570804 cached
> > Mem
> >
> >    PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
> > COMMAND
> >   8644 root      20   0       0      0      0 R  96.5  0.0 127:21.95
> > kworker/u16:16
> >   5047 dvr       20   0 6884292 122668   4132 S   6.4  0.2 258:59.49
> > dvrserver
> > 30223 root      20   0   20140   2600   2132 R   6.4  0.0   0:00.01
> > top
> >      1 root      20   0    4276   1628   1524 S   0.0  0.0   0:40.19
> > init
> >
> >
> >
> > There are about 300 treads on server, some of which are writing on disk.
> > A bit information about this btrfs filesystem: this is 22 disk file
> > system with raid1 for metadata and raid0 for data:
> >
> >   # btrfs filesystem df /store/
> > Data, single: total=11.92TiB, used=10.86TiB
> > System, RAID1: total=8.00MiB, used=1.27MiB
> > System, single: total=4.00MiB, used=0.00B
> > Metadata, RAID1: total=46.00GiB, used=33.49GiB
> > Metadata, single: total=8.00MiB, used=0.00B
> > GlobalReserve, single: total=512.00MiB, used=128.00KiB
> >   # btrfs property get /store/
> > ro=false
> > label=store
> >   # btrfs device stats /store/
> > (shows all zeros)
> >   # btrfs balance status /store/
> > No balance found on '/store/'
> >   # btrfs filesystem show /store/
> > Btrfs v3.17.1
> > (btw, is it supposed to have only version here?)
> This is a small bug that if there is appending '/' in the path for 
> 'btrfs fi show', it can't recognize it....
> Patch is already sent and maybe included next version.
> >
> > As for load we write quite small files of size (some of 313K, some of
> > 800K), that's why metadata takes that much. So back to the problem.
> > iostat 1 exposes following problem:
> >
> > avg-cpu:  %user   %nice %system %iowait  %steal   %idle
> >            16.96    0.00   17.09   65.95    0.00    0.00
> >
> > Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
> > sda               0.00         0.00         0.00          0          0
> > sdc               0.00         0.00         0.00          0          0
> > sdb               0.00         0.00         0.00          0          0
> > sde               0.00         0.00         0.00          0          0
> > sdd               0.00         0.00         0.00          0          0
> > sdf               0.00         0.00         0.00          0          0
> > sdg               0.00         0.00         0.00          0          0
> > sdj               0.00         0.00         0.00          0          0
> > sdh               0.00         0.00         0.00          0          0
> > sdk               0.00         0.00         0.00          0          0
> > sdi               1.00         0.00       200.00          0        200
> > sdl               0.00         0.00         0.00          0          0
> > sdn              48.00         0.00     17260.00          0      17260
> > sdm               0.00         0.00         0.00          0          0
> > sdp               0.00         0.00         0.00          0          0
> > sdo               0.00         0.00         0.00          0          0
> > sdq               0.00         0.00         0.00          0          0
> > sdr               0.00         0.00         0.00          0          0
> > sds               0.00         0.00         0.00          0          0
> > sdt               0.00         0.00         0.00          0          0
> > sdv               0.00         0.00         0.00          0          0
> > sdw               0.00         0.00         0.00          0          0
> > sdu               0.00         0.00         0.00          0          0
> >
> >
> > write goes to one disk. I've tried to debug what's going in kworker and
> > did
> >
> > $ echo workqueue:workqueue_queue_work
> >> /sys/kernel/debug/tracing/set_event
> > $ cat /sys/kernel/debug/tracing/trace_pipe > trace_pipe.out2
> >
> > trace_pipe2.out.xz in attachment. Could you comment, what goes wrong
> > here?
> It seems that attachment is blocked by mail-list so I didn't see the 
> attachment.

I've put it here:
https://drive.google.com/file/d/0BygFL6N3ZVUAMWxCQ0tDREE1Uzg/view?usp=sharing

And some additional information I've put in another letter that just
sent to mailing list.

> > Server has 64Gb of RAM. Is it possible that it is unable to keep all
> > metadata in memory, can we encrease this memory limit, if exists?
> Not possible, it will never happen (if nothing goes wrong....).
> Kernel has the outstanding page cache mechanism, when memory comes short,
> some cached metadata/data can be flushed back(if dirty) to disk to free 
> space.
> And re-read from disk if needed later.
> 
> So kernel don't need to load all the metadata/data into memory, and 
> that's mostly impossible for large fs.

Thanks for this explanation! Still I'm looking for suggestion on how to
cope with btrfs_async_reclaim_metadata_space that is mentioned most
frequently in kworker trace.

> And one missing important informantion: kernel version.

This is kernel 3.16.7-gentoo. 

--
Peter.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to