Hi, guys.

We have a problem with btrfs file system: sometimes it became stuck
without leaving me any way to interrupt it (shutdown -r now is unable to
restart server). By stuck I mean some processes that previously were
able to write on disk are unable to cope with load and load average goes
up:

top - 13:10:58 up 1 day,  9:26,  5 users,  load average: 157.76, 156.61,
149.29
Tasks: 235 total,   2 running, 233 sleeping,   0 stopped,   0 zombie
%Cpu(s): 19.8 us, 15.0 sy,  0.0 ni, 60.7 id,  3.9 wa,  0.0 hi,  0.6 si,
0.0 st
KiB Mem:  65922104 total, 65414856 used,   507248 free,     1844 buffers
KiB Swap:        0 total,        0 used,        0 free. 62570804 cached
Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
COMMAND                                              
 8644 root      20   0       0      0      0 R  96.5  0.0 127:21.95
kworker/u16:16                                       
 5047 dvr       20   0 6884292 122668   4132 S   6.4  0.2 258:59.49
dvrserver                                            
30223 root      20   0   20140   2600   2132 R   6.4  0.0   0:00.01
top                                                  
    1 root      20   0    4276   1628   1524 S   0.0  0.0   0:40.19
init                                                 



There are about 300 treads on server, some of which are writing on disk.
A bit information about this btrfs filesystem: this is 22 disk file
system with raid1 for metadata and raid0 for data:

 # btrfs filesystem df /store/
Data, single: total=11.92TiB, used=10.86TiB
System, RAID1: total=8.00MiB, used=1.27MiB
System, single: total=4.00MiB, used=0.00B
Metadata, RAID1: total=46.00GiB, used=33.49GiB
Metadata, single: total=8.00MiB, used=0.00B
GlobalReserve, single: total=512.00MiB, used=128.00KiB
 # btrfs property get /store/
ro=false
label=store
 # btrfs device stats /store/
(shows all zeros)
 # btrfs balance status /store/
No balance found on '/store/'
 # btrfs filesystem show /store/
Btrfs v3.17.1
(btw, is it supposed to have only version here?)

As for load we write quite small files of size (some of 313K, some of
800K), that's why metadata takes that much. So back to the problem.
iostat 1 exposes following problem:

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          16.96    0.00   17.09   65.95    0.00    0.00

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda               0.00         0.00         0.00          0          0
sdc               0.00         0.00         0.00          0          0
sdb               0.00         0.00         0.00          0          0
sde               0.00         0.00         0.00          0          0
sdd               0.00         0.00         0.00          0          0
sdf               0.00         0.00         0.00          0          0
sdg               0.00         0.00         0.00          0          0
sdj               0.00         0.00         0.00          0          0
sdh               0.00         0.00         0.00          0          0
sdk               0.00         0.00         0.00          0          0
sdi               1.00         0.00       200.00          0        200
sdl               0.00         0.00         0.00          0          0
sdn              48.00         0.00     17260.00          0      17260
sdm               0.00         0.00         0.00          0          0
sdp               0.00         0.00         0.00          0          0
sdo               0.00         0.00         0.00          0          0
sdq               0.00         0.00         0.00          0          0
sdr               0.00         0.00         0.00          0          0
sds               0.00         0.00         0.00          0          0
sdt               0.00         0.00         0.00          0          0
sdv               0.00         0.00         0.00          0          0
sdw               0.00         0.00         0.00          0          0
sdu               0.00         0.00         0.00          0          0


write goes to one disk. I've tried to debug what's going in kworker and
did

$ echo workqueue:workqueue_queue_work
> /sys/kernel/debug/tracing/set_event
$ cat /sys/kernel/debug/tracing/trace_pipe > trace_pipe.out2

trace_pipe2.out.xz in attachment. Could you comment, what goes wrong
here?

Server has 64Gb of RAM. Is it possible that it is unable to keep all
metadata in memory, can we encrease this memory limit, if exists?


Thanks in advance for any pointers,
--
Peter.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to