Hello everyone,

looking closer and comparing our servers, we infact have 2 servers
that behave differently, but they also have a different workload (mail
instead of file storage). They have identical hardware, but were
installed at a later time, compared to the servers which do have the issue.

They have a stable l2arc cache size, not like the one i described
previously, where the l2arc size is bigger than the dataset in the pool.


Nonleaking server:

L2 ARC Summary: (HEALTHY)
        Passed Headroom:                        63.05m
        Tried Lock Failures:                    198.57m
        IO In Progress:                         53.57k
        Low Memory Aborts:                      32
        Free on Write:                          21.40k
        Writes While Full:                      16.50k
        R/W Clashes:                            3.50k
        Bad Checksums:                          0
        IO Errors:                              0
        SPA Mismatch:                           613.80m

L2 ARC Size: (Adaptive)                         443.42  GiB
        Header Size:                    0.27%   1.22    GiB

L2 ARC Evicts:
        Lock Retries:                           1.33k
        Upon Reading:                           0

L2 ARC Breakdown:                               191.36m
        Hit Ratio:                      28.27%  54.09m
        Miss Ratio:                     71.73%  137.27m
        Feeds:                                  1.68m

L2 ARC Buffer:
        Bytes Scanned:                          4.28    PiB
        Buffer Iterations:                      1.68m
        List Iterations:                        107.55m
        NULL List Iterations:                   1.16m

L2 ARC Writes:
        Writes Sent:                    100.00% 915.04k



No bad checksums or IO Errors. The l2arc size of 443gb is sensible
compared to it's actual size (373gb).

Yet aside of the workload I cannot find any difference between the
2 "mail" fileservers and the other 6+ "web" fileservers doing storage
for websites. They're identical in every regard, aside of the workload
and the time of installation. The mail fileservers were installed much
later in comparison.

I just wanted to add this, as it maybe very relevant.

With kind regards,

Daniel




On 08/11/2015 04:42 PM, Daniel Genis wrote:
> Dear FreeBSD community,
> 
> We're facing a somewhat odd issue, perhaps similar to what is discussed
> here: https://forums.freebsd.org/threads/l2arc-degraded.47540/
> 
> The issue is that the L2ARC header seems to grow without limit, similar
> to a memory leak, pressuring more and more memory over time out of the ARC.
> 
> For example, the output of "zpool iostat -v 1"
> 
>                  capacity     operations    bandwidth
> pool          alloc   free   read  write   read  write
> ------------  -----  -----  -----  -----  -----  -----
> syspool       1.15G   275G      0      0      0      0
>   mirror      1.15G   275G      0      0      0      0
>     gpt/zfs0      -      -      0      0      0      0
>     gpt/zfs1      -      -      0      0      0      0
> ------------  -----  -----  -----  -----  -----  -----
> tank          1.21T  1.51T    229  1.99K  3.67M  9.48M
>   mirror       124G   154G     67    125   787K   503K
>     da0           -      -     20     27   440K   503K
>     da1           -      -     45     28   379K   503K
> [...]
>   mirror       124G   154G     34    164   454K   612K
>     da18          -      -     26     12   417K   612K
>     da19          -      -      6     13  58.8K   612K
> logs              -      -      -      -      -      -
>   mirror       117M  74.4G      0    109      0  1.75M
>     da21          -      -      0    109      0  1.75M
>     da22          -      -      0    109      0  1.75M
> cache             -      -      -      -      -      -
>   da23        1.67T  16.0E    302      7  2.85M   223K
> ------------  -----  -----  -----  -----  -----  -----
> 
> 
> Here the cache shows 1.67T, in use and 16.0E free.
> The cache is a 373GB Intel SSD.
> 
> # diskinfo -v da23
> da23
>       512             # sectorsize
>       400088457216    # mediasize in bytes (373G)
>       781422768       # mediasize in sectors
>       4096            # stripesize
>       0               # stripeoffset
>       48641           # Cylinders according to firmware.
>       255             # Heads according to firmware.
>       63              # Sectors according to firmware.
>       BTTV4234089C400HGN      # Disk ident.
>       id1,enc@n500e004aaaaaaa3e/type@0/slot@18        # Physical path
> 
> 
> 
> The L2ARC stats section from "zfs-stats -a":
> 
> L2 ARC Summary: (DEGRADED)
>       Passed Headroom:                        133.33m
>       Tried Lock Failures:                    4.90b
>       IO In Progress:                         313.63k
>       Low Memory Aborts:                      1.52k
>       Free on Write:                          589.79k
>       Writes While Full:                      34.57k
>       R/W Clashes:                            46.95k
>       Bad Checksums:                          408.40m
>       IO Errors:                              151.99m
>       SPA Mismatch:                           632.00m
> 
> L2 ARC Size: (Adaptive)                               1.89    TiB
>       Header Size:                    0.88%   16.98   GiB
> 
> L2 ARC Evicts:
>       Lock Retries:                           1.27k
>       Upon Reading:                           2
> 
> L2 ARC Breakdown:                             2.10b
>       Hit Ratio:                      32.89%  691.15m
>       Miss Ratio:                     67.11%  1.41b
>       Feeds:                                  3.70m
> 
> L2 ARC Buffer:
>       Bytes Scanned:                          10.70   PiB
>       Buffer Iterations:                      3.70m
>       List Iterations:                        236.30m
>       NULL List Iterations:                   24.86m
> 
> L2 ARC Writes:
>       Writes Sent:                    100.00% 3.38m
> 
> 
> Here we can see that currently the Header Size is almost 17gb.
> This header size grows continuously without (apparent) limit.
> Also zfs appears to think it's holding 1.89 TiB inside the L2ARC, which
> seems very very unlikely.
> 
> # freebsd-version
> 10.1-RELEASE-p13
> 
> # uname -a
> FreeBSD servername 10.1-RELEASE-p10 FreeBSD 10.1-RELEASE-p10 #0: Wed May
> 13 06:54:13 UTC 2015
> r...@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64
> 
> # uptime
>  4:35PM  up 42 days, 15:24, 1 user, load averages: 1.35, 0.96, 0.84
> 
> 
> Does anyone know how we can alleviate the issue?
> We originally thought the issue was caused by
> https://www.freebsd.org/security/advisories/FreeBSD-EN-15:07.zfs.asc
> 
> We have updated our Servers since but the header size seems to keep
> growing still. For reference, we have multiple bsd fileservers which are
> used mostly over NFS, all with identical configuration (but varying
> workload). They all still show these symptoms.
> 
> Any tips/hints/pointers are appreciated!
> 
> With kind regards,
> 
> Daniel
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
> 


-- 
Met vriendelijke groeten,

Daniel Genis

Medewerker Techniek
Byte Internet

W http://www.byte.nl/
E dan...@byte.nl
T 020 521 6226
F 020 521 6227
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to