Howdy, We are finally getting around to upgrading to the latest version of varnish and are running into quite a weird problem. Everything works fine for a bit (1+day) , then all of a sudden Varnish starts nuking all of the objects from the cache:
About 4 hours ago there were 1 million objects in the cache, now there are just about 172k. This looks a bit weird to me: sms_nbytes 18446744073709548694 . SMS outstanding bytes Here are the options I am passing to varnishd: /usr/local/sbin/varnishd -a 0.0.0.0:8888 -f /etc/varnish/varnish.vcl -P /var/run/varnishd.pid -T 0.0.0.0:47200 -t 600 -w 1,200,300 -p thread_pools 4 -p thread_pool_add_delay 2 -p lru_interval 60 -h classic,500009 -p obj_workspace 4096 -s file,/varnish/cache,150G /varnish is 2 x 80GB Intel X-25M SSDs in a software RAID 0 array. OS is Debian Lenny 64-bit. There is plenty of space: /dev/md0 149G 52G 98G 35% /varnish Here is the output of varnishstat -1 uptime 134971 . Child uptime client_conn 12051037 89.29 Client connections accepted client_drop 0 0.00 Connection dropped, no sess client_req 12048672 89.27 Client requests received cache_hit 10161272 75.28 Cache hits cache_hitpass 133244 0.99 Cache hits for pass cache_miss 1750857 12.97 Cache misses backend_conn 1824594 13.52 Backend conn. success backend_unhealthy 0 0.00 Backend conn. not attempted backend_busy 0 0.00 Backend conn. too many backend_fail 3644 0.03 Backend conn. failures backend_reuse 0 0.00 Backend conn. reuses backend_toolate 0 0.00 Backend conn. was closed backend_recycle 0 0.00 Backend conn. recycles backend_unused 0 0.00 Backend conn. unused fetch_head 5309 0.04 Fetch head fetch_length 1816422 13.46 Fetch with Length fetch_chunked 0 0.00 Fetch chunked fetch_eof 0 0.00 Fetch EOF fetch_bad 0 0.00 Fetch had bad headers fetch_close 0 0.00 Fetch wanted close fetch_oldhttp 0 0.00 Fetch pre HTTP/1.1 closed fetch_zero 0 0.00 Fetch zero len fetch_failed 16 0.00 Fetch failed n_srcaddr 0 . N struct srcaddr n_srcaddr_act 0 . N active struct srcaddr n_sess_mem 578 . N struct sess_mem n_sess 414 . N struct sess n_object 172697 . N struct object n_objecthead 173170 . N struct objecthead n_smf 471310 . N struct smf n_smf_frag 62172 . N small free smf n_smf_large 67978 . N large free smf n_vbe_conn 18446744073709551611 . N struct vbe_conn n_bereq 315 . N struct bereq n_wrk 76 . N worker threads n_wrk_create 3039 0.02 N worker threads created n_wrk_failed 0 0.00 N worker threads not created n_wrk_max 0 0.00 N worker threads limited n_wrk_queue 0 0.00 N queued work requests n_wrk_overflow 25136 0.19 N overflowed work requests n_wrk_drop 0 0.00 N dropped work requests n_backend 4 . N backends n_expired 771687 . N expired objects n_lru_nuked 744693 . N LRU nuked objects n_lru_saved 0 . N LRU saved objects n_lru_moved 8675178 . N LRU moved objects n_deathrow 0 . N objects on deathrow losthdr 25 0.00 HTTP header overflows n_objsendfile 0 0.00 Objects sent with sendfile n_objwrite 11749415 87.05 Objects sent with write n_objoverflow 0 0.00 Objects overflowing workspace s_sess 12051007 89.29 Total Sessions s_req 12050184 89.28 Total Requests s_pipe 2661 0.02 Total pipe s_pass 134858 1.00 Total pass s_fetch 1821721 13.50 Total fetch s_hdrbytes 3932274894 29134.22 Total header bytes s_bodybytes 894452020319 6626994.10 Total body bytes sess_closed 12050925 89.29 Session Closed sess_pipeline 0 0.00 Session Pipeline sess_readahead 0 0.00 Session Read Ahead sess_linger 0 0.00 Session Linger sess_herd 160 0.00 Session herd shm_records 610011852 4519.58 SHM records shm_writes 50380066 373.27 SHM writes shm_flushes 9387 0.07 SHM flushes due to overflow shm_cont 47763 0.35 SHM MTX contention shm_cycles 226 0.00 SHM cycles through buffer sm_nreq 4449213 32.96 allocator requests sm_nobj 341160 . outstanding allocations sm_balloc 11373072384 . bytes allocated sm_bfree 116602589184 . bytes free sma_nreq 0 0.00 SMA allocator requests sma_nobj 0 . SMA outstanding allocations sma_nbytes 0 . SMA outstanding bytes sma_balloc 0 . SMA bytes allocated sma_bfree 0 . SMA bytes free sms_nreq 63997 0.47 SMS allocator requests sms_nobj 0 . SMS outstanding allocations sms_nbytes 18446744073709548694 . SMS outstanding bytes sms_balloc 31161028 . SMS bytes allocated sms_bfree 31162489 . SMS bytes freed backend_req 1821961 13.50 Backend requests made n_vcl 1 0.00 N vcl total n_vcl_avail 1 0.00 N vcl available n_vcl_discard 0 0.00 N vcl discarded n_purge 1 . N total active purges n_purge_add 1 0.00 N new purges added n_purge_retire 0 0.00 N old purges deleted n_purge_obj_test 0 0.00 N objects tested n_purge_re_test 0 0.00 N regexps tested against n_purge_dups 0 0.00 N duplicate purges removed hcb_nolock 0 0.00 HCB Lookups without lock hcb_lock 0 0.00 HCB Lookups with lock hcb_insert 0 0.00 HCB Inserts esi_parse 0 0.00 Objects ESI parsed (unlock) esi_errors 0 0.00 ESI parse errors (unlock) Obviously, this isn't good for my cache hit rate :) It is also using a lot of CPU. Has anyone seen this happen before? -- Barry Abrahamson | Systems Wrangler | Automattic Blog: http://barry.wordpress.com _______________________________________________ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc