Breaking Varnish

2009-01-21 Thread Tim Kientzle
We're evaluating Varnish as a possible replacement for our
installed Squid servers.  Performance-wise, Varnish is very
impressive, and we're pretty pleased with the configuration
flexibility.

But...

Under heavy load, we're seeing a lot of segfaults and
assertion failures.  I've pasted an excerpt below of
two of the issues we've seen using Varnish 2.0.2 on Linux
2.6.21 kernel with the default VCL (using command-line options
to set the listen address and the addresses of the two back-end
servers).

We're going to repeat these tests and see if we can get
more detail, possibly including core dumps.  What other
information would be useful in diagnosing and fixing
these issues?

Cheers,

Tim Kientzle

==

1) Varnish repeatedly died due to SIGSEGV:

child (2816) Started
Child (2816) said Closed fds: 4 7 8 10 11
Child (2816) said Child starts
Child (2816) said managed to mmap 49392648192 bytes of 49392648192
Child (2816) said Ready
Child (2816) died signal=11
Child cleanup complete

2) Varnish repeatedly died due to SIGABRT:

child (3017) Started
Child (3017) said Closed fds: 4 7 8 10 11
Child (3017) said Child starts
Child (3017) said managed to mmap 49392648192 bytes of 49392648192
Child (3017) said Ready
Child (3017) died signal=6
Child (3017) Panic message: Assert error in cnt_lookup(),  
cache_center.c line 625:
   Condition(sp-objhead != NULL) not true. thread = (cache-worker)sp  
= 0x2afee0fb3008 {
   fd = -1, id = 15, xid = 0,
   client = 10.2.8.27:45430,
   step = STP_DONE,
   handling = DELIVER,
   ws = 0x2afee0fb3078 {
 id = sess,
 {s,f,r,e} = {0x2afee0fb37b0,,+587,(nil),+8192},
   },
}, 
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: 2.0.3 planning

2009-01-08 Thread Tim Kientzle
This is a very strange comment.  If Varnish requires a
particular sequence, it should implement its own.  If
it requires particular statistical properties, it should
test for those, not test for a specific sequence.

Tim


On Jan 8, 2009, at 2:12 AM, Tollef Fog Heen wrote:

 
 r3367 | phk | 2008-11-10 10:37:21 +0100 (ma., 10 nov. 2008) | 14 lines

 Add a toplevel word which examines the sequence returned by
 srandom(1) and stops the test if we do not get the same sequence
 as we expect.

 The Open Group does not define which deterministic sequence srandom(1)
 should result in, on that it be deterministic, but I have high hopes
 in the general sanity and expect that UNIX people across the board
 have realized that for portability the same sequence should be
 returned on all platforms.

 At the very least FreeBSD and Linux/GLIBC, as seen on  
 projects.linpro.no,
 agree.

___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Logged-in users

2008-11-26 Thread Tim Kientzle
Another approach is to simply use a small bit of Javascript.  It's
easy to test for the existence of the cookie in Javascript and
set that text conditionally.

Then you have only one copy of the page to be cached.

The problem with the approach you've outlined here is
that other downstream caches won't understand the difference
(although most will simply refuse to cache any responses
if the request had a cookie header).  Whereas the Javascript
approach also allows downstream caches to cache everything
efficiently.

Tim



On Nov 26, 2008, at 12:31 PM, Miles wrote:

 Miles wrote:
 Hi,

 I have a site where users can log in.  This sets a cookie with their
 encrypted login details, so they can be authenticated.  There are a
 small number of pages which are user-specific (change your details
 forms, etc), and these are set not to cache.

 When a user is logged in, a message is shown at the top of the page  
 You
 are now logged in.  However, nothing on the page depends on the
 individual user.

 My question is, how can I organise the cache to have the most cache
 hits, given that there are effectively two versions of each page -  
 one
 for logged in users, and one for anonymous users.  I want to
 specifically avoid each user having their own version of the page  
 stored
 in the cache.

 Thanks in advance for any wisdom anyone can share!

 Miles

 Thanks to everyone who suggested using ESI - I may have to use this,  
 but
 would quite like to avoid it, as it's useful to be able to run the app
 without varnish in front for development/testing.

 I wondered whether it was possible to use vcl_hash for my purposes, as
 follows:

 sub vcl_hash {

//hash the object with url+host
set req.hash += req.url;
set req.hash += req.http.host;

# see if the user has a cookie to indicate they are logged in
if req.http.cookie ~ '__ac=':
   set req.hash += 'authenticated';
else:
   set req.hash += 'anonymous'
hash;

 }

 Would this give me the two representations that I require for each  
 page
 - or am I going down a route that will turn out bad?!  I couldn't find
 much information about vcl_hash, so I'm not sure if I'm barking up the
 wrong tree or not...

 Regards,

 Miles

 ___
 varnish-misc mailing list
 varnish-misc@projects.linpro.no
 http://projects.linpro.no/mailman/listinfo/varnish-misc

___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Getting started...

2008-11-12 Thread Tim Kientzle
I'm trying to just run a plain-vanilla varnish so I can see it running  
before I start mucking with configuration.

But I'm not having much luck:

$ uname -a
Darwin tbkk.local 9.5.0 Darwin Kernel Version 9.5.0: Wed Sep  3  
11:29:43 PDT 2008; root:xnu-1228.7.58~1/RELEASE_I386 i386
$ sbin/varnishd -a 127.0.0.1:3128 -b 127.0.0.1:80 -d
storage_file: filename: ./varnish.2wA0fp (unlinked) size 669 MB.
Using old SHMFILE
Debugging mode, enter start to start child
$ echo $?
2
$

So, varnishd simply exits with no explanation at all.

After the above, bin/varnishlog just hangs with no output.

I finally resorted to running varnishd under GDB, which shows that  
vev_schedule_one() is getting NULL from binheap_root(), which leads it  
to return zero, which causes vev_schedule() to return, which causes  
mgt_schedule() to log manager dies and exit(2).

What have I missed?

Are there any good examples just showing how to run varnish?

Tim

___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Getting started...

2008-11-12 Thread Tim Kientzle
Ah...  It seems to work if I omit the -d option.

Tim

On Nov 12, 2008, at 11:11 AM, Tim Kientzle wrote:

 I'm trying to just run a plain-vanilla varnish so I can see it running
 before I start mucking with configuration.

 But I'm not having much luck:

 $ uname -a
 Darwin tbkk.local 9.5.0 Darwin Kernel Version 9.5.0: Wed Sep  3
 11:29:43 PDT 2008; root:xnu-1228.7.58~1/RELEASE_I386 i386
 $ sbin/varnishd -a 127.0.0.1:3128 -b 127.0.0.1:80 -d
 storage_file: filename: ./varnish.2wA0fp (unlinked) size 669 MB.
 Using old SHMFILE
 Debugging mode, enter start to start child
 $ echo $?
 2
 $

 So, varnishd simply exits with no explanation at all.

 After the above, bin/varnishlog just hangs with no output.

 I finally resorted to running varnishd under GDB, which shows that
 vev_schedule_one() is getting NULL from binheap_root(), which leads it
 to return zero, which causes vev_schedule() to return, which causes
 mgt_schedule() to log manager dies and exit(2).

 What have I missed?

 Are there any good examples just showing how to run varnish?

 Tim

 ___
 varnish-misc mailing list
 varnish-misc@projects.linpro.no
 http://projects.linpro.no/mailman/listinfo/varnish-misc

___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Inspect Request bodies?

2008-11-05 Thread Tim Kientzle
Under certain circumstances, I want to inspect the body of a
POST request at the proxy cache.

It don't see any hooks for this in the current Varnish 2.0.1,
but I've skimmed the source and it looks feasible:

  * I'll need code to actually read and store the POST body in memory
(including updates to the PASS handler and other places to
use the in-memory data when it's available)

  * I'll need to add VCL functions to actually analyze the POST body.

The second part looks pretty straightforward.  The VCL engine
seems quite modular and extensible.  Because VCL routines run in
per-request threads, it should be feasible to do more time-consuming
operations using straightforward sequential code.  (I've also looked
at extending Squid or Nginx, but breaking down some of these operations
into the necessary state machines would be rather tedious.)

The first part looks trickier.  Has anyone here tried anything
similar?  Any pointers (particular source files I should pay attention
to or memory-management issues I should keep in mind)?

Finally, has anyone else encountered similar requirements that
might benefit from this?  (I.e., if I do get this to work, is
it worth cleaning up the code to contribute back?)

Of course, if Varnish already provides some of this and I've
simply missed it, then that's even better. ;-)

Cheers,

Tim

P.S.  For the curious, there are two specific issues I'm
exploring:  First, I have an API which prefers GET but supports
POST if the arguments are too long; I'd like to accurately
cache responses to these larger requests.  Second, I've been
exploring request-signing techniques borrowed from OAuth.  Both
of these boil down to computing a hash over all query arguments,
including those in the POST body.  So far, I've been handling
these issues at the app server, but I've got a growing suite
of applications running in that layer and I'd like to move the
redundant code into a common proxy layer, so I've been surveying
existing proxy implementations to see which ones are most amenable
to this kind of extension.

___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Inspect Request bodies?

2008-11-05 Thread Tim Kientzle
Thanks, Poul-Henning!  These are exactly the hints I needed.

Agree completely about it being controllable in VCL; my
own environment has a mix of requests of widely-varying sizes
and I certainly don't want this for large uploads.

Tim


On Nov 5, 2008, at 11:37 AM, Poul-Henning Kamp wrote:

 In message [EMAIL PROTECTED], Tim  
 Kientzle wri
 tes:

 * I'll need code to actually read and store the POST body in memory
   (including updates to the PASS handler and other places to
   use the in-memory data when it's available)

 We sort of have this as point 15 on our shoppinglist:

   (http://varnish.projects.linpro.no/wiki/PostTwoShoppingList)

 The crucial point here, is that we want it to be controllable in
 VCL, so that people can disable it for GB sized uploads and enable
 it for short stuff (or vice versa) if they want.

 * I'll need to add VCL functions to actually analyze the POST body.

 To be honest, I would would probably just use the inline C facility
 and do it there, than trying to generalize it into a VCL extension.

 The first part looks trickier.  Has anyone here tried anything
 similar?  Any pointers (particular source files I should pay  
 attention
 to or memory-management issues I should keep in mind)?

 It's pretty straightforward really:  allocate an (non-hashed)
 object, add storage to it and store the contents there.

 You can see pretty much all the code you need in cache_fetch.c and
 for it to go into the tree as a patch, I would insist that the
 code gets generalized so we use the same code in both directions,
 rather than have two copies.


 -- 
 Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
 [EMAIL PROTECTED] | TCP/IP since RFC 956
 FreeBSD committer   | BSD since 4.3-tahoe
 Never attribute to malice what can adequately be explained by  
 incompetence.

___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc