> In that case, hashing the URL only would prevent you from adding new domains through your Varnish server. It won't hurt if you know you will only ever have one domain to deal with, but hashing the host will also not hurt as long as you normalize it to a unique value.
Hi, Let me elaborate my architecture more: I have some backend servers to serve hls fragments for video live stream,e.g: ``` hls_backend_01 hls_backend_02 hls_backend_03 hls_backend_04 hls_backend_05 hls_backend_06 hls_backend_07 hls_backend_08 hls_backend_09 hls_backend_10 ``` There is same content on all hls backend servers, there are 5 varnish in front of them for caching Now If I use round-robin director on Varnishes, because varnish would cache " req.http.host + req.url ", so for the same content but from different backends it would cache double! for example: if varnish for the first request and "test.ts" file goes to "hls_backend_01" backend server, would cache it and for the next request from other clients because it is using round-robin director it goes to "hls_backend_02" and would cache the same file again due to different "req.http.host" So now I have a solution to use Shard director based on "key=req.url" instead of round robin another way is to use round robin but adjusting the hash vcl to something like bellow: ``` sub vcl_hash { hash_data(req.url); return (lookup); } ``` In this way varnish just hash the "req.url" not "req.http.host" So, Varnish would cache the content based on the content uniqueness not based on the difference between backends. 1. At first, I asked how I can normalize it, Is it possible at all according to what I said!? Would you please explain it more with an example? 2. You give an example about other domains, In this case I do not understand what it has to do with the domain? 3.Maybe I'm thinking in wrong way because if varnish hash the data based on req.url : 'hash_data(req.url)' It shouldn't cache the same content but different backends again! for example my request is : http://varnish-01:/hls/test.ts for first request it goes to "hls_backend_01" backend and cache it and for next request it goes to "hls_backend_02" backend, so for each request it caches it again because backends are different? Many Thanks, Hamidreza ________________________________ From: varnish-misc <varnish-misc-bounces+hrhosseini=hotmail....@varnish-cache.org> on behalf of Dridi Boukelmoune <dr...@varni.sh> Sent: Sunday, August 15, 2021 10:30 PM To: varnish-misc@varnish-cache.org <varnish-misc@varnish-cache.org> Subject: Re: Best practice for caching scenario with different backend servers but same content On Sat, Aug 14, 2021 at 10:54 AM Hamidreza Hosseini <hrhosse...@hotmail.com> wrote: > > Hi, > Thanks to you and all varnish team for such answers that helped me alot, > I read the default varnish cache configuration again: > https://github.com/varnishcache/varnish-cache/blob/6.0/bin/varnishd/builtin.vcl > and find out vcl_hash as follow: > > ``` > sub vcl_hash { > hash_data(req.url); > if (req.http.host) { > hash_data(req.http.host); > } else { > hash_data(server.ip); > } > return (lookup); > } > > ``` > So, if I change vcl_hash like following , would it be enough for my > purpose?(I mean caching the same object from different backends just once > with roundrobin directive !:) > > ``` > > sub vcl_hash { > hash_data(req.url); > return (lookup); > } > > ``` > > By this config I told varnish just cache the content based on the 'req.url' > not 'req.http.host' therefore with the same content but different backend > varnish would cache once(If I want to use round robin directive instead of > shard directive ), Is this true? what bad consequences may it cause in the > future by this configuration? In this case req.http.host usually refers to the the domain end users resolve to find your varnish server (or other hops in front of it). It is usually the same for every client, let's take www.myapp.com<http://www.myapp.com> as an example. If your varnish server is in front of multiple services, you should be handling the different host headers explicitly. For exampe if you have exactly two domains you should normalize them to some canonical form. Using the same example domain that could be www.myapp.com<http://www.myapp.com> and static.myapp.com for instance. In that case hashing the URL only would prevent you from adding new domains through your Varnish server. It won't hurt if you know you will only ever have one domain to deal with, but hashing the host will also not hurt as long as you normalize it to a unique value. You are correct that by default hashing the request appropriately will help the shard director do the right thing out of the box. I remember however that you only wanted to hash a subset of the URL for video segments, so hashing the URL as-is won't provide the behavior you are looking for. Dridi _______________________________________________ varnish-misc mailing list varnish-misc@varnish-cache.org https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
_______________________________________________ varnish-misc mailing list varnish-misc@varnish-cache.org https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc