Re: [squid-users] Squid with PHP & Apache

Eliezer Croitoru Tue, 26 Nov 2013 21:50:43 -0800

Hey Ghassan,

Moving from PHP to C++ is a nice idea.

I do not know the size of the cache or it's limits but couple things toconsider while implementing the cache:

* clients latency
* server overload
* total cost
* efficiency of the cache

Bandwidth can cost lots of money in some cases and which some arewilling to pay for.Youtube by itself is a beast since the number of visits per video mightnot be worth all the efforts that are being invested only in one videofile\chunk.

Specifically on youtube you need to grab the response headers and insome cases even filter couple of them.If you are caching and you are 99.5% sure that this "chunk" or "file" isok as it is and as an object the headers can be considered as a sideeffect but in some cases are important.A compromise between Response Headers from a file to "from source" isthat in a case that the headers "file" or container is deleted to fetchnew ones or in a case the expiration headers are "out-of-date" thenfetch new Headers\object.


The main issue with 302 is the concept behind it.

I have seen that in the past the usage of 302 was in order to giveenough time for the upstream proxy\cdn node to fetch more data but insome cases it was a honest redirection towards the best origin server.

In a case you know that uses 302 responses handle them by the siterather then in a Global way.

The Content-Type is used from the origin server headers since this isprobably what the client application expects.On a web-server you would see that by the file extension theContent-Type can be decided but this is not how squid handles httprequests at all.

Squid algorithm are pretty simple while considering the basic "shape" ofthe object from the headers.

It is indeed an overhead to fetch from the web couple headers and thereare some cases which it can be avoided but a re-validation of theintegrity of the object\file is kind of important.


Back to the beginning of the Email:

If you do "know" that the object as it is now will not be changed forexample as the owner of the web-service you can even serve the client"stale" content.


There is no force in the world that limits you to do that.

I can say that for example for youtube I was thinking about usinganother approach which would "rank" videos and will consider removingvideos that was used once or twice per two weeks(which is depends on thesize of the storage and load).

If you do have a strong server that can run PHP you can try to take fora spin squid with StoreID that can help you to use only squid foryoutube video caching.

The only thing you will need to take care off is 302 response with anICAP service for example.

I do know how tempting it is to use PHP and it can be in many casesbetter for a network to use another solution then only squid.


I do not know if you have seen this article:
http://wiki.squid-cache.org/ConfigExamples/DynamicContent/Coordinator

The article shows couple aspect of youtube caching.

There was some PHP code at:
http://code.google.com/p/yt-cache/

Which I have seen long time ago.(2011-12)

StoreID is at the 3.4 branch of squid and is still on the Beta stage:
http://wiki.squid-cache.org/Features/StoreID

StoreID code by itself is very well tested and I am using it on a dailybasis not even once restarting\reloading my local server for a very longtime.I have not heard about a very big production environment(clustered)reports in my email yet.

The basic idea of StoreID is to take the current existing internals ofsquid and to "unleash" them in a way that they can be exploited\used byexternal helper.

StoreID is not here to replace the PHP or any other methods that mightfit any network, it comes to allow the admin and see the power of squidcaching even in this "dead-end" case which requires acrobatics.

You can try to just test it in a small testing environment and to see ifit fits to you.

One of the benefits that Apache+PHP has is the "Threading" which allowsone service such as apache to utilize as much horse power as the machinehas as a "metal".Since squid is already there the whole internal traffic between theapache and squid can be "spared" while using StoreID.

Note that fetching the headers *only* from the origin server can stillhelp you to decide if you want to fetch the whole object from it.A fetch of a whole headers set which will not exceed 1KB is worth foreven a 200KB file size in many cases.

I have tried to not miss somethings but I do not want to write a wholeScroll about yet so if there is more interest in it I will add more later.


Regards,
Eliezer

On 25/11/13 23:13, Ghassan Gharabli wrote:

  Hi,

I have built a PHP script to cache HTTP 1.X 206 Partial Content like
"WindowsUpdates" & Allow seeking through Youtube & many websites .

I am willing to move from PHP to C++ hopefully after a while.

The script is almost finished , but I have several question, I have no
idea if I should always grab the HTTP Response Headers and send them
back to the browsers.

1) Does Squid still grab the "HTTP Response Headers", even if the
object is already in cache or Squid has already a cached copy of the
HTTP Response header . If Squid caches HTTP Response Headers then how
do you deal with HTTP CODE 302 if the object is already cached . I am
asking this question because I have already seen most websites use
same extensions such as .FLV including Location Header.

2) Do you also use mime.conf to send the Content-Type to the browser
in case of FTP/HTTP or only FTP ?

3) Does squid compare the length of the local cached copy with the
remote file if you already have the object file or you use
refresh_pattern?.

4) What happens if the user modifies a refresh_pattern to cache an
object, for example .xml which does not have [Content-Length] header.
Do you still save it, or would you search for the ignore-headers used
to force caching the object and what happens if the cached copy
expires , do you still refresh the copy even if there is no
Content-Length header?.

I am really confused with this issue , because I am always getting a
headers list from the Internet and I send them back to the browser
(using PHP and Apache) even if the object is in cache.

Your help and answers will be much appreciated

Thank you

Ghassan

Re: [squid-users] Squid with PHP & Apache

Reply via email to