[squid-users] refersh_pattern cache dynamic extensions

2010-11-30 Thread Ghassan Gharabli
Hello,

I have several questions to ask about refresh_pattern

sometimes I see configuration as

refresh_pattern -i *.ico$
refresh_pattern -i .(css|js|xml)   #multiple extensions
refresh_pattern \.(css|js|xml)
refresh_pattern \.(css|js|xml)$
refresh_pattern -i .(css|js|xml)$
refresh_pattern .(\?.*)?$

Please can anyone explain what is the difference between each example
and I have also another question like how to cache multiple extensions
using the same rule incase it was dynamic or static

example :
#I know this rule catches dynamic website or file but i dont know how
to deal with multiple extensions like gif , jpeg , png
refresh_pattern .(\?.*)?$

Why we put $ , ?  or \?.*


Thank  you


[squid-users] Share HTTPS over SQUID

2010-12-12 Thread Ghassan Gharabli
Hello,

I was trying to share HTTPS connection through Squid but it says that
i need a certificate to work!


The problem that I dont use Squid for a domain or with IIS! Im using
Squid as Transparent HTTP for caching purpose and to save traffic thus
I also want share HTTPS if its possible to work with HTTP Transparent
Enabled .


I thought like I can forward https requests from MikroTik Router to
Windows Server 2003 that has Squid installed .


If it should work properly then how to create .pem file or a certificate file ?


Many Thanks for your help


Re: [squid-users] Share HTTPS over SQUID

2010-12-13 Thread Ghassan Gharabli
Does OpenSSL work on Windows ?


ANd does that mean I can use HTTPS to share with customers through SQUID!


Any Comments Please

On Mon, Dec 13, 2010 at 9:25 PM, Ghassan Gharabli
 wrote:
> Does OpenSSL work on Windows ?
>
>
> ANd does that mean I can use HTTPS to share with customers through SQUID!
>
>
> Any Comments Please
>
> On Mon, Dec 13, 2010 at 9:33 AM, purgat  wrote:
>> Hey
>> Someone else correct me if I am wrong but I believe you can use the
>> guide at the beginning of
>> http://wiki.squid-cache.org/ConfigExamples/Reverse/SslWithWildcardCertifiate
>> to create your keys. It is written for a *nix system but if you are a
>> windows user I believe you can find a guide somewhere for OS of your
>> choice and do these on your machine.
>> good luck.
>> Purgat
>>
>>
>>
>>
>> On Mon, 2010-12-13 at 08:54 +0200, Ghassan Gharabli wrote:
>>> Hello,
>>>
>>> I was trying to share HTTPS connection through Squid but it says that
>>> i need a certificate to work!
>>>
>>>
>>> The problem that I dont use Squid for a domain or with IIS! Im using
>>> Squid as Transparent HTTP for caching purpose and to save traffic thus
>>> I also want share HTTPS if its possible to work with HTTP Transparent
>>> Enabled .
>>>
>>>
>>> I thought like I can forward https requests from MikroTik Router to
>>> Windows Server 2003 that has Squid installed .
>>>
>>>
>>> If it should work properly then how to create .pem file or a certificate 
>>> file ?
>>>
>>>
>>> Many Thanks for your help
>>
>>
>>
>


[squid-users] SQUID store_url_rewrite

2011-05-29 Thread Ghassan Gharabli
Hello,

I was trying to cache this website :

http://down2.nogomi.com.xn55571528exgem0o65xymsgtmjiy75924mjqqybp.nogomi.com/M15/Alaa_Zalzaly/Atrak/Nogomi.com_Alaa_Zalzaly-3ali_Tar.mp3

How do you cache or rewrite its uRL to static domain! :
down2.nogomi.com.xn55571528exgem0o65xymsgtmjiy75924mjqqybp.nogomi.com

Does that URL matches this REGEX EXAMPLE or who can help me match this
Nogomi.com CDN?

#generic http://variable.domain.com/path/filename."ex";, "ext" or "exte"
#http://cdn1-28.projectplaylist.com
#http://s1sdlod041.bcst.cdn.s1s.yimg.com
} elsif 
(m/^http:\/\/(.*?)(\.[^\.\-]*?\..*?)\/([^\?\&\=]*)\.([\w\d]{2,4})\??.*$/)
{
   @y = ($1,$2,$3,$4);
   $y[0] =~
s/([a-z][0-9][a-z]dlod[\d]{3})|((cache|cdn)[-\d]*)|([a-zA-A]+-?[0-9]+(-[a-zA-Z]*)?)/cdn/;
   print $x . "storeurl://" . $y[0] . $y[1] . "/" . $y[2] . "." .
$y[3] . "\n";

I also tried to study more about REGULAR EXPRESSIONS but their
examples are only for simple URLS .. I really need to study more about
Complex URL .


Thanks for your help


Regards,
Ghassan


Re: [squid-users] SQUID store_url_rewrite

2011-05-30 Thread Ghassan Gharabli
Hello again,

#generic http://variable.domain.com/path/filename."ex";, "ext" or "exte"
#http://cdn1-28.projectplaylist.com
#http://s1sdlod041.bcst.cdn.s1s.yimg.com
#} elsif 
(m/^http:\/\/(.*?)(\.[^\.\-]*?\..*?)\/([^\?\&\=]*)\.([\w\d]{2,4})\??.*$/)
{
#@y = ($1,$2,$3,$4);
#$y[0] =~
s/([a-z][0-9][a-z]dlod[\d]{3})|((cache|cdn)[-\d]*)|([a-zA-A]+-?[0-9]+(-[a-zA-Z]*)?)/cdn/;
#print $x . "storeurl://" . $y[0] . $y[1] . "/" . $y[2] . "."
. $y[3] . "\n";


Why we had to use arrays in this example.
I understood that m/ indicates a regex match operation , "\n" to break
the line and we assined @y as an array which has
4 values we used to call each one for example we call $1 the first
record as y[0] ..till now its fine for me
and we assign a value to y[0] =~ $y[0] =~
s/([a-z][0-9][a-z]dlod[\d]{3})|((cache|cdn)[-\d]*)|([a-zA-A]+-?[0-9]+(-[a-zA-Z]*)?)/cdn/;
...

Please correct me if im wrong here.Im still confused about those
values $1 , $2 , $3 ..
how does the program know where to locate $1 or $2 as there is no
values or $strings anyway
as I have noticed that $1 means an element for example
http://cdn1-28.projectplaylist.com can be grouped as elements .. Hope
Im correct on this one
http://(cdn1-28) . (projectplaylist) . (com) should be http:// $1 . $2 . $3

Then let me see if I can solve this one to match this URL
http://down2.nogomi.com.xn55571528exgem0o65xymsgtmjiy75924mjqqybp.nogomi.com/M15/Alaa_Zalzaly/Atrak/Nogomi.com_Alaa_Zalzaly-3ali_Tar.mp3

so I should work around the FQDN and leave the rest as is, please if
you found any wrong this then correct it for me
#does that match
http://down2.nogomi.com.xn55571528exgem0o65xymsgtmjiy75924mjqqybp.nogomi.com/M15/Alaa_Zalzaly/Atrak/Nogomi.com_Alaa_Zalzaly-3ali_Tar.mp3
  ??
elsif (m/^http:\/\/(.*?)(\.[^\.\-]*?\..*?)\/([^\?\&\=]*)\.([\w\d]{2,4})\??.*$/)
{
  @y = ($1,$2,$3,$4);
  $y[0] =~ s/[a-z0-9A-Z\.\-]+/cdn/
  print $x . "storeurl://" . $y[0] . $y[1] . "/" . $y[2] . "." .
$y[3] . "\n";


does this example matches Nogomi.com domain correctly ?

and why u used s/[a-z0-9A-Z\.\-]+/cdn/

I only understood that you are mnaking sure to find small letters ,
cap letters , numbers but I believe \. is to search
for one dot only .. how about if there is 2 dots or more that 3 dots
in this case! .. another one u r finding dash ..

The only thing im confused about is why we have added /cdn/ since the
url doesnt has a word "cdn"?

Why we have used storeurl:// because I can see some of examples are
print $x . "http://"; . $y[0] . $y[1] . "/" . $y[2] . "." . $y[3] . "\n";

can you give me an example to add the portion of $y[1] please..

Which one have your interests , writing a script to match the most
similar examples in one rule or writing each script for each FQDN?

for example sometimes we see
http://down2.xn55571528exgem0o65xymsgtmjiy75924mjqqybp.example.com/folder/filename.ext
or
http://cdn.xn55571528exgem0o65xymsgtmjiy75924mjqqybp.example2.com/xn55571528exgem0o65xymsgtmjiy75924mjqqybp/folder/filename.ext

really that is interesting to me , that is why I would love to match
this too as well but the thing is if I knew all of these things ..
everything would be fine for me

Again I want to thank you for answering my questions as I felt like Im
writing a magazine heheheh


Regards,
Ghassan


Re: [squid-users] SQUID store_url_rewrite

2011-05-31 Thread Ghassan Gharabli
Im sorry again for the last email but I also have something to ask for ..

(m/^http:\/\/(.*?)(\.[^\.\-]*?\..*?)\/([^\?\&\=]*)\.([\w\d]{2,4})\??.*$/)

now Im talking about this element ([\w\d]{2,4}) which seems to match
.ex , .ext or .exte for example .mp3

I understand that \w matches an alphanumeric character, including "_"
same as [A-Za-z0-9_] in ASCII

that I know it finds for numbers , letters including underscore ..
which is correct here but the thing that is confusing ot me
also we have used \d which finds for matches a digit same as [0-9] in
ASCII.. so we have used 0-9 twice! any comment about it?


Im also seeing these urls again

#generic http://variable.domain.com/path/filename."ex";, "ext" or "exte"
#http://cdn1-28.projectplaylist.com
#http://s1sdlod041.bcst.cdn.s1s.yimg.com

^ means that we matches the beginning of a line or string.
m/^http:\/\/ ... we used at the start (.*?) which seems to be to find anything !

If we want to look at this url ; #http://s1sdlod041.bcst.cdn.s1s.yimg.com

If Im correct then (.*?) means to match "s1sdlod041" and then the
second element(\.[^\.\-]*?\..*?) we moved to . after
"s1sdlod041" so nw we have "http://s1sdlod041."; but I want to know how
about "[^\.\-]*?\..*?" like [] or we used ^ for \. and \-
coz we are also finding dashes or dots .. after that we used "*"
anything! and then Question Mark "?" .. something also confusing to me
"\.." or "\..*?" .


another question to ask for ([^\?\&\=]*) umm I think this one is for
folders or what ?...

as I saw the slash \/ before it .. which seems to catch
/?url=blah&C=blah2 and the "*" matches "blah" and "bla2"

but please if you dont mind then you can explain or illustrate more
about (\.[^\.\-]*?\..*?) or maybe you can explain it well

using your way as Im sure you are a good teacher hehehe

Please explain the whole match to me
(m/^http:\/\/(.*?)(\.[^\.\-]*?\..*?)\/([^\?\&\=]*)\.([\w\d]{2,4})\??.*$/)


I was eager to ask you all these questions from the start but I was
afraid thinking you'll not help anyway

that what I was trying to go so far is FileHippo domain

http://fs34.filehippo.com/6574/058e5771e07c467cb38d70ab6fbed3c0/Opera_1150b1_int_Setup.exe

in this case we have to try to change the domain into
"cdn.filehippo.com/6574/Opera_1150b1_int_Setup.exe" because we removed
the hashed folder!

Its okay I have the script for it


#cdn, varialble 1st path
} elsif (($u =~ /filehippo/) &&
(m/^http:\/\/(.*?)\.(.*?)\/(.*?)\/(.*)\.([a-z0-9]{3,4})(\?.*)?/)) {
@y = ($1,$2,$4,$5);
$y[0] =~ s/[a-z0-9]{2,5}/cdn./;
print $x . "http://"; . $y[0] . $y[1] . "/" . $y[2] . "." . $y[3] . "\n";

and its working 100% . I can get it from cache too .. what if I want
to add wlxrs.com into ($u =~ /filehippo|wlxrs/)

does that match this URL?
http://css.wlxrs.com/HGjlAVvMlW6-1!iEEpuBkgo2TZKpU8RH!W4mH-UPgteZ8OD6Oxte!sCQWfQ1OB7A6B-NZoBS1jrItq7zq!v10A/OOB_30_IllustratedKai/15.40.1211/img/Kai_Sunny_thumbnail.jpg
I dont think so as it has "!" where should I add this one to match a
folder like
"/HGjlAVvMlW6-1!iEEpuBkgo2TZKpU8RH!W4mH-UPgteZ8OD6Oxte!sCQWfQ1OB7A6B-NZoBS1jrItq7zq!v10A/"

sometimes the CDN folder comes at the 1st folder or 2nd or 3rd ..
deopends on any website.

can you lead me where should I find or edit this script to follow WLXRS.COM

btw, you really helped alot with those complicated examples which
means I can start from now to match any known cases

Thank you alot


[squid-users] How to apply youtube patch?

2011-06-17 Thread Ghassan Gharabli
Hello,

I was wondering If there might be a way to install DIFF FILE for
Youtube  on Windows with SQUID 2.7Stable8!

Im using this version from http://www.serassio.it/SquidNT.htm


Also , if there is no way to install it on Windows then there might be
another trick to install 2 instances of Squid on the same OS but is
there any steps to follow please as step by step?

BTW , Caching Youtube videos are working well when I use
minimum_object_size 512 bytes .. the bad thing in it , it ignores
everything less than 512 bytes .. any idea please?

Thank you


[squid-users] SQUID Multiple Instances not working on windows!

2011-08-09 Thread Ghassan Gharabli
Hello,

I have looked and read this URL
http://wiki.squid-cache.org/MultipleInstances but still not working .
Only the first Instance is woking well but the second isnt working
yet! ..  theproblem is I cant see any logs ..

usually the first instance I used to install it as :

squid.exe -i -n squid


but the second instance I tired this one :

squid.exe -i -n SquidSurf -f C:/squid2/etc/squid2.conf

also tried to change the directory of Logs ...

Any help ?


[squid-users] Vary: * effectively uncacheable

2011-09-12 Thread Ghassan Gharabli
Hello Amos,


I am trying to cache file .exe from Google but it seems no cacheable!


I already tried to customize its cdn and did it successfully ! . The
bad thing is when I looked at its header I found that it is not
cachable .


http://redbot.org/?uri=http%3A%2F%2Fdl.google.com%2Ftag%2Fs%2Fappguid%253D%257B74AF07D8-FB8F-4d51-8AC7-927721D56EBB%257D%2526iid%253D%257BBE1E0610-0F00-E4AD-7AA2-E3CE722A10A2%257D%2526lang%253Den%2526browser%253D2%2526usagestats%253D0%2526appname%253DGoogle%252520Earth%2526needsadmin%253DTrue%2526brand%253DGGGE%2Fearth%2Fclient%2FGoogleEarthSetup.exe


HTTP/1.1 200 OK
Last-Modified: Thu, 25 Aug 2011 23:00:00 GMT
Accept-Ranges: bytes
Content-Length: 604488
Content-Type: application/x-msdos-program
ETag: 24829
Vary: *
Date: Mon, 12 Sep 2011 15:06:54 GMT
Server: downloads


Is there an ignore option for that [ Vary : * ] ?


Thank you


Regards,
Ghassan


[squid-users] Etag Caching Squid2.7Stable9!

2011-09-19 Thread Ghassan Gharabli
Hello Amos,

I will try to compile Squid 2.7 Stable9 since it doesnt exist on Windows .

Does Squid 2.7Stable9 Support Etag Caching ?


My Target is to apply Youtube.patch and so many options I have seen on
internet but I always wanted to ask you before I do anything . Is it
possible to apply patches like ignore-no-store , store-stale ,
ignore-must-revalidate .

If that would be possible for me then I am going to be happy at least
I can have all configurations in one Squid Instance .


I already thought of installing Squid Version Lusca Head on Windows
but it is so buggy though I really liked the Lusca head .

What are your suggestion to have all these options in one Squid Version ?


Ghassan.


[squid-users] Header_replace

2011-09-22 Thread Ghassan Gharabli
Hello Amos,


Our internet is coming through satellite and most of the websites are
in Germany and it is really annoying to always change it to english .


I have found Header_replace in Squid but I was thinking if that can
fake the header or soemthing and force it as English en-US !! not DE


Ghassan


[squid-users] Storeurl_rewrite Cache Peers

2011-10-18 Thread Ghassan Gharabli
Hello,


My question is about storeurl_rewrite ...

I used to have more than 7 windows servers with Squid2.7 STABLE8
installed (Sibling Mode) ..

I was wondering why I cant share cached data that was saved locally
through storeurl_rewrite between all squid proxy servers!?

It was working before .. Now I am working on SQUID2.7STABLE7 but
should I upgrade to Squid2.7STABLE8 to make it work like before or I
must do soemthing in Squid.Conf?


Thank you


Re: [squid-users] Storeurl_rewrite Cache Peers

2011-10-19 Thread Ghassan Gharabli
Hello Amos,


Thank  you so much . I have fixed the issue by editing the source and
then I compiled it on windows using MinGW ..


I am happy again :D

On 10/19/11, Amos Jeffries  wrote:
> On 19/10/11 12:12, Ghassan Gharabli wrote:
>> Hello,
>>
>>
>> My question is about storeurl_rewrite ...
>>
>> I used to have more than 7 windows servers with Squid2.7 STABLE8
>> installed (Sibling Mode) ..
>>
>> I was wondering why I cant share cached data that was saved locally
>> through storeurl_rewrite between all squid proxy servers!?
>>
>> It was working before .. Now I am working on SQUID2.7STABLE7 but
>> should I upgrade to Squid2.7STABLE8 to make it work like before or I
>> must do soemthing in Squid.Conf?
>>
>
> The output of storeurl_rewrite is a "private" URL for use only within
> that Squid. All external communications including to peers uses the
> public URL which some client is wanting.
>
> You may have hit http://bugs.squid-cache.org/show_bug.cgi?id=2354
>
> ICP/HTCP being how siblings interact to determine the URLs stored. I'm
> not sure why it was working in the earlier version. Perhapse you had
> cache digests working there?
>
> Amos
> --
> Please be using
>Current Stable Squid 2.7.STABLE9 or 3.1.16
>Beta testers wanted for 3.2.0.13
>


[squid-users] Override Max File Descriptors on Windows

2011-10-27 Thread Ghassan Gharabli
Hello,


I am looking for information might help me with your experience!


I have compiled Squid 2.7Stable9 on Windows using MinGW and Cygwin but
the thing I have already tried to avoid the file descriptors
limitation but with no luck anyway . The problem is MinGW is linking
with msvcrt.dll on Windows  and I am being forced to get max fd as
2048 on Windows .. so Is there anyway I can compile Squid Under
something to override FD definitions! .


I spent few days searching for the source code of msvcrt.dll so I can
edit the max 2048 but still no luck . Does anyone know who can direct
me to someone who might get the source code of msvcrt.dll and that
would be the best idea to improve performance on Windows or else I
need to work on changing the Squid Windows I/O .

Something like replacing the use of the POSIX I/O interfaces in mysys
on Windows with the Win32 API calls (CreateFile, WriteFile, etc). All
that I know , the Windows API does not support opening files in append
mode in all cases.

My Question is .. Where should I start to find the functions related
to File Descriptors in your Squid Source Code and which Files are
responsible in your source code that are supposed to handle the FD ?

Im willing to start this project with my friends but need more
information to go for it.


Thank you

Ghassan.


Re: [squid-users] Override Max File Descriptors on Windows

2011-10-30 Thread Ghassan Gharabli
For now , people can only improve server / system performance by
changing the TIME_WAIT and FILE DESCRIPTOR limit in Windows 2003, XP,
Vista and Windows 2008.

Windows has a hard upper file descriptor limit of 512 at the 'stdio
level' i.e. any program can open a limited number of 512 files
simultaneously at stdio level, but using the C runtime library Windows
hard upper limit for file descriptors can be extended to 2048 i.e. any
program can open upto 2048 number of files simultaneously using the C
runtime 'setmaxstdio' call.

I meant by extending the FD limit If squid requires to open more then
2048 files simultaneously then squid should use the native Win32 API
calls (E.g. CreateFile) instead of C runtime library.

I have succeeded in creating and opening more than 15,000 files
simultaneously by using the Native Win32 API by using (CreateFile,
WriteFile) because Microsoft didnt specify any hard upper limit.

I also have found that when you have internet through Satellite . for
example this software routes the download through the satellite anyway
when you disconnect it then Squid terminates...

2011/10/29 08:07:38| comm_select: select failure: (10038) WSAENOTSOCK,
Socket operation on nonsocket.
2011/10/29 08:07:38| Select loop Error. Retry 1
2011/10/29 08:07:38| comm_select: select failure: (10038) WSAENOTSOCK,
Socket operation on nonsocket.
2011/10/29 08:07:38| Select loop Error. Retry 2
2011/10/29 08:07:38| comm_select: select failure: (10038) WSAENOTSOCK,
Socket operation on nonsocket.
2011/10/29 08:07:38| Select loop Error. Retry 3
2011/10/29 08:07:38| comm_select: select failure: (10038) WSAENOTSOCK,
Socket operation on nonsocket.
2011/10/29 08:07:38| Select loop Error. Retry 4
2011/10/29 08:07:38| comm_select: select failure: (10038) WSAENOTSOCK,
Socket operation on nonsocket.
2011/10/29 08:07:38| Select loop Error. Retry 5
2011/10/29 08:07:38| comm_select: select failure: (10038) WSAENOTSOCK,
Socket operation on nonsocket.
2011/10/29 08:07:38| Select loop Error. Retry 6
2011/10/29 08:07:38| comm_select: select failure: (10038) WSAENOTSOCK,
Socket operation on nonsocket.
2011/10/29 08:07:38| Select loop Error. Retry 7
2011/10/29 08:07:38| comm_select: select failure: (10038) WSAENOTSOCK,
Socket operation on nonsocket.
2011/10/29 08:07:38| Select loop Error. Retry 8
2011/10/29 08:07:38| comm_select: select failure: (10038) WSAENOTSOCK,
Socket operation on nonsocket.
2011/10/29 08:07:38| Select loop Error. Retry 9
2011/10/29 08:07:38| comm_select: select failure: (10038) WSAENOTSOCK,
Socket operation on nonsocket.
2011/10/29 08:07:38| Select loop Error. Retry 10
FATAL: Select Loop failed!

You also cant use anything else when squid terminates because there
will be no resources available! but Squid as you squid is using NATIVE
WIN32 API for FILE SOCKET so what is happening here ?

On 10/29/11, Amos Jeffries  wrote:
> On 28/10/11 02:38, Ghassan Gharabli wrote:
>> Hello,
>>
>>
>> I am looking for information might help me with your experience!
>>
>>
>> I have compiled Squid 2.7Stable9 on Windows using MinGW and Cygwin but
>> the thing I have already tried to avoid the file descriptors
>> limitation but with no luck anyway . The problem is MinGW is linking
>> with msvcrt.dll on Windows  and I am being forced to get max fd as
>> 2048 on Windows .. so Is there anyway I can compile Squid Under
>> something to override FD definitions! .
>>
>>
>> I spent few days searching for the source code of msvcrt.dll so I can
>> edit the max 2048 but still no luck . Does anyone know who can direct
>> me to someone who might get the source code of msvcrt.dll and that
>> would be the best idea to improve performance on Windows or else I
>> need to work on changing the Squid Windows I/O .
>>
>> Something like replacing the use of the POSIX I/O interfaces in mysys
>> on Windows with the Win32 API calls (CreateFile, WriteFile, etc). All
>> that I know , the Windows API does not support opening files in append
>> mode in all cases.
>>
>
> The day when network I/O on Windows supports POSIX enough that it can be
> done via CreateFile/WriteFile or some equivalent will be a happy day for
> many programmers around the world currently having to keep two sets of
> FD flying.
>
> Windows FS in Squid uses the native Win32 Overlapping I/O thread
> interfaces for disk access whenever that is available. The FD limit is
> not related to that.
>
> The 2048 FD limit (actually 2048 _handle_ limit per process) is built
> into the select() algorithm. It affects POSIX as well as Windows. But
> Windows support in Squid does not exist for any of the more advanced
> network I/O methods.
>
>   See comm_*.c for available comm modules. The ones with a Win32 API
> version have a win32 suffix.
>If you want to develop a new

[squid-users] REGEX \w+\d

2011-10-30 Thread Ghassan Gharabli
Hello Amos,

I am trying to find and match "abc123" only, cache-1 , cdn-1 , videos-1 .

but s/([a-z]*+[0-9]*)/cdn/; is also matching words without digits!!

I want to match for example ...
http://(a1.domain.com) not (a.domain.com) so how to restrict it more !!?

GET 
http://profile.ak.fbcdn.net/hprofile-ak-ash2/275314_566041174_1327573084_s.jpg
Store lookup URL:
http://cdn.ak.fbcdn.net/hprofile-ak-ash2/275314_566041174_1327573084_s.jpg

Here is my script

(m/^http:\/\/(.*?)\.(.*?)\/(.*?)\.(jp(e?g|e|2)|gif|png|tiff?|bmp|ico|flv|wmv|3gp|mp(4|3)|exe|msi|txt|zip|on2|mar|swf|cab).*?/)
{
   @y = ($1,$2,$3,$4);
   $y[0] =~ s/((cache|cdn|videos)[-\d]*)|([a-z]*+[0-9]*)/cdn/;
   print $x . "http://"; . $y[0] . "." . $y[1] . "/" . $y[2] . "."
. $y[3] . "\n";


[squid-users] httpReadReply: Excess data

2011-10-30 Thread Ghassan Gharabli
Hello,


Why do I have this error in cache.log ?

httpReadReply: Excess data from "GET
http://rest-core.msg.yahoo.com/v1/session?amIOnline=0&rand=/v1/session";


Ghassan


[squid-users] Problem with HTTP Headers

2011-11-11 Thread Ghassan Gharabli
Hello,

I am facing a trouble with Caching HTTP Headers.

Everyday I see that www.facebook.com header is being cached and then I
try to remove it manually from Cache and so other websites ...

I tried to add these refresh_patterns before any rule but
unfortunately with no luck!

refresh_pattern -i \.(htm|html|jhtml|mhtml|php)(\?.*|$)   0 0% 0
refresh_pattern ^http:\/\/www\.facebook\.com$ 0 0% 0

REFRESH_PATTERN CONFIG :

# 1 year = 525600 mins, 1 month = 43800 mins
refresh_pattern
(get_video|videoplayback|videodownload|\.flv).*(begin|start)\=[1-9][0-9]*   0
0% 0
refresh_pattern -i \.(htm|html|jhtml|mhtml|php)(\?.*|$)   0 0% 0
refresh_pattern ^http:\/\/www\.facebook\.com$ 0 0% 0
refresh_pattern ^http:\/\/www.filefactory.com.*\.(mp3)0 0% 0
refresh_pattern imeem.*\.flv 0 0% 0 override-lastmod override-expire
refresh_pattern ^ftp:   40320 20% 40320 override-expire reload-into-ims
refresh_pattern ^gopher:  1440 0% 1440

#youtube's videos
refresh_pattern
(get_video\?|videoplayback\?|generate_204\?|videodownload\?|\.flv\?|\.fid\?)
5259487 % 5259487 override-expire ignore-reload store-stale
ignore-private negative-ttl=0

#YouTube's Embeded Videos
refresh_pattern ^http:\/\/www\.youtube\.com\/v\/.*   5259487 %
5259487 override-expire ignore-reload store-stale ignore-private
negative-ttl=0

#VideoZer
refresh_pattern (video\?) 5259487 % 5259487 override-expire
ignore-reload store-stale ignore-private ignore-no-store
ignore-must-revalidate ignore-no-cache

#ads
refresh_pattern
^.*(streamate.doublepimp.com.*\.js\?|utm\.gif|ads\?|rmxads\.com|ad\.z5x\.net|bh\.contextweb\.com|bstats\.adbrite\.com|a1\.interclick\.com|ad\.trafficmp\.com|ads\.cubics\.com|ad\.xtendmedia\.com).*
 5259487 20% 5259487 ignore-no-cache ignore-no-store
ignore-private override-expire ignore-reload ignore-auth
ignore-must-revalidate store-stale negative-ttl=40320 max-stale=1440
refresh_pattern
^.*(\.googlesyndication\.com|advertising\.com|yieldmanager|game-advertising\.com|pixel\.quantserve\.com|adperium\.com|doubleclick\.net|adserving\.cpxinteractive\.com|syndication\.com|media.fastclick.net).*
5259487 20% 5259487 ignore-no-cache ignore-no-store ignore-private
override-expire ignore-reload ignore-auth ignore-must-revalidate
store-stale negative-ttl=40320 max-stale=1440

#Rapidshare
refresh_pattern \.rapidshare.*\/[0-9]*\/.*\/[^\/]* 161280 90%   161280
ignore-no-cache ignore-reload override-expire store-stale

#Uploaded.to
refresh_pattern ^http:\/\/[.a-z0-9-]*\.uploaded\.to.*  161280 90%
161280 ignore-no-cache override-expire ignore-reload
ignore-stale-while-revalidate store-stale

#FileSonic
refresh_pattern ^http:\/\/s[0-9]+\.filesonic\.com\/download\/.*
259487 99% 259487   ignore-no-cache ignore-no-store reload-into-ims
override-expire ignore-must-revalidate ignore-private store-stale

#WLM MsgrConfig
refresh_pattern
^http:\/\/config\.messenger\.msn\.com\/Config\/MsgrConfig\.asmx\?.*
  5259487 99% 5259487 ignore-private ignore-no-cache
override-expire ignore-reload ignore-must-revalidate ignore-no-store
reload-into-ims store-stale

#reverbnation.com
refresh_pattern
^http:\/\/[a-z0-9]*\.reverbnation\.com\/audio_player\/stream_song\/(.*)
 5259487 99% 5259487 ignore-no-cache override-expire ignore-reload
ignore-stale-while-revalidate ignore-must-revalidate ignore-no-store
reload-into-ims store-stale

#Fileserve.com
refresh_pattern fileserve\.com.*\.(mp3|rar|zip|vob|mpg) 5259487
99% 5259487 ignore-no-cache override-expire ignore-reload
ignore-stale-while-revalidate store-stale ignore-no-store
ignore-must-revalidate

# Icons + Video Stats
refresh_pattern \.(ico|video-stats) 5259487 99% 5259487
override-expire ignore-reload ignore-no-cache ignore-no-store
ignore-private ignore-auth override-lastmod ignore-must-revalidate
negative-ttl=10080 store-stale
refresh_pattern \.etology\? 5259487 99% 5259487 override-expire
ignore-reload ignore-no-cache store-stale
refresh_pattern galleries\.video(\?|sz) 5259487 99%
5259487 override-expire ignore-reload ignore-no-cache store-stale

#Brazzers
refresh_pattern brazzers\? 5259487 99% 5259487 override-expire
ignore-reload ignore-no-cache store-stale
refresh_pattern \.adtology\? 5259487 99% 5259487 override-expire
ignore-reload ignore-no-cache store-stale

# Google / Google Safe Browsing
refresh_pattern ^.*safebrowsing.*google
 5259487 99% 5259487override-expire ignore-reload
ignore-no-cache ignore-no-store ignore-private ignore-auth
ignore-must-revalidate negative-ttl=10080 store-stale
refresh_pattern ^http://((cbk|mt|khm|mlt)[0-9]?)\.google\.co(m|\.uk)
   5259487 99% 5259487  override-expire ignore-reload
store-stale ignore-private negative-ttl=10080
refresh_pattern
^http://((cbk|mt|khm|mlt)[0-9]?)\.googleapis\.co(m|\.uk)
5259487 99% 5259487 override-expire ignore-reload store-stale
ignore-p

Re: [squid-users] Problem with HTTP Headers

2011-11-12 Thread Ghassan Gharabli
Hello Amos,

I understand what you wrote to me but I really do not have any rule
that tells squid to cache .facebook.com header ..

I only used refresh_pattern to match Pictures , Videos & certain
extensions by using ignore-must-revalidate , ignore-no-store ,
ignore-no-cache , store-stale .. etc

and howcome this rule doesnt work ?

refresh_pattern -i \.(htm|html|jhtml|mhtml|php)(\?.*|$)   0 0%

This rule tells squid not to cache these extensions if we had static
URL or dynamic URL.

As I noticed every time you open a website for example www.mtv.com.lb
then you try to open it again next day but you get the same news (
yesterday) which confused me and allow me to think that maybe Squid
ignore all headers related to website if you cached for example
pictures and multimedia objects thats why I was asking which rule
might be affecting websites?.

I cant spend my time on adding list to "cache deny" on websites that
were being cached so I thought of only removing the rule caused squid
to cached Websites .

How to ignore www.facebook.com not to cache but at the same time I
want to cache pictures , FLV Videos , CSS , JS but not the header of
the main page (HTML/PHP).

refresh_pattern ^http:\/\/www\.facebook\.com$ 0 0% 0

I tried to use $ after .com as I only wanted not to cache the main
page of Facebook but still I want to cache Pictures and Videos at
Facebook and so on at other websites .

Sorry If I didnt explain well .



Ghassan

On Sat, Nov 12, 2011 at 1:47 PM, Amos Jeffries  wrote:
> On 12/11/2011 10:30 a.m., Ghassan Gharabli wrote:
>>
>> Hello,
>>
>> I am facing a trouble with Caching HTTP Headers.
>>
>> Everyday I see that www.facebook.com header is being cached and then I
>> try to remove it manually from Cache and so other websites ...
>
> ...other websites what?
>
>>
>> I tried to add these refresh_patterns before any rule but
>> unfortunately with no luck!
>
> Okay some basics. This is a bit complex so if I'm not clear please mention.
>
> There are several algorithms affecting caching.
>
> Firstly is absolute expiry.
>  This tells Squid exactly when to erase the object. Down to the second.
> Controlled by Expires: header, or a Cache-Control header with private,
> no-store, max-age= values.
>  As Squid HTTP/1.1 support increases Expires: (a HTTP/1.0 feature) is
> getting ignored more often.
>
> Secondly, there are freshness algorithm.
>  This tells Squid exactly when the object can be used immediately, or needs
> revalidation before use. It is an estimation only.
>  Controlled by the Date, Last-Modified headers, with Cache-Control
> max-stale, etc mixed in as well.
>  This is where refresh_pattern happens, its min/pct/max values are used to
> set boundaries in the decision about staleness. The wiki and refresh_pattern
> config docs cover exactly how it works, so I wont repeat it all here.
>
>
> Thirdly, there are the variant algorithm(s).
>  These tell Squid whether the object in cache is relevant to the request at
> all or needs to be skipped. Controlled by the ETag, Vary, Accept, etc.
>
> To complicate things refresh_pattern has ignore-* an doverride-* options
> which make Squid ignore the particular header. These are mostly HTTP
> violations and can prevent immediate expiry or extend the estimation well
> beyond anything that woudl otherwise be chosen.
> NOTE: all these options and refresh_pattern itself can only *extend* the
> time something is cached for. They cannot and do not prevent caching or
> remove things early. refresh_pattern can have the appearance of shortening
> cache times, *if* and only if, the object was caused to be cached that long
> by another refresh_pattern estimation later down the list (ie our default 1
> week storage time in the "." pattern line).
>
>  To prevent caching use "cache deny ...".
>
>
>>
>> refresh_pattern -i \.(htm|html|jhtml|mhtml|php)(\?.*|$)               0 0%
>> 0
>> refresh_pattern ^http:\/\/www\.facebook\.com$             0 0% 0
>>
>> REFRESH_PATTERN CONFIG :
>> 
>> # 1 year = 525600 mins, 1 month = 43800 mins
>> refresh_pattern
>> (get_video|videoplayback|videodownload|\.flv).*(begin|start)\=[1-9][0-9]*
>>       0
>> 0% 0
>> refresh_pattern -i \.(htm|html|jhtml|mhtml|php)(\?.*|$)               0 0%
>> 0
>> refresh_pattern ^http:\/\/www\.facebook\.com$             0 0% 0
>
> This rule above will never match. Squid uses absolute URLs to pass the
> refresh pattern.
> Absolute URL have a "/" in the path. *never* will  ".com" be the last four
> characters of the absolute URL.
>
> Amos
>


Re: [squid-users] Problem with HTTP Headers

2011-11-13 Thread Ghassan Gharabli
Dear Amos,

After allowing access  "Head" method in Squid Config

I deleted www.facebook.com from cache andthen I tried executing

squidclient -m head http://www.facebook.com

Results :

HTTP/1.0 302 Moved Temporarily
Location: http://www.facebook.com/common/browser.php
P3P: CP="Facebook does not have a P3P policy. Learn why here: http://fb.me/p3p";
Set-Cookie: datr=hfW_TtrAQmi_2SxwAUY4EjPH; expires=Tue, 12-Nov-2013 16:51:17 GMT
; path=/; domain=.facebook.com; httponly
Content-Type: text/html; charset=utf-8
X-FB-Server: 10.53.10.59
X-Cnection: close
Content-Length: 0
Date: Sun, 13 Nov 2011 16:51:17 GMT
X-Cache: MISS from Peer6.skydsl.net
X-Cache-Lookup: MISS from Peer6.skydsl.net:3128
Connection: close

I am not seeing any pragma or cache-control and expires! but redbot
shows the correct info there!.

BTW .. I am also using store_url but im sure nothing is bad there . I
am only playing with Dynamic URL regarding to Pictures and Videos
extensions so I have only one thing left for me to try which is unlike
to do it ..

acl facebookPages urlpath_regex -i /(\?.*|$)

First does this rule affect store_url?

For example when we have url like

http://www.example.com/1.gif?v=1244&y=n

I can see that urlpath_regex requires Full URL which means this rule matches :

http://www.example.com/stat?date=11

I will try to ignore this rule and let me focus on facebook problem
since we have more than 60% traffic on Facebook.


Let me test denying only PHP , HTML for a day to see if facebook http
header is being saved in cache.


Ghassan


On Sun, Nov 13, 2011 at 2:26 AM, Amos Jeffries  wrote:
> On 13/11/2011 12:15 p.m., Ghassan Gharabli wrote:
>>
>> Hello Amos,
>>
>> I understand what you wrote to me but I really do not have any rule
>> that tells squid to cache .facebook.com header ..
>
> According to http://redbot.org/?uri=http%3A%2F%2Fwww.facebook.com%2F
>
> FB front page has Expires, no-store, private, and must-revalidate. Squid
> should not be caching these at all unless somebody has maliciously erased
> the control headers. Or your squid has ignore-* and override-*
> refresfh_patterns for them (I did not see any in your config, which is good)
>
> Can you use:
>   squidclient -m HEAD http://www.facebook.com/
>
> to see if those headers you get match the ones apparently being sent by the
> FB server.
>
>>
>> I only used refresh_pattern to match Pictures , Videos&  certain
>> extensions by using ignore-must-revalidate , ignore-no-store ,
>> ignore-no-cache , store-stale .. etc
>>
>> and howcome this rule doesnt work ?
>>
>> refresh_pattern -i \.(htm|html|jhtml|mhtml|php)(\?.*|$)               0 0%
>>
>> This rule tells squid not to cache these extensions if we had static
>> URL or dynamic URL.
>
> The refresh_pattern algorithm only gets used *if* there are no Expires or
> Cache-Control headers stating specific information.
>
> Such as "private" or "no-store" or "Expires: Sat, 01 Jan 2000 00:00:00 GMT".
>
>
>>
>> As I noticed every time you open a website for example www.mtv.com.lb
>> then you try to open it again next day but you get the same news (
>> yesterday) which confused me and allow me to think that maybe Squid
>> ignore all headers related to website if you cached for example
>> pictures and multimedia objects thats why I was asking which rule
>> might be affecting websites?.
>>
>> I cant spend my time on adding list to "cache deny" on websites that
>> were being cached so I thought of only removing the rule caused squid
>> to cached Websites .
>>
>> How to ignore www.facebook.com not to cache but at the same time I
>> want to cache pictures , FLV Videos , CSS , JS but not the header of
>> the main page (HTML/PHP).
>
> With this config:
>   acl facebook dstdomain .facebook.com
>   acl facebookPages urlpath_regex -i \.([jm]?htm[l]?|php)(\?.*|$)
>   acl facebookPages urlpath_regex -i /(\?.*|$)
>   cache deny facebook facebookPages
>
> and remove all the refresh_pattterns you had about FB content.
>
> Which will cause any FB HTML objects which *might* have been cachable to be
> skipped by your Squid cache.
>
> Note that FLV videos in FB often come directly from youtube, so are not
> easily cached. The JS and CSS will retain the static/dynamic properties they
> are assigned by FB. You have generic refresh_pattern rules later on in yoru
> config which extend their normal storage times a lot.
>
>>
>> refresh_pattern ^http:\/\/www\.facebook\.com$             0 0% 0
>>
>> I tried to use $ after .com as I only wanted not to cache the main
>> page of Facebook but still I want to cache Pictures and Videos at
>> Facebook and so on at other websites .
>
> And I said the main page is not "http://www.facebook.com"; but
> "http://www.facebook.com/";
>
> so you should have added "/$" instead of just "$".
>
> BUT, using "cache deny" as above this becomes not relevant any more.
>
> Amos
>


Re: [squid-users] Problem with HTTP Headers

2011-11-16 Thread Ghassan Gharabli
Hello again,

Sorry I replied back quickly before without noticing your rule if it
has "/" or not and at first I didnt need to ignore "/?" because I am
caching several websites like name.flv/?.* so now I am using :

acl ExceptExt urlpath_regex -i (mp(3|4)|flv)/(\?.*)
acl facebook dstdomain .facebook.com
acl facebookPages urlpath_regex -i \.([jm]?htm[l]?|php)(\?.*|$)
acl facebookPages urlpath_regex -i /(\?.*|$)
cache deny facebook facebookPages !ExceptExt

Actually , I started to see Facebook.com in cache since they changed
to https://www.facebook.com so till now all servers that have the same
settings are no longer caching facebook main page header except one
server .. maybe one of the clients is infected with a malicious!

It is only being cached when one of clients are opening facebook
because I alredy opened facebook and it is not caching on this server
!.

> As you wish. I added that line because I noticed the front page for FB you
> wanted to non-cache has the URL path starting with the two characters "/?"
> instead of .html or .php.
>

How can I debug or trace the URL path that starts with "/?" and how
did you notice the front page for FB including two characters "/?" ?

BTW , I am trying my best to tune perl script that I did and yes I am
gaining much more traffic & more performance by decreasing rules and
try to match targeted urls with less lines.

Still studying REGEX and honestly Squid has saved me 54% of Traffic
and I can get more than that .

You have saved me more time on debugging it.

Thank you again.

Ghassan


On Mon, Nov 14, 2011 at 12:57 AM, Amos Jeffries  wrote:
> On Sun, 13 Nov 2011 19:14:48 +0200, Ghassan Gharabli wrote:
>>
>> Dear Amos,
>>
>> After allowing access  "Head" method in Squid Config
>>
>> I deleted www.facebook.com from cache andthen I tried executing
>>
>> squidclient -m head http://www.facebook.com
>>
>> Results :
>>
>> HTTP/1.0 302 Moved Temporarily
>> Location: http://www.facebook.com/common/browser.php
>> P3P: CP="Facebook does not have a P3P policy. Learn why here:
>> http://fb.me/p3p";
>> Set-Cookie: datr=hfW_TtrAQmi_2SxwAUY4EjPH; expires=Tue, 12-Nov-2013
>> 16:51:17 GMT
>> ; path=/; domain=.facebook.com; httponly
>> Content-Type: text/html; charset=utf-8
>> X-FB-Server: 10.53.10.59
>> X-Cnection: close
>> Content-Length: 0
>> Date: Sun, 13 Nov 2011 16:51:17 GMT
>> X-Cache: MISS from Peer6.skydsl.net
>> X-Cache-Lookup: MISS from Peer6.skydsl.net:3128
>> Connection: close
>>
>> I am not seeing any pragma or cache-control and expires! but redbot
>> shows the correct info there!.
>
> Ah, your squidclient is not sending a user-agent header. You will need to
> add -H "user-Agent: foo"
>
>>
>> BTW .. I am also using store_url but im sure nothing is bad there . I
>> am only playing with Dynamic URL regarding to Pictures and Videos
>> extensions so I have only one thing left for me to try which is unlike
>> to do it ..
>>
>> acl facebookPages urlpath_regex -i /(\?.*|$)
>>
>> First does this rule affect store_url?
>
> This is just a pattern definition. It only has effect where and when the ACL
> is used. The config I gave you only used it in the "cache deny" access line.
>
> That said, "cache deny" prevents things going to the cache, where storeurl*
> happens.
>
>
>>
>> For example when we have url like
>>
>> http://www.example.com/1.gif?v=1244&y=n
>>
>> I can see that urlpath_regex requires Full URL which means this rule
>> matches :
>>
>> http://www.example.com/stat?date=11
>
> The pattern begins with '/' and the "cache" access line I gave you included
> another ACL. Which tested the domain name was *.facebook.com.
>
> It will match things like:
>  http://www.facebook.com/?v=1244&y=n
>
> but *not* match things like:
>  http://www.example.com/1.gif?v=1244&y=n
>
>>
>> I will try to ignore this rule and let me focus on facebook problem
>> since we have more than 60% traffic on Facebook.
>>
>
> As you wish. I added that line because I noticed the front page for FB you
> wanted to non-cache has the URL path starting with the two characters "/?"
> instead of .html or .php.
>
> Amos
>
>


[squid-users] URL with Invalid Expire Date

2011-11-16 Thread Ghassan Gharabli
Hello,

I was wondering what would we do with this URL
http://www.youtube-nocookie.com/gen_204?attributionpartner=vevo if we
want to cache it.


I'e been looking and found that its expire header has an invalid date
! . I thought I can cache it since it has no-cache but as I can see
the object can be cached & cant be served without validation.


Isthere any other way that I can force it to cache ?



Ghassan


Re: [squid-users] URL with Invalid Expire Date

2011-11-17 Thread Ghassan Gharabli
I dont think that Squid2.7 could cache thing content  unless it is
fully Http/1.1 supported but is there any work around to force caching
such a content because this URL is causing a big delay when the
startup of Youtube Video happens as Youtube Videos are being cached
then you wait about 7seconds to 9 seconds waiting till Youtube Video
starts to load.

YouTube is always changing its content & the way they allow their
partner video provider such a company to show a "LOGO" of their own.
When you log its content then you notice few output video url like
(*.youtube.com/generate_204) , (*.youtube.com/get_video) or
(*.youtube.com/videoplayback) and sometimes the ID can be seen at the
end of URL or sometimes can be seen in the middle of the URL and some
of the remaining few videos you can see at the start of URL after
"videoplayback" .. wondering why I cant see such information on
http://wiki.squid-cache.org/ConfigExamples/DynamicContent/YouTube ..
maybe it is not updated yet ?.


>Caching this could present clients with a lie,
> indicating that some server state has been changed when the server has not
> even been contacted.

I thought of caching this content
http://www.youtube-nocookie.com/gen_204?attributionpartner=vevo just
to reduce the big latency even if the video was cached. I found it
interesting to cache the whole videos at Youtube thats why I preffer
presenting clients with a lie avoiding the big latency.



Ghassan


On Thu, Nov 17, 2011 at 4:41 AM, Amos Jeffries  wrote:
> On Thu, 17 Nov 2011 02:54:54 +0200, Ghassan Gharabli wrote:
>>
>> Hello,
>>
>> I was wondering what would we do with this URL
>> http://www.youtube-nocookie.com/gen_204?attributionpartner=vevo if we
>> want to cache it.
>>
>>
>> I'e been looking and found that its expire header has an invalid date
>> ! . I thought I can cache it since it has no-cache but as I can see
>> the object can be cached & cant be served without validation.
>>
>>
>> Isthere any other way that I can force it to cache ?
>
>
> RFC 2048 (HTTP/1.0) requires that invalid Expires: headers are treated as
> already expired regardless of their value. Under HTTP/1.1 this could be
> cached and served stale.
>
> However ... the status code is 204. Merely passing the URL to the web server
> changes some state there. Caching this could present clients with a lie,
> indicating that some server state has been changed when the server has not
> even been contacted. Also, the presence of no-cache and absence of max-stale
> means the server must be contacted on every re-use. Since there is no reply
> body content involved there is zero benefit from caching this. In fact a net
> loss in cache efficiency as one entry gets filled with an object which is
> guaranteed to be fully replaced on every usage.
>
>
> Amos
>
>


[squid-users] 206 Partial Content

2011-11-24 Thread Ghassan Gharabli
Hello Amos,

Squid was not able to cache partial content 206 Responses such as this
link below:

http://dc122.4shared.com/img/97459254/dcf5c10e/dlink__2Fdownload_2Fe1D3g7qW_3Ftsid_3D2024-220446-e01441a0/preview.mp3?sId=qnwkz3H9Cqm2NezD&t=1322172503748&s=fb361cc0390859a718b67a4646f7c16c

AS I can see , the response doesnt have Content_range which make it
harder to understand ..

What are the tricky things to cache this response using Squid ?



Ghassan


[squid-users] Youtube Issue!

2011-11-26 Thread Ghassan Gharabli
Hello Amos,


Finally, I have almost captured the most YouTube Videos except
something I want to get some asistance from you .


As I have tested before and tried so many times .. Chudy's script is outdated.

After testinig and logging Youtube Videos . I finally have found
something not being fully cached . If you still remember I have said
before with my old messages that ID isnt being captured in all places
but its okay I have done this . I will post my details after I
completelly finish them.

Could you please explain to me whats happening here?

If &range=13-2375679 was found in a URL then Squid doesnt understand
how to cache the full video .. as it only cache the first 13 seconds I
guess! and then it stops . If I try to download this finished cached
movie then you notice its size about 2.2 MB . You try to remove it
from cache then Squid cant even find it as it claims not cached but
shows TCP_HIT in access.log . STRANGE!

Now look into this URL:
---

"http://o-o.preferred.orange-par1.v4.lscache7.c.youtube.com/videoplayback?sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Csource%2Calgorithm%2Cburst%2Cfactor%2Ccp&fexp=907605%2C912600%2C915002&algorithm=throttle-factor&itag=34&ip=84.0.0.0&burst=40&sver=3&signature=8223490C23E48CB708E04666E4
A550422757CEC6.9D8D78E66DD14FEFC4B5F960F493ED4CDFD7C51C&source=youtube&expire=13
22348400&key=yt1&ipbits=8&factor=1.25&cp=U0hRR1NPVl9FSkNOMV9LSVpFOkpsV3BkS1B1ZXN
F&id=e120643085f56831&range=13-2375679"

HTTP/1.0 200 OK
Last-Modified: Fri, 27 Nov 2009 12:44:54 GMT
Content-Type: video/x-flv
Date: Sat, 26 Nov 2011 16:06:29 GMT
Expires: Sat, 26 Nov 2011 16:06:29 GMT
Cache-Control: private, max-age=24511
Accept-Ranges: bytes
Content-Length: 2375667
X-Content-Type-Options: nosniff
Server: gvs 1.0
X-Cache: MISS from Peer6
X-Cache-Lookup: MISS from Peer6:3128
Connection: close

Whats the job of "Accept_ranges: bytes" here?

And the very confusion again you can see another similar URL with the
same "/videoplayback?.*(id)" and here comes the ID inthe end of this
URL then moves temporary just . I must mention that this URL sends the
FLV url as Squid already read it in access.log and then it dds
&ir=1&playretry=1 or pr=1&playretry which means Squid would be
confused to cache it 2 times (FLV).

EXAMPLE:
---

"http://o-o.preferred.orange-par1.v3.lscache3.c.youtube.com/videoplayback?sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Csource%2Calgorithm%2Cburst%2Cfactor%2Ccp&fexp=908525%2C910207%2C916201&algorithm=throttle
-factor&itag=34&ip=84.0.0.0&burst=40&sver=3&signature=0489805DCC95F6EADBA9D43C3F
D8C107FC768662.73AA6897FE78CF78BE7819E089F1A4FC47534C7D&source=youtube&expire=13
22344800&key=yt1&ipbits=8&factor=1.25&cp=U0hRR1NPUl9FSkNOMV9LSVZJOmdmQWdwWC01dlp
n&id=283246f338ece5ad"

HTTP/1.0 302 Moved Temporarily
Last-Modified: Wed, 02 May 2007 10:26:10 GMT
Date: Sat, 26 Nov 2011 15:50:47 GMT
Expires: Sat, 26 Nov 2011 15:50:47 GMT
Cache-Control: private, max-age=900
Location: http://r9.orange-par2.c.youtube.com/videoplayback?sparams=id%2Cexpire%
2Cip%2Cipbits%2Citag%2Csource%2Calgorithm%2Cburst%2Cfactor%2Ccp&fexp=908525%2C91
0207%2C916201&algorithm=throttle-factor&itag=34&ip=84.0.0.0&burst=40&sver=3&sign
ature=0489805DCC95F6EADBA9D43C3FD8C107FC768662.73AA6897FE78CF78BE7819E089F1A4FC4
7534C7D&source=youtube&expire=1322344800&key=yt1&ipbits=8&factor=1.25&cp=U0hRR1N
PUl9FSkNOMV9LSVZJOmdmQWdwWC01dlpn&id=283246f338ece5ad&ir=1
X-Content-Type-Options: nosniff
Content-Type: text/html
Server: gvs 1.0
Age: 2068
Content-Length: 0
X-Cache: HIT from Peer6
X-Cache-Lookup: HIT from Peer6:3128
Connection: close

NOTE that its "content-length=0" and Squid thinks that its a video
..If we could especially ignore or deny caching headers with 0
size!.In that case I will try to add &ir=1 &pr=1 to the script .

My real question was focused on Accept-Ranges so If you could hand me
over this one .


Ghassan


Re: [squid-users] Youtube Issue!

2011-11-26 Thread Ghassan Gharabli
?itag=34" same thing for "ID"


Now Im only getting errors on those videos with 302 Redirection and
Loop patch was applied successfully before compiling Squid and
access.log shows that it is normally moving to the location of the
video url but the 2 URLs are being cached since we are caching
"/videoplayback\?" and both are producing FLV Videos.

When somebody skip the portion of the video to a timestap which hasnt
been downloaded yet then YT adds to its URL something like
&begin=[0-9]. I have denied caching those URLs because it will make
your cache directory bigger & more bigger by a short time.


Ghassan



On Sun, Nov 27, 2011 at 4:02 AM, Amos Jeffries  wrote:
> On 27/11/2011 5:32 a.m., Ghassan Gharabli wrote:
>>
>> Hello Amos,
>>
>>
>> Finally, I have almost captured the most YouTube Videos except
>> something I want to get some asistance from you .
>>
>>
>> As I have tested before and tried so many times .. Chudy's script is
>> outdated.
>>
>> After testinig and logging Youtube Videos . I finally have found
>> something not being fully cached . If you still remember I have said
>> before with my old messages that ID isnt being captured in all places
>> but its okay I have done this . I will post my details after I
>> completelly finish them.
>>
>> Could you please explain to me whats happening here?
>>
>> If&range=13-2375679 was found in a URL then Squid doesnt understand
>> how to cache the full video .. as it only cache the first 13 seconds I
>> guess! and then it stops . If I try to download this finished cached
>> movie then you notice its size about 2.2 MB . You try to remove it
>> from cache then Squid cant even find it as it claims not cached but
>> shows TCP_HIT in access.log . STRANGE!
>
> (NP: by remove you mean PURGE request? HUT just means cached data was found
> to service the request, which is right since purging the data involves
> locating it (HITing) before erasing the cached entry. Followup requests
> after the purge should not be HIT.).
>
> I took a look at these"range" replies being generated by YT a while back.
>
> What I found was that a request for video URL would send back a FLV object
> with bytes eg "[SWF...]ABCDEFGH". All fine and good this is the cacheable
> video.
>
> If the user skips around in the video the player generates a range= request
> stating what timestamp or bytes they want to strat at. Its not clear which
> due to the reply which comes back having a *different* byte sequence than
> the video at the same URL.  For example, on the "[SWF...]ABCDEFGH" video it
> would produce:   "[SWF...]EFGH" or something similar.
>
> Under the HTTP rules the range object to be combined must be a snippet
> portion of the base object (range 4-999, should have been just "DEFGH"). By
> adding the SWF headers on each reply YT are making them unique and different
> objects. Combining them in the middle (ie by a caching app) will cause
> errors in the binary object and crash the Flash player or cause it to
> display an error message instead of the video
>
> This range request only seems to happen if the user skips into a portion of
> video the player has not yet downloaded. So sending them the whole video,
> which is what we try to do with Squid, will cause a display lag for the user
> but not cause problems in their player.
>
>
>>
>> Now look into this URL:
>> ---
>>
>>
>> "http://o-o.preferred.orange-par1.v4.lscache7.c.youtube.com/videoplayback?sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Csource%2Calgorithm%2Cburst%2Cfactor%2Ccp&fexp=907605%2C912600%2C915002&algorithm=throttle-factor&itag=34&ip=84.0.0.0&burst=40&sver=3&signature=8223490C23E48CB708E04666E4
>>
>> A550422757CEC6.9D8D78E66DD14FEFC4B5F960F493ED4CDFD7C51C&source=youtube&expire=13
>>
>> 22348400&key=yt1&ipbits=8&factor=1.25&cp=U0hRR1NPVl9FSkNOMV9LSVpFOkpsV3BkS1B1ZXN
>> F&id=e120643085f56831&range=13-2375679"
>>
>> HTTP/1.0 200 OK
>> Last-Modified: Fri, 27 Nov 2009 12:44:54 GMT
>> Content-Type: video/x-flv
>> Date: Sat, 26 Nov 2011 16:06:29 GMT
>> Expires: Sat, 26 Nov 2011 16:06:29 GMT
>> Cache-Control: private, max-age=24511
>> Accept-Ranges: bytes
>> Content-Length: 2375667
>> X-Content-Type-Options: nosniff
>> Server: gvs 1.0
>> X-Cache: MISS from Peer6
>> X-Cache-Lookup: MISS from Peer6:3128
>> Connection: close
>>
>> Whats the job of "Accept_ranges: bytes" here?
>
> Accept-* means the software producing th

Re: [squid-users] Youtube Issue!

2011-11-27 Thread Ghassan Gharabli
Hello again,

I have tested this video myself and "&range=*" is coming along with
some videos without skipping anything ..

Now everything is okay but some videos are being cached 2 times with
the same Content-Length!

Please see this one:

1322399742.127  66732 192.168.10.14 TCP_HIT/200 69489664 GET
http://o-o.preferred.orange-par1.v14.lscache1.c.youtube.com/videoplayback?sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Csource%2Calgorithm%2Cburst%2Cfactor%2Ccp&fexp=903311&algorithm=throttle-factor&itag=34&ip=84.0.0.0&burst=40&sver=3&signature=81E9381A2DF2C1F61388DB08F270607E4CF8F67E.233A3E093009D8EE0123DFC0C3CAE35FB97D7348&source=youtube&expire=1322424000&key=yt1&ipbits=8&factor=1.25&cp=U0hRR1RNUl9FSkNOMV9MR1ZBOl9kd3dzRzJKZlhJ&id=db5b3a6267109fd6
- NONE/- video/x-flv
1322399847.393  79657 192.168.10.14 TCP_HIT/200 69489664 GET
http://o-o.preferred.orange-par1.v14.lscache1.c.youtube.com/videoplayback?sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Csource%2Calgorithm%2Cburst%2Cfactor%2Ccp&fexp=903311&algorithm=throttle-factor&itag=34&ip=84.0.0.0&burst=40&sver=3&signature=81E9381A2DF2C1F61388DB08F270607E4CF8F67E.233A3E093009D8EE0123DFC0C3CAE35FB97D7348&source=youtube&expire=1322424000&key=yt1&ipbits=8&factor=1.25&cp=U0hRR1RNUl9FSkNOMV9MR1ZBOl9kd3dzRzJKZlhJ&id=db5b3a6267109fd6&range=13-2375679
- NONE/- video/x-flv

Content is moving from (id=db5b3a6267109fd6) to
id=db5b3a6267109fd6&range=13-2375679 and that is reallys  trange
howcome it is being skipped if no human was skipping it manually !
..AS far as i know when someone skips and forward the timestamp
manually then you will find "&begin=*" or "start=*" and It is already
ignored so I am not facing erros on playing those videos on YT because
"&range=*" are no longer caching .

This was denied
refresh_pattern
(get_video|videoplayback|videodownload|\.flv).*range\=[0-9\-]*  0 0% 0

I also have ignored it with storeurl_rewrite helper.

Amos, If you said that &range happens when you skip playing .. then
howcome it is happening like that ?


Ghassan



On 11/27/11, Ghassan Gharabli  wrote:
> BTW, That what was happeing to me while testing YT & Ofcourse you cant
> even think of caching videos after being skipped by the client .
>
> Concerning the FLV Object , yes I have noticed from before that when
> you upload a youtube Video then they split the whole video into frames
> which seems to send different objects with the same Video ID ..
> ofcourse this one should be ignored by Squid .
>
> 302 Redirection was only found in "240p" FLV by default and for sure I
> have applied the code just not to hit LOOP .
>
> ACCESS.LOG
> ---
> 1322360339.081 88 192.168.10.14 TCP_HIT/200 86436 GET
> http://o-o.preferred.orange-par1.v3.lscache3.c.youtube.com/videoplayback?sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Csource%2Calgorithm%2Cburst%2Cfactor%2Ccp&fexp=907605%2C912600%2C915002&algorithm=throttle-factor&itag=34&ip=84.0.0.0&burst=40&sver=3&signature=712F1A94A31D43D03E1DB0F67FF9B7F1A9EDA4EC.029774C29E789ACC1D557E1172163D90F6610205&source=youtube&expire=1322384400&key=yt1&ipbits=8&factor=1.25&cp=U0hRR1NTUl9FSkNOMV9LTVZFOkpsV3BkS1RxZXNF&id=283246f338ece5ad
> - NONE/- video/x-flv
> 1322360339.242445 192.168.10.14 TCP_MISS/204 229 GET
> http://clients1.google.com/generate_204 - DIRECT/209.85.148.138
> text/html
> 1322360339.549453 192.168.10.14 TCP_MISS/204 422 GET
> http://s.youtube.com/stream_204?event=streamingerror&erc=1&retry=1&ec=100&fexp=912600,907605,915002&plid=AASyrgMkZZEo1OUT&v=KDJG8zjs5a0&el=detailpage&rt=0.749&fmt=34&shost=o-o.preferred.orange-par1.v3.lscache3.c.youtube.com&scoville=1&fv=WIN%2011,0,1,152
> - DIRECT/74.125.39.100 text/html
> 1322360339.619434 192.168.10.14 TCP_MISS/204 422 GET
> http://s.youtube.com/stream_204?fv=WIN%2011,0,1,152&event=streamingerror&el=detailpage&erc=2&rt=0.873&fexp=912600,907605,915002&fmt=34&v=KDJG8zjs5a0&shost=tc.v3.cache3.c.youtube.com&plid=AASyrgMkZZEo1OUT&scoville=1&ec=100
> - DIRECT/74.125.39.101 text/html
> 1322360340.112  10781 192.168.10.14 TCP_MISS/204 230 GET
> http://o-o.preferred.orange-par1.v3.lscache3.c.youtube.com/generate_204?sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Csource%2Calgorithm%2Cburst%2Cfactor%2Ccp&fexp=907605%2C912600%2C915002&algorithm=throttle-factor&itag=34&ip=84.0.0.0&burst=40&sver=3&signature=712F1A94A31D43D03E1DB0F67FF9B7F1A9EDA4EC.029774C29E789ACC1D557E1172163D90F6610205&source=youtube&expire=1322384400&key=yt1&ipbits=8&factor=1.25&cp=U0hRR1NTUl9FSkNOMV9LTVZFOkpsV3BkS1RxZXNF&id=283246f338ece5ad
> - DIRECT/64.15.1

[squid-users] Squid with more storage!

2012-03-10 Thread Ghassan Gharabli
Hello,


Does Squid 2.7 STABLE9 work well if we moved to another Server that
has XEON Core-6 processor with 6 x 15k SCSI Hard Drives and each HDD
could carry 100 GB of Squid cache Objects?

I am willing to add more SCSI HDDs like 12 or more just to get more
cache data as for now we the current installation Squid 2.7 STABLE9 on
Core 2 Duo with 3 x SATA HDD and we are almost filling all folders and
HDDs storage .


My main idea is to add more fast Hard drivers allowing me to add more
folders splitted on all 6 hard drives so what do you think and what is
your recommendation on such setup .


BTW , we are caching CDN Contents and each month we try to update our
script so we can match the most contents to get a high ratio cache .



Ghassan


[squid-users] Squid with more storage!

2012-03-11 Thread Ghassan Gharabli
Hello again,

I just want to know what would happen to Squid if we added more
storage like 6 X SCSI 15k HDD ? . Would that affect the performance of
the system of affect Squid to behave slower ?

Thank you


-- Forwarded message --
From: Ghassan Gharabli 
Date: Sat, Mar 10, 2012 at 11:00 PM
Subject: Squid with more storage!
To: squid-users 


Hello,


Does Squid 2.7 STABLE9 work well if we moved to another Server that
has XEON Core-6 processor with 6 x 15k SCSI Hard Drives and each HDD
could carry 100 GB of Squid cache Objects?

I am willing to add more SCSI HDDs like 12 or more just to get more
cache data as for now we the current installation Squid 2.7 STABLE9 on
Core 2 Duo with 3 x SATA HDD and we are almost filling all folders and
HDDs storage .


My main idea is to add more fast Hard drivers allowing me to add more
folders splitted on all 6 hard drives so what do you think and what is
your recommendation on such setup .


BTW , we are caching CDN Contents and each month we try to update our
script so we can match the most contents to get a high ratio cache .



Ghassan


Re: [squid-users] anyone knows some info about youtube "range" parameter?

2012-04-25 Thread Ghassan Gharabli
Hello,

As i remember I already discussed this subject before mentioning that
Youtube several months ago added a new variable/URI "RANGE". I tried
to deny all URLs that comes with "RANGE" to avoid presenting the error
at Youtube Player butb tried to investigate more and came with a
solution like that :



# youtube 360p itag=34 ,480p itag=35 [ITAG/ID/RANGE]
 if 
(m/^http:\/\/([0-9.]{4}|.*\.youtube\.com|.*\.googlevideo\.com|.*\.video\.google\.com)\/.*(itag=[0-9]*).*(id=[a-zA-Z0-9]*).*(range\=[0-9\-]*)/)
{
print $x . "http://video-srv.youtube.com.SQUIDINTERNAL/"; . $3 . "&" .
$4 . "\n";

# youtube 360p itag=34 ,480p itag=35 [ID/ITAG/RANGE]
} elsif 
(m/^http:\/\/([0-9.]{4}|.*\.youtube\.com|.*\.googlevideo\.com|.*\.video\.google\.com)\/.*(id=[a-zA-Z0-9]*).*(itag=[0-9]*).*(range\=[0-9\-]*)/)
{
print $x . "http://video-srv.youtube.com.SQUIDINTERNAL/"; . $2 . "&" .
$4 . "\n";

# youtube 360p itag=34 ,480p itag=35 [RANGE/ITAG/ID]
} elsif 
(m/^http:\/\/([0-9.]{4}|.*\.youtube\.com|.*\.googlevideo\.com|.*\.video\.google\.com)\/.*(range\=[0-9\-]*).*(itag=[0-9]*).*(id=[a-zA-Z0-9]*)/)
{
print $x . "http://video-srv.youtube.com.SQUIDINTERNAL/"; . $4 . "&" .
$2 . "\n";
--

I already discovered that by rewriting them and save them as
videplayback?id=000&range=00-00 would solve the problem but
the thing is the cache folder would be increased faster because we are
not only saving one file as we are saving multiple files for one ID!.

AS for me , it saves alot of bandwidth but bigger cache . If you check
and analyze it more then you will notice same ID or same videop while
watching the link changes for example :

It starts [ITAG/ID/RANGE] then changes to [ID/ITAG/RANGE] and finally
to [RANGE/ITAG/ID] so with my script you can capture the whole
places!.


Ghassan

On 4/25/12, Eliezer Croitoru  wrote:
> On 25/04/2012 06:02, Amos Jeffries wrote:
>> On 25/04/2012 6:02 a.m., Eliezer Croitoru wrote:
>>> as for some people asking me recently about youtube cache i have
>>> checked again and found that youtube changed their video uris and
>>> added an argument called "range" that is managed by the youtube player.
>>> the original url\uri dosnt include range but the youtube player is
>>> using this argument to save bandwidth.
>>>
>>> i can implement the cahing with ranges on nginx but i dont know yet
>>> the way that range works.
>>> it can be based on user bandwidth or "fixed" size of chunkes.
>>>
>>> if someone up to the mission of analyzing it a bit more to understand
>>> it so the "range" cache will be implemented i will be happy to get
>>> some help with it.
>>>
>>> Thanks,
>>> Eliezer
>>>
>>>
>>
>> I took a look at it a while back...
>>
>> I got as far as determining that the "range" was roughly byte-ranges as
>> per the HTTP spec BUT (and this is a huge BUT). Each response was
>> prefixed with some form of file intro bytes. Meaning the rages were not
>> strictly sub-bytes of some original object. At this point there is no
>> way for Squid to correctly generate the intro bytes, or to merge/split
>> these "ranges" for servicing other clients.
>>
>> When used the transfer is relatively efficient, so the impact of
>> bypassing the storeurl cache feature is not too bad. The other option is
>> to re-write the URL without range and simply reply with the whole video
>> regardless. It is a nasty mapping problem with bandwidth waste either way.
>>
> they have changed something in the last month or so.
> the was using a "begin"
> and now they are usinn " "rang=13-X" 13 is the first..
> i was thinking also on rewriting the address cause it works perfectly
> with my testing.
>
> will update more later.
>
> Eliezer
>> That was a year or two ago, so it may be worth re-investigating.
>>
>> Amos
>
>
> --
> Eliezer Croitoru
> https://www1.ngtech.co.il
> IT consulting for Nonprofit organizations
> eliezer  ngtech.co.il
>


Re: [squid-users] anyone knows some info about youtube "range" parameter?

2012-04-29 Thread Ghassan Gharabli
Hello Eliezer,

Are you trying to save all video chunks into same parts or capture /
download the whole video object through CURL or whatever! but i dont
think it should work since it will occure an error with the new
Youtube Player.

What I have reached lately is saving same youtube video chunks  that
has  youtube 360p itag=34 ,480p itag=35 without saving its itag since
i want to save more bandwidth (thats why i only wrote scripts to you
as an example) which means if someone wants to watch 480p then he
would get the cached 360p contents thats why i didnt add the itag but
if he thought of watching 720p and above then another script would
catch it matching ITAG 37 , 22 ... I know that is not the best
solution but at least its working pretty well with no erros at all as
long as the client can always fast forward .

Im using Squid 2.7 Stable9 compiled on windows 64-bit with PERL x64.

Regarding the 302 Redirection  .. I have made sure to update the
source file client_side.c to fix the loop 302 Redirection but really I
dont have to worry now about anything so what is your target regarding
Youtube with argument Range and whats the problem till now ?

I have RAID with 5 HDD and the average HTTP Requests per minute :
2732.6 and because I want to save more bandwidth I try to analyze HTTP
Requests so i can always update my perl script to match most wanted
websites targetting Videos , Mp3 etc.

For a second I thought of maybe someone would help to compile an
intelligent external helper script that would capture  the whole
byte-range and I know it is really hard to do that since we are
dealing with byte-range.

I only have one question that is always teasing me .. what are the
comnparison between SQUID and BLUE COAT so is it because it is a
hardware perfromance or just it has more tricks to cache everything
and reach a maximum ratio ?



Ghassan



On Mon, Apr 30, 2012 at 1:29 AM, Eliezer Croitoru  wrote:
> On 24/04/2012 21:02, Eliezer Croitoru wrote:
>>
>> as for some people asking me recently about youtube cache i have checked
>> again and found that youtube changed their video uris and added an
>> argument called "range" that is managed by the youtube player.
>> the original url\uri dosnt include range but the youtube player is using
>> this argument to save bandwidth.
>>
>> i can implement the cahing with ranges on nginx but i dont know yet the
>> way that range works.
>> it can be based on user bandwidth or "fixed" size of chunkes.
>>
>> if someone up to the mission of analyzing it a bit more to understand it
>> so the "range" cache will be implemented i will be happy to get some
>> help with it.
>>
>> Thanks,
>> Eliezer
>>
>>
> as for now the "minimum_object_size 512 bytes" wont do the trick for 302
> redirection on squid2.7 because the 302 response is 963 big size.
> so i have used:
> minimum_object_size 1024 bytes
> just to make sure it will work.
> and also this is a youtube videos dedicated server so it's on with this
> limit.
>
> Regards,
>
> Eliezer
>
> --
> Eliezer Croitoru
> https://www1.ngtech.co.il
> IT consulting for Nonprofit organizations
> eliezer  ngtech.co.il


Re: [squid-users] anyone knows some info about youtube "range" parameter?

2012-05-01 Thread Ghassan Gharabli
Cached Contents are so many 

-Youtube

-GoogleSyndication

-ADS and it will always give me a strong headache to follow every ADby
caching its content not to block it because blocking any ADS URL would
present a javascript error at the browser's side!.

-VideoZer

- android Google ( MARKET .. etc )

- Ziddu

- xHamster

- SoundCloud

- Some websites that have a cdn folder which generates a key or long
key in the middle of the URL which everytime you refresh the website
then it will generate a dynamic folder incase if it has an extension
jpg or any multimedia extension and sometimes you see URL which has an
end like that abc.jpg;blablabla or abc.jpg?blablabla so I try to
remove that CDN Folder and at the same time i also remove everything
which comes after the extension jpg if has ";" or "?" .

- wlxrs.com

- reverbnation , megavideo , xaxtube

- NOGOMI/ AHLASOOT .. for example to save bandwidth !!! . usually
before you download the MP3 file they allow you to listen online
through their web player which has the same file size and same
file_name but if you download the file then you get a different domain
name which made me think just to rewrite the URL which matches and
save the same URL into cache if a client wanted to listen or download
and also alot of websites has the same idea.

- vkontakte , depositfiles , eporner , 4shared , letitbit , sendspace
, filesonic , uploaded , uploading , turbobit, wupload , redtubefiles
, filehippo , oron , rapishare , tube8 , pornhub , xvideos , telstra ,
scribdassets , przeklej , hardsextube , fucktube , imageshack , beeg ,
yahoo videos , youjizz , gcdn

for example , look at this URL :
#http://b2htks6oia9cm5m4vthd6hhulo.gcdn.biz/d/r/*/FileName WIthout
Extension
(*) DYNAMIC CONTENT and just look into the sub domain how its like!

 and so many websites but unfotunately  EASY-SHARE is using POST
response and I cant cache it.

Lately I was monitoring NOKIA Phones and I added OVI STORE.

Do you have any website that its not cacheable or using CDN or
something because Im really interested to look at :) .



Ghassan

On Mon, Apr 30, 2012 at 2:53 AM, Eliezer Croitoru  wrote:
> On 30/04/2012 02:18, Ghassan Gharabli wrote:
>>
>> Hello Eliezer,
>>
>> Are you trying to save all video chunks into same parts or capture /
>> download the whole video object through CURL or whatever! but i dont
>> think it should work since it will occure an error with the new
>> Youtube Player.
>>
>> What I have reached lately is saving same youtube video chunks  that
>> has  youtube 360p itag=34 ,480p itag=35 without saving its itag since
>> i want to save more bandwidth (thats why i only wrote scripts to you
>> as an example) which means if someone wants to watch 480p then he
>> would get the cached 360p contents thats why i didnt add the itag but
>> if he thought of watching 720p and above then another script would
>> catch it matching ITAG 37 , 22 ... I know that is not the best
>> solution but at least its working pretty well with no erros at all as
>> long as the client can always fast forward .
>>
>> Im using Squid 2.7 Stable9 compiled on windows 64-bit with PERL x64.
>>
>> Regarding the 302 Redirection  .. I have made sure to update the
>> source file client_side.c to fix the loop 302 Redirection but really I
>> dont have to worry now about anything so what is your target regarding
>> Youtube with argument Range and whats the problem till now ?
>>
>> I have RAID with 5 HDD and the average HTTP Requests per minute :
>> 2732.6 and because I want to save more bandwidth I try to analyze HTTP
>> Requests so i can always update my perl script to match most wanted
>> websites targetting Videos , Mp3 etc.
>>
>> For a second I thought of maybe someone would help to compile an
>> intelligent external helper script that would capture  the whole
>> byte-range and I know it is really hard to do that since we are
>> dealing with byte-range.
>>
>> I only have one question that is always teasing me .. what are the
>> comnparison between SQUID and BLUE COAT so is it because it is a
>> hardware perfromance or just it has more tricks to cache everything
>> and reach a maximum ratio ?
>>
>>
>>
>> Ghassan
>
> i was messing with store_url_rewrite and url_rewrite quite some time just
> for knowledge.
>
> i was researching every concept exists with squid until now.
> a while back (year or more)  i wrote store_url_rewrite using java and posted
> the code somewhere.
> the reason i was using java was because it's the fastest and simples from
> all other languages i know (ruby perl python).
> i was saving bandwidth using nginx because it was simple to setup.
> i dont really like the idea of faking 

Re: [squid-users] Re: anyone knows some info about youtube "range" parameter?

2012-05-01 Thread Ghassan Gharabli
AS I understand that nginx removes Range header as well as several other headers
before passing request to upstream to make sure full response will be
cached and other requests for the same uri will be served correctly.

Squid at least in versions 2.7 and up, refuses to cache 206 Partial
Content responses by default. It is possible to use a combination of
the range_offset_limit and quick_abort_min to force Squid to cache the
request, but Squid will still strip off the Range header and try to
request the entire object but It is still not a good solution.

Do you know anything about VARNISH which has experimental support for
range requests as you can enable it via the “http_range” runtime
parameter.

What I have already found is that Varnish will try to request the
entire file from the back-end server, and only if the entire file is
in the cache, will it successfully respond with a partial content
response and not contact the back-end server.

Does Squid has the ability to remove Range Header ?


Ghassan

On Tue, May 1, 2012 at 5:28 PM, Eliezer Croitoru  wrote:
> On 01/05/2012 17:09, x-man wrote:
>>
>> I like the option to use nginx as cache_peer, who is doing the youtube
>> handling and I'm keen on using it.
>>
>> The only think I don't know in this case is how the nginx will mark the
>> traffic as CACHE HIT or CACHE MISS, because I want to have the CACHE HIT
>> traffic marked with DSCP so I can use the Zero penalty hit in the NAS and
>> give high speed to users for the cached videos?
>>
>> Anyone has idea about that?
>>
>> --
>> View this message in context:
>> http://squid-web-proxy-cache.1019090.n4.nabble.com/anyone-knows-some-info-about-youtube-range-parameter-tp4584388p4600792.html
>> Sent from the Squid - Users mailing list archive at Nabble.com.
>
> i do remember that nginx logged the file when a hit happens on the main
> access log file but i'm not sure about it.
> i have found  that store_url_rewrite is much more effective then nginx cache
> with ranges but didnt had the time to analyze the reason yet.
> by the way you can use squid2.7 instance as a cache_peer instead of nginx.
>
> did you tried my code(ruby)?
> i will need to make some changes to make sure it will fit more videos that
> doesn't use range parameter(there are couple).
>
>
> Eliezer
>
> --
> Eliezer Croitoru
> https://www1.ngtech.co.il
> IT consulting for Nonprofit organizations
> eliezer  ngtech.co.il


[squid-users] Interception proxy with DNAT using squid 3.3.1

2013-02-17 Thread Ghassan Gharabli
Hello,

Ive been trying to solve this problem for the past three days but
wasnt successfull.


I want to setup an interception proxy with DNAT.

SQUID ---> MIKROTIK Router > CLIENT PC

Squid Configure Options: --enable-ssl --enable-ssl-crtd
--enable-icap-client --with-filedescriptors=8192
--enable-ltdl-convenience



MY Squid config :
---
#
# Recommended minimum configuration:
#

# Example rule allowing access from your local networks.
# Adapt to list your (internal) IP networks from where browsing
# should be allowed
acl localnet src 10.0.0.0/8 # RFC1918 possible internal network
acl localnet src 172.16.0.0/12  # RFC1918 possible internal network
acl localnet src 192.168.0.0/16 # RFC1918 possible internal network
acl localnet src fc00::/7   # RFC 4193 local private network range
acl localnet src fe80::/10  # RFC 4291 link-local (directly
plugged) machines

acl SSL_ports port 443
acl Safe_ports port 80  # http
acl Safe_ports port 21  # ftp
acl Safe_ports port 443 # https
acl Safe_ports port 70  # gopher
acl Safe_ports port 210 # wais
acl Safe_ports port 1025-65535  # unregistered ports
acl Safe_ports port 280 # http-mgmt
acl Safe_ports port 488 # gss-http
acl Safe_ports port 591 # filemaker
acl Safe_ports port 777 # multiling http
acl CONNECT method CONNECT

#
# Recommended minimum Access Permission configuration:
#
# Only allow cachemgr access from localhost
http_access allow localhost manager
http_access deny manager

# Deny requests to certain unsafe ports
http_access deny !Safe_ports

# Deny CONNECT to other than secure SSL ports
http_access deny CONNECT !SSL_ports

# We strongly recommend the following be uncommented to protect innocent
# web applications running on the proxy server who think the only
# one who can access services on "localhost" is a local user
#http_access deny to_localhost
http_access allow localnet
http_access allow localhost
http_access deny all


http_port 0.0.0.0:8080
http_port 0.0.0.0:3128 intercept
#http_port 192.168.10.4:3128 intercept ssl-bump
generate-host-certificates=on dynamic_cert_mem_cache_size=10MB
cert=/usr/local/squid/ssl_cert/myCA.pem
#https_port 192.168.10.4:3129 intercept ssl-bump
generate-host-certificates=on dynamic_cert_mem_cache_size=10MB
cert=/usr/local/squid/ssl_cert/myCA.pem

cache_dir ufs /usr/local/squid/var/cache/squid 1 16 256

# Leave coredumps in the first cache dir
coredump_dir /usr/local/squid/var/cache/squid

# Add any of your own refresh_pattern entries above these.
refresh_pattern ^ftp:   144020% 10080
refresh_pattern ^gopher:14400%  1440
refresh_pattern -i (/cgi-bin/|\?) 0 0%  0
refresh_pattern .   0   20% 4320

always_direct allow all
acl broken_sites dstdomain .example.com
ssl_bump none localhost
ssl_bump none broken_sites
ssl_bump server-first

sslproxy_cert_error allow all
sslproxy_flags DONT_VERIFY_PEER
sslproxy_cert_adapt setCommonName
#sslproxy_cert_sign signTrusted
sslcrtd_program /usr/local/squid/libexec/ssl_crtd -s
/usr/local/squid/var/lib/ssl_db -M 10MB
sslcrtd_children 5


forwarded_for transparent
#visible_hostname cache2.skydsl.net
#offline_mode on
maximum_object_size 10 KB

ERROR I AM GETTING :
--
The following error was encountered while trying to retrieve the URL:
http://www.cnn.com/
Connection to 192.168.10.4 failed.
The system returned: (111) Connection refused
The remote host or network may be down. Please try the request again.
--

I tried everything was mentioned at
http://wiki.squid-cache.org/ConfigExamples/Intercept/LinuxDnat

[root@cache2 ~] # iptables -t nat --list rules
-P PREROUTING ACCEPT
-P INPUT ACCEPT
-P OUTPUT ACCEPT
-P POSTROUTING ACCEPT
-A PREROUTING -s 192.168.10.4/32 -p tcp -m tcp --dport 80 -j ACCEPT
-A PREROUTING -p tcp --dport 80 -j REDIRECT --to-ports 3128
-A POSTROUTING -j MASQUERADE

[root@cache2 ~] # iptables -t mangle --list-rules
-P PREROUTING ACCEPT
-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT
-P POSTROUTING ACCEPT
-A PREROUTING -s 192.168.10.4/32 -p tcp -m tcp --dport 80 -j ACCEPT


What do you think the problem might be?

I am using Fedora 17 x64.

Any suggestions willbe appreciated.

Thank you


Ghassan


Re: [squid-users] Interception proxy with DNAT using squid 3.3.1

2013-02-18 Thread Ghassan Gharabli
Hello Amos,

Thank you for your help.

I didnt notice that I have dnsmasq but I stopped the service of
dnsmasq and still the same error.

I am using DNAT at Mikrotik .. Masquerading rule & still the same natting ..

For example NAT ;
add chain=dstnat src-address=0.0.0.0/0 protocol=tcp dst-port=80
action=dstnat to-addresses=192.168.10.4 to-ports=3128

This examples works with Squid 2.7 but right now I have changed rules
and made a mark route rule though I really want to use NAT instead of
gateway route.

Stopping dnsmasq didnt help.

Another question , If I buy this SSL Certificate
http://www.digicert.com/welcome/ssl-plus.htm then I will get rid of
certificate errors at Client IE and is there by any chance to stay
using Fake CA which was generated from OPENSSL .pem and stay using it
& ignore all errors ?


Thank you


Ghassan

On Mon, Feb 18, 2013 at 5:08 AM, Amos Jeffries  wrote:
> On 18/02/2013 2:47 p.m., Ghassan Gharabli wrote:
>>
>> Hello,
>>
>> Ive been trying to solve this problem for the past three days but
>> wasnt successfull.
>>
>>
>> I want to setup an interception proxy with DNAT.
>>
>> SQUID ---> MIKROTIK Router > CLIENT PC
>>
>> Squid Configure Options: --enable-ssl --enable-ssl-crtd
>> --enable-icap-client --with-filedescriptors=8192
>> --enable-ltdl-convenience
>>
>>
>>
>> MY Squid config :
>> ---
>> #
>> # Recommended minimum configuration:
>> #
>>
>> # Example rule allowing access from your local networks.
>> # Adapt to list your (internal) IP networks from where browsing
>> # should be allowed
>> acl localnet src 10.0.0.0/8 # RFC1918 possible internal network
>> acl localnet src 172.16.0.0/12  # RFC1918 possible internal network
>> acl localnet src 192.168.0.0/16 # RFC1918 possible internal network
>> acl localnet src fc00::/7   # RFC 4193 local private network range
>> acl localnet src fe80::/10  # RFC 4291 link-local (directly
>> plugged) machines
>>
>> acl SSL_ports port 443
>> acl Safe_ports port 80  # http
>> acl Safe_ports port 21  # ftp
>> acl Safe_ports port 443 # https
>> acl Safe_ports port 70  # gopher
>> acl Safe_ports port 210 # wais
>> acl Safe_ports port 1025-65535  # unregistered ports
>> acl Safe_ports port 280 # http-mgmt
>> acl Safe_ports port 488 # gss-http
>> acl Safe_ports port 591 # filemaker
>> acl Safe_ports port 777 # multiling http
>> acl CONNECT method CONNECT
>>
>> #
>> # Recommended minimum Access Permission configuration:
>> #
>> # Only allow cachemgr access from localhost
>> http_access allow localhost manager
>> http_access deny manager
>>
>> # Deny requests to certain unsafe ports
>> http_access deny !Safe_ports
>>
>> # Deny CONNECT to other than secure SSL ports
>> http_access deny CONNECT !SSL_ports
>>
>> # We strongly recommend the following be uncommented to protect innocent
>> # web applications running on the proxy server who think the only
>> # one who can access services on "localhost" is a local user
>> #http_access deny to_localhost
>> http_access allow localnet
>> http_access allow localhost
>> http_access deny all
>>
>>
>> http_port 0.0.0.0:8080
>> http_port 0.0.0.0:3128 intercept
>> #http_port 192.168.10.4:3128 intercept ssl-bump
>> generate-host-certificates=on dynamic_cert_mem_cache_size=10MB
>> cert=/usr/local/squid/ssl_cert/myCA.pem
>> #https_port 192.168.10.4:3129 intercept ssl-bump
>> generate-host-certificates=on dynamic_cert_mem_cache_size=10MB
>> cert=/usr/local/squid/ssl_cert/myCA.pem
>>
>> cache_dir ufs /usr/local/squid/var/cache/squid 1 16 256
>>
>> # Leave coredumps in the first cache dir
>> coredump_dir /usr/local/squid/var/cache/squid
>>
>> # Add any of your own refresh_pattern entries above these.
>> refresh_pattern ^ftp:   144020% 10080
>> refresh_pattern ^gopher:14400%  1440
>> refresh_pattern -i (/cgi-bin/|\?) 0 0%  0
>> refresh_pattern .   0   20% 4320
>>
>> always_direct allow all
>> acl broken_sites dstdomain .example.com
>> ssl_bump none localhost
>> ssl_bump none broken_sites
>> ssl_bump server-first
>>
>> sslproxy_cert_error allow all
>> sslproxy_flags DONT_VERIFY_PEER
>> sslproxy_cert_adapt setCommonName
>> #sslproxy_cert_sign signTrusted
>> sslcrtd_program /usr/local/squid/libexec/ssl_crtd -s
>> /usr/local/squid/var/lib/ssl

[squid-users] Youtube Changes

2013-04-22 Thread Ghassan Gharabli
Hello,

Did anyone notice the changes with youtube videplayback url?. I have
noticed that most of the youtube videos are no longer cached because
its id is not static anymore .

Most videos starts with "o" id= o- .

I even changed my perl script to rewrite each videoplayback request to
remove range and I could successfully get the whole video file into
youtube player , but this is not my target .

I also tried to get the video_id and save it to each videoplayback ,
but it was saving successfully if i watch each video .. Starting to
send much more opening videos at the same time but perl starts to save
a random video id to each videoplayback request which is not good at
all .

I have noticed that each video page has a file get_video_info which
includes all urls related to the video and also the video_id is
included so does anyone have any more ideas . The only way is to get
the $video_id from each video request and keep video playback on hold
because now we need to compare the $cpn which is being common between
videoplayback\? url & s\? . Can we compare the $cpn using Squid since
we are getting requests line by line!. For Example, we read s\? and
get the "&video_id" and "&cpn" reserve it and then we search for the
videoplayback which has the common "&cpn" , if cpn matched , then save
the video_id that was with the same 'CPN" .

Is it hard with Squid? . What is the solution?

I already tried Php coding but perl was better..


Fwd: [squid-users] Re: Youtube Changes

2013-04-23 Thread Ghassan Gharabli
Hello again,

I liked the idea of what you did and especially you are using Perl -
ReadBackwards Module . Squid is writing a new log and only logging
\.youtube\.com requests or adding 2 links on the same line and then
you create a subroutine perl to get the returned 2 parameters ..
(V/docid/video_id)  & cpn ...etc .


It only writes the video_id when the id does not starts with
o-[a-zA-Z0-9-_] which is being written into cache as is , but i solved
it by rewriting the whole videoplayback url without ([&?]range=[^&]*)
, ([&?]ratebypass=[a-z]*) and also changed the [&?]vq=auto to medium
or large just to get rid of the auto rate due to the slow connection.

did you check it with ranges if the id starts with "o-" ?.

What else youtube would do in the future .. changing the video ID or
keep everything hidden?.


I liked the idea of what you did and congratulations :-). Thank you so much

Ghassan


[squid-users] HTTPS Caching between Squid's Parent and Child

2013-09-04 Thread Ghassan Gharabli
Hi,

I am trying to setup SSL-Bump between Parent Squid Proxy and the Child Proxy .

I am using Squid Version : 3.3.8 for each Parent and Squid installed
on the same system (Fedora 64-Bit)

Configure Options : --enable-ssl --enable-ssl-crtd --enable-icap-client
--with-filediscriptors=65536 --enable-ltdl-convenience

My target is to cache HTTPS Traffic, due to the very expensive
bandwidth, I have also noticed that most websites are moving to HTTPS
protocol.

I am having difficulties establishing a connection between Parent with
Child Squid .

I am able to cache HTTPS Traffic by  installing a certificate file on
each customer's PC or Phone .

Is there any possible idea that can make the parent proxy cache just
the HTTPS Traffic and let the child proxy negotiate between parent and
establish SSL connection, using the required certificate and then the
child could possibly share the connection again without annoying
customers to install the certificate ?.

Parent Proxy Settings:
-

#
# Recommended minimum configuration:
#

# Example rule allowing access from your local networks.
# Adapt to list your (internal) IP networks from where browsing
# should be allowed
acl localnet src 10.0.0.0/8 # RFC1918 possible internal network
acl localnet src 172.16.0.0/12 # RFC1918 possible internal network
acl localnet src 192.168.0.0/16 # RFC1918 possible internal network
acl localnet src fc00::/7   # RFC 4193 local private network range
acl localnet src fe80::/10  # RFC 4291 link-local (directly
plugged) machines

acl SSL_ports port 443
acl Safe_ports port 80 # http
acl Safe_ports port 21 # ftp
acl Safe_ports port 443 # https
acl Safe_ports port 70 # gopher
acl Safe_ports port 210 # wais
acl Safe_ports port 1025-65535 # unregistered ports
acl Safe_ports port 280 # http-mgmt
acl Safe_ports port 488 # gss-http
acl Safe_ports port 591 # filemaker
acl Safe_ports port 777 # multiling http
acl CONNECT method CONNECT
acl SSL method CONNECT

#
# Recommended minimum Access Permission configuration:
#
# Only allow cachemgr access from localhost
http_access allow localhost manager
#http_access deny manager

# Deny requests to certain unsafe ports
http_access deny !Safe_ports

# Deny CONNECT to other than secure SSL ports
http_access deny CONNECT !SSL_ports

# We strongly recommend the following be uncommented to protect innocent
# web applications running on the proxy server who think the only
# one who can access services on "localhost" is a local user
#http_access deny to_localhost

#
# INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS
#

# Example rule allowing access from your local networks.
# Adapt localnet in the ACL section to list your (internal) IP networks
# from where browsing should be allowed
http_access allow localnet
http_access allow localhost
# And finally deny all other access to this proxy
http_access deny all

# Squid normally listens to port 3128
http_port 0.0.0.0:9000
http_port 0.0.0.0:3128 intercept ssl-bump
generate-host-certificates=on dynamic_cert_mem_cache_size=16MB
cert=/usr/local/squidparent/ssl_cert/myCA.pem
https_port 3129 intercept ssl-bump generate-host-certificates=on
dynamic_cert_mem_cache_size=16MB
cert=/usr/local/squidparent/ssl_cert/myCA.pem

# Uncomment and adjust the following to add a disk cache directory.
cache_dir ufs /usr/local/squidparent/var/cache/squid 1 16 256

# Leave coredumps in the first cache dir
coredump_dir /usr/local/squidparent/var/cache/squid

# Add any of your own refresh_pattern entries above these.
refresh_pattern ^ftp: 1440 20% 10080
refresh_pattern ^gopher: 1440 0% 1440
refresh_pattern
^https:\/\/.*\.(jp(eg|g|e|2)|tiff?|bmp|gif|png|kmz|eot|css|js)
129600 99% 129600 ignore-no-cache ignore-no-store reload-into-ims
override-expire ignore-must-revalidate store-stale ignore-private
ignore-auth
refresh_pattern
\.(class|css|cssz|js|jsz|xml|jhtml|txt|tif|swf|zsci|arc|asc) 129600
99% 129600 ignore-no-cache ignore-no-store reload-into-ims
override-expire ignore-must-revalidate store-stale ignore-private
ignore-auth
refresh_pattern \.(doc|xls|ppt|ods|odt|odp|pdf|rtf|inf|ini)
  129600 99% 129600 ignore-no-cache ignore-no-store reload-into-ims
override-expire ignore-must-revalidate store-stale ignore-private
refresh_pattern \.(jp(eg|g|e|2)|tiff?|bmp|gif|png|kmz|eot)   129600
99% 129600 ignore-no-cache ignore-no-store override-lastmod
reload-into-ims override-expire ignore-must-revalidate store-stale
ignore-private ignore-auth
refresh_pattern
\.(z(ip|[0-9]{2})|r(ar|[0-9]{2})|jar|tgz|bz2|grf|gpf|lz|lzh|lha|arj|sis|gz|ipa|tar|rpm|vpu|amz|img)
129600 99% 129600 ignore-no-cache ignore-no-store override-lastmod
reload-into-ims override-expire ignore-must-revalidate store-stale
ignore-private
refresh_pattern
\.(mp(2|3|4)|wav|og(g|a)|flac|mid|midi?|r(m|mvb)|aac|mka|ap(e|k))
  129600 99% 129600 ignore-no-cache ignore-no-store
override-lastmod reload-into-ims override-expire
ignore-must-revalidate store-stale ignore-private
refre

[squid-users] Squid with PHP & Apache

2013-11-25 Thread Ghassan Gharabli
 Hi,

I have built a PHP script to cache HTTP 1.X 206 Partial Content like
"WindowsUpdates" & Allow seeking through Youtube & many websites .

I am willing to move from PHP to C++ hopefully after a while.

The script is almost finished , but I have several question, I have no
idea if I should always grab the HTTP Response Headers and send them
back to the borwsers.

1) Does Squid still grab the "HTTP Response Headers", even if the
object is already in cache or Squid has already a cached copy of the
HTTP Response header . If Squid caches HTTP Response Headers then how
do you deal with HTTP CODE 302 if the object is already cached . I am
asking this question because I have already seen most websites use
same extensions such as .FLV including Location Header.

2) Do you also use mime.conf to send the Content-Type to the browser
in case of FTP/HTTP or only FTP ?

3) Does squid compare the length of the local cached copy with the
remote file if you already have the object file or you use
refresh_pattern?.

4) What happens if the user modies a refresh_pattern to cache an
object, for example .xml which does not have [Content-Length] header.
Do you still save it, or would you search for the ignore-headers used
to force caching the object and what happens if the cached copy
expires , do you still refresh the copy even if there is no
Content-Length header?.

I am really confused with this issue , because I am always getting a
headers list from the internet and I send them back to the browser
(using PHP and Apache) even if the object is in cache.

Your help and answers will be much appreciated

Thank you

Ghassan


Re: [squid-users] Squid with PHP & Apache

2013-11-26 Thread Ghassan Gharabli
On Tue, Nov 26, 2013 at 5:30 AM, Amos Jeffries  wrote:
> On 26/11/2013 10:13 a.m., Ghassan Gharabli wrote:
>>  Hi,
>>
>> I have built a PHP script to cache HTTP 1.X 206 Partial Content like
>> "WindowsUpdates" & Allow seeking through Youtube & many websites .
>>
>
> Ah. So you have written your own HTTP caching proxy in PHP. Well done.
> Did you read RFC 2616 several times? your script is expected to to obey
> all the MUST conditions and clauses in there discussing "proxy" or "cache".
>

Yes , I have read it and I will read it again , but the reason i am
building such a script is because internet here in Lebanon is really
expensive and scarce.

As you know Youtube is sending dynamic chunks for each video . For
example , if you watch a video on Youtube more than 10 times , then
Squid fill up the cache with more than 90 chunks per video , that is
why allowing to seek at any position of the video using my script
would save me the headache .

>
>
> NOTE: the easy way to do this is to upgrade your Squid to the current
> series and use ACLs on the range_offset_limit directive. That way Squid
> will convert Range requests to normal fetch requests and cache the
> object before sending the requested pieces of it back to the client.
> http://www.squid-cache.org/Doc/config/range_offset_limit/
>
>

I have successfully supported HTTP/206, if the object is cached and my
target is to enable Range headers, as I can see that iPhones or Google
Chrome check if the server has a header Accept-Ranges: Bytes then they
send a request bytes=x-y or multiple bytes like bytes=x-y,x-y .

>> I am willing to move from PHP to C++ hopefully after a while.
>>
>> The script is almost finished , but I have several question, I have no
>> idea if I should always grab the HTTP Response Headers and send them
>> back to the borwsers.
>
> The response headers you get when receiving the object are meta data
> describing that object AND the transaction used to fetch it AND the
> network conditions/pathway used to fetch it. The cachs job is to store
> those along with the object itself and deliver only the relevant headers
> when delivering a HIT.
>
>>
>> 1) Does Squid still grab the "HTTP Response Headers", even if the
>> object is already in cache or Squid has already a cached copy of the
>> HTTP Response header . If Squid caches HTTP Response Headers then how
>> do you deal with HTTP CODE 302 if the object is already cached . I am
>> asking this question because I have already seen most websites use
>> same extensions such as .FLV including Location Header.
>
> Yes. All proxies on the path are expected to relay the end-to-end
> headers, drop the hop-by-hop headers, and MUST update/generate the
> feature negotiation and state information headers to match its
> capabilities in each direction.
>
>

Do you mean by Yes , for grabbing the Http Response Headers even if
the object is already in cache, so therefore latency of network is
always added even if MISS or HIT situation?. I have tested Squid and I
have noticed that reading HIT objects from Squid takes about 0.x ms,
which I believe objects are always offline until expiry occurs.Right?

Till now I am using $http_response_headers as it is the fastest method
by far , but I still have an issue with latency as for each request
the function takes about 0.30s, which is really high, even if my
network latency is 100~150 ms. That is why I have thought that I could
possibly grab the HTTP Response Headers for the first time and store
them, so if the URI was called for a second time, then I would send
them the cached Headers instead of grabbing them again , to eliminate
the network latency. But I still have an issue ... How am i going to
know if the website sends HTTP/302 (because some websites send
HTTP/302 for the same requested file name ), if I am not grabbing the
header again in a HIT situation just to improve the latency. Second
issue is Saving headers of CDN.



>>
>> 2) Do you also use mime.conf to send the Content-Type to the browser
>> in case of FTP/HTTP or only FTP ?
>
> Only FTP and Gopher *if* Squid is translating from the native FTP/Gopher
> connection to HTTP. HTTP and protocols relayed using HTTP message format
> are expected to supply the correct header.
>
>>
>> 3) Does squid compare the length of the local cached copy with the
>> remote file if you already have the object file or you use
>> refresh_pattern?.
>
> Content-Length is a declaration of how many payload bytes are following
> the response headers. It has no relation to the servers object except in
> the special case where the entire object is being delivered as payload
> without any encoding.
>
>

I am only caching obj

Re: [squid-users] Squid with PHP & Apache

2013-11-28 Thread Ghassan Gharabli
her
> approach which would "rank" videos and will consider removing videos that
> was used once or twice per two weeks(which is depends on the size of the
> storage and load).
>
> If you do have a strong server that can run PHP you can try to take for a
> spin squid with StoreID that can help you to use only squid for youtube
> video caching.
>

Good idea . I already thought of adding ranking script with help from
MySQL and then I can calculate the percentage of HIT & MISS Requests.


> The only thing you will need to take care off is 302 response with an ICAP
> service for example.
>
> I do know how tempting it is to use PHP and it can be in many cases better
> for a network to use another solution then only squid.
>
> I do not know if you have seen this article:
> http://wiki.squid-cache.org/ConfigExamples/DynamicContent/Coordinator
>
> The article shows couple aspect of youtube caching.
>
> There was some PHP code at:
> http://code.google.com/p/yt-cache/
>
> Which I have seen long time ago.(2011-12)
>

I have seen this website before and I think this project is old. They
are saving chunks, but they made an impressive settings page.

> StoreID is at the 3.4 branch of squid and is still on the Beta stage:
> http://wiki.squid-cache.org/Features/StoreID
>
> StoreID code by itself is very well tested and I am using it on a daily
> basis not even once restarting\reloading my local server for a very long
> time.
> I have not heard about a very big production environment(clustered) reports
> in my email yet.
>
> The basic idea of StoreID is to take the current existing internals of squid
> and to "unleash" them in a way that they can be exploited\used by external
> helper.
>
> StoreID is not here to replace the PHP or any other methods that might fit
> any network, it comes to allow the admin and see the power of squid caching
> even in this "dead-end" case which requires acrobatics.
>
> You can try to just test it in a small testing environment and to see if it
> fits to you.
>
> One of the benefits that Apache+PHP has is the "Threading" which allows one
> service such as apache to utilize as much horse power as the machine has as
> a "metal".
> Since squid is already there the whole internal traffic between the apache
> and squid can be "spared" while using StoreID.
>
> Note that fetching the headers *only* from the origin server can still help
> you to decide if you want to fetch the whole object from it.
> A fetch of a whole headers set which will not exceed 1KB is worth for even a
> 200KB file size in many cases.
>

The problem is that I am using Squid (Windows version) on Windows 2008
( R2 x64 32GB Rams & HDD 6 TB ), not Linux so I am not able to benefit
from the new features squid has provided. The whole idea of building
such a script is to decrease the pain that I am still suffering till
now and I really hope I will also be able to cache SSL with better
efficiency .

I heard that BlueCoat system caches SSL, but I am not sure if it is
acting as a man-in-the-middle, which requires certificate to be
installed in the client's machine!.

Thank you again for providing me with more information to the subject.

I really appreciate your correspondence.

> I have tried to not miss somethings but I do not want to write a whole
> Scroll about yet so if there is more interest in it I will add more later.
>
> Regards,
> Eliezer
>
>
> On 25/11/13 23:13, Ghassan Gharabli wrote:
>>
>>   Hi,
>>
>> I have built a PHP script to cache HTTP 1.X 206 Partial Content like
>> "WindowsUpdates" & Allow seeking through Youtube & many websites .
>>
>> I am willing to move from PHP to C++ hopefully after a while.
>>
>> The script is almost finished , but I have several question, I have no
>> idea if I should always grab the HTTP Response Headers and send them
>> back to the browsers.
>>
>>
>> 1) Does Squid still grab the "HTTP Response Headers", even if the
>> object is already in cache or Squid has already a cached copy of the
>> HTTP Response header . If Squid caches HTTP Response Headers then how
>> do you deal with HTTP CODE 302 if the object is already cached . I am
>> asking this question because I have already seen most websites use
>> same extensions such as .FLV including Location Header.
>>
>> 2) Do you also use mime.conf to send the Content-Type to the browser
>> in case of FTP/HTTP or only FTP ?
>>
>> 3) Does squid compare the length of the local cached copy with the
>> remote file if you already have the object file or you use
>> refresh_pattern?.
>>
>> 4) What happens if the user modifies a refresh_pattern to cache an
>>
>> object, for example .xml which does not have [Content-Length] header.
>> Do you still save it, or would you search for the ignore-headers used
>> to force caching the object and what happens if the cached copy
>> expires , do you still refresh the copy even if there is no
>> Content-Length header?.
>>
>> I am really confused with this issue , because I am always getting a
>> headers list from the Internet and I send them back to the browser
>> (using PHP and Apache) even if the object is in cache.
>>
>> Your help and answers will be much appreciated
>>
>> Thank you
>>
>> Ghassan
>>
>


Re: [squid-users] Squid with PHP & Apache

2013-11-28 Thread Ghassan Gharabli
On Wed, Nov 27, 2013 at 1:28 PM, Amos Jeffries  wrote:
> On 27/11/2013 5:30 p.m., Ghassan Gharabli wrote:
>> On Tue, Nov 26, 2013 at 5:30 AM, Amos Jeffries wrote:
>>> On 26/11/2013 10:13 a.m., Ghassan Gharabli wrote:
>>>>  Hi,
>>>>
>>>> I have built a PHP script to cache HTTP 1.X 206 Partial Content like
>>>> "WindowsUpdates" & Allow seeking through Youtube & many websites .
>>>>
>>>
>>> Ah. So you have written your own HTTP caching proxy in PHP. Well done.
>>> Did you read RFC 2616 several times? your script is expected to to obey
>>> all the MUST conditions and clauses in there discussing "proxy" or "cache".
>>>
>>
>> Yes , I have read it and I will read it again , but the reason i am
>> building such a script is because internet here in Lebanon is really
>> expensive and scarce.
>>
>> As you know Youtube is sending dynamic chunks for each video . For
>> example , if you watch a video on Youtube more than 10 times , then
>> Squid fill up the cache with more than 90 chunks per video , that is
>> why allowing to seek at any position of the video using my script
>> would save me the headache .
>>
>
> Youtube is a special case. They do not strictly use Range requests for
> the video seeking. If you are getting that lucky you.
> They are also multiplexing videos via multiple URLs.
>

Hi Amos,

Youtube application is mostly using range requests on iPhone or
Android , but range argument on Browsers if and only if Browsers have
Flash player 11 installed. Youtube sends full length if Flash player
10 was installed. That was my investigation regarding Youtube.

>
>>>
>>> NOTE: the easy way to do this is to upgrade your Squid to the current
>>> series and use ACLs on the range_offset_limit directive. That way Squid
>>> will convert Range requests to normal fetch requests and cache the
>>> object before sending the requested pieces of it back to the client.
>>> http://www.squid-cache.org/Doc/config/range_offset_limit/
>>>
>>>
>>
>> I have successfully supported HTTP/206, if the object is cached and my
>> target is to enable Range headers, as I can see that iPhones or Google
>> Chrome check if the server has a header Accept-Ranges: Bytes then they
>> send a request bytes=x-y or multiple bytes like bytes=x-y,x-y .
>>
>
> Yes that is how Ranges requests and responses work.
>
> What I meant was that Squid already contained a feature to selectively
> cause the entire object to cache so it could generate the 206 response
> for clients.
>
>
>>>> I am willing to move from PHP to C++ hopefully after a while.
>>>>
>>>> The script is almost finished , but I have several question, I have no
>>>> idea if I should always grab the HTTP Response Headers and send them
>>>> back to the borwsers.
>>>
>>> The response headers you get when receiving the object are meta data
>>> describing that object AND the transaction used to fetch it AND the
>>> network conditions/pathway used to fetch it. The cachs job is to store
>>> those along with the object itself and deliver only the relevant headers
>>> when delivering a HIT.
>>>
>>>>
>>>> 1) Does Squid still grab the "HTTP Response Headers", even if the
>>>> object is already in cache or Squid has already a cached copy of the
>>>> HTTP Response header . If Squid caches HTTP Response Headers then how
>>>> do you deal with HTTP CODE 302 if the object is already cached . I am
>>>> asking this question because I have already seen most websites use
>>>> same extensions such as .FLV including Location Header.
>>>
>>> Yes. All proxies on the path are expected to relay the end-to-end
>>> headers, drop the hop-by-hop headers, and MUST update/generate the
>>> feature negotiation and state information headers to match its
>>> capabilities in each direction.
>>>
>>>
>>
>> Do you mean by Yes , for grabbing the Http Response Headers even if
>> the object is already in cache, so therefore latency of network is
>> always added even if MISS or HIT situation?
>
> No. I mean the headers received along with the object need to be stored
> with it and sent on HITs.
> I see many people thinking they can just store the object by itself same
> as a webs server stores it. But that way looses the vital header
> information.
>

Do you mean within one call , you store the object and the header in
the same file and then you extra