Re: [squid-users] YouTube Videos rating lists

2017-07-08 Thread Eliezer Croitoru
Hey Marcus,

The analyzer is a full blown video and image analysis system of a company I 
work with(lots of GPUs and CPUs..).
I do not know the exact details about gambling since their product was designed 
to lookup mainly for pornography but I have seen that they can identify 
gambling in more then one case.
From my point of view this system is a black box which I throw videos at and it 
spits a response which can be one of couple.
I don't care if they have couple monkeys clicking the mouse or a keyboard doing 
this job.

And I will add the source code and initialization scripts repository at:
http://gogs.ngtech.co.il/elicro/squid-external-acl-by-url.git

I have tested it to work with CentOS, Ubuntu, Debian and couple others.
The initialization script was written for amd64 binaries but can be changed 
accordingly to any of the offered binaries.

Eliezer


Eliezer Croitoru
Linux System Administrator
Mobile: +972-5-28704261
Email: elie...@ngtech.co.il



-Original Message-
From: squid-users [mailto:squid-users-boun...@lists.squid-cache.org] On Behalf 
Of Marcus Kool
Sent: Sunday, July 9, 2017 01:25
To: squid-users@lists.squid-cache.org
Subject: Re: [squid-users] YouTube Videos rating lists

Hi Eliezer,
what is the analyzer looking at?
Does it detect gambling and support other languages than English ?
Thanks
Marcus

On 08/07/17 18:47, Eliezer Croitoru wrote:
> Hey All,
> 
> I have been working for quite some time on a basic YouTube videos filtering
> integration into SquidBlocker.
> I have a video and images analysis and categorizing system that I can use to
> rate the videos and the images but I am lacking one thing:
> YouTube URLS feeds.
> 
> I have a running server that is dedicated to receive youtube videos urls for
> analysis and then que them for testing.
> For this to work I added a feature for the an external_acl helper I wrote
> which is called a "feeder" mode which first answers the request with and ERR
> and in the background sends the url to the remote system.
> The end result would be a publically available rating lists which will be
> categorized in a similar way to what Netflix rate ie:
> https://help.netflix.com/en/node/2064
> 
> ie:
> Movies and TV:
> Little Kids   Older Kids  Teens   Adults
> All7+   13+   16+
> 
> I found that Netflix sometimes misses the exact match and adults content
> being treated for "7+" I hope that I will not have this issue.
> At the first step I will have the API set and the helper released with it's
> sources.
> When these will be ready I hope to start analyzing and categorizing youtube
> videos for white and black listing.
> After I will have a base line of black and white lists I will move on to a
> weight based categorizing which will also return the matching age which the
> video is allow to be watched by.
> 
> I need some help from anyone who is willing to send only specific url
> patterns and leave the analysis and categorizing to the automated system.
> 
> Thanks In Advance,
> Eliezer
> 
> 
> Eliezer Croitoru
> Linux System Administrator
> Mobile: +972-5-28704261
> Email: elie...@ngtech.co.il
> 
> 
> 
> ___
> squid-users mailing list
> squid-users@lists.squid-cache.org
> http://lists.squid-cache.org/listinfo/squid-users
> 
___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users

___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] YouTube Videos rating lists

2017-07-08 Thread Marcus Kool

Hi Eliezer,
what is the analyzer looking at?
Does it detect gambling and support other languages than English ?
Thanks
Marcus

On 08/07/17 18:47, Eliezer Croitoru wrote:

Hey All,

I have been working for quite some time on a basic YouTube videos filtering
integration into SquidBlocker.
I have a video and images analysis and categorizing system that I can use to
rate the videos and the images but I am lacking one thing:
YouTube URLS feeds.

I have a running server that is dedicated to receive youtube videos urls for
analysis and then que them for testing.
For this to work I added a feature for the an external_acl helper I wrote
which is called a "feeder" mode which first answers the request with and ERR
and in the background sends the url to the remote system.
The end result would be a publically available rating lists which will be
categorized in a similar way to what Netflix rate ie:
https://help.netflix.com/en/node/2064

ie:
Movies and TV:
Little Kids Older Kids  Teens   Adults
All  7+   13+   16+

I found that Netflix sometimes misses the exact match and adults content
being treated for "7+" I hope that I will not have this issue.
At the first step I will have the API set and the helper released with it's
sources.
When these will be ready I hope to start analyzing and categorizing youtube
videos for white and black listing.
After I will have a base line of black and white lists I will move on to a
weight based categorizing which will also return the matching age which the
video is allow to be watched by.

I need some help from anyone who is willing to send only specific url
patterns and leave the analysis and categorizing to the automated system.

Thanks In Advance,
Eliezer


Eliezer Croitoru
Linux System Administrator
Mobile: +972-5-28704261
Email: elie...@ngtech.co.il



___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


[squid-users] YouTube Videos rating lists

2017-07-08 Thread Eliezer Croitoru
Hey All,

I have been working for quite some time on a basic YouTube videos filtering
integration into SquidBlocker.
I have a video and images analysis and categorizing system that I can use to
rate the videos and the images but I am lacking one thing:
YouTube URLS feeds.

I have a running server that is dedicated to receive youtube videos urls for
analysis and then que them for testing.
For this to work I added a feature for the an external_acl helper I wrote
which is called a "feeder" mode which first answers the request with and ERR
and in the background sends the url to the remote system.
The end result would be a publically available rating lists which will be
categorized in a similar way to what Netflix rate ie:
https://help.netflix.com/en/node/2064

ie:
Movies and TV:
Little Kids Older Kids  Teens   Adults
All  7+   13+   16+

I found that Netflix sometimes misses the exact match and adults content
being treated for "7+" I hope that I will not have this issue.
At the first step I will have the API set and the helper released with it's
sources.
When these will be ready I hope to start analyzing and categorizing youtube
videos for white and black listing.
After I will have a base line of black and white lists I will move on to a
weight based categorizing which will also return the matching age which the
video is allow to be watched by.

I need some help from anyone who is willing to send only specific url
patterns and leave the analysis and categorizing to the automated system.

Thanks In Advance,
Eliezer


Eliezer Croitoru
Linux System Administrator
Mobile: +972-5-28704261
Email: elie...@ngtech.co.il



___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] Huge amount of time_wait connections after upgrade from v2 to v3

2017-07-08 Thread Ivan Larionov
RPS didn't change. Throughput didn't change. Our prod load is 200-700 RPS
per server (changes during the day) and my load test load was constant 470
RPS.

Clients didn't change. Doesn't matter if they use HTTP 1.1 or 1.0, because
the only thing which changed is squid version. And as I figured out, it's
not actually about 2.7 to 3.5 update, it's all about difference between
3.5.20 and 3.5.21.

I'm sorry but anything you say about throughput doesn't make any sense.
Load pattern didn't change. Squid still handles the same amount of requests.

I think I'm going to load test every patch applied to 3.5.21 from this
page:
http://www.squid-cache.org/Versions/v3/3.5/changesets/SQUID_3_5_21.html so
I'll be able to point to exact change which introduced this behavior. I'll
try to do it during the weekend or may be on Monday.

On Sat, Jul 8, 2017 at 5:46 AM, Amos Jeffries  wrote:

> On 08/07/17 02:06, Ivan Larionov wrote:
>
>> Thank you for the fast reply.
>>
>> On Jul 7, 2017, at 01:10, Amos Jeffries  wrote:
>>>
>>> On 07/07/17 13:55, Ivan Larionov wrote:

>>> >>>
>
>> However I assumed that this is a bug and that I can find older version
 which worked fine. I started testing from 3.1.x all the way to 3.5.26 and
 this is what I found:
 * All versions until 3.5.21 work fine. There no issues with huge amount
 of TIME_WAIT connections under load.
 * 3.5.20 is the latest stable version.
 * 3.5.21 is the first broken version.
 * 3.5.23, 3.5.25, 3.5.26 are broken as well.
 This effectively means that bug is somewhere in between 3.5.20 and
 3.5.21.
 I hope this helps and I hope you'll be able to find an issue. If you
 can create a bug report based on this information and post it here it would
 be awesome.

>>>
>>> The changes in 3.5.21 were fixes to some common crashes and better
>>> caching behaviour. So I expect at least some of the change is due to higher
>>> traffic throughput on proxies previously restricted by those problems.
>>>
>>>
>> I can't imagine how throughput increase could result in 500 times more
>> TIME_WAIT connections count.
>>
>>
> More requests per second generally means more TCP connections churning.
>
> Also when going from Squid-2 to Squid-3 there is a change from HTTP/1.0 to
> HTTP/1.1 and the accompanying switch from MISS to near-HIT revalidations.
> Revalidations usually only have headers without payload so the same
> bytes/sec can contain orders more magnitude of those than MISS - which is
> the point of having them.
>
>
> In our prod environment when we updated from 2.7.x to 3.5.25 we saw
>> increase from 100 to 1. This is 100x.
>>
>>
> Compared to what RPS change? Given the above traffic change this may be
> reasonable for a v2 to v3 jump. Or own very rough tests on old hardware lab
> tests have shown rates for Squid-2 at ~900 RPS and Squid-3 at around 1900
> RPS.
>
>
> When I was load testing different versions yesterday I was always sending
>> the same amount of RPS to them. Update from 3.5.20 to 3.5.21 resulted in
>> jump from 20 to 1 TIME_WAIT count. This is 500x.
>>
>> I know that time_wait is fine in general. Until you have too many of them.
>>
>>
> At this point I'd check that your testing software supports HTTP/1.1
> pipelines. It may be giving you worst-case results with per-message TCP
> churn rather than what will occur normally (pipelines of N requests per TCP
> connection).
> Though seeing such a jump between Squid-3 releases is worrying.
>
> Amos
>



-- 
With best regards, Ivan Larionov.
___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] Huge amount of time_wait connections after upgrade from v2 to v3

2017-07-08 Thread Amos Jeffries

On 08/07/17 02:06, Ivan Larionov wrote:

Thank you for the fast reply.


On Jul 7, 2017, at 01:10, Amos Jeffries  wrote:


On 07/07/17 13:55, Ivan Larionov wrote:

>>>

However I assumed that this is a bug and that I can find older version which 
worked fine. I started testing from 3.1.x all the way to 3.5.26 and this is 
what I found:
* All versions until 3.5.21 work fine. There no issues with huge amount of 
TIME_WAIT connections under load.
* 3.5.20 is the latest stable version.
* 3.5.21 is the first broken version.
* 3.5.23, 3.5.25, 3.5.26 are broken as well.
This effectively means that bug is somewhere in between 3.5.20 and 3.5.21.
I hope this helps and I hope you'll be able to find an issue. If you can create 
a bug report based on this information and post it here it would be awesome.


The changes in 3.5.21 were fixes to some common crashes and better caching 
behaviour. So I expect at least some of the change is due to higher traffic 
throughput on proxies previously restricted by those problems.



I can't imagine how throughput increase could result in 500 times more 
TIME_WAIT connections count.



More requests per second generally means more TCP connections churning.

Also when going from Squid-2 to Squid-3 there is a change from HTTP/1.0 
to HTTP/1.1 and the accompanying switch from MISS to near-HIT 
revalidations. Revalidations usually only have headers without payload 
so the same bytes/sec can contain orders more magnitude of those than 
MISS - which is the point of having them.




In our prod environment when we updated from 2.7.x to 3.5.25 we saw increase 
from 100 to 1. This is 100x.



Compared to what RPS change? Given the above traffic change this may be 
reasonable for a v2 to v3 jump. Or own very rough tests on old hardware 
lab tests have shown rates for Squid-2 at ~900 RPS and Squid-3 at around 
1900 RPS.




When I was load testing different versions yesterday I was always sending the 
same amount of RPS to them. Update from 3.5.20 to 3.5.21 resulted in jump from 
20 to 1 TIME_WAIT count. This is 500x.

I know that time_wait is fine in general. Until you have too many of them.



At this point I'd check that your testing software supports HTTP/1.1 
pipelines. It may be giving you worst-case results with per-message TCP 
churn rather than what will occur normally (pipelines of N requests per 
TCP connection).

Though seeing such a jump between Squid-3 releases is worrying.

Amos
___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] Build Squid-3.5.26 failure in FreeBSD 10.3-STABLE

2017-07-08 Thread Amos Jeffries

On 08/07/17 23:18, Willsz.net wrote:

Hi,

I got some problem with this squid version after last night decide to
upgrading FreeBSD from FreeBSD 9.3-STABLE to FreeBSD 10.3-STABLE. Here's
some error output:





root:/usr/local/src/squid-3.5.26# ./configure `cat squid.configure` > log
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
configure: WARNING: Neither SASL nor SASL2 found
configure: WARNING: Samba smbclient not found in default location.
basic_smb_auth may not work on this machine
configure: WARNING: Neither SASL nor SASL2 found
configure: WARNING: Samba wbinfo not found in default location.
ext_wbinfo_group_acl may not work on this machine
configure: WARNING: cppunit does not appear to be installed. squid does not
require this, but code testing with 'make check' will fail.



Did you read those warnings? take the last one for example.




root:/usr/local/src/squid-3.5.26# make check

...

In file included from testPreCompiler.cc:10:0:
testPreCompiler.h:12:45: fatal error: cppunit/extensions/HelperMacros.h: No
such file or directory
compilation terminated.


... surprise.

Amos
___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


[squid-users] Build Squid-3.5.26 failure in FreeBSD 10.3-STABLE

2017-07-08 Thread Willsz.net
Hi,

I got some problem with this squid version after last night decide to
upgrading FreeBSD from FreeBSD 9.3-STABLE to FreeBSD 10.3-STABLE. Here's
some error output:

root:/usr/local/src/squid-3.5.26# uname -srm
FreeBSD 10.3-STABLE i386

root:/usr/local/src/squid-3.5.26# cat squid.configure
--prefix=/usr/local --includedir=/usr/local/include --bindir=/usr/local/sbin
--libexecdir=/usr/local/libexec/squid --sysconfdir=/usr/local/etc/squid
--with-default-user=squid --localstatedir=/var/cache/squid
--libdir=/usr/local/lib --with-logdir=/var/log/squid
--with-pidfile=/var/run/squid.pid --with-swapdir=/var/cache/squid
--without-gnutls --enable-build-info --enable-loadable-modules
--enable-removal-policies=lru,heap --disable-epoll --disable-linux-netfilter
--disable-linux-tproxy --disable-translation --disable-arch-native
--mandir=/usr/local/man --infodir=/usr/local/info --disable-wccp
--disable-wccpv2 --enable-ipfw-transparent --with-large-files --disable-htcp
--disable-eui --enable-cachemgr-hostname=ip.proxy-cache.willsz.net
--disable-auth-negotiate

root:/usr/local/src/squid-3.5.26# ./configure `cat squid.configure` > log
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
configure: WARNING: Neither SASL nor SASL2 found
configure: WARNING: Samba smbclient not found in default location.
basic_smb_auth may not work on this machine
configure: WARNING: Neither SASL nor SASL2 found
configure: WARNING: Samba wbinfo not found in default location.
ext_wbinfo_group_acl may not work on this machine
configure: WARNING: cppunit does not appear to be installed. squid does not
require this, but code testing with 'make check' will fail.

root:/usr/local/src/squid-3.5.26# cat log
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... cfgaux/install-sh -c -d
checking for gawk... no
checking for mawk... no
checking for nawk... nawk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking whether UID '0' is supported by ustar format... yes
checking whether GID '0' is supported by ustar format... yes
checking how to create a ustar tar archive... gnutar
checking whether to enable maintainer-specific portions of Makefiles... no
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables... 
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking whether gcc understands -c and -o together... yes
checking for style of include used by make... GNU
checking dependency style of gcc... gcc3
checking for g++... g++
checking whether we are using the GNU C++ compiler... yes
checking whether g++ accepts -g... yes
checking dependency style of g++... gcc3
checking build system type... i386-unknown-freebsd10.3
checking host system type... i386-unknown-freebsd10.3
configure: CPU arch native optimization enabled: no
checking simplified host os... freebsd (version 10.3)
checking what kind of compiler we're using... gcc
checking whether g++ supports C++11 features by default... no
checking whether g++ supports C++11 features with -std=c++11... yes
checking for ranlib... ranlib
checking how to run the C preprocessor... gcc -E
checking whether ln -s works... yes
checking for egrep... /usr/bin/egrep
checking for sh... /bin/sh
checking for false... /usr/bin/false
checking for true... /usr/bin/true
checking for mv... /bin/mv
checking for mkdir... /bin/mkdir
checking for ln... /bin/ln
checking for chmod... /bin/chmod
checking for tr... /usr/bin/tr
checking for rm... /bin/rm
checking for cppunit-config... false
checking for perl... /usr/local/bin/perl
checking for pod2man... /usr/local/bin/pod2man
checking for ar... /usr/bin/ar
checking for linuxdoc... /usr/bin/false
configure: strict error checking enabled: yes
checking whether to use loadable modules... yes
checking how to print strings... printf
checking for a sed that does not truncate output... /usr/bin/sed
checking for fgrep... /usr/bin/fgrep
checking for ld used by gcc... /usr/local/bin/ld
checking if the linker (/usr/local/bin/ld) is GNU ld... yes
checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
checking the name lister (/usr/bin/nm -B) interface... BSD nm
checking the maximum length of command line arguments... 196608
checking how to convert i386-unknown-freebsd10.3 file names to
i386-unknown-freebsd10.3 format... func_convert_file_noop
checking how to convert i386-unknown-freebsd10.3 file names to toolchain
format... func_convert_file_noop
checking for /usr/local/bin/ld