Re: Number of SpamAssassin installations

2020-01-31 Thread Henrik K
On Sat, Jun 29, 2019 at 12:07:29AM +0300, Henrik K wrote:
>
> Here's sa-update.spamassassin.org, about a months duration, weight=10 should
> guarantee quite accurate number for that.
> 
> cat sa-update-access_log* | egrep 'tar\.gz 
> .*"(curl|Wget|fetch|libwww|sa-update)' | awk '{print $1}' | sort -u | wc -l
> 1138747
> (unique C-classes from those: 352857)
> 
> Here's some interesting User-Agent's from those that use LWP, unique IP count:
> 
>  128978 sa-update/svn917659/3.3.1
>   88423 sa-update/svn917659/3.3.2
>   20717 sa-update/svn1652181/3.4.1
>7315 sa-update/3.4.2 / svn1840377/3.4.2
>6282 sa-update/svn1475932/3.4.0
>1553 sa-update/svnunknown/3.4.2
>
> Amazing to see some 3.1 there too. Hopefully most are just some useless
> boxes with cron left running.

Here's most recent stats for last ~month.  Might be interesting regarding
the latest SHA-1 debacle..

1127974 unique IPs
355623 unique C-classes

Below some User-Agents processed.  List is made from unique IP/User-Agent
pairs to reflect number of users better.

It's nice to see 3.4.3 on top, yet worrying to see all those unpatched
redhat derivates..

 296712 sa-update/3.4.3 / svn1869639/3.4.3
 203254 curl/7.29.0 redhat/centos7?
 117014 curl/7.19.7 redhat/centos6
 114180 sa-update/svn917659/3.3.1 redhat/centos6?
  82072 sa-update/svn917659/3.3.2 redhat/centos6 fedora/atomic?
  14047 sa-update/svn1652181/3.4.1
  13311 sa-update/3.4.2 / svn1840377/3.4.2
   7038 curl/7.15.5 redhat/centos5?
   5827 sa-update/svn1475932/3.4.0
   2241 sa-update/svnunknown/3.4.2
594 sa-update/3.4.4 / svn1869639/3.4.4
370 sa-update/svn507100/3.1.8
277 sa-update/svn897929/3.3.0
232 sa-update/svn917659/3.4.2
211 sa-update/3.4.4-rc1 / svn1869639/3.4.4
190 sa-update/svnunknown/3.4.3
144 sa-update/svn607589/3.2.4
...dropped rest



Re: Number of SpamAssassin installations

2019-06-28 Thread Henrik K
On Fri, Jun 28, 2019 at 10:09:56AM +0300, Henrik K wrote:
> On Thu, Jun 27, 2019 at 06:43:42PM -0400, Kevin A. McGrail wrote:
> > On 6/27/2019 4:40 AM, Jari Fredriksson wrote:
> > >
> > >> Kevin A. McGrail  kirjoitti 26.6.2019 kello 19.09:
> > >>
> > >> Agreed.  I think David just had a simple command he ran on his logs or
> > >> the project's mirror.  I can get you access to the mirror the project
> > >> runs if you want to look at it.
> > >>
> > >> On 6/26/2019 8:51 AM, Henrik K wrote:
> > >>> It's just simple awk/grep, no need for fancy scripts.. :-)
> > >>>
> > >>> One month from single mirror should be enough to get a ballpark, weight 
> > >>> can be
> > >>> calculated to it.
> > >>>
> > >>> On Wed, Jun 26, 2019 at 08:41:32AM -0400, Kevin A. McGrail wrote:
> >  I seem to remember that yes, David Jones wrote a log parser for some
> >  information on this but there is no centralization of logs.  Each 
> >  mirror
> >  would have to run and report.
> > 
> >  On 6/25/2019 8:33 AM, Henrik K wrote:
> > > Has someone calculated weekly/monthly unique IPs seen in some mirror 
> > > logs? 
> > > I'm curious how many active SA installations there actually are?  I 
> > > realize
> > > it's just a ballpark figure..
> > >
> > > Here is mine for 6 last months. This is sa-update.bitwell.fi
> > >
> > > jarif@gauntlet ~ $ wc -l /tmp/unique-combined.txt 
> > > 1700178 /tmp/unique-combined.txt
> > >
> > > Br. jarif
> > 
> > What command did you run to get that so I run the same and I'll get my
> > numbers.
> 
> I would check tar.gz downloads and correct user-agents to ignore bots etc.
> 
> zcat access*log* | egrep 'tar\.gz .*"(curl|Wget|fetch|libwww)' | awk '{print 
> $1}' | sort -u | wc -l

Here's sa-update.spamassassin.org, about a months duration, weight=10 should
guarantee quite accurate number for that.

cat sa-update-access_log* | egrep 'tar\.gz 
.*"(curl|Wget|fetch|libwww|sa-update)' | awk '{print $1}' | sort -u | wc -l
1138747
(unique C-classes from those: 352857)

Here's some interesting User-Agent's from those that use LWP, unique IP count:

 128978 sa-update/svn917659/3.3.1
  88423 sa-update/svn917659/3.3.2
  20717 sa-update/svn1652181/3.4.1
   7315 sa-update/3.4.2 / svn1840377/3.4.2
   6282 sa-update/svn1475932/3.4.0
   1553 sa-update/svnunknown/3.4.2
411 sa-update/svn507100/3.1.8
346 sa-update/svn897929/3.3.0
206 sa-update/svn917659/3.4.2
 68 sa-update/svn540384/3.2.1
 58 sa-update/svn607589/3.2.4
 37 sa-update/svn917659/3.4.0
 32 sa-update/svn607589/3.2.5
 20 sa-update/svn910278/3.4.0
  8 sa-update/svn540384/3.2.3
  7 sa-update/svn607589/3.3.2
  5 sa-update/svn540384/3.3.2
  4 sa-update/3.4.2 / svn1854476/3.4.2
  3 sa-update/svn540384/3.4.1
  3 sa-update/svn454083/3.1.7
  2 sa-update/svn897929/3.3.2
  2 sa-update/svn882245/3.3.0
  2 sa-update/svn540384/3.3.1
  1 sa-update/svn815500/3.3.0
  1 sa-update/svn540384/3.4.0
  1 sa-update/svn540384/3.3.0
  1 sa-update/svn540384/3.2.2
  1 sa-update/svn523403/3.2.0
  1 sa-update/svn1028810/3.4.0
  1 sa-update/4.0.0-r1854477 / svn1861181/4.0.0
  1 sa-update/4.0.0-r1854477 / svn1860877/4.0.0

Amazing to see some 3.1 there too. Hopefully most are just some useless
boxes with cron left running.



Re: Number of SpamAssassin installations

2019-06-28 Thread Jari Fredriksson



> Jari Fredriksson  kirjoitti 28.6.2019 kello 10.18:
> 
> 
> 
>> Kevin A. McGrail  kirjoitti 28.6.2019 kello 1.43:
>> 
>> On 6/27/2019 4:40 AM, Jari Fredriksson wrote:
>>> 
 Kevin A. McGrail  kirjoitti 26.6.2019 kello 19.09:
 
 Agreed.  I think David just had a simple command he ran on his logs or
 the project's mirror.  I can get you access to the mirror the project
 runs if you want to look at it.
 
 On 6/26/2019 8:51 AM, Henrik K wrote:
> It's just simple awk/grep, no need for fancy scripts.. :-)
> 
> One month from single mirror should be enough to get a ballpark, weight 
> can be
> calculated to it.
> 
> On Wed, Jun 26, 2019 at 08:41:32AM -0400, Kevin A. McGrail wrote:
>> I seem to remember that yes, David Jones wrote a log parser for some
>> information on this but there is no centralization of logs.  Each mirror
>> would have to run and report.
>> 
>> On 6/25/2019 8:33 AM, Henrik K wrote:
>>> Has someone calculated weekly/monthly unique IPs seen in some mirror 
>>> logs? 
>>> I'm curious how many active SA installations there actually are?  I 
>>> realize
>>> it's just a ballpark figure..
>>> 
>>> Here is mine for 6 last months. This is sa-update.bitwell.fi
>>> 
>>> jarif@gauntlet ~ $ wc -l /tmp/unique-combined.txt 
>>> 1700178 /tmp/unique-combined.txt
>>> 
>>> Br. jarif
>> 
>> What command did you run to get that so I run the same and I'll get my
>> numbers.
> 
> I have two machines for this and the command was unique for each of them, as 
> the token for IP varies in the logs of them. But this is more stock
> 
> # grep -w "pound:" /var/log/messages*|awk '{print $7;}'|sort|uniq 
> >/tmp/unique-as-updaters-www.txt
> 
> br. jarif

I use pound reverse proxy in front of my ngxin so I took the data from it. The 
command is of course different from other logs, like nginx or apache access log.





Re: Number of SpamAssassin installations

2019-06-28 Thread Jari Fredriksson



> Kevin A. McGrail  kirjoitti 28.6.2019 kello 1.43:
> 
> On 6/27/2019 4:40 AM, Jari Fredriksson wrote:
>> 
>>> Kevin A. McGrail  kirjoitti 26.6.2019 kello 19.09:
>>> 
>>> Agreed.  I think David just had a simple command he ran on his logs or
>>> the project's mirror.  I can get you access to the mirror the project
>>> runs if you want to look at it.
>>> 
>>> On 6/26/2019 8:51 AM, Henrik K wrote:
 It's just simple awk/grep, no need for fancy scripts.. :-)
 
 One month from single mirror should be enough to get a ballpark, weight 
 can be
 calculated to it.
 
 On Wed, Jun 26, 2019 at 08:41:32AM -0400, Kevin A. McGrail wrote:
> I seem to remember that yes, David Jones wrote a log parser for some
> information on this but there is no centralization of logs.  Each mirror
> would have to run and report.
> 
> On 6/25/2019 8:33 AM, Henrik K wrote:
>> Has someone calculated weekly/monthly unique IPs seen in some mirror 
>> logs? 
>> I'm curious how many active SA installations there actually are?  I 
>> realize
>> it's just a ballpark figure..
>> 
>> Here is mine for 6 last months. This is sa-update.bitwell.fi
>> 
>> jarif@gauntlet ~ $ wc -l /tmp/unique-combined.txt 
>> 1700178 /tmp/unique-combined.txt
>> 
>> Br. jarif
> 
> What command did you run to get that so I run the same and I'll get my
> numbers.

I have two machines for this and the command was unique for each of them, as 
the token for IP varies in the logs of them. But this is more stock

# grep -w "pound:" /var/log/messages*|awk '{print $7;}'|sort|uniq 
>/tmp/unique-as-updaters-www.txt

br. jarif 

Re: Number of SpamAssassin installations

2019-06-28 Thread Henrik K
On Thu, Jun 27, 2019 at 06:43:42PM -0400, Kevin A. McGrail wrote:
> On 6/27/2019 4:40 AM, Jari Fredriksson wrote:
> >
> >> Kevin A. McGrail  kirjoitti 26.6.2019 kello 19.09:
> >>
> >> Agreed.  I think David just had a simple command he ran on his logs or
> >> the project's mirror.  I can get you access to the mirror the project
> >> runs if you want to look at it.
> >>
> >> On 6/26/2019 8:51 AM, Henrik K wrote:
> >>> It's just simple awk/grep, no need for fancy scripts.. :-)
> >>>
> >>> One month from single mirror should be enough to get a ballpark, weight 
> >>> can be
> >>> calculated to it.
> >>>
> >>> On Wed, Jun 26, 2019 at 08:41:32AM -0400, Kevin A. McGrail wrote:
>  I seem to remember that yes, David Jones wrote a log parser for some
>  information on this but there is no centralization of logs.  Each mirror
>  would have to run and report.
> 
>  On 6/25/2019 8:33 AM, Henrik K wrote:
> > Has someone calculated weekly/monthly unique IPs seen in some mirror 
> > logs? 
> > I'm curious how many active SA installations there actually are?  I 
> > realize
> > it's just a ballpark figure..
> >
> > Here is mine for 6 last months. This is sa-update.bitwell.fi
> >
> > jarif@gauntlet ~ $ wc -l /tmp/unique-combined.txt 
> > 1700178 /tmp/unique-combined.txt
> >
> > Br. jarif
> 
> What command did you run to get that so I run the same and I'll get my
> numbers.

I would check tar.gz downloads and correct user-agents to ignore bots etc.

zcat access*log* | egrep 'tar\.gz .*"(curl|Wget|fetch|libwww)' | awk '{print 
$1}' | sort -u | wc -l



Re: Number of SpamAssassin installations

2019-06-27 Thread Kevin A. McGrail
On 6/27/2019 4:40 AM, Jari Fredriksson wrote:
>
>> Kevin A. McGrail  kirjoitti 26.6.2019 kello 19.09:
>>
>> Agreed.  I think David just had a simple command he ran on his logs or
>> the project's mirror.  I can get you access to the mirror the project
>> runs if you want to look at it.
>>
>> On 6/26/2019 8:51 AM, Henrik K wrote:
>>> It's just simple awk/grep, no need for fancy scripts.. :-)
>>>
>>> One month from single mirror should be enough to get a ballpark, weight can 
>>> be
>>> calculated to it.
>>>
>>> On Wed, Jun 26, 2019 at 08:41:32AM -0400, Kevin A. McGrail wrote:
 I seem to remember that yes, David Jones wrote a log parser for some
 information on this but there is no centralization of logs.  Each mirror
 would have to run and report.

 On 6/25/2019 8:33 AM, Henrik K wrote:
> Has someone calculated weekly/monthly unique IPs seen in some mirror 
> logs? 
> I'm curious how many active SA installations there actually are?  I 
> realize
> it's just a ballpark figure..
>
> Here is mine for 6 last months. This is sa-update.bitwell.fi
>
> jarif@gauntlet ~ $ wc -l /tmp/unique-combined.txt 
> 1700178 /tmp/unique-combined.txt
>
> Br. jarif

What command did you run to get that so I run the same and I'll get my
numbers.

-- 
Kevin A. McGrail
Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171



Re: Number of SpamAssassin installations

2019-06-27 Thread Jari Fredriksson



> Kevin A. McGrail  kirjoitti 26.6.2019 kello 19.09:
> 
> Agreed.  I think David just had a simple command he ran on his logs or
> the project's mirror.  I can get you access to the mirror the project
> runs if you want to look at it.
> 
> On 6/26/2019 8:51 AM, Henrik K wrote:
>> It's just simple awk/grep, no need for fancy scripts.. :-)
>> 
>> One month from single mirror should be enough to get a ballpark, weight can 
>> be
>> calculated to it.
>> 
>> On Wed, Jun 26, 2019 at 08:41:32AM -0400, Kevin A. McGrail wrote:
>>> I seem to remember that yes, David Jones wrote a log parser for some
>>> information on this but there is no centralization of logs.  Each mirror
>>> would have to run and report.
>>> 
>>> On 6/25/2019 8:33 AM, Henrik K wrote:
 Has someone calculated weekly/monthly unique IPs seen in some mirror logs? 
 I'm curious how many active SA installations there actually are?  I realize
 it's just a ballpark figure..
 

Here is mine for 6 last months. This is sa-update.bitwell.fi

jarif@gauntlet ~ $ wc -l /tmp/unique-combined.txt 
1700178 /tmp/unique-combined.txt

Br. jarif

Re: Number of SpamAssassin installations

2019-06-26 Thread Kevin A. McGrail
Agreed.  I think David just had a simple command he ran on his logs or
the project's mirror.  I can get you access to the mirror the project
runs if you want to look at it.

On 6/26/2019 8:51 AM, Henrik K wrote:
> It's just simple awk/grep, no need for fancy scripts.. :-)
>
> One month from single mirror should be enough to get a ballpark, weight can be
> calculated to it.
>
> On Wed, Jun 26, 2019 at 08:41:32AM -0400, Kevin A. McGrail wrote:
>> I seem to remember that yes, David Jones wrote a log parser for some
>> information on this but there is no centralization of logs.  Each mirror
>> would have to run and report.
>>
>> On 6/25/2019 8:33 AM, Henrik K wrote:
>>> Has someone calculated weekly/monthly unique IPs seen in some mirror logs? 
>>> I'm curious how many active SA installations there actually are?  I realize
>>> it's just a ballpark figure..
>>>
>> -- 
>> Kevin A. McGrail
>> Member, Apache Software Foundation
>> Chair Emeritus Apache SpamAssassin Project
>> https://www.linkedin.com/in/kmcgrail - 703.798.0171


-- 
Kevin A. McGrail
Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171



Re: Number of SpamAssassin installations

2019-06-26 Thread Henrik K


It's just simple awk/grep, no need for fancy scripts.. :-)

One month from single mirror should be enough to get a ballpark, weight can be
calculated to it.

On Wed, Jun 26, 2019 at 08:41:32AM -0400, Kevin A. McGrail wrote:
> I seem to remember that yes, David Jones wrote a log parser for some
> information on this but there is no centralization of logs.  Each mirror
> would have to run and report.
> 
> On 6/25/2019 8:33 AM, Henrik K wrote:
> > Has someone calculated weekly/monthly unique IPs seen in some mirror logs? 
> > I'm curious how many active SA installations there actually are?  I realize
> > it's just a ballpark figure..
> >
> 
> -- 
> Kevin A. McGrail
> Member, Apache Software Foundation
> Chair Emeritus Apache SpamAssassin Project
> https://www.linkedin.com/in/kmcgrail - 703.798.0171


Re: Number of SpamAssassin installations

2019-06-26 Thread Kevin A. McGrail
I seem to remember that yes, David Jones wrote a log parser for some
information on this but there is no centralization of logs.  Each mirror
would have to run and report.

On 6/25/2019 8:33 AM, Henrik K wrote:
> Has someone calculated weekly/monthly unique IPs seen in some mirror logs? 
> I'm curious how many active SA installations there actually are?  I realize
> it's just a ballpark figure..
>

-- 
Kevin A. McGrail
Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171



Number of SpamAssassin installations

2019-06-25 Thread Henrik K


Has someone calculated weekly/monthly unique IPs seen in some mirror logs? 
I'm curious how many active SA installations there actually are?  I realize
it's just a ballpark figure..