We have finally been able to get the error log of jya and cryptome.
Assistance in interpreting logs would be appreciated.
Background: late Friday, July 21, service for both sites began to be
very slow and there have been repeated outages since then.
Our ISP, Digital Nation, has checked on our system several times during the
outages and had stated each time that there is nothing wrong except an
overload of hits, that our server is underpowered for the load. There is
no evidence of DoS or other attack. One administrator stated that due to
the volume of hits he could not access the machine, had to turn it off,
and then quickly get inside for review before the hits built up sufficiently
to prevent access.
On Monday, July 23, upon Digital Nation's advice that more CPU power was
needed to handle the load, and that there was no other cause of the problem,
we rented a more powerful server (about 8-fold increase) which should come
online at the end of this week.
Declan's article ran on Friday July 21 day and the hits from it did not
seem to affect the sites. Saturday, an AP story appeared but it did not
include links to the site, however, Drudge Report picked up the AP story
and provided a munged link to jya.com:
http://jya.com/crypto.htmhttp://jya.com/crypto.htm
Thousands of hits on this non-existent file began to appear in the
error log, and there have now been tens of thousands of them (maybe in
the hundreds of thousands, no count has been made, and each is
multiplied by Digital Nation's error page with its graphics).
Late Saturday night a Washington Post article appeared which provided
a link to http://jya.com/crypto.htm. That article later appeared on
a number of popular sites. Later articles in Reuters, Financial Times,
also provided links, and the access log shows folks coming in
from those sites without problems. Still, the Drudge errors were
predominant by far.
The size of the access log for the two sites jumped from ~11MB before
the problem began (May 3 to July 21) to over 113MB in four days, a
ten-fold increase.
The error log has jumped from 13MB to only 15MB since July 21. (By far the
largest cause of previous errors is the pernicious "favicon.ico.")
Soon after the Drudge attack began, this entry in the error log started to
appear and repeated every few minutes, sometimes every minute (entries
numbered by us for reference):
(1) (32)Broken pipe: accept: (client socket)
This entry had appeared only infrequently previously.
Several hours later entry (2) appeared dozens of times at the
same clock time:
(2) [warn] child process 736 still did not exit, sending a SIGTERM
Followed by several iterations of entry (3) at the same clock time:
(3) [error] child process 628 still did not exit, sending a SIGKILL
And then:
(4) Site site1 has invalid certificate: 4999 Certificate files do not exist.
(5) Site site2 has invalid certificate: 4999 Certificate files do not exist.
(6) [crit] (98)Address already in use: make_sock: could not bind to port 80
(7) [notice] caught SIGTERM, shutting down
(8) Site site1 has invalid certificate: 4999 Certificate files do not exist.
(9) Site site2 has invalid certificate: 4999 Certificate files do not exist.
(10) [notice] Apache/1.3.6 (Unix) mod_perl/1.21 mod_ssl/2.2.8 OpenSSL/0.9.2b
configured -- resuming normal operations
The pattern of these series of entries continues, with shutdowns and restarts
repeating since Saturday, July 22.
During the outage period we have been sent frequent automatic messages like
the following:
(11) Over the past fifteen minutes, the CPU has been heavily loaded.
This will result in noticible performace loss. Consider moving some
of the
services to other Cobalt servers, or reduce the complexity of the CGI
scripts running on the Cobalt server itself.
1 minute load average: 27.79
5 minute load average: 68.67
15 minute load average: 84.27
(12) Memory on the Cobalt server is heavily used.
The Cobalt server needs more memory than it currently has.
Consider adding more DRAM to the server.
Total memory is: 162376 KB
Used memory is: 161012 KB
Free memory is: 1364 KB
Percent used is: 99
(13) Your server (cob487) is not responding on the port (80) we are
monitoring -
please let us know if this is going to be a permanent condition.
If you have a support contract with us, and this is within normal
business
hours, feel free to send an e-mail to [EMAIL PROTECTED] or [EMAIL PROTECTED]
regarding the problem you are having.
If you are doing work on your server, you can reply to this message
and it will
be noted by our SOC staff. If this is an unexpected problem for you,
you may
wish to contact anyone else at your company who might be working on the
server to find out if they are aware of the situation.
This ticket will remain open until the server is back online, is
accepting
connections, or you are notified by our SOC staff that the port we are
monitoring has been changed to avoid alarms on our end.
Please let us know if you need anything.
Thanks,
--Server Operations Center
We would appreciate advice on whether these log entries and messages are
consistent with simple overloading or could indicate an attack, even a
presumbably accidental attack by Drudge (who has still not answered my
Saturday e-mail to correct the URL).
Thanks very much.