Re: [Sugar-devel] [Systems] git.sugarlabs.org down for unplanned maintenance

2014-04-14 Thread Aleksey Lim
On Mon, Apr 14, 2014 at 09:12:05AM -0400, Bernie Innocenti wrote:
 On 04/12/2014 02:07 AM, Sebastian Silva wrote:
  Here I just got home. Sorry for the inconvenience I might have caused.
  
  Bernie, do you know which log was/is growing out of hand?
 
 Both access.log and node.sugarlabs.org.log. I discarded the first and
 compressed the second (it compresses very well). You can still examine
 it by doing:
 
   xzless access.log-20140411.xz | tail
 
 You'll see lines like this one:
 
 node.sugarlabs.org:80 181.65.159.107 - - [11/Apr/2014:19:51:36 -0400]
 GET /?cmd=subscribe HTTP/1.1 200 232 - python-requests/1.2.1
 CPython/2.7.0 Linux/2.6.35.13_xo1.5-20120508.1139.olpc.eb0c7a8
 
 
 The problem seems to be that laptops retry the connection to
 /context.atom and /feedback.atom quickly. It's probably near the end of
 the file though. Don't try to uncompress the whole file because it's
 over 2GB.

`GET /?cmd=subscribe` SN API calls are inteneded to be permanent
to provide HTML5 Server-Sent Events. If I'm getting it right, Apache is
not assumed to be perfect to support such long-living connections and
SN nodes will be switched to direct HTTP access after all. But for now,
they are behind Apache porxy with 600 sec proxy timeout, so, logs
should not grow too fast.

-- 
Aleksey
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] [Systems] git.sugarlabs.org down for unplanned maintenance

2014-04-11 Thread Bernie Innocenti
Ok, we're back in business, with a snappier database too.

Today I felt lucky so I also replaced the aging MySQL 5.1 with a shiny
new MariaDB 10. Don't worry, MariaDB should be 100% backwards
compatible, and we do daily dumps in case anything goes wrong.

I also deleted several *millions* of records from the database for old
login sessions and logs of clone actions. You may have to login again,
but Gitorious feels lot faster now.

We still need to keep an eye at those evil XOs that keep reconnecting to
network.sugarlabs.org. Icarito, can you look at implementing some form
of exponential backoff? If the fix can't be deployed within a few days,
we should defend ourselves with iptables rules or at least stop logging
every connection.

As always, please notify us if anything malfunctions. Note that alsroot
said he would be offline until Apr 13.


On 04/11/2014 08:51 PM, Bernie Innocenti wrote:
 I was notified that git.sugarlabs.org was showing errors.
 
 After some head scraping I realized that the root filesystem on jita was
 full. I looked around and found giant request logs containing millions
 of requests apparently originating from XOs located in Peru.
 
 We've been DDOSed by our own creature :-)
 
 Anyway, the machine also had a giant, very fragmented mysql database
 that I'm currently cleaning up. Gitorious will be back online in less
 than 1 hour. Contact me on IRC if this is blocking your work, I can
 postpone the maintenance.

-- 
Bernie Innocenti
Sugar Labs Infrastructure Team
http://wiki.sugarlabs.org/go/Infrastructure_Team
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel