Re: Release Announcements

2012-07-23 Thread J.H.
On 07/23/2012 02:22 AM, Borislav Petkov wrote:
> On Sun, Jul 22, 2012 at 12:08:34PM -0400, Shea Levy wrote:
>> The linux-kernel-announce doesn't seem to have had any traffic
>> since 3.1-rc4 (maybe due to the kernel.org break-in?). Is there a
>> recommended way to get email news of kernel releases without being
>> subscribed to the main kernel list?
> 
> Let's CC some more people about this.
> 

Follow the respective gitweb RSS feeds?  I'll have to do some digging to
figure out where those e-mails got generated from.  I do want to say
that that *SHOULD* be working, but if the e-mails aren't showing up in
the archives that may have gotten broken somewhere.

- John 'Warthog9' Hawley

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Release Announcements

2012-07-23 Thread J.H.
On 07/23/2012 02:22 AM, Borislav Petkov wrote:
 On Sun, Jul 22, 2012 at 12:08:34PM -0400, Shea Levy wrote:
 The linux-kernel-announce doesn't seem to have had any traffic
 since 3.1-rc4 (maybe due to the kernel.org break-in?). Is there a
 recommended way to get email news of kernel releases without being
 subscribed to the main kernel list?
 
 Let's CC some more people about this.
 

Follow the respective gitweb RSS feeds?  I'll have to do some digging to
figure out where those e-mails got generated from.  I do want to say
that that *SHOULD* be working, but if the e-mails aren't showing up in
the archives that may have gotten broken somewhere.

- John 'Warthog9' Hawley

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[KORG] Downtime - Master & other servers

2007-09-23 Thread J.H.
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Good Evening,

Just a heads up master.kernel.org (hera) will be rebooted at roughly
1900 UTC (1200 PDT), for a kernel update.  I apologize for the short
notice as this is an unexpected downtime.  Estimated downtime should be
less then five minutes.

As always if you have questions, comments or concerns please e-mail
[EMAIL PROTECTED]

- - John 'Warthog9' Hawley
Kernel.org Admin
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFG9iop/E3kyWU9dicRAsO6AJ0QTswAoRYRaIIZttDObhR7nWjoBQCfXKxw
oXH3SD4ci4eCV7uor7xDyEw=
=vxiW
-END PGP SIGNATURE-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[KORG] Downtime - Master other servers

2007-09-23 Thread J.H.
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Good Evening,

Just a heads up master.kernel.org (hera) will be rebooted at roughly
1900 UTC (1200 PDT), for a kernel update.  I apologize for the short
notice as this is an unexpected downtime.  Estimated downtime should be
less then five minutes.

As always if you have questions, comments or concerns please e-mail
[EMAIL PROTECTED]

- - John 'Warthog9' Hawley
Kernel.org Admin
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFG9iop/E3kyWU9dicRAsO6AJ0QTswAoRYRaIIZttDObhR7nWjoBQCfXKxw
oXH3SD4ci4eCV7uor7xDyEw=
=vxiW
-END PGP SIGNATURE-
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel.org needs gitweb help

2007-07-14 Thread J.H.
There's a todo list in the back of my head, I will try and write it up
this weekend and get it posted.

- John 'Warthog9' Hawley

On Sat, 2007-07-14 at 01:12 -0700, Junio C Hamano wrote:
> "H. Peter Anvin" <[EMAIL PROTECTED]> writes:
> 
> > A lot of people have asked me if there is anything they can do to help
> > out kernel.org.  At this point, the number one thing anyone could do to
> > help, and which would be reasonably self-contained a project, would be
> > to help maintain our fork of gitweb:
> >
> > http://git.kernel.org/?p=git/warthog9/gitweb.git;a=summary
> >
> > We really need the caching version of gitweb, but it does have a number
> > of problems, including the non-working tarball generator.
> 
> Are there an issues-list for the forked gitweb somewhere, or
> would the first step of people who would want to help be to
> build such a list?
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel.org needs gitweb help

2007-07-14 Thread J.H.
There's a todo list in the back of my head, I will try and write it up
this weekend and get it posted.

- John 'Warthog9' Hawley

On Sat, 2007-07-14 at 01:12 -0700, Junio C Hamano wrote:
 H. Peter Anvin [EMAIL PROTECTED] writes:
 
  A lot of people have asked me if there is anything they can do to help
  out kernel.org.  At this point, the number one thing anyone could do to
  help, and which would be reasonably self-contained a project, would be
  to help maintain our fork of gitweb:
 
  http://git.kernel.org/?p=git/warthog9/gitweb.git;a=summary
 
  We really need the caching version of gitweb, but it does have a number
  of problems, including the non-working tarball generator.
 
 Are there an issues-list for the forked gitweb somewhere, or
 would the first step of people who would want to help be to
 build such a list?
 
 
 -
 To unsubscribe from this list: send the line unsubscribe git in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[KORG] Zeus1 Downtime

2007-03-06 Thread J.H.
Just a heads up to everyone,

Zeus1 (one of the frontend machines) will be offline for an extended
period starting late on March 6th, 2007 (PST).  This is to make some
array changes (moving from raid5 to raid6) and to flip the file system
on the mirrors to xfs.  There is no ETA on when zeus1 will come back up
as it will need to resync the entire mirrors after the changes.  Zeus2
will still be online and serving all of the public kernel.org traffic in
the interim, so please be kind to it.

If you have any questions, comments or concerns please don't hesitate to
get ahold of me.

- John 'Warthog9' Hawley
Kernel.org Admin

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[KORG] Zeus1 Downtime

2007-03-06 Thread J.H.
Just a heads up to everyone,

Zeus1 (one of the frontend machines) will be offline for an extended
period starting late on March 6th, 2007 (PST).  This is to make some
array changes (moving from raid5 to raid6) and to flip the file system
on the mirrors to xfs.  There is no ETA on when zeus1 will come back up
as it will need to resync the entire mirrors after the changes.  Zeus2
will still be online and serving all of the public kernel.org traffic in
the interim, so please be kind to it.

If you have any questions, comments or concerns please don't hesitate to
get ahold of me.

- John 'Warthog9' Hawley
Kernel.org Admin

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [KORG] Re: kernel.org lies about latest -mm kernel

2007-01-08 Thread J.H.
On Tue, 2007-01-09 at 08:01 +0100, Jean Delvare wrote:
> Hi JH,
> 
> On Mon, 08 Jan 2007 13:33:04 -0800, J.H. wrote:
> > On Mon, 2007-01-08 at 22:20 +0100, Jean Delvare wrote:
> > > * Drop the bandwidth graphs. Most visitors certainly do not care, and
> > > their presence generates traffic on all web servers regardless of the
> > > one the visitor is using, as each graph is generated by the respective
> > > server. If you really like these graphs, just move them to a separate
> > > page for people who want to watch them. As far as I am concerned, I
> > > find them rather confusing and uninformative - from a quick look you
> > > just can't tell if the servers are loaded or not, you have to look at
> > > the numbers, so what's the point of drawing a graph...
> > 
> > While I agree that most users don't care, they are useful.  If someone
> 
> So moving them to a separate page would make sense.

Not really.

> 
> > notices that 1 has an incredibly high load and moving lots of traffic in
> > comparison to 2, than they can manually redirect to 2 for better &
> > faster service on their own.  Since these images aren't particularly big
> 
> Unfortunately the images actually fail to present this information to
> the visitor clearly. One problem is the time range displayed. 17
> minutes is either too much (hardly better than an instant value, but
> harder to read) or not enough (you can't really see the trend.) With
> stats on the last 24 hours, people could see the daily usage curve and
> schedule their rsyncs at times of lesser load, for example, if they see
> a daily pattern in the load.

They are useful, even if they are more or less a snapshot in time, as it
at least gives people a vague idea of whats going on.  So instead of
having to e-mail the admins and ask 'my download is slow' they can at
least glance at the graphs and possibly realize O it's release day
and kernel.org is moving close to 2gbps between the two machines.

> 
> Another problem is the fact that the vertical scales are dynamically
> chosen, and thus different between both servers, making it impossible to
> quickly compare the bandwidth usage. If the bandwidth usage on both
> servers is stable, both images will look the same, even though one
> server might be overloaded and the other one underused. The user also
> can't compare from one visit to the next, the graphs look essentially
> the same each time, regardless of the actual bandwidth use. So, if you
> really want people to use these graphs to take decsions and help
> balancing the load better, you have to use fixed scales.
> 
> I also notice that the graphs show primarily the bandwidth, while what
> seems to matter is the server load.
> 
> > they cache just fine and it's not that big of a deal, and there are much
> > longer poles in the tent right now.
> 
> The images are being regenerated every other minute or so, so I doubt
> they can actually be cached.
> 

Considering how many times the front page of kernel.org is viewed, yes
they are cached and sitting in ram on the kernel.org boxes.
Realistically - we are arguing over something that barely even registers
as a blip within the entirety of the load on kernel.org, and we have
bigger things to worry about than a restructuring of our front page when
it won't greatly affect our loads.

- John 'Warthog9'

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [KORG] Re: kernel.org lies about latest -mm kernel

2007-01-08 Thread J.H.
On Mon, 2007-01-08 at 22:20 +0100, Jean Delvare wrote:
> Hi JH,
> 
> On Sat, 16 Dec 2006 11:30:34 -0800, J.H. wrote:
> > The root cause boils down to with git, gitweb and the normal mirroring
> > on the frontend machines our basic working set no longer stays resident
> > in memory, which is forcing more and more to actively go to disk causing
> > a much higher I/O load.  You have the added problem that one of the
> > frontend machines is getting hit harder than the other due to several
> > factors: various DNS servers not round robining, people explicitly
> > hitting [git|mirrors|www|etc]1 instead of 2 for whatever reason and
> 
> I am trying to be a good citizen by explicitely asking for
> www2.kernel.org, unfortunately I notice that many links on the main
> page point to www.kernel.org rather than www2.kernel.org. Check the
> location, patchtype, full source, patch, view patch, and changeset
> links for example. Fixing these links would let people really use www2
> if they want to, that might help.

True - however if you look at the underlying link for those you'll
notice that most of the links will continue to use www2 instead of www.
The ones that explicitly point to www probably have a good reason for
doing so, but I'll have to check on that.  Regardless the kernel.org
webpages need some work and it's on my todo list (maybe I should post
that somewhere...)

> 
> BTW, I'm no DNS expert, but isn't it possible to favor one host in the
> round robin mechanism? E.g. by listing the server 2 twice, so that it
> gets 2/3 of the load? This could also help if server 1 otherwise gets
> more load.

Could, but the bigger problem seems to be people explicitly pointing
rsync at 1 instead of the generic name or 2.  Beyond that traffic seems
to distribute as we are expecting.

> 
> > So we know the problem is there, and we are working on it - we are
> > getting e-mails about it if not daily than every other day or so.  If
> > there are suggestions we are willing to hear them - but the general
> > feeling with the admins is that we are probably hitting the biggest
> > problems already.
> 
> I have a few suggestions although I realize that the other things
> you're working on are likely to be much more helpful:
> 
> * Shorten the www.kernel.org main page. I guess that 99% of the hits on
> this page are by people who just want to know the latest versions, and
> possibly download a patch or access Linus' git tree through gitweb. All
> the rest could be moved to a separate page, or if you think it's
> better to keep all the general info on the main page, move the array
> with the versions to a separate page, which developers can bookmark.
> Splitting the dynamic content (top) from the essentially static content
> (bottom) of this page should help with caching, BTW.

The frontpage itself cache's pretty nicely and the upper 'dynamic'
content isn't constantly being generated on every page request so by and
large this caches and we don't have any real issue with it.

> 
> * Drop the bandwidth graphs. Most visitors certainly do not care, and
> their presence generates traffic on all web servers regardless of the
> one the visitor is using, as each graph is generated by the respective
> server. If you really like these graphs, just move them to a separate
> page for people who want to watch them. As far as I am concerned, I
> find them rather confusing and uninformative - from a quick look you
> just can't tell if the servers are loaded or not, you have to look at
> the numbers, so what's the point of drawing a graph...

While I agree that most users don't care, they are useful.  If someone
notices that 1 has an incredibly high load and moving lots of traffic in
comparison to 2, than they can manually redirect to 2 for better &
faster service on their own.  Since these images aren't particularly big
they cache just fine and it's not that big of a deal, and there are much
longer poles in the tent right now.

> 
> Of course the interest of these proposals directly depends on how much
> the www.kernel.org/index page accounts in the total load of the servers.
> 

Honestly - negligible at best.  We have bigger issues from trying to
service 200 seperate rsync processes on top of http, ftp, git, gitweb,
etc than worying about a couple of small, 90% static pages.

- John 'Warthog9' Hawley
Kernel.org Admin

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [KORG] Re: kernel.org lies about latest -mm kernel

2007-01-08 Thread J.H.
On Mon, 2007-01-08 at 22:20 +0100, Jean Delvare wrote:
 Hi JH,
 
 On Sat, 16 Dec 2006 11:30:34 -0800, J.H. wrote:
  The root cause boils down to with git, gitweb and the normal mirroring
  on the frontend machines our basic working set no longer stays resident
  in memory, which is forcing more and more to actively go to disk causing
  a much higher I/O load.  You have the added problem that one of the
  frontend machines is getting hit harder than the other due to several
  factors: various DNS servers not round robining, people explicitly
  hitting [git|mirrors|www|etc]1 instead of 2 for whatever reason and
 
 I am trying to be a good citizen by explicitely asking for
 www2.kernel.org, unfortunately I notice that many links on the main
 page point to www.kernel.org rather than www2.kernel.org. Check the
 location, patchtype, full source, patch, view patch, and changeset
 links for example. Fixing these links would let people really use www2
 if they want to, that might help.

True - however if you look at the underlying link for those you'll
notice that most of the links will continue to use www2 instead of www.
The ones that explicitly point to www probably have a good reason for
doing so, but I'll have to check on that.  Regardless the kernel.org
webpages need some work and it's on my todo list (maybe I should post
that somewhere...)

 
 BTW, I'm no DNS expert, but isn't it possible to favor one host in the
 round robin mechanism? E.g. by listing the server 2 twice, so that it
 gets 2/3 of the load? This could also help if server 1 otherwise gets
 more load.

Could, but the bigger problem seems to be people explicitly pointing
rsync at 1 instead of the generic name or 2.  Beyond that traffic seems
to distribute as we are expecting.

 
  So we know the problem is there, and we are working on it - we are
  getting e-mails about it if not daily than every other day or so.  If
  there are suggestions we are willing to hear them - but the general
  feeling with the admins is that we are probably hitting the biggest
  problems already.
 
 I have a few suggestions although I realize that the other things
 you're working on are likely to be much more helpful:
 
 * Shorten the www.kernel.org main page. I guess that 99% of the hits on
 this page are by people who just want to know the latest versions, and
 possibly download a patch or access Linus' git tree through gitweb. All
 the rest could be moved to a separate page, or if you think it's
 better to keep all the general info on the main page, move the array
 with the versions to a separate page, which developers can bookmark.
 Splitting the dynamic content (top) from the essentially static content
 (bottom) of this page should help with caching, BTW.

The frontpage itself cache's pretty nicely and the upper 'dynamic'
content isn't constantly being generated on every page request so by and
large this caches and we don't have any real issue with it.

 
 * Drop the bandwidth graphs. Most visitors certainly do not care, and
 their presence generates traffic on all web servers regardless of the
 one the visitor is using, as each graph is generated by the respective
 server. If you really like these graphs, just move them to a separate
 page for people who want to watch them. As far as I am concerned, I
 find them rather confusing and uninformative - from a quick look you
 just can't tell if the servers are loaded or not, you have to look at
 the numbers, so what's the point of drawing a graph...

While I agree that most users don't care, they are useful.  If someone
notices that 1 has an incredibly high load and moving lots of traffic in
comparison to 2, than they can manually redirect to 2 for better 
faster service on their own.  Since these images aren't particularly big
they cache just fine and it's not that big of a deal, and there are much
longer poles in the tent right now.

 
 Of course the interest of these proposals directly depends on how much
 the www.kernel.org/index page accounts in the total load of the servers.
 

Honestly - negligible at best.  We have bigger issues from trying to
service 200 seperate rsync processes on top of http, ftp, git, gitweb,
etc than worying about a couple of small, 90% static pages.

- John 'Warthog9' Hawley
Kernel.org Admin

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [KORG] Re: kernel.org lies about latest -mm kernel

2007-01-08 Thread J.H.
On Tue, 2007-01-09 at 08:01 +0100, Jean Delvare wrote:
 Hi JH,
 
 On Mon, 08 Jan 2007 13:33:04 -0800, J.H. wrote:
  On Mon, 2007-01-08 at 22:20 +0100, Jean Delvare wrote:
   * Drop the bandwidth graphs. Most visitors certainly do not care, and
   their presence generates traffic on all web servers regardless of the
   one the visitor is using, as each graph is generated by the respective
   server. If you really like these graphs, just move them to a separate
   page for people who want to watch them. As far as I am concerned, I
   find them rather confusing and uninformative - from a quick look you
   just can't tell if the servers are loaded or not, you have to look at
   the numbers, so what's the point of drawing a graph...
  
  While I agree that most users don't care, they are useful.  If someone
 
 So moving them to a separate page would make sense.

Not really.

 
  notices that 1 has an incredibly high load and moving lots of traffic in
  comparison to 2, than they can manually redirect to 2 for better 
  faster service on their own.  Since these images aren't particularly big
 
 Unfortunately the images actually fail to present this information to
 the visitor clearly. One problem is the time range displayed. 17
 minutes is either too much (hardly better than an instant value, but
 harder to read) or not enough (you can't really see the trend.) With
 stats on the last 24 hours, people could see the daily usage curve and
 schedule their rsyncs at times of lesser load, for example, if they see
 a daily pattern in the load.

They are useful, even if they are more or less a snapshot in time, as it
at least gives people a vague idea of whats going on.  So instead of
having to e-mail the admins and ask 'my download is slow' they can at
least glance at the graphs and possibly realize O it's release day
and kernel.org is moving close to 2gbps between the two machines.

 
 Another problem is the fact that the vertical scales are dynamically
 chosen, and thus different between both servers, making it impossible to
 quickly compare the bandwidth usage. If the bandwidth usage on both
 servers is stable, both images will look the same, even though one
 server might be overloaded and the other one underused. The user also
 can't compare from one visit to the next, the graphs look essentially
 the same each time, regardless of the actual bandwidth use. So, if you
 really want people to use these graphs to take decsions and help
 balancing the load better, you have to use fixed scales.
 
 I also notice that the graphs show primarily the bandwidth, while what
 seems to matter is the server load.
 
  they cache just fine and it's not that big of a deal, and there are much
  longer poles in the tent right now.
 
 The images are being regenerated every other minute or so, so I doubt
 they can actually be cached.
 

Considering how many times the front page of kernel.org is viewed, yes
they are cached and sitting in ram on the kernel.org boxes.
Realistically - we are arguing over something that barely even registers
as a blip within the entirety of the load on kernel.org, and we have
bigger things to worry about than a restructuring of our front page when
it won't greatly affect our loads.

- John 'Warthog9'

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: How git affects kernel.org performance

2007-01-07 Thread J.H.
With my gitweb caching changes this isn't as big of a deal as the front
page is only generated once every 10 minutes or so (and with the changes
I'm working on today that timeout will be variable)

- John

On Sun, 2007-01-07 at 14:57 +, Robert Fitzsimons wrote:
> > Some more data on how git affects kernel.org...
> 
> I have a quick question about the gitweb configuration, does the
> $projects_list config entry point to a directory or a file?
> 
> When it is a directory gitweb ends up doing the equivalent of a 'find
> $project_list' to find all the available projects, so it really should
> be changed to a projects list file.
> 
> Robert

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: How git affects kernel.org performance

2007-01-07 Thread J.H.
With my gitweb caching changes this isn't as big of a deal as the front
page is only generated once every 10 minutes or so (and with the changes
I'm working on today that timeout will be variable)

- John

On Sun, 2007-01-07 at 14:57 +, Robert Fitzsimons wrote:
  Some more data on how git affects kernel.org...
 
 I have a quick question about the gitweb configuration, does the
 $projects_list config entry point to a directory or a file?
 
 When it is a directory gitweb ends up doing the equivalent of a 'find
 $project_list' to find all the available projects, so it really should
 be changed to a projects list file.
 
 Robert

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [KORG] Re: kernel.org lies about latest -mm kernel

2007-01-06 Thread J.H.
It's an issue of load, and both machines are running 'hot' so to speak.
When the loads on the machines climbs our update rsyncs take longer to
complete (considering that our loads are completely based on I/O this
isn't surprising).  More or less nothing has changed since:
http://lkml.org/lkml/2006/12/14/347 with the exception that git & gitweb
are no longer the concern we have (the caching layer I put into
kernel.org seems to be taking care of the worst problems we were seeing
and I have a couple more to put up this weekend), right now it's getting
loads between the two machines load evened out and lowering the number
of allowed rsyncs on each machine to better bound the load problem.

- John

On Sat, 2007-01-06 at 10:33 -0800, Randy Dunlap wrote:
> On Mon, 18 Dec 2006 22:52:51 -0800 J.H. wrote:
> 
> > On Tue, 2006-12-19 at 07:34 +0100, Willy Tarreau wrote:
> > > On Sat, Dec 16, 2006 at 11:30:34AM -0800, J.H. wrote:
> > > (...)
> > > 
> > > > So we know the problem is there, and we are working on it - we are
> > > > getting e-mails about it if not daily than every other day or so.  If
> > > > there are suggestions we are willing to hear them - but the general
> > > > feeling with the admins is that we are probably hitting the biggest
> > > > problems already.
> > > 
> > > BTW, yesterday my 2.4 patches were not published, but I noticed that
> > > they were not even signed not bziped on hera. At first I simply thought
> > > it was related, but right now I have a doubt. Maybe the automatic script
> > > has been temporarily been disabled on hera too ?
> > 
> > The script that deals with the uploads also deals with the packaging -
> > so yes the problem is related.
> 
> and with the finger_banner and version info on www.kernel.org page?
> 
> They currently say:
> 
> The latest stable version of the Linux kernel is:   2.6.19.1
> The latest prepatch for the stable Linux kernel tree is:2.6.20-rc3
> The latest snapshot for the stable Linux kernel tree is:2.6.20-rc3-git4
> The latest 2.4 version of the Linux kernel is:  2.4.34
> The latest 2.2 version of the Linux kernel is:  2.2.26
> The latest prepatch for the 2.2 Linux kernel tree is:   2.2.27-rc2
> The latest -mm patch to the stable Linux kernels is:2.6.20-rc2-mm1
> 
> 
> but there are 2.6.20-rc3-git[567] and 2.6.20-rc3-mm1 out there,
> so when is the finger version info updated?
> 
> Thanks,
> ---
> ~Randy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [KORG] Re: kernel.org lies about latest -mm kernel

2007-01-06 Thread J.H.
It's an issue of load, and both machines are running 'hot' so to speak.
When the loads on the machines climbs our update rsyncs take longer to
complete (considering that our loads are completely based on I/O this
isn't surprising).  More or less nothing has changed since:
http://lkml.org/lkml/2006/12/14/347 with the exception that git  gitweb
are no longer the concern we have (the caching layer I put into
kernel.org seems to be taking care of the worst problems we were seeing
and I have a couple more to put up this weekend), right now it's getting
loads between the two machines load evened out and lowering the number
of allowed rsyncs on each machine to better bound the load problem.

- John

On Sat, 2007-01-06 at 10:33 -0800, Randy Dunlap wrote:
 On Mon, 18 Dec 2006 22:52:51 -0800 J.H. wrote:
 
  On Tue, 2006-12-19 at 07:34 +0100, Willy Tarreau wrote:
   On Sat, Dec 16, 2006 at 11:30:34AM -0800, J.H. wrote:
   (...)
   
So we know the problem is there, and we are working on it - we are
getting e-mails about it if not daily than every other day or so.  If
there are suggestions we are willing to hear them - but the general
feeling with the admins is that we are probably hitting the biggest
problems already.
   
   BTW, yesterday my 2.4 patches were not published, but I noticed that
   they were not even signed not bziped on hera. At first I simply thought
   it was related, but right now I have a doubt. Maybe the automatic script
   has been temporarily been disabled on hera too ?
  
  The script that deals with the uploads also deals with the packaging -
  so yes the problem is related.
 
 and with the finger_banner and version info on www.kernel.org page?
 
 They currently say:
 
 The latest stable version of the Linux kernel is:   2.6.19.1
 The latest prepatch for the stable Linux kernel tree is:2.6.20-rc3
 The latest snapshot for the stable Linux kernel tree is:2.6.20-rc3-git4
 The latest 2.4 version of the Linux kernel is:  2.4.34
 The latest 2.2 version of the Linux kernel is:  2.2.26
 The latest prepatch for the 2.2 Linux kernel tree is:   2.2.27-rc2
 The latest -mm patch to the stable Linux kernels is:2.6.20-rc2-mm1
 
 
 but there are 2.6.20-rc3-git[567] and 2.6.20-rc3-mm1 out there,
 so when is the finger version info updated?
 
 Thanks,
 ---
 ~Randy

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] gitweb: Fix shortlog only showing HEAD revision.

2007-01-05 Thread J.H.
On Fri, 2007-01-05 at 16:21 -0500, Michael Krufky wrote:
> Robert Fitzsimons wrote:
> > My change in 190d7fdcf325bb444fa806f09ebbb403a4ae4ee6 had a small bug
> > found by Michael Krufky which caused the passed in hash value to be
> > ignored, so shortlog would only show the HEAD revision.
> > 
> > Signed-off-by: Robert Fitzsimons <[EMAIL PROTECTED]>
> > ---
> > 
> > Thanks for finding this Michael.  It' just a small bug introducted by a
> > recent change I made.  Including John 'Warthog9' so hopefully he can add
> > this to the version of gitweb which is hosted on kernel.org.
> > 
> > Robert
> 
> Robert,
> 
> Thank you for fixing this bug so quickly.  I've noticed that the gitweb
> templates on kernel.org have changed at least once since you wrote this email 
> to
> me... (I can tell, based on the fact that the git:// link has moved from the
> project column to a link labeled, "git" all the way to the right.)
> 
> Unfortunately, however, the bug that I had originally reported has not yet 
> been
> fixed on the kernel.org www server.  Either the patch in question hasn't yet
> been applied to that installation, or it HAS in fact been applied, but doesn't
> fix the problem as intended.

Simple answer - it's sitting in my tree waiting for me to have enough
time to get back to gitweb.  There are several things in flight and I'm
not prepared to push them out in their current state.

So yes the problem is fixed, but it will probably be sometime this
weekend before it gets pushed out to the kernel.org servers.

> 
> Do you know which of the above is true?
> 
> Thanks again,
> 
> Mike Krufky
> 
> >  gitweb/gitweb.perl |2 +-
> >  1 files changed, 1 insertions(+), 1 deletions(-)
> > 
> > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> > index d845e91..2e94c2c 100755
> > --- a/gitweb/gitweb.perl
> > +++ b/gitweb/gitweb.perl
> > @@ -4423,7 +4423,7 @@ sub git_shortlog {
> > }
> > my $refs = git_get_references();
> >  
> > -   my @commitlist = parse_commits($head, 101, (100 * $page));
> > +   my @commitlist = parse_commits($hash, 101, (100 * $page));
> >  
> > my $paging_nav = format_paging_nav('shortlog', $hash, $head, $page, 
> > (100 * ($page+1)));
> > my $next_link = '';
> 
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] gitweb: Fix shortlog only showing HEAD revision.

2007-01-05 Thread J.H.
On Fri, 2007-01-05 at 16:21 -0500, Michael Krufky wrote:
 Robert Fitzsimons wrote:
  My change in 190d7fdcf325bb444fa806f09ebbb403a4ae4ee6 had a small bug
  found by Michael Krufky which caused the passed in hash value to be
  ignored, so shortlog would only show the HEAD revision.
  
  Signed-off-by: Robert Fitzsimons [EMAIL PROTECTED]
  ---
  
  Thanks for finding this Michael.  It' just a small bug introducted by a
  recent change I made.  Including John 'Warthog9' so hopefully he can add
  this to the version of gitweb which is hosted on kernel.org.
  
  Robert
 
 Robert,
 
 Thank you for fixing this bug so quickly.  I've noticed that the gitweb
 templates on kernel.org have changed at least once since you wrote this email 
 to
 me... (I can tell, based on the fact that the git:// link has moved from the
 project column to a link labeled, git all the way to the right.)
 
 Unfortunately, however, the bug that I had originally reported has not yet 
 been
 fixed on the kernel.org www server.  Either the patch in question hasn't yet
 been applied to that installation, or it HAS in fact been applied, but doesn't
 fix the problem as intended.

Simple answer - it's sitting in my tree waiting for me to have enough
time to get back to gitweb.  There are several things in flight and I'm
not prepared to push them out in their current state.

So yes the problem is fixed, but it will probably be sometime this
weekend before it gets pushed out to the kernel.org servers.

 
 Do you know which of the above is true?
 
 Thanks again,
 
 Mike Krufky
 
   gitweb/gitweb.perl |2 +-
   1 files changed, 1 insertions(+), 1 deletions(-)
  
  diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
  index d845e91..2e94c2c 100755
  --- a/gitweb/gitweb.perl
  +++ b/gitweb/gitweb.perl
  @@ -4423,7 +4423,7 @@ sub git_shortlog {
  }
  my $refs = git_get_references();
   
  -   my @commitlist = parse_commits($head, 101, (100 * $page));
  +   my @commitlist = parse_commits($hash, 101, (100 * $page));
   
  my $paging_nav = format_paging_nav('shortlog', $hash, $head, $page, 
  (100 * ($page+1)));
  my $next_link = '';
 
 -
 To unsubscribe from this list: send the line unsubscribe git in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [KORG] Re: kernel.org lies about latest -mm kernel

2006-12-18 Thread J.H.
On Tue, 2006-12-19 at 07:46 +0100, Willy Tarreau wrote:
> On Sun, Dec 17, 2006 at 04:42:56PM -0800, J.H. wrote:
> > On Mon, 2006-12-18 at 00:37 +0200, Matti Aarnio wrote:
> > > On Sun, Dec 17, 2006 at 10:23:54AM -0800, Randy Dunlap wrote:
> > > > J.H. wrote:
> > > ...
> > > > >The root cause boils down to with git, gitweb and the normal mirroring
> > > > >on the frontend machines our basic working set no longer stays resident
> > > > >in memory, which is forcing more and more to actively go to disk 
> > > > >causing
> > > > >a much higher I/O load.  You have the added problem that one of the
> > > > >frontend machines is getting hit harder than the other due to several
> > > > >factors: various DNS servers not round robining, people explicitly
> > > > >hitting [git|mirrors|www|etc]1 instead of 2 for whatever reason and
> > > > >probably several other factors we aren't aware of.  This has caused the
> > > > >average load on that machine to hover around 150-200 and if for 
> > > > >whatever
> > > > >reason we have to take one of the machines down the load on the
> > > > >remaining machine will skyrocket to 2000+.  
> > > 
> > > Relaying on DNS and clients doing round-robin load-balancing is doomed.
> > > 
> > > You really, REALLY, need external L4 load-balancer switches.
> > > (And installation help from somebody who really knows how to do this
> > > kind of services on a cluster.)
> > 
> > While this is a really good idea when you have systems that are all in a
> > single location, with a single uplink and what not - this isn't the case
> > with kernel.org.  Our machines are currently in three separate
> > facilities in the US (spanning two different states), with us working on
> > a fourth in Europe.
> 
> On multi-site setups, you have to rely on DNS, but the DNS should not
> announce the servers themselves, but the local load balancers, each of
> which knows other sites.
> 
> While people often find it dirty, there's no problem forwarding a
> request from one site to another via the internet as long as there
> are big pipes. Generally, I play with weights to slightly smooth
> the load and reduce the bandwidth usage on the pipe (eg: 2/3 local,
> 1/3 remote).
> 
> With LVS, you can even use the tunneling mode, with which the request
> comes to LB on site A, is forwarded to site B via the net, but the data
> returns from site B to the client.
> 
> If the frontend machines are not taken off-line too often, it should
> be no big deal for them to handle something such as LVS, and would
> help spreding the load.

I'll have to look into it - but by and large the round robining tends to
work.  Specifically as I am writing this the machines are both pushing
right around 150mbps, however the load on zeus1 is 170 vs. zeus2's 4.
Also when we peak the bandwidth we do use every last kb we can get our
hands on, so doing any tunneling takes just that much bandwidth away
from the total.

Number of Processes running
process #1  #2

rsync   162 69
http734 642
ftp 353 190

as a quick snapshot.  I would agree with HPA's recent statement - that
people who are mirroring against kernel.org have probably hard coded the
first machine into their scripts, combine that with a few dns servers
that don't honor or deal with round robining and you have the extra load
on the first machine vs. the second.

> 
> > > > >Since it's apparent not everyone is aware of what we are doing, I'll
> > > > >mention briefly some of the bigger points.
> > > ...
> > > > >- We've cut back on the number of ftp and rsync users to the machines.
> > > > >Basically we are cutting back where we can in an attempt to keep the
> > > > >load from spiraling out of control, this helped a bit when we recently
> > > > >had to take one of the machines down and instead of loads spiking into
> > > > >the 2000+ range we peaked at about 500-600 I believe.
> > > 
> > > How about having filesystems mounted with "noatime" ?
> > > Or do you already do that ?
> > 
> > We've been doing that for over a year.
> 
> Couldn't we temporarily *cut* the services one after the other on www1
> to find which ones are the most I/O consumming, and see which ones can
> coexist without bad interaction ?
> 
> Also, I see that keepalive is still enabled on apache, I guess there
> are thousands of processes and that apache is eating gigs of RAM by
> itself. I strongly suggest disabling keepalive there.
> 
> > - John
> 
> Just my 2 cents,
> Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [KORG] Re: kernel.org lies about latest -mm kernel

2006-12-18 Thread J.H.
On Tue, 2006-12-19 at 07:34 +0100, Willy Tarreau wrote:
> On Sat, Dec 16, 2006 at 11:30:34AM -0800, J.H. wrote:
> (...)
> > Since it's apparent not everyone is aware of what we are doing, I'll
> > mention briefly some of the bigger points.
> > 
> > - We have contacted HP to see if we can get additional hardware, mind
> > you though this is a long term solution and will take time, but if our
> > request is approved it will double the number of machines kernel.org
> > runs.
> 
> Just evil suggestion, but if you contact someone else than HP, they
> might be _very_ interested in taking HP's place and providing whatever
> you need to get their name on www.kernel.org. Sun and IBM do such
> monter machines too. That would not be very kind to HP, but it might
> help getting hardware faster.

I leave the actual hardware acquisitions up to HPA, I just try to keep
the machines up and running without too many problems.  HP has been
incredibly supportive of kernel.org in the past and I for one have been
very appreciative of their hardware and would love to continue working
with them.

> 
> > - Gitweb is causing us no end of headache, there are (known to me
> > anyway) two different things happening on that.  I am looking at Jeff
> > Garzik's suggested caching mechanism as a temporary stop-gap, with an
> > eye more on doing a rather heavy re-write of gitweb itself to include
> > semi-intelligent caching.  I've already started in on the later - and I
> > just about have the caching layer put in.  But this is still at least a
> > week out before we could even remotely consider deploying it.
> 
> Couldn't we disable gitweb for as long as we don't get newer machines ?
> I've been using it in the past, but it was just a convenience. If needed,
> we can explode all the recent patches with a "git-format-patch -k -m" in a
> directory.

I've mentioned this to the other admins and the consensus was that there
would be quite the outcry to suggest this - if the consensus is to
disable gitweb until we can get it under control we would take doing
that into consideration.

> 
> > - We've cut back on the number of ftp and rsync users to the machines.
> > Basically we are cutting back where we can in an attempt to keep the
> > load from spiraling out of control, this helped a bit when we recently
> > had to take one of the machines down and instead of loads spiking into
> > the 2000+ range we peaked at about 500-600 I believe.
> 
> I did not imagine FTP and rsync being so much used !

On average we are moving anywhere from 400-600mbps between the two
machines, on release days we max both of the connections at 1gpbs each
and have seen that draw last for 48hours.  For instance when FC6 was
released in the first 12 hours or so we moved 13 TBytes of data.

> 
> > So we know the problem is there, and we are working on it - we are
> > getting e-mails about it if not daily than every other day or so.  If
> > there are suggestions we are willing to hear them - but the general
> > feeling with the admins is that we are probably hitting the biggest
> > problems already.
> 
> BTW, yesterday my 2.4 patches were not published, but I noticed that
> they were not even signed not bziped on hera. At first I simply thought
> it was related, but right now I have a doubt. Maybe the automatic script
> has been temporarily been disabled on hera too ?

The script that deals with the uploads also deals with the packaging -
so yes the problem is related.

> 
> > - John 'Warthog9' Hawley
> > Kernel.org Admin
> 
> Thanks for keeping us informed !
> Willy

Doing what I can :-)

- John

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [KORG] Re: kernel.org lies about latest -mm kernel

2006-12-18 Thread J.H.
On Tue, 2006-12-19 at 07:34 +0100, Willy Tarreau wrote:
 On Sat, Dec 16, 2006 at 11:30:34AM -0800, J.H. wrote:
 (...)
  Since it's apparent not everyone is aware of what we are doing, I'll
  mention briefly some of the bigger points.
  
  - We have contacted HP to see if we can get additional hardware, mind
  you though this is a long term solution and will take time, but if our
  request is approved it will double the number of machines kernel.org
  runs.
 
 Just evil suggestion, but if you contact someone else than HP, they
 might be _very_ interested in taking HP's place and providing whatever
 you need to get their name on www.kernel.org. Sun and IBM do such
 monter machines too. That would not be very kind to HP, but it might
 help getting hardware faster.

I leave the actual hardware acquisitions up to HPA, I just try to keep
the machines up and running without too many problems.  HP has been
incredibly supportive of kernel.org in the past and I for one have been
very appreciative of their hardware and would love to continue working
with them.

 
  - Gitweb is causing us no end of headache, there are (known to me
  anyway) two different things happening on that.  I am looking at Jeff
  Garzik's suggested caching mechanism as a temporary stop-gap, with an
  eye more on doing a rather heavy re-write of gitweb itself to include
  semi-intelligent caching.  I've already started in on the later - and I
  just about have the caching layer put in.  But this is still at least a
  week out before we could even remotely consider deploying it.
 
 Couldn't we disable gitweb for as long as we don't get newer machines ?
 I've been using it in the past, but it was just a convenience. If needed,
 we can explode all the recent patches with a git-format-patch -k -m in a
 directory.

I've mentioned this to the other admins and the consensus was that there
would be quite the outcry to suggest this - if the consensus is to
disable gitweb until we can get it under control we would take doing
that into consideration.

 
  - We've cut back on the number of ftp and rsync users to the machines.
  Basically we are cutting back where we can in an attempt to keep the
  load from spiraling out of control, this helped a bit when we recently
  had to take one of the machines down and instead of loads spiking into
  the 2000+ range we peaked at about 500-600 I believe.
 
 I did not imagine FTP and rsync being so much used !

On average we are moving anywhere from 400-600mbps between the two
machines, on release days we max both of the connections at 1gpbs each
and have seen that draw last for 48hours.  For instance when FC6 was
released in the first 12 hours or so we moved 13 TBytes of data.

 
  So we know the problem is there, and we are working on it - we are
  getting e-mails about it if not daily than every other day or so.  If
  there are suggestions we are willing to hear them - but the general
  feeling with the admins is that we are probably hitting the biggest
  problems already.
 
 BTW, yesterday my 2.4 patches were not published, but I noticed that
 they were not even signed not bziped on hera. At first I simply thought
 it was related, but right now I have a doubt. Maybe the automatic script
 has been temporarily been disabled on hera too ?

The script that deals with the uploads also deals with the packaging -
so yes the problem is related.

 
  - John 'Warthog9' Hawley
  Kernel.org Admin
 
 Thanks for keeping us informed !
 Willy

Doing what I can :-)

- John

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [KORG] Re: kernel.org lies about latest -mm kernel

2006-12-18 Thread J.H.
On Tue, 2006-12-19 at 07:46 +0100, Willy Tarreau wrote:
 On Sun, Dec 17, 2006 at 04:42:56PM -0800, J.H. wrote:
  On Mon, 2006-12-18 at 00:37 +0200, Matti Aarnio wrote:
   On Sun, Dec 17, 2006 at 10:23:54AM -0800, Randy Dunlap wrote:
J.H. wrote:
   ...
The root cause boils down to with git, gitweb and the normal mirroring
on the frontend machines our basic working set no longer stays resident
in memory, which is forcing more and more to actively go to disk 
causing
a much higher I/O load.  You have the added problem that one of the
frontend machines is getting hit harder than the other due to several
factors: various DNS servers not round robining, people explicitly
hitting [git|mirrors|www|etc]1 instead of 2 for whatever reason and
probably several other factors we aren't aware of.  This has caused the
average load on that machine to hover around 150-200 and if for 
whatever
reason we have to take one of the machines down the load on the
remaining machine will skyrocket to 2000+.  
   
   Relaying on DNS and clients doing round-robin load-balancing is doomed.
   
   You really, REALLY, need external L4 load-balancer switches.
   (And installation help from somebody who really knows how to do this
   kind of services on a cluster.)
  
  While this is a really good idea when you have systems that are all in a
  single location, with a single uplink and what not - this isn't the case
  with kernel.org.  Our machines are currently in three separate
  facilities in the US (spanning two different states), with us working on
  a fourth in Europe.
 
 On multi-site setups, you have to rely on DNS, but the DNS should not
 announce the servers themselves, but the local load balancers, each of
 which knows other sites.
 
 While people often find it dirty, there's no problem forwarding a
 request from one site to another via the internet as long as there
 are big pipes. Generally, I play with weights to slightly smooth
 the load and reduce the bandwidth usage on the pipe (eg: 2/3 local,
 1/3 remote).
 
 With LVS, you can even use the tunneling mode, with which the request
 comes to LB on site A, is forwarded to site B via the net, but the data
 returns from site B to the client.
 
 If the frontend machines are not taken off-line too often, it should
 be no big deal for them to handle something such as LVS, and would
 help spreding the load.

I'll have to look into it - but by and large the round robining tends to
work.  Specifically as I am writing this the machines are both pushing
right around 150mbps, however the load on zeus1 is 170 vs. zeus2's 4.
Also when we peak the bandwidth we do use every last kb we can get our
hands on, so doing any tunneling takes just that much bandwidth away
from the total.

Number of Processes running
process #1  #2

rsync   162 69
http734 642
ftp 353 190

as a quick snapshot.  I would agree with HPA's recent statement - that
people who are mirroring against kernel.org have probably hard coded the
first machine into their scripts, combine that with a few dns servers
that don't honor or deal with round robining and you have the extra load
on the first machine vs. the second.

 
Since it's apparent not everyone is aware of what we are doing, I'll
mention briefly some of the bigger points.
   ...
- We've cut back on the number of ftp and rsync users to the machines.
Basically we are cutting back where we can in an attempt to keep the
load from spiraling out of control, this helped a bit when we recently
had to take one of the machines down and instead of loads spiking into
the 2000+ range we peaked at about 500-600 I believe.
   
   How about having filesystems mounted with noatime ?
   Or do you already do that ?
  
  We've been doing that for over a year.
 
 Couldn't we temporarily *cut* the services one after the other on www1
 to find which ones are the most I/O consumming, and see which ones can
 coexist without bad interaction ?
 
 Also, I see that keepalive is still enabled on apache, I guess there
 are thousands of processes and that apache is eating gigs of RAM by
 itself. I strongly suggest disabling keepalive there.
 
  - John
 
 Just my 2 cents,
 Willy

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [KORG] Re: kernel.org lies about latest -mm kernel

2006-12-17 Thread J.H.
On Mon, 2006-12-18 at 00:37 +0200, Matti Aarnio wrote:
> On Sun, Dec 17, 2006 at 10:23:54AM -0800, Randy Dunlap wrote:
> > J.H. wrote:
> ...
> > >The root cause boils down to with git, gitweb and the normal mirroring
> > >on the frontend machines our basic working set no longer stays resident
> > >in memory, which is forcing more and more to actively go to disk causing
> > >a much higher I/O load.  You have the added problem that one of the
> > >frontend machines is getting hit harder than the other due to several
> > >factors: various DNS servers not round robining, people explicitly
> > >hitting [git|mirrors|www|etc]1 instead of 2 for whatever reason and
> > >probably several other factors we aren't aware of.  This has caused the
> > >average load on that machine to hover around 150-200 and if for whatever
> > >reason we have to take one of the machines down the load on the
> > >remaining machine will skyrocket to 2000+.  
> 
> Relaying on DNS and clients doing round-robin load-balancing is doomed.
> 
> You really, REALLY, need external L4 load-balancer switches.
> (And installation help from somebody who really knows how to do this
> kind of services on a cluster.)

While this is a really good idea when you have systems that are all in a
single location, with a single uplink and what not - this isn't the case
with kernel.org.  Our machines are currently in three separate
facilities in the US (spanning two different states), with us working on
a fourth in Europe.

> > >Since it's apparent not everyone is aware of what we are doing, I'll
> > >mention briefly some of the bigger points.
> ...
> > >- We've cut back on the number of ftp and rsync users to the machines.
> > >Basically we are cutting back where we can in an attempt to keep the
> > >load from spiraling out of control, this helped a bit when we recently
> > >had to take one of the machines down and instead of loads spiking into
> > >the 2000+ range we peaked at about 500-600 I believe.
> 
> How about having filesystems mounted with "noatime" ?
> Or do you already do that ?

We've been doing that for over a year.

- John

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [KORG] Re: kernel.org lies about latest -mm kernel

2006-12-17 Thread J.H.
On Mon, 2006-12-18 at 00:37 +0200, Matti Aarnio wrote:
 On Sun, Dec 17, 2006 at 10:23:54AM -0800, Randy Dunlap wrote:
  J.H. wrote:
 ...
  The root cause boils down to with git, gitweb and the normal mirroring
  on the frontend machines our basic working set no longer stays resident
  in memory, which is forcing more and more to actively go to disk causing
  a much higher I/O load.  You have the added problem that one of the
  frontend machines is getting hit harder than the other due to several
  factors: various DNS servers not round robining, people explicitly
  hitting [git|mirrors|www|etc]1 instead of 2 for whatever reason and
  probably several other factors we aren't aware of.  This has caused the
  average load on that machine to hover around 150-200 and if for whatever
  reason we have to take one of the machines down the load on the
  remaining machine will skyrocket to 2000+.  
 
 Relaying on DNS and clients doing round-robin load-balancing is doomed.
 
 You really, REALLY, need external L4 load-balancer switches.
 (And installation help from somebody who really knows how to do this
 kind of services on a cluster.)

While this is a really good idea when you have systems that are all in a
single location, with a single uplink and what not - this isn't the case
with kernel.org.  Our machines are currently in three separate
facilities in the US (spanning two different states), with us working on
a fourth in Europe.

  Since it's apparent not everyone is aware of what we are doing, I'll
  mention briefly some of the bigger points.
 ...
  - We've cut back on the number of ftp and rsync users to the machines.
  Basically we are cutting back where we can in an attempt to keep the
  load from spiraling out of control, this helped a bit when we recently
  had to take one of the machines down and instead of loads spiking into
  the 2000+ range we peaked at about 500-600 I believe.
 
 How about having filesystems mounted with noatime ?
 Or do you already do that ?

We've been doing that for over a year.

- John

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [KORG] Re: kernel.org lies about latest -mm kernel

2006-12-16 Thread J.H.
The problem has been hashed over quite a bit recently, and I would be
curious what you would consider the real problem after you see the
situation.

The root cause boils down to with git, gitweb and the normal mirroring
on the frontend machines our basic working set no longer stays resident
in memory, which is forcing more and more to actively go to disk causing
a much higher I/O load.  You have the added problem that one of the
frontend machines is getting hit harder than the other due to several
factors: various DNS servers not round robining, people explicitly
hitting [git|mirrors|www|etc]1 instead of 2 for whatever reason and
probably several other factors we aren't aware of.  This has caused the
average load on that machine to hover around 150-200 and if for whatever
reason we have to take one of the machines down the load on the
remaining machine will skyrocket to 2000+.  

Since it's apparent not everyone is aware of what we are doing, I'll
mention briefly some of the bigger points.

- We have contacted HP to see if we can get additional hardware, mind
you though this is a long term solution and will take time, but if our
request is approved it will double the number of machines kernel.org
runs.

- Gitweb is causing us no end of headache, there are (known to me
anyway) two different things happening on that.  I am looking at Jeff
Garzik's suggested caching mechanism as a temporary stop-gap, with an
eye more on doing a rather heavy re-write of gitweb itself to include
semi-intelligent caching.  I've already started in on the later - and I
just about have the caching layer put in.  But this is still at least a
week out before we could even remotely consider deploying it.

- We've cut back on the number of ftp and rsync users to the machines.
Basically we are cutting back where we can in an attempt to keep the
load from spiraling out of control, this helped a bit when we recently
had to take one of the machines down and instead of loads spiking into
the 2000+ range we peaked at about 500-600 I believe.

So we know the problem is there, and we are working on it - we are
getting e-mails about it if not daily than every other day or so.  If
there are suggestions we are willing to hear them - but the general
feeling with the admins is that we are probably hitting the biggest
problems already.

- John 'Warthog9' Hawley
Kernel.org Admin

On Sat, 2006-12-16 at 10:02 -0800, Randy Dunlap wrote:
> Andrew Morton wrote:
> > On Sat, 16 Dec 2006 09:44:21 -0800
> > Randy Dunlap <[EMAIL PROTECTED]> wrote:
> > 
> >> On Thu, 14 Dec 2006 23:37:18 +0100 Pavel Machek wrote:
> >>
> >>> Hi!
> >>>
> >>> [EMAIL PROTECTED]:/data/pavel$ finger @www.kernel.org
> >>> [zeus-pub.kernel.org]
> >>> ...
> >>> The latest -mm patch to the stable Linux kernels is: 2.6.19-rc6-mm2
> >>> [EMAIL PROTECTED]:/data/pavel$ head /data/l/linux-mm/Makefile
> >>> VERSION = 2
> >>> PATCHLEVEL = 6
> >>> SUBLEVEL = 19
> >>> EXTRAVERSION = -mm1
> >>> ...
> >>> [EMAIL PROTECTED]:/data/pavel$
> >>>
> >>> AFAICT 2.6.19-mm1 is newer than 2.6.19-rc6-mm2, but kernel.org does
> >>> not understand that.
> >> Still true (not listed) for 2.6.20-rc1-mm1  :(
> >>
> >> Could someone explain what the problem is and what it would
> >> take to correct it?
> > 
> > 2.6.20-rc1-mm1 still hasn't propagated out to the servers (it's been 36
> > hours).  Presumably the front page non-update is a consequence of that.
> 
> Agreed on the latter part.  Can someone address the real problem???
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [KORG] Re: kernel.org lies about latest -mm kernel

2006-12-16 Thread J.H.
The problem has been hashed over quite a bit recently, and I would be
curious what you would consider the real problem after you see the
situation.

The root cause boils down to with git, gitweb and the normal mirroring
on the frontend machines our basic working set no longer stays resident
in memory, which is forcing more and more to actively go to disk causing
a much higher I/O load.  You have the added problem that one of the
frontend machines is getting hit harder than the other due to several
factors: various DNS servers not round robining, people explicitly
hitting [git|mirrors|www|etc]1 instead of 2 for whatever reason and
probably several other factors we aren't aware of.  This has caused the
average load on that machine to hover around 150-200 and if for whatever
reason we have to take one of the machines down the load on the
remaining machine will skyrocket to 2000+.  

Since it's apparent not everyone is aware of what we are doing, I'll
mention briefly some of the bigger points.

- We have contacted HP to see if we can get additional hardware, mind
you though this is a long term solution and will take time, but if our
request is approved it will double the number of machines kernel.org
runs.

- Gitweb is causing us no end of headache, there are (known to me
anyway) two different things happening on that.  I am looking at Jeff
Garzik's suggested caching mechanism as a temporary stop-gap, with an
eye more on doing a rather heavy re-write of gitweb itself to include
semi-intelligent caching.  I've already started in on the later - and I
just about have the caching layer put in.  But this is still at least a
week out before we could even remotely consider deploying it.

- We've cut back on the number of ftp and rsync users to the machines.
Basically we are cutting back where we can in an attempt to keep the
load from spiraling out of control, this helped a bit when we recently
had to take one of the machines down and instead of loads spiking into
the 2000+ range we peaked at about 500-600 I believe.

So we know the problem is there, and we are working on it - we are
getting e-mails about it if not daily than every other day or so.  If
there are suggestions we are willing to hear them - but the general
feeling with the admins is that we are probably hitting the biggest
problems already.

- John 'Warthog9' Hawley
Kernel.org Admin

On Sat, 2006-12-16 at 10:02 -0800, Randy Dunlap wrote:
 Andrew Morton wrote:
  On Sat, 16 Dec 2006 09:44:21 -0800
  Randy Dunlap [EMAIL PROTECTED] wrote:
  
  On Thu, 14 Dec 2006 23:37:18 +0100 Pavel Machek wrote:
 
  Hi!
 
  [EMAIL PROTECTED]:/data/pavel$ finger @www.kernel.org
  [zeus-pub.kernel.org]
  ...
  The latest -mm patch to the stable Linux kernels is: 2.6.19-rc6-mm2
  [EMAIL PROTECTED]:/data/pavel$ head /data/l/linux-mm/Makefile
  VERSION = 2
  PATCHLEVEL = 6
  SUBLEVEL = 19
  EXTRAVERSION = -mm1
  ...
  [EMAIL PROTECTED]:/data/pavel$
 
  AFAICT 2.6.19-mm1 is newer than 2.6.19-rc6-mm2, but kernel.org does
  not understand that.
  Still true (not listed) for 2.6.20-rc1-mm1  :(
 
  Could someone explain what the problem is and what it would
  take to correct it?
  
  2.6.20-rc1-mm1 still hasn't propagated out to the servers (it's been 36
  hours).  Presumably the front page non-update is a consequence of that.
 
 Agreed on the latter part.  Can someone address the real problem???
 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/