Re: [UMN_MAPSERVER-USERS] hardware

Ed McNierney Tue, 08 Jan 2008 10:24:13 -0800

Brent -

But I think that's what I said - I'm sure you saw the results you saw in
one specific situation and one specific configuration comparing one pair
of different machines.  I don't think that justifies the generic
recommendation that step 1 should be to move to Linux.  The choice of
operating system, when placed in the universe of possible performance
optimizations, is probably pretty far down the list after hard disk,
RAM, network, caching, and CPU selection.  And I list those hardware
choices in a very rough order of priority, with the first two being by
far the most important.

I have no experience in comparing AMD vs. Intel CPUs, although I do run
both.

I also have found, subjectively, that in most cases raster and vector
reprojection (tasks TopoZone does all the time) are far more efficient
than expected.

It is entirely possible that Kim's application issue is primarily one of
optimizing hardware for PostgreSQL/PostGIS performance - MapServer may
be the easy part if the application is dominated by database query time.

Part of the problem of asking about hardware recommendations on this
list is that a MapServer/PostgreSQL/PostGIS application is fairly
complex, and there are many such beasts.  It is exceptionally hard to
extrapolate one user's experience to provide useful advice for another
user with a different application.

Except for recommending fast disks and more RAM!

        - Ed

-----Original Message-----
From: Brent Wood [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, January 08, 2008 1:02 PM
To: Ed McNierney; MAPSERVER-USERS@LISTS.UMN.EDU
Subject: Re: [UMN_MAPSERVER-USERS] hardware

--- Ed McNierney <[EMAIL PROTECTED]> wrote:

> Brent -
> 
> I have to disagree with such a sweeping generalization concerning
Windows and
> Linux servers.  Having run both Windows and Linux servers for
MapServer,
> MySQL, PostGIS and other applications for several years, I find it
hard to
> claim that on identical hardware one is noticeably faster or slower
than the
> other.  Your mileage can (and obviously does) vary, but since you're
running
> on two different machines your "inferior" system may well be better
suited to
> the load you're giving it than the Windows system.  I'm sure there are
> exceptions, and applications where there IS a noticeable difference
between
> otherwise identical systems, but my experience does not show that a
broad
> statement based on your observations on one pair of servers is
warranted.

Hi Ed, 

I agree in general, but with this particular Mapserver/PostGIS/Java
application, this is exactly the result we had. The tech support was
more
Windows capable than Linux, the result was totally unexpected, and so
far
unexplained. Hence the abstracting of the PostGIS/Mapserver using WMS to
support the application. We found no need to further separate PostGIS &
Mapserver to meet performance needs, but this is obviously feasible.

The performance of a repeat query under Windows & Linux I have tried
several
times, & Linux has been invariably faster than Windows the second time
around,
but as I said, the initial un-cached query times were similar. In
situations
where a substantial amount of of data is in memory vs being read from
disk, I
believe Linux (especially 64bit) will outperform Windows (generally
still
32bit).

Another example, I recently helped put a 16 core mini-cluster for
fisheries
modelling together. (Note: very different loads to a mapserver/PostGIS
application- mostly diskless). We benchmarked AMD vs Intel & Linux vs
Windows,
fairly thoroughly, with surprising (to me, at least) results.

Supposedly similar AMD A64 dual core & Intel Core Duo cpus (according to
published benchmarks) ran very differently, with AMD winning by 15-20%
runtimes
on several hour long iterations. On the same (AMD & Intel) hardware,
Linux beat
Windows by 20-30%. The Windows binary running under wine ran 10-15%
faster than

natively under Windows.

This may differ from your experiences, but that's why a forum like this
usefully allows a variety of input from many people. It's also why
running your
own benchmarks for your own application is a very good idea.

I recommend where possible people do benchmark their own applications,
because
I have found it not uncommon for particular apps to behave unexpectedly,
and
identifying specific hardware bottlenecks can obviously be very useful.

> First, the load Kim is talking about is quite light.  If there are 200
hits
> per hour, and a 3x increase is expected, that's 600 hits per hour or
10 hits
> per minute; one hit every six seconds.

Perhaps, 1.5m points is a rough fixture, but how many are to be rendered
in any
one map? Introducing polygons makes it very hard to know what PostGIS
response
times will be, without knowing exactly how complex the polygons are.
High
resolution coastlines with millions of points per feature vs square
cells. The
former may well require more than 6 seconds to return a query result. Is
PostGIS or Mapserver doing any vector or raster reprojection? Again,
this can
substantially impact performance.

I agree, the simple numbers suggest that the load will not be huge, but
the
main thrust (or at least intent) of my response was that optimising data
structures,
pre-reprojecting data if required, etc, will often yield better
performance
improvements than adding another spindle to an array or a couple of cpu
cores. 

> 
> Second, I think it's very important to separate the MapServer and
PostGIS
> portions of the application, at least for discussion purposes.  The
optimal
> arrangement for each application may not be the same, and there may
need to
> be some compromise if they're to run on one system.

I agree here totally. Note my illustration of abstracting
mapserver/postgis
from the application server. The suggested loads, as you detail them,
don't
really imply that further separation via hardware is really needed, at
least
not without more information as above. And asking for hardware advice
without
an indicative budget is always difficult to answer except in
generalities.

More fast disk & memory :-) 

> Third, I do agree that fast RAID 5 disks and lots of RAM are always a
good
> idea!
> 
> Finally, I think the most important think to remember is to work with
what
> you know.  If your staff are accustomed to and trained in using
Windows
> systems, moving to Linux is hard and expensive - in terms of training,
> consultants, time to resolve issues, etc.  The reverse is equally
true. 
> There is a high cost of moving to an operating system environment you
don't
> know, and there needs to be a very good reason why you're willing to
incur
> that cost.  The suggestion that Kim "go to Linux ASAP" is a very
expensive
> suggestion if the needed Linux skills aren't available.  IMHO, the
first step
> is to ensure optimal hardware design using the operating system Kim's
most
> familiar with.

Except that Kim did say they were looking to do this anyway. My
experience
suggests Linux _may_ offer substantial performance benefits. Quite a
different
situation to where they are a strictly Windows house with no intent to
migrate
at all. In which case I would agree with you 100% :-)

If they have the time & resources, a benchmarking exercise of a
prototype
application to give some real world numbers could be very useful. But
generally
this is unrealistically expensive & people ask lists like this for
advice
instead :-)

Cheers,

  Brent

Re: [UMN_MAPSERVER-USERS] hardware

Reply via email to