Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-17 Thread Doug_Newcomb
>One other thing I just noticed: to effectively use the tiled tiffs I need
BigTiff support which looks like it >is a real new thing (needs libtiff 4.0
which is still in beta according to the remotesensing.org ftp site.)
If you compile gdal 1.5.2 from source and use the tiff=internal option, you
get bigtiff support, without installing libtiff 4.0 yourself,
http://www.gdal.org/formats_list.html .  I have not tested this as an image
source for mapserver yet, but I have created a couple of bigtiff images.

Doug


Doug Newcomb
USFWS
Raleigh, NC
919-856-4520 ext. 14 [EMAIL PROTECTED]
-

The opinions I express are my own and are not representative of the
official policy of the U.S.Fish and Wildlife Service or Dept. of Interior.
Life is too short for undocumented, proprietary data formats.


   
 "Jim Klassen" 
 <[EMAIL PROTECTED] 
 tpaul.mn.us>   To 
 Sent by:  <[EMAIL PROTECTED]>,  
 mapserver-users-b <[EMAIL PROTECTED]>  
 [EMAIL PROTECTED]  cc 
 o.org mapserver-users@lists.osgeo.org 
   Subject 
   Re: [mapserver-users] Ed's Rules
 09/16/2008 04:57  for the Best Raster Performance 
 PM
   
   
   
   
   




One other thing I just noticed: to effectively use the tiled tiffs I need
BigTiff support which looks like it is a real new thing (needs libtiff 4.0
which is still in beta according to the remotesensing.org ftp site.)
Anyway, I'm still going to give this a try and check the performance
difference.  For us, with existing hardware, disk space would be an issue
for uncompressed images so I will be trying JPEG in TIFF. All I can say
about the JPEG->JPEG re-compression artifacts is with the existing setup,
we haven't had any complaints.

The "large" number of files doesn't (in our case < 500k) doesn't seem to
effect the operational performance of our server in a meaningful way. (I
don't remember the exact numbers, but I have measured file access time in
directories with 10 files and 100k files and they the difference was much
less than the total mapserver run time.) It is a pain though when making
copies or running backups. As Bob said, for a typical image request sizes
of less than 1000px, the "tiles/overviews" design pretty much limits the
number of files mapserver has to touch to 4, so access time stays fairly
constant across different views.

Also, I forgot to mention that the disk subsystem here isn't exactly your
average desktop PC. (8*10K SCSI drives in RAID 10 with 1GB dedicated to the
RAID controller.) Similar requests on a much older machine we have around
here with slower disk/processor take about 800ms. I don't have any info to
attribute this to disk vs. cpu or both.

One of the main reasons we decided on JPEG here early on was the licensing
headaches surrounding mrsid/ecw/jp2. JPEG was easy and supported well by
just about everything. Actually, for that matter, TIFFs are a lot harder to
use directly by most (non-GIS) applications than JPEG/PNG too. We may be
past this being an issue, but once upon a time, before our use of
mapserver, the JPEG tiles were being accessed directly from the webserver
by various client applications (that didn't all understand tiff). Instead
of using world files to determine the extents, the tiles were accessed by a
predictable naming convention relating to the extent. When we started using
mapserver we retained the existing tiling scheme (adding world files and
tileindexes so mapserver could position the tiles) and it seemed to work
well, so haven't given it much thought since.

Thanks for all the interest and discussion around this.

Jim K

>>> Jeff Hoffmann <[EMAIL PROTECTED]> 09/16/08 4:03 PM >>>
Ed McNierney wrote:
> If you want to shrink the file size in this thought experiment that’s
> fine, but realize that you are thereby increasing the number of files
> that need to be opened for a random image request. And each new open
> file incurs a relatively high cost (directory/disk seek overhead,
> etc.); those thousands o

Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-16 Thread Jim Klassen
One other thing I just noticed: to effectively use the tiled tiffs I need 
BigTiff support which looks like it is a real new thing (needs libtiff 4.0 
which is still in beta according to the remotesensing.org ftp site.) Anyway, 
I'm still going to give this a try and check the performance difference.  For 
us, with existing hardware, disk space would be an issue for uncompressed 
images so I will be trying JPEG in TIFF. All I can say about the JPEG->JPEG 
re-compression artifacts is with the existing setup, we haven't had any 
complaints.

The "large" number of files doesn't (in our case < 500k) doesn't seem to effect 
the operational performance of our server in a meaningful way. (I don't 
remember the exact numbers, but I have measured file access time in directories 
with 10 files and 100k files and they the difference was much less than the 
total mapserver run time.) It is a pain though when making copies or running 
backups. As Bob said, for a typical image request sizes  of less than 1000px, 
the "tiles/overviews" design pretty much limits the number of files mapserver 
has to touch to 4, so access time stays fairly constant across different views.

Also, I forgot to mention that the disk subsystem here isn't exactly your 
average desktop PC. (8*10K SCSI drives in RAID 10 with 1GB dedicated to the 
RAID controller.) Similar requests on a much older machine we have around here 
with slower disk/processor take about 800ms. I don't have any info to attribute 
this to disk vs. cpu or both.

One of the main reasons we decided on JPEG here early on was the licensing 
headaches surrounding mrsid/ecw/jp2. JPEG was easy and supported well by just 
about everything. Actually, for that matter, TIFFs are a lot harder to use 
directly by most (non-GIS) applications than JPEG/PNG too. We may be past this 
being an issue, but once upon a time, before our use of mapserver, the JPEG 
tiles were being accessed directly from the webserver by various client 
applications (that didn't all understand tiff). Instead of using world files to 
determine the extents, the tiles were accessed by a predictable naming 
convention relating to the extent. When we started using mapserver we retained 
the existing tiling scheme (adding world files and tileindexes so mapserver 
could position the tiles) and it seemed to work well, so haven't given it much 
thought since.

Thanks for all the interest and discussion around this.

Jim K

>>> Jeff Hoffmann <[EMAIL PROTECTED]> 09/16/08 4:03 PM >>>
Ed McNierney wrote:
> If you want to shrink the file size in this thought experiment that’s 
> fine, but realize that you are thereby increasing the number of files 
> that need to be opened for a random image request. And each new open 
> file incurs a relatively high cost (directory/disk seek overhead, 
> etc.); those thousands or millions of JPEGs aren’t just hard to keep 
> track of – they hurt performance. I have been the keeper of tens of 
> millions of such files, and have seen some of those issues myself.
That's certainly a consideration, but you could also counter that by 
using jpeg compressed geotiffs. You'd want to make sure to tile them, 
otherwise you'd have that same big jpeg performance problem -- I think 
tiled effectively treats them as individual jpegs wrapped in one big 
file. No clue on what the actual performance of that would be, but it's 
something to consider if you've got filesystem performance problems.

> The example I gave (and my other examples) are, however, primarily 
> intended to help people think about all the aspects of the problem. 
> File access performance in an application environment is a complex 
> issue with many variables and any implementation should be prototyped 
> and tested. All I really care about is that you don’t think it’s 
> simple and you try to think through all the consequences of an 
> implementation plan.
One of the reasons why I replied to this originally is that I think it's 
good to keep options open so people can evaluate them for their specific 
circumstances. What I was hearing you say was "if you make bad choices, 
it'll perform badly" & I'm just trying to throw out some other choices 
that would better and probably be make it worth a try for a lot of 
people. It's pretty common for me to get imagery in 5000x5000 or 
1x1 geotiff tiles. I just got imagery for one county like that 
that weighs in at close to 1TB; if I were to decide I can't afford that 
kind of disk space for whatever reason, I'd investigate some compressed 
options. If I don't know any different, I might just compress that tile 
into one large jpeg (like in your example), discover the performance is 
terrible, discard it & file away in my mind that jpegs perform terribly. 
I might not understand that a 5000x5000 jpeg is going to use 75MB of 
memory and take an order of magnitude longer to decompress than that 
1000x1000 jpeg that only takes up 3MB in memory and decompresses nearly 
instantly while giving you

Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-16 Thread Jim Klassen
Ed,

Good points about using tiled tiffs so that mapserver doesn't have to read the 
whole file. I was thinking TIFFs were scanline based where you would have to do 
a lot of reading or a lot of seeking anyway if you wanted to pull out a subset 
of the file to render. JPEG in TIFF is also interesting... I have seen that 
before, but there were compatibility issues with it at the time (not with 
mapserver or gdal) so I've avoided it since. It sounds like this is at least 
worth some experimentation on my part.

Also, I agree this is a fairly complex problem when taking into account many 
simultaneous requests and that is why I asked the question.

BTW: if I were to do it again (using basically the same approach I did the 
first time), I'd use 1024x1024 pixel tiles instead of 1000x1000. It would make 
it easier to handle the factors of two in resolution and fits block boundaries 
better.

Jim

>>> Ed McNierney <[EMAIL PROTECTED]> 09/15/08 9:14 PM >>>
Damn, I'm going to have to get around to unsubscribing soon so I can shut 
myself up!

Jim, please remember that your disk subsystem does not read only the precise 
amount of data you request.  The most expensive step is telling the disk head 
to seek to a random location to start reading the data.  The actual reading 
takes much less time in almost every case.  Let's invent an example so we don't 
have to do too much hard research .

A 7,200-RPM IDE drive has about a 9 ms average read seek time, and most are 
able to really transfer real data at around 60 MB/s or so (these are very rough 
approximations).  So to read 256KB of sequential data, you spend 9 ms seeking 
to the right track and then 4 ms reading the data - that's 13 ms.  Doubling the 
read size to 512KB will only take 4 ms (or 30%) longer, not 100% longer.  But 
even that's likely to be an exaggeration, because your disk drive - knowing 
that seeks are expensive - will typically read a LOT of data after doing a 
seek.  Remember that "16MB buffer" on the package?  The drive will likely read 
far more than you need, so the "improvement" you get by cutting the amount of 
data read in a given seek in half is likely to be nothing at all.

There are limits, of course.  The larger your data read is, the more likely it 
is to be split up into more than one location on disk.  That would mean another 
seek, which would definitely hurt.  But in general if you're already reading 
modest amounts of data in each shot, reducing the amount of data read by 
compression is likely to save you almost nothing in read time and cost you 
something in decompression time (CPUs are fast, so it might not cost much, but 
it will very likely require more RAM, boosting your per-request footprint, 
which means you're more at risk of starting to swap, etc.).

And remember that not all formats are created equal.  In order to decompress 
ANY portion of a JPEG image, you must read the WHOLE file.  If I have a 
4,000x4,000 pixel 24-bit TIFF image that's 48 megabytes, and I want to read a 
256x256 piece of it, I may only need to read one megabyte or less of that file. 
 But if I convert it to a JPEG and compress it to only 10% of the TIFF's size, 
I'll have a 4.8 megabyte JPEG but I will need to read the whole 4.8 megabytes 
(and expand it into that RAM you're trying to conserve) in order to get that 
256x256 piece!

Paul is right - sometimes compression is necessary when you run out of disk 
(but disks are pretty darn cheap - the cost per megabyte of the first hard 
drive I ever purchased (a Maynard Electronics 10 MB drive for my IBM PC) is 
approximately 450,000 times higher than it is today).  If you are inclined 
toward JPEG compression, read about and think about using tiled TIFFs with JPEG 
compression in the tiles; it's a reasonable compromise that saves space while 
reducing the whole-file-read overhead of JPEG.

Where the heck is that unsubscribe button?

- Ed


On 9/15/08 9:23 PM, "Paul Spencer" <[EMAIL PROTECTED]> wrote:

Jim, you would think that ;)  However, in practice I wouldn't expect
the disk access time for geotiffs to be significantly different from
jpeg if you have properly optimized your geotiffs using gdal_translate
-co "TILED=YES" - the internal structure is efficiently indexed so
that gdal only has to read the minimum number of 256x256 blocks to
cover the requested extent.  And using gdaladdo to generate overviews
just makes it that much more efficient.

Even if you are reading less physical data from the disk to get the
equivalent coverage from jpeg, the decompression overhead is enough to
negate the difference in IO time based on Ed's oft quoted advice (and
other's experience too I think).  The rules that apply in this case
seem to be 'tile your data', 'do not compress it' and 'buy the fastest
disk you can afford'.

Compression is useful and probably necessary if you hit disk space
limits.

Cheers

Paul

On 15-Sep-08, at 5:48 PM, Jim Klassen wrote:

> Just out of curiosity, has anyone tested the performance

Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-16 Thread Jeff Hoffmann

Ed McNierney wrote:
If you want to shrink the file size in this thought experiment that’s 
fine, but realize that you are thereby increasing the number of files 
that need to be opened for a random image request. And each new open 
file incurs a relatively high cost (directory/disk seek overhead, 
etc.); those thousands or millions of JPEGs aren’t just hard to keep 
track of – they hurt performance. I have been the keeper of tens of 
millions of such files, and have seen some of those issues myself.
That's certainly a consideration, but you could also counter that by 
using jpeg compressed geotiffs. You'd want to make sure to tile them, 
otherwise you'd have that same big jpeg performance problem -- I think 
tiled effectively treats them as individual jpegs wrapped in one big 
file. No clue on what the actual performance of that would be, but it's 
something to consider if you've got filesystem performance problems.


The example I gave (and my other examples) are, however, primarily 
intended to help people think about all the aspects of the problem. 
File access performance in an application environment is a complex 
issue with many variables and any implementation should be prototyped 
and tested. All I really care about is that you don’t think it’s 
simple and you try to think through all the consequences of an 
implementation plan.
One of the reasons why I replied to this originally is that I think it's 
good to keep options open so people can evaluate them for their specific 
circumstances. What I was hearing you say was "if you make bad choices, 
it'll perform badly" & I'm just trying to throw out some other choices 
that would better and probably be make it worth a try for a lot of 
people. It's pretty common for me to get imagery in 5000x5000 or 
1x1 geotiff tiles. I just got imagery for one county like that 
that weighs in at close to 1TB; if I were to decide I can't afford that 
kind of disk space for whatever reason, I'd investigate some compressed 
options. If I don't know any different, I might just compress that tile 
into one large jpeg (like in your example), discover the performance is 
terrible, discard it & file away in my mind that jpegs perform terribly. 
I might not understand that a 5000x5000 jpeg is going to use 75MB of 
memory and take an order of magnitude longer to decompress than that 
1000x1000 jpeg that only takes up 3MB in memory and decompresses nearly 
instantly while giving you that same 500x500 chunk of image. There are 
nice things about jpegs, like you don't need commercial libraries like 
you would with ecw, mrsid, jp2, you don't have to worry about licensing 
issues, size constraints, compiler environment, all that, which makes it 
a pretty attractive compressed format if you can get it to perform well, 
but if you don't know to break them up into smallish chunks I don't 
think getting to that performance level is really possible (for exactly 
the reasons you describe).
I will also admit to being very guilty of not designing for 
“low-moderate load” situations, as I always like my Web sites to be 
able to survive the situation in which they accidentally turn out to 
be popular!
I had second thoughts about saying this, because one man's "low" load 
might be "high" for someone else especially if you're talking to someone 
who has run a pretty high profile site, but I'd wager you're the 
exception and there are a lot of smaller fish out there. I'd think that 
Jim is probably more in line with an average user, a moderately sized 
city/county that would probably come nowhere near maxing out even modest 
hardware with those jpegs of his. It's probably those smaller fish where 
compression is more important, maybe they're fighting for space on a 
department-level server or can't get budget approval to upgrade their 
drives. I'd hate for those folks to have to settle for a slow (cpu 
intensive) wavelet-based compression when a properly configured jpeg 
layer might be the compromise they're looking for.


jeff
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-16 Thread Bob Basques

Hi All,

I work with Jim, so I suppose I should add some stuff here.

We use Jpegs for Aerial Photography only, and PNGs for everything else, 
for the same reasons stated here.   I'll let Jim talk to the technical 
stuff related to the pyramiding process as he put it all together into 
an automated process.


Just to clear up some small points, we only store 1000x1000 pixel images 
behind MapServer for serving of the raster data sets, no more than four 
(4) of these source images should (in the majority of cases) need to be 
accessed per web call.


bobb


[EMAIL PROTECTED] wrote:

Hmmm the discussion here has only looked at jpgs & geotiffs.

I tend to use PNG format, which I believe is less lossy than jpgs, & supports 
transparency, which has worked fine for small scale implementations.

Can any experts here comment on the pros/cons of png vs jpg?


Thanks,

  Brent Wood



--- On Wed, 9/17/08, Ed McNierney <[EMAIL PROTECTED]> wrote:

  

From: Ed McNierney <[EMAIL PROTECTED]>
Subject: Re: [mapserver-users] Ed's Rules for the Best Raster Performance
To: "Jeff Hoffmann" <[EMAIL PROTECTED]>
Cc: "Jim Klassen" <[EMAIL PROTECTED]>, "mapserver-users@lists.osgeo.org" 

Date: Wednesday, September 17, 2008, 3:45 AM
Jeff -

I'm not convinced, either, but I have never seen a
real-world test that has shown otherwise.  There haven't
been many such tests, but I have done them myself and
several others have done them as well and posted the results
on this list.  There may be tradeoffs which require a
different implementation - that's life in the real world
- but the data (the real, measured data, not theoretical
speculation) has always been consistent.

If you want to shrink the file size in this thought
experiment that's fine, but realize that you are thereby
increasing the number of files that need to be opened for a
random image request.  And each new open file incurs a
relatively high cost (directory/disk seek overhead, etc.);
those thousands or millions of JPEGs aren't just hard to
keep track of - they hurt performance.  I have been the
keeper of tens of millions of such files, and have seen some
of those issues myself.

The example I gave (and my other examples) are, however,
primarily intended to help people think about all the
aspects of the problem.  File access performance in an
application environment is a complex issue with many
variables and any implementation should be prototyped and
tested.  All I really care about is that you don't think
it's simple and you try to think through all the
consequences of an implementation plan.

I will also admit to being very guilty of not designing for
"low-moderate load" situations, as I always like
my Web sites to be able to survive the situation in which
they accidentally turn out to be popular!

- Ed


On 9/16/08 11:21 AM, "Jeff Hoffmann"
<[EMAIL PROTECTED]> wrote:

Ed McNierney wrote:


And remember that not all formats are created equal.
  

In order to


decompress ANY portion of a JPEG image, you must read
  

the WHOLE file.


If I have a 4,000x4,000 pixel 24-bit TIFF image
  

that's 48 megabytes,


and I want to read a 256x256 piece of it, I may only
  

need to read one


megabyte or less of that file. But if I convert it to
  

a JPEG and


compress it to only 10% of the TIFF's size,
  

I'll have a 4.8 megabyte


JPEG but I will need to read the whole 4.8 megabytes
  

(and expand it


into that RAM you're trying to conserve) in order
  

to get that 256x256


piece!
  

I have a feeling like I'm throwing myself into a
religious war, but here
goes. I think the problem that you have in your estimates
is that you're
using large (well, sort of large) jpegs. When you're
using properly
sized jpegs on modern servers at low-moderate load, you can
pretty much
disregard the processor time and memory issues, and just
compare on the
basis of the slowest component, disk access. 4000x4000 is
big & the
performance isn't going to be good (for the reasons you
mention), but he
never claimed to be using images that big. What he claimed
is that he's
using 1000x1000 jpegs. The 1000x1000 jpegs is pretty
critical because
it's that sweet spot where the decompress time is
small, the memory
demands manageable but the images are large enough that you
keep the
number of tiles down to a minimum for most uses. Those
jpegs might be in
the 200k size range, compared to a 256x256 block = 64k (x3
bands =192k?)
so he's reading a full 1000x1000 image in the disk
space of 1 256x256
block. If you're serving up 500x500 finished image,
you're using at
least 4 blocks in the geotiff, maybe 9 compared 1-4 with
the 1000x1000
jpeg. You could easily be spending 2x the time reading the
disk with
geotiff as you would

Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-16 Thread pcreso
Hmmm the discussion here has only looked at jpgs & geotiffs.

I tend to use PNG format, which I believe is less lossy than jpgs, & supports 
transparency, which has worked fine for small scale implementations.

Can any experts here comment on the pros/cons of png vs jpg?


Thanks,

  Brent Wood



--- On Wed, 9/17/08, Ed McNierney <[EMAIL PROTECTED]> wrote:

> From: Ed McNierney <[EMAIL PROTECTED]>
> Subject: Re: [mapserver-users] Ed's Rules for the Best Raster Performance
> To: "Jeff Hoffmann" <[EMAIL PROTECTED]>
> Cc: "Jim Klassen" <[EMAIL PROTECTED]>, "mapserver-users@lists.osgeo.org" 
> 
> Date: Wednesday, September 17, 2008, 3:45 AM
> Jeff -
> 
> I'm not convinced, either, but I have never seen a
> real-world test that has shown otherwise.  There haven't
> been many such tests, but I have done them myself and
> several others have done them as well and posted the results
> on this list.  There may be tradeoffs which require a
> different implementation - that's life in the real world
> - but the data (the real, measured data, not theoretical
> speculation) has always been consistent.
> 
> If you want to shrink the file size in this thought
> experiment that's fine, but realize that you are thereby
> increasing the number of files that need to be opened for a
> random image request.  And each new open file incurs a
> relatively high cost (directory/disk seek overhead, etc.);
> those thousands or millions of JPEGs aren't just hard to
> keep track of - they hurt performance.  I have been the
> keeper of tens of millions of such files, and have seen some
> of those issues myself.
> 
> The example I gave (and my other examples) are, however,
> primarily intended to help people think about all the
> aspects of the problem.  File access performance in an
> application environment is a complex issue with many
> variables and any implementation should be prototyped and
> tested.  All I really care about is that you don't think
> it's simple and you try to think through all the
> consequences of an implementation plan.
> 
> I will also admit to being very guilty of not designing for
> "low-moderate load" situations, as I always like
> my Web sites to be able to survive the situation in which
> they accidentally turn out to be popular!
> 
> - Ed
> 
> 
> On 9/16/08 11:21 AM, "Jeff Hoffmann"
> <[EMAIL PROTECTED]> wrote:
> 
> Ed McNierney wrote:
> >
> > And remember that not all formats are created equal.
> In order to
> > decompress ANY portion of a JPEG image, you must read
> the WHOLE file.
> > If I have a 4,000x4,000 pixel 24-bit TIFF image
> that's 48 megabytes,
> > and I want to read a 256x256 piece of it, I may only
> need to read one
> > megabyte or less of that file. But if I convert it to
> a JPEG and
> > compress it to only 10% of the TIFF's size,
> I'll have a 4.8 megabyte
> > JPEG but I will need to read the whole 4.8 megabytes
> (and expand it
> > into that RAM you're trying to conserve) in order
> to get that 256x256
> > piece!
> I have a feeling like I'm throwing myself into a
> religious war, but here
> goes. I think the problem that you have in your estimates
> is that you're
> using large (well, sort of large) jpegs. When you're
> using properly
> sized jpegs on modern servers at low-moderate load, you can
> pretty much
> disregard the processor time and memory issues, and just
> compare on the
> basis of the slowest component, disk access. 4000x4000 is
> big & the
> performance isn't going to be good (for the reasons you
> mention), but he
> never claimed to be using images that big. What he claimed
> is that he's
> using 1000x1000 jpegs. The 1000x1000 jpegs is pretty
> critical because
> it's that sweet spot where the decompress time is
> small, the memory
> demands manageable but the images are large enough that you
> keep the
> number of tiles down to a minimum for most uses. Those
> jpegs might be in
> the 200k size range, compared to a 256x256 block = 64k (x3
> bands =192k?)
> so he's reading a full 1000x1000 image in the disk
> space of 1 256x256
> block. If you're serving up 500x500 finished image,
> you're using at
> least 4 blocks in the geotiff, maybe 9 compared 1-4 with
> the 1000x1000
> jpeg. You could easily be spending 2x the time reading the
> disk with
> geotiff as you would be with jpegs. I haven't sat down
> and done any side
> by side tests, but I can see how they would be competitive
> for certain
> uses when you look at it th

Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-16 Thread Ed McNierney
Jeff -

I'm not convinced, either, but I have never seen a real-world test that has 
shown otherwise.  There haven't been many such tests, but I have done them 
myself and several others have done them as well and posted the results on this 
list.  There may be tradeoffs which require a different implementation - that's 
life in the real world - but the data (the real, measured data, not theoretical 
speculation) has always been consistent.

If you want to shrink the file size in this thought experiment that's fine, but 
realize that you are thereby increasing the number of files that need to be 
opened for a random image request.  And each new open file incurs a relatively 
high cost (directory/disk seek overhead, etc.); those thousands or millions of 
JPEGs aren't just hard to keep track of - they hurt performance.  I have been 
the keeper of tens of millions of such files, and have seen some of those 
issues myself.

The example I gave (and my other examples) are, however, primarily intended to 
help people think about all the aspects of the problem.  File access 
performance in an application environment is a complex issue with many 
variables and any implementation should be prototyped and tested.  All I really 
care about is that you don't think it's simple and you try to think through all 
the consequences of an implementation plan.

I will also admit to being very guilty of not designing for "low-moderate load" 
situations, as I always like my Web sites to be able to survive the situation 
in which they accidentally turn out to be popular!

- Ed


On 9/16/08 11:21 AM, "Jeff Hoffmann" <[EMAIL PROTECTED]> wrote:

Ed McNierney wrote:
>
> And remember that not all formats are created equal. In order to
> decompress ANY portion of a JPEG image, you must read the WHOLE file.
> If I have a 4,000x4,000 pixel 24-bit TIFF image that's 48 megabytes,
> and I want to read a 256x256 piece of it, I may only need to read one
> megabyte or less of that file. But if I convert it to a JPEG and
> compress it to only 10% of the TIFF's size, I'll have a 4.8 megabyte
> JPEG but I will need to read the whole 4.8 megabytes (and expand it
> into that RAM you're trying to conserve) in order to get that 256x256
> piece!
I have a feeling like I'm throwing myself into a religious war, but here
goes. I think the problem that you have in your estimates is that you're
using large (well, sort of large) jpegs. When you're using properly
sized jpegs on modern servers at low-moderate load, you can pretty much
disregard the processor time and memory issues, and just compare on the
basis of the slowest component, disk access. 4000x4000 is big & the
performance isn't going to be good (for the reasons you mention), but he
never claimed to be using images that big. What he claimed is that he's
using 1000x1000 jpegs. The 1000x1000 jpegs is pretty critical because
it's that sweet spot where the decompress time is small, the memory
demands manageable but the images are large enough that you keep the
number of tiles down to a minimum for most uses. Those jpegs might be in
the 200k size range, compared to a 256x256 block = 64k (x3 bands =192k?)
so he's reading a full 1000x1000 image in the disk space of 1 256x256
block. If you're serving up 500x500 finished image, you're using at
least 4 blocks in the geotiff, maybe 9 compared 1-4 with the 1000x1000
jpeg. You could easily be spending 2x the time reading the disk with
geotiff as you would be with jpegs. I haven't sat down and done any side
by side tests, but I can see how they would be competitive for certain
uses when you look at it that way. Of course there are other issues like
lossy compression on top of lossy compression, plus you've got to worry
about keeping track of thousands (millions?) of jpegs, but they're
probably manageable tradeoffs. Oh, and you don't really get the option
to have nodata areas with jpegs, either. There's probably other
drawbacks, too, but I'm not convinced that performance is one of them.

jeff

___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-16 Thread Jeff Hoffmann

Ed McNierney wrote:


And remember that not all formats are created equal. In order to 
decompress ANY portion of a JPEG image, you must read the WHOLE file. 
If I have a 4,000x4,000 pixel 24-bit TIFF image that’s 48 megabytes, 
and I want to read a 256x256 piece of it, I may only need to read one 
megabyte or less of that file. But if I convert it to a JPEG and 
compress it to only 10% of the TIFF’s size, I’ll have a 4.8 megabyte 
JPEG but I will need to read the whole 4.8 megabytes (and expand it 
into that RAM you’re trying to conserve) in order to get that 256x256 
piece!
I have a feeling like I'm throwing myself into a religious war, but here 
goes. I think the problem that you have in your estimates is that you're 
using large (well, sort of large) jpegs. When you're using properly 
sized jpegs on modern servers at low-moderate load, you can pretty much 
disregard the processor time and memory issues, and just compare on the 
basis of the slowest component, disk access. 4000x4000 is big & the 
performance isn't going to be good (for the reasons you mention), but he 
never claimed to be using images that big. What he claimed is that he's 
using 1000x1000 jpegs. The 1000x1000 jpegs is pretty critical because 
it's that sweet spot where the decompress time is small, the memory 
demands manageable but the images are large enough that you keep the 
number of tiles down to a minimum for most uses. Those jpegs might be in 
the 200k size range, compared to a 256x256 block = 64k (x3 bands =192k?) 
so he's reading a full 1000x1000 image in the disk space of 1 256x256 
block. If you're serving up 500x500 finished image, you're using at 
least 4 blocks in the geotiff, maybe 9 compared 1-4 with the 1000x1000 
jpeg. You could easily be spending 2x the time reading the disk with 
geotiff as you would be with jpegs. I haven't sat down and done any side 
by side tests, but I can see how they would be competitive for certain 
uses when you look at it that way. Of course there are other issues like 
lossy compression on top of lossy compression, plus you've got to worry 
about keeping track of thousands (millions?) of jpegs, but they're 
probably manageable tradeoffs. Oh, and you don't really get the option 
to have nodata areas with jpegs, either. There's probably other 
drawbacks, too, but I'm not convinced that performance is one of them.


jeff
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-16 Thread Flavio Hendry
Hi

> #3. Don't compress your data
> - avoid jpg, ecw, and mrsid formats.

mmmh, my experience is, that ecw is extremely fast ...

Mit freundlichem Gruss / Best Regards
Flavio Hendry


TYDAC Web-Site:  http://www.tydac.ch
TYDAC MapServer: http://www.mapserver.ch
TYDAC SwissMaps: http://www.mapplus.ch

  Mit freundlichen Gruessen / Kind Regards
 mailto:[EMAIL PROTECTED]
 TYDAC AG - http://www.tydac.ch
Geographic Information Solutions
 Luternauweg 12 -- CH-3006 Bern
   Tel +41 (0)31 368 0180 - Fax +41 (0)31 368 1860

Location: http://www.mapplus.ch/adr/bern/luternauweg/12




___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-15 Thread Ed McNierney
Damn, I'm going to have to get around to unsubscribing soon so I can shut 
myself up!

Jim, please remember that your disk subsystem does not read only the precise 
amount of data you request.  The most expensive step is telling the disk head 
to seek to a random location to start reading the data.  The actual reading 
takes much less time in almost every case.  Let's invent an example so we don't 
have to do too much hard research .

A 7,200-RPM IDE drive has about a 9 ms average read seek time, and most are 
able to really transfer real data at around 60 MB/s or so (these are very rough 
approximations).  So to read 256KB of sequential data, you spend 9 ms seeking 
to the right track and then 4 ms reading the data - that's 13 ms.  Doubling the 
read size to 512KB will only take 4 ms (or 30%) longer, not 100% longer.  But 
even that's likely to be an exaggeration, because your disk drive - knowing 
that seeks are expensive - will typically read a LOT of data after doing a 
seek.  Remember that "16MB buffer" on the package?  The drive will likely read 
far more than you need, so the "improvement" you get by cutting the amount of 
data read in a given seek in half is likely to be nothing at all.

There are limits, of course.  The larger your data read is, the more likely it 
is to be split up into more than one location on disk.  That would mean another 
seek, which would definitely hurt.  But in general if you're already reading 
modest amounts of data in each shot, reducing the amount of data read by 
compression is likely to save you almost nothing in read time and cost you 
something in decompression time (CPUs are fast, so it might not cost much, but 
it will very likely require more RAM, boosting your per-request footprint, 
which means you're more at risk of starting to swap, etc.).

And remember that not all formats are created equal.  In order to decompress 
ANY portion of a JPEG image, you must read the WHOLE file.  If I have a 
4,000x4,000 pixel 24-bit TIFF image that's 48 megabytes, and I want to read a 
256x256 piece of it, I may only need to read one megabyte or less of that file. 
 But if I convert it to a JPEG and compress it to only 10% of the TIFF's size, 
I'll have a 4.8 megabyte JPEG but I will need to read the whole 4.8 megabytes 
(and expand it into that RAM you're trying to conserve) in order to get that 
256x256 piece!

Paul is right - sometimes compression is necessary when you run out of disk 
(but disks are pretty darn cheap - the cost per megabyte of the first hard 
drive I ever purchased (a Maynard Electronics 10 MB drive for my IBM PC) is 
approximately 450,000 times higher than it is today).  If you are inclined 
toward JPEG compression, read about and think about using tiled TIFFs with JPEG 
compression in the tiles; it's a reasonable compromise that saves space while 
reducing the whole-file-read overhead of JPEG.

Where the heck is that unsubscribe button?

- Ed


On 9/15/08 9:23 PM, "Paul Spencer" <[EMAIL PROTECTED]> wrote:

Jim, you would think that ;)  However, in practice I wouldn't expect
the disk access time for geotiffs to be significantly different from
jpeg if you have properly optimized your geotiffs using gdal_translate
-co "TILED=YES" - the internal structure is efficiently indexed so
that gdal only has to read the minimum number of 256x256 blocks to
cover the requested extent.  And using gdaladdo to generate overviews
just makes it that much more efficient.

Even if you are reading less physical data from the disk to get the
equivalent coverage from jpeg, the decompression overhead is enough to
negate the difference in IO time based on Ed's oft quoted advice (and
other's experience too I think).  The rules that apply in this case
seem to be 'tile your data', 'do not compress it' and 'buy the fastest
disk you can afford'.

Compression is useful and probably necessary if you hit disk space
limits.

Cheers

Paul

On 15-Sep-08, at 5:48 PM, Jim Klassen wrote:

> Just out of curiosity, has anyone tested the performance of Jpegs
> vs. GeoTiffs?
>
> I would expect at some point the additional disk access time
> required for GeoTiffs (of the same pixel count) as Jpegs would
> outweigh the additional processor time required to decompress the
> Jpegs. (Also the number of Jpegs that can fit in disk cache is
> greater than for similar GeoTiffs.)
>
> For reference we use 1000px by 1000px Jpeg tiles (with world files).
> We store multiple resolutions of the dataset, each in its own
> directory. We start at the native dataset resolution, and half that
> for each step, stopping when there are less than 10 tiles produced
> at that particular resolution. (I.e for one of our county wide
> datasets 6in/px, 1ft/px, 2ft/px, ... 32ft/px). A tileindex is then
> created for each resolution (using gdaltindex followed by shptree)
> and a layer is created in the mapfile for each tileindex and
> appropriate min/maxscales are set. The outputformat in the mapfile
> is set to jpeg.
>
> Our

Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-15 Thread Paul Spencer
Jim, you would think that ;)  However, in practice I wouldn't expect  
the disk access time for geotiffs to be significantly different from  
jpeg if you have properly optimized your geotiffs using gdal_translate  
-co "TILED=YES" - the internal structure is efficiently indexed so  
that gdal only has to read the minimum number of 256x256 blocks to  
cover the requested extent.  And using gdaladdo to generate overviews  
just makes it that much more efficient.


Even if you are reading less physical data from the disk to get the  
equivalent coverage from jpeg, the decompression overhead is enough to  
negate the difference in IO time based on Ed's oft quoted advice (and  
other's experience too I think).  The rules that apply in this case  
seem to be 'tile your data', 'do not compress it' and 'buy the fastest  
disk you can afford'.


Compression is useful and probably necessary if you hit disk space  
limits.


Cheers

Paul

On 15-Sep-08, at 5:48 PM, Jim Klassen wrote:

Just out of curiosity, has anyone tested the performance of Jpegs  
vs. GeoTiffs?


I would expect at some point the additional disk access time  
required for GeoTiffs (of the same pixel count) as Jpegs would  
outweigh the additional processor time required to decompress the  
Jpegs. (Also the number of Jpegs that can fit in disk cache is  
greater than for similar GeoTiffs.)


For reference we use 1000px by 1000px Jpeg tiles (with world files).  
We store multiple resolutions of the dataset, each in its own  
directory. We start at the native dataset resolution, and half that  
for each step, stopping when there are less than 10 tiles produced  
at that particular resolution. (I.e for one of our county wide  
datasets 6in/px, 1ft/px, 2ft/px, ... 32ft/px). A tileindex is then  
created for each resolution (using gdaltindex followed by shptree)  
and a layer is created in the mapfile for each tileindex and  
appropriate min/maxscales are set. The outputformat in the mapfile  
is set to jpeg.


Our typical tile size is 200KB. There are about 20k tiles in the 6in/ 
px dataset, 80k tiles in the 3in/px dataset (actually 4in data, but  
stored in 3in so it fits with the rest of the datasets well). I have  
tested and this large number of files in a directory doesn't seem to  
effect performance on our system.


Average access time for a 500x500px request to mapserver is 300ms  
measured at the client using perl/LWP and about 220ms with shp2img.


Machine is mapserver 5.2.0/x86-64/2.8GHz Xeon/Linux 2.6.16/ext3  
filesystem.


Jim Klassen
City of Saint Paul


"Fawcett, David" <[EMAIL PROTECTED]> 09/15/08 1:10 PM >>>

Better yet,

Add your comments to:

http://mapserver.gis.umn.edu/docs/howto/optimizeraster

and

http://mapserver.gis.umn.edu/docs/howto/optimizevector

I had always thought that all we needed to do to make these pages  
great

was to grok the list for all of Ed's posts...

David.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Brent
Fraser
Sent: Monday, September 15, 2008 12:55 PM
To: mapserver-users@lists.osgeo.org
Subject: [mapserver-users] Ed's Rules for the Best Raster Performance


In honor of Ed's imminent retirement from the Mapserver Support Group,
I've put together "Ed's List for the Best Raster Performance":


#1. Pyramid the data
   - use MAXSCALE and MINSCALE in the LAYER object.

#2. Tile the data (and merge your upper levels of the pyramid for  
fewer

files).
   - see the TILEINDEX object

#3. Don't compress your data
   - avoid jpg, ecw, and mrsid formats.

#4. Don't re-project your data on-the-fly.

#5. Get the fastest disks you can afford.


(Ed, feel free to edit...)

Brent Fraser
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users

___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users



__

   Paul Spencer
   Chief Technology Officer
   DM Solutions Group Inc
   http://www.dmsolutions.ca/

___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-15 Thread Robert Sanson
Has anyone tried Erdas Imagine .img files?
 
 
 
Robert Sanson, BVSc, MACVSc, PhD
Geospatial Services
AsureQuality Limited
PO Box 585, Palmerston North
NEW ZEALAND

Phone: +64 6 351-7990
Fax: +64 6 351-7919
Mobile: 021 448-472
E-mail: [EMAIL PROTECTED] 

>>> Gregor Mosheh <[EMAIL PROTECTED]> 16/09/2008 10:57 a.m. >>>
Jim Klassen wrote:
> Just out of curiosity, has anyone tested the performance of Jpegs vs. 
> GeoTiffs?

Yep. In my tests, GeoTIFF was the fastest format by some margin, even up 
to 2 GB filesizes. That was on 8-CPU machines, too. If you check the 
mailing list archive, you'll likely find the "papers" I posted to the 
list putting real numbers to it.

-- 
Gregor Mosheh / Greg Allensworth, BS, A+
System Administrator
HostGIS cartographic development & hosting services
http://www.HostGIS.com/ 

"Remember that no one cares if you can back up,
  only if you can restore." - AMANDA
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org 
http://lists.osgeo.org/mailman/listinfo/mapserver-users 

--
The contents of this email are confidential to AsureQuality. If you have 
received this communication in error please notify the sender immediately and 
delete the message and any attachments. The opinions expressed in this email 
are not necessarily those of AsureQuality. This message has been scanned for 
known viruses before delivery. AsureQuality supports the Unsolicited Electronic 
Messages Act 2007. If you do not wish to receive similar communications in 
future, please notify the sender of this message.
--


This message has been scanned for malware by SurfControl plc. 
www.surfcontrol.com
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-15 Thread Gregor Mosheh

Jim Klassen wrote:

Just out of curiosity, has anyone tested the performance of Jpegs vs. GeoTiffs?


Yep. In my tests, GeoTIFF was the fastest format by some margin, even up 
to 2 GB filesizes. That was on 8-CPU machines, too. If you check the 
mailing list archive, you'll likely find the "papers" I posted to the 
list putting real numbers to it.


--
Gregor Mosheh / Greg Allensworth, BS, A+
System Administrator
HostGIS cartographic development & hosting services
http://www.HostGIS.com/

"Remember that no one cares if you can back up,
 only if you can restore." - AMANDA
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


RE: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-15 Thread Jim Klassen
Just out of curiosity, has anyone tested the performance of Jpegs vs. GeoTiffs?

I would expect at some point the additional disk access time required for 
GeoTiffs (of the same pixel count) as Jpegs would outweigh the additional 
processor time required to decompress the Jpegs. (Also the number of Jpegs that 
can fit in disk cache is greater than for similar GeoTiffs.)

For reference we use 1000px by 1000px Jpeg tiles (with world files). We store 
multiple resolutions of the dataset, each in its own directory. We start at the 
native dataset resolution, and half that for each step, stopping when there are 
less than 10 tiles produced at that particular resolution. (I.e for one of our 
county wide datasets 6in/px, 1ft/px, 2ft/px, ... 32ft/px). A tileindex is then 
created for each resolution (using gdaltindex followed by shptree) and a layer 
is created in the mapfile for each tileindex and appropriate min/maxscales are 
set. The outputformat in the mapfile is set to jpeg.

Our typical tile size is 200KB. There are about 20k tiles in the 6in/px 
dataset, 80k tiles in the 3in/px dataset (actually 4in data, but stored in 3in 
so it fits with the rest of the datasets well). I have tested and this large 
number of files in a directory doesn't seem to effect performance on our system.

Average access time for a 500x500px request to mapserver is 300ms measured at 
the client using perl/LWP and about 220ms with shp2img.

Machine is mapserver 5.2.0/x86-64/2.8GHz Xeon/Linux 2.6.16/ext3 filesystem.

Jim Klassen
City of Saint Paul

>>> "Fawcett, David" <[EMAIL PROTECTED]> 09/15/08 1:10 PM >>>
Better yet, 

Add your comments to:

http://mapserver.gis.umn.edu/docs/howto/optimizeraster

and 

http://mapserver.gis.umn.edu/docs/howto/optimizevector

I had always thought that all we needed to do to make these pages great
was to grok the list for all of Ed's posts...

David.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Brent
Fraser
Sent: Monday, September 15, 2008 12:55 PM
To: mapserver-users@lists.osgeo.org
Subject: [mapserver-users] Ed's Rules for the Best Raster Performance


In honor of Ed's imminent retirement from the Mapserver Support Group,
I've put together "Ed's List for the Best Raster Performance":


#1. Pyramid the data 
- use MAXSCALE and MINSCALE in the LAYER object.

#2. Tile the data (and merge your upper levels of the pyramid for fewer
files).
- see the TILEINDEX object

#3. Don't compress your data
- avoid jpg, ecw, and mrsid formats.

#4. Don't re-project your data on-the-fly.

#5. Get the fastest disks you can afford.


(Ed, feel free to edit...)

Brent Fraser
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users

___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-15 Thread Brent Fraser
I love it!  I had some half-baked convoluted list-of-tileindexes idea; this is much better.  


We may have to allow PROJECTION=AUTO for the tiles (in the case where 
the tiles are in UTM several zones), and allow the tile index to be in a 
different SRS (e.g geographic) than the tiles (I can't recall if this already 
implemented; I know it caused me a problem some years ago).

A elegant enhancement with great potential...

Brent



Steve Lime wrote:

Interesting idea. This could take the form of a tile index with a couple of 
additonal columns, minscale and maxscale. Tiles would
be grouped together by using common with those values. You could do interesting things like have high resolution data in 
some areas with other areas covered with lower resolution data over a broader range of scales. The whole layer could have it's 
own floor/ceiling but tiles would be handled individually.


I wouldn't handle this as a new layer type, but rather by introducing 
parameters to indicate which columns to use, kinda like
TILEITEM. Your pyramids would be defined in the tile index...

I think the gdal tools would already support this nicely since they add to an 
index if it already exists so you could run those tools
over mutliple datasets. Vector layers could also be handled this way, no reason 
you couldn't have 1 tile per scale range.

Steve


On 9/15/2008 at 2:18 PM, in message <[EMAIL PROTECTED]>, Brent

Fraser <[EMAIL PROTECTED]> wrote:


Jeff McKenna wrote:

my quick comments:

- adding overviews with GDAL's 'gdaladdo' utility is very important
In some cases.  It depends on your data.  As Ed once posted, it may be a 
good idea to switch to external overviews and merge some of the files to 
limit the number of file-opens Mapserver must do when the view is zoomed way 
out.


- I find your use of the word "pyramid" confusing (this seems to be a 
word that Arc* users are familiar with but has no real meaning in the 
MapServer world...I guess I've been on the "good" side for too long ha)


Not being an Arc* user I can't comment on it's origins.  "Overview" is a 
good alternative, but it doesn't seem to convey the same "multi-levelness" as 
pyramids.  For example to be able to display the Global Landsat Mosaic, I 
created seven levels of a pyramid (each an external overview of the 
higher-resolution below it).  


Hmmm, may be the plural of overview is pyramid... :)

Hey Steve L., maybe we should have a "PYRAMID" Layer type, to replace a set 
of scale-sensitive TILEINDEX Layers (this would help my 
every-layer-is-exposed-in-WMS problem too: 
http://trac.osgeo.org/mapserver/ticket/300).


Brent
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org 
http://lists.osgeo.org/mailman/listinfo/mapserver-users




___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-15 Thread Brent Fraser

Jeff,

 Very good point (sometimes I get stuck on just one way of doing things).   As 
David Fawcett pointed out, I should add those comments to the doc.

Thanks!
Brent


Jeff McKenna wrote:

Brent Fraser wrote:


Jeff McKenna wrote:

my quick comments:

- adding overviews with GDAL's 'gdaladdo' utility is very important


In some cases.  It depends on your data.  As Ed once posted, it may be 
a good idea to switch to external overviews and merge some of the 
files to limit the number of file-opens Mapserver must do when the 
view is zoomed way out.


I'm not here to argue with you, I am only pointing out the importance of 
the utility, which you did not mention in your notes.




___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-15 Thread Steve Lime
Interesting idea. This could take the form of a tile index with a couple of 
additonal columns, minscale and maxscale. Tiles would
be grouped together by using common with those values. You could do interesting 
things like have high resolution data in 
some areas with other areas covered with lower resolution data over a broader 
range of scales. The whole layer could have it's 
own floor/ceiling but tiles would be handled individually.

I wouldn't handle this as a new layer type, but rather by introducing 
parameters to indicate which columns to use, kinda like
TILEITEM. Your pyramids would be defined in the tile index...

I think the gdal tools would already support this nicely since they add to an 
index if it already exists so you could run those tools
over mutliple datasets. Vector layers could also be handled this way, no reason 
you couldn't have 1 tile per scale range.

Steve

>>> On 9/15/2008 at 2:18 PM, in message <[EMAIL PROTECTED]>, Brent
Fraser <[EMAIL PROTECTED]> wrote:

> Jeff McKenna wrote:
>> my quick comments:
>> 
>> - adding overviews with GDAL's 'gdaladdo' utility is very important
> 
> In some cases.  It depends on your data.  As Ed once posted, it may be a 
> good idea to switch to external overviews and merge some of the files to 
> limit the number of file-opens Mapserver must do when the view is zoomed way 
> out.
> 
>> - I find your use of the word "pyramid" confusing (this seems to be a 
>> word that Arc* users are familiar with but has no real meaning in the 
>> MapServer world...I guess I've been on the "good" side for too long ha)
>> 
> Not being an Arc* user I can't comment on it's origins.  "Overview" is a 
> good alternative, but it doesn't seem to convey the same "multi-levelness" as 
> pyramids.  For example to be able to display the Global Landsat Mosaic, I 
> created seven levels of a pyramid (each an external overview of the 
> higher-resolution below it).  
> 
> Hmmm, may be the plural of overview is pyramid... :)
> 
> Hey Steve L., maybe we should have a "PYRAMID" Layer type, to replace a set 
> of scale-sensitive TILEINDEX Layers (this would help my 
> every-layer-is-exposed-in-WMS problem too: 
> http://trac.osgeo.org/mapserver/ticket/300).
> 
> Brent
> ___
> mapserver-users mailing list
> mapserver-users@lists.osgeo.org 
> http://lists.osgeo.org/mailman/listinfo/mapserver-users

___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-15 Thread Jeff McKenna

Brent Fraser wrote:


Jeff McKenna wrote:

my quick comments:

- adding overviews with GDAL's 'gdaladdo' utility is very important


In some cases.  It depends on your data.  As Ed once posted, it may be a 
good idea to switch to external overviews and merge some of the files to 
limit the number of file-opens Mapserver must do when the view is zoomed 
way out.


I'm not here to argue with you, I am only pointing out the importance of 
the utility, which you did not mention in your notes.



--
Jeff McKenna
FOSS4G Consulting and Training Services
http://www.gatewaygeomatics.com/

___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-15 Thread Brent Fraser


Jeff McKenna wrote:

my quick comments:

- adding overviews with GDAL's 'gdaladdo' utility is very important


In some cases.  It depends on your data.  As Ed once posted, it may be a good 
idea to switch to external overviews and merge some of the files to limit the 
number of file-opens Mapserver must do when the view is zoomed way out.

- I find your use of the word "pyramid" confusing (this seems to be a 
word that Arc* users are familiar with but has no real meaning in the 
MapServer world...I guess I've been on the "good" side for too long ha)


Not being an Arc* user I can't comment on it's origins.  "Overview" is a good alternative, but it doesn't seem to convey the same "multi-levelness" as pyramids.  For example to be able to display the Global Landsat Mosaic, I created seven levels of a pyramid (each an external overview of the higher-resolution below it).  


Hmmm, may be the plural of overview is pyramid... :)

Hey Steve L., maybe we should have a "PYRAMID" Layer type, to replace a set of 
scale-sensitive TILEINDEX Layers (this would help my every-layer-is-exposed-in-WMS 
problem too: http://trac.osgeo.org/mapserver/ticket/300).

Brent
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


RE: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-15 Thread Fawcett, David
Better yet, 

Add your comments to:

http://mapserver.gis.umn.edu/docs/howto/optimizeraster

and 

http://mapserver.gis.umn.edu/docs/howto/optimizevector

I had always thought that all we needed to do to make these pages great
was to grok the list for all of Ed's posts...

David.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Brent
Fraser
Sent: Monday, September 15, 2008 12:55 PM
To: mapserver-users@lists.osgeo.org
Subject: [mapserver-users] Ed's Rules for the Best Raster Performance


In honor of Ed's imminent retirement from the Mapserver Support Group,
I've put together "Ed's List for the Best Raster Performance":


#1. Pyramid the data 
- use MAXSCALE and MINSCALE in the LAYER object.

#2. Tile the data (and merge your upper levels of the pyramid for fewer
files).
- see the TILEINDEX object

#3. Don't compress your data
- avoid jpg, ecw, and mrsid formats.

#4. Don't re-project your data on-the-fly.

#5. Get the fastest disks you can afford.


(Ed, feel free to edit...)

Brent Fraser
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-15 Thread Jeff McKenna

my quick comments:

- adding overviews with GDAL's 'gdaladdo' utility is very important
- I find your use of the word "pyramid" confusing (this seems to be a 
word that Arc* users are familiar with but has no real meaning in the 
MapServer world...I guess I've been on the "good" side for too long ha)


--
Jeff McKenna
FOSS4G Consulting and Training Services
http://www.gatewaygeomatics.com/


Brent Fraser wrote:
In honor of Ed's imminent retirement from the Mapserver Support Group, 
I've put together "Ed's List for the Best Raster Performance":



#1. Pyramid the data- use MAXSCALE and MINSCALE in the LAYER 
object.


#2. Tile the data (and merge your upper levels of the pyramid for fewer 
files).

   - see the TILEINDEX object

#3. Don't compress your data
   - avoid jpg, ecw, and mrsid formats.

#4. Don't re-project your data on-the-fly.

#5. Get the fastest disks you can afford.


(Ed, feel free to edit...)

Brent Fraser


___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users