Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-17 Thread Doug_Newcomb
One other thing I just noticed: to effectively use the tiled tiffs I need
BigTiff support which looks like it is a real new thing (needs libtiff 4.0
which is still in beta according to the remotesensing.org ftp site.)
If you compile gdal 1.5.2 from source and use the tiff=internal option, you
get bigtiff support, without installing libtiff 4.0 yourself,
http://www.gdal.org/formats_list.html .  I have not tested this as an image
source for mapserver yet, but I have created a couple of bigtiff images.

Doug


Doug Newcomb
USFWS
Raleigh, NC
919-856-4520 ext. 14 [EMAIL PROTECTED]
-

The opinions I express are my own and are not representative of the
official policy of the U.S.Fish and Wildlife Service or Dept. of Interior.
Life is too short for undocumented, proprietary data formats.


   
 Jim Klassen 
 [EMAIL PROTECTED] 
 tpaul.mn.us   To 
 Sent by:  [EMAIL PROTECTED],  
 mapserver-users-b [EMAIL PROTECTED]  
 [EMAIL PROTECTED]  cc 
 o.org mapserver-users@lists.osgeo.org 
   Subject 
   Re: [mapserver-users] Ed's Rules
 09/16/2008 04:57  for the Best Raster Performance 
 PM
   
   
   
   
   




One other thing I just noticed: to effectively use the tiled tiffs I need
BigTiff support which looks like it is a real new thing (needs libtiff 4.0
which is still in beta according to the remotesensing.org ftp site.)
Anyway, I'm still going to give this a try and check the performance
difference.  For us, with existing hardware, disk space would be an issue
for uncompressed images so I will be trying JPEG in TIFF. All I can say
about the JPEG-JPEG re-compression artifacts is with the existing setup,
we haven't had any complaints.

The large number of files doesn't (in our case  500k) doesn't seem to
effect the operational performance of our server in a meaningful way. (I
don't remember the exact numbers, but I have measured file access time in
directories with 10 files and 100k files and they the difference was much
less than the total mapserver run time.) It is a pain though when making
copies or running backups. As Bob said, for a typical image request sizes
of less than 1000px, the tiles/overviews design pretty much limits the
number of files mapserver has to touch to 4, so access time stays fairly
constant across different views.

Also, I forgot to mention that the disk subsystem here isn't exactly your
average desktop PC. (8*10K SCSI drives in RAID 10 with 1GB dedicated to the
RAID controller.) Similar requests on a much older machine we have around
here with slower disk/processor take about 800ms. I don't have any info to
attribute this to disk vs. cpu or both.

One of the main reasons we decided on JPEG here early on was the licensing
headaches surrounding mrsid/ecw/jp2. JPEG was easy and supported well by
just about everything. Actually, for that matter, TIFFs are a lot harder to
use directly by most (non-GIS) applications than JPEG/PNG too. We may be
past this being an issue, but once upon a time, before our use of
mapserver, the JPEG tiles were being accessed directly from the webserver
by various client applications (that didn't all understand tiff). Instead
of using world files to determine the extents, the tiles were accessed by a
predictable naming convention relating to the extent. When we started using
mapserver we retained the existing tiling scheme (adding world files and
tileindexes so mapserver could position the tiles) and it seemed to work
well, so haven't given it much thought since.

Thanks for all the interest and discussion around this.

Jim K

 Jeff Hoffmann [EMAIL PROTECTED] 09/16/08 4:03 PM 
Ed McNierney wrote:
 If you want to shrink the file size in this thought experiment that’s
 fine, but realize that you are thereby increasing the number of files
 that need to be opened for a random image request. And each new open
 file incurs a relatively high cost (directory/disk seek overhead,
 etc.); those thousands or millions of JPEGs aren’t 

Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-16 Thread Flavio Hendry
Hi

 #3. Don't compress your data
 - avoid jpg, ecw, and mrsid formats.

mmmh, my experience is, that ecw is extremely fast ...

Mit freundlichem Gruss / Best Regards
Flavio Hendry


TYDAC Web-Site:  http://www.tydac.ch
TYDAC MapServer: http://www.mapserver.ch
TYDAC SwissMaps: http://www.mapplus.ch

  Mit freundlichen Gruessen / Kind Regards
 mailto:[EMAIL PROTECTED]
 TYDAC AG - http://www.tydac.ch
Geographic Information Solutions
 Luternauweg 12 -- CH-3006 Bern
   Tel +41 (0)31 368 0180 - Fax +41 (0)31 368 1860

Location: http://www.mapplus.ch/adr/bern/luternauweg/12




___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-16 Thread Jeff Hoffmann

Ed McNierney wrote:


And remember that not all formats are created equal. In order to 
decompress ANY portion of a JPEG image, you must read the WHOLE file. 
If I have a 4,000x4,000 pixel 24-bit TIFF image that’s 48 megabytes, 
and I want to read a 256x256 piece of it, I may only need to read one 
megabyte or less of that file. But if I convert it to a JPEG and 
compress it to only 10% of the TIFF’s size, I’ll have a 4.8 megabyte 
JPEG but I will need to read the whole 4.8 megabytes (and expand it 
into that RAM you’re trying to conserve) in order to get that 256x256 
piece!
I have a feeling like I'm throwing myself into a religious war, but here 
goes. I think the problem that you have in your estimates is that you're 
using large (well, sort of large) jpegs. When you're using properly 
sized jpegs on modern servers at low-moderate load, you can pretty much 
disregard the processor time and memory issues, and just compare on the 
basis of the slowest component, disk access. 4000x4000 is big  the 
performance isn't going to be good (for the reasons you mention), but he 
never claimed to be using images that big. What he claimed is that he's 
using 1000x1000 jpegs. The 1000x1000 jpegs is pretty critical because 
it's that sweet spot where the decompress time is small, the memory 
demands manageable but the images are large enough that you keep the 
number of tiles down to a minimum for most uses. Those jpegs might be in 
the 200k size range, compared to a 256x256 block = 64k (x3 bands =192k?) 
so he's reading a full 1000x1000 image in the disk space of 1 256x256 
block. If you're serving up 500x500 finished image, you're using at 
least 4 blocks in the geotiff, maybe 9 compared 1-4 with the 1000x1000 
jpeg. You could easily be spending 2x the time reading the disk with 
geotiff as you would be with jpegs. I haven't sat down and done any side 
by side tests, but I can see how they would be competitive for certain 
uses when you look at it that way. Of course there are other issues like 
lossy compression on top of lossy compression, plus you've got to worry 
about keeping track of thousands (millions?) of jpegs, but they're 
probably manageable tradeoffs. Oh, and you don't really get the option 
to have nodata areas with jpegs, either. There's probably other 
drawbacks, too, but I'm not convinced that performance is one of them.


jeff
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-16 Thread Ed McNierney
Jeff -

I'm not convinced, either, but I have never seen a real-world test that has 
shown otherwise.  There haven't been many such tests, but I have done them 
myself and several others have done them as well and posted the results on this 
list.  There may be tradeoffs which require a different implementation - that's 
life in the real world - but the data (the real, measured data, not theoretical 
speculation) has always been consistent.

If you want to shrink the file size in this thought experiment that's fine, but 
realize that you are thereby increasing the number of files that need to be 
opened for a random image request.  And each new open file incurs a relatively 
high cost (directory/disk seek overhead, etc.); those thousands or millions of 
JPEGs aren't just hard to keep track of - they hurt performance.  I have been 
the keeper of tens of millions of such files, and have seen some of those 
issues myself.

The example I gave (and my other examples) are, however, primarily intended to 
help people think about all the aspects of the problem.  File access 
performance in an application environment is a complex issue with many 
variables and any implementation should be prototyped and tested.  All I really 
care about is that you don't think it's simple and you try to think through all 
the consequences of an implementation plan.

I will also admit to being very guilty of not designing for low-moderate load 
situations, as I always like my Web sites to be able to survive the situation 
in which they accidentally turn out to be popular!

- Ed


On 9/16/08 11:21 AM, Jeff Hoffmann [EMAIL PROTECTED] wrote:

Ed McNierney wrote:

 And remember that not all formats are created equal. In order to
 decompress ANY portion of a JPEG image, you must read the WHOLE file.
 If I have a 4,000x4,000 pixel 24-bit TIFF image that's 48 megabytes,
 and I want to read a 256x256 piece of it, I may only need to read one
 megabyte or less of that file. But if I convert it to a JPEG and
 compress it to only 10% of the TIFF's size, I'll have a 4.8 megabyte
 JPEG but I will need to read the whole 4.8 megabytes (and expand it
 into that RAM you're trying to conserve) in order to get that 256x256
 piece!
I have a feeling like I'm throwing myself into a religious war, but here
goes. I think the problem that you have in your estimates is that you're
using large (well, sort of large) jpegs. When you're using properly
sized jpegs on modern servers at low-moderate load, you can pretty much
disregard the processor time and memory issues, and just compare on the
basis of the slowest component, disk access. 4000x4000 is big  the
performance isn't going to be good (for the reasons you mention), but he
never claimed to be using images that big. What he claimed is that he's
using 1000x1000 jpegs. The 1000x1000 jpegs is pretty critical because
it's that sweet spot where the decompress time is small, the memory
demands manageable but the images are large enough that you keep the
number of tiles down to a minimum for most uses. Those jpegs might be in
the 200k size range, compared to a 256x256 block = 64k (x3 bands =192k?)
so he's reading a full 1000x1000 image in the disk space of 1 256x256
block. If you're serving up 500x500 finished image, you're using at
least 4 blocks in the geotiff, maybe 9 compared 1-4 with the 1000x1000
jpeg. You could easily be spending 2x the time reading the disk with
geotiff as you would be with jpegs. I haven't sat down and done any side
by side tests, but I can see how they would be competitive for certain
uses when you look at it that way. Of course there are other issues like
lossy compression on top of lossy compression, plus you've got to worry
about keeping track of thousands (millions?) of jpegs, but they're
probably manageable tradeoffs. Oh, and you don't really get the option
to have nodata areas with jpegs, either. There's probably other
drawbacks, too, but I'm not convinced that performance is one of them.

jeff

___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-16 Thread pcreso
Hmmm the discussion here has only looked at jpgs  geotiffs.

I tend to use PNG format, which I believe is less lossy than jpgs,  supports 
transparency, which has worked fine for small scale implementations.

Can any experts here comment on the pros/cons of png vs jpg?


Thanks,

  Brent Wood



--- On Wed, 9/17/08, Ed McNierney [EMAIL PROTECTED] wrote:

 From: Ed McNierney [EMAIL PROTECTED]
 Subject: Re: [mapserver-users] Ed's Rules for the Best Raster Performance
 To: Jeff Hoffmann [EMAIL PROTECTED]
 Cc: Jim Klassen [EMAIL PROTECTED], mapserver-users@lists.osgeo.org 
 mapserver-users@lists.osgeo.org
 Date: Wednesday, September 17, 2008, 3:45 AM
 Jeff -
 
 I'm not convinced, either, but I have never seen a
 real-world test that has shown otherwise.  There haven't
 been many such tests, but I have done them myself and
 several others have done them as well and posted the results
 on this list.  There may be tradeoffs which require a
 different implementation - that's life in the real world
 - but the data (the real, measured data, not theoretical
 speculation) has always been consistent.
 
 If you want to shrink the file size in this thought
 experiment that's fine, but realize that you are thereby
 increasing the number of files that need to be opened for a
 random image request.  And each new open file incurs a
 relatively high cost (directory/disk seek overhead, etc.);
 those thousands or millions of JPEGs aren't just hard to
 keep track of - they hurt performance.  I have been the
 keeper of tens of millions of such files, and have seen some
 of those issues myself.
 
 The example I gave (and my other examples) are, however,
 primarily intended to help people think about all the
 aspects of the problem.  File access performance in an
 application environment is a complex issue with many
 variables and any implementation should be prototyped and
 tested.  All I really care about is that you don't think
 it's simple and you try to think through all the
 consequences of an implementation plan.
 
 I will also admit to being very guilty of not designing for
 low-moderate load situations, as I always like
 my Web sites to be able to survive the situation in which
 they accidentally turn out to be popular!
 
 - Ed
 
 
 On 9/16/08 11:21 AM, Jeff Hoffmann
 [EMAIL PROTECTED] wrote:
 
 Ed McNierney wrote:
 
  And remember that not all formats are created equal.
 In order to
  decompress ANY portion of a JPEG image, you must read
 the WHOLE file.
  If I have a 4,000x4,000 pixel 24-bit TIFF image
 that's 48 megabytes,
  and I want to read a 256x256 piece of it, I may only
 need to read one
  megabyte or less of that file. But if I convert it to
 a JPEG and
  compress it to only 10% of the TIFF's size,
 I'll have a 4.8 megabyte
  JPEG but I will need to read the whole 4.8 megabytes
 (and expand it
  into that RAM you're trying to conserve) in order
 to get that 256x256
  piece!
 I have a feeling like I'm throwing myself into a
 religious war, but here
 goes. I think the problem that you have in your estimates
 is that you're
 using large (well, sort of large) jpegs. When you're
 using properly
 sized jpegs on modern servers at low-moderate load, you can
 pretty much
 disregard the processor time and memory issues, and just
 compare on the
 basis of the slowest component, disk access. 4000x4000 is
 big  the
 performance isn't going to be good (for the reasons you
 mention), but he
 never claimed to be using images that big. What he claimed
 is that he's
 using 1000x1000 jpegs. The 1000x1000 jpegs is pretty
 critical because
 it's that sweet spot where the decompress time is
 small, the memory
 demands manageable but the images are large enough that you
 keep the
 number of tiles down to a minimum for most uses. Those
 jpegs might be in
 the 200k size range, compared to a 256x256 block = 64k (x3
 bands =192k?)
 so he's reading a full 1000x1000 image in the disk
 space of 1 256x256
 block. If you're serving up 500x500 finished image,
 you're using at
 least 4 blocks in the geotiff, maybe 9 compared 1-4 with
 the 1000x1000
 jpeg. You could easily be spending 2x the time reading the
 disk with
 geotiff as you would be with jpegs. I haven't sat down
 and done any side
 by side tests, but I can see how they would be competitive
 for certain
 uses when you look at it that way. Of course there are
 other issues like
 lossy compression on top of lossy compression, plus
 you've got to worry
 about keeping track of thousands (millions?) of jpegs, but
 they're
 probably manageable tradeoffs. Oh, and you don't really
 get the option
 to have nodata areas with jpegs, either. There's
 probably other
 drawbacks, too, but I'm not convinced that performance
 is one of them.
 
 jeff
 
 ___
 mapserver-users mailing list
 mapserver-users@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/mapserver-users
___
mapserver-users mailing list

Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-16 Thread Bob Basques

Hi All,

I work with Jim, so I suppose I should add some stuff here.

We use Jpegs for Aerial Photography only, and PNGs for everything else, 
for the same reasons stated here.   I'll let Jim talk to the technical 
stuff related to the pyramiding process as he put it all together into 
an automated process.


Just to clear up some small points, we only store 1000x1000 pixel images 
behind MapServer for serving of the raster data sets, no more than four 
(4) of these source images should (in the majority of cases) need to be 
accessed per web call.


bobb


[EMAIL PROTECTED] wrote:

Hmmm the discussion here has only looked at jpgs  geotiffs.

I tend to use PNG format, which I believe is less lossy than jpgs,  supports 
transparency, which has worked fine for small scale implementations.

Can any experts here comment on the pros/cons of png vs jpg?


Thanks,

  Brent Wood



--- On Wed, 9/17/08, Ed McNierney [EMAIL PROTECTED] wrote:

  

From: Ed McNierney [EMAIL PROTECTED]
Subject: Re: [mapserver-users] Ed's Rules for the Best Raster Performance
To: Jeff Hoffmann [EMAIL PROTECTED]
Cc: Jim Klassen [EMAIL PROTECTED], mapserver-users@lists.osgeo.org 
mapserver-users@lists.osgeo.org
Date: Wednesday, September 17, 2008, 3:45 AM
Jeff -

I'm not convinced, either, but I have never seen a
real-world test that has shown otherwise.  There haven't
been many such tests, but I have done them myself and
several others have done them as well and posted the results
on this list.  There may be tradeoffs which require a
different implementation - that's life in the real world
- but the data (the real, measured data, not theoretical
speculation) has always been consistent.

If you want to shrink the file size in this thought
experiment that's fine, but realize that you are thereby
increasing the number of files that need to be opened for a
random image request.  And each new open file incurs a
relatively high cost (directory/disk seek overhead, etc.);
those thousands or millions of JPEGs aren't just hard to
keep track of - they hurt performance.  I have been the
keeper of tens of millions of such files, and have seen some
of those issues myself.

The example I gave (and my other examples) are, however,
primarily intended to help people think about all the
aspects of the problem.  File access performance in an
application environment is a complex issue with many
variables and any implementation should be prototyped and
tested.  All I really care about is that you don't think
it's simple and you try to think through all the
consequences of an implementation plan.

I will also admit to being very guilty of not designing for
low-moderate load situations, as I always like
my Web sites to be able to survive the situation in which
they accidentally turn out to be popular!

- Ed


On 9/16/08 11:21 AM, Jeff Hoffmann
[EMAIL PROTECTED] wrote:

Ed McNierney wrote:


And remember that not all formats are created equal.
  

In order to


decompress ANY portion of a JPEG image, you must read
  

the WHOLE file.


If I have a 4,000x4,000 pixel 24-bit TIFF image
  

that's 48 megabytes,


and I want to read a 256x256 piece of it, I may only
  

need to read one


megabyte or less of that file. But if I convert it to
  

a JPEG and


compress it to only 10% of the TIFF's size,
  

I'll have a 4.8 megabyte


JPEG but I will need to read the whole 4.8 megabytes
  

(and expand it


into that RAM you're trying to conserve) in order
  

to get that 256x256


piece!
  

I have a feeling like I'm throwing myself into a
religious war, but here
goes. I think the problem that you have in your estimates
is that you're
using large (well, sort of large) jpegs. When you're
using properly
sized jpegs on modern servers at low-moderate load, you can
pretty much
disregard the processor time and memory issues, and just
compare on the
basis of the slowest component, disk access. 4000x4000 is
big  the
performance isn't going to be good (for the reasons you
mention), but he
never claimed to be using images that big. What he claimed
is that he's
using 1000x1000 jpegs. The 1000x1000 jpegs is pretty
critical because
it's that sweet spot where the decompress time is
small, the memory
demands manageable but the images are large enough that you
keep the
number of tiles down to a minimum for most uses. Those
jpegs might be in
the 200k size range, compared to a 256x256 block = 64k (x3
bands =192k?)
so he's reading a full 1000x1000 image in the disk
space of 1 256x256
block. If you're serving up 500x500 finished image,
you're using at
least 4 blocks in the geotiff, maybe 9 compared 1-4 with
the 1000x1000
jpeg. You could easily be spending 2x the time reading the
disk with
geotiff as you would be with jpegs. I haven't sat down
and done any side
by side tests, but I can see how they would be competitive
for certain
uses when you look at it that way. Of course there are
other issues like
lossy

Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-16 Thread Jeff Hoffmann

Ed McNierney wrote:
If you want to shrink the file size in this thought experiment that’s 
fine, but realize that you are thereby increasing the number of files 
that need to be opened for a random image request. And each new open 
file incurs a relatively high cost (directory/disk seek overhead, 
etc.); those thousands or millions of JPEGs aren’t just hard to keep 
track of – they hurt performance. I have been the keeper of tens of 
millions of such files, and have seen some of those issues myself.
That's certainly a consideration, but you could also counter that by 
using jpeg compressed geotiffs. You'd want to make sure to tile them, 
otherwise you'd have that same big jpeg performance problem -- I think 
tiled effectively treats them as individual jpegs wrapped in one big 
file. No clue on what the actual performance of that would be, but it's 
something to consider if you've got filesystem performance problems.


The example I gave (and my other examples) are, however, primarily 
intended to help people think about all the aspects of the problem. 
File access performance in an application environment is a complex 
issue with many variables and any implementation should be prototyped 
and tested. All I really care about is that you don’t think it’s 
simple and you try to think through all the consequences of an 
implementation plan.
One of the reasons why I replied to this originally is that I think it's 
good to keep options open so people can evaluate them for their specific 
circumstances. What I was hearing you say was if you make bad choices, 
it'll perform badly  I'm just trying to throw out some other choices 
that would better and probably be make it worth a try for a lot of 
people. It's pretty common for me to get imagery in 5000x5000 or 
1x1 geotiff tiles. I just got imagery for one county like that 
that weighs in at close to 1TB; if I were to decide I can't afford that 
kind of disk space for whatever reason, I'd investigate some compressed 
options. If I don't know any different, I might just compress that tile 
into one large jpeg (like in your example), discover the performance is 
terrible, discard it  file away in my mind that jpegs perform terribly. 
I might not understand that a 5000x5000 jpeg is going to use 75MB of 
memory and take an order of magnitude longer to decompress than that 
1000x1000 jpeg that only takes up 3MB in memory and decompresses nearly 
instantly while giving you that same 500x500 chunk of image. There are 
nice things about jpegs, like you don't need commercial libraries like 
you would with ecw, mrsid, jp2, you don't have to worry about licensing 
issues, size constraints, compiler environment, all that, which makes it 
a pretty attractive compressed format if you can get it to perform well, 
but if you don't know to break them up into smallish chunks I don't 
think getting to that performance level is really possible (for exactly 
the reasons you describe).
I will also admit to being very guilty of not designing for 
“low-moderate load” situations, as I always like my Web sites to be 
able to survive the situation in which they accidentally turn out to 
be popular!
I had second thoughts about saying this, because one man's low load 
might be high for someone else especially if you're talking to someone 
who has run a pretty high profile site, but I'd wager you're the 
exception and there are a lot of smaller fish out there. I'd think that 
Jim is probably more in line with an average user, a moderately sized 
city/county that would probably come nowhere near maxing out even modest 
hardware with those jpegs of his. It's probably those smaller fish where 
compression is more important, maybe they're fighting for space on a 
department-level server or can't get budget approval to upgrade their 
drives. I'd hate for those folks to have to settle for a slow (cpu 
intensive) wavelet-based compression when a properly configured jpeg 
layer might be the compromise they're looking for.


jeff
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-16 Thread Jim Klassen
. GeoTiffs?

 I would expect at some point the additional disk access time
 required for GeoTiffs (of the same pixel count) as Jpegs would
 outweigh the additional processor time required to decompress the
 Jpegs. (Also the number of Jpegs that can fit in disk cache is
 greater than for similar GeoTiffs.)

 For reference we use 1000px by 1000px Jpeg tiles (with world files).
 We store multiple resolutions of the dataset, each in its own
 directory. We start at the native dataset resolution, and half that
 for each step, stopping when there are less than 10 tiles produced
 at that particular resolution. (I.e for one of our county wide
 datasets 6in/px, 1ft/px, 2ft/px, ... 32ft/px). A tileindex is then
 created for each resolution (using gdaltindex followed by shptree)
 and a layer is created in the mapfile for each tileindex and
 appropriate min/maxscales are set. The outputformat in the mapfile
 is set to jpeg.

 Our typical tile size is 200KB. There are about 20k tiles in the 6in/
 px dataset, 80k tiles in the 3in/px dataset (actually 4in data, but
 stored in 3in so it fits with the rest of the datasets well). I have
 tested and this large number of files in a directory doesn't seem to
 effect performance on our system.

 Average access time for a 500x500px request to mapserver is 300ms
 measured at the client using perl/LWP and about 220ms with shp2img.

 Machine is mapserver 5.2.0/x86-64/2.8GHz Xeon/Linux 2.6.16/ext3
 filesystem.

 Jim Klassen
 City of Saint Paul

 Fawcett, David [EMAIL PROTECTED] 09/15/08 1:10 PM 
 Better yet,

 Add your comments to:

 http://mapserver.gis.umn.edu/docs/howto/optimizeraster

 and

 http://mapserver.gis.umn.edu/docs/howto/optimizevector

 I had always thought that all we needed to do to make these pages
 great
 was to grok the list for all of Ed's posts...

 David.

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Brent
 Fraser
 Sent: Monday, September 15, 2008 12:55 PM
 To: mapserver-users@lists.osgeo.org
 Subject: [mapserver-users] Ed's Rules for the Best Raster Performance


 In honor of Ed's imminent retirement from the Mapserver Support Group,
 I've put together Ed's List for the Best Raster Performance:


 #1. Pyramid the data
- use MAXSCALE and MINSCALE in the LAYER object.

 #2. Tile the data (and merge your upper levels of the pyramid for
 fewer
 files).
- see the TILEINDEX object

 #3. Don't compress your data
- avoid jpg, ecw, and mrsid formats.

 #4. Don't re-project your data on-the-fly.

 #5. Get the fastest disks you can afford.


 (Ed, feel free to edit...)

 Brent Fraser
 ___
 mapserver-users mailing list
 mapserver-users@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/mapserver-users
 ___
 mapserver-users mailing list
 mapserver-users@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/mapserver-users

 ___
 mapserver-users mailing list
 mapserver-users@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/mapserver-users


__

Paul Spencer
Chief Technology Officer
DM Solutions Group Inc
http://www.dmsolutions.ca/

___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


RE: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-15 Thread Fawcett, David
Better yet, 

Add your comments to:

http://mapserver.gis.umn.edu/docs/howto/optimizeraster

and 

http://mapserver.gis.umn.edu/docs/howto/optimizevector

I had always thought that all we needed to do to make these pages great
was to grok the list for all of Ed's posts...

David.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Brent
Fraser
Sent: Monday, September 15, 2008 12:55 PM
To: mapserver-users@lists.osgeo.org
Subject: [mapserver-users] Ed's Rules for the Best Raster Performance


In honor of Ed's imminent retirement from the Mapserver Support Group,
I've put together Ed's List for the Best Raster Performance:


#1. Pyramid the data 
- use MAXSCALE and MINSCALE in the LAYER object.

#2. Tile the data (and merge your upper levels of the pyramid for fewer
files).
- see the TILEINDEX object

#3. Don't compress your data
- avoid jpg, ecw, and mrsid formats.

#4. Don't re-project your data on-the-fly.

#5. Get the fastest disks you can afford.


(Ed, feel free to edit...)

Brent Fraser
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-15 Thread Brent Fraser


Jeff McKenna wrote:

my quick comments:

- adding overviews with GDAL's 'gdaladdo' utility is very important


In some cases.  It depends on your data.  As Ed once posted, it may be a good 
idea to switch to external overviews and merge some of the files to limit the 
number of file-opens Mapserver must do when the view is zoomed way out.

- I find your use of the word pyramid confusing (this seems to be a 
word that Arc* users are familiar with but has no real meaning in the 
MapServer world...I guess I've been on the good side for too long ha)


Not being an Arc* user I can't comment on it's origins.  Overview is a good alternative, but it doesn't seem to convey the same multi-levelness as pyramids.  For example to be able to display the Global Landsat Mosaic, I created seven levels of a pyramid (each an external overview of the higher-resolution below it).  


Hmmm, may be the plural of overview is pyramid... :)

Hey Steve L., maybe we should have a PYRAMID Layer type, to replace a set of 
scale-sensitive TILEINDEX Layers (this would help my every-layer-is-exposed-in-WMS 
problem too: http://trac.osgeo.org/mapserver/ticket/300).

Brent
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-15 Thread Jeff McKenna

Brent Fraser wrote:


Jeff McKenna wrote:

my quick comments:

- adding overviews with GDAL's 'gdaladdo' utility is very important


In some cases.  It depends on your data.  As Ed once posted, it may be a 
good idea to switch to external overviews and merge some of the files to 
limit the number of file-opens Mapserver must do when the view is zoomed 
way out.


I'm not here to argue with you, I am only pointing out the importance of 
the utility, which you did not mention in your notes.



--
Jeff McKenna
FOSS4G Consulting and Training Services
http://www.gatewaygeomatics.com/

___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-15 Thread Brent Fraser

Jeff,

 Very good point (sometimes I get stuck on just one way of doing things).   As 
David Fawcett pointed out, I should add those comments to the doc.

Thanks!
Brent


Jeff McKenna wrote:

Brent Fraser wrote:


Jeff McKenna wrote:

my quick comments:

- adding overviews with GDAL's 'gdaladdo' utility is very important


In some cases.  It depends on your data.  As Ed once posted, it may be 
a good idea to switch to external overviews and merge some of the 
files to limit the number of file-opens Mapserver must do when the 
view is zoomed way out.


I'm not here to argue with you, I am only pointing out the importance of 
the utility, which you did not mention in your notes.




___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-15 Thread Brent Fraser
I love it!  I had some half-baked convoluted list-of-tileindexes idea; this is much better.  


We may have to allow PROJECTION=AUTO for the tiles (in the case where 
the tiles are in UTM several zones), and allow the tile index to be in a 
different SRS (e.g geographic) than the tiles (I can't recall if this already 
implemented; I know it caused me a problem some years ago).

A elegant enhancement with great potential...

Brent



Steve Lime wrote:

Interesting idea. This could take the form of a tile index with a couple of 
additonal columns, minscale and maxscale. Tiles would
be grouped together by using common with those values. You could do interesting things like have high resolution data in 
some areas with other areas covered with lower resolution data over a broader range of scales. The whole layer could have it's 
own floor/ceiling but tiles would be handled individually.


I wouldn't handle this as a new layer type, but rather by introducing 
parameters to indicate which columns to use, kinda like
TILEITEM. Your pyramids would be defined in the tile index...

I think the gdal tools would already support this nicely since they add to an 
index if it already exists so you could run those tools
over mutliple datasets. Vector layers could also be handled this way, no reason 
you couldn't have 1 tile per scale range.

Steve


On 9/15/2008 at 2:18 PM, in message [EMAIL PROTECTED], Brent

Fraser [EMAIL PROTECTED] wrote:


Jeff McKenna wrote:

my quick comments:

- adding overviews with GDAL's 'gdaladdo' utility is very important
In some cases.  It depends on your data.  As Ed once posted, it may be a 
good idea to switch to external overviews and merge some of the files to 
limit the number of file-opens Mapserver must do when the view is zoomed way 
out.


- I find your use of the word pyramid confusing (this seems to be a 
word that Arc* users are familiar with but has no real meaning in the 
MapServer world...I guess I've been on the good side for too long ha)


Not being an Arc* user I can't comment on it's origins.  Overview is a 
good alternative, but it doesn't seem to convey the same multi-levelness as 
pyramids.  For example to be able to display the Global Landsat Mosaic, I 
created seven levels of a pyramid (each an external overview of the 
higher-resolution below it).  


Hmmm, may be the plural of overview is pyramid... :)

Hey Steve L., maybe we should have a PYRAMID Layer type, to replace a set 
of scale-sensitive TILEINDEX Layers (this would help my 
every-layer-is-exposed-in-WMS problem too: 
http://trac.osgeo.org/mapserver/ticket/300).


Brent
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org 
http://lists.osgeo.org/mailman/listinfo/mapserver-users




___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


RE: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-15 Thread Jim Klassen
Just out of curiosity, has anyone tested the performance of Jpegs vs. GeoTiffs?

I would expect at some point the additional disk access time required for 
GeoTiffs (of the same pixel count) as Jpegs would outweigh the additional 
processor time required to decompress the Jpegs. (Also the number of Jpegs that 
can fit in disk cache is greater than for similar GeoTiffs.)

For reference we use 1000px by 1000px Jpeg tiles (with world files). We store 
multiple resolutions of the dataset, each in its own directory. We start at the 
native dataset resolution, and half that for each step, stopping when there are 
less than 10 tiles produced at that particular resolution. (I.e for one of our 
county wide datasets 6in/px, 1ft/px, 2ft/px, ... 32ft/px). A tileindex is then 
created for each resolution (using gdaltindex followed by shptree) and a layer 
is created in the mapfile for each tileindex and appropriate min/maxscales are 
set. The outputformat in the mapfile is set to jpeg.

Our typical tile size is 200KB. There are about 20k tiles in the 6in/px 
dataset, 80k tiles in the 3in/px dataset (actually 4in data, but stored in 3in 
so it fits with the rest of the datasets well). I have tested and this large 
number of files in a directory doesn't seem to effect performance on our system.

Average access time for a 500x500px request to mapserver is 300ms measured at 
the client using perl/LWP and about 220ms with shp2img.

Machine is mapserver 5.2.0/x86-64/2.8GHz Xeon/Linux 2.6.16/ext3 filesystem.

Jim Klassen
City of Saint Paul

 Fawcett, David [EMAIL PROTECTED] 09/15/08 1:10 PM 
Better yet, 

Add your comments to:

http://mapserver.gis.umn.edu/docs/howto/optimizeraster

and 

http://mapserver.gis.umn.edu/docs/howto/optimizevector

I had always thought that all we needed to do to make these pages great
was to grok the list for all of Ed's posts...

David.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Brent
Fraser
Sent: Monday, September 15, 2008 12:55 PM
To: mapserver-users@lists.osgeo.org
Subject: [mapserver-users] Ed's Rules for the Best Raster Performance


In honor of Ed's imminent retirement from the Mapserver Support Group,
I've put together Ed's List for the Best Raster Performance:


#1. Pyramid the data 
- use MAXSCALE and MINSCALE in the LAYER object.

#2. Tile the data (and merge your upper levels of the pyramid for fewer
files).
- see the TILEINDEX object

#3. Don't compress your data
- avoid jpg, ecw, and mrsid formats.

#4. Don't re-project your data on-the-fly.

#5. Get the fastest disks you can afford.


(Ed, feel free to edit...)

Brent Fraser
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users

___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-15 Thread Gregor Mosheh

Jim Klassen wrote:

Just out of curiosity, has anyone tested the performance of Jpegs vs. GeoTiffs?


Yep. In my tests, GeoTIFF was the fastest format by some margin, even up 
to 2 GB filesizes. That was on 8-CPU machines, too. If you check the 
mailing list archive, you'll likely find the papers I posted to the 
list putting real numbers to it.


--
Gregor Mosheh / Greg Allensworth, BS, A+
System Administrator
HostGIS cartographic development  hosting services
http://www.HostGIS.com/

Remember that no one cares if you can back up,
 only if you can restore. - AMANDA
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-15 Thread Robert Sanson
Has anyone tried Erdas Imagine .img files?
 
 
 
Robert Sanson, BVSc, MACVSc, PhD
Geospatial Services
AsureQuality Limited
PO Box 585, Palmerston North
NEW ZEALAND

Phone: +64 6 351-7990
Fax: +64 6 351-7919
Mobile: 021 448-472
E-mail: [EMAIL PROTECTED] 

 Gregor Mosheh [EMAIL PROTECTED] 16/09/2008 10:57 a.m. 
Jim Klassen wrote:
 Just out of curiosity, has anyone tested the performance of Jpegs vs. 
 GeoTiffs?

Yep. In my tests, GeoTIFF was the fastest format by some margin, even up 
to 2 GB filesizes. That was on 8-CPU machines, too. If you check the 
mailing list archive, you'll likely find the papers I posted to the 
list putting real numbers to it.

-- 
Gregor Mosheh / Greg Allensworth, BS, A+
System Administrator
HostGIS cartographic development  hosting services
http://www.HostGIS.com/ 

Remember that no one cares if you can back up,
  only if you can restore. - AMANDA
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org 
http://lists.osgeo.org/mailman/listinfo/mapserver-users 

--
The contents of this email are confidential to AsureQuality. If you have 
received this communication in error please notify the sender immediately and 
delete the message and any attachments. The opinions expressed in this email 
are not necessarily those of AsureQuality. This message has been scanned for 
known viruses before delivery. AsureQuality supports the Unsolicited Electronic 
Messages Act 2007. If you do not wish to receive similar communications in 
future, please notify the sender of this message.
--


This message has been scanned for malware by SurfControl plc. 
www.surfcontrol.com
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-15 Thread Paul Spencer
Jim, you would think that ;)  However, in practice I wouldn't expect  
the disk access time for geotiffs to be significantly different from  
jpeg if you have properly optimized your geotiffs using gdal_translate  
-co TILED=YES - the internal structure is efficiently indexed so  
that gdal only has to read the minimum number of 256x256 blocks to  
cover the requested extent.  And using gdaladdo to generate overviews  
just makes it that much more efficient.


Even if you are reading less physical data from the disk to get the  
equivalent coverage from jpeg, the decompression overhead is enough to  
negate the difference in IO time based on Ed's oft quoted advice (and  
other's experience too I think).  The rules that apply in this case  
seem to be 'tile your data', 'do not compress it' and 'buy the fastest  
disk you can afford'.


Compression is useful and probably necessary if you hit disk space  
limits.


Cheers

Paul

On 15-Sep-08, at 5:48 PM, Jim Klassen wrote:

Just out of curiosity, has anyone tested the performance of Jpegs  
vs. GeoTiffs?


I would expect at some point the additional disk access time  
required for GeoTiffs (of the same pixel count) as Jpegs would  
outweigh the additional processor time required to decompress the  
Jpegs. (Also the number of Jpegs that can fit in disk cache is  
greater than for similar GeoTiffs.)


For reference we use 1000px by 1000px Jpeg tiles (with world files).  
We store multiple resolutions of the dataset, each in its own  
directory. We start at the native dataset resolution, and half that  
for each step, stopping when there are less than 10 tiles produced  
at that particular resolution. (I.e for one of our county wide  
datasets 6in/px, 1ft/px, 2ft/px, ... 32ft/px). A tileindex is then  
created for each resolution (using gdaltindex followed by shptree)  
and a layer is created in the mapfile for each tileindex and  
appropriate min/maxscales are set. The outputformat in the mapfile  
is set to jpeg.


Our typical tile size is 200KB. There are about 20k tiles in the 6in/ 
px dataset, 80k tiles in the 3in/px dataset (actually 4in data, but  
stored in 3in so it fits with the rest of the datasets well). I have  
tested and this large number of files in a directory doesn't seem to  
effect performance on our system.


Average access time for a 500x500px request to mapserver is 300ms  
measured at the client using perl/LWP and about 220ms with shp2img.


Machine is mapserver 5.2.0/x86-64/2.8GHz Xeon/Linux 2.6.16/ext3  
filesystem.


Jim Klassen
City of Saint Paul


Fawcett, David [EMAIL PROTECTED] 09/15/08 1:10 PM 

Better yet,

Add your comments to:

http://mapserver.gis.umn.edu/docs/howto/optimizeraster

and

http://mapserver.gis.umn.edu/docs/howto/optimizevector

I had always thought that all we needed to do to make these pages  
great

was to grok the list for all of Ed's posts...

David.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Brent
Fraser
Sent: Monday, September 15, 2008 12:55 PM
To: mapserver-users@lists.osgeo.org
Subject: [mapserver-users] Ed's Rules for the Best Raster Performance


In honor of Ed's imminent retirement from the Mapserver Support Group,
I've put together Ed's List for the Best Raster Performance:


#1. Pyramid the data
   - use MAXSCALE and MINSCALE in the LAYER object.

#2. Tile the data (and merge your upper levels of the pyramid for  
fewer

files).
   - see the TILEINDEX object

#3. Don't compress your data
   - avoid jpg, ecw, and mrsid formats.

#4. Don't re-project your data on-the-fly.

#5. Get the fastest disks you can afford.


(Ed, feel free to edit...)

Brent Fraser
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users

___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users



__

   Paul Spencer
   Chief Technology Officer
   DM Solutions Group Inc
   http://www.dmsolutions.ca/

___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: [mapserver-users] Ed's Rules for the Best Raster Performance

2008-09-15 Thread Ed McNierney
. There are about 20k tiles in the 6in/
 px dataset, 80k tiles in the 3in/px dataset (actually 4in data, but
 stored in 3in so it fits with the rest of the datasets well). I have
 tested and this large number of files in a directory doesn't seem to
 effect performance on our system.

 Average access time for a 500x500px request to mapserver is 300ms
 measured at the client using perl/LWP and about 220ms with shp2img.

 Machine is mapserver 5.2.0/x86-64/2.8GHz Xeon/Linux 2.6.16/ext3
 filesystem.

 Jim Klassen
 City of Saint Paul

 Fawcett, David [EMAIL PROTECTED] 09/15/08 1:10 PM 
 Better yet,

 Add your comments to:

 http://mapserver.gis.umn.edu/docs/howto/optimizeraster

 and

 http://mapserver.gis.umn.edu/docs/howto/optimizevector

 I had always thought that all we needed to do to make these pages
 great
 was to grok the list for all of Ed's posts...

 David.

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Brent
 Fraser
 Sent: Monday, September 15, 2008 12:55 PM
 To: mapserver-users@lists.osgeo.org
 Subject: [mapserver-users] Ed's Rules for the Best Raster Performance


 In honor of Ed's imminent retirement from the Mapserver Support Group,
 I've put together Ed's List for the Best Raster Performance:


 #1. Pyramid the data
- use MAXSCALE and MINSCALE in the LAYER object.

 #2. Tile the data (and merge your upper levels of the pyramid for
 fewer
 files).
- see the TILEINDEX object

 #3. Don't compress your data
- avoid jpg, ecw, and mrsid formats.

 #4. Don't re-project your data on-the-fly.

 #5. Get the fastest disks you can afford.


 (Ed, feel free to edit...)

 Brent Fraser
 ___
 mapserver-users mailing list
 mapserver-users@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/mapserver-users
 ___
 mapserver-users mailing list
 mapserver-users@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/mapserver-users

 ___
 mapserver-users mailing list
 mapserver-users@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/mapserver-users


__

Paul Spencer
Chief Technology Officer
DM Solutions Group Inc
http://www.dmsolutions.ca/

___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users

___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users