Richard,

Data management is a common problem. The best practices for me have been to separate physical storage and logical storage. This is easiest to do on Linux systems with symbolic links. For physical storage, I like to keep datasets self contained, especially if I have to update them at any frequency. Because these are self contained (ie. in a single directory tree) it is easy to create a parallel tree with new data and just swap out the old data for the new data by changing the symlink to the new data. This also allow any data tree to reside on any partition.

For logical storage, I think in terms of maps or applications and I build a single directory for each. Into this directory, I link in the physical datasets in need and I create all the tileindexes relative to that directory. Then in the mapfile I set DATAPATH to point to that directory. So for example, I have tiger data directories for the separate tiger releases with physical names like:

/u/data/tiger2004fe/
/u/data/tiger2004se/
/u2/data/tiger2005fe/

In my application directory I have something like:

/u/application/tiger -> /u/data/tiger2005fe/

I call the tiger data by "tiger" regardless of the version I am showing. That way I can change the underlying data without the application caring and I don't need to rebuild the tileindexes.

If I want to move the application to another server, I move the physical datasets I need and the application directory and fix up the symlinks to point to the respective new locations. In 99% of the time I do not need to rebuild the tileindexes.

Hope this helps,
  -Steve W.

Richard Taylor wrote:
Hello LIST

this is not just a MapServer question, but perhaps some of you farther down the path have insights that you are willing to pass on.

As my learning curve progresses i find that local data volume is increasing rapidly. It started of course with local apps, then expanded with my introduction to MapServer, in my case ms4w, for getting the basics, then has continued on to local directories to send up to remote unix system instances.

While the mapfiles allow one to give a full path to your data, meaning locally you can get at it wherever it is, that structure does not hold well with or all with remote instances. the end result is multiple copies of many files, some of which are quite large, one for local apps, one for ms4w, and one for each remote mapserver.

One solution is to keep getting large storage space but feeling this might a common problem wonder if any of the long term users or those with large data volumes have come to a 'best practises' solution to this issue.

thanks in advance

richard taylor

Reply via email to