Re: VS: [mapserver-users] Mapserver search performance
Hi Andreas, I have looked and looked but could not find how to use ogr2ogr to do the operation I am interested in. Could you point me in the right direction here as to how can ogr2ogr be used to add new fields to an existing dbf file from another dbf file. The external dbf file from which I need to fetch data does not have a corresponding shp file associated with it. Its just a simple data file that has a common field with the shp file attributes. Varun On Wed, May 4, 2011 at 4:23 AM, Eichner, Andreas - SID-NLKM wrote: > > Hi. The first thing to note: editing a DBF with Excel & Co. seems to be > a _really_ bad idea. Those who tried that got Shapes wired to wrong > attribute lines. So DBF, SHP, SHX and QIX files have to be used as a > whole or will usually end up in a corrupted dataset. I would suggest > "ogr2ogr" from the GDAL suite. It's fast, reliable, can do joins and is > aware of the mentioned dependencies. > Please note that such a fileset can only provide a spatial index via a > QIX file. This is OK if you only want to filter by BBOX. If you want to > filter by attribute all lines of the DBF still need to be scanned. In > such cases it's wise to use ogr2ogr to split the data in pre-filtered > sets. > Using a more sophisticated database like PostGIS or SpatialLite can help > you implement more complex scenarios. Since MapServer has no native > driver for SpatialLite it's probably not as fast as it is supposed to > be. This mostly depends on OGRs implementation. > > Greetings > >> Is there a DBF editor out >> there that can be used to import the fields from any external data >> source into the shapefile attribute DBF without affecting the >> structure? I looked for a lot but they do not have the capability of >> doing a JOIN based on a common field and pulling data into the >> shapefile DBF automatically. > ___ mapserver-users mailing list mapserver-users@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/mapserver-users
Re: VS: [mapserver-users] Mapserver search performance
I agree with Andreas, if you just want to manipulate the shapefile and continue to use a moded shapefile with MapServer then Spatialite ( or ogr2ogr ) can help you, but if you want query flexibility with MapServer then postGIS is the way to go. On Wed, May 4, 2011 at 3:23 AM, Eichner, Andreas - SID-NLKM wrote: > > Hi. The first thing to note: editing a DBF with Excel & Co. seems to be > a _really_ bad idea. Those who tried that got Shapes wired to wrong > attribute lines. So DBF, SHP, SHX and QIX files have to be used as a > whole or will usually end up in a corrupted dataset. I would suggest > "ogr2ogr" from the GDAL suite. It's fast, reliable, can do joins and is > aware of the mentioned dependencies. > Please note that such a fileset can only provide a spatial index via a > QIX file. This is OK if you only want to filter by BBOX. If you want to > filter by attribute all lines of the DBF still need to be scanned. In > such cases it's wise to use ogr2ogr to split the data in pre-filtered > sets. > Using a more sophisticated database like PostGIS or SpatialLite can help > you implement more complex scenarios. Since MapServer has no native > driver for SpatialLite it's probably not as fast as it is supposed to > be. This mostly depends on OGRs implementation. > > Greetings > >> Is there a DBF editor out >> there that can be used to import the fields from any external data >> source into the shapefile attribute DBF without affecting the >> structure? I looked for a lot but they do not have the capability of >> doing a JOIN based on a common field and pulling data into the >> shapefile DBF automatically. > ___ > mapserver-users mailing list > mapserver-users@lists.osgeo.org > http://lists.osgeo.org/mailman/listinfo/mapserver-users > ___ mapserver-users mailing list mapserver-users@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/mapserver-users
RE: VS: [mapserver-users] Mapserver search performance
I was thinking the same thing... You could either just import the shapefiles or use the virtual shape functionality. David. -Original Message- From: mapserver-users-boun...@lists.osgeo.org [mailto:mapserver-users-boun...@lists.osgeo.org] On Behalf Of Mark Korver Sent: Tuesday, May 03, 2011 4:06 PM To: Varun saraf Cc: mapserver-users@lists.osgeo.org Subject: Re: VS: [mapserver-users] Mapserver search performance Spatialite jumps into my mind. http://www.gaia-gis.it/spatialite/ ___ mapserver-users mailing list mapserver-users@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/mapserver-users
Re: VS: [mapserver-users] Mapserver search performance
Spatialite jumps into my mind. http://www.gaia-gis.it/spatialite/ On Tue, May 3, 2011 at 3:19 PM, Varun saraf wrote: > Hi, > > Thanks a lot Andreas. That JOIN was the culprit as you rightly > suggested. Once I removed the join, the performance increased > exponentially. The 5 minute cgi run now took under 5 seconds which is > simply amazing. I just need one more help. Is there a DBF editor out > there that can be used to import the fields from any external data > source into the shapefile attribute DBF without affecting the > structure? I looked for a lot but they do not have the capability of > doing a JOIN based on a common field and pulling data into the > shapefile DBF automatically. My company cannot afford the ArcGis > software. > > Thanks, > Varun > > On Fri, Apr 15, 2011 at 2:40 AM, Eichner, Andreas - SID-NLKM > wrote: >> >> Hi, >> >> AFAIK dBase files don't provide an index themselves and there's no other >> way to provide one. shptree only creates an spatial index. Therefore >> only queries like 'does this geometry touch/intersect/lie within a given >> rectangle'can be accelerated. >> >>> I tried the shptree tool but did not see any performance improvement. >> >> So this becomes clear: By doing a JOIN MapServer basically runs a loop: >> for each geometry that matches search withing external data for a line >> matching the join condition. >> >>> Could it be because all of this information that I require is coming >>> from an external DBF file that I join to the layer/shape's DBF? Will >>> including all these fields/information in the shape's DBF file itself >>> help? >> >> I'm pretty sure that this would help, since this would avoid the >> (unaccelerated) join. With a database like PostgreSQL/PostGIS or SQlite >> it's basically the same problem: if you don't create an appropriate >> index for the join condition, this becomes an costly operation. Although >> those columns are usually primary and foreign key columns with >> appropriate index and the join condition is usually a simple equality >> match. >> > ___ > mapserver-users mailing list > mapserver-users@lists.osgeo.org > http://lists.osgeo.org/mailman/listinfo/mapserver-users > ___ mapserver-users mailing list mapserver-users@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/mapserver-users
Re: VS: [mapserver-users] Mapserver search performance
Hi, Thanks a lot for the quick help. I am a PHP/Java guy. I shall try my luck with some sort of PHP scripting as I need a solution fairly quickly. I had made the suggestion of shifting to a PostGIS system quite some time back but I guess you know how it is with approvals :) Thanks again, Varun On Tue, May 3, 2011 at 4:42 PM, Mr. Puneet Kishor wrote: > > On May 3, 2011, at 3:36 PM, Varun saraf wrote: > >> Hi Puneet, >> >> Thanks a lot for the prompt reply. I tried using Excel 2007 and was >> not able to re-save the dbf file after editing. Also, I am having DBF >> files with about a million records and Excel tends to hang for these >> operations. > > > Now you tell us ;-). My sense is that most gui-based tools will choke on > million+ rows. You might well want to "upgrade" to a Pg/PostGIS solution at > some point, but I realize that is not what you are asking for... (also, I > believe the most recent Excel versions might have lost the DBF translation > capabilities -- I am not an Excel person, so I can't confirm... besides, I > use Macs, and Excel is most likely hobbled on Macs anyway). > > >> Is there some tool other than Excel which can do these >> operations? > > > I remember using Perl and XBase.pm to do this. It was really very quick and > trivial, but it was a long time ago. Choose your language of choice. Try R. > > > > >> >> Thanks, >> Varun >> >> On Tue, May 3, 2011 at 4:27 PM, Mr. Puneet Kishor >> wrote: >>> >>> On May 3, 2011, at 3:19 PM, Varun saraf wrote: >>> ..Is there a DBF editor out there that can be used to import the fields from any external data source into the shapefile attribute DBF without affecting the structure? I looked for a lot but they do not have the capability of doing a JOIN based on a common field and pulling data into the shapefile DBF automatically. >>> >>> Just use MS-Excel or any program that can open up and write DBF. As long as >>> you are careful to not change the number of rows, just add additional >>> columns, and make sure the column names are not changed, and follow the >>> various DBF limitations, you should be ok. Make sure to keep a backup of >>> the original DBF in case things go ka-pow! >>> >>> Since the DBF data and the geometry are in separate files, there is no >>> issue with adding more attributes provide you follow the care noted above. >>> .. >>> >>> >>> >>> Puneet. > > ___ mapserver-users mailing list mapserver-users@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/mapserver-users
Re: VS: [mapserver-users] Mapserver search performance
On May 3, 2011, at 3:36 PM, Varun saraf wrote: > Hi Puneet, > > Thanks a lot for the prompt reply. I tried using Excel 2007 and was > not able to re-save the dbf file after editing. Also, I am having DBF > files with about a million records and Excel tends to hang for these > operations. Now you tell us ;-). My sense is that most gui-based tools will choke on million+ rows. You might well want to "upgrade" to a Pg/PostGIS solution at some point, but I realize that is not what you are asking for... (also, I believe the most recent Excel versions might have lost the DBF translation capabilities -- I am not an Excel person, so I can't confirm... besides, I use Macs, and Excel is most likely hobbled on Macs anyway). > Is there some tool other than Excel which can do these > operations? I remember using Perl and XBase.pm to do this. It was really very quick and trivial, but it was a long time ago. Choose your language of choice. Try R. > > Thanks, > Varun > > On Tue, May 3, 2011 at 4:27 PM, Mr. Puneet Kishor wrote: >> >> On May 3, 2011, at 3:19 PM, Varun saraf wrote: >> >>> ..Is there a DBF editor out >>> there that can be used to import the fields from any external data >>> source into the shapefile attribute DBF without affecting the >>> structure? I looked for a lot but they do not have the capability of >>> doing a JOIN based on a common field and pulling data into the >>> shapefile DBF automatically. >> >> Just use MS-Excel or any program that can open up and write DBF. As long as >> you are careful to not change the number of rows, just add additional >> columns, and make sure the column names are not changed, and follow the >> various DBF limitations, you should be ok. Make sure to keep a backup of the >> original DBF in case things go ka-pow! >> >> Since the DBF data and the geometry are in separate files, there is no issue >> with adding more attributes provide you follow the care noted above. >> .. >> >> >> >> Puneet. ___ mapserver-users mailing list mapserver-users@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/mapserver-users
Re: VS: [mapserver-users] Mapserver search performance
Hi, Thanks a lot Andreas. That JOIN was the culprit as you rightly suggested. Once I removed the join, the performance increased exponentially. The 5 minute cgi run now took under 5 seconds which is simply amazing. I just need one more help. Is there a DBF editor out there that can be used to import the fields from any external data source into the shapefile attribute DBF without affecting the structure? I looked for a lot but they do not have the capability of doing a JOIN based on a common field and pulling data into the shapefile DBF automatically. My company cannot afford the ArcGis software. Thanks, Varun On Fri, Apr 15, 2011 at 2:40 AM, Eichner, Andreas - SID-NLKM wrote: > > Hi, > > AFAIK dBase files don't provide an index themselves and there's no other > way to provide one. shptree only creates an spatial index. Therefore > only queries like 'does this geometry touch/intersect/lie within a given > rectangle'can be accelerated. > >> I tried the shptree tool but did not see any performance improvement. > > So this becomes clear: By doing a JOIN MapServer basically runs a loop: > for each geometry that matches search withing external data for a line > matching the join condition. > >> Could it be because all of this information that I require is coming >> from an external DBF file that I join to the layer/shape's DBF? Will >> including all these fields/information in the shape's DBF file itself >> help? > > I'm pretty sure that this would help, since this would avoid the > (unaccelerated) join. With a database like PostgreSQL/PostGIS or SQlite > it's basically the same problem: if you don't create an appropriate > index for the join condition, this becomes an costly operation. Although > those columns are usually primary and foreign key columns with > appropriate index and the join condition is usually a simple equality > match. > ___ mapserver-users mailing list mapserver-users@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/mapserver-users
Re: VS: [mapserver-users] Mapserver search performance
Hi, I tried the shptree tool but did not see any performance improvement. It fetched results about 4-5 seconds faster. Without QIX files, it was 5 minutes and with QIX files, it took about 4 minutes and 56 seconds. All of my requests are based on the "MAPSHAPE" parameter in NQUERY mode. I noticed that mapserver was taking about 1 second for outputting the information in the template file for each feature. Could it be because all of this information that I require is coming from an external DBF file that I join to the layer/shape's DBF? Will including all these fields/information in the shape's DBF file itself help? Does mapserver pick up each feature and use a point in polygon approach to check if it lies in the provided shape? If yes, how can i make this process faster. Is there a way of making it multi-threaded or do more checks per second? I am providing sample code. Hope this helps. I have about 35 layers in my map file. Should I merge all these into a single layer? MAP FILE LAYER NAME "L1" METADATA qstring_validation_pattern '.' END STATUS DEFAULT TYPE point DATA BLKS_01 TOLERANCE 0 TOLERANCEUNITS METERS CLASS STYLE OUTLINECOLOR 255 0 0 END END JOIN NAME "external" TABLE "data/externalData.dbf" FROM "Field1" TO "Field1" TYPE ONE-TO-ONE END TEMPLATE "blockTemplate.xml" END Template File == [GEOID],[Field1],[LON],[LAT],[external_ID],[external_NAME],[external_STA],[external_NEEDSCORE],[external_MINSCORE]; Request === http://localhost/cgi-bin/mapserv.exe?map=C:/ms4w/apps/GIS/centroides.map&mode=nquery&mapshape=-99.757833 32.474433 -99.758005 32.450679 -99.726591 32.447637 -99.727621 32.474723 Any help in this regard is greatly appreciated. Let me know if you need any more information. Thanks, Varun On Mon, Apr 11, 2011 at 3:06 PM, Varun saraf wrote: > Thanks a lot for the quick reply. I will give it a try. > > On Mon, Apr 11, 2011 at 1:50 PM, Rahkonen Jukka > wrote: >> Hi, >> >> Shptree will help and stopwatch will tell you how much. Without spatial >> index Mapserver needs to go through the whole shapefile every time. Have >> a try, it will not take very many seconds to run shptree. Make different >> requests, take times with and without .qix files and you will some >> numbers. Change the requests and request order (query 1 with >> .qix/withour qix, query 2 without .qix/with .qix) so that you can see if >> you are actually testing the speed of disk access and memory cache >> access and not the effect of having spatial index. >> By adding DEBUG 5 for your layer you don't need stopwatch but you'll get >> timings into ms_errorfile. >> >> >> -Jukka Rahkonen- >> >> Varun saraf wrote: >> >>> >>> Hello Everyone, >>> >>> I have programmed a GIS application using Mapserver, Google maps and >>> Tilecache. The functionality of this GIS application is to extract the >>> data (from the dbf file) for all features (Points) within a randomly >>> drawn user shape and doing some statistical operation on that data. I >>> use an NQUERY mode with MAPSHAPE attribute to get all the data for the >>> user drawn shape. Mapserver takes aout 5-10 seconds for a small shape >>> (a couple of square miles) but as the shape gets bigger (hundreds of >>> square miles), the time taken to fetch all data related to the >>> features/points lying in the shape grows exponentially (Upto 2 hours >>> for some shapes). Until now, we were restricting the maximum area a >>> shape can have but we have to get rid of that. Is there a way to >>> improve the performance in any way? Will SHPTREE work for this >>> purpose? The features are currently points only but we may move to >>> polygons in future. We use the .shp files for the shapes. Is it >>> advantageous to move to a database instead? If yes, what database >>> works best? >>> >>> What I did notice is that for any given request to mapserver, however >>> large the shape, the CPU utilization never crosses 12%. Can we improve >>> performance by increasing the RAM or maybe move to a solid state hard >>> drive? There is also the possibility of moving this application to >>> Cloud computing. Anything that will improve the performance actually. >>> Can someone point me in the right direction as to what might be the >>> current bottleneck? >>> >>> T current setup is on windows and uses MS4W on an Apache server. >>> >>> Thanks, >>> Varun >>> ___ >>> mapserver-users mailing list >>> mapserver-users@lists.osgeo.org >>> http://lists.osgeo.org/mailman/listinfo/mapserver-users >>> >> > ___ mapserver-users mailing list mapserver-users@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/mapserver-users
Re: VS: [mapserver-users] Mapserver search performance
Thanks a lot for the quick reply. I will give it a try. On Mon, Apr 11, 2011 at 1:50 PM, Rahkonen Jukka wrote: > Hi, > > Shptree will help and stopwatch will tell you how much. Without spatial > index Mapserver needs to go through the whole shapefile every time. Have > a try, it will not take very many seconds to run shptree. Make different > requests, take times with and without .qix files and you will some > numbers. Change the requests and request order (query 1 with > .qix/withour qix, query 2 without .qix/with .qix) so that you can see if > you are actually testing the speed of disk access and memory cache > access and not the effect of having spatial index. > By adding DEBUG 5 for your layer you don't need stopwatch but you'll get > timings into ms_errorfile. > > > -Jukka Rahkonen- > > Varun saraf wrote: > >> >> Hello Everyone, >> >> I have programmed a GIS application using Mapserver, Google maps and >> Tilecache. The functionality of this GIS application is to extract the >> data (from the dbf file) for all features (Points) within a randomly >> drawn user shape and doing some statistical operation on that data. I >> use an NQUERY mode with MAPSHAPE attribute to get all the data for the >> user drawn shape. Mapserver takes aout 5-10 seconds for a small shape >> (a couple of square miles) but as the shape gets bigger (hundreds of >> square miles), the time taken to fetch all data related to the >> features/points lying in the shape grows exponentially (Upto 2 hours >> for some shapes). Until now, we were restricting the maximum area a >> shape can have but we have to get rid of that. Is there a way to >> improve the performance in any way? Will SHPTREE work for this >> purpose? The features are currently points only but we may move to >> polygons in future. We use the .shp files for the shapes. Is it >> advantageous to move to a database instead? If yes, what database >> works best? >> >> What I did notice is that for any given request to mapserver, however >> large the shape, the CPU utilization never crosses 12%. Can we improve >> performance by increasing the RAM or maybe move to a solid state hard >> drive? There is also the possibility of moving this application to >> Cloud computing. Anything that will improve the performance actually. >> Can someone point me in the right direction as to what might be the >> current bottleneck? >> >> T current setup is on windows and uses MS4W on an Apache server. >> >> Thanks, >> Varun >> ___ >> mapserver-users mailing list >> mapserver-users@lists.osgeo.org >> http://lists.osgeo.org/mailman/listinfo/mapserver-users >> > ___ mapserver-users mailing list mapserver-users@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/mapserver-users