Re: VS: [mapserver-users] Mapserver search performance

2011-04-11 Thread Varun saraf
Thanks a lot for the quick reply. I will give it a try.

On Mon, Apr 11, 2011 at 1:50 PM, Rahkonen Jukka
 wrote:
> Hi,
>
> Shptree will help and stopwatch will tell you how much. Without spatial
> index Mapserver needs to go through the whole shapefile every time. Have
> a try, it will not take very many seconds to run shptree. Make different
> requests, take times with and without .qix files and you will some
> numbers. Change the requests and request order (query 1 with
> .qix/withour qix, query 2 without .qix/with .qix) so that you can see if
> you are actually testing the speed of disk access and memory cache
> access and not the effect of having spatial index.
> By adding DEBUG 5 for your layer you don't need stopwatch but you'll get
> timings into ms_errorfile.
>
>
> -Jukka Rahkonen-
>
> Varun saraf wrote:
>
>>
>> Hello Everyone,
>>
>> I have programmed a GIS application using Mapserver, Google maps and
>> Tilecache. The functionality of this GIS application is to extract the
>> data (from the dbf file) for all features (Points) within a randomly
>> drawn user shape and doing some statistical operation on that data. I
>> use an NQUERY mode with MAPSHAPE attribute to get all the data for the
>> user drawn shape. Mapserver takes aout 5-10 seconds for a small shape
>> (a couple of square miles) but as the shape gets bigger (hundreds of
>> square miles), the time taken to fetch all data related to the
>> features/points lying in the shape grows exponentially (Upto 2 hours
>> for some shapes). Until now, we were restricting the maximum area a
>> shape can have but we have to get rid of that. Is there a way to
>> improve the performance in any way? Will SHPTREE work for this
>> purpose? The features are currently points only but we may move to
>> polygons in future. We use the .shp files for the shapes. Is it
>> advantageous to move to a database instead? If yes, what database
>> works best?
>>
>> What I did notice is that for any given request to mapserver, however
>> large the shape, the CPU utilization never crosses 12%. Can we improve
>> performance by increasing the RAM or maybe move to a solid state hard
>> drive? There is also the possibility of moving this application to
>> Cloud computing. Anything that will improve the performance actually.
>> Can someone point me in the right direction as to what might be the
>> current bottleneck?
>>
>> T current setup is on windows and uses MS4W on an Apache server.
>>
>> Thanks,
>> Varun
>> ___
>> mapserver-users mailing list
>> mapserver-users@lists.osgeo.org
>> http://lists.osgeo.org/mailman/listinfo/mapserver-users
>>
>
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: VS: [mapserver-users] Mapserver search performance

2011-04-14 Thread Varun saraf
Hi,

I tried the shptree tool but did not see any performance improvement.
It fetched results about 4-5 seconds faster. Without QIX files, it was
5 minutes and with QIX files, it took about 4 minutes and 56 seconds.
All of my requests are based on the "MAPSHAPE" parameter in NQUERY
mode. I noticed that mapserver was taking about 1 second for
outputting the information in the template file for each feature.
Could it be because all of this information that I require is coming
from an external DBF file that I join to the layer/shape's DBF? Will
including all these fields/information in the shape's DBF file itself
help? Does mapserver pick up each feature and use a point in polygon
approach to check if it lies in the provided shape? If yes, how can i
make this process faster. Is there a way of making it multi-threaded
or do more checks per second?

I am providing sample code. Hope this helps. I have about 35 layers in
my map file. Should I merge all these into a single layer?

MAP FILE

LAYER
NAME "L1"
METADATA
qstring_validation_pattern '.'
END
STATUS DEFAULT
TYPE point
DATA BLKS_01
TOLERANCE 0
TOLERANCEUNITS METERS
CLASS
   STYLE
  OUTLINECOLOR 255 0 0
END 
END

JOIN
NAME "external"
TABLE "data/externalData.dbf"
FROM "Field1"
TO "Field1"
TYPE ONE-TO-ONE
END

TEMPLATE "blockTemplate.xml"
END

Template File
==
[GEOID],[Field1],[LON],[LAT],[external_ID],[external_NAME],[external_STA],[external_NEEDSCORE],[external_MINSCORE];

Request
===
http://localhost/cgi-bin/mapserv.exe?map=C:/ms4w/apps/GIS/centroides.map&mode=nquery&mapshape=-99.757833
32.474433 -99.758005 32.450679 -99.726591 32.447637 -99.727621
32.474723

Any help in this regard is greatly appreciated. Let me know if you
need any more information.

Thanks,
Varun

On Mon, Apr 11, 2011 at 3:06 PM, Varun saraf  wrote:
> Thanks a lot for the quick reply. I will give it a try.
>
> On Mon, Apr 11, 2011 at 1:50 PM, Rahkonen Jukka
>  wrote:
>> Hi,
>>
>> Shptree will help and stopwatch will tell you how much. Without spatial
>> index Mapserver needs to go through the whole shapefile every time. Have
>> a try, it will not take very many seconds to run shptree. Make different
>> requests, take times with and without .qix files and you will some
>> numbers. Change the requests and request order (query 1 with
>> .qix/withour qix, query 2 without .qix/with .qix) so that you can see if
>> you are actually testing the speed of disk access and memory cache
>> access and not the effect of having spatial index.
>> By adding DEBUG 5 for your layer you don't need stopwatch but you'll get
>> timings into ms_errorfile.
>>
>>
>> -Jukka Rahkonen-
>>
>> Varun saraf wrote:
>>
>>>
>>> Hello Everyone,
>>>
>>> I have programmed a GIS application using Mapserver, Google maps and
>>> Tilecache. The functionality of this GIS application is to extract the
>>> data (from the dbf file) for all features (Points) within a randomly
>>> drawn user shape and doing some statistical operation on that data. I
>>> use an NQUERY mode with MAPSHAPE attribute to get all the data for the
>>> user drawn shape. Mapserver takes aout 5-10 seconds for a small shape
>>> (a couple of square miles) but as the shape gets bigger (hundreds of
>>> square miles), the time taken to fetch all data related to the
>>> features/points lying in the shape grows exponentially (Upto 2 hours
>>> for some shapes). Until now, we were restricting the maximum area a
>>> shape can have but we have to get rid of that. Is there a way to
>>> improve the performance in any way? Will SHPTREE work for this
>>> purpose? The features are currently points only but we may move to
>>> polygons in future. We use the .shp files for the shapes. Is it
>>> advantageous to move to a database instead? If yes, what database
>>> works best?
>>>
>>> What I did notice is that for any given request to mapserver, however
>>> large the shape, the CPU utilization never crosses 12%. Can we improve
>>> performance by increasing the RAM or maybe move to a solid state hard
>>> drive? There is also the possibility of moving this application to
>>> Cloud computing. Anything that will improve the performance actually.
>>> Can someone point me in the right direction as to what might be the
>>> current bottleneck?
>>>
>>> T current setup is on windows and uses MS4W on an Apache server.
>>>
>>> Thanks,
>>> Varun
>>> ___
>>> mapserver-users mailing list
>>> mapserver-users@lists.osgeo.org
>>> http://lists.osgeo.org/mailman/listinfo/mapserver-users
>>>
>>
>
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: VS: [mapserver-users] Mapserver search performance

2011-05-03 Thread Varun saraf
Hi,

Thanks a lot Andreas. That JOIN was the culprit as you rightly
suggested. Once I removed the join, the performance increased
exponentially. The 5 minute cgi run now took under 5 seconds which is
simply amazing. I just need one more help. Is there a DBF editor out
there that can be used to import the fields from any external data
source into the shapefile attribute DBF without affecting the
structure? I looked for a lot but they do not have the capability of
doing a JOIN based on a common field and pulling data into the
shapefile DBF automatically. My company cannot afford the ArcGis
software.

Thanks,
Varun

On Fri, Apr 15, 2011 at 2:40 AM, Eichner, Andreas - SID-NLKM
 wrote:
>
> Hi,
>
> AFAIK dBase files don't provide an index themselves and there's no other
> way to provide one. shptree only creates an spatial index. Therefore
> only queries like 'does this geometry touch/intersect/lie within a given
> rectangle'can be accelerated.
>
>> I tried the shptree tool but did not see any performance improvement.
>
> So this becomes clear: By doing a JOIN MapServer basically runs a loop:
> for each geometry that matches search withing external data for a line
> matching the join condition.
>
>> Could it be because all of this information that I require is coming
>> from an external DBF file that I join to the layer/shape's DBF? Will
>> including all these fields/information in the shape's DBF file itself
>> help?
>
> I'm pretty sure that this would help, since this would avoid the
> (unaccelerated) join. With a database like PostgreSQL/PostGIS or SQlite
> it's basically the same problem: if you don't create an appropriate
> index for the join condition, this becomes an costly operation. Although
> those columns are usually primary and foreign key columns with
> appropriate index and the join condition is usually a simple equality
> match.
>
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: VS: [mapserver-users] Mapserver search performance

2011-05-03 Thread Mr. Puneet Kishor

On May 3, 2011, at 3:36 PM, Varun saraf wrote:

> Hi Puneet,
> 
> Thanks a lot for the prompt reply. I tried using Excel 2007 and was
> not able to re-save the dbf file after editing. Also, I am having DBF
> files with about a million records and Excel tends to hang for these
> operations.


Now you tell us ;-). My sense is that most gui-based tools will choke on 
million+ rows. You might well want to "upgrade" to a Pg/PostGIS solution at 
some point, but I realize that is not what you are asking for... (also, I 
believe the most recent Excel versions might have lost the DBF translation 
capabilities -- I am not an Excel person, so I can't confirm... besides, I use 
Macs, and Excel is most likely hobbled on Macs anyway).


> Is there some tool other than Excel which can do these
> operations?


I remember using Perl and XBase.pm to do this. It was really very quick and 
trivial, but it was a long time ago. Choose your language of choice. Try R.




> 
> Thanks,
> Varun
> 
> On Tue, May 3, 2011 at 4:27 PM, Mr. Puneet Kishor  wrote:
>> 
>> On May 3, 2011, at 3:19 PM, Varun saraf wrote:
>> 
>>> ..Is there a DBF editor out
>>> there that can be used to import the fields from any external data
>>> source into the shapefile attribute DBF without affecting the
>>> structure? I looked for a lot but they do not have the capability of
>>> doing a JOIN based on a common field and pulling data into the
>>> shapefile DBF automatically.
>> 
>> Just use MS-Excel or any program that can open up and write DBF. As long as 
>> you are careful to not change the number of rows, just add additional 
>> columns, and make sure the column names are not changed, and follow the 
>> various DBF limitations, you should be ok. Make sure to keep a backup of the 
>> original DBF in case things go ka-pow!
>> 
>> Since the DBF data and the geometry are in separate files, there is no issue 
>> with adding more attributes provide you follow the care noted above.
>> ..
>> 
>> 
>> 
>> Puneet.

___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: VS: [mapserver-users] Mapserver search performance

2011-05-03 Thread Varun saraf
Hi,

Thanks a lot for the quick help. I am a PHP/Java guy. I shall try my
luck with some sort of PHP scripting as I need a solution fairly
quickly. I had made the suggestion of shifting to a PostGIS system
quite some time back but I guess you know how it is with approvals :)

Thanks again,
Varun

On Tue, May 3, 2011 at 4:42 PM, Mr. Puneet Kishor  wrote:
>
> On May 3, 2011, at 3:36 PM, Varun saraf wrote:
>
>> Hi Puneet,
>>
>> Thanks a lot for the prompt reply. I tried using Excel 2007 and was
>> not able to re-save the dbf file after editing. Also, I am having DBF
>> files with about a million records and Excel tends to hang for these
>> operations.
>
>
> Now you tell us ;-). My sense is that most gui-based tools will choke on 
> million+ rows. You might well want to "upgrade" to a Pg/PostGIS solution at 
> some point, but I realize that is not what you are asking for... (also, I 
> believe the most recent Excel versions might have lost the DBF translation 
> capabilities -- I am not an Excel person, so I can't confirm... besides, I 
> use Macs, and Excel is most likely hobbled on Macs anyway).
>
>
>> Is there some tool other than Excel which can do these
>> operations?
>
>
> I remember using Perl and XBase.pm to do this. It was really very quick and 
> trivial, but it was a long time ago. Choose your language of choice. Try R.
>
>
>
>
>>
>> Thanks,
>> Varun
>>
>> On Tue, May 3, 2011 at 4:27 PM, Mr. Puneet Kishor  
>> wrote:
>>>
>>> On May 3, 2011, at 3:19 PM, Varun saraf wrote:
>>>
 ..Is there a DBF editor out
 there that can be used to import the fields from any external data
 source into the shapefile attribute DBF without affecting the
 structure? I looked for a lot but they do not have the capability of
 doing a JOIN based on a common field and pulling data into the
 shapefile DBF automatically.
>>>
>>> Just use MS-Excel or any program that can open up and write DBF. As long as 
>>> you are careful to not change the number of rows, just add additional 
>>> columns, and make sure the column names are not changed, and follow the 
>>> various DBF limitations, you should be ok. Make sure to keep a backup of 
>>> the original DBF in case things go ka-pow!
>>>
>>> Since the DBF data and the geometry are in separate files, there is no 
>>> issue with adding more attributes provide you follow the care noted above.
>>> ..
>>>
>>>
>>>
>>> Puneet.
>
>
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: VS: [mapserver-users] Mapserver search performance

2011-05-03 Thread Mark Korver
Spatialite jumps into my mind.

http://www.gaia-gis.it/spatialite/

On Tue, May 3, 2011 at 3:19 PM, Varun saraf  wrote:
> Hi,
>
> Thanks a lot Andreas. That JOIN was the culprit as you rightly
> suggested. Once I removed the join, the performance increased
> exponentially. The 5 minute cgi run now took under 5 seconds which is
> simply amazing. I just need one more help. Is there a DBF editor out
> there that can be used to import the fields from any external data
> source into the shapefile attribute DBF without affecting the
> structure? I looked for a lot but they do not have the capability of
> doing a JOIN based on a common field and pulling data into the
> shapefile DBF automatically. My company cannot afford the ArcGis
> software.
>
> Thanks,
> Varun
>
> On Fri, Apr 15, 2011 at 2:40 AM, Eichner, Andreas - SID-NLKM
>  wrote:
>>
>> Hi,
>>
>> AFAIK dBase files don't provide an index themselves and there's no other
>> way to provide one. shptree only creates an spatial index. Therefore
>> only queries like 'does this geometry touch/intersect/lie within a given
>> rectangle'can be accelerated.
>>
>>> I tried the shptree tool but did not see any performance improvement.
>>
>> So this becomes clear: By doing a JOIN MapServer basically runs a loop:
>> for each geometry that matches search withing external data for a line
>> matching the join condition.
>>
>>> Could it be because all of this information that I require is coming
>>> from an external DBF file that I join to the layer/shape's DBF? Will
>>> including all these fields/information in the shape's DBF file itself
>>> help?
>>
>> I'm pretty sure that this would help, since this would avoid the
>> (unaccelerated) join. With a database like PostgreSQL/PostGIS or SQlite
>> it's basically the same problem: if you don't create an appropriate
>> index for the join condition, this becomes an costly operation. Although
>> those columns are usually primary and foreign key columns with
>> appropriate index and the join condition is usually a simple equality
>> match.
>>
> ___
> mapserver-users mailing list
> mapserver-users@lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/mapserver-users
>
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


RE: VS: [mapserver-users] Mapserver search performance

2011-05-03 Thread Fawcett, David (MPCA)
I was thinking the same thing...

You could either just import the shapefiles or use the virtual shape 
functionality.

David.

-Original Message-
From: mapserver-users-boun...@lists.osgeo.org 
[mailto:mapserver-users-boun...@lists.osgeo.org] On Behalf Of Mark Korver
Sent: Tuesday, May 03, 2011 4:06 PM
To: Varun saraf
Cc: mapserver-users@lists.osgeo.org
Subject: Re: VS: [mapserver-users] Mapserver search performance

Spatialite jumps into my mind.

http://www.gaia-gis.it/spatialite/



___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: VS: [mapserver-users] Mapserver search performance

2011-05-04 Thread Mark Korver
I agree with Andreas,

if you just want to manipulate the shapefile and continue to use a
moded shapefile with MapServer then Spatialite ( or ogr2ogr ) can help
you, but if you want query flexibility with MapServer then postGIS is
the way to go.

On Wed, May 4, 2011 at 3:23 AM, Eichner, Andreas - SID-NLKM
 wrote:
>
> Hi. The first thing to note: editing a DBF with Excel & Co. seems to be
> a _really_ bad idea. Those who tried that got Shapes wired to wrong
> attribute lines. So DBF, SHP, SHX and QIX files have to be used as a
> whole or will usually end up in a corrupted dataset. I would suggest
> "ogr2ogr" from the GDAL suite. It's fast, reliable, can do joins and is
> aware of the mentioned dependencies.
> Please note that such a fileset can only provide a spatial index via a
> QIX file. This is OK if you only want to filter by BBOX. If you want to
> filter by attribute all lines of the DBF still need to be scanned. In
> such cases it's wise to use ogr2ogr to split the data in pre-filtered
> sets.
> Using a more sophisticated database like PostGIS or SpatialLite can help
> you implement more complex scenarios. Since MapServer has no native
> driver for SpatialLite it's probably not as fast as it is supposed to
> be. This mostly depends on OGRs implementation.
>
> Greetings
>
>> Is there a DBF editor out
>> there that can be used to import the fields from any external data
>> source into the shapefile attribute DBF without affecting the
>> structure? I looked for a lot but they do not have the capability of
>> doing a JOIN based on a common field and pulling data into the
>> shapefile DBF automatically.
> ___
> mapserver-users mailing list
> mapserver-users@lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/mapserver-users
>
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


Re: VS: [mapserver-users] Mapserver search performance

2011-05-04 Thread Varun saraf
Hi Andreas,

I have looked and looked but could not find how to use ogr2ogr to do
the operation I am interested in. Could you point me in the right
direction here as to how can ogr2ogr be used to add new fields to an
existing dbf file from another dbf file. The external dbf file from
which I need to fetch data does not have a corresponding shp file
associated with it. Its just a simple data file that has a common
field with the shp file attributes.

Varun

On Wed, May 4, 2011 at 4:23 AM, Eichner, Andreas - SID-NLKM
 wrote:
>
> Hi. The first thing to note: editing a DBF with Excel & Co. seems to be
> a _really_ bad idea. Those who tried that got Shapes wired to wrong
> attribute lines. So DBF, SHP, SHX and QIX files have to be used as a
> whole or will usually end up in a corrupted dataset. I would suggest
> "ogr2ogr" from the GDAL suite. It's fast, reliable, can do joins and is
> aware of the mentioned dependencies.
> Please note that such a fileset can only provide a spatial index via a
> QIX file. This is OK if you only want to filter by BBOX. If you want to
> filter by attribute all lines of the DBF still need to be scanned. In
> such cases it's wise to use ogr2ogr to split the data in pre-filtered
> sets.
> Using a more sophisticated database like PostGIS or SpatialLite can help
> you implement more complex scenarios. Since MapServer has no native
> driver for SpatialLite it's probably not as fast as it is supposed to
> be. This mostly depends on OGRs implementation.
>
> Greetings
>
>> Is there a DBF editor out
>> there that can be used to import the fields from any external data
>> source into the shapefile attribute DBF without affecting the
>> structure? I looked for a lot but they do not have the capability of
>> doing a JOIN based on a common field and pulling data into the
>> shapefile DBF automatically.
>
___
mapserver-users mailing list
mapserver-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users