Hi,
> As long as there aren't too many shells, it's ok to use a linear search,
> isn't it?
Must we do this assertion ? I started a test with your dataset as a
single multipolygon.
Had to stop it after 1h30... :-(
> Did you confirm that OGR is actually reading and forming the
> geometries? I still find it hard to believe that it can read the geoms
> in only 4 s!
And yet, JTS can do too.
If I use the entire PolygonHandler procedure but replace the geometry by
an empty
geometry at the last minute (when Shapefile class feed the geometry
collection) to
avoid envelope calculation, the read takes about 5 s ! (instead of 16
for a full reading
with envelope calculation on my machine).
Seems consistent with what you get comparing jeql and ogr.
> It's interesting about the envelope computation. In JTS these are
> computed lazily, but in the ShapefileReader they're used right away to
> test for candidate shells for holes.
In your dataset (only simple polygon), your tip #3 completely avoid tests on
envelope. So there should be no overhead while reading. Envelope are only
computed when the FeatureDataset is built.
So reading is very fast, but at one point, envelope calculation is
necessary and
it takes more time than reading.
Now the problem is how to efficiently read multipolygons, with the overhead
of envelope calculation, envelope contain test, and PiP test.
Not sure that finding holes with only one shell candidate (tip #2) will
help much
as in complex cases (like your landuse dataset), most holes will have
several shell
candidates.
With your file saved as a single multipolygon, even without any test ( just
getEnvelopeIntenal() and getCoordinates let in the inner loop), the
computation has
no end (ok 250000 holes x 750000 shells > 180 billions of method calls
which is
something).
Maybe for such cases, indexing shells with a strtree would save time,
but I'm not
sure it would be very efficient because of the many large areas in the
dataset.
Michaël
>
> On 4/21/2012 10:42 AM, Michaël Michaud wrote:
>> Hi,
>>
>> Oh yes, the short circuit will not improve performance much in our case
>> because most often, the equalsExact will also be performed.
>> (I don't think equalsExact should be removed as it could break
>> algorithms relying on it)
>>
>> Maybe the good structure would have been to map shells to hole lists
>> with a IdentityHashMap ?
>>
>> I did some profiling to try to understand the difference between our
>> reader and OGR's
>> and found that most of the time in PolygonHandler was spent in Envelope
>> computation
>>
>> I think envelope are required to display data efficiently,
>> but I could not find where exactly they were computed
>> (tried to deactivate envelope computation everywhere in FeatureDataset
>> without performance gain).
>>
>> I also found that we missed a test for the single-ring case (where ccw
>> computation
>> can be avoided), but there is not much difference at the end.
>>
>> Michaël
>>> Oh - except that in the cases where the geometries are *not* identical,
>>> then a slower equals test is performed. So still better to just test
>>> reference equality, I think.
>>>
>>> On 4/21/2012 8:05 AM, Martin Davis wrote:
>>>> That does seem like a good idea. I'll look at adding that.
>>>>
>>>> On 4/21/2012 12:48 AM, Michaël Michaud wrote:
>>>>> Implementing #3 I was wondering why not implementing the following
>>>>> short circuit in Geometry equals method
>>>>> public boolean equals(Object o) {
>>>>> if (this == o) return true;
>>>>> etc.
>>>>> }
>>>>> Can you see any drawback ?
>>>>> Seems that this way, many algorithms could automatically benefit your
>>>>> #3 trick.
>>>>>
>>>>>
>>>> ------------------------------------------------------------------------------
>>>> For Developers, A Lot Can Happen In A Second.
>>>> Boundary is the first to Know...and Tell You.
>>>> Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
>>>> http://p.sf.net/sfu/Boundary-d2dvs2
>>>> _______________________________________________
>>>> Jump-pilot-devel mailing list
>>>> [email protected]
>>>> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
>>>>
>>>>
>>>> -----
>>>> No virus found in this message.
>>>> Checked by AVG - www.avg.com
>>>> Version: 2012.0.1913 / Virus Database: 2411/4950 - Release Date: 04/21/12
>>>>
>>>>
>>>>
>>> ------------------------------------------------------------------------------
>>> For Developers, A Lot Can Happen In A Second.
>>> Boundary is the first to Know...and Tell You.
>>> Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
>>> http://p.sf.net/sfu/Boundary-d2dvs2
>>> _______________________________________________
>>> Jump-pilot-devel mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
>>>
>>>
>> ------------------------------------------------------------------------------
>> For Developers, A Lot Can Happen In A Second.
>> Boundary is the first to Know...and Tell You.
>> Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
>> http://p.sf.net/sfu/Boundary-d2dvs2
>> _______________________________________________
>> Jump-pilot-devel mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
>>
>>
>> -----
>> No virus found in this message.
>> Checked by AVG - www.avg.com
>> Version: 2012.0.1913 / Virus Database: 2411/4950 - Release Date: 04/21/12
>>
>>
>>
> ------------------------------------------------------------------------------
> For Developers, A Lot Can Happen In A Second.
> Boundary is the first to Know...and Tell You.
> Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
> http://p.sf.net/sfu/Boundary-d2dvs2
> _______________________________________________
> Jump-pilot-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
>
>
------------------------------------------------------------------------------
For Developers, A Lot Can Happen In A Second.
Boundary is the first to Know...and Tell You.
Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
http://p.sf.net/sfu/Boundary-d2dvs2
_______________________________________________
Jump-pilot-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel