I would beg to differ. One only needs to look at most of the data
contained in some spatial data sets to see why we need to worry about
topology. For instance, polygons should not overlap each other, gaps
should not exist between polygons, etc, etc.. Although these are not
"advanced topological" issues, they are critical for presenting data
to users that makes sense. The FAO GAUL project has invested
significant effort in the creation of a topologically intact dataset,
specifically to allow spatial analyses to take place.Routing
applications in some countries, such as the one that I am in at the
moment, are increasingly being considered. This implies that points
(health facilities) must have some type of topological relationship
with another layer (such as roads) for routing and optimization
analyses to take place. It would not make sense to have DHIS messing
with data that someone has spent a lot of time establishing topology
in, unless it is required for presentation purposes. Again, my point
is this is perfectly valid, as long as we are not using DHIS as a
spatial repository, but rather as a presentation mechanism for
thematic maps.

The answer to your question of course is going to depend on how the
data (which is imported into DHIS) is processed. If we allow the
application to start truncating digits, we run the risk of perturbing
the answer. I recall performing an analysis some years ago  related to
 refugee camps. It happens to be that refugee camps are often located
very close to borders. Due to the imprecise topology (and precision)
of the dataset that were using, there were differences between the
number of refugee camps located in a particular country when
determined through a data field, as compared to when they were
determined spatially. My point is that topology matters a lot, and any
simplification that is performed by DHIS should not be done so without
the users actually knowing what is happening. My reccomendation them
that we (meaning the DHIS2 community) should provide a clear workflow
about how to prep data that is suitable for DHIS2, keeping all of
these points and other considerations (such as browser payload) in

However, the initial subject of this email really has to do with
GDAL/ogr2ogr appending additional decimals to coordinates, that do not
exist in the original file. After having chatted with Bob, this seems
to be a potential issue with ogr2ogr itself. I totally agree that in
most cases the removal of unnecessary decimal places is not going to
be an issue for the vast majority of the data that we are dealing
with. However, I also feel that users should not be compelled to
perform these manipulations of the data if they choose not to. We
should instead figure out a way to produce the data that is required
by DHIS2 by documenting it with a set of open source tools, and then
allow people to choose other methods if this suits their needs, which
I think it possible but needs further investigation.

> I am not aware of very advanced topological relationships when GIS is used 
> for Public Health. I have seen much more advanced implementation in GIS 
> solutions for telecommunications or utility.
> In the context of the DHIS, the main spatial relationship I can think of  is 
> to answer the question how many health facilities are contained within this 
> district?
> The generalization which is currently performed simplifies a lot the data and 
> I don't think it will be possible to preserve the topology using the same 
> parameters. If you use a better generalizing algorithm which will keep 
> relationships among spatial objects, the size of the geoJSON will be much 
> bigger. At this point, I don't think it is relevant to talk about topological 
> relationships or scale when the original layers are generalized with the 
> currently specified tolerance.
> Johan
> Hi Johan and Bob,
> Johan, you are indeed correct that the generalization process may
> remove the "cartographic intricacies", but this is very likely because
> the generalization is performed either on geographical data where
> there is no topological relationships between objects, or the
> generalization process does not respect the topology when it is
> performed.
> It would be possible to generalize a given set of polygons without
> affecting their intrinsic topological relationships, but much more
> care needs to be exercised when the generalization is performed. This
> generalization could take place by removing unnecessary points
> (simplification) and/or by reducing the precision of the data.
> Ultimately the point in doing this is to decrease the "bulk" of the
> data that is presented to the client. I can imagine that a data set of
> 100 points with 15 decimals would behave more or less the same as a
> dataset of 1000 points with 6 decimals (just guessing here). My point
> is that there is a certain payload associated with each dataset.
> Typically, server side processing in the form of processing the GIS
> layer to an image would be employed. However, since we are using
> vector data on the client side, the data should  be preprocessed in
> order to preserve these cartographic details that are important, as
> automated simplification routines normally do not handle this. The
> result being that the payload of the layer has been decreased to a
> point that is "acceptable" to users. I am sceptical about whether this
> step will be possible to automate at all for reasonably complex
> polgyon layers (i.e districts) that DHIS typically deals with.
> I want to come back to the use of  of DHIS as a repository. At this
> point, IMHO, DHIS seems to be not appropriate for a health facility
> repository. There is no way to adjust the metadata of a given
> organizational unit object easily. I suppose we could use things like
> orgunit groups to provide some type of metadata, but for instance, we
> may want each orgunit to have a property such as "Address", "Fax" or
> "Elevation". Additionally  the  proposed clipping of precision further
> complicates matters in this regard. Ultimatley, we want a quick
> responsive map for users as the first priority, and we should set our
> sites on this.
> In summary, I think that the current approach that we have, namely a
> recommended workflow of how to preprocess a given set of data should
> not be supplanted by the system itself truncating precision of
> coordinates. There are many different generalization algorithms, each
> with their pros and cons. Additionally, the generalization is highly
> dependent on the scale of the map, and ultimately the pixel size of
> the users screen, implying that different datasets may need to be
> generalized in different ways depending on their scale. A gory detail
> of how this done by Geoserver (using Geotools) is here
> http://docs.geoserver.org/stable/en/user/tutorials/feature-pregeneralized/feature-pregeneralized_tutorial.html.
>  We certainly do not need to recreate GeoTools or Geoserver, as they
> are very good already at what they do. I would say instead that we
> should consider leveraging these tools instead, and letting them
> decide how to generalize or not generalize features, depending on the
> scale of the map that is requested by the users. I guess I am
> expressing some fundamental gripe, that we should not baby users too
> much. If people want to have 15 decimals, well let them. They may have
> reasons for this. It obviously does not make much sense, any more than
> using 50,000 points to represent a simple polygon that could be
> represented with four vertices. In both cases, the GIS guys need to do
> their work and understand what type of data is required by the client.
> Providing clear recommendations for a workflow  coupled with
> guidelines on what a "reasonable" payload to the browser would be,
> e.g. 30kb versus 30MB for a given layer, would be the best way to go I
> think.
>> The number of decimals is not really the issue. If you use 6 decimals, it is 
>> already enough for the type of GIS application we are interested in. The use 
>> of 15 decimals will not change a lot the precision of your map and it is not 
>> really necessary.
>> 0 decimal places = approx. 112 km (70 miles) (Precision depending on the 
>> latitude)
>> 3 decimal places = approx 111 m (365 feet)
>> 6 decimal places = < 0.3 m (< 1 foot)
>> The maps used by the system are not that accurate anyway to be more precise 
>> than 6 decimal places because there are not very large scale maps (1:1 000 
>> or 1:500). There are medium scale maps 1:50 000 or 1:100 000 or small scale 
>> maps.
>> The issue is more the cartographic generalization and the fact that it is 
>> not preserving all intricate geographical or other cartographic details. It 
>> is necessary to run the generalization process in order to use the GeoJSON 
>> format, but it removes a lot of data and simplifies it as well. As a 
>> significant amount of data is lost in the process, the output files are not 
>> relevant regarding purpose and scale and the simplified GeoJSON files can't 
>> really be used in a GIS.
>> Johan
>> On Mon, Jul 26, 2010 at 9:38 AM, Bob Jolliffe <bobjolli...@gmail.com> wrote:
>>> Hi Jason
>>> On 26 July 2010 04:49, Jason Pickering <jason.p.picker...@gmail.com> wrote:
>>>> Hi Knut,
>>>>> It may be that we want to use DHIS as both a repository with full
>>>>> precision (though not ridiculously artifical ones like 15 decimal
>>>>> lat/lon) and have a faster way of renderin. But for a repo, I think
>>>>> something like PostGIS is in order. Or we could just store things as
>>>>> GML...
>>>> Well, this is really the issue. If DHIS is going to be a repository,
>>>> any self-respecting GIS geek would not use it if the application
>>>> clipped precision. Although a few meters is not significant in terms
>>>> of rendering a map, it may cause havoc on certain datasets,
>>>> particularly if there are topological relationships between different
>>>> layers. If a facility is related topologically to a road network, and
>>>> the point is shifted a few meters, this may result in disturbance of
>>>> the topology between these layers, rendering DHIS useless as a
>>>> repository. ogr2ogr is perfectly OK as long as we are not dealing with
>>>> these types of layers, but as soon as we start to think about
>>>> relationships to other layers, we need to be very careful about how
>>>> the data is preprocessed.
>>> Would you suggest then that the best place to clip precision would be
>>> when the data is retrieved from the database for the specific view/map
>>> rendering, rather than prior to it being stored?
>>> This would render the current convenience of storing as a geojson
>>> string redundant as we would need to process the string on checkout
>>> anyway.
>>> Can anyone say what the precision is on the shapefiles prior to
>>> ogr2ogr conversion  ie. are we introducing a new level of precision
>>> here or is that 15 digit precision the precision of the source
>>> shapefiles?
>> Quoting myself:
>> "Here is a comparison of what I get in GeoJSON vs GML (converting from the 
>> same
>> shapefile):
>> GeoJSON: 38.415412, 1.750212
>> GML:        38.415411724082148,1.750212388592194"
>> Both using ogr2ogr. So 6 vs 15 decimals.
>> Knut
>>> Bob
>>>>> We should be very conscient of not pushing the new, very simple
>>>>> solution too far, for more complex functionality we should rather
>>>>> employ Geoserver and PostGIS - and I still think this is the best
>>>>> solution for a national repository. Our new way of storing orgunit
>>>>> boundaries is a very small subset of such a full blown GIS solution,
>>>>> but has the advantage of being simple, lightweight and portable.
>>>> Agreed on both points, namely that the solution is lightweight and
>>>> aimed at thematic mapping but other solutions would be more
>>>> appropriate for use as a repository of GIS data.
>>>> Regards,
>>>> Jason
