Kevin,

Nice work on the proposal. Scraping geo-data from PDF is a feat. Kudos!
I've read through your proposal and having worked on a statewide protected
area import, I support this one. protect_object=water is a novel one. Are
there other uses of this tag, e.g. protect_object=trees for state forest?

Cadastral data is always tough to manage in OSM, especially in backwoodsy
areas where parcels are only one or two steps removed from the original
subdivision by the King of England back in the 1600s or earlier. Much of
Maryland's state forest boundaries are what I'd call a "best guess" in GIS.
On the ground, they are usually marked. especially near populated areas, so
they are observable. With this, you'd be open to locals finding the
boundary to be incorrect and then updating, correct?

Best,

Elliott

On Sun, May 22, 2016 at 11:36 PM Kevin Kenny <kken...@nycap.rr.com> wrote:

> One-line summary: I want to import the boundaries of New York City
> watershed recreation areas.
>
> Side note: This project ties in closely with Paul Norman's
> identification of a need to clean up the NYS DEC Lands import.
> Many of the NYC DEP watershed lands share borders with the DEC lands,
> and performing this import together with or after the DEC Lands
> cleanup would yield a topology that is more nearly consistent. (Some
> property lines simply are mapped inconsistently in the real world as
> well as the digital world, so there will unavoidably be misalignment
> of some parcels after the import.)
>
> I welcome comments about any aspects of this proposal. I'm still new
> to this game.
>
> ------------------------------------------------------------------------
>
> PROPOSED IMPORT:
>      New York City
>      Department of Environmental Protection
>      Bureau of Water Supply
>      Open Recreation Areas and Use Designations
>      http://www.nyc.gov/html/dep/pdf/recreation/open_rec_areas.pdf
>
> 1. OVERVIEW
>
> New York City owns, and makes accessible for public recreational use
> (activities such as hiking, fishing, hunting and trapping) about four
> hundred parcels of land in the Catskill and Croton watersheds. All of
> these lands are outside the boundaries of the city itself. The vast
> majority of these parcels do not yet appear in OpenStreetMap.  This
> proposal is made to solicit community buy-in for the project of
> importing multipolygons giving the boundaries of these reserves.
>
> I expect that this import should be relatively non-controversial. The
> data arise from an authoritative source - the agency that manages the
> lands in question. They are readily obtainable in no other
> way. Cadastre of public parks, nature reserves, and the like has been
> imported many times before.
>
> The import is of relatively small scale, comprising fewer than 400
> multipolygons and associated tags. The total area of the parcels in
> question is roughly 145 square miles (375 km**2).
>
> 2. LICENSING
>
> I believe that the data are, by law, in the public domain under New
> York City's open data access policy. The OSM community has relied on
> ths policy in the past, most notably in the import of the New York
> City address and building footprint data. The relevant paragraph is
> in the Administrative Code of the City of New York, Chapter 5,
> paragraph 23-502, subparagraph d. The text may be found at
> http://www1.nyc.gov/assets/doitt/downloads/pdf/nyc_open_data_tsm.pdf,
> page 27. The data in question do appear on the single web portal
> described in subparagraph a.
>
> 3. TECHNICAL DETAILS
>
> The data in question consist of the PDF file
> http://www.nyc.gov/html/dep/pdf/recreation/open_rec_areas.pdf, and the
> PDF maps to which it links. I've successfully made a script to scrape
> the tabular data from the PDF, resulting in a set of 367 distinct
> unit names, together with the 'paa', 'hike', 'fish', 'hunt', 'trap'
> and 'dua' columns, and the URL's of the corresponding maps.
>
> These maps are all in PDF format. They are fully georeferenced, and
> I've been able to work out GDAL scripts to extract the boundaries and
> produce well-formed polygons from all but four of them. These four are
> the "Day Use Areas" or "Designated Use Areas" (the web site fairly
> consistently uses the former phrasing, the posters on the land use the
> latter) of Devasego Park, the Ashokan fountains, and the Kensico and
> Cross River dams. These are popular areas for walking and picnicking,
> but are more of the nature of city parks than of nature reserves.
> On the initial import I propose simply to ignore these four, leaving
> 363 recreation areas to import.
>
> The proposed tagging is as follows:
>      leisure=nature_reserve
>          For the benefit of legacy renderers that do not yet comprehend
>          the details of boundary=protected_area
>      boundary=protected_area
>      protect_class=12
>      protection_object=water
>          Tailor-made for this data set!
>      operator='New York City, Department of Environmental Protection,
>                Bureau of Water Supply'
>      website=http://www.nyc.gov/html/dep/html/recreation/index.shtml
>      name=(obtained from the 'unit' column of the list of sites, with
>              the word, 'Unit' postpended)
>      access=yes (if the 'PAA' column is 'Y') or access=license (if the
>              PAA column is 'N')
> access:license=
> http://www.nyc.gov/html/dep/html/watershed_protection/recreation.shtml
>              if (access=license)
>      access:hiking=(value of the 'hike' column, normalized to 'yes' or
> 'no')
>      access:fishing=(value of the 'hike' column, normalized to 'yes' or
> 'no')
>      access:hunting=(value of the 'hike' column, normalized to 'yes' or
> 'no')
>      access:trapping=(value of the 'trap' column, normalized to 'yes'
>              or 'no')
>      nycdep:version=YYYYMMDDHHMMSS
>          UTC time returned as Date-Modified from the web site. See
>          below for rationale of retaining this information.
>
> I'm more than open to a different tagging scheme for 'access'. What
> the relevant restrictions are:
>
> PAA=Y areas are open to all comers, no permission needed, for the
> activities specitied. PAA=N areas require a free access permit
> obtainable at the web site
> http://www.nyc.gov/html/dep/html/watershed_protection/recreation.shtml
>
> HIKE, FISH, HUNT, and TRAP describe the permitted activities (HIKE
> encompasses related activities such as photography, bird watching,
> etc.)
>
> The areas in which HIKE=N are all areas adjoining the
> reservoirs. Hiking with no other purpose is forbidden in these areas,
> as is the trapping of game. Hunters, fishermen and boaters accessing
> these areas must have valid licenses for these activities, and boats
> must be tagged by NYCDEP. Since all of the HIKE=N areas are also
> PAA=N, lawful users will have applied for an access permit and been
> presented with the restrictions, so I don't propose to model this
> complexity in the tagging, unless someone suggests a more obvious
> tagging scheme than I've been able to invent.
>
> CONFLATION AND UPDATE PLAN
>
> The initial conflation should be quite straightforward - simply query
> a PostGIS mirror for area features that overlap the supplied
> multipolygons by more than a trivial amount. (The cadastral data from
> the different agencies are not 100% consistent, so I expect that a few
> per cent of some parcels will overlap adjacent state forests, and
> intend to import these data as is. Rectifying misdrawn property lines
> is not our problem!) I propose simply to import the parcels into JOSM,
> resolve any JOSM-reported errors and warnings, and upload. I will
> likely work either by county or by township, depending on the number
> of parcels in a county, to keep each upload to a manageable size.
>
> Further updates in semi-automatic fashion should also be fairly
> straightforward. I propose to maintain a record of what has been
> uploaded, and when changes appear, check whether the OSM data for a
> parcel have changed from the previous upload. For unchanged parcels,
> the old can be replaced with the new withough stepping on any mapper's
> manual work, For new parcels, the upload can proceed. For changed
> parcels, the change has to be alerted for manual review. I expect that
> this last situation will be vanishingly rare. Of course, if the new
> upload results in a conflict (e.g., a substantial overlap with an area
> feature already in the database), the change will have to be flagged
> for manual review.
>
> FURTHER NOTES
>
> I'd much rather work from the bureau's own shapefiles, of course, but
> I've not yet managed to locate an appropriate contact to request
> them. Filing a demand under the Freedom of Information Law is often
> regarded as a hostile act, and I'd rather stay on good terms with the
> officials involved, so I prefer to proceed by less formal means. The
> 'web scraping' outlined above at least works, although I expect that
> it will be a brittle process in the long run. I'll keep casting about
> for a more robust way to handle this data set.
>
> NEXT STEPS
>
> Of course, I'll make source code of all scripts available for review
> and so that I can pass the baton for others to carry out the
> semiautomated update process if needed.
>
> If this proposal doesn't get roundly shot down, the next steps will be
> to create a project page on the wiki, link it to Import/Catalogue,
> clean up and publish the scripts, perform the import onto the test
> server, get a data review, and then update the Contributors page and
> do the import for real.
>
> Comments?
>
> --
> 73 de ke9tv/2, Kevin
>
>
> _______________________________________________
> Talk-us mailing list
> talk...@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk-us
>
-- 
Elliott Plack
http://elliottplack.me
_______________________________________________
Imports mailing list
Imports@openstreetmap.org
https://lists.openstreetmap.org/listinfo/imports

Reply via email to