Re: [OSM-talk-be] CRAB Import Tool

Glenn Plas Wed, 23 Oct 2013 03:17:01 -0700

On 2013-10-23 10:28, Marc Gemis wrote:

You could also make a csv file with the diffs and open that with theOpenData plugin in JOSM. (see my presentation at ESI on import VMMmonitoring stations )
But of course that requires people to install this plugin.

The great thing about having to go dig deep in the data itself is thatit will be fairly easy to structure this the way JOSM does it. Since youalready have to deal with different formats, why spend time on an extraone ? All you need to do is save 'the result' set locally and startlooking at the format. It would be a huge feature. I'm not againstcsv's but their use is limited, xml's (not my favorite!) however gives astructure to the data and makes the data also more human readable. Nextto being an established standard.

or you could add all changes in such a way that they are also added tothe tool that Ben proposes.

I would just not invent a new output format to work with locally...Although using sqlite as a locale storage (whatever tool) would alsohelp a lot speedwise.

When using subsets of data instead of 'everthing from bounding box', theplus-side of working with a limited dataset is that the manual editing, selecting things 'in bulk' is a lot easier in a tool like JOSM, assuch JOSM becomes your tool to manually process and verify it. A wellknown tool... So you don't need to reinvent the wheel (validation forexample). Just glue JOSM right in.

as an example, when Colruyt is taken over by Delhaize and you want tochange the operator on all stores with name 'Colruyt', my steps would be:

- Export using Overpass, in the query decide if you want to do nodes orways (or both). -> .osm file only containing the nodes I'm interested in(and metadata)

- Validate right of the bat to get an idea of the current state

- open in JOSM, search/replace using all features in JOSM (regex, caseinsensitive, key exact and non exact search)

- Do this again for all typo's in common keys (and delete the crap)
- Validate ( use errors to focus on certain keys )

- search for bad keys, stuff that doesn't belong on those nodes (alwayssomething lingering around)

- Check addr:* keys , all of them.
- Use Address plugins to complete missing information
- Validate

- Search for notes, comments and read them (they might give clues aboutproblems you missed). Update them as you fix.

- Validate

- Prepare for merge problems (someone might have touched one of thenodes/ways)

- Now Fix remaining validation errors until it's 100% clean (or are false)
- Validate
- Upload
- Merge (if needed)
- Upload

Any tool that comes our way that supports OSM format kan be injected inany of those phases. That's awesome. I would be using a tool likethat. Since it can take input from any source that you are able tobring to common ground, being OSM format. (XML). I would be using alltools that are suited and add value in such a way to meet a certain problem.

If the tools that parse CRAB would use plugins to read from a(ny)datasource ... Then you might be creating something that lasts. Iwould love to try it out.


Glenn

On Wed, Oct 23, 2013 at 10:13 AM, Glenn Plas <gl...@byte-consult.be<mailto:gl...@byte-consult.be>> wrote:


    On 2013-10-22 20:53, Kurt Roeckx wrote:

        On Mon, Oct 21, 2013 at 10:45:22PM +0200, Kurt Roeckx wrote:

            On Mon, Oct 21, 2013 at 10:06:03PM +0200, Kurt Roeckx wrote:

                I really see no good reason not to add those IDs at
                this point.
                I don't see the harm in them.  I can only see them
                being useful.

            I would actually want to propose a different import strategy:
            - Add the CRAB IDs to all existing addresses in Flanders
            - Import the rest or large parts of CRAB in one big import

        So after feedback on this, I want to propose that instead of
        actually importing this that we provide the data that this import
        tool would generate in such a way that it's easy for people to
        take the data and import it themself, potentially after fixing
        things.

        This would make it easier to improve the import tool after getting
        feedback of what it generates wrong.


    If you could export to OSM format , that would be awesome. Like in
    the way Overpass does this.

    In pseudo:

    - get data from osm (assuming here , the data is partial, so lets
    say, everything with an 'addr' tag in your field of view.)  , the
    same effect you have when exporting a certain key using overpass.
    - get data from crab, craft is as such (preparse it) to facilitate
    merging with osm data set.
    - Make the diff, but create an OSM compliant xml (with meta data,
    otherwise you won't be able to create a changeset from it)
    - open the changeset with JOSM, verify, correct, validate and push.

    So, truthfully, I think a tool like you envision is still
    interesting and the more we do, the better and less manual JOSM
    work to do.  But we need to do chunks of it, we should do this for
    small area's.    it's also easier to (later on) fix things that
    went wrong yet unnoticed, that way you don't have to deal with
    huge changesets finding that single node on page 450 (ever tried
    paging through changesets using the site ? ;-) .   Even a perfect
    full import in one go would give us headaches later.  It keeps
    things managable

    I think it's great you want to do this, I'm just not too positive
    about the success and it's not that I doubt your skills, it's that
    I doubt we'll be able to cover all exceptions that you usually run
    into in a decent timeframe.    The problem is not so much the bulk
    of perfect tagged stuff ,   but the ones that need special
    treatment.   It could turn out to be a bigger job than anticipated
    right now.

    Glenn




    _______________________________________________
    Talk-be mailing list
    Talk-be@openstreetmap.org <mailto:Talk-be@openstreetmap.org>
    https://lists.openstreetmap.org/listinfo/talk-be




_______________________________________________
Talk-be mailing list
Talk-be@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk-be

_______________________________________________
Talk-be mailing list
Talk-be@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk-be

Re: [OSM-talk-be] CRAB Import Tool

Reply via email to