Re: [Talk-GB] Postcode data
Aidan I've had a look at your list and would say it's way under - you only have 3 B27 codes and I've completed addressing this whole postcode area - I've not got every one complete but I'm sure there's far more than the 3 you're showing, including the one for my own house which is missing! Similarly for B72 which I know is also complete with every address mapped Regards Brian ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
Brian, The list was of invalid postcodes. Hopefully everything you entered was a valid postcode (it seems that way given that only 3 invalid B27 postcodes appear in the list). Rob On 12 March 2013 22:37, Brian Prangle bpran...@gmail.com wrote: Aidan I've had a look at your list and would say it's way under - you only have 3 B27 codes and I've completed addressing this whole postcode area - I've not got every one complete but I'm sure there's far more than the 3 you're showing, including the one for my own house which is missing! Similarly for B72 which I know is also complete with every address mapped Regards Brian ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
Rob, Probably, I'm a little busy at the moment so not really going to get round to doing it in the short term. Can probably rustle up a list mapping the ways and nodes to the incorrect postcode fairly quickly which would probably help? Aidan On 1 March 2013 17:35, Rob Nickerson rob.j.nicker...@gmail.com wrote: That's an interesting list for anyone who is concerned with data cleansing! Some of the results are because only the first part of a postcode has been entered, however even these have numerous formats (e.g. CV3, CV3 ???, CV3 ///). For the other errors, it tends to be typos (e.g. CO!6 7BJ, where ! is a probably a typo of 1 - Shift+1=!), but there are also road names, numbers, and web URLs in the postcode tag. Would it be possible to create a list of these where we could add the correct postcode in a new column and then upload the new data into OSM? Rob On 1 March 2013 17:24, Aidan McGinley aidmcgin+openstreet...@gmail.comwrote: * How accurate is the data already in OSM? Interesting question Rob, as of today there's approximately 200,000 ways or nodes tagged with postcodes in OSM, this is made up of about 29,000 unique postcodes. Those numbers are not 100% accurate as my bounding box for getting the data overlaps a bit with France and Ireland. I've removed the obvious French postcodes (5 digits) there might be a few I missed although I'm pretty sure the extras don't skew the numbers too much. I've compared the unique values from that list with the ONS dataset (excluding terminated postcodes) and come up with the list linked below [1] There's 1119 unique invalid postcodes, which of of course doesn't account for ways or nodes that are incorrectly tagged with a valid postcode but is a useful stat nonetheless. It should also be relatively easy to get those cleaned up I would think. Couple of notes about the data, there are a few postcodes that look like they are valid (e.g. BR3 1AZ, WC2H 9BD) but they have in fact got some invalid characters at the end that are not visible so that's why they are listed. It also includes postcodes in lowercase as well since it breaks from the convention of uppercase postcodes, you could argue that they should be in or out, but it was easier to leave them in. [1] https://docs.google.com/file/d/0B0viaV_xKHyCNmJDY1A1X092Zkk/edit?usp=sharing On 28 February 2013 23:44, Rob Nickerson rob.j.nicker...@gmail.comwrote: Interestingly out of the 95 you also identified 2 postcodes that are incorrect in OSM... raising the obvious questions: * How accurate is the data already in OSM? * Should imports be compared to 100% accuracy or a more realistic measure of OSM accuracy? Rob ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
* How accurate is the data already in OSM? Interesting question Rob, as of today there's approximately 200,000 ways or nodes tagged with postcodes in OSM, this is made up of about 29,000 unique postcodes. Those numbers are not 100% accurate as my bounding box for getting the data overlaps a bit with France and Ireland. I've removed the obvious French postcodes (5 digits) there might be a few I missed although I'm pretty sure the extras don't skew the numbers too much. I've compared the unique values from that list with the ONS dataset (excluding terminated postcodes) and come up with the list linked below [1] There's 1119 unique invalid postcodes, which of of course doesn't account for ways or nodes that are incorrectly tagged with a valid postcode but is a useful stat nonetheless. It should also be relatively easy to get those cleaned up I would think. Couple of notes about the data, there are a few postcodes that look like they are valid (e.g. BR3 1AZ, WC2H 9BD) but they have in fact got some invalid characters at the end that are not visible so that's why they are listed. It also includes postcodes in lowercase as well since it breaks from the convention of uppercase postcodes, you could argue that they should be in or out, but it was easier to leave them in. [1] https://docs.google.com/file/d/0B0viaV_xKHyCNmJDY1A1X092Zkk/edit?usp=sharing On 28 February 2013 23:44, Rob Nickerson rob.j.nicker...@gmail.com wrote: Interestingly out of the 95 you also identified 2 postcodes that are incorrect in OSM... raising the obvious questions: * How accurate is the data already in OSM? * Should imports be compared to 100% accuracy or a more realistic measure of OSM accuracy? Rob ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
That's an interesting list for anyone who is concerned with data cleansing! Some of the results are because only the first part of a postcode has been entered, however even these have numerous formats (e.g. CV3, CV3 ???, CV3 ///). For the other errors, it tends to be typos (e.g. CO!6 7BJ, where ! is a probably a typo of 1 - Shift+1=!), but there are also road names, numbers, and web URLs in the postcode tag. Would it be possible to create a list of these where we could add the correct postcode in a new column and then upload the new data into OSM? Rob On 1 March 2013 17:24, Aidan McGinley aidmcgin+openstreet...@gmail.comwrote: * How accurate is the data already in OSM? Interesting question Rob, as of today there's approximately 200,000 ways or nodes tagged with postcodes in OSM, this is made up of about 29,000 unique postcodes. Those numbers are not 100% accurate as my bounding box for getting the data overlaps a bit with France and Ireland. I've removed the obvious French postcodes (5 digits) there might be a few I missed although I'm pretty sure the extras don't skew the numbers too much. I've compared the unique values from that list with the ONS dataset (excluding terminated postcodes) and come up with the list linked below [1] There's 1119 unique invalid postcodes, which of of course doesn't account for ways or nodes that are incorrectly tagged with a valid postcode but is a useful stat nonetheless. It should also be relatively easy to get those cleaned up I would think. Couple of notes about the data, there are a few postcodes that look like they are valid (e.g. BR3 1AZ, WC2H 9BD) but they have in fact got some invalid characters at the end that are not visible so that's why they are listed. It also includes postcodes in lowercase as well since it breaks from the convention of uppercase postcodes, you could argue that they should be in or out, but it was easier to leave them in. [1] https://docs.google.com/file/d/0B0viaV_xKHyCNmJDY1A1X092Zkk/edit?usp=sharing On 28 February 2013 23:44, Rob Nickerson rob.j.nicker...@gmail.comwrote: Interestingly out of the 95 you also identified 2 postcodes that are incorrect in OSM... raising the obvious questions: * How accurate is the data already in OSM? * Should imports be compared to 100% accuracy or a more realistic measure of OSM accuracy? Rob ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
On 27 February 2013 09:03, Andy Allan gravityst...@gmail.com wrote: On 26 February 2013 22:08, Aidan McGinley aidmcgin+openstreet...@gmail.com wrote: is the actual output that would get loaded onto OSM. Please don't load this data into OpenStreetMap. It's not a good idea. 1) The source data appears to be heavily overprocessed. 2) The license is unclear 3) We don't want to import this stuff anyway +1 As I said before when this was first raised: I'm not sure I see much benefit to the import. It's presumably going to add relatively few postcodes [as a percentage of the total number of postcodes in the UK], so won't be that much use for anyone wanting to use OSM data for postcode look-ups. Indeed anyone wanting to do that could just as easily use the centroid data directly to map a postcode to a location, and then use that location to do whatever searching they want to do on OSM. There is obviously some advantage in that we'll have more buildings / amenities with properly assigned post-codes. But because of the relatively low benefit (unless I'm missing something) I would say that the community should see good evidence for an extremely low error rate on the import before agreeing that it would be a good thing to do. So what evidence is there for this low error rate? What is the result of my suggestion to look at buildings where there is an existing postcode in OSM and your method would have a postcode to assign to that building? What percentage of those buildings result in a match for the postcodes and what percentage result in disagreement? (Rather than just producing some comparison data and asking others to check it, I would say that the onus is on the importer to come up with the evidence that the import is going to be accurate.) In any case, as Andy says, I think a better use for this data would be an ITO-OSM-Locator-style view where local editors can compare the data with OSM and use it that way to help verify / improve OSM manually. Robert. -- Robert Whittaker ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
I haven't had time to do the analysis yet Robert, I've been focussing on fixing the issues I've already identified, once they are done I'll do something more detailed. But at a high level these are the numbers: 95 total buildings with an existing postcode tag (after removing ways mapped to more than one postcode) 77 of these match exactly the postcode tag identified by the script (ignoring whitespace) 12 of these buildings only have the first part of the postcode. All of these partially match what is output by the script, for example way 5042255 is tagged in OSM as SW15 and the script identifies it as SW15B2U. 6 entries do not match. These are obviously the important ones that need investigation to see whether there is fundamental issues with the import or if there is a valid reason for the mismatch like typo or postcodes no longer in use. If it turns out there is an issue with these then obviously the import will have to be shelved. On 28 February 2013 09:51, Robert Whittaker (OSM lists) robert.whittaker+...@gmail.com wrote: On 27 February 2013 09:03, Andy Allan gravityst...@gmail.com wrote: On 26 February 2013 22:08, Aidan McGinley aidmcgin+openstreet...@gmail.com wrote: is the actual output that would get loaded onto OSM. Please don't load this data into OpenStreetMap. It's not a good idea. 1) The source data appears to be heavily overprocessed. 2) The license is unclear 3) We don't want to import this stuff anyway +1 As I said before when this was first raised: I'm not sure I see much benefit to the import. It's presumably going to add relatively few postcodes [as a percentage of the total number of postcodes in the UK], so won't be that much use for anyone wanting to use OSM data for postcode look-ups. Indeed anyone wanting to do that could just as easily use the centroid data directly to map a postcode to a location, and then use that location to do whatever searching they want to do on OSM. There is obviously some advantage in that we'll have more buildings / amenities with properly assigned post-codes. But because of the relatively low benefit (unless I'm missing something) I would say that the community should see good evidence for an extremely low error rate on the import before agreeing that it would be a good thing to do. So what evidence is there for this low error rate? What is the result of my suggestion to look at buildings where there is an existing postcode in OSM and your method would have a postcode to assign to that building? What percentage of those buildings result in a match for the postcodes and what percentage result in disagreement? (Rather than just producing some comparison data and asking others to check it, I would say that the onus is on the importer to come up with the evidence that the import is going to be accurate.) In any case, as Andy says, I think a better use for this data would be an ITO-OSM-Locator-style view where local editors can compare the data with OSM and use it that way to help verify / improve OSM manually. Robert. -- Robert Whittaker ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
On 28/02/2013 12:27, Aidan McGinley wrote: 12 of these buildings only have the first part of the postcode. All of these partially match what is output by the script, for example way 5042255 is tagged in OSM as SW15 and the script identifies it as SW15B2U. Hopefully that's a typo for SW152BU? More importantly, that highlights an enhancement that I'd like to request: always make sure the two parts of the postcode are separated, even if they are not in the source data. I think this should be easy, as the second part is always the last three characters. -- Steve ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
Steve, yes you're right that was a typo on my part :) Here is the full analysis of the problematic ways 61130908 SW2 4RT in OSM vs SW2 4SG from ONS 112682060 in OSM SW1 2SE vs SW1V 2SE from ONS 139941192 in OSM vs SE11 5EN vs SE11 5EF from ONS 116957518 in OSM vs SW1V 1DX vs SW1P 1JQ from ONS 124038826 in OSM vs SW3 4UD vs SW3 4UJ from ONS 185247746 in OSM vs SW11 6QF vs SW11 6LD from ONS 61130908 can be accounted for as the postcode SW24RT was retired in Dec 2011 and simply has not been updated on OSM, so not an issue 112682060 The postcode SW12SE does not exist, it looks like a typo on OSM, again not cause for concern 139941192 This is an interesting one. Essentially it is the SOCA HQ building. The published address for SOCA (from their website) is a PO Box with the postcode as entered on OSM (5EN), I’m assuming the ultimate destination is this building so if the destination is the basis for a correctly tagged postcode then it is correct on OSM. Equally the 5EF postcode applies to Citadel Place and the building SOCA is in, so I would say this is an accurate identification by the script and could correctly be tagged with both. 116957518 judging by the Bing ariel image this building is not mapped correctly and slightly off from it's correct location, that is a clear danger of doing the import and an issue I've always been conscious of. There is no obvious way to determine whether a building is correctly aligned without manual inspection. I'm not 100% sure what the issue with the remaining two are, but at a guess I would say it is similar to the issue raised by Ed Loach already, namely residential property that is above a business having a separate postcode to the business operating beneath it. Both of these ways represent a business, however it looks like there are flats above them. If someone has any better ideas then would love to hear them. In summary, there is 3 false positives out of 95 in this sample data. It is not going to be possible to remedy the cause of those false positives, and it's not clear how prevalent the two issues behind them are in OSM - namely misaligned buildings and multiuse properties. I'm going to assume that uncertainty is sufficient reason not to import so will cease work on preparing the import. Having said that I'm sure the data is useful so I'm interested in exploring any ideas the community might have. On 28 February 2013 16:14, Steve Doerr doerr.step...@gmail.com wrote: On 28/02/2013 12:27, Aidan McGinley wrote: 12 of these buildings only have the first part of the postcode. All of these partially match what is output by the script, for example way 5042255 is tagged in OSM as SW15 and the script identifies it as SW15B2U. Hopefully that's a typo for SW152BU? More importantly, that highlights an enhancement that I'd like to request: always make sure the two parts of the postcode are separated, even if they are not in the source data. I think this should be easy, as the second part is always the last three characters. -- Steve __**_ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.**org/listinfo/talk-gbhttp://lists.openstreetmap.org/listinfo/talk-gb ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
Interestingly out of the 95 you also identified 2 postcodes that are incorrect in OSM... raising the obvious questions: * How accurate is the data already in OSM? * Should imports be compared to 100% accuracy or a more realistic measure of OSM accuracy? Rob ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
On 26 February 2013 22:08, Aidan McGinley aidmcgin+openstreet...@gmail.com wrote: is the actual output that would get loaded onto OSM. Please don't load this data into OpenStreetMap. It's not a good idea. 1) The source data appears to be heavily overprocessed. Users should note that postcodes that straddle two geographic areas will be assigned to the area where the mean grid reference of all the addresses within the postcode falls. So while you're trying to map postcodes to a particular building in OSM, what's actually happening is that the real postcode locations are first being averaged to a centroid, then that postcode centroid is assigned to a given geography (e.g. a LSOA, or whichever geography you are using), and then you're taking the centroid of the geography (not the centroid of the postcodes) and finding a random building in OSM that overlaps that geography centroid, then adding the postcode to the building. So you're adding postcodes to whatever building just happens to be at the centroid of the geography, when all we know is that the centroid of the postcodes is somewhere within that geography. Having postcode data in OSM is useful, but this appears to be very haphazard. There's no guarantee that the given building is anywhere near the postcode centroid (the postcode centroid could be at the edge of a given geography) and it's no surprise that each geography could have multiple postcode centroids. There are other approaches. We have access to postcode centroids from elsewhere, if we were to pick just one building per postcode to assign a postcode to, it would be better to use the centroid of the postcodes, rather than the centroid of a geography that the centroid of the postcodes happens to fall within. 2) The license is unclear ONS Intellectual Property in the postcode products is supplied under Open Government Licence terms (see Related Links). Sure, OGL, great, but... The ONSPD is a Gridlink® branded product that pulls together data from members of the Gridlink® Consortium (Royal Mail, Ordnance Survey, National Records of Scotland, Land Property Services (Northern Ireland) and ONS). So the ONS might be happy to put their own IP (presumably the act of mapping postcode centroids to geographies) under OGL, but as it says above there's a bunch of other IP rights in the database, and the ONS makes no statement on the licensing of the data. 3) We don't want to import this stuff anyway Postcode centroids have been discussed many times before, and the position we've taken is that importing them does not help our mappers. It's derived data, not the kind of thing that we actually map. We use the centroids in various visualisations and QA tools, we can expand them out to voroni polygons to help figure out what the real postcodes might be, but what we're aiming for is for buildings to be assigned the *correct*, actual postcodes. Until we get some real, full detail, all 28m buildings, data (e.g. the PAF) under a suitable license, then please don't import centroids or anything derived from them. Cheers, Andy ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
Aidan, On 27 Feb 2013 09:04, Andy Allan gravityst...@gmail.com wrote: Please don't load this data into OpenStreetMap. It's not a good idea. 100% agree with Andy. To be acceptable your script would need to do at least as good a job as mappers could do by hand which I don't think is possible with only centroids being available. It's easy for a person to look at a postcode overlay and spot that a postcode just applies to one side of a street but I don't see how your script can do this with any degree of confidence. What I would like to see is a tool (similar to ITO's OS Locator reconciliation) to encourage people to add more postcodes. Kevin ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
Thanks for the feedback Andy, I'll tackle each point below, @Kevin hopefully #3 should address your concerns about accuracy 1) The source data appears to be heavily overprocessed. This only applies to data other than the postcode centroids, such as the Census Output Areas and other categorizations The postcode centroid is accurate in the data, the OSM wiki[1] even says the data matches the Code Point centroids (not sure who did that analysis), so I'm not sure how much more accurate you would require it to be? 2) The license is unclear I think this has been visited several times before, but I based my decision to use this dataset by the information on the wiki[1] which states: Office of National Statistics Postcode centroids. The ONS have released postcode centroids under the standard Open Goverment License (OGL). These centroids match (to a few cm) those in the Code-Point Open data and can be used in OpenStreetMap If that is incorrect, then I'm happy to work with another data set that is compatible with OSM. 3) We don't want to import this stuff anyway I'm getting a contradictory message from you on this one, but I think we're on the same page. Let me explain how the data is filtered to *ensure* the postcode is mapped to a correct building which you say should be the aim. I am not simply using the centroid, I am combining it with the quality indicator in the ONS data. I've only included the highest quality data which is postcode centroids that fall within a building within the area of the postcode. The only time I've seen an issue with postcodes that fall in this category is when there are multiple postcodes that map to the same building, and I am filtering all of those cases out. Does that allay any concerns about the import? [1] http://wiki.openstreetmap.org/wiki/Ordnance_Survey_Opendata On 27 February 2013 09:46, Kevin Peat k...@k3v.eu wrote: Aidan, On 27 Feb 2013 09:04, Andy Allan gravityst...@gmail.com wrote: Please don't load this data into OpenStreetMap. It's not a good idea. 100% agree with Andy. To be acceptable your script would need to do at least as good a job as mappers could do by hand which I don't think is possible with only centroids being available. It's easy for a person to look at a postcode overlay and spot that a postcode just applies to one side of a street but I don't see how your script can do this with any degree of confidence. What I would like to see is a tool (similar to ITO's OS Locator reconciliation) to encourage people to add more postcodes. Kevin ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
That's right Ed, if a building already has a postcode I won't be changing it. I'm actually outputting them separately and using them for some quality assurance. I'd be interested to know the two postcodes you are referring to just to check how they look in the source data. On 27 February 2013 12:47, Ed Loach e...@loach.me.uk wrote: Does that allay any concerns about the import? If buildings already have postcodes tagged you won't replace them? I am aware of at least one local instance where the postcode of the business which is mapped is not the same as those of the upstairs flats (accessed from the rear of the buildings) but the centroid of the flats falls in the building which is tagged as the business, and the centroid of the businesses postcode is elsewhere. Replacing the postcode would make it wrong. Ed ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
Aidan, On 27 February 2013 11:12, Aidan McGinley ... I've only included the highest quality data which is postcode centroids that fall within a building within the area of the postcode... Does that allay any concerns about the import? Does the centroid always fall within the postcode area? I didn't realise that was the case if true, or is there an indicator in the dataset for that? I still think that these kinds of things are best structured so mappers can run them themselves against their own areas if they want to. In that way there is always someone to check the results and to clean-up any problems that do occur. Kevin ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
Kevin, Yes, the centroid falls within the postcode area for the highest quality indicator by definition. To be marked in the highest quality data set, the centroid must fall inside a building within the postcode area (see the user guide that accompanies the data for more details). These high quality postcodes are the only one's I'm using, I'm throwing the rest away. I'm pretty sure it would be possible for the centroid to fall outside the postcode area for some of the lower quality indicators though, so that might be what you are thinking of. To your other point, I'll only be running this if I can be sure the data is accurate, if I have any indication that this will result in anything less than perfect accuracy I will not do it. This will involve a lot of manual checks by me, and hopefully other users when I get the final version running. For anyone willing to do some manual checking, I'm happy to generate data for their local area, and the script will also be available for people to run themselves once I have the current kinks ironed out. Aidan On 27 February 2013 14:53, Kevin Peat k...@k3v.eu wrote: Aidan, On 27 February 2013 11:12, Aidan McGinley ... I've only included the highest quality data which is postcode centroids that fall within a building within the area of the postcode... Does that allay any concerns about the import? Does the centroid always fall within the postcode area? I didn't realise that was the case if true, or is there an indicator in the dataset for that? I still think that these kinds of things are best structured so mappers can run them themselves against their own areas if they want to. In that way there is always someone to check the results and to clean-up any problems that do occur. Kevin ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
Hi Andy, Aidan approached the talk-gb mailing list last month about this postcode idea [1]. A discussion was held, and from the back of this several changes were made (mainly related to reducing the size of the import to ensure that only the very best data was used, but also to ensure that QA was considered). Furthermore based on the discussions, the idea has since been posted to the imports mailing list and a wiki page [2] has been created. As no more complains were raised at the time, Aidan has continued to develop this idea and his script. I am supportive of his work as I am confident of the quality and the benefit it will bring to OSM. In regards to the licence, the data is available under the Open Government Licence and is therefore compatible with OpenStreetMap. I do however share you general concern about imports, but on this occasion I'm 100% satisfied. Kind regards, Rob [1] http://lists.openstreetmap.org/pipermail/talk-gb/2013-January/014358.html [2] http://wiki.openstreetmap.org/wiki/ONS_Postcode_Import ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
Time for an update on this. I've done my first proper test run of this today using the latest ONS file. The test was run over an area of about 60 square miles in southwest London. There's about 3000 postcodes in the result set which I'm pretty impressed with assuming they are all valid, so we should see a significant uplift in the number of mapped postcodes once I get the issues ironed out. Posting it up here to see if anyone else can spot things that need to be addressed. The first file [1] contains all buildings in OSM that already have a postcode which the script picked up. This is good for QAing, and it looks fairly good from my first look over it, but I need to do some proper analysis. This file isn't valid OSM xml as it has 2 tags for the postcode, but useful for analysis anyway, I'll make it a bit better in the next iteration. The second [2] is the actual output that would get loaded onto OSM. I've noticed two issues myself (see below), and appreciate any input from others as well. Issue 1 - Some of the buildings are coming out with multiple postcodes e.g. way #117697674 maps to SW147NX and SW147PQ. Appears to be for large ways and I don't think there's anything that can be done other than to filter these out. Very easy to do, looks like it'll remove about 10% from the result set. Issue 2 - The second issue will require a bit more work, some ways have international characters that are getting garbled at some point during the transformation as the script isn't handling the encoding correctly. Currently looking into it, worst case scenario I'll have to filter these out somehow. An example way is for Westmiinster Abbey - 23093437 [1] http://paste.ubuntu.com/5568746/ - The first postcode tag on the way is the existing postcode in OSM, second is the one identified by the script [2] http://paste.ubuntu.com/5568754/ - In the final version I'll be splitting these output files into sets containing 1000 ways each. On 21 January 2013 16:13, Aidan McGinley aidmcgin+openstreet...@gmail.comwrote: @Brian - Yes I need to formulate how to QA this. I'd like to automate the QA as much as possible but having some elements done manually is obviously beneficial and the more people that can cast their eye over it the better. Any volunteers please do let me know, and also if anyone has any ideas for how to QA this do let me know. @Robert - That accuracy check would be very easy to do as part of the QA process, I'll add it to my list of To Do items. I'll be able to give you an indication of the number/percentage of postcodes potentially added after I do a run against the full postcode file, right now I just don't know as I've only been working with a very small subset. Bear in mind there is only 27,013 unique UK postcodes in OSM at present so any import is going to be significant in my eyes. For comparison the number of postcodes in the ONS data that matches the criteria I outline above is 1.7M, so even a tiny hit rate will result in a significant uplift to the data in OSM. On 18 January 2013 10:43, Matt Williams li...@milliams.com wrote: On 17 January 2013 23:01, Rob Nickerson rob.j.nicker...@gmail.com wrote: I would imagine that this would add a fair number of postcodes, and although those interested in address lookup can just use the centroid database without needing to go to OSM, this requires knowledge of the database (which non-UK developers might not have) and does not link postcodes back to address numbers and street names. Also recall that the Auto industry asked in 2012 how OSM intends to bridge the gap between us and commercial map providers. Something like this would be a good step in the right direction in my opinion. From what I have heard, this sounds like a very cautious import and I am happy to support it. It may even have lower error rates than some manual edits!! RobJN p.s. Matt, if you are reading this, do you still update your graph of number of postcodes added to OSM? Might be interesting to see it. Sure, the latest version (from the update a few days ago) is attached. The vertical axis represents my interpretation of how many delivery points we have with an address in the UK in OSM at the moment. This means I've expanded out interpolated ways and buildings with multiple addresses. The big straight section in the middle is from where I didn't update the tool for ages. -- Matt Williams http://milliams.com ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
@Brian - Yes I need to formulate how to QA this. I'd like to automate the QA as much as possible but having some elements done manually is obviously beneficial and the more people that can cast their eye over it the better. Any volunteers please do let me know, and also if anyone has any ideas for how to QA this do let me know. @Robert - That accuracy check would be very easy to do as part of the QA process, I'll add it to my list of To Do items. I'll be able to give you an indication of the number/percentage of postcodes potentially added after I do a run against the full postcode file, right now I just don't know as I've only been working with a very small subset. Bear in mind there is only 27,013 unique UK postcodes in OSM at present so any import is going to be significant in my eyes. For comparison the number of postcodes in the ONS data that matches the criteria I outline above is 1.7M, so even a tiny hit rate will result in a significant uplift to the data in OSM. On 18 January 2013 10:43, Matt Williams li...@milliams.com wrote: On 17 January 2013 23:01, Rob Nickerson rob.j.nicker...@gmail.com wrote: I would imagine that this would add a fair number of postcodes, and although those interested in address lookup can just use the centroid database without needing to go to OSM, this requires knowledge of the database (which non-UK developers might not have) and does not link postcodes back to address numbers and street names. Also recall that the Auto industry asked in 2012 how OSM intends to bridge the gap between us and commercial map providers. Something like this would be a good step in the right direction in my opinion. From what I have heard, this sounds like a very cautious import and I am happy to support it. It may even have lower error rates than some manual edits!! RobJN p.s. Matt, if you are reading this, do you still update your graph of number of postcodes added to OSM? Might be interesting to see it. Sure, the latest version (from the update a few days ago) is attached. The vertical axis represents my interpretation of how many delivery points we have with an address in the UK in OSM at the moment. This means I've expanded out interpolated ways and buildings with multiple addresses. The big straight section in the middle is from where I didn't update the tool for ages. -- Matt Williams http://milliams.com ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
On 15 January 2013 19:28, Rob Nickerson rob.j.nicker...@gmail.com wrote: Hi Matt, I'm getting results from other cities in the Land Registry tool. Should this apply the same +- 0.1 degrees logic? Would also be nice if you could add the edit this way in external links (like those on a way's page on OSM.org). If you search for a more precise postcode, do you still get matches from far away? Could you give me an example so I can see if I can track it down? I'll add 'edit this way' to my TODO list. -- Matt Williams http://milliams.com ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
On 17 January 2013 23:01, Rob Nickerson rob.j.nicker...@gmail.com wrote: I would imagine that this would add a fair number of postcodes, and although those interested in address lookup can just use the centroid database without needing to go to OSM, this requires knowledge of the database (which non-UK developers might not have) and does not link postcodes back to address numbers and street names. Also recall that the Auto industry asked in 2012 how OSM intends to bridge the gap between us and commercial map providers. Something like this would be a good step in the right direction in my opinion. From what I have heard, this sounds like a very cautious import and I am happy to support it. It may even have lower error rates than some manual edits!! RobJN p.s. Matt, if you are reading this, do you still update your graph of number of postcodes added to OSM? Might be interesting to see it. Sure, the latest version (from the update a few days ago) is attached. The vertical axis represents my interpretation of how many delivery points we have with an address in the UK in OSM at the moment. This means I've expanded out interpolated ways and buildings with multiple addresses. The big straight section in the middle is from where I didn't update the tool for ages. -- Matt Williams http://milliams.com attachment: postcode_houses.png___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
On 16 January 2013 13:04, Brian Prangle bpran...@gmail.com wrote: You might like to get a volunteer to check a pilot import that's limited within a manageable area - suggest a limited range of postcodes Another useful check would be to apply your matching over the OSM database, and pull out all the potential polygons that are already tagged with a postcode. Then compare the existing tagging with the postcode you get from the external data. Loot at the number / percentage of dependencies, and for each one try to work out which source is correct. This might will give you another indication of the accuracy of the proposed import. (Personally, I'm not sure I see much benefit to the import. It's presumably going to add relatively few postcodes, so won't be that much use for anyone wanting to use OSM data for postcode look-ups. Indeed anyone wanting to do that could just as easily use the centroid data directly to map a postcode to a location, and then use that location to do whatever searching they want to do on OSM. There is obviously some advantage in that we'll have more buildings / amenities with properly assigned post-codes. But because of the relatively low benefit (unless I'm missing something) I would say that the community should see good evidence for an extremely low error rate on the import before agreeing that it would be a good thing to do.) Robert. -- Robert Whittaker ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
I would imagine that this would add a fair number of postcodes, and although those interested in address lookup can just use the centroid database without needing to go to OSM, this requires knowledge of the database (which non-UK developers might not have) and does not link postcodes back to address numbers and street names. Also recall that the Auto industry asked in 2012 how OSM intends to bridge the gap between us and commercial map providers. Something like this would be a good step in the right direction in my opinion. From what I have heard, this sounds like a very cautious import and I am happy to support it. It may even have lower error rates than some manual edits!! RobJN p.s. Matt, if you are reading this, do you still update your graph of number of postcodes added to OSM? Might be interesting to see it. ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
Hi Aidan You might like to get a volunteer to check a pilot import that's limited within a manageable area - suggest a limited range of postcodes Regards Brian On 15 January 2013 11:32, Aidan McGinley aidmcgin+openstreet...@gmail.comwrote: @Rob yes I had seen that. It is a great tool, but as you say it's difficult to be absolutely sure that what you get back is accurate To summarise what I'll be looking at doing Filter the following from the ONS Postcode data: - Postcodes which have a date of termination set - Postcodes whose centroid is shared with other postcodes - Postcodes which have a quality indicator other than Within the building of the matched address closest to the postcode Then match these filtered centroids to ways from Openstreetmap that have the following criteria: - The way is closed - The postcode centroid is inside the way - The way does not already have a addr:postcode tag - The way is tagged building=* There are future enhancements that can be done around tackling ways tagged amenity=* as per Rovastars comments, and reporting on postcode accuracy but the above is enough for the initial version I think. If there are no objections to the above, I will start work on optimizing the script and get in touch with the import mailing list to discuss how to actually tackle the task of importing the data. On 14 January 2013 19:54, Rob Nickerson rob.j.nicker...@gmail.com wrote: Hi Aidan, Sounds like you have thought this through to ensure that this import will work well in practice. I would be more satisfied if imports were to closed ways with building=* only (as mentioned by others). Have you seen the Land Registry 'price paid' open data that includes addresses and postcodes. Matt had a go at creating a simple tool that matches their data to OSM address tags [1]. It's a great start, but be aware that the nearby houses only matches on street name and number so you can end up with addresses in different towns. Also it's quickest if you specify at least the first number after the space (e.g. CV4+8) Rob [1] http://milliams.dev.openstreetmap.org/postcodefinder/landregistry/ ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
@Rob yes I had seen that. It is a great tool, but as you say it's difficult to be absolutely sure that what you get back is accurate To summarise what I'll be looking at doing Filter the following from the ONS Postcode data: - Postcodes which have a date of termination set - Postcodes whose centroid is shared with other postcodes - Postcodes which have a quality indicator other than Within the building of the matched address closest to the postcode Then match these filtered centroids to ways from Openstreetmap that have the following criteria: - The way is closed - The postcode centroid is inside the way - The way does not already have a addr:postcode tag - The way is tagged building=* There are future enhancements that can be done around tackling ways tagged amenity=* as per Rovastars comments, and reporting on postcode accuracy but the above is enough for the initial version I think. If there are no objections to the above, I will start work on optimizing the script and get in touch with the import mailing list to discuss how to actually tackle the task of importing the data. On 14 January 2013 19:54, Rob Nickerson rob.j.nicker...@gmail.com wrote: Hi Aidan, Sounds like you have thought this through to ensure that this import will work well in practice. I would be more satisfied if imports were to closed ways with building=* only (as mentioned by others). Have you seen the Land Registry 'price paid' open data that includes addresses and postcodes. Matt had a go at creating a simple tool that matches their data to OSM address tags [1]. It's a great start, but be aware that the nearby houses only matches on street name and number so you can end up with addresses in different towns. Also it's quickest if you specify at least the first number after the space (e.g. CV4+8) Rob [1] http://milliams.dev.openstreetmap.org/postcodefinder/landregistry/ ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
Hi Matt, I'm getting results from other cities in the Land Registry tool. Should this apply the same +- 0.1 degrees logic? Would also be nice if you could add the edit this way in external links (like those on a way's page on OSM.org). Cheers, Rob ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
Hi Aidan, Sounds like you have thought this through to ensure that this import will work well in practice. I would be more satisfied if imports were to closed ways with building=* only (as mentioned by others). Have you seen the Land Registry 'price paid' open data that includes addresses and postcodes. Matt had a go at creating a simple tool that matches their data to OSM address tags [1]. It's a great start, but be aware that the nearby houses only matches on street name and number so you can end up with addresses in different towns. Also it's quickest if you specify at least the first number after the space (e.g. CV4+8) Rob [1] http://milliams.dev.openstreetmap.org/postcodefinder/landregistry/ ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
Hi Aidan If you were to do this then two things you should consider: 1. only tag closed ways where tag is building=xx AFAIK water and woods and gardendens etxc don't have postcodes 2. how to treat buildings where there is already a postcode and if correct/incorrect Regards Brian On 13 January 2013 15:21, Aidan McGinley aidmcgin+openstreet...@gmail.comwrote: Been toying with some ideas for how to use the ONS Postcode data[1]. One idea that I have been exploring is to check if the value for the centre of the postcode is inside a closed way, and if so then tag that way with the appropriate addr:postcode. I mocked up a script to check this using the overpass API. Some sample output is in the attached link [2]. Essentially the output shows the postcode and the associated way or ways that enclose it if more than one. I've excluded ways tagged landuse=*. The script is pretty inefficient at the moment, and needs to be optimised, but before I do that I wanted to check with the wider community that this is a viable approach, and if so the best way to do the import. Worth noting that if the data were imported, then ways that map to multiple postcodes would need to be excluded, as discussed previously on the mailing list[3] [1] http://www.ons.gov.uk/ons/guide-method/geography/products/postcode-directories/-nspp-/index.html [2] http://paste.ubuntu.com/1527642/ [3] http://lists.openstreetmap.org/pipermail/talk-gb/2013-January/014336.html ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
On 13/01/13 15:21, Aidan McGinley wrote: Been toying with some ideas for how to use the ONS Postcode data[1]. One idea that I have been exploring is to check if the value for the centre of the postcode is inside a closed way, and if so then tag that way with the appropriate addr:postcode. I mocked up a script to check this using the overpass API. Some sample output is in the attached link [2]. Essentially the output shows the postcode and the associated way or ways that enclose it if more than one. I've excluded ways tagged landuse=*. The script is pretty inefficient at the moment, and needs to be optimised, but before I do that I wanted to check with the wider community that this is a viable approach, and if so the best way to do the import. Worth noting that if the data were imported, then ways that map to multiple postcodes would need to be excluded, as discussed previously on the mailing list[3] [1] http://www.ons.gov.uk/ons/guide-method/geography/products/postcode-directories/-nspp-/index.html [2] http://paste.ubuntu.com/1527642/ [3] http://lists.openstreetmap.org/pipermail/talk-gb/2013-January/014336.html In case you haven't seen it, I produce and maintain a postcode layer based on ONS data that can be used in the editors. It helps determine the postcode, but always as a manual process. You can see info here [1] I doubt importing the data is practical and certainly would not be welcomed by many. The resolution of the postcode centroids leaves doubt as to the edge cases. The closest postcode centroid is sometimes related to the properties on a different road, easy to spot by eye. [1] http://onspd.raggedred.net/ -- Cheers, Chris user: chillly ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
Thanks for the feedback. @Brian - The script is filtering to only include closed ways, although I wasn't sure whether to restrict it to just ones with building=xxx, as this would mean some of the amenity=XXX type tags would get missed. It's trivial to do though should this go ahead. Had not thought of the postcode being already set, would probably simply ignore those ways and not overwrite the data. @Chris, yes I've seen your tiles, and have been using them to verify some of the output. I agree many of the centroids are not accurate, however ONS includes in the data a field called Grid Reference positional Quality Indicator. The highest quality status indicator is Within the building of the matched address closest to the postcode. If an import was done, it would need to filter out all the postcodes which don't have that quality indicator set to that value which I'm already doing. That in combination with limiting to closed ways with building=* seems like it would result in an accurate import? On 13 January 2013 18:38, Chris Hill o...@raggedred.net wrote: On 13/01/13 15:21, Aidan McGinley wrote: Been toying with some ideas for how to use the ONS Postcode data[1]. One idea that I have been exploring is to check if the value for the centre of the postcode is inside a closed way, and if so then tag that way with the appropriate addr:postcode. I mocked up a script to check this using the overpass API. Some sample output is in the attached link [2]. Essentially the output shows the postcode and the associated way or ways that enclose it if more than one. I've excluded ways tagged landuse=*. The script is pretty inefficient at the moment, and needs to be optimised, but before I do that I wanted to check with the wider community that this is a viable approach, and if so the best way to do the import. Worth noting that if the data were imported, then ways that map to multiple postcodes would need to be excluded, as discussed previously on the mailing list[3] [1] http://www.ons.gov.uk/ons/**guide-method/geography/** products/postcode-directories/**-nspp-/index.htmlhttp://www.ons.gov.uk/ons/guide-method/geography/products/postcode-directories/-nspp-/index.html [2] http://paste.ubuntu.com/**1527642/ http://paste.ubuntu.com/1527642/ [3] http://lists.openstreetmap.**org/pipermail/talk-gb/2013-** January/014336.htmlhttp://lists.openstreetmap.org/pipermail/talk-gb/2013-January/014336.html In case you haven't seen it, I produce and maintain a postcode layer based on ONS data that can be used in the editors. It helps determine the postcode, but always as a manual process. You can see info here [1] I doubt importing the data is practical and certainly would not be welcomed by many. The resolution of the postcode centroids leaves doubt as to the edge cases. The closest postcode centroid is sometimes related to the properties on a different road, easy to spot by eye. [1] http://onspd.raggedred.net/ -- Cheers, Chris user: chillly __**_ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.**org/listinfo/talk-gbhttp://lists.openstreetmap.org/listinfo/talk-gb ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data
Postcodes don't have to assigned to buildings in OSM. In the last couple of weeks I have not added any postcodes to buildings but I have added postcodes to amenities as areas including stadiums, schools, hospitals and police stations. None of these were buildings. Once you have a plan here then remember to run it by the import mailing list. -- View this message in context: http://gis.19327.n5.nabble.com/Postcode-data-tp5744277p5744382.html Sent from the Great Britain mailing list archive at Nabble.com. ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] Postcode data to be free in 2010 - well not exactly.....
On 09/12/09 21:29, Peter Miller wrote: *I'm sure this is old news to some of you, but... **http://news.bbc.co.uk/1/hi/technology/8402327.stm Arhhh. but Correction - Poscodes will not be free http://giscussions.blogspot.com/2009/12/correction-poscodes-will-not-be-free.html I don't think either piece really tells us what is going on. The BBC story is essentially about the announcement made a couple of weeks ago about the OS data, which did indeed say it would include postcode boundary data. So RM have denied any plan to release the PAF, but then that isn't what was ever announced. The real problem with the government announcement was that it talked about the OS releasing postcode boundary data but, as far as I know, the OS doesn't have any such data other presumably than what it has by agreement with RM. I suspect that the real truth of the situation is that whoever wrote the government press release assume that the OS had the data and could be made to release it but that one of the things the consultation will establish is that they don't/can't. Tom -- Tom Hughes (t...@compton.nu) http://www.compton.nu/ ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb