On 25/5/21 4:41 pm, Daniel O'Connor wrote:
I'd make a polite argument there is still value in at least the suburb,
possibly postcode being still provided. When exporting data via
overpass as CSV; it's not currently easy or obvious to appropriately
bring in the parent attributes; even if it is for a Real Human looking
at the map.
There's a fair number of use cases for "data in a spreadsheet
friendly format" I feel.
You don't need to add addr:suburb to get that, all you need is a little
Assuming you have a csv dump of the address points from OSM eg:
node,34495141,-35.2641690,149.1223146,,3,Sargood Street
node,40293773,-35.2640376,149.1226107,,9,Sargood Street
node,254020381,-35.2623407,149.1451050,1,5,Edgar Street
node,291548764,-35.3847749,149.0720245,,56,Mannheim Street
node,318854867,-35.3339561,149.1697838,,289,Canberra Avenue
node,318855426,-35.3244730,149.1792480,4,59-61,Wollongong Street
node,318856277,-35.3150098,149.1417359,,19,Jardine Street
node,318859652,-35.3627241,149.0815960,,70,Hodgson Crescent
node,318859688,-35.3627835,149.0817144,,70,Hodgson Crescent
and you've the corresponding admin_level 10 and post code boundaries in
then you import the libraries you need:
import pandas as pd
import geopandas as gpd
read in the address points:
addlist= pd.read_csv('act_address_dump.csv',low_memory=False)
convert the list to a geoframe:
address_points =
read in the suburb boundaries:
suburbs = gpd.read_file('act_suburbs.geojson')
drop all of the tags that we will not need:
suburbs = suburbs[['name','geometry']]
then do the same for the post code boundaries:
postcodes = gpd.read_file('postcodes.geojson')
postcodes = postcodes[['postal_code','geometry']]
now we merge the three data sets together with a series of spatial
joins. First the suburb names:
address_points = gpd.sjoin(address_points,suburbs,op="within")
the join creates a column we don't need so get rid of that:
address_points = address_points.drop(['index_right'], axis=1)
then join the post codes:
address_points = gpd.sjoin(address_points,postcodes,op="within")
we've now got all of the data into the one frame but we need to clean up
the column labels before we write it out, so do a rename:
address_points =
and we can then write out the columns we want to a csv file:
which gives you:
310,node,2441363738,-35.3076927,149.1333269,,7,National Circuit,Barton,2600
2280,way,564187362,-35.1539837,149.1117804,,5,Jimmy Little
4414,way,823380125,-35.2242021,149.0456133,,55,Ennor Crescent,Florey,2615
2249,way,547120674,-35.2540932,149.1531645,,24,Piper Street,Ainslie,2602
1548,way,220316259,-35.3349388,149.0923894,,27,Coxen Street,Hughes,2605
4511,way,847394981,-35.2353182,149.0470223,,2,Diggles Street,Page,2614
3747,way,796706631,-35.2288001,149.0513507,,4,Caddy Place,Florey,2615
555,node,4214686496,-35.318041,149.1264149,,39,Empire Circuit,Forrest,2603
1052,node,7930404220,-35.1705767,149.0708312,,13,Gladstone Street,Hall,2618
I did this in an interactive ipython session, but if this is something
people want it could be easily turned into a Python script that does the
pull from overpass and writes out the file.
I did the whole country in one go to see how well it scales and the run
time was pretty much the same. Of course you can't do postcodes for
everywhere as we have put them all in yet.
Talk-au mailing list