Hi Nolan, I think I can answer a few of your questions. Firstly, some background. The graph model of the OSM data is based largely on the XML formated OSM documents, and there you will find 'nodes', 'ways', 'relations' and 'tags' each as their own xml-tag, and as a consequence each will also have their own neo4j-node in the graph. Another point is that the geometry can be based on one or more nodes or ways, and so we always create another node for the geometry, and link it to the osm-node, way or relation that represents that geometry.
What all this boils down to is that you cannot find the tags on the geometry node itself. You cannot even find the location on that node. If you want to use the graph model in a direct way, as you have been trying, you really do need to know how the OSM data is modeled. For example, for a LineString geometry, you would need to traverse from the geometry node to the way node and finally to the tags node (to get the tags). To get to the locations is even more complex. Rather than do that, I would suggest that you work with the OSM API we provided with the OSMLayer, OSMDataset and OSMGeometryEncoder classes. Then you do not need to know the graph model at all. For example, OSMDataset has a method for getting a Way object from a node, and the returned object can be queried for its nodes, geometry, etc. Currently we provide methods for returning neo4j-nodes as well as objects that make spatial sense. One minor issue here is the ambiguity inherent in the fact that both neo4j and OSM make use of the term 'node', but for different things. We have various solutions to this, sometimes replacing 'node' with 'point' and sometimes prefixing with 'osm'. The unit tests in TestsForDocs includes some tests for the OSM API. My first goal is to find the nearest OSM node to a given lat, lon. My > attempts seem to be made of fail thus far, however. Here's my code: > Most of the OSM dataset is converted into LineStrings, and what you really want to do is find the closest vertex of the closest LineString. We have a utility function 'findClosestEdges' in the SpatialTopologyUtils class for that. The unit tests in TestSpatialUtils, and the testSnapping() method in particular, show use of this. My thinking is that nodes should be represented as points, so I can't > see why this fails. When I run this in a REPL, I do get a node back. So > far so good. Next, I want to get the node's tags. So I run: > The spatial search will return 'geometries', which are spatial objects. In neo4j-spatial every geometry is represented by a unique node, but it is not required that that node contain coordinates or tags. That is up to the GeometryEncoder. In the case of the OSM model, this information is elsewhere, because of the nature of the OSM graph, which is a highly interconnected network of points, most of which do not represent Point geometries, but are part of much more complex geometries (streets, regions, buildings, etc.). n.getSingleRelationship(OSMRelation.TAGS, Direction.INCOMING) > The geometry node is not connected directly to the tags node. You need two steps to get there. But again, rather than figure out the graph yourself, use the API. In this case, instead of getting the geometry node from the SpatialDatabaseRecord, rather just get the properties using getPropertyNames and getProperty(String). This API works the same on all kinds of spatial data, and in the case of OSM data will return the TAGS, since those are interpreted as attributes of the geometries. n.getSingleRelationship(OSMRelationship.GEOM, > Direction.INCOMING).getOtherNode(n).getPropertyKeys > I see what appears to be a series of tags (oneway, name, etc.) Why are > these being returned for OSMRelation.GEOM rather than OSMRelation.TAGS? > These are not the tags. Now you have found the node representing an OSM 'Way'. This has a few properties on it that are relevant to the way, the name, whether the street is oneway or not, etc. Sometimes these are based on values in the tags, but they are not the tags themselves. This node is connected to the geometry node and the tags node, so you were half-way there (to the tags that is). You started at the geometry node, and stepped over to the way node, and one more step (this time with the TAGS relationship) would have got you to the tags. But again, I advise against trying to explore the OSM graph by itself. As you have already found, it is not completely trivial. What you should have done is access the attributes directly from the search results. Additionally, I see the property way_osm_id, which clearly isn't a tag. > It would also seem to indicate that this query returned a way rather > than a node like I'd hoped. This conclusion is further born out by the > tag names. So clearly I'm not getting the search correct. But beyond > that, the way being returned by this search isn't close to the lat,lon I > provided. What am I missing? > The lat/long values are quite a bit deeper in the graph. In the case of 'ways', we have a chain of nodes that run from the first to the last node of the way. Each of these nodes has a relationship to another node that contains the location. The reason for the intermediate nodes is because the location nodes can exist in multiple ways. Needless to say it is a bit complex to traverse all this completely manually as you are trying. Another complication is that most points in the OSM model are not exposed as Point Geometries in the spatial index. This is because most of them are intended as parts of bigger geometries. For example, if someone created a lake in OSM, made of 100 points in a polygon, those 100 points would not be indexed in the spatial index, but the Polygon would be. So using the spatial index to find points like this will not work. Only points that are tagged individually will appear in the spatial index. We have some rules in the importer to decide if points that are part of bigger geometries can also be indexed by themselves. Usually the presence of user tags is good enough for this. Needless to say, very few points have their own tags, but the ways they belong to will have tags. So it you are interested in only points of interest, then using the index to search for points will work. If you are interested in all OSM nodes, including those that are only vertices of ways, then use the SpatialTopologyUtils.findClosestEdges(Point,Layer) method. As an aside, in looking at the latest OSM import testcase, it seems like > the batch inserter may now be optional. Is this true, and what > benefits/disadvantages are there to its use? I tried importing the Texas > OSM data on my fairly powerful laptop, but gave up after 12 hours and > 170000 way imports (I think there are over a million in that dataset.) > Other geospatial formats seem to do the import in a matter of hours, but > this import seemed like it'd go on for days if I let it. > There is a problem with the lucene index. Well, not specifically lucene, but a problem with using a general purpose index to map between OSM-ways and OSM-nodes. This problem really affects the scalability of the importer. It certainly slows down a lot for larger datasets. We have investigated various other solutions. Peter has tried BDB and other external indexes, and I started writing a HashMap/Array based index. I did not finish that, but have seen that Chris Gioran has now started a similar (and apparently more complete / better) solution at https://github.com/digitalstain/BigDataImport. I would like to try it out when I get a chance. The issue we have scaling the OSM import is not unique, so a solution like this will probably help many people. _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user