OK, thanks for this. Unfortunately my project is getting less attention now than it ever has, but I finally sat down and reworked my architecture. Instead of working with Neo4J Nodes, I've reworked my library to use SpatialDatabaseRecord and am now having a bit more success.
I have three more query types to implement, and I'm not immediately seeing how to do this using the OSM API. I'm not sure if I need to drop to the lower-level JTS searches for these, so any guidance would be appreciated. Note that, in all that follows, "node" means OSM node, not Neo4J node: def nearestWay(lat:Double, lon:Double, allowedTypes:Option[List[String]] = None) = { Finds the nearest way to a given lat, lon. The list of allowed types lets me filter out power lines, footways, etc. by specifying values for the highway= tag. def nearestNodeWithWay(lat:Double, lon:Double, allowedTypes:Option[List[String]]) = { Finds the nearest node on any way. def waysForNode(n:Node) = { Lists the ways on which a given node is found. OSMDataset lets me do the opposite and returns the nodes for a way, but I don't immediately see a way to do this. Thanks. On 06/19/2011 04:47 PM, Craig Taverner wrote: > Hi Nolan, > > I think I can answer a few of your questions. Firstly, some background. The > graph model of the OSM data is based largely on the XML formated OSM > documents, and there you will find 'nodes', 'ways', 'relations' and 'tags' > each as their own xml-tag, and as a consequence each will also have their > own neo4j-node in the graph. Another point is that the geometry can be based > on one or more nodes or ways, and so we always create another node for the > geometry, and link it to the osm-node, way or relation that represents that > geometry. > > What all this boils down to is that you cannot find the tags on the geometry > node itself. You cannot even find the location on that node. If you want to > use the graph model in a direct way, as you have been trying, you really do > need to know how the OSM data is modeled. For example, for a LineString > geometry, you would need to traverse from the geometry node to the way node > and finally to the tags node (to get the tags). To get to the locations is > even more complex. Rather than do that, I would suggest that you work with > the OSM API we provided with the OSMLayer, OSMDataset and OSMGeometryEncoder > classes. Then you do not need to know the graph model at all. > > For example, OSMDataset has a method for getting a Way object from a node, > and the returned object can be queried for its nodes, geometry, etc. > Currently we provide methods for returning neo4j-nodes as well as objects > that make spatial sense. One minor issue here is the ambiguity inherent in > the fact that both neo4j and OSM make use of the term 'node', but for > different things. We have various solutions to this, sometimes replacing > 'node' with 'point' and sometimes prefixing with 'osm'. The unit tests in > TestsForDocs includes some tests for the OSM API. > > My first goal is to find the nearest OSM node to a given lat, lon. My >> attempts seem to be made of fail thus far, however. Here's my code: >> > Most of the OSM dataset is converted into LineStrings, and what you really > want to do is find the closest vertex of the closest LineString. We have a > utility function 'findClosestEdges' in the SpatialTopologyUtils class for > that. The unit tests in TestSpatialUtils, and the testSnapping() method in > particular, show use of this. > > My thinking is that nodes should be represented as points, so I can't >> see why this fails. When I run this in a REPL, I do get a node back. So >> far so good. Next, I want to get the node's tags. So I run: >> > The spatial search will return 'geometries', which are spatial objects. In > neo4j-spatial every geometry is represented by a unique node, but it is not > required that that node contain coordinates or tags. That is up to the > GeometryEncoder. In the case of the OSM model, this information is > elsewhere, because of the nature of the OSM graph, which is a highly > interconnected network of points, most of which do not represent Point > geometries, but are part of much more complex geometries (streets, regions, > buildings, etc.). > > n.getSingleRelationship(OSMRelation.TAGS, Direction.INCOMING) > The geometry node is not connected directly to the tags node. You need two > steps to get there. But again, rather than figure out the graph yourself, > use the API. In this case, instead of getting the geometry node from the > SpatialDatabaseRecord, rather just get the properties using getPropertyNames > and getProperty(String). This API works the same on all kinds of spatial > data, and in the case of OSM data will return the TAGS, since those are > interpreted as attributes of the geometries. > > n.getSingleRelationship(OSMRelationship.GEOM, >> Direction.INCOMING).getOtherNode(n).getPropertyKeys >> I see what appears to be a series of tags (oneway, name, etc.) Why are >> these being returned for OSMRelation.GEOM rather than OSMRelation.TAGS? >> > These are not the tags. Now you have found the node representing an OSM > 'Way'. This has a few properties on it that are relevant to the way, the > name, whether the street is oneway or not, etc. Sometimes these are based on > values in the tags, but they are not the tags themselves. This node is > connected to the geometry node and the tags node, so you were half-way there > (to the tags that is). You started at the geometry node, and stepped over to > the way node, and one more step (this time with the TAGS relationship) would > have got you to the tags. > > But again, I advise against trying to explore the OSM graph by itself. As > you have already found, it is not completely trivial. What you should have > done is access the attributes directly from the search results. > > Additionally, I see the property way_osm_id, which clearly isn't a tag. >> It would also seem to indicate that this query returned a way rather >> than a node like I'd hoped. This conclusion is further born out by the >> tag names. So clearly I'm not getting the search correct. But beyond >> that, the way being returned by this search isn't close to the lat,lon I >> provided. What am I missing? >> > The lat/long values are quite a bit deeper in the graph. In the case of > 'ways', we have a chain of nodes that run from the first to the last node of > the way. Each of these nodes has a relationship to another node that > contains the location. The reason for the intermediate nodes is because the > location nodes can exist in multiple ways. Needless to say it is a bit > complex to traverse all this completely manually as you are trying. > > Another complication is that most points in the OSM model are not exposed as > Point Geometries in the spatial index. This is because most of them are > intended as parts of bigger geometries. For example, if someone created a > lake in OSM, made of 100 points in a polygon, those 100 points would not be > indexed in the spatial index, but the Polygon would be. So using the spatial > index to find points like this will not work. Only points that are tagged > individually will appear in the spatial index. We have some rules in the > importer to decide if points that are part of bigger geometries can also be > indexed by themselves. Usually the presence of user tags is good enough for > this. Needless to say, very few points have their own tags, but the ways > they belong to will have tags. > > So it you are interested in only points of interest, then using the index to > search for points will work. If you are interested in all OSM nodes, > including those that are only vertices of ways, then use the > SpatialTopologyUtils.findClosestEdges(Point,Layer) method. > > As an aside, in looking at the latest OSM import testcase, it seems like >> the batch inserter may now be optional. Is this true, and what >> benefits/disadvantages are there to its use? I tried importing the Texas >> OSM data on my fairly powerful laptop, but gave up after 12 hours and >> 170000 way imports (I think there are over a million in that dataset.) >> Other geospatial formats seem to do the import in a matter of hours, but >> this import seemed like it'd go on for days if I let it. >> > There is a problem with the lucene index. Well, not specifically lucene, but > a problem with using a general purpose index to map between OSM-ways and > OSM-nodes. This problem really affects the scalability of the importer. It > certainly slows down a lot for larger datasets. We have investigated various > other solutions. Peter has tried BDB and other external indexes, and I > started writing a HashMap/Array based index. I did not finish that, but have > seen that Chris Gioran has now started a similar (and apparently more > complete / better) solution at https://github.com/digitalstain/BigDataImport. > I would like to try it out when I get a chance. The issue we have scaling > the OSM import is not unique, so a solution like this will probably help > many people. > _______________________________________________ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user