Re: [Neo4j] bdb-index
Looking over these On Sat, Jul 30, 2011 at 1:42 PM, John cyuczieekc cyuczie...@gmail.comwrote: those got concatenated for some reason, I'll repost them here so I can see them Relatationship: To associated Node: RelId - NodeId From associated Node: NodeId - RelId RelationshipType: To associated Node: RelationhipType.name - NodeId From associated Node: NodeId - RelationshipType.name; here, this may indeed be faster as you said it ('cause it doesn't require double lookup in both Relationship: and RelationshipType: above); but I wondering if it's more neatly if RelationshipType: here would be: Relid - RelationshipType expanding this notation as: To associated Relid: RelationshipType - Relid From associated Relid: Relid - RelationshipType the logic would be, if you have the start node, you can get the Relid (via Relationship: above) then you can get the RelationshipType and if you have RelationshipType you can get all Relids of that type, and by having those, you can lookup the nodes that form them in the Relationship: above but of course the way you said it it's faster I guess, I don't suppose lookup by name would be slower by much compared to lookup by long/id and XA transactions or any transactions, would make sure Relationship: is consistent with RelationshipType: anyway. I should probably be thinking/doing about other things, this seemed useless RelationshipRole: To associated Node: RelationhipRole.name - NodeId From associated Node: NodeId - RelationshipRole.name; PropertyType: To associated Node: PropertyType.name - NodeId From associated Node: NodeId - PropertyType.name; Property: To associated Node: Node, PropertyType.name - NodeId From associated Node: NodeId - Node, PropertyType.name On Fri, Jul 29, 2011 at 5:27 PM, Niels Hoogeveen pd_aficion...@hotmail.com wrote: What I need to store in an index depends on the type of element that needs to be reified. Relatationship: To associated Node: RelId - NodeIdFrom associated Node: NodeId - RelId RelationshipType: To associated Node: RelationhipType.name - NodeIdFrom associated Node: NodeId - RelationshipType.name; RelationshipRole:To associated Node: RelationhipRole.name - NodeIdFrom associated Node: NodeId - RelationshipRole.name; PropertyType:To associated Node: PropertyType.name - NodeIdFrom associated Node: NodeId - PropertyType.name; Property:To associated Node: Node, PropertyType.name - NodeIdFrom associated Node: NodeId - Node, PropertyType.name Niels Date: Fri, 29 Jul 2011 06:49:31 +0200 From: cyuczie...@gmail.com To: user@lists.neo4j.org Subject: Re: [Neo4j] bdb-index Hi xD I'm not clear what you need to store here, if I understand correctly you could store in 2 primary bdb databases the nodeID (ie. long) of each node in a relationship ie. key-value dbForward: A-B A-C X-D X-B dbBackward: B-A B-X C-A D-X A,B,C,D,X are all nodeIDs ie. longs this way you could check if A-B exists, or all of A's endNodes , or what startNodes are pointing to the endNode B the storing of these would be sorted and in BTree, lookup would be fast, so you can consider ie. A as being a set of B and C, and X being a set of B and D, (that is you cannot set the order as in a list, they are sorted by bdb for fast retrievals). (But upon this, sets, can build lists np - that is using only bdb; tho you won't need that using neo4j) So, if this is the kind of index you wanted... (I am not aware of specific indexes with bdb, though that doesn't mean they don't exist) Insertions would require transaction protection so both A-B in dbForward and B-A in dbBackward are inserted atomically. Parsing A then X of B- in dbBackward for example can only be done with a cursor... Either way, I'm taking a look on that bdb-index thingy; will report back if I have any ideas heh John. On Thu, Jul 28, 2011 at 9:42 PM, Niels Hoogeveen pd_aficion...@hotmail.comwrote: Thank you, Peter,There is no rush here. It would be nice to investigate this option, but it can wait until Mattias has returned and sifted through urgent matters. The question is even, if it would be a good idea to use an index to do the book keeping for Enhanced API.As it is now, the Reification of eg. a Relationship, requires one property to be set on a relationship, containing the node ID of the associated node. On the associated node is a property containing the ID of the relationship, so there is a bidirectional look up. Introducing an index would remove the need to have these additional properties, but would lead to slower look-up times (no matter how fast the index).So it's a trade-off between speed and cleanliness of namespace. Using the Enhanced API disallows certain property names to be used in user applications.The property names used in Enhanced API all start with org.neo4j.collections.graphbd., so there is little chance a
[Neo4j] Neo4j Disk Space
Hi, I have a question about optimizing the disk space required by Neo4j. Currently i have a Neo4j database contains 238796 nodes, 5418484 properties and 1328479 relationships, the disk space taked by Neo4j is about 17 GB and i think is too much. so is there any optimization can be considered to reduce the disk space, given that i only index the nodes. Thanks alot -- Ahmad Bakr Software Engineer eSpace Technologies www.espace.com.eg Mob: (+2) 010-410-2280 ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Neo4j Disk Space
Could you explain the number, types and sizes of the properties (distribution)? Nodes take 9 bytes per node and rels 33 bytes on disk. Simple properties up to a long take up 25 bytes, strings that don't fit into the short-string compression and arrays take larger sizes in the stringstore and arraystore. Please list the lenght of the files in your store directory as well as the directory sizes for each index (under index/lucene/nodes/* and index/lucene/relationships/* Thanks Michael Am 31.07.2011 um 11:05 schrieb Ahmad Bakr: Hi, I have a question about optimizing the disk space required by Neo4j. Currently i have a Neo4j database contains 238796 nodes, 5418484 properties and 1328479 relationships, the disk space taked by Neo4j is about 17 GB and i think is too much. so is there any optimization can be considered to reduce the disk space, given that i only index the nodes. Thanks alot -- Ahmad Bakr Software Engineer eSpace Technologies www.espace.com.eg Mob: (+2) 010-410-2280 ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Synchronization of EmbeddedReadOnlyGraphDatabase - Bug?
Hello Neo4J-Team, does anyone looked into this issue? Any insights? Thanks again, Mathias -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-Synchronization-of-EmbeddedReadOnlyGraphDatabase-Bug-tp3174626p3213450.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Brainstorming on my project: neo4john
Hey guys, I've been thinking that I would like to have a topic (like this current one) where I would be allowed to post anything related to brainstorming on my project which is currently a mix of neo4j and berkeleydb java edition. That is, I would like to start from scratch and explain and explore ideas, where anyone could step in and say what on their mind, especially with notes on how that would be better with neo4j rather than berkeleydb. But I'd like to know if this is a good idea to do here, and if any neo4j people would allow me to do this here. This would probably mean you'd receive lots of emails with this subject and Re: this subject, which you may not want to receive, in which case I would suggest a filter to ignore such emails (easily done within gmail for example) - but be sure not to ignore the sender which is always user@lists.neo4j.org for any topic/subject not just this one. So, anyone could potentially ignore my emails that I send here, should they be annoyed or they be too many too soon. Still, I would not do this unless most (if not all) of you (mainly neo4j devs I'd say) agree to allow me to post here. I would post replies only to this topic... well you get the idea :) Though you should reserve the right at any time to say stop if you don't want me to post anymore (due to ie. too frequent post, too dumb content, content seems like noise and doesn't help anyone) - that is, in the case you allow me to post :) - so if allowed, please reply and say so, otherwise if no replies with allowed or not, will default to `not allowed`, so I won't try to post anymore :) - be kind lol If I know Peter, and I don't lol, he'd be happy with some brainstorming I think, right? :) then, what about the others? Btw, if you feel like saying that I'm allowed would be too much of a responsibility or taking it from others, then maybe say that you wouldn't mind if I posted or not, or would make no difference to you. Though the neo4j guysgirls (ie. devs) would probably know if `me posting on this topic would be a good idea, for them and the users using this mailing list`. If you're wondering why would I do this, most importantly because it helps me by typing my thoughts rather than just thinking them in my head, if I don't type them I get easily distracted by other things and they end up being postponed/abandoned. Expressing my thoughts by typing them seems to be bridging both the physical and the mental in a way that they're both happy to do this heh. And also this might be helpful to others reading this, unless they get annoyed by my way of writing (which means both me and him are at fault, or rather the cause of his annoyance) or they get annoyed for other reasons but still triggered by them reading what I write. I am not good at writing or at programming for that matter, and I'm aware of this, but I believe that expressing myself in this written form might help (at least) me (and I hope not at the expense of others ie. like spam) and will likely,along the way, trigger some progress in me, which if you ask me, is in everyone's interest: the more people evolve the better is for everyone, no? yes,good :) No one is required (or expected of me) to read what I write, btw; but you should know that my subconscious, for some reason, likes knowing that someone did read and got beneficial results from it, ie. got something positive rather than negative (though any change is progress, except ie. if you make a system on top of that saying that ie. `counter` must increase for it to be considered progressing, so then while any change is progress at the lower level even if counter decreases, at this higher level, counter decreasing is not considered progress anymore; but then again at an even higher level, over time counter could be increasing by 10 then decreasing by 10 such that it would seem to be oscillating, and this would be considered no progress, rather it would be considered constant, unless the oscillation amplitude would change or increase ie. counter would increase by 15 and then decrease by 15 over time, this would be progress as considered at this level). So while I am sort of waiting for a good enough reason to hack my own subconscious and change it (assumed that it's possible, hey neuroplasticity would say so heh) such that i wouldn't require expressing my thoughts in writing or feeling empowered knowing that others are reading that, (while that) I am going along with what seems to be the next feel-empowered step...(kinda forgot what I wanted to say here xD) Also the subconscious(not just mine I'd say) likes to know that it did something, sort of like has a foundation for allowing itself to feel empowered, by having something done in the physical worlds that it is proud of, can be used by it as a permission slip to allow itself to feel happy about it or rather empowered; so in this respect this me writing my thoughts here stuff also helps with that. :) There's also some inherent desire to
[Neo4j] updateOrAdd method
i face strange problems with this method , i am sure i am giving it the right node id and the new properties , but i found the same old un-updated node in the DB , also i followed the instructions regarding flushing and shutting down the index please can you help me? -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/updateOrAdd-method-tp3213619p3213619.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Brainstorming on my project: neo4john
Hi John, I think when approaching a project there are two distinct issues at play, one is the tooling level, another is the actual solution you are trying to create for an actual problem. When looking at the tooling level it is great to have as much covered as possible. Neo4j offers a graph database and pretty good integration with Lucene. This overall is a good choice of tools, because there is hardly any overlapping functionality. Neo4j offers storage and navigation, while Lucene provides indexing. So the tools are pretty much orthogonal to each other. When adding BDB to the mix, things become a bit messier. BDB offers indexing and storage, so now you have to decide what to use BDB for. If you choose to only use it for indexing, like an alternative to Lucene, things remain pretty much orthogonal. When you decide to use BDB for storage, the question becomes: what to store in Neo4j and what to store in BDB. When it comes to storing and retrieving properties to entities both seem to be pretty fast, and unless you have serious performance issues with the storage of properties, either Neo4j or BDB is suitable for the task. When it comes to storing relationships between entities, Neo4j is by far the better solution. Fetching a relationship is a really cheap action, since it only involves moving a file pointer to a certain position (id * record length) and read the record (ie. if that data is not available in the cache already). When having a relationships it is also cheap to fetch the associated nodes (again moving a file pointer to a position, or read it from the cache). And while we are at it, when having a node or a relationship, it is again cheap to fetch the properties associated to that node. The motto of Neo4j seems to be, keep it local stupid. This works great, unless things are not local and this is where indexing comes into play. Suppose we know a name or a certain value and want to know what nodes or relationships it is associated with, doing a local search becomes ineffective. We could iterated over all nodes (and or all relationships) and check for that particular value, but that doesn't scale beyond a couple of thousand nodes or relationships. One option could be to do the indexing in the graph. We could create a node that can easily be addressed through the reference node, that functions as a tree root and traverse over he index to find a particular node or relationship. It works, but is not as fast as dedicated indexing. A dedicated index will fetch index blocks in one read operation and manipulate those index blocks in memory, where an index build in Neo4j would model an index block as a set of nodes that need to be read one after another (and likely from very different places in the store). So a dedicated index is more local than Neo4j can be when manipulating the index trees. A dedicated index will win hands down from Neo4j when it comes to raw speed of an index lookup/manipulation and likely consume less memory doing so. Neo4j already supports Lucene, which is great for certain jobs (full text indexing, composite queries), but is probably (I would have to run tests to verify this assumption) slower than BDB when it comes to simple key-value mappings. Lucene is also not very good at handling unicity constraints, an area where a more regular key-value store like BDB has advantage too. All this is just about the tooling level of an application (fun in its own right, but it doesn't solve any real problems). Things become more interesting when we start looking at an actual application. So my question is, what use cases do you want to solve with your neo4john project. Your example with buttons on a screen is a bit too high level, because it contains a lot more tooling than just neo4j and or BDB. You would need presentation (GUI or HTML) and reactiveness (how to respond to input) and you would need to somehow model your domain. So my suggestion would be to first list a couple of real world scenarios you want to solve with your neo4john project and then look at your tooling to see what trade-offs you need to make to implement it. You may need a mix of Neo4j, Lucene and BDB, but maybe you don't need all three to solve your particular problem. In any case, it's important to rise above the tooling level, because that is only a means to a goal. Even if your project provides additional tooling, there is still an application level to it. Focusing on the application level is good practice, because only there do you actually provide solutions. Niels Date: Sun, 31 Jul 2011 15:09:20 +0200 From: cyuczie...@gmail.com To: user@lists.neo4j.org Subject: [Neo4j] Brainstorming on my project: neo4john Hey guys, I've been thinking that I would like to have a topic (like this current one) where I would be allowed to post anything related to brainstorming on my project which is currently a mix of neo4j and berkeleydb java
Re: [Neo4j] updateOrAdd method
Hi Ahmed, The best way for us to help is if you send a unit test that demonstrates the problem you're having. Can you send some JUnit code please? Jim On 31 Jul 2011, at 16:01, ahmed.elsharkasy wrote: i face strange problems with this method , i am sure i am giving it the right node id and the new properties , but i found the same old un-updated node in the DB , also i followed the instructions regarding flushing and shutting down the index please can you help me? -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/updateOrAdd-method-tp3213619p3213619.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Brainstorming on my project: neo4john
Hi John, Niels, I think of indexes in Neo4j as long-lived names. Not quite the keep it local that Niels mentioned, but not entirely dissimilar either. Those long lived-names tend to give you starting points in the graph from where you perform graph operations. Indexing therefore constitutes less of your database design than it would in a RDBMS. Marko had a good line about this: graphs are adjacency free indexes (or words to that affect). Jim ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Brainstorming on my project: neo4john
Marko had a good line about this: graphs are adjacency free indexes (or words to that affect). :) -- Index-free Adjacency Marko. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Remote database contents inspection
Has anyone tried running Neoclipse with X11 Forwarding on a remote linux server? Any pointers ? It does not seem to work for me :( Any other tips for inspecting remote database contents of embedded installation, without the web admin ? Regards , Dima Gutzeit ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Remote database contents inspection
neo4j-shell or you could try to use WrappingNeoServerBootstrapper which runs a neo4j-server with an existing, embedded GraphDatabaseservice. See this thread for context: http://neo4j-community-discussions.438527.n3.nabble.com/How-to-download-neo4j-1-3-for-using-jo4Neo-tp3191002p3206212.html Cheers Michael Am 31.07.2011 um 18:52 schrieb Dima Gutzeit: Has anyone tried running Neoclipse with X11 Forwarding on a remote linux server? Any pointers ? It does not seem to work for me :( Any other tips for inspecting remote database contents of embedded installation, without the web admin ? Regards , Dima Gutzeit ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Node#getRelationshipTypes
Good point. It could for all practical purposes even be IterableRelationshipType so they can be lazily fetched, as long as the underlying implementation makes certain that any iteration of the RelationshipTypes forms a set (no duplicates). There is no need to have RelationshipTypes in any particular order, and if that is needed in the application, they can usually be sorted locally since Nodes will generally have associated Relationships of only a handful of RelationshipTypes. That said, the more important question is, if the Neo4j store can produce this meta-information. For sparsely connected nodes, it is possible to iterate over the relationships and return the set of RelationshipTypes, but this is not a proper solution when nodes are densely connected. So there is no general solution for this question yet. Niels From: j...@neotechnology.com Date: Sun, 31 Jul 2011 17:29:29 +0100 To: user@lists.neo4j.org Subject: Re: [Neo4j] Node#getRelationshipTypes Hi Niels, Ignoring the operational use for getting relationship types, I do think these should be generalised from: RelationshipType[] getRelationshipTypes(); RelationshipType[] getRelationshipTypes(Direction); to: SetRelationshipType getRelationshipTypes(); SetRelationshipType getgetRelationshipTypes(Direction); Unless you need the ordering and you think the overhead of creating a some kind of Set is too onerous from a performance point of view. Jim ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Brainstorming on my project: neo4john
Interesting thought, and it is certainly true that indexing is much less of a concern in a graph database than in a normal RDBMS where generally every table needs to have a primary key and where you need to have an index on the primary key to be able to do joins (at least to do them somewhat quickly). In a graph database relationships are explicit and static, while in an RDBMS inter-table relationships are implicit and dynamic. This distinction makes that an RDBMS can answer some ad-hoc relationship questions where this would be unpractical in a graph database. For example, in an RDBMS I can ask for a join over the Persons and over the Country table and return the Person_ID and the Country_ID if the country code is contained in the last name of the person. In a graph database asking that same question is not that easy, unless of course we have explicitly created relationships from Person nodes to Country nodes if the country code is contained in the last name of the person (unlikely). Being able to find relationships in an implicit and dynamic way has of course a performance penalty. After all it's much cheaper to follow a file pointer than having to lookup a value in an index (or worse do a full table scan). That said, there are situations where we need to jump to another position in the graph. One way is through the use of id's, which is a very cheap non-local jump. The other is through indexes, which can come in two variations, in-graph (using a traversal to mimic a non-local jump), or through an external index service. In-graph indexes can work really well, but are not as optimized to the task as dedicated index services are. The main reason is that dedicated index services can map index blocks to memory, while neo4j is much more fine grained, having to load the content of an index block node for node and relationship for relationship. This makes that in-graph indexes don't really scale all that well, especially when getting bigger than memory allocated. When having a cache miss, a dedicated index service can swap out a couple of index blocks where neo4j needs to swap out individual nodes and relationships. If index blocks are needed again, a dedicated index service can simply load those block in one read operation, while an in-graph index would have to reload those individual nodes and relationships one at a time. Niels From: j...@neotechnology.com Date: Sun, 31 Jul 2011 17:27:33 +0100 To: user@lists.neo4j.org Subject: Re: [Neo4j] Brainstorming on my project: neo4john Hi John, Niels, I think of indexes in Neo4j as long-lived names. Not quite the keep it local that Niels mentioned, but not entirely dissimilar either. Those long lived-names tend to give you starting points in the graph from where you perform graph operations. Indexing therefore constitutes less of your database design than it would in a RDBMS. Marko had a good line about this: graphs are adjacency free indexes (or words to that affect). Jim ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Brainstorming on my project: neo4john
Hey Niels, thanks for the concise reply. On Sun, Jul 31, 2011 at 5:10 PM, Niels Hoogeveen pd_aficion...@hotmail.comwrote: Hi John, I think when approaching a project there are two distinct issues at play, one is the tooling level, another is the actual solution you are trying to create for an actual problem. I seem to want a generic solution for multiple problems. Something generic enough that it can be applied specifically. The tooling (if I understand this right) should be able to be used by the user to mould to his own needs, sort of like java and/or eclipse can be used to build whatever program the user wants to code. I want this to be a foundation, such that supposedly I could spend 99% of the time inside this doing my work, rather than using the OS and its applications... When looking at the tooling level it is great to have as much covered as possible. Neo4j offers a graph database and pretty good integration with Lucene. This overall is a good choice of tools, because there is hardly any overlapping functionality. Neo4j offers storage and navigation, while Lucene provides indexing. So the tools are pretty much orthogonal to each other. When adding BDB to the mix, things become a bit messier. BDB offers indexing and storage, so now you have to decide what to use BDB for. If you choose to only use it for indexing, like an alternative to Lucene, things remain pretty much orthogonal. as I understand it here that's exactly what bdb-index is supposed to do for you (replace or be similar in interface/usage as Lucene index) for your graph-collections When you decide to use BDB for storage, the question becomes: what to store in Neo4j and what to store in BDB. the way it is right now, I could implement what I got, in either bdb or in neo4j; i don't need both; btw, lucene index seems as fast as bdb possibly 1ms slower, from what I've tested (granted that it was superficially tested) When it comes to storing and retrieving properties to entities both seem to be pretty fast, and unless you have serious performance issues with the storage of properties, either Neo4j or BDB is suitable for the task. When it comes to storing relationships between entities, Neo4j is by far the better solution. Fetching a relationship is a really cheap action, since it only involves moving a file pointer to a certain position (id * record length) and read the record (ie. if that data is not available in the cache already). as I've seen it though, I need to use an index (ie. lucene) such that I could check with neo4j if A-B exists where A has 1 million outgoing relationships to 1 million different nodes of which B is one else, it's over 700ms to a few seconds by using findSinglePath (of course I might've missed something) however when using an index then ~1ms When having a relationships it is also cheap to fetch the associated nodes (again moving a file pointer to a position, or read it from the cache). And while we are at it, when having a node or a relationship, it is again cheap to fetch the properties associated to that node. The motto of Neo4j seems to be, keep it local stupid. This works great, unless things are not local and this is where indexing comes into play. Suppose we know a name or a certain value and want to know what nodes or relationships it is associated with, doing a local search becomes ineffective. We could iterated over all nodes (and or all relationships) and check for that particular value, but that doesn't scale beyond a couple of thousand nodes or relationships. that is one use case that I need, but this search is done by bdb ~0ms instead of me doing any iterations via java code though in my case value is either the string name of a Node or just another node id One option could be to do the indexing in the graph. We could create a node that can easily be addressed through the reference node, that functions as a tree root and traverse over he index to find a particular node or relationship. did you do this btw, with SortedTree? or similar, within graph-collections ? I admit I only superficially skimmed it at some point and notice some acquireLock() method that attracted my attention - unrelated It works, but is not as fast as dedicated indexing. A dedicated index will fetch index blocks in one read operation and manipulate those index blocks in memory, where an index build in Neo4j would model an index block as a set of nodes that need to be read one after another (and likely from very different places in the store). So a dedicated index is more local than Neo4j can be when manipulating the index trees. that lucene index ie. RelationshipIndex works rather well 0 to 1ms results with is, similar to bdb, so using it would be a must for me, assuming I have millions of relationships :) A dedicated index will win hands down from Neo4j when it comes to raw speed of an index lookup/manipulation and likely consume less memory doing so.
Re: [Neo4j] Remote database contents inspection
Cool, which dependency is needed to use WrappingNeoServerBootstrapper ? Regards , Dima Gutzeit. On Sun, Jul 31, 2011 at 7:57 PM, Michael Hunger michael.hun...@neotechnology.com wrote: neo4j-shell or you could try to use WrappingNeoServerBootstrapper which runs a neo4j-server with an existing, embedded GraphDatabaseservice. See this thread for context: http://neo4j-community-discussions.438527.n3.nabble.com/How-to-download-neo4j-1-3-for-using-jo4Neo-tp3191002p3206212.html Cheers Michael Am 31.07.2011 um 18:52 schrieb Dima Gutzeit: Has anyone tried running Neoclipse with X11 Forwarding on a remote linux server? Any pointers ? It does not seem to work for me :( Any other tips for inspecting remote database contents of embedded installation, without the web admin ? Regards , Dima Gutzeit ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Remote database contents inspection
I am sorry, missed it, its all there in the original post: org.neo4j.app:server:1.4 On Sun, Jul 31, 2011 at 8:46 PM, Dima Gutzeit dima.gutz...@mailvision.comwrote: Cool, which dependency is needed to use WrappingNeoServerBootstrapper ? Regards , Dima Gutzeit. On Sun, Jul 31, 2011 at 7:57 PM, Michael Hunger michael.hun...@neotechnology.com wrote: neo4j-shell or you could try to use WrappingNeoServerBootstrapper which runs a neo4j-server with an existing, embedded GraphDatabaseservice. See this thread for context: http://neo4j-community-discussions.438527.n3.nabble.com/How-to-download-neo4j-1-3-for-using-jo4Neo-tp3191002p3206212.html Cheers Michael Am 31.07.2011 um 18:52 schrieb Dima Gutzeit: Has anyone tried running Neoclipse with X11 Forwarding on a remote linux server? Any pointers ? It does not seem to work for me :( Any other tips for inspecting remote database contents of embedded installation, without the web admin ? Regards , Dima Gutzeit ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Remote database contents inspection
To be exact its org.neo4j.app:neo4j-server:1.4, and its not posted on central repository (why ?) but should be fetched from: http://m2.neo4j.org/http://m2.neo4j.org/releases/org/neo4j/app/neo4j-server/ On Sun, Jul 31, 2011 at 9:03 PM, Dima Gutzeit dima.gutz...@mailvision.comwrote: I am sorry, missed it, its all there in the original post: org.neo4j.app:server:1.4 On Sun, Jul 31, 2011 at 8:46 PM, Dima Gutzeit dima.gutz...@mailvision.com wrote: Cool, which dependency is needed to use WrappingNeoServerBootstrapper ? Regards , Dima Gutzeit. On Sun, Jul 31, 2011 at 7:57 PM, Michael Hunger michael.hun...@neotechnology.com wrote: neo4j-shell or you could try to use WrappingNeoServerBootstrapper which runs a neo4j-server with an existing, embedded GraphDatabaseservice. See this thread for context: http://neo4j-community-discussions.438527.n3.nabble.com/How-to-download-neo4j-1-3-for-using-jo4Neo-tp3191002p3206212.html Cheers Michael Am 31.07.2011 um 18:52 schrieb Dima Gutzeit: Has anyone tried running Neoclipse with X11 Forwarding on a remote linux server? Any pointers ? It does not seem to work for me :( Any other tips for inspecting remote database contents of embedded installation, without the web admin ? Regards , Dima Gutzeit ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Remote database contents inspection
Update and a question. I was able to get the server JARs and run the WrappingNeoServerBootstrapper. The problem is that after getting everything from maven it misses the neo4j-server-static-web.jar. Its not a part of any POM and is not listed in dependencies, nevertheless its required when starting the server (runtime dependency). Can this be fixed somehow, on the POM of neo4j-server ? Thanks in advance. On Sun, Jul 31, 2011 at 9:11 PM, Dima Gutzeit dima.gutz...@mailvision.comwrote: To be exact its org.neo4j.app:neo4j-server:1.4, and its not posted on central repository (why ?) but should be fetched from: http://m2.neo4j.org/http://m2.neo4j.org/releases/org/neo4j/app/neo4j-server/ On Sun, Jul 31, 2011 at 9:03 PM, Dima Gutzeit dima.gutz...@mailvision.com wrote: I am sorry, missed it, its all there in the original post: org.neo4j.app:server:1.4 On Sun, Jul 31, 2011 at 8:46 PM, Dima Gutzeit dima.gutz...@mailvision.com wrote: Cool, which dependency is needed to use WrappingNeoServerBootstrapper ? Regards , Dima Gutzeit. On Sun, Jul 31, 2011 at 7:57 PM, Michael Hunger michael.hun...@neotechnology.com wrote: neo4j-shell or you could try to use WrappingNeoServerBootstrapper which runs a neo4j-server with an existing, embedded GraphDatabaseservice. See this thread for context: http://neo4j-community-discussions.438527.n3.nabble.com/How-to-download-neo4j-1-3-for-using-jo4Neo-tp3191002p3206212.html Cheers Michael Am 31.07.2011 um 18:52 schrieb Dima Gutzeit: Has anyone tried running Neoclipse with X11 Forwarding on a remote linux server? Any pointers ? It does not seem to work for me :( Any other tips for inspecting remote database contents of embedded installation, without the web admin ? Regards , Dima Gutzeit ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Remote database contents inspection
Runtime dependency is required for classifier 'static-web'. It should be documented somewhere so that others will not find it out the hard way :) On Sun, Jul 31, 2011 at 10:45 PM, Dima Gutzeit dima.gutz...@mailvision.comwrote: Update and a question. I was able to get the server JARs and run the WrappingNeoServerBootstrapper. The problem is that after getting everything from maven it misses the neo4j-server-static-web.jar. Its not a part of any POM and is not listed in dependencies, nevertheless its required when starting the server (runtime dependency). Can this be fixed somehow, on the POM of neo4j-server ? Thanks in advance. On Sun, Jul 31, 2011 at 9:11 PM, Dima Gutzeit dima.gutz...@mailvision.com wrote: To be exact its org.neo4j.app:neo4j-server:1.4, and its not posted on central repository (why ?) but should be fetched from: http://m2.neo4j.org/http://m2.neo4j.org/releases/org/neo4j/app/neo4j-server/ On Sun, Jul 31, 2011 at 9:03 PM, Dima Gutzeit dima.gutz...@mailvision.com wrote: I am sorry, missed it, its all there in the original post: org.neo4j.app:server:1.4 On Sun, Jul 31, 2011 at 8:46 PM, Dima Gutzeit dima.gutz...@mailvision.com wrote: Cool, which dependency is needed to use WrappingNeoServerBootstrapper ? Regards , Dima Gutzeit. On Sun, Jul 31, 2011 at 7:57 PM, Michael Hunger michael.hun...@neotechnology.com wrote: neo4j-shell or you could try to use WrappingNeoServerBootstrapper which runs a neo4j-server with an existing, embedded GraphDatabaseservice. See this thread for context: http://neo4j-community-discussions.438527.n3.nabble.com/How-to-download-neo4j-1-3-for-using-jo4Neo-tp3191002p3206212.html Cheers Michael Am 31.07.2011 um 18:52 schrieb Dima Gutzeit: Has anyone tried running Neoclipse with X11 Forwarding on a remote linux server? Any pointers ? It does not seem to work for me :( Any other tips for inspecting remote database contents of embedded installation, without the web admin ? Regards , Dima Gutzeit ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Brainstorming on my project: neo4john
Aiming to be as generic as possible can be good, but as some point you need to be specific too. You mention Java and Eclipse as being generic, but they are only to a point. When Java was introduced some of its main feats were platform independence, static type checking, garbage collection/managed memory and checked exceptions. Those were deliberate design decisions making Java a very specific sort of language, making it suitable for certain types of applications and less suitable for other types of applications. Java is very suitable for large applications that need modularisation, but it's not that great for ad-hoc scripting. The same is true for Eclipse. It is a great platform to build an IDE in, but would be overkill for a simple game of tic-tac-toe. Creating something that is completely generic has the downside that it actually becomes bad at doing something specific. Another downside to being completely generic is that it doesn't provide people with clues what it can do. This is most noticeable in the programming language LISP, which is so generic that every construct looks like every other construct, giving no visual clues to what the program is actually doing. It's a wonderful language where you can do the most amazing dynamic magic in only a few lines of code (with lots of parentheses), but has always been a niche language, because it doesn't offer programmers concrete clues about what you can do with it. Another language from that same era, COBOL, took the opposite approach and very explicitly made every feature available at the language level. This made COBOL a very special purpose language. At the time, for application programmers COBOL was an easy choice, because it offered many of the features needed for the applications. Much of that could be achieved in LISP too, but it never made those features explicit, so no one ever considered writing a business app in LISP. That said, the demise of COBOL came because of a changing environment and the specifics strengths eventually became weaknesses. Still, the changing environment didn't make LISP a winner, instead a language like PHP became hugely successful, because it focused on doing one thing well: the creation of HTML pages. So my point is, when you want to create something, try to have a concrete vision of what you want it to do. Now as to the discussion of what is tooling level and what is application level. I think the as a rule of thumb you can say that tools can be replaced by something else without functionally changing the application, while you cannot replace part of the application with something else without functionally changing the applications. Let's look at Neo4j. For me as an application programmer, it is a tool. I could in principle swap Neo4j out and replace it with another storage engine. I would probably take a performance hit in some areas doing so, but functionally my application could very much remain the same. For Neo Tech on the other hand, Neo4j is an application. There the tooling level consists of things like Maven and the java NIO API. In principle the tooling could be replaced. Instead of Maven, ANT scripts could be used to do the build and instead of the NIO API, the old fashioned IO API could be used. There would be a huge performance penalty swapping out NIO for IO, but functionally Neo4j could remain the same, only much slower. Yet it is not possible to remove the Node API and replace it with something else without changing the functionality of the application. So the question remains what functionality you want to provide with your neo4john project. You could think of a storage API that is independent of the storage engine used. So you could swap out Neo4j and replace it with BDB, and vice versa. If you do that, ask yourself who would be interested in that and what purpose does it serve? What are the benefits of replacing one storage engine with the other? When I started working on the Enhanced API, I had some concrete goals in mind which I wanted to solve: 1.) Make every element of the database reifiable, so they can all be used as first-class citizens.2.) Provide a pluggable architecture for properties and relationships. Both these goals make the Enhanced API more general than the standard API, but this is a result of the goals and not a goal in and of itself. Date: Sun, 31 Jul 2011 19:45:50 +0200 From: cyuczie...@gmail.com To: user@lists.neo4j.org Subject: Re: [Neo4j] Brainstorming on my project: neo4john Hey Niels, thanks for the concise reply. On Sun, Jul 31, 2011 at 5:10 PM, Niels Hoogeveen pd_aficion...@hotmail.comwrote: Hi John, I think when approaching a project there are two distinct issues at play, one is the tooling level, another is the actual solution you are trying to create for an actual problem. I seem to want a generic solution for multiple problems. Something generic enough that it can be applied
[Neo4j] Checking for unfinished transactions
How can I check if there are unfinished transactions? I want to do the same thing that is done when EmbeddedGraphDatabase is instantiated after a non-clean shutdown. Jul 31, 2011 5:06:12 PM org.neo4j.kernel.impl.transaction.xaframework.XaLogicalLog doInternalRecovery INFO: Non clean shutdown detected on log [/home/felipe/graph.db/nioneo_logical.log.2]. Recovery started ... Jul 31, 2011 5:06:12 PM org.neo4j.kernel.impl.transaction.TxManager init INFO: Unresolved transactions found, recovery started ... Jul 31, 2011 5:06:12 PM org.neo4j.kernel.impl.transaction.TxManager init INFO: Recovery completed, all transactions have been resolved to a consistent state. Is it too slow? I want to check if there are unfinished transactions to roll back them all. I'm writing a server and clients may not call finish() for some reason (unexpected errors) disobeying http://wiki.neo4j.org/content/Neo_Mistakes#Transactions_usage. I want to cleanup transactions on every new client connection so that every connection starts clean. Thanks, Felipe ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] GSoC 2011 Weekly report - OSM data mining and editing capabilities in uDig and Geotools
==Weekly Report== Hi all, last week I started the implementation of the OSM data editing from uDIg and the fuctionality to save them on the database. This week I will continue this task. Regards, Mirco 2011/7/24 Mirco Franzago mircofranz...@gmail.com ==Weekly Report== Hi all, this week I studied the uDig tutorials and uDig code to learn how to use the renderers. I made some experiments about OSM data import and editing using JOSM and Potlatch2. Now I'm trying to find a way to edit OSM data directly from uDig and a way to commit (save) the data on the neo4j-spatial database after the editing phase. Regards, Mirco. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Node#getRelationshipTypes
Imho it would have to iterate as well. As the type is stored with the relationship record and so can only be accessed after having read it. It might be to have some minimal performance improvements that relationships would not have to be fully loaded, nor put into the cache for that. But this is always a question of the use-case. What will be done next with those rel-types. What was the use-case for this operation again? Cheers Michael Am 31.07.2011 um 18:59 schrieb Niels Hoogeveen: Good point. It could for all practical purposes even be IterableRelationshipType so they can be lazily fetched, as long as the underlying implementation makes certain that any iteration of the RelationshipTypes forms a set (no duplicates). There is no need to have RelationshipTypes in any particular order, and if that is needed in the application, they can usually be sorted locally since Nodes will generally have associated Relationships of only a handful of RelationshipTypes. That said, the more important question is, if the Neo4j store can produce this meta-information. For sparsely connected nodes, it is possible to iterate over the relationships and return the set of RelationshipTypes, but this is not a proper solution when nodes are densely connected. So there is no general solution for this question yet. Niels From: j...@neotechnology.com Date: Sun, 31 Jul 2011 17:29:29 +0100 To: user@lists.neo4j.org Subject: Re: [Neo4j] Node#getRelationshipTypes Hi Niels, Ignoring the operational use for getting relationship types, I do think these should be generalised from: RelationshipType[] getRelationshipTypes(); RelationshipType[] getRelationshipTypes(Direction); to: SetRelationshipType getRelationshipTypes(); SetRelationshipType getgetRelationshipTypes(Direction); Unless you need the ordering and you think the overhead of creating a some kind of Set is too onerous from a performance point of view. Jim ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Node#getRelationshipTypes
I have two specific use cases for these methods: I'd like to present a node with the property types (names) it has content for and with the relationship types it has relationships for, while loading those properties/relationships on demand (ie. click here to see details). This can be done for properties: there is a getPropertyKeys() method, but there is no getRelationshipTypes() method. The other use case has to do with the Enhanced API. There I want to have pluggable relationships and properties. With respect to relationships there are already three implementations: the regular Relationship, SortedRelations (which use an in-graph Btree for storage) and HyperRelationships which allow n-ary relationships. Every Element in Enhanced API has a getRelationships() method, much like the getRelationships() method in Node, which should return every relationship attached to an Element, irrespective of its implementation. Right now the Element implementation has to perform the logic to distinguish which relationship is used for what implementation (under the hood it all works using normal Relationships). It would be much more elegant to iterate over the RelationshipTypes and dispatch the getRelationships() method to the appropriate RelationshipType implementations. That way the logic for SortedRelationships, HyperRelationships remains in their associated classes and is not spread around the implementation. Niels From: michael.hun...@neotechnology.com Date: Sun, 31 Jul 2011 23:20:50 +0200 To: user@lists.neo4j.org Subject: Re: [Neo4j] Node#getRelationshipTypes Imho it would have to iterate as well. As the type is stored with the relationship record and so can only be accessed after having read it. It might be to have some minimal performance improvements that relationships would not have to be fully loaded, nor put into the cache for that. But this is always a question of the use-case. What will be done next with those rel-types. What was the use-case for this operation again? Cheers Michael Am 31.07.2011 um 18:59 schrieb Niels Hoogeveen: Good point. It could for all practical purposes even be IterableRelationshipType so they can be lazily fetched, as long as the underlying implementation makes certain that any iteration of the RelationshipTypes forms a set (no duplicates). There is no need to have RelationshipTypes in any particular order, and if that is needed in the application, they can usually be sorted locally since Nodes will generally have associated Relationships of only a handful of RelationshipTypes. That said, the more important question is, if the Neo4j store can produce this meta-information. For sparsely connected nodes, it is possible to iterate over the relationships and return the set of RelationshipTypes, but this is not a proper solution when nodes are densely connected. So there is no general solution for this question yet. Niels From: j...@neotechnology.com Date: Sun, 31 Jul 2011 17:29:29 +0100 To: user@lists.neo4j.org Subject: Re: [Neo4j] Node#getRelationshipTypes Hi Niels, Ignoring the operational use for getting relationship types, I do think these should be generalised from: RelationshipType[] getRelationshipTypes(); RelationshipType[] getRelationshipTypes(Direction); to: SetRelationshipType getRelationshipTypes(); SetRelationshipType getgetRelationshipTypes(Direction); Unless you need the ordering and you think the overhead of creating a some kind of Set is too onerous from a performance point of view. Jim ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] js visualizer from the console
Hey All, How would I go about re-using that js visualizer that's in the console? Has anyone made a generic version? Many thanks! ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Collaborative filtering in Cypher
Hi, I'm new to graph databases and have been trying to understand the power of Cypher and/or Gremlin as a way to develop suggestion queries. I've watched a few webinars and read through some of the documentation but I've had a hard time figuring out complex suggestion type queries other than stuff like find me the top 3 movies my friends have recommended, or given that I rated this movie 5 stars, find me other people who liked this movie. To give a simplified example of what I'm trying to achieve is something like suggesting users based on attributes they weight from 0-1 and those attributes can have relationships between each other and I want to find something like the Tanimoto coefficient but weights as opposed to strictly binary attributes or slope one for a start user to end users and then list the top X. I have the O'Reilly book on Collective Intelligence but I've only seen examples for set theory, not for graph theory. I was thinking of Something like: UserA likes Digital Photography with a weight of 1 UserA likes Wine with a weight of .8 UserA likes Rock music with a weight of 1 UserB likes Film Photography with a weight of 1 UserB likes Rock music with a weight of .5 UserB likes Beer with a weight of 1 Digital Photography is related to Film Photography with a weight of .5 I want to return how alike these two users are. I understand the graph theory behind it in that I want to follow all likes relationships and is related relationships till I hit another user and then with those paths that I have to that user I want to multiply the weights of the relationships along that path to get a score and then add those scores to get a final score. So in the above example it would look something like: UserA-Digital Photography-Film Photography-UserB = 1 * .5 * 1 = .5 UserA-Rock-UserB = 1 * .5 = .5 Final tally = 1. And then if there were mere users it would follow the paths to find those users as well and find the scores and then sort the users by score. Looking at Gremlin and Cypher, I'm not sure where to even start to work on query that can do this and if it even is possible. I know what I described isn't slope one or the Tanimoto coefficient because it doesn't take into account the full set of attributes for the second user, but I'm just getting used to this and right now my potential solution is just have all unrelated attributes have edges of weight 0, but yeah I'm probably getting ahead of myself. I'm just looking for a point in the right direction for places to research and perhaps if they're available see some actual Cypher queries that have done weighted suggestions based on attributes. Sorry for my ignorance and thanks in advance, Mike ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user