Re: [Neo4j] neoviz + kernel error

2010-11-09 Thread Johan Svensson
Hi,

If it is a very old store that has been running on pre 1.0 beta
releases download Neo4j 1.0 and perform a startup+shutdown. After that
you will be able to run 1.1 or the current milestone/snapshot
releases.

If you have been running in HA mode and switched back to for example
the 1.1 release you will also get that exception and downgrading from
newer to older release is not automatically supported.

Regards,
Johan

On Tue, Nov 9, 2010 at 1:46 AM, Karen Nomorosa
 wrote:
> Hi all,
>
> Tried running the neo-graphviz component and got the following error:
> org.neo4j.kernel.impl.nioneo.store.IllegalStoreVersionException: Store 
> version [NeoStore v0.9.6]. Please make sure you are not running old Neo4j 
> kernel  towards a store that has been created by newer version  of Neo4j.
>                at 
> org.neo4j.kernel.impl.nioneo.store.NeoStore.versionFound(NeoStore.java:355)...
>
> So I looked at NeoStore.java of the latest kernel source (at 
> /components/kernel/trunk/src/main/java/org/neo4j/kernel/impl/nioneo/store/NeoStore.java)
>  and it seems to only check for v0.9.5, throwing an exception for any other 
> version.
>
> Would there be a quick workaround for this?
>
> Thanks,
> Karen
>
> 
> Karen Joy Nomorosa
> Semantic Analyst
> Project Q
> Tell a little... Get a lot...
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] many small graphs in 1 Database

2010-11-09 Thread Thomas Strunz

Hi all,

I have following questions:

is neo4j also suited for a database, that contains many 100k of small graphs 
(5-30 nodes, mostly around 1-4 relationships per node)? (As far as I understood 
not the main purpose of the product but doesn't hurt to ask)

If yes how can you perform subgraph matching and whats it's performance? 
(especially considering that most nodes are the same and the relationship types 
between them too)
To be specific: graph = chemical Structure (mainly C and H Atoms (nodes) 
connected by bonds (single, double,..)

A query typically only contains nodes and relationships that appear in 100% of 
the "small graphs" and multiple times per graph.
I read 

http://lists.neo4j.org/pipermail/user/2009-June/001331.html

and this seems to hint it will be rather tricky to achieve this? (defines the 
entry point, and only enter each "small graph" once)

Note that prior filtering steps unrelated to graphs must be done previously 
anyway and hence the number of "small graphs" to traverse is usually much lower 
than the total number. 


And an additional question:

Can a node be a traversable graph too?
Example: chemical Structure XYZ (a graph) was made by John Doe and is stored in 
Room 123.
(the chemical Structure XYZ must be seen as a single object (=Node) for the 
additional context).
Query would be: find all chemical Structures made by John Doe that match a 
given chemical Structure

I hope it's understandable what i'm tryign to get at.

Best Regards,

Thomas

  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j.py 0.2

2010-11-09 Thread Francois Kassis
Thx, I will be waiting for it.
Francois.

--
From: "Mattias Persson" 
Sent: Tuesday, November 09, 2010 2:55 PM
To: "Neo4j user discussions" 
Subject: Re: [Neo4j] Neo4j.py 0.2

> 2010/11/9 Mattias Persson 
>
>> I lack some knowledge about the python bindings, but I know that they
>> haven't received proper love lately so there is no (that I'm aware of)
>> up2date binding for latest neo4j version. I don't know if fulltext 
>> indexing
>> is supported (we're  talking about the old 
>> IndexService).
>> When the python bindings becomes up to date you'll find the new index
>> framework  instead which
>> has fulltext support of course.
>>
>> Check out
>>
> ... http://components.neo4j.org/neo4j.py/ for more information.
>
>>
>> 2010/11/8 Francois Kassis 
>>
>> Hi all,
>>> where can I find a python binding for neo4j equivalent to last neo4j 
>>> java
>>> version. my problem is that the database was created in neo4j java 
>>> latest
>>> version, and when accessing the index in python 0.1 snapshot it gives me 
>>> a
>>> version conflict error.
>>> one more question, is lucene fulltext search implemented in python
>>> version?
>>> THX in advance.
>>> Francois.
>>> ___
>>> Neo4j mailing list
>>> User@lists.neo4j.org
>>> https://lists.neo4j.org/mailman/listinfo/user
>>>
>>
>>
>>
>> --
>> Mattias Persson, [matt...@neotechnology.com]
>> Hacker, Neo Technology
>> www.neotechnology.com
>>
>
>
>
> -- 
> Mattias Persson, [matt...@neotechnology.com]
> Hacker, Neo Technology
> www.neotechnology.com
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
> 
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j.py 0.2

2010-11-09 Thread Mattias Persson
2010/11/9 Mattias Persson 

> I lack some knowledge about the python bindings, but I know that they
> haven't received proper love lately so there is no (that I'm aware of)
> up2date binding for latest neo4j version. I don't know if fulltext indexing
> is supported (we're  talking about the old 
> IndexService).
> When the python bindings becomes up to date you'll find the new index
> framework  instead which
> has fulltext support of course.
>
> Check out
>
... http://components.neo4j.org/neo4j.py/ for more information.

>
> 2010/11/8 Francois Kassis 
>
> Hi all,
>> where can I find a python binding for neo4j equivalent to last neo4j java
>> version. my problem is that the database was created in neo4j java latest
>> version, and when accessing the index in python 0.1 snapshot it gives me a
>> version conflict error.
>> one more question, is lucene fulltext search implemented in python
>> version?
>> THX in advance.
>> Francois.
>> ___
>> Neo4j mailing list
>> User@lists.neo4j.org
>> https://lists.neo4j.org/mailman/listinfo/user
>>
>
>
>
> --
> Mattias Persson, [matt...@neotechnology.com]
> Hacker, Neo Technology
> www.neotechnology.com
>



-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j.py 0.2

2010-11-09 Thread Mattias Persson
I lack some knowledge about the python bindings, but I know that they
haven't received proper love lately so there is no (that I'm aware of)
up2date binding for latest neo4j version. I don't know if fulltext indexing
is supported (we're  talking about the old
IndexService).
When the python bindings becomes up to date you'll find the new index
framework  instead which has
fulltext support of course.

Check out

2010/11/8 Francois Kassis 

> Hi all,
> where can I find a python binding for neo4j equivalent to last neo4j java
> version. my problem is that the database was created in neo4j java latest
> version, and when accessing the index in python 0.1 snapshot it gives me a
> version conflict error.
> one more question, is lucene fulltext search implemented in python version?
> THX in advance.
> Francois.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] many small graphs in 1 Database

2010-11-09 Thread Peter Neubauer
Thomas,
IMHO, the examination of the graphs should be much helped by the new
Index API, where you can ask and store composite indexes. I would
imagine that you could do a lot of the exclusion work by indexing the
chemical structures by not only one node, but possibly construct a
typical path of nodes and relationships and index that one with
http://wiki.neo4j.org/content/Index_Framework#Compound_queries, that
that you can ask complex queries involving the whole structure, and
get the "entry node" for the subgraph back. Also, that entry node
could be used to connect to e.g. John Doe in order to represent the
whole compound.

Would that be feasible?

Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

http://www.neo4j.org               - Your high performance graph database.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.



On Tue, Nov 9, 2010 at 10:47 AM, Thomas Strunz  wrote:
>
> Hi all,
>
> I have following questions:
>
> is neo4j also suited for a database, that contains many 100k of small graphs 
> (5-30 nodes, mostly around 1-4 relationships per node)? (As far as I 
> understood not the main purpose of the product but doesn't hurt to ask)
>
> If yes how can you perform subgraph matching and whats it's performance? 
> (especially considering that most nodes are the same and the relationship 
> types between them too)
> To be specific: graph = chemical Structure (mainly C and H Atoms (nodes) 
> connected by bonds (single, double,..)
>
> A query typically only contains nodes and relationships that appear in 100% 
> of the "small graphs" and multiple times per graph.
> I read
>
> http://lists.neo4j.org/pipermail/user/2009-June/001331.html
>
> and this seems to hint it will be rather tricky to achieve this? (defines the 
> entry point, and only enter each "small graph" once)
>
> Note that prior filtering steps unrelated to graphs must be done previously 
> anyway and hence the number of "small graphs" to traverse is usually much 
> lower than the total number.
>
>
> And an additional question:
>
> Can a node be a traversable graph too?
> Example: chemical Structure XYZ (a graph) was made by John Doe and is stored 
> in Room 123.
> (the chemical Structure XYZ must be seen as a single object (=Node) for the 
> additional context).
> Query would be: find all chemical Structures made by John Doe that match a 
> given chemical Structure
>
> I hope it's understandable what i'm tryign to get at.
>
> Best Regards,
>
> Thomas
>
>
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Accessing neo4j index created with indexservice

2010-11-09 Thread Peter Neubauer
Vasco,
sorry for not getting back to you. Is this still relevant, do you want
me to look into this?

Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

http://www.neo4j.org               - Your high performance graph database.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.



On Mon, Oct 4, 2010 at 10:07 PM, Vasco Pedro  wrote:
>  long node = inserter.createNode( nodeProps );
>            indexService.index( node, "uri", nodeuri );
>            indexService.index(node, "label", label);
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] many small graphs in 1 Database

2010-11-09 Thread Craig Taverner
It's been 13 years since I left Chemistry, but I think I have some residual
interest in the subject :-)

My two cents worth for this problem is that it is possible to model
everything in one single graph:

   - Store both the chemical structures and the relationships as graphs,
   differentiating using relationship types.
   - Atoms connected together would have BOND relationships, but the
   molecule itself can also be represented by a single node, with an ATOMS
   relationship into the sub-graph representing the molecular structure
   - Meta-data structures related to the molecules would be a graph
   connecting them all together and connected to the root node. So, for
   example, the suggestion of a molecule having been made by John Doe and
   stored in Room 123 would be modeled by having a 'rooms' node connected to a
   node for each room, and the room node with 'name'='123' would be connected
   using STORES relationship to the molecule node above. Similarly we would
   have a 'chemists' node connected to each chemist, and the chemist with
   'name'='John Doe' would be related by CREATED to the molecule. The CREATED
   relationship could have properties like created_on, etc. John Doe could also
   have properties like email, phone number, etc.

In this approach we have no need for the external index, since all queries
suggested can be achieved using a traversal. If you want composite queries
and know in advance the main queries you will make, you can also optimise
the graph structure for those queries. For example, if a standard query is
to ask the database which molecules were created by chemist X between March
and August 2010, then create month nodes between the chemist node and the
molecule node, so all molecules made by X in January would be connected
first to X's January node and from there to X himself. This is in effect
building a custom index into the graph. It is a good solution if you know
very well what kind of queries you will make.

However, as Peter suggests, using the lucene index, especially with the new
composite query support, you do not need to think too hard about having your
own index graph, but would simply add both the chemists name and the date to
a single index on the molecule node itself. So the lucene query for the
chemist and date should return a set of molecule nodes, and you can then do
further pattern matching, if needed, on those.

One other idea I would consider for pattern matching is to generate a
signature, a kind of hash of the molecule shape that is representative of
the shape. Then you can index that hash also, and effectively get the
molecular shape to be a lucene searchable field. This is only possible if
you know your domain well enough to create a hash that makes sense for your
situation. In the case of chemistry, it really depends on what you mean by
'shape' when doing the search. For example, perhaps a search on chemical
formula is a good enough description of the shape, and in that case your
'hash' is simply the formula. So, for example, ethanol would be C2H5OH.
Searches on that hash should yield ethanol and perhaps a few similar
compounds. If we spent a little more time thinking about this, we could
possibly come up with a few better hashes, more likely to match 'shape', but
I hope you at least got my idea :-)
(I suspect that there are probably standard ways of writing down a chemical
shape uniquely, and if you get the shape hash to be truely unique, you can
also not bother to store the molecule as a sub-graph at all, saving space
and complexity).

Regards, Craig

On Tue, Nov 9, 2010 at 2:32 PM, Peter Neubauer <
peter.neuba...@neotechnology.com> wrote:

> Thomas,
> IMHO, the examination of the graphs should be much helped by the new
> Index API, where you can ask and store composite indexes. I would
> imagine that you could do a lot of the exclusion work by indexing the
> chemical structures by not only one node, but possibly construct a
> typical path of nodes and relationships and index that one with
> http://wiki.neo4j.org/content/Index_Framework#Compound_queries, that
> that you can ask complex queries involving the whole structure, and
> get the "entry node" for the subgraph back. Also, that entry node
> could be used to connect to e.g. John Doe in order to represent the
> whole compound.
>
> Would that be feasible?
>
> Cheers,
>
> /peter neubauer
>
> GTalk:  neubauer.peter
> Skype   peter.neubauer
> Phone   +46 704 106975
> LinkedIn   http://www.linkedin.com/in/neubauer
> Twitter  http://twitter.com/peterneubauer
>
> http://www.neo4j.org   - Your high performance graph database.
> http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.
>
>
>
> On Tue, Nov 9, 2010 at 10:47 AM, Thomas Strunz 
> wrote:
> >
> > Hi all,
> >
> > I have following questions:
> >
> > is neo4j also suited for a database, that contains many 100k of small
> graphs (5-30 nodes, mostly around 1-4 relationships per node)? (As far as I
> 

Re: [Neo4j] neo4j.py issue reading index

2010-11-09 Thread Chris Diehl
Hi Peter,

Just wanted to check in with you to see if you all had a chance to
look at that index reading issue with neo4j.py. We're hoping it's a
quick fix as we would like to use neo4j.py in some experiments we're
doing now.

Thanks again for looking into this.

Cheers, Chris

Message: 1
Date: Mon, 25 Oct 2010 16:24:39 +0200
From: Peter Neubauer 
Subject: Re: [Neo4j] neo4j.py issue reading index
To: Neo4j user discussions 
Message-ID:
       
Content-Type: text/plain; charset=ISO-8859-1

Chris,
Andres looked at this today, seems we can reproduce it. The Lucene
folder for the "address" index is created during the transaction, but
not found afterwards.

So, we will check with Tobias who is maintaining the bindings ASAP (he
is on holiday this week) and get back to you on this. Is that ok?

Cheers,

/peter neubauer

VP Product Management, Neo Technology

GTalk:? ? ? neubauer.peter
Skype? ? ?? peter.neubauer
Phone? ? ?? +46 704 106975
LinkedIn?? http://www.linkedin.com/in/neubauer
Twitter? ? ? http://twitter.com/peterneubauer
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] JTA support for Neo

2010-11-09 Thread Chris Gioran
Hi people,

I have started work on providing support for integrating Neo in a JTA
environment. This is the first time I have done something like this so
some feedback would be welcome.

My setup is this: I have an ApacheDS instance running to provide an
LDAP service to store objects. I have tweaked (very slightly) a
standalone JOTM instance to provide its TransactionManager via JNDI to
the directory (I prefer that over RMI). I then start an
EmbeddedGraphDatabase and replace the native neo TxManager with the
JOTM provided implementation as I retrieve it from the directory, both
in the PersistenceModule and the TxModule. From there, instead of
asking the graphDb to beginTx() I ask the JOTM TransactionManager to
begin(). I perform some Node creations and they are persisted on
commit(). Also, via a debugger, I can see that the XAResources of Neo
are indeed enlisted on the remove Transaction.

I know that this is very premature and for the time being the changes
are very rough but I think I have a proof of concept. Comments? Does
anyone have any experience in a similar setting? Next steps?

This has a very high coolness factor, btw :)

cheers,
CG
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] neo4j.py issue reading index

2010-11-09 Thread Peter Neubauer
Chris,
now that Tobias is on site again, I got the code to work, it looks
like this now:

from __future__ import with_statement
import neo4j

graphdb = neo4j.GraphDatabase("neodb/test")
with graphdb.transaction:
   addIdx = graphdb.index("address",create=True)
   n = graphdb.node(color="Red", widht=16, height=32)
   addIdx[n['color']] = n

graphdb.shutdown()
graphdb = neo4j.GraphDatabase("neodb/test")
with graphdb.transaction:
   addIdx = graphdb.index("address",create=True)


Two points:

1. Indexes are created lazily, so unless you actually add something to
it it will not be created, thus not being available on the next
opening of the db.

2. addIdx = graphdb.index("address",create=True) is broken, so
omitting the create=True breaks things. Always use it, will be
rectified upon overhaul of the python bindings.

Does this help?

Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

http://www.neo4j.org               - Your high performance graph database.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.



On Tue, Nov 9, 2010 at 4:38 PM, Chris Diehl  wrote:
> Hi Peter,
>
> Just wanted to check in with you to see if you all had a chance to
> look at that index reading issue with neo4j.py. We're hoping it's a
> quick fix as we would like to use neo4j.py in some experiments we're
> doing now.
>
> Thanks again for looking into this.
>
> Cheers, Chris
>
> Message: 1
> Date: Mon, 25 Oct 2010 16:24:39 +0200
> From: Peter Neubauer 
> Subject: Re: [Neo4j] neo4j.py issue reading index
> To: Neo4j user discussions 
> Message-ID:
>        
> Content-Type: text/plain; charset=ISO-8859-1
>
> Chris,
> Andres looked at this today, seems we can reproduce it. The Lucene
> folder for the "address" index is created during the transaction, but
> not found afterwards.
>
> So, we will check with Tobias who is maintaining the bindings ASAP (he
> is on holiday this week) and get back to you on this. Is that ok?
>
> Cheers,
>
> /peter neubauer
>
> VP Product Management, Neo Technology
>
> GTalk:? ? ? neubauer.peter
> Skype? ? ?? peter.neubauer
> Phone? ? ?? +46 704 106975
> LinkedIn?? http://www.linkedin.com/in/neubauer
> Twitter? ? ? http://twitter.com/peterneubauer
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] neo4j.py issue reading index

2010-11-09 Thread Peter Neubauer
Also,
I updated the info on http://components.neo4j.org/neo4j.py/ to reflect
this. Sorry for the inconvenience!

Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

http://www.neo4j.org               - Your high performance graph database.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.



On Tue, Nov 9, 2010 at 6:14 PM, Peter Neubauer
 wrote:
> Chris,
> now that Tobias is on site again, I got the code to work, it looks
> like this now:
>
> from __future__ import with_statement
> import neo4j
>
> graphdb = neo4j.GraphDatabase("neodb/test")
> with graphdb.transaction:
>   addIdx = graphdb.index("address",create=True)
>   n = graphdb.node(color="Red", widht=16, height=32)
>   addIdx[n['color']] = n
>
> graphdb.shutdown()
> graphdb = neo4j.GraphDatabase("neodb/test")
> with graphdb.transaction:
>   addIdx = graphdb.index("address",create=True)
>
>
> Two points:
>
> 1. Indexes are created lazily, so unless you actually add something to
> it it will not be created, thus not being available on the next
> opening of the db.
>
> 2. addIdx = graphdb.index("address",create=True) is broken, so
> omitting the create=True breaks things. Always use it, will be
> rectified upon overhaul of the python bindings.
>
> Does this help?
>
> Cheers,
>
> /peter neubauer
>
> GTalk:      neubauer.peter
> Skype       peter.neubauer
> Phone       +46 704 106975
> LinkedIn   http://www.linkedin.com/in/neubauer
> Twitter      http://twitter.com/peterneubauer
>
> http://www.neo4j.org               - Your high performance graph database.
> http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.
>
>
>
> On Tue, Nov 9, 2010 at 4:38 PM, Chris Diehl  wrote:
>> Hi Peter,
>>
>> Just wanted to check in with you to see if you all had a chance to
>> look at that index reading issue with neo4j.py. We're hoping it's a
>> quick fix as we would like to use neo4j.py in some experiments we're
>> doing now.
>>
>> Thanks again for looking into this.
>>
>> Cheers, Chris
>>
>> Message: 1
>> Date: Mon, 25 Oct 2010 16:24:39 +0200
>> From: Peter Neubauer 
>> Subject: Re: [Neo4j] neo4j.py issue reading index
>> To: Neo4j user discussions 
>> Message-ID:
>>        
>> Content-Type: text/plain; charset=ISO-8859-1
>>
>> Chris,
>> Andres looked at this today, seems we can reproduce it. The Lucene
>> folder for the "address" index is created during the transaction, but
>> not found afterwards.
>>
>> So, we will check with Tobias who is maintaining the bindings ASAP (he
>> is on holiday this week) and get back to you on this. Is that ok?
>>
>> Cheers,
>>
>> /peter neubauer
>>
>> VP Product Management, Neo Technology
>>
>> GTalk:? ? ? neubauer.peter
>> Skype? ? ?? peter.neubauer
>> Phone? ? ?? +46 704 106975
>> LinkedIn?? http://www.linkedin.com/in/neubauer
>> Twitter? ? ? http://twitter.com/peterneubauer
>> ___
>> Neo4j mailing list
>> User@lists.neo4j.org
>> https://lists.neo4j.org/mailman/listinfo/user
>>
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] JTA support for Neo

2010-11-09 Thread Peter Neubauer
Chris,
Awesome! I think the next step would be to start testing things when
neo4j needs to recover, rollback etc, I think this is where the
problems arise :)

Also, any chance of making a maven project out of it and having the
project as a test component somewhere in the Svn repo, so it can be
run as part of the QA process for releases etc?

/Peter

On Tuesday, November 9, 2010, Chris Gioran  wrote:
> Hi people,
>
> I have started work on providing support for integrating Neo in a JTA
> environment. This is the first time I have done something like this so
> some feedback would be welcome.
>
> My setup is this: I have an ApacheDS instance running to provide an
> LDAP service to store objects. I have tweaked (very slightly) a
> standalone JOTM instance to provide its TransactionManager via JNDI to
> the directory (I prefer that over RMI). I then start an
> EmbeddedGraphDatabase and replace the native neo TxManager with the
> JOTM provided implementation as I retrieve it from the directory, both
> in the PersistenceModule and the TxModule. From there, instead of
> asking the graphDb to beginTx() I ask the JOTM TransactionManager to
> begin(). I perform some Node creations and they are persisted on
> commit(). Also, via a debugger, I can see that the XAResources of Neo
> are indeed enlisted on the remove Transaction.
>
> I know that this is very premature and for the time being the changes
> are very rough but I think I have a proof of concept. Comments? Does
> anyone have any experience in a similar setting? Next steps?
>
> This has a very high coolness factor, btw :)
>
> cheers,
> CG
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] JTA support for Neo

2010-11-09 Thread Chris Gioran
> Chris,
> Awesome! I think the next step would be to start testing things when
> neo4j needs to recover, rollback etc, I think this is where the
> problems arise :)
>
> Also, any chance of making a maven project out of it and having the
> project as a test component somewhere in the Svn repo, so it can be
> run as part of the QA process for releases etc?

OK, so the plan I have in my mind is this:

- Do runs to see what problems exist when rolling back/starting in
recovery mode with the only resource being neo.
- See how the whole thing works when another XA compatible resource is
thrown in the mix, probably a RDBMS, checking that 2PC works.
- Find the least intrusive way of making neo fit in the picture, in
terms of configuration/code changes etc, approve and commit those.
- Write test cases and a maven project so that it can be integrated in
the release cycle to be checked for correct functionality.
- After that probably I would like to fill in the gaps so that from an
app server I can do a container managed tx over a jdbc connection and
a neo connection. After all, this is the ultimate purpose of this
exercise.

I will fill you in as I go through each of the above. Thanks for your time.

cheers,
CG
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] JTA support for Neo

2010-11-09 Thread Peter Neubauer
This is seriously cool, Chris,
Let the list know if you need any support, I can stay up some nights for hid :)

Peter

On Tuesday, November 9, 2010, Chris Gioran  wrote:
>> Chris,
>> Awesome! I think the next step would be to start testing things when
>> neo4j needs to recover, rollback etc, I think this is where the
>> problems arise :)
>>
>> Also, any chance of making a maven project out of it and having the
>> project as a test component somewhere in the Svn repo, so it can be
>> run as part of the QA process for releases etc?
>
> OK, so the plan I have in my mind is this:
>
> - Do runs to see what problems exist when rolling back/starting in
> recovery mode with the only resource being neo.
> - See how the whole thing works when another XA compatible resource is
> thrown in the mix, probably a RDBMS, checking that 2PC works.
> - Find the least intrusive way of making neo fit in the picture, in
> terms of configuration/code changes etc, approve and commit those.
> - Write test cases and a maven project so that it can be integrated in
> the release cycle to be checked for correct functionality.
> - After that probably I would like to fill in the gaps so that from an
> app server I can do a container managed tx over a jdbc connection and
> a neo connection. After all, this is the ultimate purpose of this
> exercise.
>
> I will fill you in as I go through each of the above. Thanks for your time.
>
> cheers,
> CG
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] many small graphs in 1 Database

2010-11-09 Thread Thomas Strunz

Thanks for this very detailed answer. Some comments: I'm a novice at this 
(Graph, Graph database, traversal, the math behind it). I'm more from the 
science and sql corner.  I have heard of lucene but never used myself, so new 
too.
The BONDS relationship is of course simplified because of possible double, 
triple,.. bonds and stereochemistry. 

I have some follow-up question:

1. If I make a Molecle Node that has an ATOMS relationship to the actual 
"chemical structure graph" to what node would this relationship be attached to?
To all atoms? (yes makes the most sense IMHO, just chosing one randomly does 
not seem to make much sense)

2. Having a "rooms" and "chemists" nodes, isn't that kind of a fallback to a 
table-like representation? Or: why not just omit them and link a room directly 
to a molecule?

3. Using an index seems to be easier and more adaptable? 

4. Concerning the Signature/Hash: I know 2 basic possibilites:

a) For Full structure search
 a code that fully represents the molecule but can not be used to determine 
substructures 
canonical SMILES, InchI, InchIKey (=hashed InchI), proprietary canonical codes
(no need for graph database)

b) Substructure Search
Here so called Fingerprints are used. A Fingerprint is just a set of bits (Like 
Java BitSet) with each bit representing a certain feature.
A typical size is 1024 bits.
The idea is you can logically compare them very quickly. Structure AND 
SubStructure = Structure (the substructure has all features of the total 
structure)
The point is you get some false-positives but no false-negatives. 

After this step you need to compare structures by graph matching, which of 
course is much more expensive but must be done only for a handful of all the 
structures.

Regards,

Thomas


> Date: Tue, 9 Nov 2010 15:45:55 +0100
> From: cr...@amanzi.com
> To: user@lists.neo4j.org
> Subject: Re: [Neo4j] many small graphs in 1 Database
> 
> It's been 13 years since I left Chemistry, but I think I have some residual
> interest in the subject :-)
> 
> My two cents worth for this problem is that it is possible to model
> everything in one single graph:
> 
>- Store both the chemical structures and the relationships as graphs,
>differentiating using relationship types.
>- Atoms connected together would have BOND relationships, but the
>molecule itself can also be represented by a single node, with an ATOMS
>relationship into the sub-graph representing the molecular structure
>- Meta-data structures related to the molecules would be a graph
>connecting them all together and connected to the root node. So, for
>example, the suggestion of a molecule having been made by John Doe and
>stored in Room 123 would be modeled by having a 'rooms' node connected to a
>node for each room, and the room node with 'name'='123' would be connected
>using STORES relationship to the molecule node above. Similarly we would
>have a 'chemists' node connected to each chemist, and the chemist with
>'name'='John Doe' would be related by CREATED to the molecule. The CREATED
>relationship could have properties like created_on, etc. John Doe could 
> also
>have properties like email, phone number, etc.
> 
> In this approach we have no need for the external index, since all queries
> suggested can be achieved using a traversal. If you want composite queries
> and know in advance the main queries you will make, you can also optimise
> the graph structure for those queries. For example, if a standard query is
> to ask the database which molecules were created by chemist X between March
> and August 2010, then create month nodes between the chemist node and the
> molecule node, so all molecules made by X in January would be connected
> first to X's January node and from there to X himself. This is in effect
> building a custom index into the graph. It is a good solution if you know
> very well what kind of queries you will make.
> 
> However, as Peter suggests, using the lucene index, especially with the new
> composite query support, you do not need to think too hard about having your
> own index graph, but would simply add both the chemists name and the date to
> a single index on the molecule node itself. So the lucene query for the
> chemist and date should return a set of molecule nodes, and you can then do
> further pattern matching, if needed, on those.
> 
> One other idea I would consider for pattern matching is to generate a
> signature, a kind of hash of the molecule shape that is representative of
> the shape. Then you can index that hash also, and effectively get the
> molecular shape to be a lucene searchable field. This is only possible if
> you know your domain well enough to create a hash that makes sense for your
> situation. In the case of chemistry, it really depends on what you mean by
> 'shape' when doing the search. For example, perhaps a search on chemical
> formula is a good enough description of the shape, an

Re: [Neo4j] Question about setConfiguration()

2010-11-09 Thread Marko Rodriguez
Hi,

>> However, I seem to be running into a bug where IndexHits.size() > 0, but I'm
>> not getting back what I indexed :( ... I will try and isolate issue and
>> bring it up in another email.
> 
> Weird, that can be the case if it finds hits, but entities has been
> deleted or something.

I found the problem. It was me. Blueprints doesn't allow for the "reference 
node" that Neo4j creates. A new database must be empty with two automatic 
indices. Well, the way I ensure this in the test cases is, when a graphdb 
loads, graph.removeVertex(graph.getVertex(0)). However, in this particular test 
case I was opening/closing the Neo4jGraph so much, that the IDs system "reset" 
and allowed me to create a 0 ID vertex. And well, it was deleted when the test 
case retrieved a new instance of the graph. Thus, the test case failed Had 
nothing to do with the test case per se, but the damn reference node stuff.

To make a long story short, this is all Peter's fault.

Marko.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Eigenvector Centrality subclasses

2010-11-09 Thread Paul A. Jackson
Not sure if this fell through the cracks.

Here are some more specific questions.

I get inconsistent results from run to run using eigenvector centrality.  It 
doesn't seem to matter which implementation I use but I have used Arnoldi most, 
for no reason other than it returns the iteration count.

The iteration count is not consistent from run to run when run against the 
exact same graph using the exact same precision.  In a graph with 32 nodes and 
117 edges, I get anywhere from 18 to 24 iterations needed to get a precision of 
0.001.  The variance is easier to see when the test is run on different 
computers.

Also, I experience the same problem as Piyush below.  Not sure if anything ever 
came from this:

On Wed, Jul 28, 2010 at 10:20 AM, Piyush Kanti Bhunre  
wrote:
> Hi,
>
>   I am getting some negative values of centrality of nodes of a 
> network using  Neo4j's EigenvectorCentralityArnoldi. I am using this 
> for the small networks having few thousands nodes. I am not sure if it 
> is due to instability of the algorithm or bugs in implementation. 
> Could you please comment on that?
>
> Thanks.
> Piyush

Thanks,
-Paul


-Original Message-
From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On 
Behalf Of Paul A. Jackson
Sent: Monday, November 08, 2010 10:10 AM
To: Neo4j user discussions
Subject: [Neo4j] Eigenvector Centrality subclasses

Anyone know the pros/cons of the Arnoldi eigenvector centrality implementation 
over the Power implementation?  I see that Arnoldi gives a little more 
information on number of iterations, but it seems neither is deterministic.

Thanks,
Paul Jackson
Pitney Bowes
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Eigenvector Centrality subclasses

2010-11-09 Thread Marko Rodriguez
Hey Paul,

> I get inconsistent results from run to run using eigenvector centrality.  It 
> doesn't seem to matter which implementation I use but I have used Arnoldi 
> most, for no reason other than it returns the iteration count.

Given that eigenvector components sum to 1, and when dealing with large graphs, 
you may be running into floating point precision issues. In general, different 
eigenvector methods may have small variations in their values (even though its 
the same eigenvector!), but, if you are getting Spearman rank order correlation 
~1.0, then I think its 'all good.' Also, note that for those eigenvector 
centrality implementations that are based on random walk, variations are sure 
to show up.

> The iteration count is not consistent from run to run when run against the 
> exact same graph using the exact same precision.  In a graph with 32 nodes 
> and 117 edges, I get anywhere from 18 to 24 iterations needed to get a 
> precision of 0.001.  The variance is easier to see when the test is run on 
> different computers.

Hmm...  What code are you using? I'm talking in general and not specifically 
about anything Neo4j related... 

Thanks,
Marko.

http://markorodriguez.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Eigenvector Centrality subclasses

2010-11-09 Thread Paul A. Jackson
I'm using:
import org.neo4j.graphalgo.impl.centrality.EigenvectorCentrality;
import org.neo4j.graphalgo.impl.centrality.EigenvectorCentralityArnoldi;
import org.neo4j.graphalgo.impl.centrality.EigenvectorCentralityPower;

The variance I am seeing is far greater than anything that could be explained 
by floating point precision issues.  For example, a result coming back after 
one call as 0.045 and then on the next call with identical options it could 
return 0.038.

I glanced over the code and I see that they both use java.util.Random, so that 
could explain why it is not deterministic.  Maybe that answers everything.

Unfortunately, what it means is that you might randomly have two subsequent 
calls that appear to return similar results, but actually you have not zeroed 
in on the correct answer within the actual level of precision that is desired.

The JavaDoc explicitly states that precision doesn't means proximity to correct 
result, but it doesn't make the results less unsatisfying.

-Paul

-Original Message-
From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On 
Behalf Of Marko Rodriguez
Sent: Tuesday, November 09, 2010 6:06 PM
To: Neo4j user discussions
Subject: Re: [Neo4j] Eigenvector Centrality subclasses

Hey Paul,

> I get inconsistent results from run to run using eigenvector centrality.  It 
> doesn't seem to matter which implementation I use but I have used Arnoldi 
> most, for no reason other than it returns the iteration count.

Given that eigenvector components sum to 1, and when dealing with large graphs, 
you may be running into floating point precision issues. In general, different 
eigenvector methods may have small variations in their values (even though its 
the same eigenvector!), but, if you are getting Spearman rank order correlation 
~1.0, then I think its 'all good.' Also, note that for those eigenvector 
centrality implementations that are based on random walk, variations are sure 
to show up.

> The iteration count is not consistent from run to run when run against the 
> exact same graph using the exact same precision.  In a graph with 32 nodes 
> and 117 edges, I get anywhere from 18 to 24 iterations needed to get a 
> precision of 0.001.  The variance is easier to see when the test is run on 
> different computers.

Hmm...  What code are you using? I'm talking in general and not specifically 
about anything Neo4j related... 

Thanks,
Marko.

http://markorodriguez.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Eigenvector Centrality subclasses

2010-11-09 Thread Paul A. Jackson
Perhaps if "new Random( System.currentTimeMillis() )" we replaced with "new 
Random( 0 )", you would get the benefits of pseudo random behavior but also 
deterministic results from run to run.

-Paul

-Original Message-
From: Paul A. Jackson 
Sent: Tuesday, November 09, 2010 6:16 PM
To: 'Neo4j user discussions'
Subject: RE: [Neo4j] Eigenvector Centrality subclasses

I'm using:
import org.neo4j.graphalgo.impl.centrality.EigenvectorCentrality;
import org.neo4j.graphalgo.impl.centrality.EigenvectorCentralityArnoldi;
import org.neo4j.graphalgo.impl.centrality.EigenvectorCentralityPower;

The variance I am seeing is far greater than anything that could be explained 
by floating point precision issues.  For example, a result coming back after 
one call as 0.045 and then on the next call with identical options it could 
return 0.038.

I glanced over the code and I see that they both use java.util.Random, so that 
could explain why it is not deterministic.  Maybe that answers everything.

Unfortunately, what it means is that you might randomly have two subsequent 
calls that appear to return similar results, but actually you have not zeroed 
in on the correct answer within the actual level of precision that is desired.

The JavaDoc explicitly states that precision doesn't means proximity to correct 
result, but it doesn't make the results less unsatisfying.

-Paul

-Original Message-
From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On 
Behalf Of Marko Rodriguez
Sent: Tuesday, November 09, 2010 6:06 PM
To: Neo4j user discussions
Subject: Re: [Neo4j] Eigenvector Centrality subclasses

Hey Paul,

> I get inconsistent results from run to run using eigenvector centrality.  It 
> doesn't seem to matter which implementation I use but I have used Arnoldi 
> most, for no reason other than it returns the iteration count.

Given that eigenvector components sum to 1, and when dealing with large graphs, 
you may be running into floating point precision issues. In general, different 
eigenvector methods may have small variations in their values (even though its 
the same eigenvector!), but, if you are getting Spearman rank order correlation 
~1.0, then I think its 'all good.' Also, note that for those eigenvector 
centrality implementations that are based on random walk, variations are sure 
to show up.

> The iteration count is not consistent from run to run when run against the 
> exact same graph using the exact same precision.  In a graph with 32 nodes 
> and 117 edges, I get anywhere from 18 to 24 iterations needed to get a 
> precision of 0.001.  The variance is easier to see when the test is run on 
> different computers.

Hmm...  What code are you using? I'm talking in general and not specifically 
about anything Neo4j related... 

Thanks,
Marko.

http://markorodriguez.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Question about setConfiguration()

2010-11-09 Thread Andres Taylor
It most often is. But he is so nice that we keep him around anyway... :)

Andrés

On Tue, Nov 9, 2010 at 11:09 PM, Marko Rodriguez wrote:

> Hi,
>
> >> However, I seem to be running into a bug where IndexHits.size() > 0, but
> I'm
> >> not getting back what I indexed :( ... I will try and isolate issue and
> >> bring it up in another email.
> >
> > Weird, that can be the case if it finds hits, but entities has been
> > deleted or something.
>
> I found the problem. It was me. Blueprints doesn't allow for the "reference
> node" that Neo4j creates. A new database must be empty with two automatic
> indices. Well, the way I ensure this in the test cases is, when a graphdb
> loads, graph.removeVertex(graph.getVertex(0)). However, in this particular
> test case I was opening/closing the Neo4jGraph so much, that the IDs system
> "reset" and allowed me to create a 0 ID vertex. And well, it was deleted
> when the test case retrieved a new instance of the graph. Thus, the test
> case failed Had nothing to do with the test case per se, but the damn
> reference node stuff.
>
> To make a long story short, this is all Peter's fault.
>
> Marko.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user