Re: [Neo4j] Synchronization model between embedded client and Standalone server

2011-02-21 Thread Mattias Persson
2011/2/21 Brendan 

> Hi,
>
> I want to design a system where each client has a embedded db to store only
> a subset of a master server db of all clients.  Then, should I design a
> server plug in to extend the currently stateless server services to support
> transaction by simply wrap few requests into one atomic request and respond?
> Also, modify your online backup to synchronization the clients according to
> these transactions.
>

Are you trying to serve only a subset of the transaction stream to each
client? I don't think that's possible and your data can get in a pretty
weird state. What is the problem you are trying to solve which can't be
solved with, say, normal high availability?

>
> I'm sure there is a better idea.  Please let me know if you have.
>
> Brendan
>
> Sent from my iPad
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Social Networks And Graph Databases

2011-02-21 Thread Mattias Persson
2011/2/22 J T 

> Hmm, I hadn't considered the apache approach but it still kind of goes
> against the grain - perhaps i just want too much or its my innate laziness
> ... hehe ;)
>
> Its not just about data size, its more about not wanting to have to
> re-engineer/re-factor as things grow - whether that growth is concurrent
> access or in data quantity.
>
>
There are not that many cases (fewer than you'd might imagine) where you'd
need to scale/shard out Neo4j to multiple machines just to handle the load
put on it. It's great to think ahead and be aware of limitations, but
there's a pretty high chance you just wont run into those. And if/when you
do Neo4j will probably have evolved to handle that load for you anyway,
maybe even sharding :)

>
>
>
> On Mon, Feb 21, 2011 at 11:47 PM, Michael Hunger <
> michael.hun...@neotechnology.com> wrote:
>
> > Hi J.T.,
> >
> > of course you can have the cache sharding taken care of by the server
> side,
> > e.g. use an apache proxy for
> > client sticky routing, redirecting according to URL patterns etc. But
> that
> > doesn't cover your "domain".
> >
> > The problem is that other than simple kv stores, where the sharding the
> key
> > is pretty easy, sharding graphs is much more
> > demanding. You would like to have traversal locality (so that you don't
> > have to cross servers for a single traversal).
> > That means something that keeps (and also updates) your subgraphs to be
> in
> > just one server.
> > And deciding which subgraphs should be put together is either a pure
> domain
> > driven thing or something that could be achieved by having lots of (long
> > running?)
> > clients (and their request URLs) and looking at their traversal / query
> > statistics and optimizing the data held permanently (or even "mastered")
> on
> > the specific node for a certain set of requests.
> >
> > It would also mean that the occasional cross-server traversal should
> result
> > in local caches being updated for the remote data.
> >
> > Is the problem we're talking about just data size? You can already store
> > pretty big graphs in a single neo4j node (esp. when you go for big
> > machines).
> >
> > Michael
> >
> > Am 22.02.2011 um 00:15 schrieb J T:
> >
> > > I realise that there are different qualities that can come in to play
> > with
> > > the labels 'scalability' & 'performance' and I can see how your
> strategy
> > > would help with some of those qualities but it relies on custom logic
> in
> > the
> > > client application to do the sharding and load spreading and doesn't
> > address
> > > scaling the underlying persistant storage engine.
> > >
> > > One of the things that attracted me to Riak and Cassandra (for the use
> > cases
> > > I can apply them to) is that sharding, load balancing and persistance
> > > scaling was available out-of-the-box and and pretty much invisible to
> the
> > > client application. The client app didn't have to do anything special.
> I
> > > appreciate that perhaps because they have different semantics that its
> an
> > > easier for them to solve.
> > >
> > > I had a read of this page you wrote the other day :
> > >
> >
> http://jim.webber.name/2011/02/16/3b8f4b3d-c884-4fba-ae6b-7b75a191fa22.aspx
> > >
> > > It was your comment "it's hard to achieve in practice" that prompted me
> > to
> > > post my initial message yesterday to enquire further.
> > >
> > > I'm no specialist in the field, I just know what I want hehe :)
> > >
> > > The only player in the field I've been able to find that might have
> more
> > of
> > > the qualities I am interested is InfiniteGraph, its a shame that it
> > doesn't
> > > have a 'server' version like neo does for me to do a proper comparison.
> > >
> > > I'll stick with neo for now, and see how the marketplace matures in the
> > > coming months - i'm amazed at how much movement there has been in the
> > last
> > > year.
> > >
> > >
> > >
> > >
> > > On Mon, Feb 21, 2011 at 3:09 PM, Jim Webber 
> > wrote:
> > >
> > >> Yup, you nailed it better than I did Rick.
> > >>
> > >> Though your partition strategy might not be just "per user." For
> example
> > in
> > >> the geo domain, it makes sense to route requests for particular cities
> > to
> > >> specific nodes. It'll depend on your application how you generate your
> > >> routing rules.
> > >>
> > >> Jim
> > >>
> > >> On 21 Feb 2011, at 14:51, Michael Hunger wrote:
> > >>
> > >>> You shouldn't be confused because you got it right :)
> > >>>
> > >>> Cheers
> > >>>
> > >>> Michael
> > >>>
> > >>> Am 21.02.2011 um 15:40 schrieb Rick Otten:
> > >>>
> >  Ok, I'm following this discussion, and now I'm confused.
> > 
> >  My understanding was that the (potentially very large) database is
> >  replicated across all instances.
> > 
> >  If someone needed to traverse to something that wasn't cached,
> they'd
> > >> take
> >  a performance hit, but still be able to get to it.
> > 
> >  I had understood the idea behind the load balancing is t

Re: [Neo4j] Unable to parse mapped memory

2011-02-21 Thread Mattias Persson
Yup, there doesn't seem to be support for the dot notation for specifying
mapped memory. I'm not sure there ever has been either.

2011/2/22 Peter Neubauer 

> Craig,
> I guess we need to put in 1200M instead?
>
> /peter
>
> On Monday, February 21, 2011, Craig Taverner  wrote:
> > Hi,
> >
> > Recently, since Peter moved Neo4j Spatial to 1.3M02 I think, I stopped
> being
> > able to run the Neo4j Spatial unit tests in maven. Most of them now
> generate
> > an error like:
> >
> > INFO: Unable to parse mapped memory[1.2] string for
> >
> /home/craig/dev/neo4j/neo4j-spatial/target/var/neo4j-db/neostore.propertystore.db.strings
> >
> >
> > I am making sure to create the database from scratch. The full error is
> > below. Any ideas?
> >
> > Regards, Craig
> >
> > Physical mem: 2517MB, Heap size: 1020MB
> > Feb 21, 2011 10:37:37 PM
> > org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore getMappedMem
> > INFO: Unable to parse mapped memory[1.2] string for
> >
> /home/craig/dev/neo4j/neo4j-spatial/target/var/neo4j-db/neostore.propertystore.db.strings
> > create=true
> > dump_configuration=true
> >
> logical_log=/home/craig/dev/neo4j/neo4j-spatial/target/var/neo4j-db/nioneo_logical.log
> >
> neo_store=/home/craig/dev/neo4j/neo4j-spatial/target/var/neo4j-db/neostore
> > neostore.nodestore.db.mapped_memory=50M
> > neostore.propertystore.db.arrays.mapped_memory=50M
> > neostore.propertystore.db.index.keys.mapped_memory=1M
> > neostore.propertystore.db.index.mapped_memory=1M
> > neostore.propertystore.db.mapped_memory=400M
> > neostore.propertystore.db.strings.mapped_memory=1.2G
> > neostore.relationshipstore.db.mapped_memory=150M
> > rebuild_idgenerators_fast=true
> > store_dir=/home/craig/dev/neo4j/neo4j-spatial/target/var/neo4j-db
> > ___
> > Neo4j mailing list
> > User@lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
> >
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Lucene fulltext index batch inserter Lock obtain problem

2011-02-21 Thread Mattias Persson
2011/2/22 Shae Griffiths 

>
> Hi Mattias,
>
> >
> >Are you using multiple threads during batch insertion (not allowed b.t.w.)
> ?
> >
>
> It's only a single thread as far as I'm aware, unless the Lucene stuff is
> doing something funky under the covers.
>
> I was hoping it was common, and you could say "oh you've forgotten this
> line" but my lack of googling results wasn't promising!
>
> >>
> >> I'm thinking that maybe my problem lies in the fact i'm indexing based
> on
> >> properties that may not exist on all nodes. eg. My graph is made of
> >> entities
> >> such as people and cars, a person has a first and last name, but does
> not
> >> have a registration number, and for a car it would be the reverse.
> >>
> >
> >That doesn't matter, the index can handle such data.
> >
>
> I thought this was the case, when using a non-batch approach it seemed to
> go
> for a while before it got confused!
>
> >
> >Weird... I would like to see your code to get more ideas to what it might
> >be.
> >
>
> The code to create the nodes and index them looks like this: (i'll simplify
> irrelevant bits :))
>
> // Index service and inserter are declared as:
> private static BatchInserter inserter = new BatchInserterImpl(DB_PATH);
> private static LuceneIndexBatchInserter indexService = new
> LuceneFulltextIndexBatchInserter(inserter);
>
> ...
>
> While(There_are_nodes_to_Create)
> {
>// Build up properties
>...
>// Node is Created just before this
>long node = inserter.createNode( properties );
>
>// Each node has an ID and a Type
>indexService.index( node, ID_KEY, ID );
>indexService.index( node, TYPE_KEY, Type );
>
>// All other properties vary based on the Type of node and are read in
> from a file prior to this
>for (int j = 0; j < Number_of_properties_to_put_on_node; j ++)
>{
>if (Property_Value_isnt_null_or_empty)
>{
>indexService.index( node, Node_Property_Name, Property_Value );
>}
>}
> }
> indexService.optimize();
>
>
So with the new framework your code would look something like this (assuming
the "properties" map also contain the ID_KEY and TYPE_KEY properties):

   BatchInserter db = new BatchInserterImpl( ... );
   BatchInserterIndexProvider indexProvider = new
LuceneBatchInserterIndexProvider( db );
   BatchInserterIndex personIndex = indexProvider.nodeIndex( "persons",
MapUtil.stringMap( "type", "exact" ) );
   BatchInserterIndex carIndex = indexProvider.nodeIndex( "cars",
MapUtil.stringMap( "type", "fulltext" ) );

   while ( has more nodes )
   {
   long node = db.createNode( properties );
   personIndex.add( node, properties );
   // ... or add to carIndex if it's a car
   }

   indexProvider.shutdown();
   db.shutdown();

More information
http://wiki.neo4j.org/content/Index_Framework#Advanced_creation_and_fulltextand
http://wiki.neo4j.org/content/Index_Framework#Batch_insertion

NOTE: you'll have to recreate your indices with the new framework since it
isn't compatible with the old one.


>
> >
> >Another thing: You seem to use the old index framework. Please consider
> the
> new
> >one  instead. Heck, it
> might
> >even get rid of your problems also :)
> >
>
> I just had a look at that new framework and tried to copy the code (at the
> bottom, the batch insert stuff) and eclipse doesnt seem to be able to work
> out what LuceneBatchInserterIndexProvider is. Also, when looking thorough
> that documentation you sent me, I can't seem to find anything about a
> fulltext style index (or is it implicit and it works it out itself?)
>
> This indexing is my last hurdle, then once thats done, I just need to make
> a
> UI and my project is finished! YAY!
>
> Great to hear!


> Thanks,
>
> Shae
> --
> View this message in context:
> http://neo4j-user-list.438527.n3.nabble.com/Neo4j-Lucene-fulltext-index-batch-inserter-Lock-obtain-problem-tp2542866p2549074.html
> Sent from the Neo4J User List mailing list archive at Nabble.com.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j rest server indexing

2011-02-21 Thread francoisk6

Hi All,
I am trying to build a non-exact search plug-in to Neo4J rest server, is
there a document or example on how to accomplish that? I checked the
get_all_nodes, I am newbie in java and my main problem is how to send the
parameters like the index name and the value to be searched for.

Thx for your help.
Francois Kassis.

-
Regards,
Francois Kassis.
-- 
View this message in context: 
http://neo4j-user-list.438527.n3.nabble.com/Neo4j-rest-server-indexing-tp2526084p2550661.html
Sent from the Neo4J User List mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Unable to parse mapped memory

2011-02-21 Thread Peter Neubauer
Craig,
I guess we need to put in 1200M instead?

/peter

On Monday, February 21, 2011, Craig Taverner  wrote:
> Hi,
>
> Recently, since Peter moved Neo4j Spatial to 1.3M02 I think, I stopped being
> able to run the Neo4j Spatial unit tests in maven. Most of them now generate
> an error like:
>
> INFO: Unable to parse mapped memory[1.2] string for
> /home/craig/dev/neo4j/neo4j-spatial/target/var/neo4j-db/neostore.propertystore.db.strings
>
>
> I am making sure to create the database from scratch. The full error is
> below. Any ideas?
>
> Regards, Craig
>
> Physical mem: 2517MB, Heap size: 1020MB
> Feb 21, 2011 10:37:37 PM
> org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore getMappedMem
> INFO: Unable to parse mapped memory[1.2] string for
> /home/craig/dev/neo4j/neo4j-spatial/target/var/neo4j-db/neostore.propertystore.db.strings
> create=true
> dump_configuration=true
> logical_log=/home/craig/dev/neo4j/neo4j-spatial/target/var/neo4j-db/nioneo_logical.log
> neo_store=/home/craig/dev/neo4j/neo4j-spatial/target/var/neo4j-db/neostore
> neostore.nodestore.db.mapped_memory=50M
> neostore.propertystore.db.arrays.mapped_memory=50M
> neostore.propertystore.db.index.keys.mapped_memory=1M
> neostore.propertystore.db.index.mapped_memory=1M
> neostore.propertystore.db.mapped_memory=400M
> neostore.propertystore.db.strings.mapped_memory=1.2G
> neostore.relationshipstore.db.mapped_memory=150M
> rebuild_idgenerators_fast=true
> store_dir=/home/craig/dev/neo4j/neo4j-spatial/target/var/neo4j-db
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Social Networks And Graph Databases

2011-02-21 Thread J T
Hmm, I hadn't considered the apache approach but it still kind of goes
against the grain - perhaps i just want too much or its my innate laziness
... hehe ;)

Its not just about data size, its more about not wanting to have to
re-engineer/re-factor as things grow - whether that growth is concurrent
access or in data quantity.




On Mon, Feb 21, 2011 at 11:47 PM, Michael Hunger <
michael.hun...@neotechnology.com> wrote:

> Hi J.T.,
>
> of course you can have the cache sharding taken care of by the server side,
> e.g. use an apache proxy for
> client sticky routing, redirecting according to URL patterns etc. But that
> doesn't cover your "domain".
>
> The problem is that other than simple kv stores, where the sharding the key
> is pretty easy, sharding graphs is much more
> demanding. You would like to have traversal locality (so that you don't
> have to cross servers for a single traversal).
> That means something that keeps (and also updates) your subgraphs to be in
> just one server.
> And deciding which subgraphs should be put together is either a pure domain
> driven thing or something that could be achieved by having lots of (long
> running?)
> clients (and their request URLs) and looking at their traversal / query
> statistics and optimizing the data held permanently (or even "mastered") on
> the specific node for a certain set of requests.
>
> It would also mean that the occasional cross-server traversal should result
> in local caches being updated for the remote data.
>
> Is the problem we're talking about just data size? You can already store
> pretty big graphs in a single neo4j node (esp. when you go for big
> machines).
>
> Michael
>
> Am 22.02.2011 um 00:15 schrieb J T:
>
> > I realise that there are different qualities that can come in to play
> with
> > the labels 'scalability' & 'performance' and I can see how your strategy
> > would help with some of those qualities but it relies on custom logic in
> the
> > client application to do the sharding and load spreading and doesn't
> address
> > scaling the underlying persistant storage engine.
> >
> > One of the things that attracted me to Riak and Cassandra (for the use
> cases
> > I can apply them to) is that sharding, load balancing and persistance
> > scaling was available out-of-the-box and and pretty much invisible to the
> > client application. The client app didn't have to do anything special. I
> > appreciate that perhaps because they have different semantics that its an
> > easier for them to solve.
> >
> > I had a read of this page you wrote the other day :
> >
> http://jim.webber.name/2011/02/16/3b8f4b3d-c884-4fba-ae6b-7b75a191fa22.aspx
> >
> > It was your comment "it's hard to achieve in practice" that prompted me
> to
> > post my initial message yesterday to enquire further.
> >
> > I'm no specialist in the field, I just know what I want hehe :)
> >
> > The only player in the field I've been able to find that might have more
> of
> > the qualities I am interested is InfiniteGraph, its a shame that it
> doesn't
> > have a 'server' version like neo does for me to do a proper comparison.
> >
> > I'll stick with neo for now, and see how the marketplace matures in the
> > coming months - i'm amazed at how much movement there has been in the
> last
> > year.
> >
> >
> >
> >
> > On Mon, Feb 21, 2011 at 3:09 PM, Jim Webber 
> wrote:
> >
> >> Yup, you nailed it better than I did Rick.
> >>
> >> Though your partition strategy might not be just "per user." For example
> in
> >> the geo domain, it makes sense to route requests for particular cities
> to
> >> specific nodes. It'll depend on your application how you generate your
> >> routing rules.
> >>
> >> Jim
> >>
> >> On 21 Feb 2011, at 14:51, Michael Hunger wrote:
> >>
> >>> You shouldn't be confused because you got it right :)
> >>>
> >>> Cheers
> >>>
> >>> Michael
> >>>
> >>> Am 21.02.2011 um 15:40 schrieb Rick Otten:
> >>>
>  Ok, I'm following this discussion, and now I'm confused.
> 
>  My understanding was that the (potentially very large) database is
>  replicated across all instances.
> 
>  If someone needed to traverse to something that wasn't cached, they'd
> >> take
>  a performance hit, but still be able to get to it.
> 
>  I had understood the idea behind the load balancing is to minimize
>  traversals out of cache by grouping similar sets of users on a
> >> particular
>  server.  (That way you don't need a ton of RAM to stash everything in
> >> the
>  database, just the most frequently accessed nodes and relationships
>  associated with a subset of the users.)
> 
> 
> 
> 
> > Hello JT,
> >
> >> One thing, when you say route requests to specific instances .. does
> >> that
> >> imply that node relationships can't span instances ?
> >
> > Yes that's right. What I'm suggesting here is that each instance is a
> >> full
> > replica that works on a subset of requests which are likel

Re: [Neo4j] Lucene fulltext index batch inserter Lock obtain problem

2011-02-21 Thread Shae Griffiths

Hi Mattias,

>
>Are you using multiple threads during batch insertion (not allowed b.t.w.)
?
>

It's only a single thread as far as I'm aware, unless the Lucene stuff is
doing something funky under the covers.

I was hoping it was common, and you could say "oh you've forgotten this
line" but my lack of googling results wasn't promising!

>>
>> I'm thinking that maybe my problem lies in the fact i'm indexing based on
>> properties that may not exist on all nodes. eg. My graph is made of
>> entities
>> such as people and cars, a person has a first and last name, but does not
>> have a registration number, and for a car it would be the reverse.
>>
>
>That doesn't matter, the index can handle such data.
>

I thought this was the case, when using a non-batch approach it seemed to go
for a while before it got confused!

>
>Weird... I would like to see your code to get more ideas to what it might
>be.
>

The code to create the nodes and index them looks like this: (i'll simplify
irrelevant bits :))

// Index service and inserter are declared as:
private static BatchInserter inserter = new BatchInserterImpl(DB_PATH);
private static LuceneIndexBatchInserter indexService = new
LuceneFulltextIndexBatchInserter(inserter);

...

While(There_are_nodes_to_Create)
{
// Build up properties
...
// Node is Created just before this
long node = inserter.createNode( properties ); 

// Each node has an ID and a Type
indexService.index( node, ID_KEY, ID );
indexService.index( node, TYPE_KEY, Type );

// All other properties vary based on the Type of node and are read in
from a file prior to this
for (int j = 0; j < Number_of_properties_to_put_on_node; j ++)
{
if (Property_Value_isnt_null_or_empty)
{
indexService.index( node, Node_Property_Name, Property_Value );
}
}
}
indexService.optimize();


>
>Another thing: You seem to use the old index framework. Please consider the
new
>one  instead. Heck, it might
>even get rid of your problems also :)
>

I just had a look at that new framework and tried to copy the code (at the
bottom, the batch insert stuff) and eclipse doesnt seem to be able to work
out what LuceneBatchInserterIndexProvider is. Also, when looking thorough
that documentation you sent me, I can't seem to find anything about a
fulltext style index (or is it implicit and it works it out itself?)

This indexing is my last hurdle, then once thats done, I just need to make a
UI and my project is finished! YAY!

Thanks,

Shae
-- 
View this message in context: 
http://neo4j-user-list.438527.n3.nabble.com/Neo4j-Lucene-fulltext-index-batch-inserter-Lock-obtain-problem-tp2542866p2549074.html
Sent from the Neo4J User List mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Social Networks And Graph Databases

2011-02-21 Thread Michael Hunger
Hi J.T.,

of course you can have the cache sharding taken care of by the server side, 
e.g. use an apache proxy for 
client sticky routing, redirecting according to URL patterns etc. But that 
doesn't cover your "domain".

The problem is that other than simple kv stores, where the sharding the key is 
pretty easy, sharding graphs is much more
demanding. You would like to have traversal locality (so that you don't have to 
cross servers for a single traversal).
That means something that keeps (and also updates) your subgraphs to be in just 
one server. 
And deciding which subgraphs should be put together is either a pure domain 
driven thing or something that could be achieved by having lots of (long 
running?)
clients (and their request URLs) and looking at their traversal / query 
statistics and optimizing the data held permanently (or even "mastered") on the 
specific node for a certain set of requests.

It would also mean that the occasional cross-server traversal should result in 
local caches being updated for the remote data.

Is the problem we're talking about just data size? You can already store pretty 
big graphs in a single neo4j node (esp. when you go for big machines).

Michael

Am 22.02.2011 um 00:15 schrieb J T:

> I realise that there are different qualities that can come in to play with
> the labels 'scalability' & 'performance' and I can see how your strategy
> would help with some of those qualities but it relies on custom logic in the
> client application to do the sharding and load spreading and doesn't address
> scaling the underlying persistant storage engine.
> 
> One of the things that attracted me to Riak and Cassandra (for the use cases
> I can apply them to) is that sharding, load balancing and persistance
> scaling was available out-of-the-box and and pretty much invisible to the
> client application. The client app didn't have to do anything special. I
> appreciate that perhaps because they have different semantics that its an
> easier for them to solve.
> 
> I had a read of this page you wrote the other day :
> http://jim.webber.name/2011/02/16/3b8f4b3d-c884-4fba-ae6b-7b75a191fa22.aspx
> 
> It was your comment "it's hard to achieve in practice" that prompted me to
> post my initial message yesterday to enquire further.
> 
> I'm no specialist in the field, I just know what I want hehe :)
> 
> The only player in the field I've been able to find that might have more of
> the qualities I am interested is InfiniteGraph, its a shame that it doesn't
> have a 'server' version like neo does for me to do a proper comparison.
> 
> I'll stick with neo for now, and see how the marketplace matures in the
> coming months - i'm amazed at how much movement there has been in the last
> year.
> 
> 
> 
> 
> On Mon, Feb 21, 2011 at 3:09 PM, Jim Webber  wrote:
> 
>> Yup, you nailed it better than I did Rick.
>> 
>> Though your partition strategy might not be just "per user." For example in
>> the geo domain, it makes sense to route requests for particular cities to
>> specific nodes. It'll depend on your application how you generate your
>> routing rules.
>> 
>> Jim
>> 
>> On 21 Feb 2011, at 14:51, Michael Hunger wrote:
>> 
>>> You shouldn't be confused because you got it right :)
>>> 
>>> Cheers
>>> 
>>> Michael
>>> 
>>> Am 21.02.2011 um 15:40 schrieb Rick Otten:
>>> 
 Ok, I'm following this discussion, and now I'm confused.
 
 My understanding was that the (potentially very large) database is
 replicated across all instances.
 
 If someone needed to traverse to something that wasn't cached, they'd
>> take
 a performance hit, but still be able to get to it.
 
 I had understood the idea behind the load balancing is to minimize
 traversals out of cache by grouping similar sets of users on a
>> particular
 server.  (That way you don't need a ton of RAM to stash everything in
>> the
 database, just the most frequently accessed nodes and relationships
 associated with a subset of the users.)
 
 
 
 
> Hello JT,
> 
>> One thing, when you say route requests to specific instances .. does
>> that
>> imply that node relationships can't span instances ?
> 
> Yes that's right. What I'm suggesting here is that each instance is a
>> full
> replica that works on a subset of requests which are likely to keep the
> caches warm.
> 
> So if you can split your requests (e.g all customers beginning with "A"
>> go
> to instance "1" ... all customers beginning with "Z" go to instance
>> "26"),
> they will benefit from having warm caches for reading, while the HA
> infrastructure deals with updates across instances transactionally.
> 
> Jim
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
> 
 
 
 --
 Rick Otten
 rot...@windfish.net
 O=='=+
 
 

Re: [Neo4j] Can't get a spatial layer

2011-02-21 Thread Craig Taverner
>
>  def performImport() {
>database.shutdown()
> val batchInserter = new MyBatchInserter
>importer.importFile(batchInserter, filename)
> importer.reIndex(database, 1000)
>batchInserter.shutdown()
>  }
>

But now you are running the reIndex method on a database you previously
shutdown. You need to re-open the database for reIndex to work. Also, you
should shutdown the batchinserter after the importFile and before opening
the database again. The sequence should be:

   - start batch inserter
   - import file
   - shutdown batch inserter
   - start database
   - reindex
   - shutdown database
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Unable to parse mapped memory

2011-02-21 Thread Anders Nawroth
Hi!

Do you mean moved to 1.3.M01, because I can't find any branch using 
1.3.M02 yet?


/anders

2011-02-21 22:43, Craig Taverner skrev:
> Hi,
>
> Recently, since Peter moved Neo4j Spatial to 1.3M02 I think, I stopped being
> able to run the Neo4j Spatial unit tests in maven. Most of them now generate
> an error like:
>
> INFO: Unable to parse mapped memory[1.2] string for
> /home/craig/dev/neo4j/neo4j-spatial/target/var/neo4j-db/neostore.propertystore.db.strings
>
>
> I am making sure to create the database from scratch. The full error is
> below. Any ideas?
>
> Regards, Craig
>
> Physical mem: 2517MB, Heap size: 1020MB
> Feb 21, 2011 10:37:37 PM
> org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore getMappedMem
> INFO: Unable to parse mapped memory[1.2] string for
> /home/craig/dev/neo4j/neo4j-spatial/target/var/neo4j-db/neostore.propertystore.db.strings
> create=true
> dump_configuration=true
> logical_log=/home/craig/dev/neo4j/neo4j-spatial/target/var/neo4j-db/nioneo_logical.log
> neo_store=/home/craig/dev/neo4j/neo4j-spatial/target/var/neo4j-db/neostore
> neostore.nodestore.db.mapped_memory=50M
> neostore.propertystore.db.arrays.mapped_memory=50M
> neostore.propertystore.db.index.keys.mapped_memory=1M
> neostore.propertystore.db.index.mapped_memory=1M
> neostore.propertystore.db.mapped_memory=400M
> neostore.propertystore.db.strings.mapped_memory=1.2G
> neostore.relationshipstore.db.mapped_memory=150M
> rebuild_idgenerators_fast=true
> store_dir=/home/craig/dev/neo4j/neo4j-spatial/target/var/neo4j-db
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Social Networks And Graph Databases

2011-02-21 Thread J T
I realise that there are different qualities that can come in to play with
the labels 'scalability' & 'performance' and I can see how your strategy
would help with some of those qualities but it relies on custom logic in the
client application to do the sharding and load spreading and doesn't address
scaling the underlying persistant storage engine.

One of the things that attracted me to Riak and Cassandra (for the use cases
I can apply them to) is that sharding, load balancing and persistance
scaling was available out-of-the-box and and pretty much invisible to the
client application. The client app didn't have to do anything special. I
appreciate that perhaps because they have different semantics that its an
easier for them to solve.

I had a read of this page you wrote the other day :
http://jim.webber.name/2011/02/16/3b8f4b3d-c884-4fba-ae6b-7b75a191fa22.aspx

It was your comment "it's hard to achieve in practice" that prompted me to
post my initial message yesterday to enquire further.

I'm no specialist in the field, I just know what I want hehe :)

The only player in the field I've been able to find that might have more of
the qualities I am interested is InfiniteGraph, its a shame that it doesn't
have a 'server' version like neo does for me to do a proper comparison.

I'll stick with neo for now, and see how the marketplace matures in the
coming months - i'm amazed at how much movement there has been in the last
year.




On Mon, Feb 21, 2011 at 3:09 PM, Jim Webber  wrote:

> Yup, you nailed it better than I did Rick.
>
> Though your partition strategy might not be just "per user." For example in
> the geo domain, it makes sense to route requests for particular cities to
> specific nodes. It'll depend on your application how you generate your
> routing rules.
>
> Jim
>
> On 21 Feb 2011, at 14:51, Michael Hunger wrote:
>
> > You shouldn't be confused because you got it right :)
> >
> > Cheers
> >
> > Michael
> >
> > Am 21.02.2011 um 15:40 schrieb Rick Otten:
> >
> >> Ok, I'm following this discussion, and now I'm confused.
> >>
> >> My understanding was that the (potentially very large) database is
> >> replicated across all instances.
> >>
> >> If someone needed to traverse to something that wasn't cached, they'd
> take
> >> a performance hit, but still be able to get to it.
> >>
> >> I had understood the idea behind the load balancing is to minimize
> >> traversals out of cache by grouping similar sets of users on a
> particular
> >> server.  (That way you don't need a ton of RAM to stash everything in
> the
> >> database, just the most frequently accessed nodes and relationships
> >> associated with a subset of the users.)
> >>
> >>
> >>
> >>
> >>> Hello JT,
> >>>
>  One thing, when you say route requests to specific instances .. does
>  that
>  imply that node relationships can't span instances ?
> >>>
> >>> Yes that's right. What I'm suggesting here is that each instance is a
> full
> >>> replica that works on a subset of requests which are likely to keep the
> >>> caches warm.
> >>>
> >>> So if you can split your requests (e.g all customers beginning with "A"
> go
> >>> to instance "1" ... all customers beginning with "Z" go to instance
> "26"),
> >>> they will benefit from having warm caches for reading, while the HA
> >>> infrastructure deals with updates across instances transactionally.
> >>>
> >>> Jim
> >>> ___
> >>> Neo4j mailing list
> >>> User@lists.neo4j.org
> >>> https://lists.neo4j.org/mailman/listinfo/user
> >>>
> >>
> >>
> >> --
> >> Rick Otten
> >> rot...@windfish.net
> >> O=='=+
> >>
> >>
> >> ___
> >> Neo4j mailing list
> >> User@lists.neo4j.org
> >> https://lists.neo4j.org/mailman/listinfo/user
> >
> > ___
> > Neo4j mailing list
> > User@lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
>
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Can't get a spatial layer

2011-02-21 Thread Nolan Darilek
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

> I think it is the missing call to import.reindex that is causing your
> problem.

I think so too, especially since I didn't know I had to do it. :) So now
my code looks like:

  def performImport() {
database.shutdown()
val batchInserter = new MyBatchInserter
importer.importFile(batchInserter, filename)
importer.reIndex(database, 1000)
batchInserter.shutdown()
  }

And I see this:


Re-indexing with GraphDatabaseService: EmbeddedGraphDatabase
[/home/nolan/database] (class: class
org.neo4j.kernel.EmbeddedGraphDatabase)
scala.actors.Actor$$anon$1@11dea651: caught java.lang.NullPointerException
java.lang.NullPointerException
at
org.neo4j.kernel.impl.nioneo.xa.ReadTransaction.nodeLoadLight(ReadTransaction.java:74)

at
org.neo4j.kernel.impl.nioneo.xa.NioNeoDbPersistenceSource$ReadOnlyResourceConnection.nodeLoadLight(NioNeoDbPersistenceSource.java:235)

at
org.neo4j.kernel.impl.persistence.PersistenceManager.loadLightNode(PersistenceManager.java:74)

at
org.neo4j.kernel.impl.core.NodeManager.getNodeById(NodeManager.java:386)
at
org.neo4j.kernel.impl.core.NodeManager.getReferenceNode(NodeManager.java:464)

at
org.neo4j.kernel.EmbeddedGraphDbImpl.getReferenceNode(EmbeddedGraphDbImpl.java:258)

at
org.neo4j.kernel.EmbeddedGraphDatabase.getReferenceNode(EmbeddedGraphDatabase.java:120)

at
org.neo4j.gis.spatial.SpatialDatabaseService.getSpatialRoot(SpatialDatabaseService.java:83)

at
org.neo4j.gis.spatial.SpatialDatabaseService.getLayer(SpatialDatabaseService.java:105)

at
org.neo4j.gis.spatial.SpatialDatabaseService.getOrCreateLayer(SpatialDatabaseService.java:150)

at org.neo4j.gis.spatial.osm.OSMImporter.reIndex(OSMImporter.java:197)
at org.neo4j.gis.spatial.osm.OSMImporter.reIndex(OSMImporter.java:187)
at
info.hermesgps.core.model.data.impl.neo4j.Neo4jImport.performImport(neo4j.scala:55)

at
info.hermesgps.core.model.data.Import$$anonfun$start$1.apply$mcV$sp(data.scala:25)

...

I even tried bracketing that in a transaction but that didn't seem to help.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1i6/EACgkQIaMjFWMehWIhAQCfcUAfYp9BOduLlvYbVsh/HvFA
1rMAn1DZXgZInSJ26tQhl61bzZ6Wbph9
=03uK
-END PGP SIGNATURE-
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Unable to parse mapped memory

2011-02-21 Thread Craig Taverner
Hi,

Recently, since Peter moved Neo4j Spatial to 1.3M02 I think, I stopped being
able to run the Neo4j Spatial unit tests in maven. Most of them now generate
an error like:

INFO: Unable to parse mapped memory[1.2] string for
/home/craig/dev/neo4j/neo4j-spatial/target/var/neo4j-db/neostore.propertystore.db.strings


I am making sure to create the database from scratch. The full error is
below. Any ideas?

Regards, Craig

Physical mem: 2517MB, Heap size: 1020MB
Feb 21, 2011 10:37:37 PM
org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore getMappedMem
INFO: Unable to parse mapped memory[1.2] string for
/home/craig/dev/neo4j/neo4j-spatial/target/var/neo4j-db/neostore.propertystore.db.strings
create=true
dump_configuration=true
logical_log=/home/craig/dev/neo4j/neo4j-spatial/target/var/neo4j-db/nioneo_logical.log
neo_store=/home/craig/dev/neo4j/neo4j-spatial/target/var/neo4j-db/neostore
neostore.nodestore.db.mapped_memory=50M
neostore.propertystore.db.arrays.mapped_memory=50M
neostore.propertystore.db.index.keys.mapped_memory=1M
neostore.propertystore.db.index.mapped_memory=1M
neostore.propertystore.db.mapped_memory=400M
neostore.propertystore.db.strings.mapped_memory=1.2G
neostore.relationshipstore.db.mapped_memory=150M
rebuild_idgenerators_fast=true
store_dir=/home/craig/dev/neo4j/neo4j-spatial/target/var/neo4j-db
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Can't get a spatial layer

2011-02-21 Thread Craig Taverner
>
> Uh, I would too actually. ;) I thought that all I needed to do was
> create the SpatialDatabaseService and start grabbing layers.


Well, you can make empty layers if you want, but in order to see layers in
the database you must either import data (creates a layer) or create an
empty layer (which you can add geometries to later if you want).


> Or by
> "load" do you mean import from OSM? If so, then here's what I'm doing:
>

Yes. The OSM import should create a single layer.

 class MyBatchInserter extends BatchInserterImpl(database.getStoreDir) {
>

I'm slightly suspicious that you might be opening the batch inserter on an
already open database, which is not allowed. The database should be closed.
And yet you said you are seeing import messages, so perhaps there is some
scala trick I am missing.

 def performImport() {
>val batchInserter = new MyBatchInserter
>importer.importFile(batchInserter, filename)
>batchInserter.shutdown()
>  }
>

I do not see the code for indexing the layer. The OSMImporter has a two
phase import, the first imports the OSM data into an OSM graph structure,
which can be traversed as OSM data, but has not been published as a layer of
JTS Geometries. You need to run the second phase, which means calling the
importer.reindex method with a normal GraphDatabaseService instance.

This would certainly explain why you can see the import happing, but then
not find the layer.

That's extending a few local traits and such, but it does output lots of
> log messages indicating that it is importing and does leave me with a
> database in the end, so it seems to be working.
>

I think it is the missing call to import.reindex that is causing your
problem. The reason we split the OSM import into two phases was we wanted to
use the batch inserter for the heavy duty task of getting the bulk of the
data into the database, but we could not use it for the indexing task,
because the RTree needs to delete nodes during index tree splits, and the
batch inserter does not allow delete. So we have to switch back to the
normal GraphDatabaseService for the second phase, which creates the layer
and index and adds all geometries to the index.

We are currently evaluating alternative performance improvements which will
allow us to use the normal API for the entire process, in which case we will
remove the second phase and index the geometries as we find them. But right
now that is not possible.

Regards, Craig
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Can't get a spatial layer

2011-02-21 Thread Nolan Darilek
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 02/21/2011 02:49 PM, Craig Taverner wrote:
> 
> My Scala is too rusty to comment on any issues with the code, but I must
> suspect that the code for loading the layers is probably at fault. You are
> only showing the code for listing layers below, so I would like to see the
> code for loading layers before making any suggestions.
> 

Uh, I would too actually. ;) I thought that all I needed to do was
create the SpatialDatabaseService and start grabbing layers. Or by
"load" do you mean import from OSM? If so, then here's what I'm doing:

class Neo4jImport(filename:String, layer:String = "map") extends Import {

  val importer = new OSMImporter(layer)

  private var processed = 0

  def processedEntities = processed

  private val database = dataset.asInstanceOf[Neo4JDataSet].database

  class MyBatchInserter extends BatchInserterImpl(database.getStoreDir) {

override def createNode(properties:JMap[String, Object]) = {
  processed += 1
  super.createNode(properties)
}

override def createNode(id:Long, properties:JMap[String, Object]){
  super.createNode(id, properties)
  processed += 1
}

override def createRelationship(n1:Long, n2:Long,
rt:RelationshipType, properties:JMap[String, Object]) = {
  processed += 1
  super.createRelationship(n1, n2, rt, properties)
}

  }

  def performImport() {
val batchInserter = new MyBatchInserter
importer.importFile(batchInserter, filename)
batchInserter.shutdown()
  }

}

That's extending a few local traits and such, but it does output lots of
log messages indicating that it is importing and does leave me with a
database in the end, so it seems to be working.

Thanks.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1i05gACgkQIaMjFWMehWKuHgCeObYwAATsiqK6w2K2THa8cvYD
t2UAn2KUQL/atgjduDMPfD6F+p+1WuvR
=tnlY
-END PGP SIGNATURE-
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Can't get a spatial layer

2011-02-21 Thread Craig Taverner
Hi Nolan,

My Scala is too rusty to comment on any issues with the code, but I must
suspect that the code for loading the layers is probably at fault. You are
only showing the code for listing layers below, so I would like to see the
code for loading layers before making any suggestions.

I know I tested the getLayerNames quite a bit with my recent Ruby wrapper on
Neo4j Spatial, and it worked fine for that, so I do not believe there are
any issues in the underlying code.

Regards, Craig

On Mon, Feb 21, 2011 at 9:23 PM, Nolan Darilek wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> What conditions would cause there to be no layer names returned by
> spatialService.getLayerNames(), and null for all calls to
> spatialService.getLayer(...)?
>
> I tweaked my test script slightly such that it dumps me into a console
> so I can see how well I've ported my data model. Only, try though I
> might, I can't seem to load any layers. I'm pretty sure that my code is
> correct. Here's what I have:
>
> class Neo4JDataSet(val database:GraphDatabaseService) extends DataSet {
>
>  val spatialService = new SpatialDatabaseService(database)
>  println("Layers")
>  spatialService.getLayerNames.foreach(println)
>  println("Done")
> ...
> }
>
> The foreach line should print a list of the layers, one per line. The
> database is being initialized, as I occasionally get unclean shutdown
> errors if I don't shutdown() before killing the Scala REPL. I've even
> tried the foreach in a transaction, but can't get a list of layers. If I
> poke the database in the shell, I do see a node called "map", which
> happens to be the layer name I chose.
>
> What am I missing, or what additional information can I provide? It's
> hard to know exactly, since I have a couple layers of indirection at
> work here. But if I access the database directly and create a
> SpatialDatabaseService from the REPL, I still get 0 layers, so I must be
> missing a step somewhere.
>
> Thanks.
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.10 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAk1iydgACgkQIaMjFWMehWI8RwCeJMFBBk+bgNNQCyAbIOjijN0M
> qukAn1FA8aGrU+scGJ2Bb2e10rOuwe37
> =MION
> -END PGP SIGNATURE-
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Getting started with Neo4J Spatial

2011-02-21 Thread Craig Taverner
>
> I've jumped right in and am implementing a rough cut of my database
> abstraction layer for Neo4J. This brought up a couple more questions.
>  The Neo4J Spatial readme shows the search query being done inside a
> transaction. Is this necessary? My reading of the wiki suggests that a
> TX is only needed for mutating the graph, which I assume queries aren't
> doing. I've checked the test cases and don't see a tx, but I'm not
> unwinding the callstack all the way back through superclasses and might
> be missing it.
>

The first versions of Neo4j Spatial were build with Neo4j 1.0 where
read-transactions were still required. These have since been removed, but
sadly the readme is still out of date. I will rectify this as soon as
possible (hopefully later today).

I have two methods which I'm trying to figure out how to implement:
>
> nearestNode(lat:Double, lon:Double):Option[Node]
> nearestWay(lat:Double, lon:Double, allowedTypes:Option[List[String]] =
> None):Option[Way]
>

There are a number scala people on the list that should be able to correct
me if my Scala is rusty, but I see no problems with your ideas. I think both
of these can be implemented using the ideas I presented in the previous
reply. For nearest node, we are talking about a simple distance, point to
point. For nearest way, I think you might need to define what you mean by
nearest. Do you mean the way with the nearest edge? Or perhaps the way with
the nearest centre of gravity or average position? I believe the built-in
JTS functions will help in either case.

I do like the Scala Option and None features, but have to admit I still
prefer the Ruby solution with the absolutely brilliant decision to make '0'
(zero) be true :-)
(and from that follows a whole stream of beautifully compact boolean logic
:-)

Node or Way, and nearestWay may or may not be given an option of allowed
> types (allowed values for the "highway" tag, basically.)
>

The filtering on 'highway' is nicely supported by the DynamicLayer class,
which OSMLayer extends. Take a look at the code in TestDynamicLayers for
examples of that. You can require that a tag like "highway" exist, or
require that it match a specific value "highway"="primary". You can combine
multiple constraints like this also. The only restriction currently is that
all constraints are AND's together. We need to think a little more about OR.
I think my current code allows for it, but I have not verified it.

I see there is a SearchClosest() query, but I don't see how to tell it
> what I'm looking for when I give it a lat/lon.
>

If you define the dynamic layers first, that will setup the filter, then run
the distance search on the dynamic layer. This will cause both the distance
test and the filter test to be performed during the search. In effect, what
it does is perform the distance test using a bounding box for all nodes in
the RTree, until it gets to leave nodes that are within regions that match
the distance criteria (or range of distances if you take my last suggestions
from the previous email), and at the leave nodes it then performs both the
tag test (from the dynamic layer) and the more exact distance test.

Is this where dynamic layers come in? So for instance, if I wanted the
> closest node then I'd just create a dynamic layer on top of my map layer
> for all points? And if I then wanted the closest way of any type, I
> guess I'd then create a dynamic layer for polygons? Then, assuming my
> nearestWay function is called with a list of allowed types, I'd just
> iterate through that list and add ("highway", type) to that same layer
> to further narrow the search? Just want to make sure I have it right
> before running off to read the dynamic layer code.
>

Yes, you are correct, almost. The dynamic layers do filter on geometry type,
but they also do the tag filtering for you at the same time. So you do not
see to do any iterating on the results, the final results are the final
results :-)
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Can't get a spatial layer

2011-02-21 Thread Nolan Darilek
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

What conditions would cause there to be no layer names returned by
spatialService.getLayerNames(), and null for all calls to
spatialService.getLayer(...)?

I tweaked my test script slightly such that it dumps me into a console
so I can see how well I've ported my data model. Only, try though I
might, I can't seem to load any layers. I'm pretty sure that my code is
correct. Here's what I have:

class Neo4JDataSet(val database:GraphDatabaseService) extends DataSet {

  val spatialService = new SpatialDatabaseService(database)
  println("Layers")
  spatialService.getLayerNames.foreach(println)
  println("Done")
...
}

The foreach line should print a list of the layers, one per line. The
database is being initialized, as I occasionally get unclean shutdown
errors if I don't shutdown() before killing the Scala REPL. I've even
tried the foreach in a transaction, but can't get a list of layers. If I
poke the database in the shell, I do see a node called "map", which
happens to be the layer name I chose.

What am I missing, or what additional information can I provide? It's
hard to know exactly, since I have a couple layers of indirection at
work here. But if I access the database directly and create a
SpatialDatabaseService from the REPL, I still get 0 layers, so I must be
missing a step somewhere.

Thanks.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1iydgACgkQIaMjFWMehWI8RwCeJMFBBk+bgNNQCyAbIOjijN0M
qukAn1FA8aGrU+scGJ2Bb2e10rOuwe37
=MION
-END PGP SIGNATURE-
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neovigator Weekend Project

2011-02-21 Thread Javier de la Rosa
Very nice!

On Mon, Feb 21, 2011 at 09:15, Mattias Persson
 wrote:
> Pretty cool !
>
> 2011/2/21 Tobias Ivarsson 
>
>> Nice stuff!
>>
>> On Mon, Feb 21, 2011 at 6:18 AM, Max De Marzi Jr. > >wrote:
>>
>> > Guys,
>> >
>> > So I ran into the Ask Ken project ( http://askken.heroku.com/ ) by
>> > Michael Aufreiter yesterdat, and though it was pretty awesome... so I
>> > ported it to using Neo4j.
>> >
>> > Check it out: http://neovigator.heroku.com/
>> >
>> > On github at https://github.com/maxdemarzi/neovigator
>> >
>> > Regards,
>> > Max
>> > ___
>> > Neo4j mailing list
>> > User@lists.neo4j.org
>> > https://lists.neo4j.org/mailman/listinfo/user
>> >
>>
>>
>>
>> --
>> Tobias Ivarsson 
>> Hacker, Neo Technology
>> www.neotechnology.com
>> Cellphone: +46 706 534857
>> ___
>> Neo4j mailing list
>> User@lists.neo4j.org
>> https://lists.neo4j.org/mailman/listinfo/user
>>
>
>
>
> --
> Mattias Persson, [matt...@neotechnology.com]
> Hacker, Neo Technology
> www.neotechnology.com
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Javier de la Rosa
http://versae.es
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Index question (ReST API)

2011-02-21 Thread Mattias Persson
2011/2/21 Mark Nijhof 

> Ok perhaps I don't entirely get indexes yet:
>
> - Does the key value part need to map to an actual property_name and
> property_value of a node?
>

Doesn't need to, no


> - If the value changes then I should update the index manually? (i.e.
> delete
> the old index and create a new one).
>

Yup. But to have this done automatically you'll have to make use of
transaction event handlers, which can sit and listen to properties
added/removed and update indexes on commit. However there are no nice tools
for that yet.


> - Are indexes the only way I can find a node where property name is Mark?
> And for that I actually have to create an index /index_name/name/mark?
>

It's the best way IMHO, if you don't create some index in the graph itself
or loop through all persons linearly (which of course doesn't scale). You
can have an index "persons" where you map the key/value pair "name", "Mark"
to the Node denoting mark:

   Node mark = graphDb.createNode();
   mark.setProperty( "name", "Mark" );
   Index personsIndex = graphDb.index().forNodes( "persons" );
   personsIndex.add( mark, "name", "Mark" );
   ...
   personsIndex.get( "name", "Mark" ).getSingle();


> - Would I create an index users/dummy/dummy to get all users or this should
> just go through a relation?
>

If you, at creation time, connect all users to a "users reference" node you
can reach all users later by iterating from that reference node to all
users:

   // Assume this exists (perhaps created earlier).
   Node usersRefNode = graphDb.getReferenceNode().getSingleRelationship(
   USERS_REFERENCE, Direction.OUTGOING );
   ...
   Node mark = graphDb.createNode();
   usersRefNode.createRelationshipTo( mark, USER );

   // Iterate over all the users
   for ( Relationship rel : usersRefNode.getRelationships( USER,
Direction.OUTGOING ) ) {
   Node person = rel.getEndNode();
   ...
   }


>
> -Mark
>
>
>
>
> On Sat, Feb 19, 2011 at 11:37 PM, Max De Marzi Jr.  >wrote:
>
> > The indexing piece is really lacking in Neography.  I keep meaning to
> > get around to it, and it's about time I did (next week).
> >
> > It would be nice if we had full indexing support in the REST API first
> > since whatever I implement will need to change when we do.
> >
> > If the specs are done, but not yet implemented, can we get a preview?
> >
> > On Sat, Feb 19, 2011 at 5:03 PM, Michael Hunger
> >  wrote:
> > > The problem (as before) is that you have both in neography - the direct
> > REST API methods that just expose the procedural calls to ruby, and the
> more
> > OO-like Node and Relationship classes.
> > >
> > > Easy to mix them up and take them for the same API, but they aren't.
> > >
> > > Cheers
> > >
> > > Michael
> > >
> > > Am 19.02.2011 um 21:34 schrieb Mark Nijhof:
> > >
> > >> I think I got confused because neography has node classes that contain
> > >> properties. So key didn't make much sense to me. I actually think I'll
> > drop
> > >> neography and start using actual ReST command to get a better
> > understanding.
> > >>
> > >> -Mark
> > >>
> > >> On Sat, Feb 19, 2011 at 9:21 PM, Peter Neubauer <
> > >> peter.neuba...@neotechnology.com> wrote:
> > >>
> > >>> Yes, that should be about right! Is
> > >>>
> > >>>
> >
> http://components.neo4j.org/neo4j-server/1.3-SNAPSHOT/rest.html#Add_to_index
> > >>> unclear? In that case, we need to put that out more clearly ...
> > >>>
> > >>> Cheers,
> > >>>
> > >>> /peter neubauer
> > >>>
> > >>> GTalk:  neubauer.peter
> > >>> Skype   peter.neubauer
> > >>> Phone   +46 704 106975
> > >>> LinkedIn   http://www.linkedin.com/in/neubauer
> > >>> Twitter  http://twitter.com/peterneubauer
> > >>>
> > >>> http://www.neo4j.org   - Your high performance graph
> > database.
> > >>> http://startupbootcamp.org/- Öresund - Innovation happens HERE.
> > >>> http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing
> > party.
> > >>>
> > >>>
> > >>>
> > >>> On Sat, Feb 19, 2011 at 8:56 PM, Mark Nijhof
> > >>>  wrote:
> >  Hmm ok, is it this?
> > 
> >  Key is the property name
> >  Value is the property value
> > 
> >  ?
> > 
> >  -Mark
> > 
> >  On Sat, Feb 19, 2011 at 8:09 PM, Mark Nijhof <
> > >>> mark.nij...@cre8ivethought.com
> > > wrote:
> > 
> > > Hi,
> > >
> > > I have a question about indexes, when looking at neography I see a
> > >>> method: add_node_to_index(index,
> > > key, value, node)
> > >
> > > I can understand that index is the name of the index that I want to
> > put
> > >>> the
> > > node into, what I don't understand is the key value part of it.
> > >
> > > -Mark
> > >
> > >
> > >
> > > --
> > > Mark Nijhof
> > > m: 0047 95 00 99 37
> > > e:  mark.nij...@cre8ivethought.com
> > > b:  cre8ivethought.com/blog/index
> > >
> > >
> > >
> > > "Walking on water and developing software from a specification are
> > easy
> > 

[Neo4j] Synchronization model between embedded client and Standalone server

2011-02-21 Thread Brendan
Hi,

I want to design a system where each client has a embedded db to store only a 
subset of a master server db of all clients.  Then, should I design a server 
plug in to extend the currently stateless server services to support 
transaction by simply wrap few requests into one atomic request and respond? 
Also, modify your online backup to synchronization the clients according to 
these transactions.

I'm sure there is a better idea.  Please let me know if you have.

Brendan

Sent from my iPad
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Index question (ReST API)

2011-02-21 Thread Mark Nijhof
Ok perhaps I don't entirely get indexes yet:

- Does the key value part need to map to an actual property_name and
property_value of a node?
- If the value changes then I should update the index manually? (i.e. delete
the old index and create a new one).
- Are indexes the only way I can find a node where property name is Mark?
And for that I actually have to create an index /index_name/name/mark?
- Would I create an index users/dummy/dummy to get all users or this should
just go through a relation?

-Mark




On Sat, Feb 19, 2011 at 11:37 PM, Max De Marzi Jr. wrote:

> The indexing piece is really lacking in Neography.  I keep meaning to
> get around to it, and it's about time I did (next week).
>
> It would be nice if we had full indexing support in the REST API first
> since whatever I implement will need to change when we do.
>
> If the specs are done, but not yet implemented, can we get a preview?
>
> On Sat, Feb 19, 2011 at 5:03 PM, Michael Hunger
>  wrote:
> > The problem (as before) is that you have both in neography - the direct
> REST API methods that just expose the procedural calls to ruby, and the more
> OO-like Node and Relationship classes.
> >
> > Easy to mix them up and take them for the same API, but they aren't.
> >
> > Cheers
> >
> > Michael
> >
> > Am 19.02.2011 um 21:34 schrieb Mark Nijhof:
> >
> >> I think I got confused because neography has node classes that contain
> >> properties. So key didn't make much sense to me. I actually think I'll
> drop
> >> neography and start using actual ReST command to get a better
> understanding.
> >>
> >> -Mark
> >>
> >> On Sat, Feb 19, 2011 at 9:21 PM, Peter Neubauer <
> >> peter.neuba...@neotechnology.com> wrote:
> >>
> >>> Yes, that should be about right! Is
> >>>
> >>>
> http://components.neo4j.org/neo4j-server/1.3-SNAPSHOT/rest.html#Add_to_index
> >>> unclear? In that case, we need to put that out more clearly ...
> >>>
> >>> Cheers,
> >>>
> >>> /peter neubauer
> >>>
> >>> GTalk:  neubauer.peter
> >>> Skype   peter.neubauer
> >>> Phone   +46 704 106975
> >>> LinkedIn   http://www.linkedin.com/in/neubauer
> >>> Twitter  http://twitter.com/peterneubauer
> >>>
> >>> http://www.neo4j.org   - Your high performance graph
> database.
> >>> http://startupbootcamp.org/- Öresund - Innovation happens HERE.
> >>> http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing
> party.
> >>>
> >>>
> >>>
> >>> On Sat, Feb 19, 2011 at 8:56 PM, Mark Nijhof
> >>>  wrote:
>  Hmm ok, is it this?
> 
>  Key is the property name
>  Value is the property value
> 
>  ?
> 
>  -Mark
> 
>  On Sat, Feb 19, 2011 at 8:09 PM, Mark Nijhof <
> >>> mark.nij...@cre8ivethought.com
> > wrote:
> 
> > Hi,
> >
> > I have a question about indexes, when looking at neography I see a
> >>> method: add_node_to_index(index,
> > key, value, node)
> >
> > I can understand that index is the name of the index that I want to
> put
> >>> the
> > node into, what I don't understand is the key value part of it.
> >
> > -Mark
> >
> >
> >
> > --
> > Mark Nijhof
> > m: 0047 95 00 99 37
> > e:  mark.nij...@cre8ivethought.com
> > b:  cre8ivethought.com/blog/index
> >
> >
> >
> > "Walking on water and developing software from a specification are
> easy
> >>> if
> > both are frozen."
> >
> > -- Edward V Berard
> >
> >
> >
> >
> 
> 
>  --
>  Mark Nijhof
>  m: 0047 95 00 99 37
>  e:  mark.nij...@cre8ivethought.com
>  b:  cre8ivethought.com/blog/index
> 
> 
> 
>  "Walking on water and developing software from a specification are
> easy
> >>> if
>  both are frozen."
> 
>  -- Edward V Berard
>  ___
>  Neo4j mailing list
>  User@lists.neo4j.org
>  https://lists.neo4j.org/mailman/listinfo/user
> 
> >>>
> >>
> >>
> >>
> >> --
> >> Mark Nijhof
> >> m: 0047 95 00 99 37
> >> e:  mark.nij...@cre8ivethought.com
> >> b:  cre8ivethought.com/blog/index
> >>
> >>
> >>
> >> "Walking on water and developing software from a specification are easy
> if
> >> both are frozen."
> >>
> >> -- Edward V Berard
> >> ___
> >> Neo4j mailing list
> >> User@lists.neo4j.org
> >> https://lists.neo4j.org/mailman/listinfo/user
> >
> > ___
> > Neo4j mailing list
> > User@lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
> >
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Mark Nijhof
m: 0047 95 00 99 37
e:  mark.nij...@cre8ivethought.com
b:  cre8ivethought.com/blog/index



"Walking on water and developing software from a specification are easy if
both are frozen."

-- Edward V Berard
___

Re: [Neo4j] Batch Inserter - db scaling issue (not index scaling issue)

2011-02-21 Thread Mark Harwood
Thanks for taking the time to look over my example, Johan.

I was hoping that the batch inserter's memory costs would not be
directly linear with the volume of data inserted - sounds like it is?.
My assumption was that the indexing service was the service with the
comparatively hard task of random-lookups on arbitrary keys on an
ever-changing index with sub-linear memory cost, sub-linear lookup
speed and background merge tasks to avoid fragmentation over time. I
kind of hoped the graph db could have similar qualities in its tasks
of allocating new node ids and storage/retrieval of related edges.

Cheers,
Mark

On Mon, Feb 21, 2011 at 2:30 PM, Johan Svensson  wrote:
> Mark,
>
> I had a look at this and you try to inject 130M relationships with a
> relationship store configured to 700M. That will not be an efficient
> insert. If your relationships and data are not sorted the batch
> inserter would have to unload and load blocks of data as soon as you
> get over around 22M relationships. To inject 130M relationships at
> full speed with random connections would require around 4G for the
> relationship store.
>
> -Johan
>
> On Fri, Feb 18, 2011 at 8:07 AM, Mark @ Gmail  wrote:
>> Hi Johan and others
I am having a hard time to follow what the problems really are since 
conversation is split up in several thread
>> My fault, sorry. I was replying to a message posted before I subscribed to 
>> the list so didn't have the orginal poster's email.
>>
as I understand it you are saying that it is the index lookups that are 
taking to long time?
>>
>> In your current implementation, "Yes" - in the indexing implementation I 
>> provide on that Google code project there is no performance issue.
>> However, having fixed the Lucene indexing issue it only reveals that the 
>> *database* is now the bottleneck and blows up after 30 million edge inserts. 
>> That is now the issue here.
>>
>> See the test results here : 
>> http://code.google.com/p/graphdb-load-tester/wiki/TestResults
>>
For example inserting 500M relationships
requiring 1B index lookups (one for each node) with an avg index
lookup time of 1ms is 11 days worth of index lookup time.
>> That is why I suggested to Peter when he asked for help with indexing that a 
>> Bloom filter helps "know what you don't know" and an LRU Cache helps hang 
>> onto popular nodes. These are in my implementation and both avoid reads.
>> Re your suggestion about avoiding indexes by inserting in batches - I can't 
>> see how that will help because you can sort input data by from node key or 
>> to node key but will not necessarily end up with node pairs that are joined 
>> by edges conveniently located in the same batch and will therefore need an 
>> index service to add any edges - but as I say this is fixed in my 
>> implementation andindexing is not the remaining issue - the database is.
>> I do encourage you to try run it.
>>
>> Cheers,
>> Mark
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Gremlin from Java Error

2011-02-21 Thread Alfredas Chmieliauskas
Thanks a lot. Updated to 0.8-SNAPSHOT. Strangely maven did not resolve
groovy dependence automatically. Had to add it by hand.

ScriptEngine approach works fine, Gremlin.compile(expr) does not?

I understand that it could be faster&cleaner to use groovy, but I need
to be able to pass string queries and get results back.
In my use-case this is the definite use of gremlin - arbitrary string
queries. For "static" business logic I can use Pipes - I'm not
bothered by the wordy expressions.

Alfredas


On Mon, Feb 21, 2011 at 4:23 PM, Marko Rodriguez  wrote:
> Hi,
>
> outE(String) is in Gremlin 0.8-SNASPHOT, not Gremlin 0.7.
>
> Note at the bottom of the main Wiki page: "Gremlin documentation is up to 
> date with the current Gremlin codebase, not with the latest Gremlin release."
>
> I do this so its easier for me to maintain documentation. The documentation 
> for Gremlin 0.7 is in the doc/wiki directory of the distribution.
>
> If you still want to use Gremlin 0.7, do this:
>
> engine.eval("v.outE[[label='KNOWS']].inV >> results");
>
> Finally, I would recommend using Groovy in your codebase instead of JSR 223 
> ScriptEngine. Groovy and Java work seamlessly together and its so much 
> handier/cleaner/faster than through JSR 223. See:
>        https://github.com/tinkerpop/gremlin/wiki/Using-Gremlin-through-Groovy
>
> Hope that helps,
> Marko.
>
> http://markorodriguez.com
>
>
> On Feb 21, 2011, at 9:15 AM, Alfredas Chmieliauskas wrote:
>
>> Dear all,
>>
>> I have the following code:
>>
>> ScriptEngine engine = new GremlinScriptEngineFactory().getScriptEngine();
>> List results = new ArrayList();
>> engine.getBindings(ScriptContext.ENGINE_SCOPE).put("g", getGraph());
>> engine.getBindings(ScriptContext.ENGINE_SCOPE).put("v", 
>> getVertex(startNode));
>> engine.getBindings(ScriptContext.ENGINE_SCOPE).put("results", results);
>> try {
>>    engine.eval(v.outE('KNOWS').inV >> results");
>> } catch (ScriptException e) {
>>    logger.error(e.getMessage(), e);
>> }
>>
>> produces the following error:
>>
>>
>> ERROR javax.script.ScriptException:
>> groovy.lang.MissingMethodException: No signature of method:
>> com.tinkerpop.blueprints.pgm.impls.neo4j.Neo4jVertex.outE() is
>> applicable for argument types: (java.lang.String) values: [KNOWS]
>> Possible solutions: outE(groovy.lang.Closure), dump(),
>> use([Ljava.lang.Object;), getAt(java.lang.String),
>> getAt(java.lang.String), with(groovy.lang.Closure)
>> javax.script.ScriptException: javax.script.ScriptException:
>> groovy.lang.MissingMethodException: No signature of method:
>> com.tinkerpop.blueprints.pgm.impls.neo4j.Neo4jVertex.outE() is
>> applicable for argument types: (java.lang.String) values: [KNOWS]
>> Possible solutions: outE(groovy.lang.Closure), dump(),
>> use([Ljava.lang.Object;), getAt(java.lang.String),
>> getAt(java.lang.String), with(groovy.lang.Closure)
>>       at 
>> org.codehaus.groovy.jsr223.GroovyScriptEngineImpl.eval(GroovyScriptEngineImpl.java:117)
>>       at 
>> com.tinkerpop.gremlin.jsr223.GremlinScriptEngine.eval(GremlinScriptEngine.java:36)
>>       at 
>> javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:247)
>>       at 
>> com.tinkerpop.gremlin.jsr223.GremlinScriptEngine.eval(GremlinScriptEngine.java:32)
>>       at 
>> alfredas.springdatagraph.template.domain.AbstractRepository.findAllByGremlin2(AbstractRepository.java:94)
>>       at alfredas.springdatagraph.template.App.(App.java:84)
>>       at alfredas.springdatagraph.template.App.main(App.java:93)
>> Caused by: javax.script.ScriptException:
>> groovy.lang.MissingMethodException: No signature of method:
>> com.tinkerpop.blueprints.pgm.impls.neo4j.Neo4jVertex.outE() is
>> applicable for argument types: (java.lang.String) values: [KNOWS]
>> Possible solutions: outE(groovy.lang.Closure), dump(),
>> use([Ljava.lang.Object;), getAt(java.lang.String),
>> getAt(java.lang.String), with(groovy.lang.Closure)
>>       at 
>> org.codehaus.groovy.jsr223.GroovyScriptEngineImpl.eval(GroovyScriptEngineImpl.java:318)
>>       at 
>> org.codehaus.groovy.jsr223.GroovyScriptEngineImpl.eval(GroovyScriptEngineImpl.java:111)
>>       ... 6 more
>> Caused by: groovy.lang.MissingMethodException: No signature of method:
>> com.tinkerpop.blueprints.pgm.impls.neo4j.Neo4jVertex.outE() is
>> applicable for argument types: (java.lang.String) values: [KNOWS]
>> Possible solutions: outE(groovy.lang.Closure), dump(),
>> use([Ljava.lang.Object;), getAt(java.lang.String),
>> getAt(java.lang.String), with(groovy.lang.Closure)
>>       at 
>> org.codehaus.groovy.runtime.ScriptBytecodeAdapter.unwrap(ScriptBytecodeAdapter.java:54)
>>       at 
>> org.codehaus.groovy.runtime.callsite.PojoMetaClassSite.call(PojoMetaClassSite.java:46)
>>       at 
>> org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:40)
>>       at 
>> org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:116)
>>       at 
>> org.codehaus.groovy.runtime.callsite.AbstractCallSit

Re: [Neo4j] Getting started with Neo4J Spatial

2011-02-21 Thread Craig Taverner
>
> Actually, MongoDB is just geohashes AFAIK. So it works great for use
> cases like Foursquare, but only partially well for me, where I want to
> know things like the closest way to x latitude and y longitude. So
> perhaps I'm comparing apples and orangutans when comparing MongoDB to
> Neo4J.
>

"Apples and Orangutans" - that's a quote for the books :-)

OK, but can you tell me how to translate that into code? In looking
> through the source, I see there's a query for points within a specific
> bounding box, but how would I make that appear infinite? Do I use a
> sufficiently large bounding box such that the user will only reach the
> end after 100 pages or so, or is there another way? I wouldn't mind that
> as much, then requiring the user to hit the normal search API for
> anything beyond such a large radius.
>

Off the top of my head, I can see four options that should work on current
code:

   - The bounding box query you suggest
   - The distance query in the class
SearchPointsWithinOrthodromicDistance.
   This should be a bit faster and more memory efficient than the plan bounding
   box search because the distance test is done within the index. The
   calculated distance is saved to the result set so you can sort the results
   later.
   - Do a 
SearchWithinon
a polygon with a hole, so that the outer ring is the new distance, and
   the inner ring is the previous distance, creating a pagination effect. Since
   the polygon(with hole) is tested inside the index traversal, the efficiency
   should be higher than doing a post-query pagination, especially for later
   pages. One issue is that you do not know how many results will occur within,
   so you need to do partial post-query pagination anyway.
   - Create a new special search query that combines the distance
   calculation in the orthodromic distance query with the inner/outer ring. I
   think this is the most efficient, because you can use a combination of the
   two ring polygon and a distance calculation to refine the results. And this
   is done during the traversal, so the result set never gets too big.

I'm not only the president, but also a client. :) So I
> literally can't see your slides.
>

OK. Then I guess it would be best for me a voice presentation, on skype
perhaps. That is certainly possible, but tricky to arrange because I have a
tight schedule. Contact me off-list and we can see if something can be
arranged.

OK, good to know that all of the tests pass.
>

Indeed. The automatic build fails if the tests fail. It is a nice safety
catch.

I'm still familiarizing myself with how the shell displays things, but
> according to that last ls, it would appear that there is a map node
> under osm_root ("map" was my layer name) with relations called
> "relations" and "ways". Would I be wrong in my expectation to find a
> "nodes" relation there as well?
>

Ah. I see your point. Well, actually I confess I removed direct access to
the nodes in a very early version of the OSMImporter because we can get to
the nodes through the ways, or through the RTree index, if the nodes are
points of interest. So linking the nodes to the dataset only serves to
increase the number of relations in an already large data structure. I
needed to save space. So, if you want all the nodes, you need to traverse
down the ways, and for each way, traverse to its first node proxy. Then
there is a chain of node proxies, each with a side relationship to the node
itself. The reason for the node-proxies is because the same node can be part
of many ways.

I haven't looked at the code yet, but instead have been familiarizing
> myself with more Neo4J concepts, as well as the shell.
>

A good place to start.

Via "index --indexes", I see that there is a "nodes" index. Actually,
> given that I haven't seen "ways" or "relations" indexes but have seen
> relations rooted off of "map", I now wonder if the lack of
> ways/relations indexes and the lack of a nodes relationship are related?
>

Actually there should be three indices, for nodes, ways and relations. The
nodes index is used when loading ways to find the nodes for each way. The
ways index is used when loading relations. The relations index is also used
when loading relations since relations can contain relations.

But the pure graph has, as you observed, only two ways to the nodes, from
the dataset through the ways, or from the layer, through the RTree index,
which is not a lucene index like the others, but instead a tree structure in
the graph itself.


> In any case, is it possible to list the keywords associated with a given
> index from the shell? Based on my reading of the wiki, I assume that I'd
> query the nodes index by a keyword-value pair with the node ID as the
> value

Re: [Neo4j] Getting started with Neo4J Spatial

2011-02-21 Thread Nolan Darilek
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 02/18/2011 06:00 PM, Craig Taverner wrote:
> 
> Internally Neo4j Spatial is working with bounding boxes too. I have to
> assume MongoDb does also. Our RTree index is optimized for general spatial
> objects, and while it can do distance queries, I think there are better
> ways. I'm thinking that a fixed size grid b-tree like index, similar to my
> own 'amanzi-index' would work very well for distance searches, because
> instead of traversing from the top of the tree, you would traverse from the
> central point outwards radially. This would be one way to have the graph
> really benefit the distance search. We have considered plugging in other
> indices into Neo4j Spatial, but have not actually done it (yet).
> 

Actually, MongoDB is just geohashes AFAIK. So it works great for use
cases like Foursquare, but only partially well for me, where I want to
know things like the closest way to x latitude and y longitude. So
perhaps I'm comparing apples and orangutans when comparing MongoDB to Neo4J.

> It is still worth trying out the current index, it might be fast enough, not
> for infinite scrolling lists, but at least for the first many pages (so
> effectively infinite from the users point of view).
> 

OK, but can you tell me how to translate that into code? In looking
through the source, I see there's a query for points within a specific
bounding box, but how would I make that appear infinite? Do I use a
sufficiently large bounding box such that the user will only reach the
end after 100 pages or so, or is there another way? I wouldn't mind that
as much, then requiring the user to hit the normal search API for
anything beyond such a large radius.

> 
> Why can you not see them? Site blocked? Shall I email you the original
> presentation?
> 

My product is a navigation platform for people who are blind or visually
impaired. I'm translating OSM data into natural language such that, for
instance, a node that happens to be an intersection of several ways is
described in natural language relevant to both the node's content and to
the user's current direction of travel. And, to borrow a tired
colloquialism, I'm not only the president, but also a client. :) So I
literally can't see your slides.

> 
> The directory containing that test class contains many others, and all of
> them are run every night, so they should all work. Neo4j Spatial is sadly
> lacking in good documentation, so these test classes are the best place to
> investigate how to use it.
> 

OK, good to know that all of the tests pass.

> 
> The graph created has a tree structure rooted at the database root node. You
> can traverse down to a collection of nodes and a collection of ways. The
> Ruby wrapper actually exposes node and way counts through this, so you could
> look at that code to see how it is done.
> 

OK, so perhaps I'm confused about something. Here's a shell transcript:

Welcome to the Neo4j Shell! Enter 'help' for a list of commands

neo4j-sh (0)$ cd 1
neo4j-sh (osm_root,1)$ ls
*name =[osm_root]
*type =[osm]
(me) --[OSM]-> (map,2)
(me) <-[OSM]-- (0)
neo4j-sh (osm_root,1)$ cd 2
neo4j-sh (map,2)$ ls
*generator =[Osmosis 0.36]
*name  =[map]
*type  =[osm]
*version   =[0.6]
(me) <-[OSM]-- (osm_root,1)
(me) --[RELATIONS]-> (Walden of Westchase,50976192)
(me) --[WAYS]-> (Sam Rayburn Tollway,21973837)
neo4j-sh (map,2)$ quit

I'm still familiarizing myself with how the shell displays things, but
according to that last ls, it would appear that there is a map node
under osm_root ("map" was my layer name) with relations called
"relations" and "ways". Would I be wrong in my expectation to find a
"nodes" relation there as well?

> 
> As discussed above, we use a lucene index to track the OSM node ids (I
> usually referred to them as the osm-id above). Obviously the neo4j id is not
> the same number, as you suspected, so we needed this index. This is also the
> bottleneck for loading large OSM files.
> 

I haven't looked at the code yet, but instead have been familiarizing
myself with more Neo4J concepts, as well as the shell.

Via "index --indexes", I see that there is a "nodes" index. Actually,
given that I haven't seen "ways" or "relations" indexes but have seen
relations rooted off of "map", I now wonder if the lack of
ways/relations indexes and the lack of a nodes relationship are related?
In any case, is it possible to list the keywords associated with a given
index from the shell? Based on my reading of the wiki, I assume that I'd
query the nodes index by a keyword-value pair with the node ID as the
value. I'll be looking at that code later today, now that I'm grounded
in more of the concepts, but I wanted to understand those first before
reading code. Is the keyword I'd check "osm-id"? I'd like to learn how
to discover indexes from the shell for future reference, if indeed that
is possible.

Also, one more semi-related question. How do layers work in queries? My
app stores user POIs and, h

Re: [Neo4j] Gremlin from Java Error

2011-02-21 Thread Marko Rodriguez
Hi,

outE(String) is in Gremlin 0.8-SNASPHOT, not Gremlin 0.7.

Note at the bottom of the main Wiki page: "Gremlin documentation is up to date 
with the current Gremlin codebase, not with the latest Gremlin release."

I do this so its easier for me to maintain documentation. The documentation for 
Gremlin 0.7 is in the doc/wiki directory of the distribution.

If you still want to use Gremlin 0.7, do this:

engine.eval("v.outE[[label='KNOWS']].inV >> results");

Finally, I would recommend using Groovy in your codebase instead of JSR 223 
ScriptEngine. Groovy and Java work seamlessly together and its so much 
handier/cleaner/faster than through JSR 223. See:
https://github.com/tinkerpop/gremlin/wiki/Using-Gremlin-through-Groovy

Hope that helps,
Marko.

http://markorodriguez.com


On Feb 21, 2011, at 9:15 AM, Alfredas Chmieliauskas wrote:

> Dear all,
> 
> I have the following code:
> 
> ScriptEngine engine = new GremlinScriptEngineFactory().getScriptEngine();
> List results = new ArrayList();
> engine.getBindings(ScriptContext.ENGINE_SCOPE).put("g", getGraph());
> engine.getBindings(ScriptContext.ENGINE_SCOPE).put("v", getVertex(startNode));
> engine.getBindings(ScriptContext.ENGINE_SCOPE).put("results", results);
> try {
>engine.eval(v.outE('KNOWS').inV >> results");
> } catch (ScriptException e) {
>logger.error(e.getMessage(), e);
> }
> 
> produces the following error:
> 
> 
> ERROR javax.script.ScriptException:
> groovy.lang.MissingMethodException: No signature of method:
> com.tinkerpop.blueprints.pgm.impls.neo4j.Neo4jVertex.outE() is
> applicable for argument types: (java.lang.String) values: [KNOWS]
> Possible solutions: outE(groovy.lang.Closure), dump(),
> use([Ljava.lang.Object;), getAt(java.lang.String),
> getAt(java.lang.String), with(groovy.lang.Closure)
> javax.script.ScriptException: javax.script.ScriptException:
> groovy.lang.MissingMethodException: No signature of method:
> com.tinkerpop.blueprints.pgm.impls.neo4j.Neo4jVertex.outE() is
> applicable for argument types: (java.lang.String) values: [KNOWS]
> Possible solutions: outE(groovy.lang.Closure), dump(),
> use([Ljava.lang.Object;), getAt(java.lang.String),
> getAt(java.lang.String), with(groovy.lang.Closure)
>   at 
> org.codehaus.groovy.jsr223.GroovyScriptEngineImpl.eval(GroovyScriptEngineImpl.java:117)
>   at 
> com.tinkerpop.gremlin.jsr223.GremlinScriptEngine.eval(GremlinScriptEngine.java:36)
>   at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:247)
>   at 
> com.tinkerpop.gremlin.jsr223.GremlinScriptEngine.eval(GremlinScriptEngine.java:32)
>   at 
> alfredas.springdatagraph.template.domain.AbstractRepository.findAllByGremlin2(AbstractRepository.java:94)
>   at alfredas.springdatagraph.template.App.(App.java:84)
>   at alfredas.springdatagraph.template.App.main(App.java:93)
> Caused by: javax.script.ScriptException:
> groovy.lang.MissingMethodException: No signature of method:
> com.tinkerpop.blueprints.pgm.impls.neo4j.Neo4jVertex.outE() is
> applicable for argument types: (java.lang.String) values: [KNOWS]
> Possible solutions: outE(groovy.lang.Closure), dump(),
> use([Ljava.lang.Object;), getAt(java.lang.String),
> getAt(java.lang.String), with(groovy.lang.Closure)
>   at 
> org.codehaus.groovy.jsr223.GroovyScriptEngineImpl.eval(GroovyScriptEngineImpl.java:318)
>   at 
> org.codehaus.groovy.jsr223.GroovyScriptEngineImpl.eval(GroovyScriptEngineImpl.java:111)
>   ... 6 more
> Caused by: groovy.lang.MissingMethodException: No signature of method:
> com.tinkerpop.blueprints.pgm.impls.neo4j.Neo4jVertex.outE() is
> applicable for argument types: (java.lang.String) values: [KNOWS]
> Possible solutions: outE(groovy.lang.Closure), dump(),
> use([Ljava.lang.Object;), getAt(java.lang.String),
> getAt(java.lang.String), with(groovy.lang.Closure)
>   at 
> org.codehaus.groovy.runtime.ScriptBytecodeAdapter.unwrap(ScriptBytecodeAdapter.java:54)
>   at 
> org.codehaus.groovy.runtime.callsite.PojoMetaClassSite.call(PojoMetaClassSite.java:46)
>   at 
> org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:40)
>   at 
> org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:116)
>   at 
> org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:124)
>   at Script1.run(Script1.groovy:43)
>   at 
> org.codehaus.groovy.jsr223.GroovyScriptEngineImpl.eval(GroovyScriptEngineImpl.java:315)
>   ... 7 more
> 
> 
> And the alternative method:
> 
> Gremlin.compile("outE('KNOWS').inV)";
> 
> gives:
> 
> Exception in thread "main" groovy.lang.MissingMethodException: No
> signature of method: Script1.outE() is applicable for argument types:
> (java.lang.String) values: [KNOWS]
> Possible solutions: run(), run(), dump(), use([Ljava.lang.Object;),
> putAt(java.lang.String, java.lang.Object), with(groovy.lang.Closure)
>   at 
> org.codehaus.groovy.runtime.ScriptBytecodeAdapter.un

[Neo4j] Gremlin from Java Error

2011-02-21 Thread Alfredas Chmieliauskas
Dear all,

I have the following code:

ScriptEngine engine = new GremlinScriptEngineFactory().getScriptEngine();
List results = new ArrayList();
engine.getBindings(ScriptContext.ENGINE_SCOPE).put("g", getGraph());
engine.getBindings(ScriptContext.ENGINE_SCOPE).put("v", getVertex(startNode));
engine.getBindings(ScriptContext.ENGINE_SCOPE).put("results", results);
try {
engine.eval(v.outE('KNOWS').inV >> results");
} catch (ScriptException e) {
logger.error(e.getMessage(), e);
}

produces the following error:


ERROR javax.script.ScriptException:
groovy.lang.MissingMethodException: No signature of method:
com.tinkerpop.blueprints.pgm.impls.neo4j.Neo4jVertex.outE() is
applicable for argument types: (java.lang.String) values: [KNOWS]
Possible solutions: outE(groovy.lang.Closure), dump(),
use([Ljava.lang.Object;), getAt(java.lang.String),
getAt(java.lang.String), with(groovy.lang.Closure)
javax.script.ScriptException: javax.script.ScriptException:
groovy.lang.MissingMethodException: No signature of method:
com.tinkerpop.blueprints.pgm.impls.neo4j.Neo4jVertex.outE() is
applicable for argument types: (java.lang.String) values: [KNOWS]
Possible solutions: outE(groovy.lang.Closure), dump(),
use([Ljava.lang.Object;), getAt(java.lang.String),
getAt(java.lang.String), with(groovy.lang.Closure)
   at 
org.codehaus.groovy.jsr223.GroovyScriptEngineImpl.eval(GroovyScriptEngineImpl.java:117)
   at 
com.tinkerpop.gremlin.jsr223.GremlinScriptEngine.eval(GremlinScriptEngine.java:36)
   at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:247)
   at 
com.tinkerpop.gremlin.jsr223.GremlinScriptEngine.eval(GremlinScriptEngine.java:32)
   at 
alfredas.springdatagraph.template.domain.AbstractRepository.findAllByGremlin2(AbstractRepository.java:94)
   at alfredas.springdatagraph.template.App.(App.java:84)
   at alfredas.springdatagraph.template.App.main(App.java:93)
Caused by: javax.script.ScriptException:
groovy.lang.MissingMethodException: No signature of method:
com.tinkerpop.blueprints.pgm.impls.neo4j.Neo4jVertex.outE() is
applicable for argument types: (java.lang.String) values: [KNOWS]
Possible solutions: outE(groovy.lang.Closure), dump(),
use([Ljava.lang.Object;), getAt(java.lang.String),
getAt(java.lang.String), with(groovy.lang.Closure)
   at 
org.codehaus.groovy.jsr223.GroovyScriptEngineImpl.eval(GroovyScriptEngineImpl.java:318)
   at 
org.codehaus.groovy.jsr223.GroovyScriptEngineImpl.eval(GroovyScriptEngineImpl.java:111)
   ... 6 more
Caused by: groovy.lang.MissingMethodException: No signature of method:
com.tinkerpop.blueprints.pgm.impls.neo4j.Neo4jVertex.outE() is
applicable for argument types: (java.lang.String) values: [KNOWS]
Possible solutions: outE(groovy.lang.Closure), dump(),
use([Ljava.lang.Object;), getAt(java.lang.String),
getAt(java.lang.String), with(groovy.lang.Closure)
   at 
org.codehaus.groovy.runtime.ScriptBytecodeAdapter.unwrap(ScriptBytecodeAdapter.java:54)
   at 
org.codehaus.groovy.runtime.callsite.PojoMetaClassSite.call(PojoMetaClassSite.java:46)
   at 
org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:40)
   at 
org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:116)
   at 
org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:124)
   at Script1.run(Script1.groovy:43)
   at 
org.codehaus.groovy.jsr223.GroovyScriptEngineImpl.eval(GroovyScriptEngineImpl.java:315)
   ... 7 more


And the alternative method:

Gremlin.compile("outE('KNOWS').inV)";

gives:

Exception in thread "main" groovy.lang.MissingMethodException: No
signature of method: Script1.outE() is applicable for argument types:
(java.lang.String) values: [KNOWS]
Possible solutions: run(), run(), dump(), use([Ljava.lang.Object;),
putAt(java.lang.String, java.lang.Object), with(groovy.lang.Closure)
   at 
org.codehaus.groovy.runtime.ScriptBytecodeAdapter.unwrap(ScriptBytecodeAdapter.java:54)
   at 
org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.callCurrent(PogoMetaClassSite.java:78)
   at 
org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCallCurrent(CallSiteArray.java:44)
   at 
org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:141)
   at 
org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:149)
   at Script1.run(Script1.groovy:22)
   at groovy.lang.GroovyShell.evaluate(GroovyShell.java:576)
   at groovy.lang.GroovyShell.evaluate(GroovyShell.java:614)
   at groovy.lang.GroovyShell.evaluate(GroovyShell.java:585)
   at groovy.lang.GroovyShell$evaluate.call(Unknown Source)
   at 
org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:40)
   at 
org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:116)
   at 
org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(

Re: [Neo4j] Social Networks And Graph Databases

2011-02-21 Thread Jim Webber
Yup, you nailed it better than I did Rick.

Though your partition strategy might not be just "per user." For example in the 
geo domain, it makes sense to route requests for particular cities to specific 
nodes. It'll depend on your application how you generate your routing rules.

Jim

On 21 Feb 2011, at 14:51, Michael Hunger wrote:

> You shouldn't be confused because you got it right :)
> 
> Cheers
> 
> Michael
> 
> Am 21.02.2011 um 15:40 schrieb Rick Otten:
> 
>> Ok, I'm following this discussion, and now I'm confused.
>> 
>> My understanding was that the (potentially very large) database is
>> replicated across all instances.
>> 
>> If someone needed to traverse to something that wasn't cached, they'd take
>> a performance hit, but still be able to get to it.
>> 
>> I had understood the idea behind the load balancing is to minimize
>> traversals out of cache by grouping similar sets of users on a particular
>> server.  (That way you don't need a ton of RAM to stash everything in the
>> database, just the most frequently accessed nodes and relationships
>> associated with a subset of the users.)
>> 
>> 
>> 
>> 
>>> Hello JT,
>>> 
 One thing, when you say route requests to specific instances .. does
 that
 imply that node relationships can't span instances ?
>>> 
>>> Yes that's right. What I'm suggesting here is that each instance is a full
>>> replica that works on a subset of requests which are likely to keep the
>>> caches warm.
>>> 
>>> So if you can split your requests (e.g all customers beginning with "A" go
>>> to instance "1" ... all customers beginning with "Z" go to instance "26"),
>>> they will benefit from having warm caches for reading, while the HA
>>> infrastructure deals with updates across instances transactionally.
>>> 
>>> Jim
>>> ___
>>> Neo4j mailing list
>>> User@lists.neo4j.org
>>> https://lists.neo4j.org/mailman/listinfo/user
>>> 
>> 
>> 
>> -- 
>> Rick Otten
>> rot...@windfish.net
>> O=='=+
>> 
>> 
>> ___
>> Neo4j mailing list
>> User@lists.neo4j.org
>> https://lists.neo4j.org/mailman/listinfo/user
> 
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Social Networks And Graph Databases

2011-02-21 Thread Michael Hunger
You shouldn't be confused because you got it right :)

Cheers

Michael

Am 21.02.2011 um 15:40 schrieb Rick Otten:

> Ok, I'm following this discussion, and now I'm confused.
> 
> My understanding was that the (potentially very large) database is
> replicated across all instances.
> 
> If someone needed to traverse to something that wasn't cached, they'd take
> a performance hit, but still be able to get to it.
> 
> I had understood the idea behind the load balancing is to minimize
> traversals out of cache by grouping similar sets of users on a particular
> server.  (That way you don't need a ton of RAM to stash everything in the
> database, just the most frequently accessed nodes and relationships
> associated with a subset of the users.)
> 
> 
> 
> 
>> Hello JT,
>> 
>>> One thing, when you say route requests to specific instances .. does
>>> that
>>> imply that node relationships can't span instances ?
>> 
>> Yes that's right. What I'm suggesting here is that each instance is a full
>> replica that works on a subset of requests which are likely to keep the
>> caches warm.
>> 
>> So if you can split your requests (e.g all customers beginning with "A" go
>> to instance "1" ... all customers beginning with "Z" go to instance "26"),
>> they will benefit from having warm caches for reading, while the HA
>> infrastructure deals with updates across instances transactionally.
>> 
>> Jim
>> ___
>> Neo4j mailing list
>> User@lists.neo4j.org
>> https://lists.neo4j.org/mailman/listinfo/user
>> 
> 
> 
> -- 
> Rick Otten
> rot...@windfish.net
> O=='=+
> 
> 
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Social Networks And Graph Databases

2011-02-21 Thread Rick Otten
Ok, I'm following this discussion, and now I'm confused.

My understanding was that the (potentially very large) database is
replicated across all instances.

If someone needed to traverse to something that wasn't cached, they'd take
a performance hit, but still be able to get to it.

I had understood the idea behind the load balancing is to minimize
traversals out of cache by grouping similar sets of users on a particular
server.  (That way you don't need a ton of RAM to stash everything in the
database, just the most frequently accessed nodes and relationships
associated with a subset of the users.)




> Hello JT,
>
>> One thing, when you say route requests to specific instances .. does
>> that
>> imply that node relationships can't span instances ?
>
> Yes that's right. What I'm suggesting here is that each instance is a full
> replica that works on a subset of requests which are likely to keep the
> caches warm.
>
> So if you can split your requests (e.g all customers beginning with "A" go
> to instance "1" ... all customers beginning with "Z" go to instance "26"),
> they will benefit from having warm caches for reading, while the HA
> infrastructure deals with updates across instances transactionally.
>
> Jim
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>


-- 
Rick Otten
rot...@windfish.net
O=='=+


___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Batch Inserter - db scaling issue (not index scaling issue)

2011-02-21 Thread Johan Svensson
Mark,

I had a look at this and you try to inject 130M relationships with a
relationship store configured to 700M. That will not be an efficient
insert. If your relationships and data are not sorted the batch
inserter would have to unload and load blocks of data as soon as you
get over around 22M relationships. To inject 130M relationships at
full speed with random connections would require around 4G for the
relationship store.

-Johan

On Fri, Feb 18, 2011 at 8:07 AM, Mark @ Gmail  wrote:
> Hi Johan and others
>>>I am having a hard time to follow what the problems really are since 
>>>conversation is split up in several thread
> My fault, sorry. I was replying to a message posted before I subscribed to 
> the list so didn't have the orginal poster's email.
>
>>>as I understand it you are saying that it is the index lookups that are 
>>>taking to long time?
>
> In your current implementation, "Yes" - in the indexing implementation I 
> provide on that Google code project there is no performance issue.
> However, having fixed the Lucene indexing issue it only reveals that the 
> *database* is now the bottleneck and blows up after 30 million edge inserts. 
> That is now the issue here.
>
> See the test results here : 
> http://code.google.com/p/graphdb-load-tester/wiki/TestResults
>
>>>For example inserting 500M relationships
>>>requiring 1B index lookups (one for each node) with an avg index
>>>lookup time of 1ms is 11 days worth of index lookup time.
> That is why I suggested to Peter when he asked for help with indexing that a 
> Bloom filter helps "know what you don't know" and an LRU Cache helps hang 
> onto popular nodes. These are in my implementation and both avoid reads.
> Re your suggestion about avoiding indexes by inserting in batches - I can't 
> see how that will help because you can sort input data by from node key or to 
> node key but will not necessarily end up with node pairs that are joined by 
> edges conveniently located in the same batch and will therefore need an index 
> service to add any edges - but as I say this is fixed in my implementation 
> andindexing is not the remaining issue - the database is.
> I do encourage you to try run it.
>
> Cheers,
> Mark
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Online backup works in full but not incremental

2011-02-21 Thread Mattias Persson
I just can't reproduce your problems. How are you doing backup? It looks
like (from your wrapper.log) that it's the server which somehow requests to
do the incremental backup. Aren't you doing backup of the server database?

2011/2/21 Brendan Cheng 

> Tobias,
>
> Here s the stack trace and part of the wrapper.log:
> run:
> Mon Feb 21 10:28:59 CST 2011: Client connected to 192.168.1.101:6362
> Mon Feb 21 10:29:00 CST 2011: Opened a new channel to /192.168.1.101:6362
> 21-Feb-2011 10:29:20 itags.Sync.SyncData main
> SEVERE: null
> org.neo4j.com.ComException: org.neo4j.com.ComException:
> org.jboss.netty.handler.queue.BlockingReadTimeoutException
>at org.neo4j.com.Client.sendRequest(Client.java:183)
>at
> org.neo4j.com.backup.BackupClient.incrementalBackup(BackupClient.java:48)
>at
> org.neo4j.com.backup.OnlineBackup.incremental(OnlineBackup.java:115)
>at
> org.neo4j.com.backup.OnlineBackup.incremental(OnlineBackup.java:105)
>at itags.Sync.SyncData.RunningOnlineBackup(SyncData.java:57)
>at itags.Sync.SyncData.main(SyncData.java:247)
> Caused by: org.neo4j.com.ComException:
> org.jboss.netty.handler.queue.BlockingReadTimeoutException
>at
> org.neo4j.com.DechunkingChannelBuffer.readNext(DechunkingChannelBuffer.java:61)
>at org.neo4j.com.Client$2.readNext(Client.java:155)
>at
> org.neo4j.com.DechunkingChannelBuffer.readNextChunk(DechunkingChannelBuffer.java:79)
>at
> org.neo4j.com.DechunkingChannelBuffer.(DechunkingChannelBuffer.java:50)
>at org.neo4j.com.Client$2.(Client.java:151)
>at org.neo4j.com.Client.sendRequest(Client.java:150)
>... 5 more
> Caused by: org.jboss.netty.handler.queue.BlockingReadTimeoutException
>at
> org.jboss.netty.handler.queue.BlockingReadHandler.readEvent(BlockingReadHandler.java:236)
>at
> org.jboss.netty.handler.queue.BlockingReadHandler.read(BlockingReadHandler.java:167)
>at
> org.neo4j.com.DechunkingChannelBuffer.readNext(DechunkingChannelBuffer.java:57)
>... 10 more
>
> Do I miss any configuration?
>
> cheers,
>
> Brendan
>
> > Date: Mon, 21 Feb 2011 01:11:19 +0800
> > From: Brendan 
> > Subject: [Neo4j] Online backup works in full but not incremental
> > To: "user@lists.neo4j.org" 
> > Message-ID: 
> > Content-Type: text/plain;   charset=us-ascii
> >
> > Hi,
> >
> > After I install the neo4j on ubuntu server I'm able to backup the full
> database, even repeatedly but it crashed on incremental.
> >
> > I also fixed the problem with webadmin page loading but I found 2 issues:
> 1 the timeline graph can not be loaded on ie 8. 2 I experienced stack
> overflow on the webadmin page during the full online backup running. Also
> unresponsive in Save changes command issued from webadmin.
> >
> > I then tried webadmin from chrome and I saw the timeline graph but the
> add relationship panel loaded but not functional, can't enter the node I'd
> URL.
> >
> > Which browser is most compatible with webadmin?
> >
> > Your comment is highly appreciated!
> >
> > Brendan
> >
> > Sent from my iPad
> > Date: Sun, 20 Feb 2011 19:07:02 +0100
> > From: Tobias Ivarsson 
> > Subject: Re: [Neo4j] Online backup works in full but not incremental
> > To: Neo4j user discussions 
> > Message-ID:
> >
> > Content-Type: text/plain; charset=ISO-8859-1
> >
> > On Sun, Feb 20, 2011 at 6:11 PM, Brendan  wrote:
> >
> >> Hi,
> >>
> >> After I install the neo4j on ubuntu server I'm able to backup the full
> >> database, even repeatedly but it crashed on incremental.
> >>
> >
> > Could you please provide a stacktrace, and other kinds of error output
> from
> > this crash.
> >
> > Thank you,
> > --
> > Tobias Ivarsson 
> > Hacker, Neo Technology
> > www.neotechnology.com
> > Cellphone: +46 706 534857
> >
>
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
>


-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neovigator Weekend Project

2011-02-21 Thread Mattias Persson
Pretty cool !

2011/2/21 Tobias Ivarsson 

> Nice stuff!
>
> On Mon, Feb 21, 2011 at 6:18 AM, Max De Marzi Jr.  >wrote:
>
> > Guys,
> >
> > So I ran into the Ask Ken project ( http://askken.heroku.com/ ) by
> > Michael Aufreiter yesterdat, and though it was pretty awesome... so I
> > ported it to using Neo4j.
> >
> > Check it out: http://neovigator.heroku.com/
> >
> > On github at https://github.com/maxdemarzi/neovigator
> >
> > Regards,
> > Max
> > ___
> > Neo4j mailing list
> > User@lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
> >
>
>
>
> --
> Tobias Ivarsson 
> Hacker, Neo Technology
> www.neotechnology.com
> Cellphone: +46 706 534857
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Root Node

2011-02-21 Thread Mattias Persson
2011/2/20 Michael Hunger 

> Yep. Just think in graph index :)
>
> To clarify: in-graph index, the "natural" index that is the graph itself.

>
> Michael
>
> Am 20.02.2011 um 22:26 schrieb Mark Nijhof:
>
> > Ah right, so I could connect via a relationship my type nodes to this
> > reference node (atm I was using an index to get to them).
> >
> > -Mark
> >
> >
> >
> > On 20. feb. 2011, at 22:24, Michael Hunger
> >  wrote:
> >
> >> One purpose of the reference not is that you don't have to rely on
> indexing for getting to certain nodes.
> >>
> >> If you connect your nodes to the reference node in a way that puts them
> in certain categories you can always get to them via traversal.
> >> Connections to the reference node are also used for visualization (e.g.
> for Neoclipse)
> >>
> >> HTH
> >>
> >> Michael
> >>
> >> From the docs of the method:
> >> Node getReferenceNode()
> >> Returns the reference node, which is a "starting point" in the node
> space. Usually, a client attaches relationships to this node that leads into
> various parts of the node space. For more information about common node
> space organizational patterns, see the design guide at
> http://wiki.neo4j.org/content/Design_Guide.
> >>
> >> Specifically: http://wiki.neo4j.org/content/Design_Guide#Subreferences
> >>
> >> Am 20.02.2011 um 21:19 schrieb Mark Nijhof:
> >>
> >>> Hi,
> >>>
> >>> Silly question perhaps, but what is the purpose of the root node? Why
> would
> >>> I want to get it?
> >>>
> >>> -Mark
> >>>
> >>> --
> >>> Mark Nijhof
> >>> m: 0047 95 00 99 37
> >>> e:  mark.nij...@cre8ivethought.com
> >>> b:  cre8ivethought.com/blog/index
> >>>
> >>>
> >>>
> >>> "Walking on water and developing software from a specification are easy
> if
> >>> both are frozen."
> >>>
> >>> -- Edward V Berard
> >>> ___
> >>> Neo4j mailing list
> >>> User@lists.neo4j.org
> >>> https://lists.neo4j.org/mailman/listinfo/user
> >>
> >> ___
> >> Neo4j mailing list
> >> User@lists.neo4j.org
> >> https://lists.neo4j.org/mailman/listinfo/user
> > ___
> > Neo4j mailing list
> > User@lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
>
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Lucene fulltext index batch inserter Lock obtain problem

2011-02-21 Thread Mattias Persson
2011/2/21 Shae Griffiths 

>
> Hi guys,
>
> I'm trying to use a LuceneFulltextIndexBatchInserter to index all my data
> as
> I import it, so I can search on properties other than just an ID, but it
> fairly quickly (~5 seconds) comes back with an
> "org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:
> NativeFSLock.write.lock" exception.
>

Are you using multiple threads during batch insertion (not allowed b.t.w.) ?


>
> When searching I found a bug report that looked like this from late 2008
> that said it was fixed, so maybe i'm doing something wrong, is this a
> fairly
> common error?
>

Nope, can't say that it is.

>
> I'm thinking that maybe my problem lies in the fact i'm indexing based on
> properties that may not exist on all nodes. eg. My graph is made of
> entities
> such as people and cars, a person has a first and last name, but does not
> have a registration number, and for a car it would be the reverse.
>

That doesn't matter, the index can handle such data.

>
> Then using a regular FulltextIndex (not a batch one) it seems to run fine,
> but after a while i get a gc overhead limit exceeded error :(
>

Weird... I would like to see your code to get more ideas to what it might
be.

Another thing: You seem to use the old index framework. Please consider the new
one  instead. Heck, it might
even get rid of your problems also :)


>
> Any ideas?
>
> Shae
> --
> View this message in context:
> http://neo4j-user-list.438527.n3.nabble.com/Neo4j-Lucene-fulltext-index-batch-inserter-Lock-obtain-problem-tp2542866p2542866.html
> Sent from the Neo4J User List mailing list archive at Nabble.com.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Spring data and Pipes

2011-02-21 Thread Alfredas Chmieliauskas
Thanks.

On Sun, Feb 20, 2011 at 10:18 PM, Michael Hunger
 wrote:
> Alfredas,
>
> very interesting ideas. I think that kind of repository would have to be 
> built in your application, _but_ we should provide the building blocks so 
> that you can easily put it together.

Yep. I certainly feel that the methods i'm trying to assemble could be
part of the generic repo provided.

>
> I'm thinking about providing support for Repositories in the Hades/Spring JPA 
> style where you get a generic repository as base with additional 
> finder-method-name parsing and whatever you add to that
> as a kind of custom extension gets exposed in your repo too.

Yes. Exactly.
The way i'm providing the generic methods right now is in the fashion of:
 List findByTraversalDescription(TraversalDescription td)
 List findByPipeline(Pipeline pipe)
 List findBySparql(String query)


>
> So if you're interested in this we can make this a use case for the M4 
> release of Spring-Data-Graph and experiment with that.
Sure. That would be very interesting.

>
> I'm currently working on the Neo4jTemplate API for Spring-Data-Graph, if you 
> (or anyone else) would like to look at that and provide feedback, I'd be 
> happy.
>
> http://gist.github.com/835408

Will do.
>
> I think you spotted a least issue with the relateTo. Currently it assumes 
> that you relate the father -> OUTGOING -> child. (Would you please try 
> child.relateTo(father) for incoming rels and communicate the results).
>

The child.relateTo(father) when child --outgoing--> father works fine.
It seems that one should rather just use collections (add/remove)
instead of relateTo as it might be confusing. Is there any reason to
provide that method?

> That is because it doesn't rely existing annotated relationships when doing 
> that operation. I could extend it in looking for such annotated fields and 
> use the values given there.
> But then you can also just add the child to the collection to create the 
> relationship.
>
>
> Cheers Michael
>


>
> Am 20.02.2011 um 21:22 schrieb Alfredas Chmieliauskas:
>
>> On Sun, Feb 20, 2011 at 8:20 PM, Michael Hunger
>>  wrote:
>>> all finders map back to the underlying graph so it doesnt matter which one 
>>> to use.
>>>
>>> one could still add a toVertex wrapper to your domain objects to have them 
>>> be usable in tinkerpop
>>>
>>> what is the usecase you want to achieve with the combination of spring data 
>>> graph and tinkerpop
>>>
>> So far I'm just experimenting with different ways query the graph :-)
>> In the end I hope to build an "AbstractRepository" class that would
>> allow me to find things in the same underlying graph using
>> TraversalDescriptions, Pipes, Gremlin or even SPARQL.
>> Having that would add a lot of flexibility in writing domain methods.
>>
>>> glad that spring data graph works well for you if you have any feedback or 
>>> issues just ping me
>>
>> Yes. Spring data is very interesting. Although I am still trying to
>> understand how to query a graph of heterogeneous nodes and
>> relationships and discover patterns.
>>
>> A quick/trick question:
>> I noticed that father.relateTo(child, RelationshipTypes.PARENT) works
>> only if father has Direction.BOTH or Direction.OUTGOING; and
>>
>> @RelatedTo(type = "PARENT", elementClass = Person.class, direction =
>> Direction.INCOMING)
>> private Set children;
>> 
>> father.relateTo(child, RelationshipTypes.PARENT);
>> 
>> father.getChildren()
>>
>> would return empty in case of  Direction.INCOMING
>>
>> this might be confusing!
>>
>> Thanks,
>>
>> Alfredas
>> ___
>> Neo4j mailing list
>> User@lists.neo4j.org
>> https://lists.neo4j.org/mailman/listinfo/user
>
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Exception creating EmbeddedGraphDatabase from a Servlet

2011-02-21 Thread Pablo Pareja
Hi Jim,

I already did it, I'm using a constant defined in the code for the DB folder
which it's the same
I used for testing things with a really simple jar.
I also tried changing the permissions for every DB file granting every kind
of permission to any kind of user (I know that's
kind of crazy but just wanted to make sure it didn't have anything to do
with that...)

Pablo


On Mon, Feb 21, 2011 at 11:00 AM, Jim Webber  wrote:

> Hi Pablo,
>
> This caught my eye in your stacktrace: Unable to create directory path[]
> for Neo4j
>
> Can you confirm that you have provided the right path for your database
> into your Jetty app?
>
> Jim
>
>
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Pablo Pareja Tobes
LinkedInhttp://www.linkedin.com/in/pabloparejatobes
Twitter   http://www.twitter.com/pablopareja

http://about.me/pablopareja
http://www.ohnosequences.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Exception creating EmbeddedGraphDatabase from a Servlet

2011-02-21 Thread Jim Webber
Hi Pablo,

This caught my eye in your stacktrace: Unable to create directory path[] for 
Neo4j

Can you confirm that you have provided the right path for your database into 
your Jetty app?

Jim


___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Exception creating EmbeddedGraphDatabase from a Servlet

2011-02-21 Thread Pablo Pareja
Hi!

I'm having some trouble getting Neo4j DB to work from servlets in my Jetty
server.
There's no problem when I open the DB from a simple jar program file,
however when
I put the same code inside a servlet I get this exception:

org.neo4j.graphdb.TransactionFailureException: Could not create data source
[nioneodb], see nested exception for cause of error
at
org.neo4j.kernel.impl.transaction.TxModule.registerDataSource(TxModule.java:154)
at org.neo4j.kernel.GraphDbInstance.start(GraphDbInstance.java:107)
at
org.neo4j.kernel.EmbeddedGraphDbImpl.(EmbeddedGraphDbImpl.java:168)
at
org.neo4j.kernel.EmbeddedGraphDatabase.(EmbeddedGraphDatabase.java:81)
at
org.neo4j.kernel.EmbeddedGraphDatabase.(EmbeddedGraphDatabase.java:65)
at
com.era7.bioinfo.bioinfoneo4j.Neo4jManager.(Neo4jManager.java:28)
at
com.era7.bioinfo.servletlibraryneo4j.servlet.BasicServletNeo4j.servletLogic(BasicServletNeo4j.java:138)
at
com.era7.bioinfo.servletlibraryneo4j.servlet.BasicServletNeo4j.doGet(BasicServletNeo4j.java:285)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:530)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:426)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:457)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:931)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:361)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:867)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:245)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:113)
at org.eclipse.jetty.server.Server.handle(Server.java:337)
at
org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:581)
at
org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:1005)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:560)
at
org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:222)
at
org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:417)
at
org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:474)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:437)
at java.lang.Thread.run(Thread.java:636)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:532)
at
org.neo4j.kernel.impl.transaction.XaDataSourceManager.create(XaDataSourceManager.java:74)
at
org.neo4j.kernel.impl.transaction.TxModule.registerDataSource(TxModule.java:148)
... 31 more
Caused by: java.io.IOException: Unable to create directory path[] for Neo4j
store.
at
org.neo4j.kernel.impl.nioneo.xa.NeoStoreXaDataSource.autoCreatePath(NeoStoreXaDataSource.java:178)
at
org.neo4j.kernel.impl.nioneo.xa.NeoStoreXaDataSource.(NeoStoreXaDataSource.java:124)
... 37 more

I have to say that I'm not an expert at all dealing with jetty so it's
possible that this exception
is related to server configuration stuff I don't know about.
Any ideas?

Thanks in advance,
Cheers


-- 
Pablo Pareja Tobes
LinkedInhttp://www.linkedin.com/in/pabloparejatobes
Twitter   http://www.twitter.com/pablopareja

http://about.me/pablopareja
http://www.ohnosequences.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Slow down on insertion as db grow

2011-02-21 Thread Massimo Lusetti
On Fri, Feb 18, 2011 at 2:31 PM, Tobias Ivarsson
 wrote:

> I recently (yesterday) committed a new feature for Neo4j that will store
> these kinds of short strings, making Neo4j store them without having to
> involve the DynamicStringStore at all. You should see a substantial speedup
> from using that. IPv4 addresses will always be storeable as a short string,
> so if all you store as properties are IPv4 addresses the DynamicStringStore
> wouldn't be used at all in your use case.

I'll give it a go in next days and report back...

> WRT the slowdown you've seen, I'll have to investigate that further.

Really appreciated, if you want feedback, let me know.

Cheers
-- 
Massimo
http://meridio.blogspot.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user