On Tue, Jun 28, 2011 at 2:25 PM, Rick Bullotta
rick.bullo...@thingworx.com wrote:
Ah! Got it. That makes sense, and that most definitely is an EXTREME edge
case!!! ;-)
Does it make more sense to have each hash index point to its own node
... so I end up having more then 350.000.000 of
On Mon, Jun 27, 2011 at 11:56 PM, Michael Hunger
michael.hun...@neotechnology.com wrote:
Massimo,
could you please look into the Lucene Document instance that you add all the
fields to?
You're right... I add only the NodeId and my own hash... Which fields
do you add?
If it also contains
On Tue, Jun 28, 2011 at 2:22 PM, Mattias Persson
matt...@neotechnology.com wrote:
I think what Michael is saying is that you normally don't have many
key/value pairs associated with one node/relationship in an index. To have
250M key/value pairs indexed in one lucene document is at most a
On Sat, Jun 25, 2011 at 2:59 AM, Michael Hunger
michael.hun...@neotechnology.com wrote:
Massimo,
when profiling this it quickly becomes apparent that the issue is within the
lucene document.
(org.apache.lucene.document.Document)
it holds an arraylist of all its fields which amount to all
On Thu, Jun 23, 2011 at 9:08 PM, Mattias Persson
matt...@neotechnology.com wrote:
That should be quite fine. I could try this out locally perhaps. Something
like:
IndexNode index = db.index().forNodes(myIndex);
Transaction tx = db.beginTx();
Node node = db.createNode();
for ( int i = 0; i
On Fri, Jun 24, 2011 at 10:00 AM, Massimo Lusetti mluse...@gmail.com wrote:
On Thu, Jun 23, 2011 at 9:08 PM, Mattias Persson
matt...@neotechnology.com wrote:
That should be quite fine. I could try this out locally perhaps. Something
like:
IndexNode index = db.index().forNodes(myIndex
On Thu, Jun 23, 2011 at 9:08 PM, Mattias Persson
matt...@neotechnology.com wrote:
That should be quite fine. I could try this out locally perhaps. Something
like:
IndexNode index = db.index().forNodes(myIndex);
Transaction tx = db.beginTx();
Node node = db.createNode();
for ( int i = 0; i
On Thu, Jun 16, 2011 at 5:20 PM, Mattias Persson
matt...@neotechnology.com wrote:
Hi, could you clarify a bit what you mean? Do you have one node which is
indexed in one index with millions of key/value pairs? What about
relationships, you said that you had many relationships to a single node.
On Tue, Jun 14, 2011 at 12:56 PM, Johan Svensson
jo...@neotechnology.com wrote:
Hi,
Looks like there was an OOME during commit but commit partially
succeeded (removing the xid branch id association for the xa resource)
causing the subsequent rollback call to fail. To guarantee consistency
From
http://components.neo4j.org/neo4j/1.4.M04/apidocs/org/neo4j/graphdb/Transaction.html
I read:
... Read operations inside of a transaction will also read
uncommitted data from the same transaction.
So does my understanding that if I create a Node or a Relationship and
then add it to an
Hi All,
I'm going to give a try again to my apps on neo4j with the current
1.4.M03 implementations.
After a while I got this stack trace for which I hope someone could
give me a clue:
org.neo4j.graphdb.TransactionFailureException: Unable to commit transaction
at
On Mon, Mar 21, 2011 at 11:07 PM, Rick Bullotta
rick.bullo...@thingworx.com wrote:
Here's the quick summary of what we're encountering:
We are inserting large numbers of activity stream entries on a nearly
constant basis. To optimize transactioning, we queue these up and have a
single
On Thu, Mar 17, 2011 at 12:15 AM, David Montag
david.mon...@neotechnology.com wrote:
One key point of Davids suggestion is that it takes into account that each
action of the user could take place from a different IP. Massimo's original
model implied that the user would always be at the same IP
On Tue, Mar 22, 2011 at 6:40 PM, Rick Bullotta
rick.bullo...@thingworx.com wrote:
Hi, Massimo.
When you say you are using an externally managed Lucene index, does that
imply that you are not using the Neo index framework and interacting with
Lucene directly?
Summarizing, yes.
My use case
On Tue, Mar 22, 2011 at 6:40 PM, Rick Bullotta
rick.bullo...@thingworx.com wrote:
Hi, Massimo.
When you say you are using an externally managed Lucene index, does that
imply that you are not using the Neo index framework and interacting with
Lucene directly?
Thanks for any advice!
Oh...
On Tue, Mar 22, 2011 at 7:19 PM, Peter Neubauer
neubauer.pe...@gmail.com wrote:
Interesting,
Could you write a test scenario that shows this? Would be very helpful in
tracking down why this is. I don't think there is much magic going on beside
building lucene indicies in memory before
I remember to have read about some design smells but I cannot find it
in the Design_Guide wiki so I post it here.
I got IP addresses and uid (unique usernames), each uid performs
actions on domains (kinda of urls). So I got a db with a small to
medium number of Node for uid, IP and domains (with
On Wed, Mar 16, 2011 at 6:10 PM, Rick Bullotta
rick.bullo...@burningskysoftware.com wrote:
I'll be interested in the response(s), Massimo, since some of the more
performance-critical aspects of our application are also
relationship-heavy.
I'll let you know, certainly.
Now with the prototype
On Wed, Mar 16, 2011 at 6:17 PM, David Montag
david.mon...@neotechnology.com wrote:
Massimo,
So just to understand your graph layout, you have:
(UID) --PERFORMED_ACTION_ON-- (DOMAIN)
(UID) --ACTION_TOOK_PLACE_FROM-- (IP)
Is this correct? Could you elaborate a bit more on the use case, along
On Wed, Mar 16, 2011 at 6:41 PM, David Montag
david.mon...@neotechnology.com wrote:
Massimo,
It sounds like certain PERFORMED_ACTION_ON and ACTION_TOOK_PLACE_FROM
relationships are logically grouped/related. Is this a correct statement?
If so, then you might want to consider something like:
On Wed, Mar 16, 2011 at 7:03 PM, David Montag
david.mon...@neotechnology.com wrote:
Massimo,
If you'd like, I could skype with you later this afternoon (in 4-5 hours)
and discuss it?
David
Wow that's would be cool... But hopefully I'm going to be sleeping, I
need it... Anyway I'll do my
On Tue, Mar 15, 2011 at 9:11 AM, Mattias Persson
matt...@neotechnology.com wrote:
I'm positive that some nice API will enter the kernel at some point, f.ex.
I'm experimenting with an API like this:
for(Node node :
On Tue, Mar 15, 2011 at 9:11 AM, Mattias Persson
matt...@neotechnology.com wrote:
more advanced. If you would like to force the traversal down a very defined
path then go with the core API, like:
for(Relationship relA : startNode.getRelationships(A)) {
Node nodeA =
On Tue, Mar 15, 2011 at 4:26 PM, Chris Gioran
chris.gio...@neotechnology.com wrote:
Of course it does! Node is the interface implemented by NodeProxy -
the objects you get back and use as Nodes. NodeProxy implement
equals() and hashCode() based on the id of the Node. Using them in a
On Mon, Mar 14, 2011 at 9:26 AM, Mattias Persson
matt...@neotechnology.com wrote:
Hmm, that doesn't look very good. I'm very keen on looking at your code for
this test, if possible, since I haven't experienced a slowdown like this
before.
I just did an insertion test of 1.5M indexed nodes
On Mon, Mar 14, 2011 at 12:39 PM, Mattias Persson
matt...@neotechnology.com wrote:
Head over to http://wiki.neo4j.org/content/Traversal_Framework for more
information. And know that the exact limitation you ran into spawned the
creation of this new API.
I started using this framework from day
On Mon, Mar 14, 2011 at 1:32 PM, Emil Eifrem e...@neotechnology.com wrote:
On Mon, Mar 14, 2011 at 12:15, Axel Morgner a...@morgner.de wrote:
Hi everybody,
as said, here's a new thread for the idea of having beer and talk
meetings.
Possible locations so far:
Malmö
London
Berlin
I'm using the Traversal framework
(http://wiki.neo4j.org/content/Traversal_Framework) and I would like
to know if I'm using it the way it has thought to be.
I need to deeply traverse the graph going down through different
RelationshipTypes so I do a first TraversalDescription and while
iterating
On Mon, Mar 14, 2011 at 3:13 PM, Mattias Persson
matt...@neotechnology.com wrote:
Would you like to do a traversal where relationships of different types can
be traversed? That can/should be done with one traversal, one
TraversalDescription:
Traversal.description()
On Fri, Mar 11, 2011 at 10:31 AM, Peter Neubauer
peter.neuba...@neotechnology.com wrote:
No,
things are not failing, it is just that in big insertion scenarios the
index lookup when joining nodes together into relationships, there is
often just an exact index needed in order to do that. We
On Fri, Mar 11, 2011 at 10:37 AM, Mattias Persson
matt...@neotechnology.com wrote:
And I'm curious about why the neo4j lucene layer adds overhead and how your
code looks like in your own solution.
I really don't know, didn't had time to investigate in neo4j code but
I'm indexing a SHA1 hash
On Fri, Mar 11, 2011 at 12:31 PM, Tobias Ivarsson
tobias.ivars...@neotechnology.com wrote:
I need at least these two files:
neostore.nodestore.db
neostore.relationshipstore.db
I managed to get permission to send you data and tomorrow while in
office I'll let you know the URL.
Thanks so much
On Sun, Mar 13, 2011 at 10:29 AM, Peter Neubauer
neubauer.pe...@gmail.com wrote:
Let me know how it goes!
Here are the results: http://dl.dropbox.com/u/22802242/neo4j-stats.ods
As you can see JDBM is slower then Lucene in my tests and the growing
trend in ms is steeper.
Every rounds parse
On Sun, Mar 13, 2011 at 4:43 PM, Massimo Lusetti mluse...@gmail.com wrote:
On Sun, Mar 13, 2011 at 10:29 AM, Peter Neubauer
neubauer.pe...@gmail.com wrote:
Let me know how it goes!
Here are the results: http://dl.dropbox.com/u/22802242/neo4j-stats.ods
As you can see JDBM is slower
On Sun, Mar 13, 2011 at 4:50 PM, Peter Neubauer
neubauer.pe...@gmail.com wrote:
Thanks Massimo,
Will check it out tomorrow!
You're welcome, it's easy and funny to play with neo4j.
Please know that every test has been conducted with neo4j 1.3.M04
untuned, that every rounds involve the creation
On Thu, Mar 10, 2011 at 9:58 PM, Massimo Lusetti mluse...@gmail.com wrote:
On Thu, Mar 10, 2011 at 6:11 PM, Axel Morgner a...@morgner.de wrote:
Hi,
I'm getting an InvalidRecordException
org.neo4j.kernel.impl.nioneo.store.InvalidRecordException: Node[5] is
neither firstNode[37781] nor
On Fri, Mar 11, 2011 at 10:14 AM, Johan Svensson
jo...@neotechnology.com wrote:
Hi,
I am assuming no manual modifying of log files or store files at
runtime or between shutdowns/crashes and startups has been performed.
Sure.
What filesystem are you running this on (and with what
On Fri, Mar 11, 2011 at 10:12 AM, Peter Neubauer
peter.neuba...@neotechnology.com wrote:
Nice Ashwin,
sounds like a great ac, will definitily keep track of it. If I do a
Neo4j Index provider for JDBM2, would you be able to help me to tweak
it to behave good?
Did it really fail Lucene with
On Fri, Mar 11, 2011 at 10:49 AM, Tobias Ivarsson
tobias.ivars...@neotechnology.com wrote:
Please tell me that you have the database file in the state they were in
when this happened. That you have not tried to repair the database in any
way.
If you do, could you please send me those database
On Wed, Mar 9, 2011 at 8:24 PM, Mattias Persson
matt...@neotechnology.com wrote:
Correct, and go with the M03 milestone first because M04 will
introduce changes which requires an upgrade from a cleanly shut down
database.
Not so gentle from yours to force an upgrade from M03 to M04 sir...
;-)
Hi all,
I'm still new to neo4j but I feel I get more used to it and I like
that feeling, anyway I'm getting strange results, at least they seems
strange to me but I bet they've a pretty easy explanation.
The nice pictures below (thanks Andreas neoclipse rocks!) show what I
got in a part of my
On Thu, Mar 10, 2011 at 3:55 PM, Massimo Lusetti mluse...@gmail.com wrote:
Hi all,
I'm still new to neo4j but I feel I get more used to it and I like
that feeling, anyway I'm getting strange results, at least they seems
strange to me but I bet they've a pretty easy explanation.
As it seems
On Thu, Mar 10, 2011 at 6:11 PM, Axel Morgner a...@morgner.de wrote:
Hi,
I'm getting an InvalidRecordException
org.neo4j.kernel.impl.nioneo.store.InvalidRecordException: Node[5] is
neither firstNode[37781] nor secondNode[37782] for Relationship[188125]
at
On Sat, Mar 5, 2011 at 6:58 PM, Anders Nawroth and...@neotechnology.com wrote:
Hi!
I just uploaded Neoclipse 1.3.M03, you can download it from here:
http://neo4j.org/download/
From this version, the search function uses the new integrated index.
At the moment it can only search for exact
I'm using neoclipse 1.2 to access a db made with 1.3.M03 and having
issue browsing the node space... always get Unknown enum type:11
I tested both my db and one made with the code at The Matrix
example, same results.
I read that I should use the same version for the DB and Neoclipse, is
that
On Sun, Feb 27, 2011 at 11:37 PM, Tobias Ivarsson
tobias.ivars...@neotechnology.com wrote:
I'm not sure the input from those statistics would influence anything at
this point, but since I haven't looked at any of these statistics of the
string store WITH short strings in place it would at
Hi neo4j developers,
I'm repopulating a neo4j db with fresh data made after M03 release,
so I guess If I could run Tobias statistics tool to gather some
info... are these interesting to you and safe for data?
Cheers
--
Massimo
http://meridio.blogspot.com
On Fri, Feb 18, 2011 at 2:31 PM, Tobias Ivarsson
tobias.ivars...@neotechnology.com wrote:
I recently (yesterday) committed a new feature for Neo4j that will store
these kinds of short strings, making Neo4j store them without having to
involve the DynamicStringStore at all. You should see a
On Wed, Feb 16, 2011 at 11:12 AM, Mark Harwood markharw...@gmail.com wrote:
However, the underlying Neo4J database doesn't seem to be able to cope
with inserting these volumes of data on my available hardware and I
don't have (and would hope not to need) 10s of gigabytes of RAM to
throw at
On Thu, Feb 17, 2011 at 12:54 PM, Pablo Pareja ppar...@era7.com wrote:
Hi Massimo,
It's too bad you are running into the same kind of situations, (specially
when
the conclusion you came up to is that Neo4j just degrades...).
However, did you already try dividing the big insertion process
On Thu, Feb 17, 2011 at 3:10 PM, Johan Svensson jo...@neotechnology.com wrote:
Hello,
I am having a hard time to follow what the problems really are since
conversation is split up in several threads.
Totally right and sorry about that, I'll start a new thread for my use case.
Cheers
--
On Thu, Feb 17, 2011 at 3:10 PM, Johan Svensson jo...@neotechnology.com wrote:
Massimo, you had problems with injection that created duplicates due
to a synchronization issue. That issue has been resolved and now you
are experiencing a slowdown during batch inserter injection?
Yep, as I said
On Wed, Feb 16, 2011 at 2:46 AM, David Montag
david.mon...@neotechnology.com wrote:
Hi Massimo,
I just want to understand your use case.
You have a stream of records (log rows in your case) coming in. You process
each record, somehow mutating the graph. Then you want to remember that
On Mon, Feb 14, 2011 at 10:08 AM, Tobias Ivarsson
tobias.ivars...@neotechnology.com wrote:
That is correct. The lucene indexes in Neo4j are tied to the same
transaction life cycle.
Thanks for the confirmation Tobias and Chris.
I find my domain model somehow suggest me to use more then one
On Thu, Feb 10, 2011 at 12:08 PM, Tobias Ivarsson
tobias.ivars...@neotechnology.com wrote:
This tool does not gather statistics for string arrays, since they are not
affected by the proposed patch.
The tool still runs, but if there are no String properties, output will be
pretty boring
On Tue, Feb 15, 2011 at 9:14 PM, Mattias Persson
matt...@neotechnology.com wrote:
100 million sounds strange :) but to have a hand full of key/value pairs
pointing to the same entity is rather normal. Could you elaborate more on
that use case to let us know why you apparently have super many
On Thu, Feb 10, 2011 at 12:08 PM, Tobias Ivarsson
tobias.ivars...@neotechnology.com wrote:
This tool does not gather statistics for string arrays, since they are not
affected by the proposed patch.
The tool still runs, but if there are no String properties, output will be
pretty boring
Hi all,
Does the inner Lucene index commit toghether with the Transaction
success()/finish() cycle?
I mean if I start a Transaction and do like
http://wiki.neo4j.org/content/Transactions#Big_transactions I suppose
I'm guarantee the Lucene index is synched to disk as soon as the
Transaction
On Fri, Feb 4, 2011 at 5:13 PM, Tobias Ivarsson
tobias.ivars...@neotechnology.com wrote:
Friends,
Please read this, your involvement will make Neo4j more efficient!
Does this run with a db with only String[] data?!
Cheers
--
Massimo
http://meridio.blogspot.com
On Wed, Feb 2, 2011 at 1:20 PM, Tobias Ivarsson
tobias.ivars...@neotechnology.com wrote:
You are doing I/O bound work. More then two threads is most likely just
going to add overhead and make things slower!
I'm certainly doing something wired cause the performance of my tests
aren't linear.
On Thu, Feb 3, 2011 at 11:30 AM, Mattias Persson
matt...@neotechnology.com wrote:
Lucene lookup performance degrades the bigger the index gets. That may be a
reason.
I don't think Lucene cannot handle an index with 6/7 million of
entries. Maybe are some logs around?
Cheers
--
Massimo
On Thu, Feb 3, 2011 at 2:01 PM, Peter Neubauer
peter.neuba...@neotechnology.com wrote:
Massimo,
I yesterday just tried to import the Germany OpenStreetMap dataset
into Neo4j using Lucene indexing. There are around 60M nodes that all
are indexed into Lucene, and then looked up when the Ways,
On Tue, Feb 1, 2011 at 10:19 PM, Tobias Ivarsson
tobias.ivars...@neotechnology.com wrote:
For getting a performance boost out of writes, doing multiple operations in
one transaction will give a much bigger gain than multiple threads though.
For your use case, I think two writer threads and a
Hi everyone,
I'm new to neo4j and I'm making experience with it, I got a fairly
big table (in my current db) which consists of something more then 220
million rows.
I want to put that in a graphdb, for instance neo4j, and graph it to
do some statistics on them. Every row will be a node in my
On Tue, Feb 1, 2011 at 6:36 PM, Tobias Ivarsson
tobias.ivars...@neotechnology.com wrote:
Since you are checking for existence before inserting the conflict you are
getting is strange. Are you running multiple insertion threads?
Yep, I got 20 concurrent threads doing the job. I've forgot about
On Tue, Feb 1, 2011 at 6:43 PM, Peter Neubauer
peter.neuba...@neotechnology.com wrote:
Also,
have you been running this insert multiple times without cleaning up
the database between runs?
Nope for the tests I wipe (rm -rf) the db dir every run.
Cheers
--
Massimo
http://meridio.blogspot.com
On Tue, Feb 1, 2011 at 8:02 PM, Mattias Persson
matt...@neotechnology.com wrote:
Seems a little weird, the commit rate won't affect the end result,
just performance (more operations per commit means faster
performance). Your code seems correct for single threaded use btw.
Does it means that I
On Tue, Feb 1, 2011 at 10:19 PM, Tobias Ivarsson
tobias.ivars...@neotechnology.com wrote:
No, it means that you have to synchronize the threads so that they don't
insert the same data concurrently.
That would be a typical issue but I'm sure my are not duplicated since
the come from the (old)
On Tue, Feb 1, 2011 at 10:25 PM, Michael Hunger
michael.hun...@neotechnology.com wrote:
What about batch insertion of the nodes and indexing them after the fact?
The data to be entered will changes values in other nodes (statistics)
so I absolutely need to be sure to not insert data twice and
On Tue, Feb 1, 2011 at 10:50 PM, Tobias Ivarsson
tobias.ivars...@neotechnology.com wrote:
That is correct, the Isolation of ACID says that data isn't visible to other
threads until after commit.
The CHM should not replace the index check though, since you want to limit
the number of items in
70 matches
Mail list logo