I'm not sure it's such a good idea to call tx.success() on every iteration
of the loop. I suggest call it only in the commit, and after the loop (ie.
move it two lines down).
Also I think a commit size of 50k it a little large. You're probably not
going to see much improvement past 10k. In fact I
OK, I changed that and will test it if it improves the runtime.
Btw. I also changed my timestamp String into a long to reduce the size of my
database.
Hope to get some tips about faster parsing or optimizing my CSV-file from
you guys soon.
Cheers
Stephan
--
View this message in context:
Hello Michael,
I got the zipfile from here
http://download.wikimedia.org/enwiki/20110526/enwiki-20110526-stub-meta-history.xml.gz
http://download.wikimedia.org/enwiki/20110526/enwiki-20110526-stub-meta-history.xml.gz
. The unzipped file is a XML-file and I extracted the important informations
Hi all,
I'm new to neo4j and graph databases.
To create my graph database I got two questions for you:
1.
I want to create a graph database out of a huge CSV file.
The problem is, that i need to index the nodes I have already created, so
that I don't create duplicate nodes.
My CSV file looks
Stephan,
This is a common thing when inserting data.
You should be able to use lucene in both settings (6M authors is not that much).
Please have a look at your heap memory settings (and in transactional mode also
your memory-map settings for neo4j).
For batch inserter. You can query the
Hi,
thanks for your fast answer.
Right now I'm using lucene for 6M authors, but my whole dataset consists of
nearly 25M authors.
Can i use lucene there also, because I think this getting really slow to
check if a user already exists.
How can I change my heap memory settings and my memory-map
Stephan,
can you perhaps share your csv file or give at least a few sample lines and a
typical distribution (articles per author etc). You tested this with 20M
arcticles and 6M authors? What is the current runtime of that import with which
kind of hardware? (when working on a similar test I
7 matches
Mail list logo