Re: [Neo4j] Batch find
I don't see why there should be any delay. if you just try this, it should be able to add several thousand nodes per second to the graph. GraphDatabaseService graphdb = new EmbeddedGraphDatabase("words.db"); Index index = graphdb.index().forNodes("words"); for (Document doc : documents) { Transaction tx=graphdb.beginTx(); try { for (String word : document.words()) { Node node = index.get("word",word).getSingle(); if (node == null) { node = graphdb.createNode(); node.setProperty("word",word); node.setProperty("count",1); index.add(node, "word",word); } else { node.setProperty("count", (Integer)node.getProperty("count")+1); } } tx.success(); } finally { tx.finish(); } } Am 08.08.2011 um 17:06 schrieb ahmed.elsharkasy: > Also what is the reason of the delay still? > > -- > View this message in context: > http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3235850.html > Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Batch find
Also what is the reason of the delay still? -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3235850.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Batch find
yes this is my initial load of the db yes i know i maybe mixing both of them and this is not right but how can i do the same functionality and using the batch operations can i remove the transaction and insert/update with batch? -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3235802.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Batch find
Ahmed, is this your initial load of the graphdb? It looks like your mixing batch-insertion and normal transactional API in a single program. Please try to use just one in one program. I'd really suggest just go with the transactional API and insert / update one or more document(s) per transaction. What are you using "reference" for? that is set to the "created" or "result" node(id)? Am 08.08.2011 um 16:29 schrieb ahmed.elsharkasy: >Transaction tx = graphDb.beginTx(); >try { > >for (all words in a document){ > // search for the word > >if (result == null) { >long created = inserter.createNode(properties); >wordsIndex.add(created, properties); > >Mapproperties2 = > MapUtil.map("value", reference); > > //create relation > >reference = created; > >} else { > // update with the new properties >inserter.setNodeProperties(result, new_properties); > >//create relation > >reference = result; > >} >} > >} finally { >tx.finish(); >index.shutdown(); >inserter.shutdown(); >graphDb.shutdown(); >} > > -- > View this message in context: > http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3235721.html > Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Batch find
Transaction tx = graphDb.beginTx(); try { for (all words in a document){ // search for the word if (result == null) { long created = inserter.createNode(properties); wordsIndex.add(created, properties); Mapproperties2 = MapUtil.map("value", reference); //create relation reference = created; } else { // update with the new properties inserter.setNodeProperties(result, new_properties); //create relation reference = result; } } } finally { tx.finish(); index.shutdown(); inserter.shutdown(); graphDb.shutdown(); } -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3235721.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Batch find
Ahmed, could you please share some code? Batch-Inserter should really only be used to insert millions or billions of nodes. With the normal API you can insert/update about 10k nodes/rels per transaction without any issues. You should be able to insert/update several thousand nodes per second into your graph and index per second. Everything else is not ok. The REST API is at least one magnitude slower than the normal Java API (probably even two magnitudes, depending on the types of operations). Michael Am 08.08.2011 um 15:04 schrieb ahmed.elsharkasy: > Yes i am inserting 3 nodes where for each node i search use batch index > whether this word is found in the graph to update or create another node and > this took 1 second which is too high for me > > another problem beside the time is that i have to open a transaction and i > use inside the transaction a batch inserter which made me open a graph > database service for the beginning of the transaction and then closing it > and starting my batch operations which i think is not good too.also the > shutting down of the service and starting the batch also swallows good bunch > of milliseconds > > Each node carry only 1 string property beside the id . > > Do you think this time is suitable ? from the rest api i used to insert more > than 50 nodes in less than a second > > By the way when i increased the number of inserted nodes to 500 , the time > is still 1 second , the problem is that i want to decrease this 1 sec to > half a second maybe or something like that > > -- > View this message in context: > http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3235486.html > Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Batch find
Yes i am inserting 3 nodes where for each node i search use batch index whether this word is found in the graph to update or create another node and this took 1 second which is too high for me another problem beside the time is that i have to open a transaction and i use inside the transaction a batch inserter which made me open a graph database service for the beginning of the transaction and then closing it and starting my batch operations which i think is not good too.also the shutting down of the service and starting the batch also swallows good bunch of milliseconds Each node carry only 1 string property beside the id . Do you think this time is suitable ? from the rest api i used to insert more than 50 nodes in less than a second By the way when i increased the number of inserted nodes to 500 , the time is still 1 second , the problem is that i want to decrease this 1 sec to half a second maybe or something like that -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3235486.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Batch find
Yes, just executing a number of java API calls in a single tx, this is just what the REST API does. Batching is here on the protocol level, i.e. you need only one network operation (and serializer/deserializer call) for the whole set of operations (and those concerns are all not relevant in the java API). The question is: do you run into performance issues? Michael Am 08.08.2011 um 13:57 schrieb ahmed.elsharkasy: > i got your point , the reason for asking this question is that i already done > similar operations from the rest API i.e finding groups of words with one > request , doing more than operation in one call > > isnt there a java equivalent to a request like this in the Rest API : > http://docs.neo4j.org/chunked/1.4.M04/rest-api-batch-ops.html > > -- > View this message in context: > http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3235344.html > Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Batch find
i got your point , the reason for asking this question is that i already done similar operations from the rest API i.e finding groups of words with one request , doing more than operation in one call isnt there a java equivalent to a request like this in the Rest API : http://docs.neo4j.org/chunked/1.4.M04/rest-api-batch-ops.html -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3235344.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Batch find
How many words are contained in your text document ? Probably not millions or billions? Then using the batch-inserter API for that is not sensible. Otherwise (except if you're really experiencing performance issues) I would stay with the iteration across the words (of your word-set). You might use the lucene query syntax that I mentioned before to construct a query that looks for nodes with your words. That will give you the nodes already in the graph, you'd have to keep track of the words of your set that have already been dealt with and create the others afterwards. Am 08.08.2011 um 13:23 schrieb ahmed.elsharkasy: > still how can i get the whole words of a document in one shot to be able to > define the nodes which shall be inserted by batch and the nodes which shall > be updated > > -- > View this message in context: > http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3235279.html > Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Batch find
still how can i get the whole words of a document in one shot to be able to define the nodes which shall be inserted by batch and the nodes which shall be updated -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3235279.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Batch find
You'll probably want to use an Index for this. Either a Lucene index or in-graph index. I would recommend a Lucene index, since you can also leverage Lucene (and even Solr's) analyzers and parsers to process your document. -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of ahmed.elsharkasy Sent: Wednesday, August 03, 2011 7:15 AM To: user@lists.neo4j.org Subject: Re: [Neo4j] Batch find I am trying to insert a document containing list of words , and i wont to check whether some of this words are already in my graph and in this case i will update their properties otherwise i will create new nodes with the new words -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3221964.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Batch find
That should be "without having to do any lookups" > From: pd_aficion...@hotmail.com > To: user@lists.neo4j.org > Date: Wed, 3 Aug 2011 13:37:44 +0200 > Subject: Re: [Neo4j] Batch find > > > The batch insert is intended to push data into the database with having to do > any look ups. > You could preprocess your input data, such that it can be loaded in one go. > You could for example read you input file against an existing database, fetch > the ID's of nodes and relationships that contain the information you need to > update, and create two new input files. One containing data that can be > inserted using the batch inserter, and one containing the information that > needs to updated (including the ID's of the PropertyContainers that need to > be updated). > Niels > > > > Date: Wed, 3 Aug 2011 04:14:44 -0700 > > From: ahmed.elshark...@gmail.com > > To: user@lists.neo4j.org > > Subject: Re: [Neo4j] Batch find > > > > I am trying to insert a document containing list of words , and i wont to > > check whether some of this words are already in my graph and in this case i > > will update their properties otherwise i will create new nodes with the new > > words > > > > -- > > View this message in context: > > http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3221964.html > > Sent from the Neo4j Community Discussions mailing list archive at > > Nabble.com. > > ___ > > Neo4j mailing list > > User@lists.neo4j.org > > https://lists.neo4j.org/mailman/listinfo/user > > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Batch find
The batch insert is intended to push data into the database with having to do any look ups. You could preprocess your input data, such that it can be loaded in one go. You could for example read you input file against an existing database, fetch the ID's of nodes and relationships that contain the information you need to update, and create two new input files. One containing data that can be inserted using the batch inserter, and one containing the information that needs to updated (including the ID's of the PropertyContainers that need to be updated). Niels > Date: Wed, 3 Aug 2011 04:14:44 -0700 > From: ahmed.elshark...@gmail.com > To: user@lists.neo4j.org > Subject: Re: [Neo4j] Batch find > > I am trying to insert a document containing list of words , and i wont to > check whether some of this words are already in my graph and in this case i > will update their properties otherwise i will create new nodes with the new > words > > -- > View this message in context: > http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3221964.html > Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Batch find
I am trying to insert a document containing list of words , and i wont to check whether some of this words are already in my graph and in this case i will update their properties otherwise i will create new nodes with the new words -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3221964.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Batch find
Ahmed, are you tying to find a text or name, or a node? I am not sure as to what you mean. Do you have some example code so we can understand your problem better? Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://startupbootcamp.org/ - Öresund - Innovation happens HERE. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Wed, Aug 3, 2011 at 1:25 AM, ahmed.elsharkasy wrote: > how can i batch find a whole document in neo4j instead of looping through the > document words and searching one by one? > am using java > > -- > View this message in context: > http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3221634.html > Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user