Hi Rob,
as mentioned above by Frank and Michele, your initial approach used single
document operations that can not run in parallel (using the synchronous
driver).
All options stated above are based on 2M records.
Option 1: using Java threads and transaction size 500: *3 min 15 sec.*
Option 2: using AQL:* 1 min 12 sec.*
Option 3: using batches of documents: *1 min 5 sec.*
Your mileage will vary, this was a test on a local machine.
On Wednesday, February 26, 2020 at 3:26:22 PM UTC+1, Rob Gratz wrote:
>
>
> If you extrapolate your results across 2M records, you end up with the
> results I'm getting which is about 35+ minutes. I am adding the same data
> through Neo4J and it takes roughly 6-7 minutes.
>
>
> On Saturday, February 15, 2020 at 11:05:47 AM UTC-7, Michele Rastelli
> wrote:
>>
>> What and how are you measuring exactly? What are the numbers that you
>> get? And what is the execution time that you get in Neo4j?
>> Your code is correct and you should get good performances running it.
>>
>> I have slightly modified your code to measure the performances and on my
>> machine it takes in average around 1.1 ms to execute your code. I have
>> executed it against a single instance db (version 3.6.1-community) running
>> in a local docker container.
>>
>>
>> Here is the code:
>>
>> import com.arangodb.ArangoDB;
>> import com.arangodb.ArangoGraph;
>> import com.arangodb.entity.BaseDocument;
>> import com.arangodb.entity.BaseEdgeDocument;
>> import com.arangodb.entity.EdgeDefinition;
>> import com.arangodb.entity.StreamTransactionEntity;
>> import com.arangodb.model.EdgeCreateOptions;
>> import com.arangodb.model.StreamTransactionOptions;
>> import com.arangodb.model.VertexCreateOptions;
>>
>> import java.util.Collections;
>> import java.util.Date;
>> import java.util.UUID;
>>
>> public class Test {
>> static String DATABASE = "mydb";
>> static String GRAPH = "mygraph";
>>
>> public static void main(String[] args) {
>> ArangoDB arangoDB = new ArangoDB.Builder()
>> .host("localhost", 8529)
>> .build();
>>
>> if (arangoDB.db(DATABASE).exists()) {
>> arangoDB.db(DATABASE).drop();
>> }
>> arangoDB.db(DATABASE).create();
>>
>> arangoDB.db(DATABASE).createCollection("MY_test_vertex_from1");
>> arangoDB.db(DATABASE).createCollection("MY_test_vertex_from2");
>> arangoDB.db(DATABASE).createCollection("MY_test_vertex_to");
>> arangoDB.db(DATABASE).createGraph(GRAPH,
>> Collections.singletonList(new EdgeDefinition()
>> .collection("MY_test_edge")
>> .from("MY_test_vertex_from1", "MY_test_vertex_from2")
>> .to("MY_test_vertex_to")
>> ));
>>
>> int iterations = 1_000;
>>
>> // warmup
>> for (int i = 0; i < iterations; i++) {
>> String from1 = "from1-" + UUID.randomUUID().toString();
>> String from2 = "from2-" + UUID.randomUUID().toString();
>> String to = "to-" + UUID.randomUUID().toString();
>> addNodes(arangoDB, from1, from2, to);
>> addEdge(arangoDB, from1, from2, to);
>> }
>>
>> long start = new Date().getTime();
>> for (int i = 0; i < iterations; i++) {
>> String from1 = "from1-" + UUID.randomUUID().toString();
>> String from2 = "from2-" + UUID.randomUUID().toString();
>> String to = "to-" + UUID.randomUUID().toString();
>> addNodes(arangoDB, from1, from2, to);
>> addEdge(arangoDB, from1, from2, to);
>> }
>> long end = new Date().getTime();
>> long elapsed = end - start;
>> System.out.println("elapsed: " + elapsed + " ms");
>> System.out.println("avg: " + (1.0 * elapsed / iterations) + " ms");
>> arangoDB.shutdown();
>> }
>>
>>
>> private static void addEdge(ArangoDB arangoDB, String from1, String
>> from2, String to) {
>> ArangoGraph graph = arangoDB.db(DATABASE).graph(GRAPH);
>> String[] collections = new String[]{"MY_test_edge"};
>>
>> StreamTransactionEntity tx = graph.db().beginStreamTransaction(
>> new StreamTransactionOptions()
>> .waitForSync(false)
>> .writeCollections(collections));
>> EdgeCreateOptions options = new EdgeCreateOptions()
>> .streamTransactionId(tx.getId())
>> .waitForSync(false);
>>
>> try {
>> BaseEdgeDocument edge = new
>> BaseEdgeDocument("MY_test_vertex_from1/" + from1, "MY_test_vertex_to/" + to);
>> graph.edgeCollection("MY_test_edge").insertEdge(edge, options);
>>
>> edge = new BaseEdgeDocument("MY_test_vertex_from2/" + from2,
>> "MY_test_vertex_to/" + to);
>> graph.edgeCollection("MY_test_edge").insertEdge(edge, options);
>>
>> graph.db().commitStreamTransaction(tx.getId());
>> } catch (Exception e) {
>> graph.db().abortStreamTransaction(tx.getId());
>> throw e;
>> }
>> }
>>
>> private static void addNodes(ArangoDB arangoDB, String from1, String
>> from2, String to) {
>> ArangoGraph graph = arangoDB.db(DATABASE).graph(GRAPH);
>>
>> String[] collections = new String[]{"MY_test_vertex_from1",
>> "MY_test_vertex_from2", "MY_test_vertex_to"};
>> StreamTransactionEntity tx = graph.db().beginStreamTransaction(
>> new StreamTransactionOptions()
>> .waitForSync(false)
>> .writeCollections(collections));
>> VertexCreateOptions options = new VertexCreateOptions()
>> .streamTransactionId(tx.getId())
>> .waitForSync(false);
>> try {
>> graph.vertexCollection("MY_test_vertex_from1").insertVertex(new
>> BaseDocument(from1), options);
>> graph.vertexCollection("MY_test_vertex_from2").insertVertex(new
>> BaseDocument(from2), options);
>> graph.vertexCollection("MY_test_vertex_to").insertVertex(new
>> BaseDocument(to), options);
>> graph.db().commitStreamTransaction(tx.getId());
>> } catch (Exception e) {
>> e.printStackTrace();
>> graph.db().abortStreamTransaction(tx.getId());
>> throw e;
>> }
>> }
>> }
>>
>>
>>
>>
>>
>>
>> On Friday, 14 February 2020 19:27:49 UTC+1, Rob Gratz wrote:
>>>
>>>
>>> I am in the process of evaluating a number of different graph databases
>>> for use in an existing application. This application currently uses Neo4J
>>> as the repository but we are looking at whether a switch would make sense.
>>> As part of the evaluation process, we have created a test harness to
>>> perform consistent tests across the databases we are evaluating, arangodb
>>> being one of them. One of the simple tests we are doing is to add nodes
>>> and edges individually and in batches since that is how our application
>>> would interact with the DB. What I have found is that arango is 5-6 times
>>> slower than neo4j in doing these tests. I am currently using the Java
>>> driver to perform the tests and doing the inserts using the graph api and
>>> not the generic collection api. I have been following the arango provided
>>> java guides for adding the nodes and edges and using e StreamingTransaction
>>> api for doing the batches so I'm not sure where I could be going too wrong
>>> in my approach. With that said, I don't know how arangodb could be that
>>> much slower than neo4j.
>>>
>>> Following are examples of how I am adding the data (this isn't the
>>> harness, just an example of how we are adding data). Any feedback as to
>>> how I can improve the performance would be greatly appreciated.
>>>
>>> private void addEdge(ArangoDB arangoDB)
>>> {
>>> ArangoGraph graph = arangoDB.db(DATABASE).graph(GRAPH);
>>> String[] collections = new String[] {"MY_test_edge"};
>>>
>>> StreamTransactionEntity tx = graph.db().beginStreamTransaction(
>>> new StreamTransactionOptions()
>>> .waitForSync(false)
>>> .writeCollections(collections));
>>> EdgeCreateOptions options = new EdgeCreateOptions()
>>> .streamTransactionId(tx.getId())
>>> .waitForSync(false);
>>>
>>> System.out.println("Transaction collections: " + String.join(",",
>>> collections));
>>> try
>>> {
>>> BaseEdgeDocument edge = new
>>> BaseEdgeDocument("MY_test_vertex_from1/MY_from_key1",
>>> "MY_test_vertex_to/MY_to_key");
>>> graph.edgeCollection("MY_test_edge").insertEdge(edge, options);
>>>
>>> edge = new BaseEdgeDocument("MY_test_vertex_from2/MY_from_key2",
>>> "MY_test_vertex_to/MY_to_key");
>>> graph.edgeCollection("MY_test_edge").insertEdge(edge, options);
>>>
>>> graph.db().commitStreamTransaction(tx.getId());
>>> }
>>> catch (Exception e)
>>> {
>>> graph.db().abortStreamTransaction(tx.getId());
>>> throw e;
>>> }
>>> }
>>>
>>> private void addNodes(ArangoDB arangoDB)
>>> {
>>> ArangoGraph graph = arangoDB.db(DATABASE).graph(GRAPH);
>>>
>>> String[] collections = new String[] {"MY_test_vertex_from1",
>>> "MY_test_vertex_from2", "MY_test_vertex_to"};
>>> StreamTransactionEntity tx = graph.db().beginStreamTransaction(
>>> new StreamTransactionOptions()
>>> .waitForSync(false)
>>> .writeCollections(collections));
>>> VertexCreateOptions options = new VertexCreateOptions()
>>> .streamTransactionId(tx.getId())
>>> .waitForSync(false);
>>> try
>>> {
>>> graph.vertexCollection("MY_test_vertex_from1").insertVertex(new
>>> BaseDocument("MY_from_key1"), options);
>>> graph.vertexCollection("MY_test_vertex_from2").insertVertex(new
>>> BaseDocument("MY_from_key2"), options);
>>> graph.vertexCollection("MY_test_vertex_to").insertVertex(new
>>> BaseDocument("MY_to_key"), options);
>>> graph.db().commitStreamTransaction(tx.getId());
>>> }
>>> catch (Exception e)
>>> {
>>> graph.db().abortStreamTransaction(tx.getId());
>>> }
>>> }
>>>
>>>
--
You received this message because you are subscribed to the Google Groups
"ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/arangodb/75ba5eb4-545e-4256-b119-f728f72c1cb9%40googlegroups.com.