> How big is your file the sort cannot write? One bil-ee-on lines... :-P
> ...This should help a lot. The trouble is that the size of a block of contiguous accounts in the real data is not-uniform (even if it might be with my test data). Therefore, it is highly likely a contiguous block of account numbers will span 2 or more batches. This will lead to a lot of contention. In your example, if Account 2 spills over into the next batch, chances are I'll have to rollback that batch. Don't you also have a problem that if X, Y, Z and W in your example are account numbers in the next batch, you'll also get contention? Admittedly, randomization doesn't solve this problem either. > you can use the special Batch Importer: OGraphBatchInsert Would this not be subject to the same contention problems? At what point is it flushed to disk? (Obviously, it can't live in heap forever). > You should definitely using transactions with batch size of 100 items. I thought I read somewhere else (can't find the link at the moment) that you said only use transactions when using the remote protocol? > Please use last 2.2.10. ... try to define 50GB of DISKCACHE and 14GB of Heap Will do on the next run. > If happens again, could you please send a thread dump? I have the full thread dump but it's on my work machine so can't post it in this forum (all access to Google Groups is banned by the bank so I am writing this on my personal computer). Happy to email them to you. Which email shall I use? Phill On Friday, September 23, 2016 at 7:41:29 AM UTC+1, l.garulli wrote: > > On 23 September 2016 at 00:49, Phillip Henry <phill...@gmail.com > <javascript:>> wrote: > >> Hi, Luca. >> > > Hi Phillip. > > >> I have: >> >> 4. sorting is an overhead, albeit outside of Orient. Using the Unix sort >> command failed with "No space left on device". Oops. OK, so I ran my >> program to generate the data again, this time it is ordered by the first >> account number. Performance was much slower as there appeared to be a lot >> of contention for this account (ie, all writes were contending for this >> account, even if the other account had less contention). More randomized >> data was faster. >> > > How big is your file the sort cannot write? Anyway, if you have the > accounts sorted, you should have transactions of about 100 items where the > bank account and edges are in the same transaction. This should help a lot. > Example: > > Account 1 -> Payment 1 -> Account X > Account 1 -> Payment 2 -> Account Y > Account 1 -> Payment 3 -> Account Z > Account 2 -> Payment 1 -> Account X > Account 2 -> Payment 1 -> Account W > > If the transaction batch is 5 (I suggest you to start with 100), all the > operations are executed in one transaction. In another thread has: > > Account 99 -> Payment 1 -> Account W > > It could go in conflict because the shared Account W. > > If you can export Account's IDs that are numbers and incremental, you can > use the special Batch Importer: OGraphBatchInsert. Example: > > OGraphBatchInsert batch = new OGraphBatchInsert("plocal:/temp/mydb", "admin", > "admin"); > batch.begin(); > > batch.createEdge(0L, 1L, null); // CREATE EDGES BETWEEN VERTEX 0 and 1. IF > VERTICES > > // DON'T EXISTS, ARE CREATED IMPLICITELY > batch.createEdge(1L, 2L, null); > batch.createEdge(2L, 0L, null); > > > batch.createVertex(3L); // CREATE AN NON CONNECTED VERTEX > > > Map<String, Object> vertexProps = new HashMap<String, Object>(); > vertexProps.put("foo", "foo"); > vertexProps.put("bar", 3); > batch.setVertexProperties(0L, vertexProps); // SET PROPERTY FOR VERTEX 0 > batch.end(); > > This is blazing fast, but uses Heap so run it with a lot of it. > > >> >> 6. I've mutlithreaded my loader. The details are now: >> >> - using plocal >> - using 30 threads >> - not using transactions (OrientGraphFactory.getNoTx) >> > > You should definitely using transactions with batch size of 100 items. > This speeds up things. > > >> - retrying forever upon write collisions. >> - using Orient 2.2.7. >> > > Please use last 2.2.10. > > >> - using -XX:MaxDirectMemorySize:258040m >> > > This is not really important, it's just an upper bound for the JVM. Please > set it to 512GB so you can forget about it. The 2 most important values are > DISKCACHE and JVM heap. The sum must lower than the available RAM in the > server before you run OrientDB. > > If you have 64GB, try to define 50GB of DISKCACHE and 14GB of Heap. > > If you use the Batch Importer, you should use more Heap and less DISKCACHE. > > >> The good news is I've achieved an initial write throughput of about >> 30k/second. >> >> The bad news is I've tried several runs and only been able to achieve >> 200mil < number of writes < 300mil. >> >> The first time I tried it, the loader deadlocked. Using jstat showed that >> the deadlock was between 3 threads at: >> - >> OOneKeyEntryPerKeyLockManager.acquireLock(OOneKeyEntryPerKeyLockManager.java:173) >> - >> OPartitionedLockManager.acquireExclusiveLock(OPartitionedLockManager.java:210) >> - >> OOneKeyEntryPerKeyLockManager.acquireLock(OOneKeyEntryPerKeyLockManager.java:171) >> > > If happens again, could you please send a thread dump? > > >> The second time it failed was due to a NullPointerException at >> OByteBufferPool.java:297. I've looked at the code and the only way I can >> see this happening is if OByteBufferPool.allocateBuffer throws an error >> (perhaps an OutOfMemoryError in java.nio.Bits.reserveMemory). This >> StackOverflow posting ( >> http://stackoverflow.com/questions/8462200/examples-of-forcing-freeing-of-native-memory-direct-bytebuffer-has-allocated-us) >> >> seems to indicate that this can happen if the underlying DirectByteBuffer's >> Cleaner doesn't have its clean() method called. >> > > This is because the database was bigger than this setting: - using > -XX:MaxDirectMemorySize:258040m. Please set this at 512GB (see above). > > >> Alternatively, I followed the SO suggestion and lowered the heap space to >> a mere 1gb (it was 50gb) to make the GC more active. Unfortunately, after a >> good start, the job is still running some 15 hours later with a hugely >> reduced write throughput (~ 7k/s). Jstat shows 4292 full GCs taking a total >> time of 4597s - not great but not hugely awful either. At this rate, the >> remaining 700mil or so payments are going to take another 30 hours. >> > > See above the suggested settings. > > >> 7. Even with the highest throughput I have achieved, 30k writes per >> second, I'm looking at about 20 hours of loading. We've taken the same data >> and, after trial and error that was not without its own problems, put it >> into Neo4J in 37 minutes. This is a significant difference. It appears that >> they are approaching the problem differently to avoid contention on >> updating the vertices during an edge write. >> > > With all this suggestion you should be able to have much better numbers. > If you can use the Batch Importer the number should be close to Neo4j. > > >> >> Thoughts? >> >> Regards, >> >> Phillip >> >> > > Best Regards, > > Luca Garulli > Founder & CEO > OrientDB LTD <http://orientdb.com/> > > Want to share your opinion about OrientDB? > Rate & review us at Gartner's Software Review > <https://www.gartner.com/reviews/survey/home> > > > > >> >> On Thursday, September 15, 2016 at 10:06:44 PM UTC+1, l.garulli wrote: >>> >>> On 15 September 2016 at 09:54, Phillip Henry <phill...@gmail.com> wrote: >>> >>>> Hi, Luca. >>>> >>> >>> Hi Phillip, >>> >>> 3. Yes, default configuration. Apart from adding an index for ACCOUNTS, >>>> I did nothing further. >>>> >>> >>> Ok, so you have writeQuorum="majority" that means 2 sycnhronous writes >>> and 1 asynchronous per transaction. >>> >>> >>>> 4. Good question. With real data, we expect it to be as you suggest: >>>> some nodes with the majority of the payments (eg, supermarkets). However, >>>> for the test data, payments were assigned randomly and, therefore, should >>>> be uniformly distributed. >>>> >>> >>> What's your average in terms of number of edges? <10, <50, <200, <1000? >>> >>> >>>> 2. Yes, I tried plocal minutes after posting (d'oh!). I saw a good >>>> improvement. It started about 3 times faster and got faster still (about >>>> 10 >>>> times faster) by the time I checked this morning on a job running >>>> overnight. However, even though it is now running at about 7k transactions >>>> per second, a billion edges is still going to take about 40 hours. So, I >>>> ask myself: is there anyway I can make it faster still? >>>> >>> >>> Here it's missing the usage of AUTO-SHARDING INDEX. Example: >>> >>> accountClass.createIndex("Account.number", >>> OClass.INDEX_TYPE.UNIQUE.toString(), (OProgressListener) null, (ODocument) >>> null, >>> "AUTOSHARDING", new String[] { "number" }); >>> >>> In this way you should go more in parallel, because the index is >>> distributed across all the shards (clusters) of Account class. you should >>> have 32 of them by default because you have 32 cores. >>> >>> Please let me know if by sorting the from_accounts and with this change >>> if it's much faster. >>> >>> This is the best you can have out of the box. To push numbers up it's >>> slightly more complicated: you should be sure that transactions go in >>> parallel and they aren't serialized. This is possible by playing with >>> internal OrientDB settings (mainly the distributed workerThreads), by >>> having many clusters per class (You could try with 128 first and see how >>> it's going). >>> >>> >>>> I assume when I start the servers up in distributed mode once more, the >>>> data will then be distributed across all nodes in the cluster? >>>> >>> >>> That's right. >>> >>> >>>> 3. I'll return to concurrent, remote inserts when this job has >>>> finished. Hopefully, a smaller batch size will mean there is no >>>> degradation >>>> in performance either... FYI: with a somewhat unscientific approach, I was >>>> polling the server JVM with JStack and saw only a single thread doing all >>>> the work and it *seemed* to spend a lot of its time in ODirtyManager on >>>> collection manipulation. >>>> >>> >>> I think it's because you didn't use the AUTO-SHARDING index. Furthermore >>> running distributed, unfortunately, means the tree ridbag is not available >>> (we will support it in the future), so every change to the edges takes a >>> lot of CPU to demarshall and marshall the entire edge list everytime you >>> update a vertex. That's why my recommendation about sorting the vertices. >>> >>> >>>> I totally appreciate that performance tuning is an empirical science, >>>> but do you have any opinions as to which would probably be faster: >>>> single-threaded plocal or multithreaded remote? >>>> >>> >>> With v2.2 yo can go in parallel, by using the tips above. For sure the >>> replication has a cost. I'm sure you can go much faster with just one node >>> and then start the other 2 nodes to have the database replicated >>> automatically. At least for the first massive insertion. >>> >>> >>>> >>>> Regards, >>>> >>>> Phillip >>>> >>> >>> Luca >>> >>> >>> >>>> >>>> On Wednesday, September 14, 2016 at 3:48:56 PM UTC+1, Phillip Henry >>>> wrote: >>>>> >>>>> Hi, guys. >>>>> >>>>> I'm conducting a proof-of-concept for a large bank (Luca, we had a >>>>> 'phone conf on August 5...) and I'm trying to bulk insert a humongous >>>>> amount of data: 1 million vertices and 1 billion edges. >>>>> >>>>> Firstly, I'm impressed about how easy it was to configure a cluster. >>>>> However, the performance of batch inserting is bad (and seems to get >>>>> considerably worse as I add more data). It starts at about 2k >>>>> vertices-and-edges per second and deteriorates to about 500/second after >>>>> only about 3 million edges have been added. This also takes ~ 30 minutes. >>>>> Needless to say that 1 billion payments (edges) will take over a week at >>>>> this rate. >>>>> >>>>> This is a show-stopper for us. >>>>> >>>>> My data model is simply payments between accounts and I store it in >>>>> one large file. It's just 3 fields and looks like: >>>>> >>>>> FROM_ACCOUNT TO_ACCOUNT AMOUNT >>>>> >>>>> In the test data I generated, I had 1 million accounts and 1 billion >>>>> payments randomly distributed between pairs of accounts. >>>>> >>>>> I have 2 classes in OrientDB: ACCOUNTS (extending V) and PAYMENT >>>>> (extending E). There is a UNIQUE_HASH_INDEX on ACCOUNTS for the account >>>>> number (a string). >>>>> >>>>> We're using OrientDB 2.2.7. >>>>> >>>>> My batch size is 5k and I am using the "remote" protocol to connect to >>>>> our cluster. >>>>> >>>>> I'm using JDK 8 and my 3 boxes are beefy machines (32 cores each) but >>>>> without SSDs. I wrote the importing code myself but did nothing 'clever' >>>>> (I >>>>> think) and used the Graph API. This client code has been given lots of >>>>> memory and using jstat I can see it is not excessively GCing. >>>>> >>>>> So, my questions are: >>>>> >>>>> 1. what kind of performance can I realistically expect and can I >>>>> improve what I have at the moment? >>>>> >>>>> 2. what kind of degradation should I expect as the graph grows? >>>>> >>>>> Thanks, guys. >>>>> >>>>> Phillip >>>>> >>>>> >>>>> >>>>> -- >>>> >>>> --- >>>> You received this message because you are subscribed to the Google >>>> Groups "OrientDB" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to orient-databa...@googlegroups.com. >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >> >> --- >> You received this message because you are subscribed to the Google Groups >> "OrientDB" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to orient-databa...@googlegroups.com <javascript:>. >> For more options, visit https://groups.google.com/d/optout. >> > > -- --- You received this message because you are subscribed to the Google Groups "OrientDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to orient-database+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.