Hello Danny,
Yes, I'm using 7.0-4. >> What are you comparing it to on the Oracle side? >> In MarkLogic, the content will be all indexed and searchable. Is that true on the orcl side too The Oracle side is doing a basic CLOB insert with no indexing. The Oracle server being compared to is a higher capacity system so we expected to see a faster ingestion. I didn't expect the MarkLogic side to be 4 times slower. Yes, we tried tweaking the batch size. The 500 batch size had the fastest load times. I will investigate further but I believe the bottleneck is on the MarkLogic side. I believe the MarkLogic CPU has some room for parallelizing. I'll create a custom REST Extension that will spawn multiple threads for the doc-inserts. I assume the REST API bulk ingestion already does this but I can't say for sure. I'll keep you posted. Thanks Danny - Gary R From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of Danny Sokolsky Sent: Tuesday, October 14, 2014 2:00 PM To: MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] How to optimize the REST API Bulk Ingestion Performance? Hi Gary, A few thoughts here. You are using 7.0-4 on this? What are you comparing it to on the Oracle side? In MarkLogic, the content will be all indexed and searchable. Is that true on the orcl side too? What indexes to you have enabled? Maybe you do not need them all (or maybe you should put the equivalent indexing on the orcl side)? Have you tried tweaking the batch size? I would try a smaller number, say 50 or 100. Have you analyzed where you are spending the time? In the c# code? In the code loading the doc on MarkLogic? Do you have multiple threads loading from your .net program? If you are not maxing out your cpu on the MarkLogic side, you probably have room for more parallelization. -Danny From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of Gary Russo Sent: Tuesday, October 14, 2014 9:21 AM To: general@developer.marklogic.com Subject: [MarkLogic Dev General] How to optimize the REST API Bulk Ingestion Performance? MarkLogic Bulk ingestion processing is slower than an equivalent Oracle ingestion process. The MarkLogic ingestion takes 30 minutes. An Oracle equivalent only takes 7 minutes. I'm using the REST API to bulk ingest multiple documents as described here. => http://docs.marklogic.com/guide/rest-dev/bulk#id_54649 Notes: . C# code is used to call the MarkLogic Bulk Ingest REST API. . Document batch size used is 500. . Average doc size is 1 KB. . JSON Conversion and Validation logic occurs in the C# code. Any thoughts on how to optimize the MarkLogic bulk ingest to make it as fast as Oracle's 7 minute load time? Thanks, Gary R Gary Russo Enterprise NoSQL Developer http://garyrusso.wordpress.com
_______________________________________________ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general