I reported earlier that my profiling application was causing MarkLogic to restart after handling about 20,000 tasks. Turns out it was an out-of-memory issue on the server itself (currently configured with 256GB of RAM). We could see a distinct spike in memory usage, at which point the server restarted MarkLogic. I tried different input data sets so it doesn’t appear to be an issue with a particular input document (my data set has a few outliers that are much larger than typical but only a few).
Subsequent testing determined that it was the use of the MarkLogic profiler that was causing the memory spike: if I turned off the profiler then memory usage was flat and all the tasks completed as expected. This is ML 8.03. I’m still working on getting my server upgraded to a newer version of MarkLogic so I can see if this is an issue that has already been fixed. So it looks like there’s some kind of memory leak related to the profiler and I’d like to understand what that issue and either understand how to avoid it or report it formally. If it’s a general potential problem with large-scale processing would like to understand how to avoid it or plan for it. If it’s a problem specific to the profiler then need to report it formally and provide appropriate diagnostics. So my questions: 1. Is this a known issue with profiling? I’m guessing not in that I’m probably doing something out-of-the-ordinary vis-à-vis profiling and is something that nobody would see in typical single-instance ad-hoc profiling. 2. What types of MarkLogic processing would cause this kind of memory spike that lasts across the execution of multiple tasks? I would expect the memory required for a given task to be released as soon as the task is complete so I’m guessing it must be an issue with caches or something? Thanks, Eliot -- Eliot Kimber http://contrext.com _______________________________________________ General mailing list [email protected] Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
