Re: what Geert responded with, I can see a scenario where a very large file ingests fine, then index settings are changed in a way that makes its termlist size expand significantly. With no knowledge of the document or its size, it’s hard to say what the best thing to do here is. I would look at:
· Should this (presumably) huge document be one big document, or would it be better modeled as multiple, smaller documents as Geert suggests? Keep in mind that in MarkLogic, a document should equate to a record in a traditional database. Think of it as a de-normalized row rather than as a table full of rows. With this in mind, consider splitting this into multiple, smaller documents. · Are the index settings necessary for the application? If the index blew up significantly, it’s probably due to positions and/or wildcard indexes. What are the indexes you enabled that require reindexing? Do you need them all? If you just turned on positions and you require them, consider setting a lower positions list max size. From: [email protected] [mailto:[email protected]] On Behalf Of Indrajeet Verma Sent: Monday, May 11, 2015 2:35 PM To: MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] XDMP-FRAGTOOLARGE Shashi, I think you should see your logic of splitting/loading of the XML document. What method/tool are you using to ingest content (mlcp and recordloader etc...) You can write your logic to split files based on a element which you want to create a root element. Also please take a look the suggestions of the journal size that Geert has suggested. I am sure there is some problem in your ingestion and XML size otherwise default configurations also works without any issues. of-course later you can optimize them if needed. Also you should see on your number of forests otherwise you can not show the actual power of ML to your customer. (mainly search performance and Big Data management etc..) Records, Indy Regards, Indy On Mon, May 11, 2015 at 11:49 PM, Geert Josten <[email protected]<mailto:[email protected]>> wrote: Hi Shashidhar, I’m wondering how large the original file was, probably not 32gb. I’m also wondering how it ended up getting inserted without trouble. Almost as if memory values have been tuned down afterwards. I’d decrease memory list size to a more reasonable value, and instead take a look at in memory tree size as well. It is also suggested to keep journal size larger than list size + tree size at minimum. You could of course delete the file, that should just work, but I can’t judge whether the file contains valuable information or not. It is also suggested to split that file into smaller parts. Best to do that at ingest time, but if that is not an option, fragmentation might help here. But word of warning fragmentation also influences how queries behave, and have some other side effects as well. We typically recommend against it.. Kind regards, Geert From: Shashidhar Rao <[email protected]<mailto:[email protected]>> Date: Monday, May 11, 2015 at 6:52 PM To: Geert Josten <[email protected]<mailto:[email protected]>> Subject: XDMP-FRAGTOOLARGE Hi Geert, I am getting this error. I tried posting but not getting any replies. Can you suggest anything to resolve this error. There is currently an XDMP-FORESTERR: Error in reindex of forest PROD_DB_1: XDMP-REINDEX: Error reindexing fn:doc("/home/data/TD078999.XML"): XDMP-FRAGTOOLARGE: Fragment of /home/data/TD078999.XML too large for in-memory storage: : In-memory list storage full; list: table=100%, wordsused=50%, wordsfree=25%, overhead=25%; tree: table=0%, wordsused=6%, wordsfree=94%, overhead=0% exception. Information on this page may be missing. Any suggestion on how to resolve this error? My in memory list is 32699MB How can I increase this value or can I delete this file? Please help Thanks _______________________________________________ General mailing list [email protected]<mailto:[email protected]> Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________ General mailing list [email protected] Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
