Hi Mike Please find my comments below
Which version of MarkLogic are you using? On what OS? Gnana: MarkLogic 5.0.3, Linux 50 threads could be too many. How many CPU cores does the host have? How much RAM? Gnana: As MarkLogic supports 256 threads I thought 50 threads are good to use. I am just performing the validation and if not valid quarantining the files. CPU Cores: 2 RAM: 2GB total used free shared buffers cached Mem: 2008 1953 55 0 118 234 -/+ buffers/cache: 1601 407 Swap: 3327 37 3290 Standalone validation is a read-only query. But with triggers it changes to a database update context. It shouldn't be surprising that updates are slower than read-only operations. How are you using triggers, exactly? Did you modify an existing CPF pipeline, or are you using the raw triggers API? Gnana: I am currently using raw triggers. As I am dynamically generating files. Once the files copied into a particular folder, triggers fires and those are handled with an XQuery module. Where is the XSLT stored? What about the schema? Gnana: As I am currently using Schematron's, I stored the schematron in content database and the XSLT's are stored in Modules database. Thanks and Regards, Gnanaprakash Bodireddy -----Original Message----- From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of general-requ...@developer.marklogic.com Sent: Wednesday, November 28, 2012 1:30 AM To: general@developer.marklogic.com Subject: General Digest, Vol 101, Issue 54 Send General mailing list submissions to general@developer.marklogic.com To subscribe or unsubscribe via the World Wide Web, visit http://developer.marklogic.com/mailman/listinfo/general or, via email, send a message with subject or body 'help' to general-requ...@developer.marklogic.com You can reach the person managing the list at general-ow...@developer.marklogic.com When replying, please edit your Subject line so it is more specific than "Re: Contents of General digest..." Today's Topics: 1. Re: Need help with SQL (Mary Holstege) 2. Re: cts:element-value-match and cts:element-range-query questions (Michael Blakeley) 3. Re: Performance Issue with Schematron Validation (Michael Blakeley) 4. Re: cts:element-value-match and cts:element-range-query questions (Gajanan Chinchwadkar) 5. Re: cts:element-value-match and cts:element-range-query questions (John Zhong) ---------------------------------------------------------------------- Message: 1 Date: Tue, 27 Nov 2012 08:30:50 -0800 From: "Mary Holstege" <mary.holst...@marklogic.com> Subject: Re: [MarkLogic Dev General] Need help with SQL To: "MarkLogic Developer Discussion" <general@developer.marklogic.com> Message-ID: <op.wofxhow9fi7...@mary-apple.marklogic.com> Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes On Mon, 26 Nov 2012 18:01:18 -0800, Ankur Patwa <ankur.pa...@icainformatics.com> wrote: > All, > I found the answer in SQL Modelling guide in Fragment Roots in > Database section. > One caveat with it is that it breaks other cts queries and free text > searches. > Our fragments are defined at document level right now and would like > to keep it that way. > I?ve tried creating a Fragment Root at document?s root element but it > looks like the fragments do not jive well with one another. > Should I be creating a Fragment Parent at the root level but to no avail. > What should I try next? > Any help/insight is much appreciated. > > Thanks in advance. > > Best, > Ankur > When we are computing rows, we compute the cross product of all values in the fragment for each of the columns. Since your fragment contains multiple instances of the given columns, you will get multiple rows. What is more, all of this selection is unfiltered (as is execution of MATCH clauses), so you need to ensure that you have sufficient indexes to give the results you want under these circumstances. Self joining isn't going to help at all, because all those rows from the same document will have the same URI anyway. Right now there is no way to require that the columns for a row all fall under the *same* element instance. Nothing in your view specification is saying that that should be the case. We do understand that people want richer data modeling options for SQL, and are looking at ways we can improve. The right way to think about modeling for SQL access is that a document (fragment) is a row. //Mary mary.holst...@marklogic.com Principal Engineer MarkLogic Corporation ------------------------------ Message: 2 Date: Tue, 27 Nov 2012 09:22:02 -0800 From: Michael Blakeley <m...@blakeley.com> Subject: Re: [MarkLogic Dev General] cts:element-value-match and cts:element-range-query questions To: MarkLogic Developer Discussion <general@developer.marklogic.com> Message-ID: <9dfd5d1d-e00c-4132-89fc-b08e4afac...@blakeley.com> Content-Type: text/plain; charset=us-ascii Typically one wouldn't create a range index for complex elements like 'chapter'. Range indexes are designed for simple element values and element-attribute values. Have you considered creating a field and using http://docs.marklogic.com/cts:field-value-match or http://docs.marklogic.com/cts:field-word-match instead? You should be able to create a 'chapter' field that includes the 'chapter' element but excludes 'b'. Or you *might* be able to do something with ML6 and a path index: see http://docs.marklogic.com/guide/admin/range_index#id_54948 for details. But the path '//chapter/text()' seems to be invalid so I'm not sure what the right approach would be. -- Mike On 26 Nov 2012, at 21:11 , John Zhong <j...@yuxipacific.com> wrote: > Hi, > > I am using ML 5.0-2 version, and configured a element range index (string > type) on a complex element, for example, chapter, which has both text node > and child element. Like: > > <chapter>This is chapter <b>one</b>.</chapter> > > Then, I used cts:element-value-match to search the value: > > cts:element-value-match( > xs:QName("chapter"), > ("*one*"), > ("concurrent") > ) > > It returned the: > > This is chapter one. > > Is this expected behavior? If so, how to constrain that if I just want > to match the direct text node? (In this case, I don't want this > chapter element returned) > > Also, if I use cts:element-range-query to search all the chapter elements > that have the value "This is chapter one.", that chapter element is returned > too. Is this expected behavior either? Same question, if I don't want this > chapter element returned, how to do that? > > Thanks, > John > _______________________________________________ > General mailing list > General@developer.marklogic.com > http://developer.marklogic.com/mailman/listinfo/general ------------------------------ Message: 3 Date: Tue, 27 Nov 2012 09:22:41 -0800 From: Michael Blakeley <m...@blakeley.com> Subject: Re: [MarkLogic Dev General] Performance Issue with Schematron Validation To: MarkLogic Developer Discussion <general@developer.marklogic.com> Message-ID: <7b452bb4-02e7-4647-8e89-fb51d78b5...@blakeley.com> Content-Type: text/plain; charset=windows-1252 Which version of MarkLogic are you using? On what OS? 50 threads could be too many. How many CPU cores does the host have? How much RAM? Standalone validation is a read-only query. But with triggers it changes to a database update context. It shouldn't be surprising that updates are slower than read-only operations. How are you using triggers, exactly? Did you modify an existing CPF pipeline, or are you using the raw triggers API? Where is the XSLT stored? What about the schema? -- Mike On 26 Nov 2012, at 23:48 , <gnanaprakash.bodire...@cognizant.com> wrote: > Hi > > I am currently validating XML?s using Schematron. But I am facing a > performance issue with this. > > Currently I am generating large number of documents on the fly and using > triggers (50 Threads) to validate the XML?s. > > Each XML when validated individually is taking around 0.4s but when using > triggers it is shooting up to 10seconds per XML. > > While profiling the validation the below line of code is talking > around 0.3 seconds which is part of schematron.xqy file > > fn:unordered(xdmp:xslt-invoke($include, $sch)) > > Can anyone help me in improving the performance in validating xml using > Schematron. > > Thanks and Regards, > > Gnanaprakash Bodireddy > > This e-mail and any files transmitted with it are for the sole use of > the intended recipient(s) and may contain confidential and privileged > information. If you are not the intended recipient(s), please reply to > the sender and destroy all copies of the original message. Any > unauthorized review, use, disclosure, dissemination, forwarding, > printing or copying of this email, and/or any action taken in reliance > on the contents of this e-mail is strictly prohibited and may be > unlawful. _______________________________________________ > General mailing list > General@developer.marklogic.com > http://developer.marklogic.com/mailman/listinfo/general ------------------------------ Message: 4 Date: Tue, 27 Nov 2012 09:44:22 -0800 From: Gajanan Chinchwadkar <gajanan.chinchwad...@marklogic.com> Subject: Re: [MarkLogic Dev General] cts:element-value-match and cts:element-range-query questions To: MarkLogic Developer Discussion <general@developer.marklogic.com> Message-ID: <c9924d15b04672479b089f7d55ffc132226a4cc...@exchg-be.marklogic.com> Content-Type: text/plain; charset="us-ascii" Michael's suggestion is right. Fields is a better approach. ML6 Path indexes will not help: the value in the path index is the same as an element index at the leaf level will have. -----Original Message----- From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of Michael Blakeley Sent: Tuesday, November 27, 2012 9:22 AM To: MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] cts:element-value-match and cts:element-range-query questions Typically one wouldn't create a range index for complex elements like 'chapter'. Range indexes are designed for simple element values and element-attribute values. Have you considered creating a field and using http://docs.marklogic.com/cts:field-value-match or http://docs.marklogic.com/cts:field-word-match instead? You should be able to create a 'chapter' field that includes the 'chapter' element but excludes 'b'. Or you *might* be able to do something with ML6 and a path index: see http://docs.marklogic.com/guide/admin/range_index#id_54948 for details. But the path '//chapter/text()' seems to be invalid so I'm not sure what the right approach would be. -- Mike On 26 Nov 2012, at 21:11 , John Zhong <j...@yuxipacific.com> wrote: > Hi, > > I am using ML 5.0-2 version, and configured a element range index (string > type) on a complex element, for example, chapter, which has both text node > and child element. Like: > > <chapter>This is chapter <b>one</b>.</chapter> > > Then, I used cts:element-value-match to search the value: > > cts:element-value-match( > xs:QName("chapter"), > ("*one*"), > ("concurrent") > ) > > It returned the: > > This is chapter one. > > Is this expected behavior? If so, how to constrain that if I just want > to match the direct text node? (In this case, I don't want this > chapter element returned) > > Also, if I use cts:element-range-query to search all the chapter elements > that have the value "This is chapter one.", that chapter element is returned > too. Is this expected behavior either? Same question, if I don't want this > chapter element returned, how to do that? > > Thanks, > John > _______________________________________________ > General mailing list > General@developer.marklogic.com > http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general ------------------------------ Message: 5 Date: Tue, 27 Nov 2012 12:47:39 -0500 From: John Zhong <j...@yuxipacific.com> Subject: Re: [MarkLogic Dev General] cts:element-value-match and cts:element-range-query questions To: MarkLogic Developer Discussion <general@developer.marklogic.com> Message-ID: <CA+yakFknYOkzxmq8HFTTfs=e0beqb3fzs2tzbutlxyivd59...@mail.gmail.com> Content-Type: text/plain; charset="utf-8" Thank you both. On Tue, Nov 27, 2012 at 12:44 PM, Gajanan Chinchwadkar < gajanan.chinchwad...@marklogic.com> wrote: > Michael's suggestion is right. Fields is a better approach. > > ML6 Path indexes will not help: the value in the path index is the > same as an element index at the leaf level will have. > > -----Original Message----- > From: general-boun...@developer.marklogic.com [mailto: > general-boun...@developer.marklogic.com] On Behalf Of Michael Blakeley > Sent: Tuesday, November 27, 2012 9:22 AM > To: MarkLogic Developer Discussion > Subject: Re: [MarkLogic Dev General] cts:element-value-match and > cts:element-range-query questions > > Typically one wouldn't create a range index for complex elements like > 'chapter'. Range indexes are designed for simple element values and > element-attribute values. > > Have you considered creating a field and using > http://docs.marklogic.com/cts:field-value-match or > http://docs.marklogic.com/cts:field-word-match instead? You should be > able to create a 'chapter' field that includes the 'chapter' element > but excludes 'b'. > > Or you *might* be able to do something with ML6 and a path index: see > http://docs.marklogic.com/guide/admin/range_index#id_54948 for details. > But the path '//chapter/text()' seems to be invalid so I'm not sure > what the right approach would be. > > -- Mike > > On 26 Nov 2012, at 21:11 , John Zhong <j...@yuxipacific.com> wrote: > > > Hi, > > > > I am using ML 5.0-2 version, and configured a element range index > (string type) on a complex element, for example, chapter, which has > both text node and child element. Like: > > > > <chapter>This is chapter <b>one</b>.</chapter> > > > > Then, I used cts:element-value-match to search the value: > > > > cts:element-value-match( > > xs:QName("chapter"), > > ("*one*"), > > ("concurrent") > > ) > > > > It returned the: > > > > This is chapter one. > > > > Is this expected behavior? If so, how to constrain that if I just > > want > to match the direct text node? (In this case, I don't want this > chapter element returned) > > > > Also, if I use cts:element-range-query to search all the chapter > elements that have the value "This is chapter one.", that chapter > element is returned too. Is this expected behavior either? Same > question, if I don't want this chapter element returned, how to do that? > > > > Thanks, > > John > > _______________________________________________ > > General mailing list > > General@developer.marklogic.com > > http://developer.marklogic.com/mailman/listinfo/general > > _______________________________________________ > General mailing list > General@developer.marklogic.com > http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ > General mailing list > General@developer.marklogic.com > http://developer.marklogic.com/mailman/listinfo/general > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://developer.marklogic.com/pipermail/general/attachments/20121127/ff3bb973/attachment-0001.html ------------------------------ _______________________________________________ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general End of General Digest, Vol 101, Issue 54 **************************************** This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient(s), please reply to the sender and destroy all copies of the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email, and/or any action taken in reliance on the contents of this e-mail is strictly prohibited and may be unlawful. _______________________________________________ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general