Hi Mike

Please find my comments below

Which version of MarkLogic are you using? On what OS?
Gnana: MarkLogic 5.0.3, Linux

50 threads could be too many. How many CPU cores does the host have? How much 
RAM?
Gnana:
As MarkLogic supports 256 threads I thought 50 threads are good to use. I am 
just performing the validation and if not valid quarantining the files.
CPU Cores: 2
RAM: 2GB
                 total       used       free     shared    buffers     cached
Mem:          2008       1953         55          0        118        234
-/+ buffers/cache:       1601        407
Swap:         3327         37       3290

Standalone validation is a read-only query. But with triggers it changes to a 
database update context. It shouldn't be surprising that updates are slower 
than read-only operations.

How are you using triggers, exactly? Did you modify an existing CPF pipeline, 
or are you using the raw triggers API?
Gnana: I am currently using raw triggers. As I am dynamically generating files. 
Once the files copied into a particular folder, triggers fires and those are 
handled with an XQuery module.

Where is the XSLT stored? What about the schema?
Gnana: As I am currently using Schematron's, I stored the schematron in content 
database and the XSLT's are stored in Modules database.

Thanks and Regards,

Gnanaprakash Bodireddy


-----Original Message-----
From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of 
general-requ...@developer.marklogic.com
Sent: Wednesday, November 28, 2012 1:30 AM
To: general@developer.marklogic.com
Subject: General Digest, Vol 101, Issue 54

Send General mailing list submissions to
        general@developer.marklogic.com

To subscribe or unsubscribe via the World Wide Web, visit
        http://developer.marklogic.com/mailman/listinfo/general
or, via email, send a message with subject or body 'help' to
        general-requ...@developer.marklogic.com

You can reach the person managing the list at
        general-ow...@developer.marklogic.com

When replying, please edit your Subject line so it is more specific than "Re: 
Contents of General digest..."


Today's Topics:

   1. Re: Need help with SQL (Mary Holstege)
   2. Re: cts:element-value-match and   cts:element-range-query
      questions (Michael Blakeley)
   3. Re: Performance Issue with Schematron     Validation
      (Michael Blakeley)
   4. Re: cts:element-value-match       and     cts:element-range-query
      questions (Gajanan Chinchwadkar)
   5. Re: cts:element-value-match and cts:element-range-query
      questions (John Zhong)


----------------------------------------------------------------------

Message: 1
Date: Tue, 27 Nov 2012 08:30:50 -0800
From: "Mary Holstege" <mary.holst...@marklogic.com>
Subject: Re: [MarkLogic Dev General] Need help with SQL
To: "MarkLogic Developer Discussion" <general@developer.marklogic.com>
Message-ID: <op.wofxhow9fi7...@mary-apple.marklogic.com>
Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes


On Mon, 26 Nov 2012 18:01:18 -0800, Ankur Patwa 
<ankur.pa...@icainformatics.com> wrote:

> All,
> I found the answer in SQL Modelling guide in Fragment Roots in
> Database section.
> One caveat with it is that it breaks other cts queries and free text
> searches.
> Our fragments are defined at document level right now and would like
> to keep it that way.
> I?ve tried creating a Fragment Root at document?s root element but it
> looks like the fragments do not jive well with one another.
> Should I be creating a Fragment Parent at the root level but to no avail.
> What should I try next?
> Any help/insight is much appreciated.
>
> Thanks in advance.
>
> Best,
> Ankur
>


When we are computing rows, we compute the cross product of all values in the 
fragment for each of the columns. Since your fragment contains multiple 
instances of the given columns, you will get multiple rows. What is more, all 
of this selection is unfiltered (as is execution of MATCH clauses), so you need 
to ensure that you have sufficient indexes to give the results you want under 
these circumstances.

Self joining isn't going to help at all, because all those rows from the same 
document will have the same URI anyway.

Right now there is no way to require that the columns for a row all fall under 
the *same* element instance. Nothing in your view specification is saying that 
that should be the case. We do understand that people want richer data modeling 
options for SQL, and are looking at ways we can improve.

The right way to think about modeling for SQL access is that a document
(fragment) is a row.

//Mary

mary.holst...@marklogic.com
Principal Engineer
MarkLogic Corporation





------------------------------

Message: 2
Date: Tue, 27 Nov 2012 09:22:02 -0800
From: Michael Blakeley <m...@blakeley.com>
Subject: Re: [MarkLogic Dev General] cts:element-value-match and
        cts:element-range-query questions
To: MarkLogic Developer Discussion <general@developer.marklogic.com>
Message-ID: <9dfd5d1d-e00c-4132-89fc-b08e4afac...@blakeley.com>
Content-Type: text/plain; charset=us-ascii

Typically one wouldn't create a range index for complex elements like 
'chapter'. Range indexes are designed for simple element values and 
element-attribute values.

Have you considered creating a field and using 
http://docs.marklogic.com/cts:field-value-match or 
http://docs.marklogic.com/cts:field-word-match instead? You should be able to 
create a 'chapter' field that includes the 'chapter' element but excludes 'b'.

Or you *might* be able to do something with ML6 and a path index: see 
http://docs.marklogic.com/guide/admin/range_index#id_54948 for details. But the 
path '//chapter/text()' seems to be invalid so I'm not sure what the right 
approach would be.

-- Mike

On 26 Nov 2012, at 21:11 , John Zhong <j...@yuxipacific.com> wrote:

> Hi,
>
> I am using ML 5.0-2 version, and configured a element range index (string 
> type) on a complex element, for example, chapter, which has both text node 
> and child element. Like:
>
> <chapter>This is chapter <b>one</b>.</chapter>
>
> Then, I used cts:element-value-match to search the value:
>
> cts:element-value-match(
>   xs:QName("chapter"),
>   ("*one*"),
>   ("concurrent")
> )
>
> It returned the:
>
> This is chapter one.
>
> Is this expected behavior? If so, how to constrain that if I just want
> to match the direct text node? (In this case, I don't want this
> chapter element returned)
>
> Also, if I use cts:element-range-query to search all the chapter elements 
> that have the value "This is chapter one.", that chapter element is returned 
> too. Is this expected behavior either? Same question, if I don't want this 
> chapter element returned, how to do that?
>
> Thanks,
> John
> _______________________________________________
> General mailing list
> General@developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general



------------------------------

Message: 3
Date: Tue, 27 Nov 2012 09:22:41 -0800
From: Michael Blakeley <m...@blakeley.com>
Subject: Re: [MarkLogic Dev General] Performance Issue with Schematron
        Validation
To: MarkLogic Developer Discussion <general@developer.marklogic.com>
Message-ID: <7b452bb4-02e7-4647-8e89-fb51d78b5...@blakeley.com>
Content-Type: text/plain; charset=windows-1252

Which version of MarkLogic are you using? On what OS?

50 threads could be too many. How many CPU cores does the host have? How much 
RAM?

Standalone validation is a read-only query. But with triggers it changes to a 
database update context. It shouldn't be surprising that updates are slower 
than read-only operations.

How are you using triggers, exactly? Did you modify an existing CPF pipeline, 
or are you using the raw triggers API?

Where is the XSLT stored? What about the schema?

-- Mike

On 26 Nov 2012, at 23:48 , <gnanaprakash.bodire...@cognizant.com> wrote:

> Hi
>
> I am currently validating XML?s using Schematron. But I am facing a 
> performance issue with this.
>
> Currently I am generating large number of documents on the fly and using 
> triggers (50 Threads) to validate the XML?s.
>
> Each XML when validated individually is taking around 0.4s but when using 
> triggers it is shooting up to 10seconds per XML.
>
> While profiling the validation the below line of code is talking
> around 0.3 seconds which is part of schematron.xqy file
>
> fn:unordered(xdmp:xslt-invoke($include, $sch))
>
> Can anyone help me in improving the performance in validating xml using 
> Schematron.
>
> Thanks and Regards,
>
> Gnanaprakash Bodireddy
>
> This e-mail and any files transmitted with it are for the sole use of
> the intended recipient(s) and may contain confidential and privileged
> information. If you are not the intended recipient(s), please reply to
> the sender and destroy all copies of the original message. Any
> unauthorized review, use, disclosure, dissemination, forwarding,
> printing or copying of this email, and/or any action taken in reliance
> on the contents of this e-mail is strictly prohibited and may be
> unlawful. _______________________________________________
> General mailing list
> General@developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general



------------------------------

Message: 4
Date: Tue, 27 Nov 2012 09:44:22 -0800
From: Gajanan Chinchwadkar <gajanan.chinchwad...@marklogic.com>
Subject: Re: [MarkLogic Dev General] cts:element-value-match    and
        cts:element-range-query questions
To: MarkLogic Developer Discussion <general@developer.marklogic.com>
Message-ID:
        <c9924d15b04672479b089f7d55ffc132226a4cc...@exchg-be.marklogic.com>
Content-Type: text/plain; charset="us-ascii"

Michael's suggestion is right. Fields is a better approach.

ML6 Path indexes will not help: the value in the path index is the same as an 
element index at the leaf level will have.

-----Original Message-----
From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Michael Blakeley
Sent: Tuesday, November 27, 2012 9:22 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] cts:element-value-match and 
cts:element-range-query questions

Typically one wouldn't create a range index for complex elements like 
'chapter'. Range indexes are designed for simple element values and 
element-attribute values.

Have you considered creating a field and using 
http://docs.marklogic.com/cts:field-value-match or 
http://docs.marklogic.com/cts:field-word-match instead? You should be able to 
create a 'chapter' field that includes the 'chapter' element but excludes 'b'.

Or you *might* be able to do something with ML6 and a path index: see 
http://docs.marklogic.com/guide/admin/range_index#id_54948 for details. But the 
path '//chapter/text()' seems to be invalid so I'm not sure what the right 
approach would be.

-- Mike

On 26 Nov 2012, at 21:11 , John Zhong <j...@yuxipacific.com> wrote:

> Hi,
>
> I am using ML 5.0-2 version, and configured a element range index (string 
> type) on a complex element, for example, chapter, which has both text node 
> and child element. Like:
>
> <chapter>This is chapter <b>one</b>.</chapter>
>
> Then, I used cts:element-value-match to search the value:
>
> cts:element-value-match(
>   xs:QName("chapter"),
>   ("*one*"),
>   ("concurrent")
> )
>
> It returned the:
>
> This is chapter one.
>
> Is this expected behavior? If so, how to constrain that if I just want
> to match the direct text node? (In this case, I don't want this
> chapter element returned)
>
> Also, if I use cts:element-range-query to search all the chapter elements 
> that have the value "This is chapter one.", that chapter element is returned 
> too. Is this expected behavior either? Same question, if I don't want this 
> chapter element returned, how to do that?
>
> Thanks,
> John
> _______________________________________________
> General mailing list
> General@developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


------------------------------

Message: 5
Date: Tue, 27 Nov 2012 12:47:39 -0500
From: John Zhong <j...@yuxipacific.com>
Subject: Re: [MarkLogic Dev General] cts:element-value-match and
        cts:element-range-query questions
To: MarkLogic Developer Discussion <general@developer.marklogic.com>
Message-ID:
        <CA+yakFknYOkzxmq8HFTTfs=e0beqb3fzs2tzbutlxyivd59...@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Thank you both.

On Tue, Nov 27, 2012 at 12:44 PM, Gajanan Chinchwadkar < 
gajanan.chinchwad...@marklogic.com> wrote:

> Michael's suggestion is right. Fields is a better approach.
>
> ML6 Path indexes will not help: the value in the path index is the
> same as an element index at the leaf level will have.
>
> -----Original Message-----
> From: general-boun...@developer.marklogic.com [mailto:
> general-boun...@developer.marklogic.com] On Behalf Of Michael Blakeley
> Sent: Tuesday, November 27, 2012 9:22 AM
> To: MarkLogic Developer Discussion
> Subject: Re: [MarkLogic Dev General] cts:element-value-match and
> cts:element-range-query questions
>
> Typically one wouldn't create a range index for complex elements like
> 'chapter'. Range indexes are designed for simple element values and
> element-attribute values.
>
> Have you considered creating a field and using
> http://docs.marklogic.com/cts:field-value-match or
> http://docs.marklogic.com/cts:field-word-match instead? You should be
> able to create a 'chapter' field that includes the 'chapter' element
> but excludes 'b'.
>
> Or you *might* be able to do something with ML6 and a path index: see
> http://docs.marklogic.com/guide/admin/range_index#id_54948 for details.
> But the path '//chapter/text()' seems to be invalid so I'm not sure
> what the right approach would be.
>
> -- Mike
>
> On 26 Nov 2012, at 21:11 , John Zhong <j...@yuxipacific.com> wrote:
>
> > Hi,
> >
> > I am using ML 5.0-2 version, and configured a element range index
> (string type) on a complex element, for example, chapter, which has
> both text node and child element. Like:
> >
> > <chapter>This is chapter <b>one</b>.</chapter>
> >
> > Then, I used cts:element-value-match to search the value:
> >
> > cts:element-value-match(
> >   xs:QName("chapter"),
> >   ("*one*"),
> >   ("concurrent")
> > )
> >
> > It returned the:
> >
> > This is chapter one.
> >
> > Is this expected behavior? If so, how to constrain that if I just
> > want
> to match the direct text node? (In this case, I don't want this
> chapter element returned)
> >
> > Also, if I use cts:element-range-query to search all the chapter
> elements that have the value "This is chapter one.", that chapter
> element is returned too. Is this expected behavior either? Same
> question, if I don't want this chapter element returned, how to do that?
> >
> > Thanks,
> > John
> > _______________________________________________
> > General mailing list
> > General@developer.marklogic.com
> > http://developer.marklogic.com/mailman/listinfo/general
>
> _______________________________________________
> General mailing list
> General@developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> General@developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://developer.marklogic.com/pipermail/general/attachments/20121127/ff3bb973/attachment-0001.html

------------------------------

_______________________________________________
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


End of General Digest, Vol 101, Issue 54
****************************************
This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful.
_______________________________________________
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to