On Mon, 20 Jun 2016 07:52:31 -0700, <abhinav.mish...@cognizant.com> wrote:

> Hi All,
>
> We did a test round of test for 15000 xmls which has xi:include element  
> (Sample given below). The require large xml (hierarchy xml) is getting  
> generated in just PT23.040552S. We used node-expand API to generate the  
> xml. Whereas our old recursive approach is taking more than 30 minute to  
> perform the same operation. Can you please provide any thoughts ? Any  
> other things we should be consider ?

If you look at how xinc:node-expand is implemented  
(Modules/MarkLogic/xinclude/xinclude.xqy), it makes use of two things to  
improve performance:
(1) cts:element-walk which is like cts:walk except it looks for particular  
matching QNames
(2) it prunes entire branches of the tree if possible.

You could perhaps use similar tricks to build your own recursive expansion  
function that does exactly what you want in a more targeted way.

//Mary

>
> import module namespace xinc = "http://marklogic.com/xinclude"; at  
> "/MarkLogic/xinclude/xinclude.xqy";
> xinc:node-expand(fn:doc("/data/d14d44ec-59d5-4ada-b47d-3d62b69633c8") )
>
> Where "/data/d14d44ec-59d5-4ada-b47d-3d62b69633c8" is the root xml URI  
> in the hierarchy.
>
> 1- Root object which contains relationships
> <object name="package" id="d14d44ec-59d5-4ada-b47d-3d62b69633c8">
>                <properties>
>                               <property name="myPackage" type="string">
>                                              <value>somevalue</value>
>                               </property>
>                .....
>                ....
>                </properties>
>
>                <relationships>
>                                                             <include  
> href="/data/c525e14d-59d5-4ada-b47d-3d62b69633c8"  
> xpointer="xpath(/*:object)" xmlns="http://www.w3.org/2001/XInclude"/>
>                                                             <include  
> href="/data/12970f40-053d-4f22-8e39-073ca3a17454"  
> xpointer="xpath(/*:object)" xmlns="http://www.w3.org/2001/XInclude"/>
>     ....
>                </relationships>
> </object>
>
> 2- Child object which contains further relationships (It is one of the  
> child which is inside the relationships)
>
> <object name="myImage" id="c525e14d-59d5-4ada-b47d-3d62b69633c8">
>                <properties>
>                               <property name="pixelXDimension"  
> type="int">
>                                              <value>645</value>
>                               </property>
>                .....
>                ....
>                </properties>
>
>                <relationships>
>                                                             <include  
> href="/data/xyzzqqka-59d5-4ada-b47d-125shydtt2bs"  
> xpointer="xpath(/*:object)" xmlns="http://www.w3.org/2001/XInclude"/>
>     ....
>                </relationships>
> </object>
>
>
> 3- Further Child object which contains other relationships
>
> <object name="thumbnail" id="xyzzqqka-59d5-4ada-b47d-125shydtt2bs">
>                <properties>
>                               <property name="pixelXDimension"  
> type="int">
>                                              <value>645</value>
>                               </property>
>                .....
>                ....
>                </properties>
>
>                <relationships>
>                                                             <include  
> href="/data/abcgdt13-59d5-125a-b47d-425shydtt2bs"  
> xpointer="xpath(/*:object)" xmlns="http://www.w3.org/2001/XInclude"/>
>     ....
>                </relationships>
> </object>
>
> And so on. And final xml which we want :
>
> <object name="package" id="d14d44ec-59d5-4ada-b47d-3d62b69633c8">
>                <properties>
>                               <property name="myPackage" type="string">
>                                              <value>somevalue</value>
>                               </property>
> .....
> ....
>                </properties>
>
>                <relationships>
>                               <object name="myImage"  
> id="c525e14d-59d5-4ada-b47d-3d62b69633c8">
>                                              <properties>
>                                                             <property  
> name="pixelXDimension" type="int">
>                                                                            
> <value>645</value>
>                                                             </property>
> .....
> ....
>                                              </properties>
>
>                                              <relationships>
>                                                             <object  
> name="thumbnail" id="xyzzqqka-59d5-4ada-b47d-125shydtt2bs">
>                                                                            
> <properties>
>                                                                               
>             <property  
> name="pixelXDimension" type="int">
>                                                                               
>                            <value>645</value>
>                                                                               
>             </property>
> .....
> ....
>                                                                            
> </properties>
>
>                                                                            
> <relationships>
> ....
>                                                                            
> </relationships>
>                                                             </object>
> ....
>                                              </relationships>
>                               </object>
> ....
>                </relationships>
> </object>
>
> Regards,
> Abhinav
>
> From: Mishra, Abhinav Kumar (Cognizant)
> Sent: Thursday, June 16, 2016 12:55 PM
> To: MarkLogic Developer Discussion
> Cc: Singh, Vikas (Cognizant)
> Subject: RE: [MarkLogic Dev General] performance issue for creating  
> large xml
>
> Hi Geert,
> We are creating an xml which looks like a hierarchy. And once the  
> hierarchy is prepared from small chunks we are using an XSLT to  
> transform the hierarchy into another format. The small chunks contains  
> metadata for different-2 files.
> Currently we are having more than 30000 small chunks and we have to  
> create a large xml (hierarchy xml) out of these chunks in memory. The  
> generated large xml (hierarchy xml) will be more than 30MB in size. And  
> this process is taking more than 45 minutes to complete. So we are  
> looking for a design change. Vikas pointed out to use xi:include. So we  
> thought of having a discussion here.
>
> Let me try to explain what we are doing.
>
> 1- Root object which contains relationships
> <object name="package" id="d14d44ec-59d5-4ada-b47d-3d62b69633c8">
>                <properties>
>                               <property name="myPackage" type="string">
>                                              <value>somevalue</value>
>                               </property>
>                .....
>                ....
>                </properties>
>
>                <relationships>
>                               
> <value>c525e14d-59d5-4ada-b47d-3d62b69633c8</value>
>                               
> <value>12970f40-053d-4f22-8e39-073ca3a17454</value>
>     ....
>                </relationships>
> </object>
>
> 2- Child object which contains further relationships (It is one of the  
> child which is inside the relationships)
>
> <object name="myImage" id="c525e14d-59d5-4ada-b47d-3d62b69633c8">
>                <properties>
>                               <property name="pixelXDimension"  
> type="int">
>                                              <value>645</value>
>                               </property>
>                .....
>                ....
>                </properties>
>
>                <relationships>
>                               
> <value>xyzzqqka-59d5-4ada-b47d-125shydtt2bs</value>
>     ....
>                </relationships>
> </object>
>
>
> 3- Further Child object which contains other relationships
>
> <object name="thumbnail" id="xyzzqqka-59d5-4ada-b47d-125shydtt2bs">
>                <properties>
>                               <property name="pixelXDimension"  
> type="int">
>                                              <value>645</value>
>                               </property>
>                .....
>                ....
>                </properties>
>
>                <relationships>
>                               
> <value>abcgdt13-59d5-125a-b47d-425shydtt2bs</value>
>     ....
>                </relationships>
> </object>
>
> and so on. and at the end we are creating a large xml which will look  
> like:
>
> <object name="package" id="d14d44ec-59d5-4ada-b47d-3d62b69633c8">
>                <properties>
>                               <property name="myPackage" type="string">
>                                              <value>somevalue</value>
>                               </property>
>                .....
>                ....
>                </properties>
>
>                <contains>
>                               <object name="myImage"  
> id="c525e14d-59d5-4ada-b47d-3d62b69633c8">
>                                              <properties>
>                                                             <property  
> name="pixelXDimension" type="int">
>                                                                            
> <value>645</value>
>                                                             </property>
>                .....
>                ....
>                                              </properties>
>
>                                              <contains>
>                                                             <object  
> name="thumbnail" id="xyzzqqka-59d5-4ada-b47d-125shydtt2bs">
>                                                                            
> <properties>
>                                                                               
>             <property  
> name="pixelXDimension" type="int">
>                                                                               
>                            <value>645</value>
>                                                                               
>             </property>
>                .....
>                ....
>                                                                            
> </properties>
>
>                                                                            
> <contains>
>     ....
>                                                                            
> </contains>
>                                                             </object>
>     ....
>                                              </contains>
>                               </object>
>     ....
>                </contains>
> </object>
>
> Now we are using XSLT to transform into another format which we need as  
> a business requirement.
>
>
> Regards
> Abhinav
>
> From:  
> general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>
>   
> [mailto:general-boun...@developer.marklogic.com] On Behalf Of Geert  
> Josten
> Sent: Thursday, June 16, 2016 10:29 AM
> To: MarkLogic Developer Discussion
> Subject: Re: [MarkLogic Dev General] performance issue for creating  
> large xml
>
> Hi Vikas,
>
> XInclude processing requires building the large xml in memory too,  
> regardless where it will be going. So whether this will work well enough  
> for your case depends on how large `large` is..
>
> Kind regards,
> Geert
>
> From:  
> <general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
>   
> on behalf of  
> "vikas.sin...@cognizant.com<mailto:vikas.sin...@cognizant.com>"  
> <vikas.sin...@cognizant.com<mailto:vikas.sin...@cognizant.com>>
> Reply-To: MarkLogic Developer Discussion  
> <general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
> Date: Thursday, June 16, 2016 at 4:24 PM
> To:  
> "general@developer.marklogic.com<mailto:general@developer.marklogic.com>"  
> <general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
> Subject: Re: [MarkLogic Dev General] performance issue for creating  
> large xml
>
> Thanks Geert for quick reply
>
> As per current process also  we are creating large xml  by adding all  
> related fragment,  but not committing this large xml into database ,  so  
> we are planning to create xml  as below.
>
>  <object name="Test" >
>  <!--Some metadata properties -->
>  <relationships>
>   <relationship type="reference">
>      <value>49d7116c24d541aea73328b761cdd89f</value>
>          <xi:include href="/49d7116c24d541aea73328b761cdd89f.xml"  
> xpointer="49d7116c24d541aea73328b761cdd89f" />
>     </relationship>
> </object>
>
> As per above xml we are planning to add one more value as <xi:include>   
> which will be same  as value element but contains exact xpath. So when  
> we want expanded form based on the xinclude  it will automatically  
> expanded. Will this approach improve our performance. This  xi:include  
> will be the different content with same structure.
>
> Regards,
> Vikas Singh
> From:  
> general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>
>   
> [mailto:general-boun...@developer.marklogic.com] On Behalf Of Geert  
> Josten
> Sent: Thursday, June 16, 2016 7:29 PM
> To: MarkLogic Developer Discussion
> Subject: Re: [MarkLogic Dev General] performance issue for creatign  
> large xml
>
> Hi Vikas,
>
> Keep in mind you will be buffering all related fragments in memory while  
> building this large XML. It might work out, but it won't scale well. To  
> allow keeping memory usage small, and streaming through the results, you  
> are better off returning all xml chunks without wrapping them in a  
> single large document or element node.
>
> Not very elegant, but this would probably work:
>
> "<wrapper>",
> <p>hello world</p>,
> <p>hello world</p>,
> "</wrapper>"
>
> You can replace the p elements with anything that produces results in a  
> streaming manner..
>
> Cheers,
> Geert
>
> From:  
> <general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
>   
> on behalf of  
> "vikas.sin...@cognizant.com<mailto:vikas.sin...@cognizant.com>"  
> <vikas.sin...@cognizant.com<mailto:vikas.sin...@cognizant.com>>
> Reply-To: MarkLogic Developer Discussion  
> <general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
> Date: Thursday, June 16, 2016 at 3:47 PM
> To:  
> "general@developer.marklogic.com<mailto:general@developer.marklogic.com>"  
> <general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
> Subject: [MarkLogic Dev General] performance issue for creatign large xml
>
> Hi All,
>
> As per current design in our project we are creating large xml by adding  
> all small xml chunks for a final outcome .For achieving this we are  
> using cts:search and this search will work recursively .
>
> Example:  We have one xml which contains metadata and all references of  
> it .Now when we will create final result , we  will be getting all  
> references and metadata of all references and creating one large xml.  
> Child references also contains other references and so on.
>
> This process is taking around one hour for creating the final result.
>
> Can we change our design and use XInclude in all the parent document so  
> when we want final output. It will be automatically expanded for all  
> child so no need to search in database .
> Will this improve our performance for generation of final outcome.
>
> Regards,
> Vikas Singh
>
> This e-mail and any files transmitted with it are for the sole use of  
> the intended recipient(s) and may contain confidential and privileged  
> information. If you are not the intended recipient(s), please reply to  
> the sender and destroy all copies of the original message. Any  
> unauthorized review, use, disclosure, dissemination, forwarding,  
> printing or copying of this email, and/or any action taken in reliance  
> on the contents of this e-mail is strictly prohibited and may be  
> unlawful. Where permitted by applicable law, this e-mail and other  
> e-mail communications sent to and from Cognizant e-mail addresses may be  
> monitored.
> This e-mail and any files transmitted with it are for the sole use of  
> the intended recipient(s) and may contain confidential and privileged  
> information. If you are not the intended recipient(s), please reply to  
> the sender and destroy all copies of the original message. Any  
> unauthorized review, use, disclosure, dissemination, forwarding,  
> printing or copying of this email, and/or any action taken in reliance  
> on the contents of this e-mail is strictly prohibited and may be  
> unlawful. Where permitted by applicable law, this e-mail and other  
> e-mail communications sent to and from Cognizant e-mail addresses may be  
> monitored.
> This e-mail and any files transmitted with it are for the sole use of  
> the intended recipient(s) and may contain confidential and privileged  
> information. If you are not the intended recipient(s), please reply to  
> the sender and destroy all copies of the original message. Any  
> unauthorized review, use, disclosure, dissemination, forwarding,  
> printing or copying of this email, and/or any action taken in reliance  
> on the contents of this e-mail is strictly prohibited and may be  
> unlawful. Where permitted by applicable law, this e-mail and other  
> e-mail communications sent to and from Cognizant e-mail addresses may be  
> monitored.


-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/
_______________________________________________
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to