Hello,

  This is common problem in XQuery. There hardly is a way of
optimizing it without introducing special "group by" operator to
FLWOR. We are currently working on it.

  What version of Sedna do you use? Development branch contains
optimized "distinct-values" function, so it may be faster.

Ilya Taranov,
Sedna Team.

On Mon, Mar 14, 2011 at 3:04 PM, giocondo sticca
<[email protected]> wrote:
> Hi,
>
> I have performance problem with this query:
>
> declare namespace s="http://www.schemata.it/lml/1.0";;
> declare namespace l="http://www.schemata.it/lml/1.0/linker";;
>
> let $tmp := for $b in ftindex-scan("ft_body","art.
> 15","nosort")/../s:meta/s:classification/s:index
>
> return $b
>
> return
>
> <result>
> {for $i in distinct-values($tmp)
> return
> <section><name>{$i}</name><total>{count($tmp[.=$i])}</total></section>
> }
> </result>
>
> Total time: 56500ms
>
> I have also tried  this variant, but total time is always too high
>
> declare namespace s="http://www.schemata.it/lml/1.0";;
> declare namespace l="http://www.schemata.it/lml/1.0/linker";;
>
> let $tmp := for $b in ftindex-scan("ft_body","art.
> 15","nosort")/../s:meta/s:classification/s:index
>
> return $b
>
> return
>
> <result>
> {for $sec in distinct-values($tmp)
> return <section><name>{$sec}</name><total>{count($tmp intersect
> index-scan("test",$sec,"EQ"))}</total></section>
> }
> </result>
>
> Total time: 20453ms
>
> Any suggestion ?
>
> Thanks.
>
> P.S.
>
> This is the profile of the second query:
>
> <profile xmlns="http://www.modis.ispras.ru/sedna";>
>   <total-time>23.970</total-time>
> </profile><prolog xmlns="http://www.modis.ispras.ru/sedna";>
>   <namespace prefix="l" uri="http://www.schemata.it/lml/1.0/linker"/>
>   <namespace prefix="s" uri="http://www.schemata.it/lml/1.0"/>
> </prolog><query xmlns="http://www.modis.ispras.ru/sedna";>
>   <operation xmlns="" name="PPQueryRoot" time="23.970" calls="1">
>     <operation name="PPLet" position="4:5" time="23.968" calls="2">
>       <produces>
>         <variable descriptor="1" name="tmp"/>
>       </produces>
>       <operation name="PPReturn" position="4:17" time="2.272" calls="22341">
>         <produces>
>           <variable descriptor="0" name="b"/>
>         </produces>
>         <operation name="PPDDO" position="4:93" time="2.255" calls="22341">
>           <operation name="PPAxisChild" step="child::element(s:index)"
> position="4:93" time="2.109" calls="22341">
>             <operation name="PPAxisChild"
> step="child::element(s:classification)" position="4:76" time="1.986"
> calls="22341">
>               <operation name="PPAxisChild" step="child::element(s:meta)"
> position="4:69" time="1.820" calls="22341">
>                 <operation name="PPAxisParent" step="parent::node()"
> position="4:66" time="1.634" calls="22341">
>                   <operation name="PPSeqChecker" mode="node" position="4:66"
> time="1.460" calls="22341">
>                     <operation name="PPFtIndexScan" position="4:23"
> time="1.456" calls="22341">
>                       <operation name="PPConst" type="xs:string"
> value="ft_body" position="4:36" time="0.000" calls="2"/>
>                       <operation name="PPConst" type="xs:string" value="art.
> 15" position="4:46" time="0.000" calls="2"/>
>                       <operation name="PPConst" type="xs:string"
> value="nosort" position="4:56" time="0.000" calls="2"/>
>                     </operation>
>                   </operation>
>                 </operation>
>               </operation>
>             </operation>
>           </operation>
>         </operation>
>         <operation name="PPVariable" descriptor="0" variable-name="b"
> position="6:8" time="0.004" calls="44680"/>
>       </operation>
>       <operation name="PPElementConstructor" element-name="result"
> deep-copy="true" namespace-inside="false" position="10:1" time="23.968"
> calls="2">
>         <operation name="PPSequence" position="10:1" time="23.962"
> calls="467">
>           <operation name="PPSpaceSequence" doc-order="false"
> position="11:1" time="23.962" calls="467">
>             <operation name="PPReturn" position="11:6" time="23.962"
> calls="467">
>               <produces>
>                 <variable descriptor="2" name="sec"/>
>               </produces>
>               <operation name="PPFnDistinctValues" position="11:14"
> time="3.207" calls="467">
>                 <operation name="PPVariable" descriptor="1"
> variable-name="tmp" position="11:30" time="2.282" calls="22341"/>
>               </operation>
>               <operation name="PPElementConstructor" element-name="section"
> deep-copy="false" namespace-inside="false" position="12:8" time="20.755"
> calls="932">
>                 <operation name="PPSequence" position="12:8" time="20.752"
> calls="1398">
>                   <operation name="PPElementConstructor" element-name="name"
> deep-copy="false" namespace-inside="false" position="12:17" time="0.006"
> calls="932">
>                     <operation name="PPSequence" position="12:17"
> time="0.001" calls="932">
>                       <operation name="PPSpaceSequence" doc-order="false"
> position="12:23" time="0.001" calls="932">
>                         <operation name="PPVariable" descriptor="2"
> variable-name="sec" position="12:24" time="0.001" calls="932"/>
>                       </operation>
>                     </operation>
>                   </operation>
>                   <operation name="PPElementConstructor"
> element-name="total" deep-copy="false" namespace-inside="false"
> position="12:36" time="20.745" calls="932">
>                     <operation name="PPSequence" position="12:36"
> time="20.737" calls="932">
>                       <operation name="PPSpaceSequence" doc-order="false"
> position="12:43" time="20.736" calls="932">
>                         <operation name="PPFnCount" position="12:44"
> time="20.736" calls="932">
>                           <operation name="PPIntersect" doc-order="false"
> position="12:50" time="20.731" calls="22806">
>                             <operation name="PPSXptr" position="12:50"
> time="17.913" calls="10342429">
>                               <operation name="PPVariable" descriptor="1"
> variable-name="tmp" position="12:50" time="2.116" calls="10410906"/>
>                             </operation>
>                             <operation name="PPSXptr" position="12:65"
> time="1.511" calls="487246">
>                               <operation name="PPIndexScan"
> index-scan-condition="EQ" position="12:65" time="0.222" calls="487266">
>                                 <operation name="PPConst" type="xs:string"
> value="test" position="12:76" time="0.000" calls="932"/>
>                                 <operation name="PPVariable" descriptor="2"
> variable-name="sec" position="12:83" time="0.000" calls="932"/>
>                                 <operation name="PPConst" type="xs:integer"
> value="0" position="12:65" time="0.000" calls="0"/>
>                               </operation>
>                             </operation>
>                           </operation>
>                         </operation>
>                       </operation>
>                     </operation>
>                   </operation>
>                 </operation>
>               </operation>
>             </operation>
>           </operation>
>         </operation>
>       </operation>
>     </operation>
>   </operation>
> </query>
>
> All the tests was performed on a DELL PowerEdge 2009 (Quad-Core XEON
> E5410, 4GB RAM, 3 x 750GB Raid SAS) with Debian Lenny (Kernel version
> 2.6.26-2-amd64) and Sedna 3.4.228
>
> ------------------------------------------------------------------------------
> Colocation vs. Managed Hosting
> A question and answer guide to determining the best fit
> for your organization - today and in the future.
> http://p.sf.net/sfu/internap-sfd2d
> _______________________________________________
> Sedna-discussion mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/sedna-discussion
>
>



-- 
Thanks in advance,
Ilya Taranov.
------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Sedna-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/sedna-discussion

Reply via email to