Ivan,

The data that is loaded into Sedna comes from the same source. The only difference is that one is chunked up into individual XML files and the other is a single large XML document containing the same info.

So when I create two variables containing the same content but from different sources (one a collection and the other a single XML doc) and then do a union on them I should only get the unique nodes as per your example below. But that is not what is happening. It is returning all 22 album nodes (11 album nodes from each). When I output each to their own file and do a comparison using Oxygen's XML diff tool, according to that tool they are identical. Am I missing something here on how union is supposed to work?

Also the second issue is I was just returning @ids to see that the order was correct. They are both sorted the same and when I view the XML (from the output files above) they are identical and both sorted exactly the same, I can't stress that enough. The first variable (l_Content1) the IDs are output in the order they were sorted and appear in the XML file but the second variable's (l_Content2) the IDs do not appear in the sorted order but the output XML file has them in the sorted order. I find this confusing as to why this is happening. Both variables contain the exact same data.

The XML that I gave must be inserted into Sedna as is for the first variable but must be chunked into individual XML files (with album as root) and then inserted into a collection. I don't know whether you have done that or not and hence the results I'm seeing are not being duplicated.

I hope that helps in clarifying what I am seeing.

Marijan (Mario) Madunic

On 7/31/2012 9:43 AM, Ivan Shcheklein wrote:
Hi Marijan,

    I'm having an issue with union that I can't seem to see what's
    wrong. It
    always returns the entire result of both sets of XML but they are
    basically exactly the same.


See http://www.w3.org/TR/xquery/#combining_seq .

"All these operators [/including union/] eliminate duplicate nodes from their result sequences based on node identity"

It means, that the following query returns two nodes:

let $a := <a id='1'/>
let $b := <b id='1'/>
return $a/@id union $b/@id

while:

let $a := <a id='1'/>
return $a/@id union $a/@id

returns one node.

Union operation in XQuery doesn't compare values of nodes (IDs in your case). There is distinct-values function to remove duplicates by value:

let $l_UnionContent := distinct-values(($l_Content1, $l_Content2))
...

    Another issue I found was that $l_Content2 even though sorted (when I
    view the result XML both $l_Content1 and $l_Content2 are ordered
    exactly
    the same) is out of order when I want to view just the IDs.


I've failed to reproduce this issue:

[shcheklein@dhcp-218-16-wifi ~/sedna-3.5.161/bin]$ ./se_term x
Welcome to term, the SEDNA Interactive Terminal. Type \? for help.

x> create collection "c_AlbumsIndividual"&
UPDATE is executed successfully

x> load "./test.xml" "albums" "c_AlbumsIndividual"& Bulk load succeeded

x> CREATE INDEX "AlbumsByArtistCollection" ON collection("c_AlbumsIndividual")/albums/album BY artists/artist/@artistID AS xs:string& UPDATE is executed successfully

x> declare variable $p_Index2 as xs:string := "AlbumsByArtistCollection"; declare variable $p_Value2 as xs:string := "artist_709"; let $l_Content2 := for $l_ContentTemp2 in index-scan($p_Index2, $p_Value2, "EQ") order by number(substring-after($l_ContentTemp2/@id, 'album_')) return $l_ContentTemp2 return $l_Content2/@id&

id="album_505"
id="album_506"
id="album_507"
id="album_508"
id="album_509"
id="album_982"
id="album_2476"
id="album_2591"
id="album_2596"
id="album_2599"
id="album_2874"

x> doc('$version')&
<sedna version="3.5" build="161"/>

Ivan Shcheklein,
Sedna Team

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Sedna-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/sedna-discussion

Reply via email to