Hello, Marijan!
So when I create two variables containing the same content but from
> different sources (one a collection and the other a single XML doc) and
> then do a union on them I should only get the unique nodes as per your
> example below. But that is not what is happening. It is returning all 22
> album nodes (11 album nodes from each). When I output each to their own
> file and do a comparison using Oxygen's XML diff tool, according to that
> tool they are identical. Am I missing something here on how union is
> supposed to work?
Like Ivan already said, duplicate eliminating in union works based on node
(not value) identity. This means that two separate nodes with the same
content will be considered as different. So, if you have loaded xml file
into two different documents in sedna, all matching nodes in them will be
different in union, because they are represented by different physical
objects in database.
Also the second issue is I was just returning @ids to see that the order
> was correct. They are both sorted the same and when I view the XML (from
> the output files above) they are identical and both sorted exactly the
> same, I can't stress that enough. The first variable (l_Content1) the IDs
> are output in the order they were sorted and appear in the XML file but the
> second variable's (l_Content2) the IDs do not appear in the sorted order
> but the output XML file has them in the sorted order. I find this confusing
> as to why this is happening. Both variables contain the exact same data.
I tried to reproduce this issue both in linux-x64 and win-x64 platforms
(With splitted into distinct files albums), but it worked well on my
machine. I simply executed all the queries from your first letter. Have you
performed any other manipulations with this documents before you noticed
strange behavior? If so, can you try to reproduce issue without them in new
database? And can you check that you tried to execute exactly the same
queries that presented in the letter?
2012/7/31 Marijan (Mario) Madunic <[email protected]>
> Ivan,
>
> The data that is loaded into Sedna comes from the same source. The only
> difference is that one is chunked up into individual XML files and the
> other is a single large XML document containing the same info.
>
> So when I create two variables containing the same content but from
> different sources (one a collection and the other a single XML doc) and
> then do a union on them I should only get the unique nodes as per your
> example below. But that is not what is happening. It is returning all 22
> album nodes (11 album nodes from each). When I output each to their own
> file and do a comparison using Oxygen's XML diff tool, according to that
> tool they are identical. Am I missing something here on how union is
> supposed to work?
>
> Also the second issue is I was just returning @ids to see that the order
> was correct. They are both sorted the same and when I view the XML (from
> the output files above) they are identical and both sorted exactly the
> same, I can't stress that enough. The first variable (l_Content1) the IDs
> are output in the order they were sorted and appear in the XML file but the
> second variable's (l_Content2) the IDs do not appear in the sorted order
> but the output XML file has them in the sorted order. I find this confusing
> as to why this is happening. Both variables contain the exact same data.
>
> The XML that I gave must be inserted into Sedna as is for the first
> variable but must be chunked into individual XML files (with album as root)
> and then inserted into a collection. I don't know whether you have done
> that or not and hence the results I'm seeing are not being duplicated.
>
> I hope that helps in clarifying what I am seeing.
>
> Marijan (Mario) Madunic
>
>
> On 7/31/2012 9:43 AM, Ivan Shcheklein wrote:
>
> Hi Marijan,
>
> I'm having an issue with union that I can't seem to see what's wrong. It
>> always returns the entire result of both sets of XML but they are
>> basically exactly the same.
>>
>
> See http://www.w3.org/TR/xquery/#combining_seq .
>
> "All these operators [*including union*] eliminate duplicate nodes from
> their result sequences based on node identity"
>
> It means, that the following query returns two nodes:
>
> let $a := <a id='1'/>
> let $b := <b id='1'/>
> return $a/@id union $b/@id
>
> while:
>
> let $a := <a id='1'/>
> return $a/@id union $a/@id
>
> returns one node.
>
> Union operation in XQuery doesn't compare values of nodes (IDs in your
> case). There is distinct-values function to remove duplicates by value:
>
> let $l_UnionContent := distinct-values(($l_Content1, $l_Content2))
> ...
>
> Another issue I found was that $l_Content2 even though sorted (when I
>> view the result XML both $l_Content1 and $l_Content2 are ordered exactly
>> the same) is out of order when I want to view just the IDs.
>>
>
> I've failed to reproduce this issue:
>
> [shcheklein@dhcp-218-16-wifi ~/sedna-3.5.161/bin]$ ./se_term x
> Welcome to term, the SEDNA Interactive Terminal. Type \? for help.
>
> x> create collection "c_AlbumsIndividual"&
> UPDATE is executed successfully
>
> x> load "./test.xml" "albums" "c_AlbumsIndividual"&
>
> Bulk load succeeded
>
> x> CREATE INDEX "AlbumsByArtistCollection" ON
> collection("c_AlbumsIndividual")/albums/album BY artists/artist/@artistID
> AS xs:string&
> UPDATE is executed successfully
>
> x> declare variable $p_Index2 as xs:string :=
> "AlbumsByArtistCollection";
>
> declare variable $p_Value2 as xs:string := "artist_709";
>
>
>
>
> let $l_Content2 := for $l_ContentTemp2 in
> index-scan($p_Index2,
>
> $p_Value2, "EQ") order by
> number(substring-after($l_ContentTemp2/@id,
>
> 'album_')) return $l_ContentTemp2
>
>
>
>
> return
>
>
> $l_Content2/@id&
>
> id="album_505"
> id="album_506"
> id="album_507"
> id="album_508"
> id="album_509"
> id="album_982"
> id="album_2476"
> id="album_2591"
> id="album_2596"
> id="album_2599"
> id="album_2874"
>
> x> doc('$version')&
> <sedna version="3.5" build="161"/>
>
> Ivan Shcheklein,
> Sedna Team
>
>
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Sedna-discussion mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/sedna-discussion
>
>
--
Best regards,
Konstantin Abakumov.
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Sedna-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/sedna-discussion