Dear Christian,
> "Checks if the specified resource
> exists and if it is an XML document". That being the case, I would
> think it would return false if my path argument actually contained two
> document-nodes.
>>Do your documents have the same name?
In the case I was testing, yes. I used the db:add function twice (for
example):
db:add('testing',document{<a/>},'parent/doc.xml')
db:add('testing',document{<b/>},'parent/doc.xml')
If I then call the following, the result is true:
db:is-xml('testing','parent/doc.xml')
However, this isn't an XML document. It is a collection of XML documents.
It isn't that I disagree with treating paths as collections as that is very
useful as well. My issue is that it should be possible to give a unique name
to a document and to be explicit as to when that is your intention.
In terms of what would have to be changed, I think the following would
suffice:
First document-nodes already have an identifier assigned to them (the
db:node-id can be used to see this), so it would seem to me that it becomes
a matter of allowing a map of document names to these identifiers. Although
the recommendation would seem to allow for multiple names per document-node,
I don't see much value in that and it would probably require new functions
to allow such a thing. To keep it simple, my recommendation will only focus
on what could be done in the existing functions, rather than adding new
ones. Since my issue centers around the ambiguity of a path referring to a
collection or a single document, I will focus on the functions that deal
with these (and only the XML related ones).
- db:open($db as item(), $path as xs:string) as document-node()*: I think
this can remain unchanged. If the path is a document, the result would
technically be a single document-node, but that is already true.
- db:add($db as item(), $input as item(), $path as xs:string) as
empty-sequence(): This should remain unchanged as it shouldn't be required
to assign a name to a document in order to add a document-node to an
existing collection. However, one must be able to distinguish whether the
path identifies a collection or a document within a collection. Further, as
laid out in the XQuery recommendation, if the path is a document there
should be a relation of that name to the existing collection names. I would
think the cleanest approach would be to add an overload method:
db:add($db as item(), $input as item(), $path as xs:string, $doc_name as
xs:boolean) as empty-sequence: The $doc_name parameter is true if the path
is intended to identify a document name, and false if it identifies a
collection. The default value is false, so that nothing changes in terms of
how the function currently works. As with the existing path, the delimiter
character ('/') is significant in that it represents hierarchy. Therefore,
the following:
db:add('db', document{<a/>}, 'level_1/level_2/my_doc', true)
Adds the document-node to the database named 'db' with the document name
('level_1/level_2/my_doc'). This document-node is also available under the
collection 'level_1' and 'level_1/level_2' (as it is currently implemented).
If the $doc_name parameter is true, and the supplied path already exists, an
error should be raised.
-db:rename($db as item(), $path as xs:string, $newpath as xs:string) as
empty-sequence(): will raise error if the rename results in a document name
conflict. For example if I have the documents 'A/doc_1.xml' and
'B/doc_1.xml' and invoke db:rename('db','A','B') the change would not be
allowed since this would result in 2 document-nodes with the name
'B/doc.xml'. Renaming a document name to an existing collection name could
simply remove the document name from the document-node (i.e. unmap it).
-db:replace($db as item(), $path as xs:string, $input as item()) as
empty-sequence(): this should work as it already does, since it raises an
error if the path refers to more than one document node. Basically, you can
replace a collection assuming it contains only one document-node. If the
path is a document, no further check would be required since you know if
contains a single document-node.
I would think that should cover it, but I am sure this is not exhaustive.
Basically, we simply need a way to provide a unique name within the scope of
a database to a single document-node. I would think that this would also
allow you to implement the fn:doc and fn:document-uri functions better.
I hope this helps to clarify.
Jack
P.S. My assumption is also that the changes above would be applied to the
other APIs (e.g. Java) as well.
-----Original Message-----
From: Christian Grün [mailto:[email protected]]
Sent: Wednesday, February 08, 2012 6:25 AM
To: J Gager
Cc: [email protected]
Subject: Re: [basex-talk] Collections and Documents
Dear Jack,
> My confusion mainly arises from the documentation for the Database
> Module in the XQuery portal
> (http://docs.basex.org/wiki/Database_Module). Throughout this page,
> the examples provided for the functions seem to indicate that it is
possible to provide a single name which maps to a single document-node.
we have added one introductory paragraph "Commonalities" on that page that
is supposed to explain the $db variable, but it may well be that it's not
really noticed, or may be misleading.
> When I find more time, I can
> provide more detailed recommendations for the above wiki page.
That would be great; you'll probably be more efficient in rephrasing the
relevant snippets than us (maybe it's just one, two sentences that may need
to be replaced).
> "Checks if the specified resource
> exists and if it is an XML document". That being the case, I would
> think it would return false if my path argument actually contained two
> document-nodes.
Do your documents have the same name?
> In fact, it would seem from some quick tests that I am even able to
> store binary resource and XML under the same path (which I would
> expect with folders but not with documents).
True, that's currently possible (but may be prohibited in future versions).
> I hope this is useful. I still think that having a true document-node
> to document mapping would be useful, as it would allow one to use the
> handy database module functions such as add, delete, rename, and
> replace confidently.
What would have to be changed in your opinion to end up with a true
document-node to document mapping?
Thanks,
Christian
_______________________________________________
BaseX-Talk mailing list
[email protected]
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk