I suspect the answer is "no" ... but just plugging the brains out there ..

For good or bad I use this architype.

I have many "summary" documents  say  "/logs/1.xml" , "/logs/2.xml"  which 
belongs to the collection "/summaries"

There can be many (100k+)

Each summary document lists a refernce to external URL's (in this case Amazon 
S3) from which data could be loaded.
If I load the data I put each group into a collection named by the URL of the 
summary.
So say I have 10,000 XML documents   referenced by doc("/logs/1.xml") If I 
choose to load them, they will end up in collection
"/logs/1.xml".   These summaries are in the collection say "/summaries"

The reason for this is for the ability to easily bulk delete blocks of 
documents based on their summaries.
I can list the summaries and by a simple
                exists( collection( $url) )

cant tell if any actual log documents have been loaded.


NOW:  I want to be able to delete all records by summary but only if the 
documents have been loaded.
Suppose I had 100k summary URL's I could do

                for $url in collection("/summaries")
                                if( exists( collection( $url) )  then
                                                xdmp:collection-delete($url)
                                else ()


This works and all ... but suppose I want something more efficiient.
Overall there may be only say 1% of the summary documents actually loaded.  
Furthermore if there were LOTS of ones loaded the above would timeout.

So I spawn a thread to delete say [1 to 10] of every summary collection ...
but say I have 100k collections most of the threads do nothing.
So I have to revert to the above to first check if the collection has anything 
before spawning a thread.

Quesiton:   Is there a cts:search  option which can do a collection query based 
on the results of the search itself ?
that is (pseudo code)
in one cts:search

    for $c in collection("x")/document-uri(.)
                if( exists( collection( $c) )
                                return $c

doing this in FLOWR is very slow ...
but its what I'm resorting to ....











----------------------------------------
David A. Lee
Senior Principal Software Engineer
Epocrates, Inc.
[email protected]<mailto:[email protected]>
812-482-5224

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to