I think what Karl is expressing is frustration that basic xpath
expressions appear not to use indexes.
I too am 'in the dark' about that ... and would love some advise.
Why, for example
        cts:search( doc("doc.xml")/FOO/BAR , "text") 

uses index ("instant results" ) 

but apparently 
        doc("doc.xml")/FOO[BAR eq 'text']

seems to iterate through the list and not use indexes. (painfully slow
results ... )


I'm sure this is a mis-understanding.  But I've hit it myself when
trying to port over RDBMS-Like structures to ML ... doing random-access
lookups of 'record like things' using key values and XPath just dont
seem to be using indexes.  There Must Be A Way !

-David




-----Original Message-----
From: general-boun...@developer.marklogic.com
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Geert
Josten
Sent: Monday, November 23, 2009 2:29 PM
To: General Mark Logic Developer Discussion
Subject: RE: [MarkLogic Dev General] XML structure/schema design for MLS

Hi Karl,

Personally, I would choose the shortest way to make things work. ;-) And
MarkLogic Server doesn't require you to choose between the three. You
can intermingle if you like as well.

If your current data is following a certain standard, then it is likely
that it is so for a certain reason. Perhaps it is necessary to be able
to exchange data with other parties or applications. This is a very
strong reason to preserve the content in its original format, whether
MarkLogic Server can handle that well or not. But thanks to namespaces
and document properties in MarkLogic Server, it is quite easy to add
information that is optimized for searching or user presentation, to
make less optimally structured content work better in MarkLogic Server.
You can always store calculated data in document properties, add
namespaced attributes to specific nodes to optimize certain things and
filter them out when exchanging data with other systems, add meta
information in a separate xml structure that is inserted in the existing
data structure, or wrap the contents in a new root element which allows
additional information at root level. Document properties prevent
mingling data, the last solution is one in which separating the data is
very easy.

But apart from that, it might be just as likely that MarkLogic Server
could perform really well with the existing structure, if indices and
search expressions would be chosen carefully. Unfortunately, you leave
us in the dark why you think solution #2 should dominate entirely over
the others. Perhaps you could elaborate on that first? And while at it,
give us some hints on the big picture. What are you trying to achieve in
general with MarkLogic Server?

Kind regards,
Geert

>


Drs. G.P.H. Josten
Consultant


http://www.daidalos.nl/
Daidalos BV
Source of Innovation
Hoekeindsehof 1-4
2665 JZ Bleiswijk
Tel.: +31 (0) 10 850 1200
Fax: +31 (0) 10 850 1199
http://www.daidalos.nl/
KvK 27164984
De informatie - verzonden in of met dit emailbericht - is afkomstig van
Daidalos BV en is uitsluitend bestemd voor de geadresseerde. Indien u
dit bericht onbedoeld hebt ontvangen, verzoeken wij u het te
verwijderen. Aan dit bericht kunnen geen rechten worden ontleend.


> From: general-boun...@developer.marklogic.com
> [mailto:general-boun...@developer.marklogic.com] On Behalf Of
> Karl Erisman
> Sent: maandag 23 november 2009 3:14
> To: general@developer.marklogic.com
> Subject: [MarkLogic Dev General] XML structure/schema design for MLS
>
> I have a general question about choosing an XML structure
> (schema design if using schemas) for use with MarkLogic.  My
> particular situation involves storing clinical data.  There
> are multiple opposing forces that could motivate choosing one
> schema structure over another.
>  The main ones are:
>
> (1) standards compliance: it would be nice if the internal
> storage format is compatible with existing standard schemas
> for clinical data in XML (to take advantage of existing tools
> that work against the standard schemas and to allow exchange
> with external systems without requiring transformation)
> (2) ease of handling in MLS, specifically *indexing* and *searching*
> (3) "clean" XML (structure that makes sense semantically to a
> human viewer)
>
> The more I experiment with cts:query and search:search, the
> more I tend to think that #2 should dominate entirely, to the
> point of ignoring the others.  As it turns out, some standard
> data formats are really awkward to work with in MLS.
>
> So, do others just organize their content specifically for
> MLS and run transformations when needed?  What does Mark
> Logic recommend?  What have your experiences been?
>
> Thank you,
> Karl
> _______________________________________________
> General mailing list
> General@developer.marklogic.com
> http://xqzone.com/mailman/listinfo/general
>

_______________________________________________
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general
_______________________________________________
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general

Reply via email to