That query doesnt do what I want, because (shame on me) I have
multiple
docs with //row elements.
And do those row elements have RXAUI child elements that would match
the value?
But just for testing I ran it and it performs about the same as the
cts:search case.
(4.5 sec) so it seems to be using indexes in that case.
Even that seems to be too slow for me where the result set is 8
records
of about 100 bytes each.
Does the query I gave you return 8 results or does it return a lot
more because of your spurious row entries?
My real question here is one I'm trying to discover. And one that I
think many people are asking.
Can I get MarkLogic to perform like an RDBMS in the (hopefully rare)
cases where the data really is like RDB data ?
You can definitely get Mark Logic to return 8 records in a fraction of
a second, and faster still if the 8 records have their expanded tree
cache records in memory. The fact that you're not seeing that makes
me think something else is going on.
Your original query, for example, was actually crossing fragments
because of the full path you were using, which required fragment join
work that will slow things down. The query I provided had the
possibility of reducing that (and if it had been faster we couldn't
added a more efficient rule to limit to the right document). Doing
what Kelly suggested and breaking the fragments into their own docs
will definitely eliminate the join. Let's see what happens there.
Also, what does query-trace() say about index utilization?
Note that philosophically, tiny fragments are just fine if the
fragments are the unit of retrieval, as they are in your case. The
advice not to make your fragments too small assumes that you might
want to sometimes retrieve a full document made up of fragments and in
that case you don't want to have to build your doc out of lots of tiny
fragment retrievals. You won't be doing that here, so tiny fragments
are fine.
That is lots (millions) of small identical "rows" of data where I'd
like
to 'simply' look up a row by an exact key match. Not word or phrase
or
wildcard searching of big docs in the haystack,
but a real RDBMS style single key lookup type index.
Should be able to run extremely fast. MarkMail does it all the time
fetching one mail out of millions, with each mail being fairly small.
What I going to experiment with next is sticking this particular
file in
an RDBMS and using the SQL connector code ... Yuck. I was really
hopping not to do that.
You shouldn't have to.
Another idea, which I think is pretty ugly, but might help, is to
artificially create structure where none exists. For example say
group
the records by the first 2 digits of the key value into a document and
reduce the fragmentation by 100x But even getting this
restructuring
done is painful because the doc is too big to load into memory so I
need
to use a DB just to get at it.
Which probably means I load it into an RDBMS to restructure the XML or
maybe just leave it there.
You shouldn't have to do that either.
-jh-
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general