As many of you may know, one of our problems in working with XML
documents (specially in web applications) was that all (native) XML databases did
not support Unicode correctly (or completely). I had tested two or three of
them (open source ones) and I hadn’t found any usable one, and also had worked
a bit on xindices (the solution of apache group) sources, and had been
despaired. But since this plays an essential role for web pages based on XML
documents, Again, I started playing with them today. This time I focused on another one, called eXist. This is still very
young, but the group of it’s developers seem to be much more active than those
of xindice, since there are many versions of it released in this one or two
months I’ve known it. This one has also problems with Unicode, but as I found
out, no corruption occurs in data, but at the moment of sending data to server
(from client) a check of well-formedness occurs, and documents having
non-English (I’m not sure what characters, but sure many of Persian ones) tags
are rejected; so till now the only problem is caused for documents using
Persian tags, but this didn’t suffice in my view, so I worked more with it, and
I found there is a possibility in it, to work directly with a server process, which
is started in your application, not the real server (having the real server
working for itself and answering to request, we use some codes of it in our
program to change the database instead of requesting the server for that. Sure the
process should be run on server machine and have needed permissions.). Using
this method of adding documents, the check of wel-formedness is done without
problems, and the document of having Persian tags is added to database. So after
the long story, I can give the good news: we can use eXist to store any xml
document. I also checked out xindice again (I had forgotten what was problem of it
exactly), and I found out it can’t (anyway) store documents having non-English
tags, and also it corrupts some letters of Persian (first of those is Farsi
yeh, but it wasn’t the last one), so is not usable for Persian information
anyway. After all of these, there was a question about performance of these, and
as I’ve heard, there had been a simple test on xindice (Mehran said me
something), and by the performance, the program seemed very hopeless to be able
to work in real world. I’m not sure what they had done, but as I’ve read here
and there, to get an acceptable performance in using those databases for
complicated xpath queries, one should use a good indexing for his documents. None
of currently existing XML database engines can automatically find a good
indexing for a given schema (or DTD), and in fact, the hardly support use of
schemas and namespaces in XML documents, but fortunately both of xindice and
eXist give the user possibility of defining an indexing method for some
documents, so anyone tries to use those, should read about this. Also about the
performance, as I read somewhere, the maximum number of children any node have
is critical, and this number should not be more that 800 or 1000 too get and acceptable
result. I will test the performance of each of these two as soon as possible,
and will inform those interested in it. |
- [farsiweb]Farsi TEXT TV`? Omid Milani
- [farsiweb]Farsi TEXT TV`? Farhad Abdolian
- Re: [farsiweb]Farsi TEXT TV`? Roozbeh Pournader