About:   xindice-1.1b4 - commandline tool
Question: How to preserve encodings across export/import ???

--------------------------------
Scenario explaining the problem
--------------------------------
Step 1.
Adding document:
xindice ad -c xmldb:xindice://localhost:8080/db/foo/ -n 36.xml -f 36.xml
on document 36.xml:
<?xml version="1.0" encoding="iso-8859-1"?> ... etc ...
This works OK


Step 2.
Retrieving document:
xindice rd -c xmldb:xindice://localhost:8080/db/foo/ -n 36.xml -f 36a.xml
results in document 36a.xml:
<?xml version="1.0"?> ... etc ...
Payload of extracted doc (36a.xml) is very much identical to payload of original document (36.xml)


Step 3.
Adding document:
xindice ad -c xmldb:xindice://localhost:8080/db/foo/ -n 36a.xml -f 36a.xml
on document 36a.xml above.


This results in error:
    ERROR : Invalid byte 2 of 3-byte UTF-8 sequence.

--------------------------------
What is happening here?
--------------------------------

The reason for this is that there is an ISO-8859-1 character in the source document of step 1 (which was the reason for the explicit encoding in the document)

This is a practical problem.
In the off-the-shelf installation of Xindice 1.0 this was a non-problem -- that tool seems to be more tolerant.
In Xindice 1.1 ... how can one make it work?


--------------------------------
Run-time environment
--------------------------------
Xindice server run in an off-the-shelf Tomcat/Cocoon framwork.
- Tomcat 4.1.12
- Cocoon 2.1.6



-- ------------------------------------------------------------------ Olle Olsson [EMAIL PROTECTED] Tel: +46 8 633 15 19 Fax: +46 8 751 72 30 [Svenska W3C-kontoret: [EMAIL PROTECTED] SICS [Swedish Institute of Computer Science] Box 1263 SE - 164 29 Kista Sweden ------------------------------------------------------------------




Reply via email to