I'm outputting XML from my search engine for use in other people's websites,
and I'm having a small problem.

Some of the sites I'm indexing are made in word [I've no control over this],
and outputted as html.

And they're in strange character sets like windows-125{0,1,2}.

When I output the XML, it contains things like <92>s, which are the word
equivalent of a normal '. Is there any way I can do translations on this,
either in the indexer, or in the php? [I'm using the php front end, and
crc-multi DB schema].

Basically, I'd like to see nothing more than US-ASCII or friends; much
easier to use, and won't break perl scripts on unix boxes.

Anybody?

Ta,
Gary (-;

PS I never got any response to my RFC on my code for putting stuff INTO the
database from XML. Does anyone have anythign to add to it?
___________________________________________
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]

Reply via email to