Hi,

Just to let you know - our programmer has eventually come up with a workaround which we think may be useful for others and so we are sharing our solution:

All characters in added content with ASCII code higher than 127 have been converted to HTML entities :-).

Linda

On 7.9.2016 08:37, Linda Jansova wrote:

Hi,

We are currently developing a module for pulling added content (book jackets, tables of contents etc.) from our local provider obalkyknih.cz.

In our programming endeavor we have come across an encoding issue which we have not been able to resolve so far. The thing is all textual added content gets messed up when shown on a record webpage. Our programmer has made a number of tests to make sure that the encoding does not go wrong in the module we are developing (a list of tests performed is available at https://bugs.launchpad.net/evergreen/+bug/1610678). We have also tested Open Library – when it comes to tables of contents with diacritics, it is also messed up.

To investigate the issue further, could anyone provide us with any hints as to:

 *

    how (where) the record webpage actually pulls together?

 *

    how AddedContent.pm methods (which provide the added content) are
    called?

One more question – is there any developer documentation which would describe Evergreen architecture in more detail available?

BTW could our developer's hint that „interestingly on URL in form http://evergreen-server/opac/extras/ac/toc/html/r/23225 I could see toc in correct encoding“ be of any use? It seems to me that if we figured out what the differences between how data are processed for a record webpage and for the sample URL above, we could actually hit the nail on the head...

Thank you in advance for sharing any clues!

Linda


Reply via email to