The issue may be that the code blocks in question are not supported by the
version of java in use. Different versions of java support different
versions of unicode which include different sets of code blocks. Please
include the exact version of java and the code point / code block in any
bug reports.

For example, for a long time 'Linear B' was a problem for us, but now it's
not.

cheers
stuart

--
...let us be heard from red core to black sky

On Thu, Jun 9, 2016 at 11:09 PM, Tiago Guimarães <
tiagommguimarae...@gmail.com> wrote:

> DB is in UTF.
> Note that there is no error while using the JSPUI, only when trying to
> harvest the OAI, does that error appear.
>
> The log has that line that i posted:
> com.ctc.wstx.exc.WstxParsingException: Illegal character entity:
> expansion character (code 0xdbc0) not a valid XML character
>
> The really weird thing is that char code is not on the interval for
> invalid xml chars in the w3c documentation.
>
> From my understanding, this errors appear when somebody copy and pastes
> the the abstract from some pdf and carry over some weird chars.
>
>
> Em quinta-feira, 9 de junho de 2016 12:02:17 UTC+1, Luiz dos Santos
> escreveu:
>>
>> Hi,
>>
>>    To me it seems a charset problem, are you sure that the database is
>> UTF-8? Do you see any error in the log?
>>
>> Best
>> Luiz
>>
>> On Thursday, June 9, 2016, Tiago Guimarães <tiagommgu...@gmail.com>
>> wrote:
>>
>>> Hi all,
>>>
>>>
>>> I'm having problems with bad characters in OAI.
>>>
>>>
>>> It's the same as this JIRA ticket:
>>> https://jira.duraspace.org/projects/DS/issues/DS-2806
>>>
>>>
>>> this is a problem that is appearing here, basicaly, OAI returns
>>> mal-formed XML because of weird chars
>>>
>>> Example:  https://i.gyazo.com/22b7f355b0e71b830ec08378a9076c34.png
>>>
>>> Shouldn't dspace take care of that? At least warn the User when he
>>> pastes invalid chars when deposit an item.
>>>
>>>
>>> DSpace should probably have a feature that detects characters that break
>>> the OAI XML. I'm up to creating a PR that does that, but i need guidance.
>>>
>>>
>>>
>>> Also, according to this https://www.w3.org/TR/REC-xml/#NT-Char the char
>>> 0xdbc0 should be valid to XML, but OAI is giving me this:
>>> com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion
>>> character (code 0xdbc0) not a valid XML character
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "DSpace Technical Support" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to dspace-tech+unsubscr...@googlegroups.com.
>>> To post to this group, send email to dspace-tech@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/dspace-tech.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "DSpace Technical Support" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dspace-tech+unsubscr...@googlegroups.com.
> To post to this group, send email to dspace-tech@googlegroups.com.
> Visit this group at https://groups.google.com/group/dspace-tech.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.

Reply via email to