[dspace-tech] Bad characters for OAI

2017-10-25 Thread Mariangels
Hello,

We are working with DSpace 5.5 and Mirage theme.

We are having problems with the characters for OAI. The words with 
accent... Please look:

http://repositori.uvic.cat/oai/request?verb=ListSets

Where can I look for try to fix this problem?

Thanks in advance.

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


Re: [dspace-tech] Bad Characters making OAI returning mal formed XML

2016-06-20 Thread Tiago Guimarães
java version is 1-7_85, the char giving me problems is 0xdbc0

Em quinta-feira, 9 de junho de 2016 21:36:13 UTC+1, Stuart Yeates escreveu:
>
> The issue may be that the code blocks in question are not supported by the 
> version of java in use. Different versions of java support different 
> versions of unicode which include different sets of code blocks. Please 
> include the exact version of java and the code point / code block in any 
> bug reports.
>
> For example, for a long time 'Linear B' was a problem for us, but now it's 
> not.
>
> cheers
> stuart
>
> --
> ...let us be heard from red core to black sky
>
> On Thu, Jun 9, 2016 at 11:09 PM, Tiago Guimarães  > wrote:
>
>> DB is in UTF.
>> Note that there is no error while using the JSPUI, only when trying to 
>> harvest the OAI, does that error appear.
>>
>> The log has that line that i posted:  
>> com.ctc.wstx.exc.WstxParsingException: Illegal character entity: 
>> expansion character (code 0xdbc0) not a valid XML character
>>
>> The really weird thing is that char code is not on the interval for 
>> invalid xml chars in the w3c documentation.
>>
>> From my understanding, this errors appear when somebody copy and pastes 
>> the the abstract from some pdf and carry over some weird chars.
>>
>>
>> Em quinta-feira, 9 de junho de 2016 12:02:17 UTC+1, Luiz dos Santos 
>> escreveu:
>>>
>>> Hi, 
>>>
>>>To me it seems a charset problem, are you sure that the database is 
>>> UTF-8? Do you see any error in the log?
>>>
>>> Best
>>> Luiz
>>>
>>> On Thursday, June 9, 2016, Tiago Guimarães  
>>> wrote:
>>>
 Hi all,


 I'm having problems with bad characters in OAI.


 It's the same as this JIRA ticket:  
 https://jira.duraspace.org/projects/DS/issues/DS-2806


 this is a problem that is appearing here, basicaly, OAI returns 
 mal-formed XML because of weird chars

 Example:  https://i.gyazo.com/22b7f355b0e71b830ec08378a9076c34.png

 Shouldn't dspace take care of that? At least warn the User when he 
 pastes invalid chars when deposit an item.


 DSpace should probably have a feature that detects characters that 
 break the OAI XML. I'm up to creating a PR that does that, but i need 
 guidance.



 Also, according to this https://www.w3.org/TR/REC-xml/#NT-Char the 
 char 0xdbc0 should be valid to XML, but OAI is giving me this: 
 com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion 
 character (code 0xdbc0) not a valid XML character

 -- 
 You received this message because you are subscribed to the Google 
 Groups "DSpace Technical Support" group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to dspace-tech+unsubscr...@googlegroups.com.
 To post to this group, send email to dspace-tech@googlegroups.com.
 Visit this group at https://groups.google.com/group/dspace-tech.
 For more options, visit https://groups.google.com/d/optout.

>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "DSpace Technical Support" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to dspace-tech...@googlegroups.com .
>> To post to this group, send email to dspac...@googlegroups.com 
>> .
>> Visit this group at https://groups.google.com/group/dspace-tech.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


Re: [dspace-tech] Bad Characters making OAI returning mal formed XML

2016-06-09 Thread Stuart A. Yeates
The issue may be that the code blocks in question are not supported by the
version of java in use. Different versions of java support different
versions of unicode which include different sets of code blocks. Please
include the exact version of java and the code point / code block in any
bug reports.

For example, for a long time 'Linear B' was a problem for us, but now it's
not.

cheers
stuart

--
...let us be heard from red core to black sky

On Thu, Jun 9, 2016 at 11:09 PM, Tiago Guimarães <
tiagommguimarae...@gmail.com> wrote:

> DB is in UTF.
> Note that there is no error while using the JSPUI, only when trying to
> harvest the OAI, does that error appear.
>
> The log has that line that i posted:
> com.ctc.wstx.exc.WstxParsingException: Illegal character entity:
> expansion character (code 0xdbc0) not a valid XML character
>
> The really weird thing is that char code is not on the interval for
> invalid xml chars in the w3c documentation.
>
> From my understanding, this errors appear when somebody copy and pastes
> the the abstract from some pdf and carry over some weird chars.
>
>
> Em quinta-feira, 9 de junho de 2016 12:02:17 UTC+1, Luiz dos Santos
> escreveu:
>>
>> Hi,
>>
>>To me it seems a charset problem, are you sure that the database is
>> UTF-8? Do you see any error in the log?
>>
>> Best
>> Luiz
>>
>> On Thursday, June 9, 2016, Tiago Guimarães 
>> wrote:
>>
>>> Hi all,
>>>
>>>
>>> I'm having problems with bad characters in OAI.
>>>
>>>
>>> It's the same as this JIRA ticket:
>>> https://jira.duraspace.org/projects/DS/issues/DS-2806
>>>
>>>
>>> this is a problem that is appearing here, basicaly, OAI returns
>>> mal-formed XML because of weird chars
>>>
>>> Example:  https://i.gyazo.com/22b7f355b0e71b830ec08378a9076c34.png
>>>
>>> Shouldn't dspace take care of that? At least warn the User when he
>>> pastes invalid chars when deposit an item.
>>>
>>>
>>> DSpace should probably have a feature that detects characters that break
>>> the OAI XML. I'm up to creating a PR that does that, but i need guidance.
>>>
>>>
>>>
>>> Also, according to this https://www.w3.org/TR/REC-xml/#NT-Char the char
>>> 0xdbc0 should be valid to XML, but OAI is giving me this:
>>> com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion
>>> character (code 0xdbc0) not a valid XML character
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "DSpace Technical Support" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to dspace-tech+unsubscr...@googlegroups.com.
>>> To post to this group, send email to dspace-tech@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/dspace-tech.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "DSpace Technical Support" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dspace-tech+unsubscr...@googlegroups.com.
> To post to this group, send email to dspace-tech@googlegroups.com.
> Visit this group at https://groups.google.com/group/dspace-tech.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] Bad Characters making OAI returning mal formed XML

2016-06-09 Thread Tiago Guimarães


Hi all,


I'm having problems with bad characters in OAI.


It's the same as this JIRA ticket:  
https://jira.duraspace.org/projects/DS/issues/DS-2806


this is a problem that is appearing here, basicaly, OAI returns mal-formed 
XML because of weird chars

Example:  https://i.gyazo.com/22b7f355b0e71b830ec08378a9076c34.png

Shouldn't dspace take care of that? At least warn the User when he pastes 
invalid chars when deposit an item.


DSpace should probably have a feature that detects characters that break 
the OAI XML. I'm up to creating a PR that does that, but i need guidance.



Also, according to this https://www.w3.org/TR/REC-xml/#NT-Char the char 
0xdbc0 should be valid to XML, but OAI is giving me this: 
com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion 
character (code 0xdbc0) not a valid XML character

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.