Re: [Dspace-tech] Encoding problem in discovery search results, xmlui

2013-09-17 Thread helix84
On Mon, Sep 16, 2013 at 11:40 PM, Alcides Carlos de Moraes Neto wrote: > A successful execution of update-discovery-index -b with the proper LANG > environment variable (pt_BR, UTF-8) seems to have fixed the issue. That makes sense. Thanks for reporting back. Regards, ~~helix84 Compulsory read

Re: [Dspace-tech] Encoding problem in discovery search results, xmlui

2013-09-16 Thread Alcides Carlos de Moraes Neto
A successful execution of update-discovery-index -b with the proper LANG environment variable (pt_BR, UTF-8) seems to have fixed the issue. Ats, Alcides Carlos de Moraes Neto "Sometimes I think we're alone. Sometimes I think we're not. In either case, the thought is staggering." - R. Buckminster

Re: [Dspace-tech] Encoding problem in discovery search results, xmlui

2013-09-11 Thread Alcides Carlos de Moraes Neto
As I suspected, it's the SOLR index that's messed up. Executing this SOLR query: http://localhost:8080/solr/search/select?indent=true&q=andr%C3%A9%20luiz%20lopes%20de%20alcantara&fq=-type:(not%C3%ADcia%20de%20jornal)

Re: [Dspace-tech] Encoding problem in discovery search results, xmlui

2013-09-10 Thread Alcides Carlos de Moraes Neto
I ran update-discovery-index -f, but the results still show encoding issues. http://www2.senado.leg.br/bdsf/discover?filtertype_0=type&filter_relational_operator_0=notequals&filter_0=not%C3%ADcia+de+jornal&submit_apply_filter=Aplicar&query=andr%C3%A9+luiz+lopes+de+alcantara I'm stumped right now,

Re: [Dspace-tech] Encoding problem in discovery search results, xmlui

2013-09-06 Thread Alcides Carlos de Moraes Neto
Just a follow up. filter-media -f seems to have fixed the issue with the OCR txt. But some search results still show encoding issues. I believe I need to regenerate the solr index. Ats, Alcides Carlos de Moraes Neto "Sometimes I think we're alone. Sometimes I think we're not. In either case, the

Re: [Dspace-tech] Encoding problem in discovery search results, xmlui

2013-09-03 Thread helix84
On Tue, Sep 3, 2013 at 1:24 AM, Alcides Carlos de Moraes Neto wrote: > I have checked the .txt media-filter generates, they are all UTF-8. What I see (see attachment) looks like double-encoded UTF-8 (it happens when a charset converter is told that a file is to be encoded from one character set t

Re: [Dspace-tech] Encoding problem in discovery search results, xmlui

2013-09-03 Thread Alcides Carlos de Moraes Neto
Thank you all, Tomcat is set to URIEncoding="UTF-8", so that's not the issue. I'm suspecting that filter-media is generating invalid .txt but haven't found anything yet. Ats, Alcides Carlos de Moraes Neto 2013/9/3 Tiago Rodrigo Marçal Murakami > Hi Alcides, > > We edit the Tomcat file server

Re: [Dspace-tech] Encoding problem in discovery search results, xmlui

2013-09-03 Thread Alcides Carlos de Moraes Neto
Hello helix, thank you for your input. Indeed, it is a problem with the filter-media generated txt. A filter-media -f resolved the issue for this specific item. I scheduled a full filter-media -f of the repository tonight. Ats, Alcides Carlos de Moraes Neto 2013/9/3 helix84 > On Tue, Sep 3,

Re: [Dspace-tech] Encoding problem in discovery search results, xmlui

2013-09-03 Thread Tiago Rodrigo Marçal Murakami
Hi Alcides, We edit the Tomcat file server.xml to force UTF-8: Att, Tiago R. M. Murakami Comunicação Científica e Acadêmica Departamento Técnico - Sistema Integrado de Bibliotecas Universidade de São Paulo Brasil Rua da Biblioteca S/N - Complexo Brasiliana Tel: (11) 3091-4195 2013/9/2 Alcide

Re: [Dspace-tech] Encoding problem in discovery search results, xmlui

2013-09-02 Thread Keir Vaughan-Taylor
Is it a problem that the extension is pdf.txt ? On Mon, 2013-09-02 at 20:24 -0300, Alcides Carlos de Moraes Neto wrote: > Hello all, > > > We have this problem with our current dspace 3.1 installation. > Discovery search results show some invalid characters due to encoding > issues. > Only th

[Dspace-tech] Encoding problem in discovery search results, xmlui

2013-09-02 Thread Alcides Carlos de Moraes Neto
Hello all, We have this problem with our current dspace 3.1 installation. Discovery search results show some invalid characters due to encoding issues. Only the full text search/highlight portion of the results has this problem. Example: http://www2.senado.leg.br/bdsf/discover?filtertype_0=type&fi