Hi,

I have that problem to. But I notice that it only happens if I send my data
via solrj. If I send it via the solr-ruby gem, everything is fine
(http://wiki.apache.org/solr/solr-ruby).

Here is my jruby script:
-------------------------------
require 'rubygems'

require 'solr'
require 'rexml/document'

include Java

def send_via_solrj(text, url)
  doc = org.apache.solr.common.SolrInputDocument.new
  doc.addField('id', '1')
  doc.addField('text', text)

  server = org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.new(url)
  server.add(doc);
  server.commit();
end

def send_via_gem(text, url)
  solr_doc = Solr::Document.new
  solr_doc['id'] = '2'
  solr_doc['text'] = text

  options = {
    :autocommit => :on
  }

  conn = Solr::Connection.new(url, options)
  conn.add(solr_doc)
end

host = 'localhost'
port = '8888'
path = '/solr/core0'
url = "http://#{host}:#{port}#{path}";

text = "eaiou with circumflexes: êâîôû"

send_via_solrj(text, url)
send_via_gem(text, url)

puts "done!"
-------------------------------

If I watch the http messages with tcpmon, I see that the data sent via solrj
is encoded in cp1252 while the data sent via the gem is utf-8.

Anyone has an idea of how we can configure sorlj to send in utf-8?

Thanks in advance.


Walid ABDELKABIR wrote:
> 
> when executing this code I got in my index the field "includes" with this
> value : "????? ???? ????????????? ?????" :
> ---------------------------
> String content ="eaiou with circumflexes: êâîôû";
> SolrInputDocument doc = new SolrInputDocument();
> doc.addField( "id", "123", 1.0f );
> doc.addField( "includes", content, 1.0f );
> server.add( doc );
> ---------------------------
> 
> but this code works fine :
> 
> -------------------------------
> String addContent =   "<add><doc boost="1.0">"
>                               +"<field name="id">123</field><field
> name="includes">eaiou with circumflexes:âîôû</field>"
>                               +"</doc></add>";
> DirectXmlRequest up = new DirectXmlRequest( "/update", addContent );
> server.request( up );
> -------------------------------
> 
> thanks for help
> 
> 

-- 
View this message in context: 
http://www.nabble.com/solrj-%3A-probleme-with-utf-8-content-tp22577377p22620317.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to