Re: Problem adding unicoded docs to Solr through SolrJ

2009-04-30 Thread Michael Ludwig
ahmed baseet schrieb: I tried something stupid but working though. I first converted the whole string to byte array and then used that byte array to create a new utf-8 encoded sting like this, // Encode in Unicode UTF-8 byte [] utfEncodeByteArray = textOnly.getBytes(); This yi

Re: Problem adding unicoded docs to Solr through SolrJ

2009-04-30 Thread Gunnar Wagenknecht
ahmed baseet schrieb: > I first converted the whole string to > byte array and then used that byte array to create a new utf-8 encoded sting > like this, I'm not sure that this is required at all. Java strings have the same representation internally no matter what they were created from. Thus, the

Re: Problem adding unicoded docs to Solr through SolrJ

2009-04-29 Thread ahmed baseet
Thanks a lot for your quick and detailed response. I got the point. But as I've mentioned earlier I've a string of rawtext[default encoding] that needs to be encoded in utf-8, so I tried something stupid but working though. I first converted the whole string to byte array and then used that byte a

Re: Problem adding unicoded docs to Solr through SolrJ

2009-04-29 Thread Michael Ludwig
ahmed baseet schrieb: public void postToSolrUsingSolrj(String rawText, String pageId) { doc.addField("features", rawText ); In the above the param rawText is just the html stripped off of all its tags, js, css etc and pageId is the Url for that page. When I'm using this for Eng

Problem adding unicoded docs to Solr through SolrJ

2009-04-29 Thread ahmed baseet
Hi All, I'm trying to automate the process of posting xml s to Solr using Solrj. Essentially I'm extracting the text from a given Url, then creating a solrDoc and posting the same using the following function, public void postToSolrUsingSolrj(String rawText, String pageId) { String url = "