Check the analyzers for the field types containing Hindi text to be sure that they are not using a character mapping or "folding" filter that might mangle the Hindi characters. Post the field type, say for the "title" field.

Also, try manually (using curl or the post jar) adding a single document that has Hindi data and see if that works.

-- Jack Krupansky

-----Original Message----- From: KP Sanjailal
Sent: Thursday, May 17, 2012 5:55 AM
To: solr-user@lucene.apache.org
Subject: Indexing & Searching MySQL table with Hindi and English data

Hi,

I tried to setup indexing of MySQL tables in Apache Solr 3.6.

Everything works fine but text in Hindi script (only some 10% of total
records) not getting indexed properly.

A search with keyword in Hindi retrieve emptly result set.  Also a
retrieved hindi record displays junk characters.

The database tables contains bibliographical details of books such as
title, author, publisher, isbn, publishing place, series etc. and out of
the total records about 10% of records contains text in Hindi in title,
author, publisher fields.

Example:

*Search Results from MySQL using PHP*

  1.
<http://192.168.0.132/shared/biblio_view.php?bibid=26913&tab=opac>
 *Title:* सौर ऊर्जा Saur
oorja<http://192.168.0.132/shared/biblio_view.php?bibid=26913&tab=opac>
*Author(s):* विनोद कुमार मिश्र MISHRA (VK) *Material:* Books **  **
*Search Results from Apache Solr (searched using keyword in English)*

 1.
<http://192.168.0.132/test/biblio_view.php?bibid=26913&tab=opac>
 *Title:* सौर ऊर्जा Saur
oorja<http://192.168.0.132/test/biblio_view.php?bibid=26913&tab=opac>
*Author(s):* विनोद कुमार मिश्र MISHRA (VK) *
Material:* Books


How do I go about solving this language problem.

Thanks in advace.

K. P. Sanjailal
--

Reply via email to