Hi Stave,

Thank for your continues investigation..

This has improved the search little bit, but am facing another issue where
am getting a record doesn't have a specific word in my query. 

Plz note that you have indexed only 9 records where i have shared you more
than 76 sample records (please refer to the earlier attachment
Arabic_Characters2.xlsx in Examples sheet) to index so you can reproduce the
issue. 

i.e. i searched with (bizNameAr: شرطة ازكي), and am getting:

{
  "responseHeader": {
    "status": 0,
    "QTime": 3,
    "params": {
      "indent": "true",
      "q": "bizNameAr: شرطة ازكي",
      "_": "1488089550104",
      "wt": "json"
    }
  },
  "response": {
    "numFound": 4,
    "start": 0,
    "docs": [
      {
        "id": "82",
        "bizNameAr": "شرطة عمان السلطانية - قيادة شرطة محافظة الداخلية - -
مركز شرطة إزكي",
        "_version_": 1560298301338681300
      },
      {
        "id": "63",
        "bizNameAr": "شركة ظفار للتأمين ش.م.ع.ع - فرع ازكي",
        "_version_": 1560298301325049900
      },
      {
        "id": "56",
        "bizNameAr": "شرطة عمان السلطانية - قيادة شرطة محافظة شمال الشرقية 
-  - مركز شرطة إبراء",
        "_version_": 1560298301319807000
      },
      {
        "id": "79",
        "bizNameAr": "شرطة عمان السلطانية - قيادة شرطة محافظة شمال الشرقية -
- مركز شرطة إبراء",
        "_version_": 1560298301335535600
      }
    ]
  }
}



the expected result is:   "id": "82",
                                  "bizNameAr": "شرطة عمان السلطانية - قيادة
شرطة محافظة الداخلية - - مركز *شرطة إزكي*",

as the above has both the words mentioned in the query (marked as Bold),
where the rest have the following:

        "id": "63",
        "bizNameAr": "شركة ظفار للتأمين ش.م.ع.ع - فرع ازكي"

it has only one word of the query (ازكي)

        "id": "56",
        "bizNameAr": "شرطة عمان السلطانية - قيادة شرطة محافظة شمال الشرقية 
-  - مركز شرطة إبراء"

it has only one word of the query (شرطة)

"id": "79",
"bizNameAr": "شرطة عمان السلطانية - قيادة شرطة محافظة شمال الشرقية - - مركز
شرطة إبراء"

It has only one word of the query (شرطة)

where the above 3 records should not come in the result since already 2
words mentioned in the query, and only one record has these two words.


I would really suggest if we can give you a real-time demo on our system
with my Arab colleague so it can be more clear for you. let us know if we
can do that.

Thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Arabic-words-search-in-solr-tp4317733p4322354.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to