Akash Mehta created SOLR-9418:
---------------------------------
Summary: Probabilistic-Query-Parser RequestHandler
Key: SOLR-9418
URL: https://issues.apache.org/jira/browse/SOLR-9418
Project: Solr
Issue Type: New Feature
Security Level: Public (Default Security Level. Issues are Public)
Reporter: Akash Mehta
The main aim of this requestHandler is to get the best parsing for a given
query. This basically means recognizing different phrases within the query. We
need some kind of training data to generate these phrases. The way this project
works is:
1.)Generate all possible parsings for the given query
2.)For each possible parsing, a naive-bayes like score is calculated.
3.)The main scoring is done by going through all the documents in the training
set and finding the probability of bunch of words occurring together as a
phrase as compared to them occurring randomly in the same document. Then the
score is normalized. Some higher importance is given to the title field as
compared to content field which is configurable.
4.)Finally after scoring each of the possible parsing, the one with the highest
score is returned.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]