Akash Mehta created SOLR-9418:
---------------------------------

             Summary: Probabilistic-Query-Parser RequestHandler
                 Key: SOLR-9418
                 URL: https://issues.apache.org/jira/browse/SOLR-9418
             Project: Solr
          Issue Type: New Feature
      Security Level: Public (Default Security Level. Issues are Public)
            Reporter: Akash Mehta


The main aim of this requestHandler is to get the best parsing for a given 
query. This basically means recognizing different phrases within the query. We 
need some kind of training data to generate these phrases. The way this project 
works is:
1.)Generate all possible parsings for the given query
2.)For each possible parsing, a naive-bayes like score is calculated.
3.)The main scoring is done by going through all the documents in the training 
set and finding the probability of bunch of words occurring together as a 
phrase as compared to them occurring randomly in the same document. Then the 
score is normalized. Some higher importance is given to the title field as 
compared to content field which is configurable.
4.)Finally after scoring each of the possible parsing, the one with the highest 
score is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to