Re: About Query Parser
That's impressive answer. I actually wanted to know how exactly query parser works. I'm actually supposed to collect some fields,values,other related info and build a solr query. I wanted to know i should use this queryparser or java code to build solr query. Anyway it looks i've to go with java code so build it and i"m on it. Thanks, Vivek On Fri, Jun 20, 2014 at 6:06 PM, Daniel Collins wrote: > I would say "*:*" is a human-readable/writable query. as is > "inStock:false". The former will be converted by the query parser into a > MatchAllDocsQuery which is what Lucene understands. The latter will be > converted (again by the query parser) into some query. Now this is where > *which* query parser you are using is important. Is "inStock" a word to be > queried, or a field in your schema? Probably the latter, but the query > parser has to determine that using the Solr schema. So I would expect that > query to be converted to a TermQuery(Term("inStock", "false")), so a query > for the value false in the field inStock. > > This is all interesting but what are you really trying to find out? If you > just want to run queries and see what they translate to, you can use the > debug options when you send the query in, and then Solr will return to you > both the raw query (with any other options that the query handler might > have added to your query) as well as the Lucene Query generated from it. > > e.g.from running ":" on a solr instance. > > "rawquerystring": "*:*", "querystring": "*:*", "parsedquery": > "MatchAllDocsQuery(*:*)", "parsedquery_toString": "*:*", "QParser": > "LuceneQParser", > Or (this shows the difference between raw query syntax and parsed query > syntax) "rawquerystring": "body_en:test AND headline_en:hello", > "querystring": > "body_en:test AND headline_en:hello", "parsedquery": "+body_en:test > +headline_en:hello", "parsedquery_toString": "+body_en:test > +headline_en:hello", "QParser": "LuceneQParser", > > > On 20 June 2014 13:05, Vivekanand Ittigi wrote: > > > All right let me put this. > > > > > > > http://192.168.1.78:8983/solr/collection1/select?q=inStock:false&facet=true&facet.field=popularity&wt=xml&indent=true > > . > > > > I just want to know what is this form. is it lucene query or this query > > should go under query parser to get converted to lucene query. > > > > > > Thanks, > > Vivek > > > > > > On Fri, Jun 20, 2014 at 5:19 PM, Alexandre Rafalovitch < > arafa...@gmail.com > > > > > wrote: > > > > > That's *:* and a special case. There is no scoring here, nor searching. > > > Just a dump of documents. Not even filtering or faceting. I sure hope > you > > > have more interesting examples. > > > > > > Regards, > > > Alex > > > On 20/06/2014 6:40 pm, "Vivekanand Ittigi" > > wrote: > > > > > > > Hi Daniel, > > > > > > > > You said inputs are "human-generated" and outputs are "lucene > objects". > > > So > > > > my question is what does the below query mean. Does this fall under > > > > human-generated one or lucene.? > > > > > > > > > > http://localhost:8983/solr/collection1/select?q=*%3A*&wt=xml&indent=true > > > > > > > > Thanks, > > > > Vivek > > > > > > > > > > > > > > > > On Fri, Jun 20, 2014 at 3:55 PM, Daniel Collins < > danwcoll...@gmail.com > > > > > > > wrote: > > > > > > > > > Alexandre's response is very thorough, so I'm really simplifying > > > things, > > > > I > > > > > confess but here's my "query parsers for dummies". :) > > > > > > > > > > In terms of inputs/outputs, a QueryParser takes a string (generally > > > > assumed > > > > > to be "human generated" i.e. something a user might type in, so > > maybe a > > > > > sentence, a set of words, the format can vary) and outputs a Lucene > > > Query > > > > > object ( > > > > > > > > > > > > > > > > > > > > http://lucene.apache.org/core/4_8_1/core/org/apache/lucene/search/Query.html > > > > > ), > > > > > which in fact is a kind of "tree" (again, I'm simplifying I know) > > > since a > > > > > query can contain nested expressions. > > > > > > > > > > So very loosely its a translator from a human-generated query into > > the > > > > > structure that Lucene can handle. There are several different > query > > > > > parsers since they all use different input syntax, and ways of > > handling > > > > > different constructs (to handle A and B, should the user type "+A > +B" > > > or > > > > "A > > > > > and B" or just "A B" for example), and have different levels of > > support > > > > for > > > > > the various Query structures that Lucene can handle: SpanQuery, > > > > FuzzyQuery, > > > > > PhraseQuery, etc. > > > > > > > > > > We for example use an XML-based query parser. Why (you might well > > > ask!), > > > > > well we had an already used and supported query syntax of our own, > > > which > > > > > our users understood, so we couldn't use an off the shelf query > > parser. > > > > We > > > > > could have built our own in Java, but for a variety of reasons we > > parse > > > > our > > > > > queries in a
Re: About Query Parser
I would say "*:*" is a human-readable/writable query. as is "inStock:false". The former will be converted by the query parser into a MatchAllDocsQuery which is what Lucene understands. The latter will be converted (again by the query parser) into some query. Now this is where *which* query parser you are using is important. Is "inStock" a word to be queried, or a field in your schema? Probably the latter, but the query parser has to determine that using the Solr schema. So I would expect that query to be converted to a TermQuery(Term("inStock", "false")), so a query for the value false in the field inStock. This is all interesting but what are you really trying to find out? If you just want to run queries and see what they translate to, you can use the debug options when you send the query in, and then Solr will return to you both the raw query (with any other options that the query handler might have added to your query) as well as the Lucene Query generated from it. e.g.from running ":" on a solr instance. "rawquerystring": "*:*", "querystring": "*:*", "parsedquery": "MatchAllDocsQuery(*:*)", "parsedquery_toString": "*:*", "QParser": "LuceneQParser", Or (this shows the difference between raw query syntax and parsed query syntax) "rawquerystring": "body_en:test AND headline_en:hello", "querystring": "body_en:test AND headline_en:hello", "parsedquery": "+body_en:test +headline_en:hello", "parsedquery_toString": "+body_en:test +headline_en:hello", "QParser": "LuceneQParser", On 20 June 2014 13:05, Vivekanand Ittigi wrote: > All right let me put this. > > > http://192.168.1.78:8983/solr/collection1/select?q=inStock:false&facet=true&facet.field=popularity&wt=xml&indent=true > . > > I just want to know what is this form. is it lucene query or this query > should go under query parser to get converted to lucene query. > > > Thanks, > Vivek > > > On Fri, Jun 20, 2014 at 5:19 PM, Alexandre Rafalovitch > > wrote: > > > That's *:* and a special case. There is no scoring here, nor searching. > > Just a dump of documents. Not even filtering or faceting. I sure hope you > > have more interesting examples. > > > > Regards, > > Alex > > On 20/06/2014 6:40 pm, "Vivekanand Ittigi" > wrote: > > > > > Hi Daniel, > > > > > > You said inputs are "human-generated" and outputs are "lucene objects". > > So > > > my question is what does the below query mean. Does this fall under > > > human-generated one or lucene.? > > > > > > > http://localhost:8983/solr/collection1/select?q=*%3A*&wt=xml&indent=true > > > > > > Thanks, > > > Vivek > > > > > > > > > > > > On Fri, Jun 20, 2014 at 3:55 PM, Daniel Collins > > > > wrote: > > > > > > > Alexandre's response is very thorough, so I'm really simplifying > > things, > > > I > > > > confess but here's my "query parsers for dummies". :) > > > > > > > > In terms of inputs/outputs, a QueryParser takes a string (generally > > > assumed > > > > to be "human generated" i.e. something a user might type in, so > maybe a > > > > sentence, a set of words, the format can vary) and outputs a Lucene > > Query > > > > object ( > > > > > > > > > > > > > > http://lucene.apache.org/core/4_8_1/core/org/apache/lucene/search/Query.html > > > > ), > > > > which in fact is a kind of "tree" (again, I'm simplifying I know) > > since a > > > > query can contain nested expressions. > > > > > > > > So very loosely its a translator from a human-generated query into > the > > > > structure that Lucene can handle. There are several different query > > > > parsers since they all use different input syntax, and ways of > handling > > > > different constructs (to handle A and B, should the user type "+A +B" > > or > > > "A > > > > and B" or just "A B" for example), and have different levels of > support > > > for > > > > the various Query structures that Lucene can handle: SpanQuery, > > > FuzzyQuery, > > > > PhraseQuery, etc. > > > > > > > > We for example use an XML-based query parser. Why (you might well > > ask!), > > > > well we had an already used and supported query syntax of our own, > > which > > > > our users understood, so we couldn't use an off the shelf query > parser. > > > We > > > > could have built our own in Java, but for a variety of reasons we > parse > > > our > > > > queries in a front-end system ahead of Solr (which is C++-based), so > we > > > > needed an interim format to pass queries to Solr that was as near to > a > > > > Lucene Query object as we could get (and there was an existing XML > > parser > > > > to save us starting from square one!). > > > > > > > > As part of that Query construction (but independent of which > > QueryParser > > > > you use), Solr will also make use of a set of Tokenizers and Filters > ( > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/solr/Understanding+Analyzers,+Tokenizers,+and+Filters > > > > ) > > > > but that's more to do with dealing with the terms in the query (so in > > my > > > > examples above, is A a real wor
Re: About Query Parser
All right let me put this. http://192.168.1.78:8983/solr/collection1/select?q=inStock:false&facet=true&facet.field=popularity&wt=xml&indent=true . I just want to know what is this form. is it lucene query or this query should go under query parser to get converted to lucene query. Thanks, Vivek On Fri, Jun 20, 2014 at 5:19 PM, Alexandre Rafalovitch wrote: > That's *:* and a special case. There is no scoring here, nor searching. > Just a dump of documents. Not even filtering or faceting. I sure hope you > have more interesting examples. > > Regards, > Alex > On 20/06/2014 6:40 pm, "Vivekanand Ittigi" wrote: > > > Hi Daniel, > > > > You said inputs are "human-generated" and outputs are "lucene objects". > So > > my question is what does the below query mean. Does this fall under > > human-generated one or lucene.? > > > > http://localhost:8983/solr/collection1/select?q=*%3A*&wt=xml&indent=true > > > > Thanks, > > Vivek > > > > > > > > On Fri, Jun 20, 2014 at 3:55 PM, Daniel Collins > > wrote: > > > > > Alexandre's response is very thorough, so I'm really simplifying > things, > > I > > > confess but here's my "query parsers for dummies". :) > > > > > > In terms of inputs/outputs, a QueryParser takes a string (generally > > assumed > > > to be "human generated" i.e. something a user might type in, so maybe a > > > sentence, a set of words, the format can vary) and outputs a Lucene > Query > > > object ( > > > > > > > > > http://lucene.apache.org/core/4_8_1/core/org/apache/lucene/search/Query.html > > > ), > > > which in fact is a kind of "tree" (again, I'm simplifying I know) > since a > > > query can contain nested expressions. > > > > > > So very loosely its a translator from a human-generated query into the > > > structure that Lucene can handle. There are several different query > > > parsers since they all use different input syntax, and ways of handling > > > different constructs (to handle A and B, should the user type "+A +B" > or > > "A > > > and B" or just "A B" for example), and have different levels of support > > for > > > the various Query structures that Lucene can handle: SpanQuery, > > FuzzyQuery, > > > PhraseQuery, etc. > > > > > > We for example use an XML-based query parser. Why (you might well > ask!), > > > well we had an already used and supported query syntax of our own, > which > > > our users understood, so we couldn't use an off the shelf query parser. > > We > > > could have built our own in Java, but for a variety of reasons we parse > > our > > > queries in a front-end system ahead of Solr (which is C++-based), so we > > > needed an interim format to pass queries to Solr that was as near to a > > > Lucene Query object as we could get (and there was an existing XML > parser > > > to save us starting from square one!). > > > > > > As part of that Query construction (but independent of which > QueryParser > > > you use), Solr will also make use of a set of Tokenizers and Filters ( > > > > > > > > > https://cwiki.apache.org/confluence/display/solr/Understanding+Analyzers,+Tokenizers,+and+Filters > > > ) > > > but that's more to do with dealing with the terms in the query (so in > my > > > examples above, is A a real word, does it need stemming, lowercasing, > > > removing because its a stopword, etc). > > > > > >
Re: About Query Parser
That's *:* and a special case. There is no scoring here, nor searching. Just a dump of documents. Not even filtering or faceting. I sure hope you have more interesting examples. Regards, Alex On 20/06/2014 6:40 pm, "Vivekanand Ittigi" wrote: > Hi Daniel, > > You said inputs are "human-generated" and outputs are "lucene objects". So > my question is what does the below query mean. Does this fall under > human-generated one or lucene.? > > http://localhost:8983/solr/collection1/select?q=*%3A*&wt=xml&indent=true > > Thanks, > Vivek > > > > On Fri, Jun 20, 2014 at 3:55 PM, Daniel Collins > wrote: > > > Alexandre's response is very thorough, so I'm really simplifying things, > I > > confess but here's my "query parsers for dummies". :) > > > > In terms of inputs/outputs, a QueryParser takes a string (generally > assumed > > to be "human generated" i.e. something a user might type in, so maybe a > > sentence, a set of words, the format can vary) and outputs a Lucene Query > > object ( > > > > > http://lucene.apache.org/core/4_8_1/core/org/apache/lucene/search/Query.html > > ), > > which in fact is a kind of "tree" (again, I'm simplifying I know) since a > > query can contain nested expressions. > > > > So very loosely its a translator from a human-generated query into the > > structure that Lucene can handle. There are several different query > > parsers since they all use different input syntax, and ways of handling > > different constructs (to handle A and B, should the user type "+A +B" or > "A > > and B" or just "A B" for example), and have different levels of support > for > > the various Query structures that Lucene can handle: SpanQuery, > FuzzyQuery, > > PhraseQuery, etc. > > > > We for example use an XML-based query parser. Why (you might well ask!), > > well we had an already used and supported query syntax of our own, which > > our users understood, so we couldn't use an off the shelf query parser. > We > > could have built our own in Java, but for a variety of reasons we parse > our > > queries in a front-end system ahead of Solr (which is C++-based), so we > > needed an interim format to pass queries to Solr that was as near to a > > Lucene Query object as we could get (and there was an existing XML parser > > to save us starting from square one!). > > > > As part of that Query construction (but independent of which QueryParser > > you use), Solr will also make use of a set of Tokenizers and Filters ( > > > > > https://cwiki.apache.org/confluence/display/solr/Understanding+Analyzers,+Tokenizers,+and+Filters > > ) > > but that's more to do with dealing with the terms in the query (so in my > > examples above, is A a real word, does it need stemming, lowercasing, > > removing because its a stopword, etc). > > >
Re: About Query Parser
Hi Daniel, You said inputs are "human-generated" and outputs are "lucene objects". So my question is what does the below query mean. Does this fall under human-generated one or lucene.? http://localhost:8983/solr/collection1/select?q=*%3A*&wt=xml&indent=true Thanks, Vivek On Fri, Jun 20, 2014 at 3:55 PM, Daniel Collins wrote: > Alexandre's response is very thorough, so I'm really simplifying things, I > confess but here's my "query parsers for dummies". :) > > In terms of inputs/outputs, a QueryParser takes a string (generally assumed > to be "human generated" i.e. something a user might type in, so maybe a > sentence, a set of words, the format can vary) and outputs a Lucene Query > object ( > > http://lucene.apache.org/core/4_8_1/core/org/apache/lucene/search/Query.html > ), > which in fact is a kind of "tree" (again, I'm simplifying I know) since a > query can contain nested expressions. > > So very loosely its a translator from a human-generated query into the > structure that Lucene can handle. There are several different query > parsers since they all use different input syntax, and ways of handling > different constructs (to handle A and B, should the user type "+A +B" or "A > and B" or just "A B" for example), and have different levels of support for > the various Query structures that Lucene can handle: SpanQuery, FuzzyQuery, > PhraseQuery, etc. > > We for example use an XML-based query parser. Why (you might well ask!), > well we had an already used and supported query syntax of our own, which > our users understood, so we couldn't use an off the shelf query parser. We > could have built our own in Java, but for a variety of reasons we parse our > queries in a front-end system ahead of Solr (which is C++-based), so we > needed an interim format to pass queries to Solr that was as near to a > Lucene Query object as we could get (and there was an existing XML parser > to save us starting from square one!). > > As part of that Query construction (but independent of which QueryParser > you use), Solr will also make use of a set of Tokenizers and Filters ( > > https://cwiki.apache.org/confluence/display/solr/Understanding+Analyzers,+Tokenizers,+and+Filters > ) > but that's more to do with dealing with the terms in the query (so in my > examples above, is A a real word, does it need stemming, lowercasing, > removing because its a stopword, etc). >
Re: About Query Parser
Alexandre's response is very thorough, so I'm really simplifying things, I confess but here's my "query parsers for dummies". :) In terms of inputs/outputs, a QueryParser takes a string (generally assumed to be "human generated" i.e. something a user might type in, so maybe a sentence, a set of words, the format can vary) and outputs a Lucene Query object ( http://lucene.apache.org/core/4_8_1/core/org/apache/lucene/search/Query.html), which in fact is a kind of "tree" (again, I'm simplifying I know) since a query can contain nested expressions. So very loosely its a translator from a human-generated query into the structure that Lucene can handle. There are several different query parsers since they all use different input syntax, and ways of handling different constructs (to handle A and B, should the user type "+A +B" or "A and B" or just "A B" for example), and have different levels of support for the various Query structures that Lucene can handle: SpanQuery, FuzzyQuery, PhraseQuery, etc. We for example use an XML-based query parser. Why (you might well ask!), well we had an already used and supported query syntax of our own, which our users understood, so we couldn't use an off the shelf query parser. We could have built our own in Java, but for a variety of reasons we parse our queries in a front-end system ahead of Solr (which is C++-based), so we needed an interim format to pass queries to Solr that was as near to a Lucene Query object as we could get (and there was an existing XML parser to save us starting from square one!). As part of that Query construction (but independent of which QueryParser you use), Solr will also make use of a set of Tokenizers and Filters ( https://cwiki.apache.org/confluence/display/solr/Understanding+Analyzers,+Tokenizers,+and+Filters) but that's more to do with dealing with the terms in the query (so in my examples above, is A a real word, does it need stemming, lowercasing, removing because its a stopword, etc).
Re: About Query Parser
I am going to have a go at this. Maybe others can add/correct. When you make a request to Solr, it hits a request handler first. E.g. a "/select" request handler. That's defined in solrconfig.xml The request handler can change your request with some defaults, required and overriding parameters. For "solr.SearchHandler", it can also define what search components stack then processes the actual request. They can define it explicitly (e.g. "/suggest" request handler), use default stack or append/prepend to the default stack (e.g. "/spell" request Handler). The default search component stack can be seen in the commented out section of solrconfig.xml and consists of 6 components: query, facet, mlt (MoreLikeThis), highlight, stats, and debug. Query component is the one that actually does the searching and figuring out what the result documents are. And it uses query parsers for that. There are multiple query parsers available. The most common are "standard/lucene", "dismax" and "edismax". But there is a bunch more: https://cwiki.apache.org/confluence/display/solr/Query+Syntax+and+Parsing If you don't have query components, you are not actually searching for documents, you are doing something else (e.g. spelling). These parsers transform what you sent in your URL (in the "q" parameter, but also others) into the Lucene or internal queries that return documents with some ranking attached. Then, other components do their own things too. facet components add facets. highlight components add highlight sections based on the already collected information and so on. Then, all that gets serialized into one of many supported formats (XML, JSON, Ruby, etc) and sent back to the client. If you want examples, then just read through solrconfig.xml and shema.xml and understand how they hang together. That's why they are so long, so people can see the defaults and examples. If you did not care for that, your solrconfig.xml could be as small as: https://github.com/arafalov/solr-indexing-book/blob/master/published/collection1/conf/solrconfig.xml Regards, Alex. P.s. The interesting question in return is "where are you stuck that you think that knowing what query parser is will move you further ahead?" Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Fri, Jun 20, 2014 at 3:55 PM, Vivekanand Ittigi wrote: > Hi, > > I think this might be a silly question but i want to make it clear. > > What is query parser...? What does it do.? I know its used for converting > query. But from What to what?what is the input and what is the output of > query parser. And where exactly this feature can be used? > > If possible please explain with the example. It really helps a lot? > > Thanks, > Vivek
About Query Parser
Hi, I think this might be a silly question but i want to make it clear. What is query parser...? What does it do.? I know its used for converting query. But from What to what?what is the input and what is the output of query parser. And where exactly this feature can be used? If possible please explain with the example. It really helps a lot? Thanks, Vivek