subject:"Query Parser OR AND and NOT"

Combining edismax Parser with Block Join Parent Query Parser

2021-01-11 Thread Ravi Lodhi

Hello Guys,

Does Solr support edismax parser with Block Join Parent Query Parser? If
yes then could you provide me the syntax or point me to some reference
document? And how does it affect the performance?

I am working on a search screen in an eCommerce application's backend. The
requirement is to design an order search screen. We were thinking of using
a nested document approach. Order information document as parent and all
its items as child document. We need to perform keyword search on both
parent and child documents. By using Block Join Parent Query Parser we can
search only on child documents and can retrieve parents. The sample
document structure is given below. We need "OR" condition between edismax
query and Block Join Parent Query Parser.

Is the nested document a good approach for the order and order items
related data or we should denormalize the data either at parent level or
child level? Which will be the best suitable schema design in this scenario?

e.g. If I search "WEB" then if this is found in any of the child documents
then the parent doc should return or if it is found on any parent document
then that parent should return.

Sample Parent doc:
{
"orderId": "ORD1",
"orderTypeId": "SALES",
"orderStatusId": "ORDER_APPROVED",
"orderStatusDescription": "Approved",
"orderDate": "2021-01-09T07:00:00Z",
"orderGrandTotal": "200",
"salesChannel": "WEB",
"salesRepNames": "Demo Supplier",
"originFacilityId": "FACILITY_01"
}

Sample Child doc:

{
"orderItemId": "ORD1",
"itemStatusId": "ORDER_APPROVED",
"itemStatusDescription": "Approved",
"productId": "P01",
"productName": "Demo Product",
"productInternalName": "Demo Product 01",
"productBrandName": "Demo Brand"
}

Any Help on this will be much appreciated!

Thanks!
Ravi Lodhi

Re: Developing update processor/Query Parser

2020-06-26 Thread Vincenzo D'Amore

Sharing a static object between URP and QParser it's easy. But when the
core is reloaded, this shared static object is rewritten from scratch.
An old QParser that references that object could have serious problems,
inconsistencies, concurrent modification exceptions, etc.
On the other hand, trying to solve this problem using synchronization,
mutex or semaphores will lead to a non performing solution.
Another real problem I have, is that it is not clear what happens
internally.

On Fri, Jun 26, 2020 at 9:19 PM Vincenzo D'Amore  wrote:

> Hi Gus, thanks for the thorough explanation.
> Infact, I was concerned about how to hold the shared information (between
> URP and QParser), for example when a core is reloaded.
> What happens when a core is reloaded? Very likely I have a couple of new
> URP/QParser but an older QParser may still be serving a request.
> If my assumption is true, now I have a clearer idea.
> So the question is, how to share an object (CustomConfig) between URP and
> QParser.
> And this CustomConfig object is created by URP every time the core is
> reloaded.
>
> On Fri, Jun 26, 2020 at 8:47 PM Gus Heck  wrote:
>
>> During the request, the parser plugin is retrieved from a PluginBag on the
>> SolrCore object, so it should be reloaded at the same time as the update
>> component (which comes from another PluginBag on SolrCore). If the
>> components are deployed with consistent configuration in solrconfig.xml,
>> any given SolrCore instance should have a consistent set of both. If you
>> want to avoid repeating the information, one possibility is to use a
>> system
>> property
>>
>> https://lucene.apache.org/solr/guide/8_4/configuring-solrconfig-xml.html#jvm-system-properties
>> though
>> the suitability of that may depend on the size of your cluster and your
>> deployment infrastructure.
>>
>> On Thu, Jun 25, 2020 at 2:47 PM Mikhail Khludnev  wrote:
>>
>> > Hello, Vincenzo.
>> > Please find above about a dedicated component doing nothing, but just
>> > holding a config.
>> > Also you may extract config into a file and load it by
>> > SolrResourceLoaderAware.
>> >
>> > On Thu, Jun 25, 2020 at 2:06 PM Vincenzo D'Amore 
>> > wrote:
>> >
>> > > Hi Mikhail, yup, I was trying to avoid putting logic in Solr.
>> > > Just to be a little bit more specific, consider that if the update
>> > factory
>> > > writes a field that has a size of 50.
>> > > The QParser should be aware of the current size when writing a query.
>> > >
>> > > Is it possible to have in solrconfig.xml file a shared configuration?
>> > >
>> > > I mean a snippet of configuration shared between update processor
>> factory
>> > > and QParser.
>> > >
>> > >
>> > > On Wed, Jun 24, 2020 at 10:33 PM Mikhail Khludnev 
>> > wrote:
>> > >
>> > > > Hello, Vincenzo.
>> > > > Presumably you can introduce a component which just holds a config
>> > data,
>> > > > and then this component might be lookedup from QParser and
>> > UpdateFactory.
>> > > > Overall, it seems like embedding logic into Solr core, which rarely
>> > works
>> > > > well.
>> > > >
>> > > > On Wed, Jun 24, 2020 at 8:00 PM Vincenzo D'Amore <
>> v.dam...@gmail.com>
>> > > > wrote:
>> > > >
>> > > > > Hi all,
>> > > > >
>> > > > > I've started to work on a couple of components very tight
>> together.
>> > > > > An update processor that writes few fields in the solr index and a
>> > > Query
>> > > > > Parser that, well, then reads such fields from the index.
>> > > > >
>> > > > > Such components share few configuration parameters together, I'm
>> > asking
>> > > > if
>> > > > > there is a pattern, a draft, a sample, some guidelines or best
>> > > practices
>> > > > > that explains how to properly save configuration parameters.
>> > > > >
>> > > > > The configuration is written into the solrconfig.xml file, for
>> > example:
>> > > > >
>> > > > >
>> > > > >  
>> > > > >x1
>> > > > >x2
>> > > > >  
>> > > > >
>> > > > >
>> > > > > And then query parser :
>> > >

Re: Developing update processor/Query Parser

2020-06-26 Thread Vincenzo D'Amore

Hi Gus, thanks for the thorough explanation.
Infact, I was concerned about how to hold the shared information (between
URP and QParser), for example when a core is reloaded.
What happens when a core is reloaded? Very likely I have a couple of new
URP/QParser but an older QParser may still be serving a request.
If my assumption is true, now I have a clearer idea.
So the question is, how to share an object (CustomConfig) between URP and
QParser.
And this CustomConfig object is created by URP every time the core is
reloaded.

On Fri, Jun 26, 2020 at 8:47 PM Gus Heck  wrote:

> During the request, the parser plugin is retrieved from a PluginBag on the
> SolrCore object, so it should be reloaded at the same time as the update
> component (which comes from another PluginBag on SolrCore). If the
> components are deployed with consistent configuration in solrconfig.xml,
> any given SolrCore instance should have a consistent set of both. If you
> want to avoid repeating the information, one possibility is to use a system
> property
>
> https://lucene.apache.org/solr/guide/8_4/configuring-solrconfig-xml.html#jvm-system-properties
> though
> the suitability of that may depend on the size of your cluster and your
> deployment infrastructure.
>
> On Thu, Jun 25, 2020 at 2:47 PM Mikhail Khludnev  wrote:
>
> > Hello, Vincenzo.
> > Please find above about a dedicated component doing nothing, but just
> > holding a config.
> > Also you may extract config into a file and load it by
> > SolrResourceLoaderAware.
> >
> > On Thu, Jun 25, 2020 at 2:06 PM Vincenzo D'Amore 
> > wrote:
> >
> > > Hi Mikhail, yup, I was trying to avoid putting logic in Solr.
> > > Just to be a little bit more specific, consider that if the update
> > factory
> > > writes a field that has a size of 50.
> > > The QParser should be aware of the current size when writing a query.
> > >
> > > Is it possible to have in solrconfig.xml file a shared configuration?
> > >
> > > I mean a snippet of configuration shared between update processor
> factory
> > > and QParser.
> > >
> > >
> > > On Wed, Jun 24, 2020 at 10:33 PM Mikhail Khludnev 
> > wrote:
> > >
> > > > Hello, Vincenzo.
> > > > Presumably you can introduce a component which just holds a config
> > data,
> > > > and then this component might be lookedup from QParser and
> > UpdateFactory.
> > > > Overall, it seems like embedding logic into Solr core, which rarely
> > works
> > > > well.
> > > >
> > > > On Wed, Jun 24, 2020 at 8:00 PM Vincenzo D'Amore  >
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I've started to work on a couple of components very tight together.
> > > > > An update processor that writes few fields in the solr index and a
> > > Query
> > > > > Parser that, well, then reads such fields from the index.
> > > > >
> > > > > Such components share few configuration parameters together, I'm
> > asking
> > > > if
> > > > > there is a pattern, a draft, a sample, some guidelines or best
> > > practices
> > > > > that explains how to properly save configuration parameters.
> > > > >
> > > > > The configuration is written into the solrconfig.xml file, for
> > example:
> > > > >
> > > > >
> > > > >  
> > > > >x1
> > > > >x2
> > > > >  
> > > > >
> > > > >
> > > > > And then query parser :
> > > > >
> > > > >  > > > > class="com.example.query.MyCustomQueryParserPlugin" />
> > > > >
> > > > > I'm struggling because the change of configuration on the updated
> > > > processor
> > > > > has an impact on the query parser.
> > > > > For example the configuration info shared between those two
> > components
> > > > can
> > > > > be overwritten during a core reload.
> > > > > Basically, during an update or a core reload, there is a query
> parser
> > > > that
> > > > > is serving requests while some other component is updating the
> index.
> > > > > So I suppose there should be a pattern, an approach, a common
> > solution
> > > > when
> > > > > a piece of configuration has to be loaded at boot, or when the core
> > is
> > > > > loaded.
> > > > > Or when, after an update a new searcher is created and a new query
> > > parser
> > > > > is created.
> > > > >
> > > > > Any suggestion is really appreciated.
> > > > >
> > > > > Best regards,
> > > > > Vincenzo
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Vincenzo D'Amore
> > > > >
> > > >
> > > >
> > > > --
> > > > Sincerely yours
> > > > Mikhail Khludnev
> > > >
> > >
> > >
> > > --
> > > Vincenzo D'Amore
> > >
> >
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
> >
>
>
> --
> http://www.needhamsoftware.com (work)
> http://www.the111shift.com (play)
>


-- 
Vincenzo D'Amore

Re: Developing update processor/Query Parser

2020-06-26 Thread Gus Heck

During the request, the parser plugin is retrieved from a PluginBag on the
SolrCore object, so it should be reloaded at the same time as the update
component (which comes from another PluginBag on SolrCore). If the
components are deployed with consistent configuration in solrconfig.xml,
any given SolrCore instance should have a consistent set of both. If you
want to avoid repeating the information, one possibility is to use a system
property
https://lucene.apache.org/solr/guide/8_4/configuring-solrconfig-xml.html#jvm-system-properties
though
the suitability of that may depend on the size of your cluster and your
deployment infrastructure.

On Thu, Jun 25, 2020 at 2:47 PM Mikhail Khludnev  wrote:

> Hello, Vincenzo.
> Please find above about a dedicated component doing nothing, but just
> holding a config.
> Also you may extract config into a file and load it by
> SolrResourceLoaderAware.
>
> On Thu, Jun 25, 2020 at 2:06 PM Vincenzo D'Amore 
> wrote:
>
> > Hi Mikhail, yup, I was trying to avoid putting logic in Solr.
> > Just to be a little bit more specific, consider that if the update
> factory
> > writes a field that has a size of 50.
> > The QParser should be aware of the current size when writing a query.
> >
> > Is it possible to have in solrconfig.xml file a shared configuration?
> >
> > I mean a snippet of configuration shared between update processor factory
> > and QParser.
> >
> >
> > On Wed, Jun 24, 2020 at 10:33 PM Mikhail Khludnev 
> wrote:
> >
> > > Hello, Vincenzo.
> > > Presumably you can introduce a component which just holds a config
> data,
> > > and then this component might be lookedup from QParser and
> UpdateFactory.
> > > Overall, it seems like embedding logic into Solr core, which rarely
> works
> > > well.
> > >
> > > On Wed, Jun 24, 2020 at 8:00 PM Vincenzo D'Amore 
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I've started to work on a couple of components very tight together.
> > > > An update processor that writes few fields in the solr index and a
> > Query
> > > > Parser that, well, then reads such fields from the index.
> > > >
> > > > Such components share few configuration parameters together, I'm
> asking
> > > if
> > > > there is a pattern, a draft, a sample, some guidelines or best
> > practices
> > > > that explains how to properly save configuration parameters.
> > > >
> > > > The configuration is written into the solrconfig.xml file, for
> example:
> > > >
> > > >
> > > >  
> > > >x1
> > > >x2
> > > >  
> > > >
> > > >
> > > > And then query parser :
> > > >
> > > >  > > > class="com.example.query.MyCustomQueryParserPlugin" />
> > > >
> > > > I'm struggling because the change of configuration on the updated
> > > processor
> > > > has an impact on the query parser.
> > > > For example the configuration info shared between those two
> components
> > > can
> > > > be overwritten during a core reload.
> > > > Basically, during an update or a core reload, there is a query parser
> > > that
> > > > is serving requests while some other component is updating the index.
> > > > So I suppose there should be a pattern, an approach, a common
> solution
> > > when
> > > > a piece of configuration has to be loaded at boot, or when the core
> is
> > > > loaded.
> > > > Or when, after an update a new searcher is created and a new query
> > parser
> > > > is created.
> > > >
> > > > Any suggestion is really appreciated.
> > > >
> > > > Best regards,
> > > > Vincenzo
> > > >
> > > >
> > > >
> > > > --
> > > > Vincenzo D'Amore
> > > >
> > >
> > >
> > > --
> > > Sincerely yours
> > > Mikhail Khludnev
> > >
> >
> >
> > --
> > Vincenzo D'Amore
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>


-- 
http://www.needhamsoftware.com (work)
http://www.the111shift.com (play)

RE: Unexpected results using Block Join Parent Query Parser

2020-06-26 Thread Tor-Magne Stien Hagen

Alright, that solved the problem. Thank you very much!

Tor-Magne Stien Hagen

-Original Message-
From: Mikhail Khludnev  
Sent: Thursday, June 25, 2020 12:13 PM
To: solr-user 
Subject: Re: Unexpected results using Block Join Parent Query Parser

Ok. My fault. Old sport, you know. When retrieving  intermediate scopes, 
parents bitmask shout include all enclosing scopes as well. It's a dark side of 
the BJQ.
 {!parent which=`class:(section OR composition)`} I'm not 
sure what you try to achieve specifying grandchildren as a parent-bitmask. 
Note, the algorithm assumes that parents' bitmask has the last doc in the 
segment set. I.e. 'which' query supplied in runtime should strictly correspond 
to the block structure indexed before.

On Thu, Jun 25, 2020 at 12:05 PM Tor-Magne Stien Hagen  wrote:

> If I modify the query like this:
>
> {!parent which='class:instruction'}class:observation
>
> It still returns a result for the instruction document, even though 
> the document with class instruction does not have any children...
>
> Tor-Magne Stien Hagen
>
> -Original Message-
> From: Mikhail Khludnev 
> Sent: Wednesday, June 24, 2020 2:14 PM
> To: solr-user 
> Subject: Re: Unexpected results using Block Join Parent Query Parser
>
> Jan, thanks for the clarification.
> Sure you can use {!parent which=class:section} for return children, 
> which has a garndchildren matching subordinate query.
> Note: there's something about named scopes, which I didn't get into 
> yet, but it might be relevant to the problem.
>
> On Wed, Jun 24, 2020 at 1:43 PM Jan Høydahl  wrote:
>
> > I guess the key question here is whether «parent» in BlockJoin is 
> > strictly top-level parent/root, i.e. class:composition for the 
> > example in this tread? Or can {!parent} parser also be used to 
> > select the «child» level in a child/grandchild relationship inside a block?
> >
> > Jan
> >
> > > 24. jun. 2020 kl. 11:36 skrev Tor-Magne Stien Hagen :
> > >
> > > Thanks for your answer,
> > >
> > > What kind of rules exists for the which clause? In other words, 
> > > how can
> > you identify parents without using some sort of filtering?
> > >
> > > Tor-Magne Stien Hagen
> > >
> > > -Original Message-
> > > From: Mikhail Khludnev 
> > > Sent: Wednesday, June 24, 2020 10:01 AM
> > > To: solr-user 
> > > Subject: Re: Unexpected results using Block Join Parent Query 
> > > Parser
> > >
> > > Hello,
> > >
> > > Please check warning box titled Using which
> > >
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flu
> > ce 
> > ne.apache.org%2Fsolr%2Fguide%2F8_5%2Fother-parsers.html%23block-join
> > -p 
> > arent-query-parserdata=02%7C01%7Ctsh%40dips.no%7C9201c7db5ed34b
> > af
> > 864808d818383e50%7C2f46c9197c11446584b2e354fb809979%7C0%7C0%7C637285
> > 97 
> > 7131631165sdata=0kMYuLmBcziHdzOucKA7Vx63Xr7a90dqOsplNteRbvE%3D&
> > am
> > p;reserved=0
> > >
> > > On Wed, Jun 24, 2020 at 10:01 AM Tor-Magne Stien Hagen 
> > > 
> > wrote:
> > >
> > >> Hi,
> > >>
> > >> I have indexed the following nested document in Solr:
> > >>
> > >> {
> > >>"id": "1",
> > >>"class": "composition",
> > >>"children": [
> > >>{
> > >>"id": "2",
> > >>"class": "section",
> > >>"children": [
> > >>{
> > >>"id": "3",
> > >>"class": "observation"
> > >>}
> > >>]
> > >>},
> > >>{
> > >>"id": "4",
> > >>"class": "section",
> > >>"children": [
> > >>{
> > >>        "id": "5",
> > >>"class": "instruction"
> > >>}
> > >>]
> > >>}
> > >>]
> > >> }
> > >>
> > >> Given the following query:
> > >>
> > >> {!parent which='id:4'}id:3
> > >>
> > >> I expect the result to be empty as document 3 is not a child 
> > >> document of document 4.
> > >>
> > >> To reproduce, use the docker container provided here:
> > >> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2
> > >> Fg
> > >> ith
> > >> ub.com%2Ftormsh%2FSolr-Exampledata=02%7C01%7Ctsh%40dips.no%7
> > >> C5
> > >> fef
> > >> 4e9a68cc41c72fd208d81814e93e%7C2f46c9197c11446584b2e354fb809979%7
> > >> C0
> > >> %7C
> > >> 0%7C637285825378470570sdata=OyjBalFeXfb0W2euL76L%2BNyRDg9ukv
> > >> T8
> > >> TNI
> > >> aODCmV30%3Dreserved=0
> > >>
> > >> Have I misunderstood something regarding the Block Join Parent 
> > >> Query Parser?
> > >>
> > >> Tor-Magne Stien Hagen
> > >>
> > >>
> > >
> > > --
> > > Sincerely yours
> > > Mikhail Khludnev
> >
> >
>
> --
> Sincerely yours
> Mikhail Khludnev
>


--
Sincerely yours
Mikhail Khludnev

Re: Developing update processor/Query Parser

2020-06-25 Thread Mikhail Khludnev

Hello, Vincenzo.
Please find above about a dedicated component doing nothing, but just
holding a config.
Also you may extract config into a file and load it by
SolrResourceLoaderAware.

On Thu, Jun 25, 2020 at 2:06 PM Vincenzo D'Amore  wrote:

> Hi Mikhail, yup, I was trying to avoid putting logic in Solr.
> Just to be a little bit more specific, consider that if the update factory
> writes a field that has a size of 50.
> The QParser should be aware of the current size when writing a query.
>
> Is it possible to have in solrconfig.xml file a shared configuration?
>
> I mean a snippet of configuration shared between update processor factory
> and QParser.
>
>
> On Wed, Jun 24, 2020 at 10:33 PM Mikhail Khludnev  wrote:
>
> > Hello, Vincenzo.
> > Presumably you can introduce a component which just holds a config data,
> > and then this component might be lookedup from QParser and UpdateFactory.
> > Overall, it seems like embedding logic into Solr core, which rarely works
> > well.
> >
> > On Wed, Jun 24, 2020 at 8:00 PM Vincenzo D'Amore 
> > wrote:
> >
> > > Hi all,
> > >
> > > I've started to work on a couple of components very tight together.
> > > An update processor that writes few fields in the solr index and a
> Query
> > > Parser that, well, then reads such fields from the index.
> > >
> > > Such components share few configuration parameters together, I'm asking
> > if
> > > there is a pattern, a draft, a sample, some guidelines or best
> practices
> > > that explains how to properly save configuration parameters.
> > >
> > > The configuration is written into the solrconfig.xml file, for example:
> > >
> > >
> > >  
> > >x1
> > >x2
> > >  
> > >
> > >
> > > And then query parser :
> > >
> > >  > > class="com.example.query.MyCustomQueryParserPlugin" />
> > >
> > > I'm struggling because the change of configuration on the updated
> > processor
> > > has an impact on the query parser.
> > > For example the configuration info shared between those two components
> > can
> > > be overwritten during a core reload.
> > > Basically, during an update or a core reload, there is a query parser
> > that
> > > is serving requests while some other component is updating the index.
> > > So I suppose there should be a pattern, an approach, a common solution
> > when
> > > a piece of configuration has to be loaded at boot, or when the core is
> > > loaded.
> > > Or when, after an update a new searcher is created and a new query
> parser
> > > is created.
> > >
> > > Any suggestion is really appreciated.
> > >
> > > Best regards,
> > > Vincenzo
> > >
> > >
> > >
> > > --
> > > Vincenzo D'Amore
> > >
> >
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
> >
>
>
> --
> Vincenzo D'Amore
>


-- 
Sincerely yours
Mikhail Khludnev

Re: Developing update processor/Query Parser

2020-06-25 Thread Vincenzo D'Amore

Hi Mikhail, yup, I was trying to avoid putting logic in Solr.
Just to be a little bit more specific, consider that if the update factory
writes a field that has a size of 50.
The QParser should be aware of the current size when writing a query.

Is it possible to have in solrconfig.xml file a shared configuration?

I mean a snippet of configuration shared between update processor factory
and QParser.


On Wed, Jun 24, 2020 at 10:33 PM Mikhail Khludnev  wrote:

> Hello, Vincenzo.
> Presumably you can introduce a component which just holds a config data,
> and then this component might be lookedup from QParser and UpdateFactory.
> Overall, it seems like embedding logic into Solr core, which rarely works
> well.
>
> On Wed, Jun 24, 2020 at 8:00 PM Vincenzo D'Amore 
> wrote:
>
> > Hi all,
> >
> > I've started to work on a couple of components very tight together.
> > An update processor that writes few fields in the solr index and a Query
> > Parser that, well, then reads such fields from the index.
> >
> > Such components share few configuration parameters together, I'm asking
> if
> > there is a pattern, a draft, a sample, some guidelines or best practices
> > that explains how to properly save configuration parameters.
> >
> > The configuration is written into the solrconfig.xml file, for example:
> >
> >
> >  
> >x1
> >x2
> >  
> >
> >
> > And then query parser :
> >
> >  > class="com.example.query.MyCustomQueryParserPlugin" />
> >
> > I'm struggling because the change of configuration on the updated
> processor
> > has an impact on the query parser.
> > For example the configuration info shared between those two components
> can
> > be overwritten during a core reload.
> > Basically, during an update or a core reload, there is a query parser
> that
> > is serving requests while some other component is updating the index.
> > So I suppose there should be a pattern, an approach, a common solution
> when
> > a piece of configuration has to be loaded at boot, or when the core is
> > loaded.
> > Or when, after an update a new searcher is created and a new query parser
> > is created.
> >
> > Any suggestion is really appreciated.
> >
> > Best regards,
> > Vincenzo
> >
> >
> >
> > --
> > Vincenzo D'Amore
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>


-- 
Vincenzo D'Amore

Re: Unexpected results using Block Join Parent Query Parser

2020-06-25 Thread Mikhail Khludnev

Ok. My fault. Old sport, you know. When retrieving  intermediate scopes,
parents bitmask shout include all enclosing scopes as well. It's a dark
side of the BJQ.
 {!parent which=`class:(section OR composition)`}
I'm not sure what you try to achieve specifying grandchildren as a
parent-bitmask. Note, the algorithm assumes that parents' bitmask has the
last doc in the segment set. I.e. 'which' query supplied in runtime should
strictly correspond to the block structure indexed before.

On Thu, Jun 25, 2020 at 12:05 PM Tor-Magne Stien Hagen  wrote:

> If I modify the query like this:
>
> {!parent which='class:instruction'}class:observation
>
> It still returns a result for the instruction document, even though the
> document with class instruction does not have any children...
>
> Tor-Magne Stien Hagen
>
> -Original Message-
> From: Mikhail Khludnev 
> Sent: Wednesday, June 24, 2020 2:14 PM
> To: solr-user 
> Subject: Re: Unexpected results using Block Join Parent Query Parser
>
> Jan, thanks for the clarification.
> Sure you can use {!parent which=class:section} for return children, which
> has a garndchildren matching subordinate query.
> Note: there's something about named scopes, which I didn't get into yet,
> but it might be relevant to the problem.
>
> On Wed, Jun 24, 2020 at 1:43 PM Jan Høydahl  wrote:
>
> > I guess the key question here is whether «parent» in BlockJoin is
> > strictly top-level parent/root, i.e. class:composition for the example
> > in this tread? Or can {!parent} parser also be used to select the
> > «child» level in a child/grandchild relationship inside a block?
> >
> > Jan
> >
> > > 24. jun. 2020 kl. 11:36 skrev Tor-Magne Stien Hagen :
> > >
> > > Thanks for your answer,
> > >
> > > What kind of rules exists for the which clause? In other words, how
> > > can
> > you identify parents without using some sort of filtering?
> > >
> > > Tor-Magne Stien Hagen
> > >
> > > -Original Message-
> > > From: Mikhail Khludnev 
> > > Sent: Wednesday, June 24, 2020 10:01 AM
> > > To: solr-user 
> > > Subject: Re: Unexpected results using Block Join Parent Query Parser
> > >
> > > Hello,
> > >
> > > Please check warning box titled Using which
> > >
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fluce
> > ne.apache.org%2Fsolr%2Fguide%2F8_5%2Fother-parsers.html%23block-join-p
> > arent-query-parserdata=02%7C01%7Ctsh%40dips.no%7C9201c7db5ed34baf
> > 864808d818383e50%7C2f46c9197c11446584b2e354fb809979%7C0%7C0%7C63728597
> > 7131631165sdata=0kMYuLmBcziHdzOucKA7Vx63Xr7a90dqOsplNteRbvE%3D
> > p;reserved=0
> > >
> > > On Wed, Jun 24, 2020 at 10:01 AM Tor-Magne Stien Hagen 
> > wrote:
> > >
> > >> Hi,
> > >>
> > >> I have indexed the following nested document in Solr:
> > >>
> > >> {
> > >>"id": "1",
> > >>"class": "composition",
> > >>"children": [
> > >>{
> > >>"id": "2",
> > >>"class": "section",
> > >>"children": [
> > >>{
> > >>"id": "3",
> > >>"class": "observation"
> > >>}
> > >>]
> > >>},
> > >>{
> > >>"id": "4",
> > >>"class": "section",
> > >>"children": [
> > >>{
> > >>"id": "5",
> > >>"class": "instruction"
> > >>}
> > >>]
> > >>}
> > >>]
> > >> }
> > >>
> > >> Given the following query:
> > >>
> > >> {!parent which='id:4'}id:3
> > >>
> > >> I expect the result to be empty as document 3 is not a child
> > >> document of document 4.
> > >>
> > >> To reproduce, use the docker container provided here:
> > >> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fg
> > >> ith
> > >> ub.com%2Ftormsh%2FSolr-Exampledata=02%7C01%7Ctsh%40dips.no%7C5
> > >> fef
> > >> 4e9a68cc41c72fd208d81814e93e%7C2f46c9197c11446584b2e354fb809979%7C0
> > >> %7C
> > >> 0%7C637285825378470570sdata=OyjBalFeXfb0W2euL76L%2BNyRDg9ukvT8
> > >> TNI
> > >> aODCmV30%3Dreserved=0
> > >>
> > >> Have I misunderstood something regarding the Block Join Parent
> > >> Query Parser?
> > >>
> > >> Tor-Magne Stien Hagen
> > >>
> > >>
> > >
> > > --
> > > Sincerely yours
> > > Mikhail Khludnev
> >
> >
>
> --
> Sincerely yours
> Mikhail Khludnev
>


-- 
Sincerely yours
Mikhail Khludnev

RE: Unexpected results using Block Join Parent Query Parser

2020-06-25 Thread Tor-Magne Stien Hagen

If I modify the query like this:

{!parent which='class:instruction'}class:observation

It still returns a result for the instruction document, even though the 
document with class instruction does not have any children...

Tor-Magne Stien Hagen

-Original Message-
From: Mikhail Khludnev  
Sent: Wednesday, June 24, 2020 2:14 PM
To: solr-user 
Subject: Re: Unexpected results using Block Join Parent Query Parser

Jan, thanks for the clarification.
Sure you can use {!parent which=class:section} for return children, which has a 
garndchildren matching subordinate query.
Note: there's something about named scopes, which I didn't get into yet, but it 
might be relevant to the problem.

On Wed, Jun 24, 2020 at 1:43 PM Jan Høydahl  wrote:

> I guess the key question here is whether «parent» in BlockJoin is 
> strictly top-level parent/root, i.e. class:composition for the example 
> in this tread? Or can {!parent} parser also be used to select the 
> «child» level in a child/grandchild relationship inside a block?
>
> Jan
>
> > 24. jun. 2020 kl. 11:36 skrev Tor-Magne Stien Hagen :
> >
> > Thanks for your answer,
> >
> > What kind of rules exists for the which clause? In other words, how 
> > can
> you identify parents without using some sort of filtering?
> >
> > Tor-Magne Stien Hagen
> >
> > -Original Message-
> > From: Mikhail Khludnev 
> > Sent: Wednesday, June 24, 2020 10:01 AM
> > To: solr-user 
> > Subject: Re: Unexpected results using Block Join Parent Query Parser
> >
> > Hello,
> >
> > Please check warning box titled Using which
> >
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fluce
> ne.apache.org%2Fsolr%2Fguide%2F8_5%2Fother-parsers.html%23block-join-p
> arent-query-parserdata=02%7C01%7Ctsh%40dips.no%7C9201c7db5ed34baf
> 864808d818383e50%7C2f46c9197c11446584b2e354fb809979%7C0%7C0%7C63728597
> 7131631165sdata=0kMYuLmBcziHdzOucKA7Vx63Xr7a90dqOsplNteRbvE%3D
> p;reserved=0
> >
> > On Wed, Jun 24, 2020 at 10:01 AM Tor-Magne Stien Hagen 
> wrote:
> >
> >> Hi,
> >>
> >> I have indexed the following nested document in Solr:
> >>
> >> {
> >>"id": "1",
> >>"class": "composition",
> >>"children": [
> >>{
> >>"id": "2",
> >>"class": "section",
> >>"children": [
> >>{
> >>"id": "3",
> >>"class": "observation"
> >>}
> >>]
> >>},
> >>{
> >>"id": "4",
> >>"class": "section",
> >>"children": [
> >>{
> >>"id": "5",
> >>"class": "instruction"
> >>}
> >>]
> >>}
> >>]
> >> }
> >>
> >> Given the following query:
> >>
> >> {!parent which='id:4'}id:3
> >>
> >> I expect the result to be empty as document 3 is not a child 
> >> document of document 4.
> >>
> >> To reproduce, use the docker container provided here:
> >> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fg
> >> ith 
> >> ub.com%2Ftormsh%2FSolr-Exampledata=02%7C01%7Ctsh%40dips.no%7C5
> >> fef 
> >> 4e9a68cc41c72fd208d81814e93e%7C2f46c9197c11446584b2e354fb809979%7C0
> >> %7C 
> >> 0%7C637285825378470570sdata=OyjBalFeXfb0W2euL76L%2BNyRDg9ukvT8
> >> TNI
> >> aODCmV30%3Dreserved=0
> >>
> >> Have I misunderstood something regarding the Block Join Parent 
> >> Query Parser?
> >>
> >> Tor-Magne Stien Hagen
> >>
> >>
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
>
>

--
Sincerely yours
Mikhail Khludnev

Re: Developing update processor/Query Parser

2020-06-24 Thread Mikhail Khludnev

Hello, Vincenzo.
Presumably you can introduce a component which just holds a config data,
and then this component might be lookedup from QParser and UpdateFactory.
Overall, it seems like embedding logic into Solr core, which rarely works
well.

On Wed, Jun 24, 2020 at 8:00 PM Vincenzo D'Amore  wrote:

> Hi all,
>
> I've started to work on a couple of components very tight together.
> An update processor that writes few fields in the solr index and a Query
> Parser that, well, then reads such fields from the index.
>
> Such components share few configuration parameters together, I'm asking if
> there is a pattern, a draft, a sample, some guidelines or best practices
> that explains how to properly save configuration parameters.
>
> The configuration is written into the solrconfig.xml file, for example:
>
>
>  
>    x1
>    x2
>  
>
>
> And then query parser :
>
>  class="com.example.query.MyCustomQueryParserPlugin" />
>
> I'm struggling because the change of configuration on the updated processor
> has an impact on the query parser.
> For example the configuration info shared between those two components can
> be overwritten during a core reload.
> Basically, during an update or a core reload, there is a query parser that
> is serving requests while some other component is updating the index.
> So I suppose there should be a pattern, an approach, a common solution when
> a piece of configuration has to be loaded at boot, or when the core is
> loaded.
> Or when, after an update a new searcher is created and a new query parser
> is created.
>
> Any suggestion is really appreciated.
>
> Best regards,
> Vincenzo
>
>
>
> --
> Vincenzo D'Amore
>


-- 
Sincerely yours
Mikhail Khludnev

Developing update processor/Query Parser

2020-06-24 Thread Vincenzo D'Amore

Hi all,

I've started to work on a couple of components very tight together.
An update processor that writes few fields in the solr index and a Query
Parser that, well, then reads such fields from the index.

Such components share few configuration parameters together, I'm asking if
there is a pattern, a draft, a sample, some guidelines or best practices
that explains how to properly save configuration parameters.

The configuration is written into the solrconfig.xml file, for example:

   
 
   x1
   x2
 
   

And then query parser :



I'm struggling because the change of configuration on the updated processor
has an impact on the query parser.
For example the configuration info shared between those two components can
be overwritten during a core reload.
Basically, during an update or a core reload, there is a query parser that
is serving requests while some other component is updating the index.
So I suppose there should be a pattern, an approach, a common solution when
a piece of configuration has to be loaded at boot, or when the core is
loaded.
Or when, after an update a new searcher is created and a new query parser
is created.

Any suggestion is really appreciated.

Best regards,
Vincenzo



-- 
Vincenzo D'Amore

Re: Unexpected results using Block Join Parent Query Parser

2020-06-24 Thread Mikhail Khludnev

Jan, thanks for the clarification.
Sure you can use {!parent which=class:section} for return children, which
has a garndchildren matching subordinate query.
Note: there's something about named scopes, which I didn't get into yet,
but it might be relevant to the problem.

On Wed, Jun 24, 2020 at 1:43 PM Jan Høydahl  wrote:

> I guess the key question here is whether «parent» in BlockJoin is strictly
> top-level parent/root, i.e. class:composition for the example in this
> tread? Or can {!parent} parser also be used to select the «child» level in
> a child/grandchild relationship inside a block?
>
> Jan
>
> > 24. jun. 2020 kl. 11:36 skrev Tor-Magne Stien Hagen :
> >
> > Thanks for your answer,
> >
> > What kind of rules exists for the which clause? In other words, how can
> you identify parents without using some sort of filtering?
> >
> > Tor-Magne Stien Hagen
> >
> > -Original Message-
> > From: Mikhail Khludnev 
> > Sent: Wednesday, June 24, 2020 10:01 AM
> > To: solr-user 
> > Subject: Re: Unexpected results using Block Join Parent Query Parser
> >
> > Hello,
> >
> > Please check warning box titled Using which
> >
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flucene.apache.org%2Fsolr%2Fguide%2F8_5%2Fother-parsers.html%23block-join-parent-query-parserdata=02%7C01%7Ctsh%40dips.no%7C5fef4e9a68cc41c72fd208d81814e93e%7C2f46c9197c11446584b2e354fb809979%7C0%7C0%7C637285825378470570sdata=rB356EBZuDmFsTHT3ULcvr47zCcr%2F29XYaGA7%2BJ5HrI%3Dreserved=0
> >
> > On Wed, Jun 24, 2020 at 10:01 AM Tor-Magne Stien Hagen 
> wrote:
> >
> >> Hi,
> >>
> >> I have indexed the following nested document in Solr:
> >>
> >> {
> >>"id": "1",
> >>"class": "composition",
> >>"children": [
> >>{
> >>"id": "2",
> >>"class": "section",
> >>"children": [
> >>{
> >>"id": "3",
> >>"class": "observation"
> >>}
> >>]
> >>},
> >>{
> >>"id": "4",
> >>"class": "section",
> >>"children": [
> >>{
> >>"id": "5",
> >>"class": "instruction"
> >>}
> >>]
> >>}
> >>    ]
> >> }
> >>
> >> Given the following query:
> >>
> >> {!parent which='id:4'}id:3
> >>
> >> I expect the result to be empty as document 3 is not a child document
> >> of document 4.
> >>
> >> To reproduce, use the docker container provided here:
> >> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
> >> ub.com%2Ftormsh%2FSolr-Exampledata=02%7C01%7Ctsh%40dips.no%7C5fef
> >> 4e9a68cc41c72fd208d81814e93e%7C2f46c9197c11446584b2e354fb809979%7C0%7C
> >> 0%7C637285825378470570sdata=OyjBalFeXfb0W2euL76L%2BNyRDg9ukvT8TNI
> >> aODCmV30%3Dreserved=0
> >>
> >> Have I misunderstood something regarding the Block Join Parent Query
> >> Parser?
> >>
> >> Tor-Magne Stien Hagen
> >>
> >>
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
>
>

-- 
Sincerely yours
Mikhail Khludnev

Re: Unexpected results using Block Join Parent Query Parser

2020-06-24 Thread Jan Høydahl

I guess the key question here is whether «parent» in BlockJoin is strictly 
top-level parent/root, i.e. class:composition for the example in this tread? Or 
can {!parent} parser also be used to select the «child» level in a 
child/grandchild relationship inside a block?

Jan

> 24. jun. 2020 kl. 11:36 skrev Tor-Magne Stien Hagen :
> 
> Thanks for your answer,
> 
> What kind of rules exists for the which clause? In other words, how can you 
> identify parents without using some sort of filtering?
> 
> Tor-Magne Stien Hagen
> 
> -Original Message-
> From: Mikhail Khludnev  
> Sent: Wednesday, June 24, 2020 10:01 AM
> To: solr-user 
> Subject: Re: Unexpected results using Block Join Parent Query Parser
> 
> Hello,
> 
> Please check warning box titled Using which
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flucene.apache.org%2Fsolr%2Fguide%2F8_5%2Fother-parsers.html%23block-join-parent-query-parserdata=02%7C01%7Ctsh%40dips.no%7C5fef4e9a68cc41c72fd208d81814e93e%7C2f46c9197c11446584b2e354fb809979%7C0%7C0%7C637285825378470570sdata=rB356EBZuDmFsTHT3ULcvr47zCcr%2F29XYaGA7%2BJ5HrI%3Dreserved=0
> 
> On Wed, Jun 24, 2020 at 10:01 AM Tor-Magne Stien Hagen  wrote:
> 
>> Hi,
>> 
>> I have indexed the following nested document in Solr:
>> 
>> {
>>"id": "1",
>>"class": "composition",
>>"children": [
>>{
>>"id": "2",
>>"class": "section",
>>"children": [
>>{
>>"id": "3",
>>"class": "observation"
>>}
>>]
>>},
>>{
>>"id": "4",
>>"class": "section",
>>"children": [
>>{
>>"id": "5",
>>"class": "instruction"
>>}
>>]
>>}
>>]
>> }
>> 
>> Given the following query:
>> 
>> {!parent which='id:4'}id:3
>> 
>> I expect the result to be empty as document 3 is not a child document 
>> of document 4.
>> 
>> To reproduce, use the docker container provided here:
>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
>> ub.com%2Ftormsh%2FSolr-Exampledata=02%7C01%7Ctsh%40dips.no%7C5fef
>> 4e9a68cc41c72fd208d81814e93e%7C2f46c9197c11446584b2e354fb809979%7C0%7C
>> 0%7C637285825378470570sdata=OyjBalFeXfb0W2euL76L%2BNyRDg9ukvT8TNI
>> aODCmV30%3Dreserved=0
>> 
>> Have I misunderstood something regarding the Block Join Parent Query 
>> Parser?
>> 
>> Tor-Magne Stien Hagen
>> 
>> 
> 
> --
> Sincerely yours
> Mikhail Khludnev

RE: Unexpected results using Block Join Parent Query Parser

2020-06-24 Thread Tor-Magne Stien Hagen

Thanks for your answer,

What kind of rules exists for the which clause? In other words, how can you 
identify parents without using some sort of filtering?

Tor-Magne Stien Hagen

-Original Message-
From: Mikhail Khludnev  
Sent: Wednesday, June 24, 2020 10:01 AM
To: solr-user 
Subject: Re: Unexpected results using Block Join Parent Query Parser

Hello,

Please check warning box titled Using which
https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flucene.apache.org%2Fsolr%2Fguide%2F8_5%2Fother-parsers.html%23block-join-parent-query-parserdata=02%7C01%7Ctsh%40dips.no%7C5fef4e9a68cc41c72fd208d81814e93e%7C2f46c9197c11446584b2e354fb809979%7C0%7C0%7C637285825378470570sdata=rB356EBZuDmFsTHT3ULcvr47zCcr%2F29XYaGA7%2BJ5HrI%3Dreserved=0

On Wed, Jun 24, 2020 at 10:01 AM Tor-Magne Stien Hagen  wrote:

> Hi,
>
> I have indexed the following nested document in Solr:
>
> {
> "id": "1",
> "class": "composition",
> "children": [
> {
> "id": "2",
> "class": "section",
> "children": [
> {
> "id": "3",
> "class": "observation"
> }
> ]
> },
> {
> "id": "4",
> "class": "section",
> "children": [
> {
> "id": "5",
> "class": "instruction"
> }
> ]
> }
> ]
> }
>
> Given the following query:
>
> {!parent which='id:4'}id:3
>
> I expect the result to be empty as document 3 is not a child document 
> of document 4.
>
> To reproduce, use the docker container provided here:
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
> ub.com%2Ftormsh%2FSolr-Exampledata=02%7C01%7Ctsh%40dips.no%7C5fef
> 4e9a68cc41c72fd208d81814e93e%7C2f46c9197c11446584b2e354fb809979%7C0%7C
> 0%7C637285825378470570sdata=OyjBalFeXfb0W2euL76L%2BNyRDg9ukvT8TNI
> aODCmV30%3Dreserved=0
>
> Have I misunderstood something regarding the Block Join Parent Query 
> Parser?
>
> Tor-Magne Stien Hagen
>
>

--
Sincerely yours
Mikhail Khludnev

Re: Unexpected results using Block Join Parent Query Parser

2020-06-24 Thread Mikhail Khludnev

Hello,

Please check warning box titled Using which
https://lucene.apache.org/solr/guide/8_5/other-parsers.html#block-join-parent-query-parser

On Wed, Jun 24, 2020 at 10:01 AM Tor-Magne Stien Hagen  wrote:

> Hi,
>
> I have indexed the following nested document in Solr:
>
> {
> "id": "1",
> "class": "composition",
> "children": [
> {
> "id": "2",
> "class": "section",
> "children": [
> {
> "id": "3",
> "class": "observation"
> }
> ]
> },
> {
> "id": "4",
> "class": "section",
> "children": [
> {
> "id": "5",
> "class": "instruction"
> }
> ]
> }
> ]
> }
>
> Given the following query:
>
> {!parent which='id:4'}id:3
>
> I expect the result to be empty as document 3 is not a child document of
> document 4.
>
> To reproduce, use the docker container provided here:
> https://github.com/tormsh/Solr-Example
>
> Have I misunderstood something regarding the Block Join Parent Query
> Parser?
>
> Tor-Magne Stien Hagen
>
>

-- 
Sincerely yours
Mikhail Khludnev

Unexpected results using Block Join Parent Query Parser

2020-06-24 Thread Tor-Magne Stien Hagen

Hi,

I have indexed the following nested document in Solr:

{
"id": "1",
"class": "composition",
"children": [
{
"id": "2",
"class": "section",
"children": [
{
"id": "3",
"class": "observation"
}
]
},
{
"id": "4",
"class": "section",
"children": [
{
"id": "5",
"class": "instruction"
}
]
}
]
}

Given the following query:

{!parent which='id:4'}id:3

I expect the result to be empty as document 3 is not a child document of 
document 4.

To reproduce, use the docker container provided here:
https://github.com/tormsh/Solr-Example

Have I misunderstood something regarding the Block Join Parent Query Parser?

Tor-Magne Stien Hagen

Re: strange behavior of solr query parser

2020-03-02 Thread Hongtai Xue

Hi Phil.Staley

Thanks for your reply.
but I'm afraid that's a different problem.

Our problem can be confirmed since at least SOLR 7.3.0. (the oldest version we 
have)
And we guess it might already exists since SOLR-9786.
https://github.com/apache/lucene-solr/commit/bf9db95f218f49bac8e7971eb953a9fd9d13a2f0#diff-269ae02e56283ced3ce781cce21b3147R563

sincerely 
hongtai

送信元: "Staley, Phil R - DCF" 
Reply-To: "d...@lucene.apache.org" 
日付: 2020年3月2日 月曜日 22:38
宛先: solr_user lucene_apache , 
"d...@lucene.apache.org" 
件名: Re: strange behavior of solr query parser

I believe we are experiencing the same thing.

We recently upgraded to our Drupal 8 sites to SOLR 8.3.1.  We are now getting 
reports of certain patterns of search terms resulting in an error that reads, 
“The website encountered an unexpected error. Please try again later.”
 
Below is a list of example terms that always result in this error and a similar 
list that works fine.  The problem pattern seems to be a search term that 
contains 2 or 3 characters followed by a space, followed by additional text.
 
To confirm that the problem is version 8 of SOLR, I have updated our local and 
UAT sites with the latest Drupal updates that did include an update to the 
Search API Solr module and tested the terms below under SOLR 7.7.2, 8.3.1, and 
8.4.1.  Under version 7.7.2  everything works fine. Under either of the version 
8, the problem returns.
 
Thoughts?
 
Search terms that result in error
• w-2 agency directory
• agency w-2 directory
• w-2 agency
• w-2 directory
• w2 agency directory
• w2 agency
• w2 directory
 
Search terms that do not result in error
• w-22 agency directory
• agency directory w-2
• agency w-2directory
• agencyw-2 directory
• w-2
• w2
• agency directory
• agency
• directory
• -2 agency directory
• 2 agency directory
• w-2agency directory
• w2agency directory
 



From: Hongtai Xue 
Sent: Monday, March 2, 2020 3:45 AM
To: solr_user lucene_apache 
Cc: d...@lucene.apache.org 
Subject: strange behavior of solr query parser 
 
Hi,
 
Our team found a strange behavior of solr query parser.
In some specific cases, some conditional clauses on unindexed field will be 
ignored.
 
for query like, q=A:1 OR B:1 OR A:2 OR B:2
if field B is not indexed(but docValues="true"), "B:1" will be lost.
 
but if you write query like, q=A:1 OR A:2 OR B:1 OR B:2, 
it will work perfect.
 
the only difference of two queries is that they are wrote in different orders.
one is ABAB, another is AABB,
 
■reproduce steps and example explanation
you can easily reproduce this problem on a solr collection with _default 
configset and exampledocs/books.csv data.
 
1. create a _default collection
bin/solr create -c books -s 2 -rf 2
 
2. post books.csv.
bin/post -c books example/exampledocs/books.csv
 
3. run following query.
http://localhost:8983/solr/books/select?q=%2B%28name_str%3AFoundation+OR+cat%3Abook+OR+name_str%3AJhereg+OR+cat%3Acd%29=query
 
 
I printed query parsing debug information. 
you can tell "name_str:Foundation" is lost.
 
query: "name_str:Foundation OR cat:book OR name_str:Jhereg OR cat:cd"
(please note "Jhereg" is "4a 68 65 72 65 67" and "Foundation" is "46 6f 75 6e 
64 61 74 69 6f 6e")

  "debug":{
    "rawquerystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg OR 
cat:cd)",
    "querystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg OR 
cat:cd)",
    "parsedquery":"+(cat:book cat:cd (name_str:[[4a 68 65 72 65 67] TO [4a 68 
65 72 65 67]]))",
    "parsedquery_toString":"+(cat:book cat:cd name_str:[[4a 68 65 72 65 67] TO 
[4a 68 65 72 65 67]])",
    "QParser":"LuceneQParser"}}

 
but for query: "name_str:Foundation OR name_str:Jhereg OR cat:book OR cat:cd",
everything is OK. "name_str:Foundation" is not lost.

  "debug":{
    "rawquerystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book OR 
cat:cd)",
    "querystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book OR 
cat:cd)",
    "parsedquery":"+(cat:book cat:cd ((name_str:[[46 6f 75 6e 64 61 74 69 6f 
6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]]) (name_str:[[4a 68 65 72 65 67] TO [4a 
68 65 72 65 67]])))",
    "parsedquery_toString":"+(cat:book cat:cd (name_str:[[46 6f 75 6e 64 61 74 
69 6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]] name_str:[[4a 68 65 72 65 67] TO 
[4a 68 65 72 65 67]]))",
    "QParser":"LuceneQParser"}}

http://localhost:8983/solr/books/select?q=%2B%28name_str%3AFoundation+OR+name_str%3AJhereg+OR+cat%3Abook+OR+cat%3Acd%29=query
 
we did a little bit research, and we wander if it is a bug of SolrQueryParser.
more specifically, we think if statement here might be wrong.
https://github.com/apache/lucene-solr/blob/branch_8_4/solr/core/src/java/org/apache/solr/parser/SolrQueryParserBase.java#L711
 
Could you please tell us if it is a bug, or it's just a wrong query statement.
 
Thanks,
Hongtai Xue

Question About Solr Query Parser

2020-03-02 Thread Kayak28

Hello, Community:

I have a question about interpreting a parsed query from Debug Query.
I used Solr 8.4.1 and LuceneQueryParser.
I was learning the behavior of ManagedSynonymFilter because I was curious
about how "ManagedSynonymGraphFilter" fails to generate a graph.
So, I try to interpret the parsed query, which gives me:
MultiPhraseQuery(managed_synonym_filter_query:\"AA (B SYNONYM_AA_1)
SYNOYM_AA_2 SYNONYM_AA
SYNONYM_AA_3 SYNONYM_AA_4 SYNONYM_AA_5\")
when I query q=managed_synonym_filter_query:"AAB" (SYNONYM_AA_n means
synonyms for AA that is defined in the managed-resource (1 <= n <= 5) )

I wonder why (B SYNONYM_AA_1) are appeared and what these parentheses mean.

If anyone knows any reasons or clues, I would be very appreciated of you
sharing the information.

Sincerely,
Kaya Ota

Re: strange behavior of solr query parser

2020-03-02 Thread Staley, Phil R - DCF

I believe we are experiencing the same thing.


We recently upgraded to our Drupal 8 sites to SOLR 8.3.1.  We are now getting 
reports of certain patterns of search terms resulting in an error that reads, 
“The website encountered an unexpected error. Please try again later.”



Below is a list of example terms that always result in this error and a similar 
list that works fine.  The problem pattern seems to be a search term that 
contains 2 or 3 characters followed by a space, followed by additional text.



To confirm that the problem is version 8 of SOLR, I have updated our local and 
UAT sites with the latest Drupal updates that did include an update to the 
Search API Solr module and tested the terms below under SOLR 7.7.2, 8.3.1, and 
8.4.1.  Under version 7.7.2  everything works fine. Under either of the version 
8, the problem returns.



Thoughts?



Search terms that result in error

  *   w-2 agency directory
  *   agency w-2 directory
  *   w-2 agency
  *   w-2 directory
  *   w2 agency directory
  *   w2 agency
  *   w2 directory



Search terms that do not result in error

  *   w-22 agency directory
  *   agency directory w-2
  *   agency w-2directory
  *   agencyw-2 directory
  *   w-2
  *   w2
  *   agency directory
  *   agency
  *   directory
  *   -2 agency directory
  *   2 agency directory
  *   w-2agency directory
  *   w2agency directory





From: Hongtai Xue 
Sent: Monday, March 2, 2020 3:45 AM
To: solr_user lucene_apache 
Cc: d...@lucene.apache.org 
Subject: strange behavior of solr query parser


Hi,



Our team found a strange behavior of solr query parser.

In some specific cases, some conditional clauses on unindexed field will be 
ignored.



for query like, q=A:1 OR B:1 OR A:2 OR B:2

if field B is not indexed(but docValues="true"), "B:1" will be lost.



but if you write query like, q=A:1 OR A:2 OR B:1 OR B:2,

it will work perfect.



the only difference of two queries is that they are wrote in different orders.

one is ABAB, another is AABB,



■reproduce steps and example explanation

you can easily reproduce this problem on a solr collection with _default 
configset and exampledocs/books.csv data.



1. create a _default collection

bin/solr create -c books -s 2 -rf 2



2. post books.csv.

bin/post -c books example/exampledocs/books.csv



3. run following query.

http://localhost:8983/solr/books/select?q=%2B%28name_str%3AFoundation+OR+cat%3Abook+OR+name_str%3AJhereg+OR+cat%3Acd%29=query





I printed query parsing debug information.

you can tell "name_str:Foundation" is lost.



query: "name_str:Foundation OR cat:book OR name_str:Jhereg OR cat:cd"

(please note "Jhereg" is "4a 68 65 72 65 67" and "Foundation" is "46 6f 75 6e 
64 61 74 69 6f 6e")



  "debug":{

"rawquerystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg OR 
cat:cd)",

"querystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg OR 
cat:cd)",

"parsedquery":"+(cat:book cat:cd (name_str:[[4a 68 65 72 65 67] TO [4a 68 
65 72 65 67]]))",

"parsedquery_toString":"+(cat:book cat:cd name_str:[[4a 68 65 72 65 67] TO 
[4a 68 65 72 65 67]])",

"QParser":"LuceneQParser"}}





but for query: "name_str:Foundation OR name_str:Jhereg OR cat:book OR cat:cd",

everything is OK. "name_str:Foundation" is not lost.



  "debug":{

"rawquerystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book OR 
cat:cd)",

"querystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book OR 
cat:cd)",

"parsedquery":"+(cat:book cat:cd ((name_str:[[46 6f 75 6e 64 61 74 69 6f 
6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]]) (name_str:[[4a 68 65 72 65 67] TO [4a 
68 65 72 65 67]])))",

"parsedquery_toString":"+(cat:book cat:cd (name_str:[[46 6f 75 6e 64 61 74 
69 6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]] name_str:[[4a 68 65 72 65 67] TO 
[4a 68 65 72 65 67]]))",

"QParser":"LuceneQParser"}}



http://localhost:8983/solr/books/select?q=%2B%28name_str%3AFoundation+OR+name_str%3AJhereg+OR+cat%3Abook+OR+cat%3Acd%29=query



we did a little bit research, and we wander if it is a bug of SolrQueryParser.

more specifically, we think if statement here might be wrong.

https://github.com/apache/lucene-solr/blob/branch_8_4/solr/core/src/java/org/apache/solr/parser/SolrQueryParserBase.java#L711



Could you please tell us if it is a bug, or it's just a wrong query statement.



Thanks,

Hongtai Xue

strange behavior of solr query parser

2020-03-02 Thread Hongtai Xue

Hi,

Our team found a strange behavior of solr query parser.
In some specific cases, some conditional clauses on unindexed field will be 
ignored.

for query like, q=A:1 OR B:1 OR A:2 OR B:2
if field B is not indexed(but docValues="true"), "B:1" will be lost.

but if you write query like, q=A:1 OR A:2 OR B:1 OR B:2,
it will work perfect.

the only difference of two queries is that they are wrote in different orders.
one is ABAB, another is AABB,

■reproduce steps and example explanation
you can easily reproduce this problem on a solr collection with _default 
configset and exampledocs/books.csv data.

1. create a _default collection
bin/solr create -c books -s 2 -rf 2

2. post books.csv.
bin/post -c books example/exampledocs/books.csv

3. run following query.
http://localhost:8983/solr/books/select?q=%2B%28name_str%3AFoundation+OR+cat%3Abook+OR+name_str%3AJhereg+OR+cat%3Acd%29=query


I printed query parsing debug information.
you can tell "name_str:Foundation" is lost.

query: "name_str:Foundation OR cat:book OR name_str:Jhereg OR cat:cd"
(please note "Jhereg" is "4a 68 65 72 65 67" and "Foundation" is "46 6f 75 6e 
64 61 74 69 6f 6e")

  "debug":{
"rawquerystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg OR 
cat:cd)",
"querystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg OR 
cat:cd)",
"parsedquery":"+(cat:book cat:cd (name_str:[[4a 68 65 72 65 67] TO [4a 68 
65 72 65 67]]))",
"parsedquery_toString":"+(cat:book cat:cd name_str:[[4a 68 65 72 65 67] TO 
[4a 68 65 72 65 67]])",
"QParser":"LuceneQParser"}}


but for query: "name_str:Foundation OR name_str:Jhereg OR cat:book OR cat:cd",
everything is OK. "name_str:Foundation" is not lost.

  "debug":{
"rawquerystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book OR 
cat:cd)",
"querystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book OR 
cat:cd)",
"parsedquery":"+(cat:book cat:cd ((name_str:[[46 6f 75 6e 64 61 74 69 6f 
6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]]) (name_str:[[4a 68 65 72 65 67] TO [4a 
68 65 72 65 67]])))",
"parsedquery_toString":"+(cat:book cat:cd (name_str:[[46 6f 75 6e 64 61 74 
69 6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]] name_str:[[4a 68 65 72 65 67] TO 
[4a 68 65 72 65 67]]))",
"QParser":"LuceneQParser"}}

http://localhost:8983/solr/books/select?q=%2B%28name_str%3AFoundation+OR+name_str%3AJhereg+OR+cat%3Abook+OR+cat%3Acd%29=query

we did a little bit research, and we wander if it is a bug of SolrQueryParser.
more specifically, we think if statement here might be wrong.
https://github.com/apache/lucene-solr/blob/branch_8_4/solr/core/src/java/org/apache/solr/parser/SolrQueryParserBase.java#L711

Could you please tell us if it is a bug, or it's just a wrong query statement.

Thanks,
Hongtai Xue

Re: Graph Query Parser Syntax

2020-02-27 Thread sambasivarao giddaluri

Hi All ,
any suggestions?


On Fri, Feb 14, 2020 at 5:20 PM sambasivarao giddaluri <
sambasiva.giddal...@gmail.com> wrote:

> Hi All,
> In our project we have to use multiple graph queries with AND and OR
> conditions but graph query parser does not work for the below scenario, can
> any one suggest how to overcome this kind of problem? this is stopping our
> pre prod release .
> we are also using traversalFilter but our usecase still need multiple OR
> and AND graph query .
>
>
>
> *works*
> {!graph from=parentId to=parentId returnRoot=false}id:abc
> *works*
> ({!graph from=parentId to=parentId returnRoot=false}id:abc)
> *works*
> ({!graph from=parentId to=parentId returnRoot=false}id:abc AND name:test)
> *works*
> {!graph from=parentId to=parentId returnRoot=false}(id:abc AND name:test)
>
> *Fails Syntax Error *
> ({!graph from=parentId to=parentId returnRoot=false}(id:abc AND
> name:test))
>
> *Fails Syntax Error  *
> ({!graph from=parentId to=parentId returnRoot=false}(id:abc AND
> name:test))  OR (({!graph from=parentId to=parentId
> returnRoot=false}(description :abc AND name:test))
>
>
> '(id:abc': Encountered \"\" at line 1, column 13.\nWas expecting one
> of:\n ...\n ...\n ...\n\"+\" ...\n\"-\"
> ...\n ...\n\"(\" ...\n\")\" ...\n\"*\" ...\n
>  \"^\" ...\n ...\n ...\n ...\n
>   ...\n ...\n ...\n\"[\"
> ...\n\"{\" ...\n ...\n\"filter(\" ...\n
> ...\n",
>
> Regards
> sam
>
>
>

Graph Query Parser Syntax

2020-02-14 Thread sambasivarao giddaluri

Hi All,
In our project we have to use multiple graph queries with AND and OR
conditions but graph query parser does not work for the below scenario, can
any one suggest how to overcome this kind of problem? this is stopping our
pre prod release .
we are also using traversalFilter but our usecase still need multiple OR
and AND graph query .



*works*
{!graph from=parentId to=parentId returnRoot=false}id:abc
*works*
({!graph from=parentId to=parentId returnRoot=false}id:abc)
*works*
({!graph from=parentId to=parentId returnRoot=false}id:abc AND name:test)
*works*
{!graph from=parentId to=parentId returnRoot=false}(id:abc AND name:test)

*Fails Syntax Error *
({!graph from=parentId to=parentId returnRoot=false}(id:abc AND name:test))

*Fails Syntax Error  *
({!graph from=parentId to=parentId returnRoot=false}(id:abc AND
name:test))  OR (({!graph from=parentId to=parentId
returnRoot=false}(description :abc AND name:test))


'(id:abc': Encountered \"\" at line 1, column 13.\nWas expecting one
of:\n ...\n ...\n ...\n\"+\" ...\n\"-\"
...\n ...\n\"(\" ...\n\")\" ...\n\"*\" ...\n
 \"^\" ...\n ...\n ...\n ...\n
  ...\n ...\n ...\n\"[\"
...\n\"{\" ...\n ...\n\"filter(\" ...\n
...\n",

Regards
sam

Extending SOLR default/eDisMax query parser with Span queries functionalities

2020-01-07 Thread Kaminski, Adi

Hi,
We would like to extend SOLR default (named 'lucene' per: 
https://lucene.apache.org/solr/guide/6_6/the-standard-query-parser.html)
or eDisMax query parser with additional functionality of Lucene Span queries in 
order to allow via standard parsers to execute position search (SpanFirst, etc.)
via more trivial interface (for example 'sq=' clause).

Is there any guideline/HowTo regarding required areas to focus on/implement, 
important notes/checklist, etc. ?
(the idea I guess is to inherit the default/eDisMax relevant classes and expand 
functionality, without harming the existing ones)

We've found the below try to do something similar, but it was at 2012 and on 
very old Solr version (4.X), and i assume default SOLR/eDisMax
parsers were changed since then (we are on Solr 8.3 version right now).
https://issues.apache.org/jira/browse/SOLR-3925

Thanks a lot in advance,
Adi



This electronic message may contain proprietary and confidential information of 
Verint Systems Inc., its affiliates and/or subsidiaries. The information is 
intended to be for the use of the individual(s) or entity(ies) named above. If 
you are not the intended recipient (or authorized to receive this e-mail for 
the intended recipient), you may not use, copy, disclose or distribute to 
anyone this message or any information contained in this message. If you have 
received this electronic message in error, please notify us by replying to this 
e-mail.

Re: Using Deep Paging with Graph Query Parser

2019-12-17 Thread Chris Hostetter



: Is there a way to use combine paging's cursor feature with graph query
: parser?

it should work just fie -- the cursorMark logic doesn't care what query 
parser you use.

Is there a particular problem you are running into when you send requests 
using both?


-Hoss
http://www.lucidworks.com/

Graph Query Parser with pagination

2019-12-11 Thread sambasivarao giddaluri

Hi All,

Is it possible to search on a index using graph query parser with
pagination available .
ex:
1 <--2<--3
1 <--4<--5
1 <--6<--7
and so on

1 is parent of 2,4  and 2 is parent of 3 and 4 is parent of 5
1 is doc type A  and 2,4 are of type doc B and 3,5 are of type C

similarly if i have 200 children similar to 2,4,6
schema example:
doc A
{
id : 1
name: Laptop
}
doc B
{
id : 2
parent:1
name: Dell
}
doc C
{
id : 3
parent:2
mainparent:1
name: latitude 15inch
}

doc A
{
id : 1
name: Laptop
}
doc B
{
id : 4
parent:1
name: Dell Desktop
}
doc C
{
id : 5
parent:4
mainparent:1
name: latitude 15inch
}



So my query doc C.name=latitude 15inch and doc A.name=laptop

this will give me two results when from doc C if am using graph query
parser , but instead of getting all results in one call , can add some kind
of pagination .

Or any other suggestions ? which can be used to achieve the below results
where we multiple docs involved in query .


Regards
sambasiva

Using Deep Paging with Graph Query Parser

2019-12-08 Thread mmb1234



Is there a way to use combine paging's cursor feature with graph query
parser?

Background:
I have a hierarchical data structure that is split into N different flat
json docs and updated (inserted) into solr with from/to fields. Using the
from/to join syntax a graph query is needed since different queries need
parents (given certain child filters) and different queries need children
(given certain parent filters).

Graph query parser (though not distributed) looks ideal. However I need
pagination to iterate on the results. Hints for custom code are ok, since
current solr install has lots admin, core and collection handlers already
running.

-M



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: igain query parser generating invalid output

2019-10-12 Thread Peter Davie


Hi,

I have created the bug report in Jira and attached the patch to it.

Kind Regards,
Peter

On 12/10/2019 2:34 am, Joel Bernstein wrote:

This sounds like a great patch. I can help with the review and commit after
the jira is created.

Thanks!

Joel


On Fri, Oct 11, 2019 at 1:06 AM Peter Davie <
peter.da...@convergentsolutions.com.au> wrote:


Hi,

I apologise in advance for the length of this email, but I want to share
my discovery steps to make sure that I haven't missed anything during my
investigation...

I am working on a classification project and will be using the
classify(model()) stream function to classify documents.  I have noticed
that models generated include many noise terms from the (lexically)
early part of the term list.  To test, I have used the /BBC articles
fulltext and category //dataset from Kaggle/
(https://www.kaggle.com/yufengdev/bbc-fulltext-and-category). I have
indexed the data into a Solr collection (news_categories) and am
performing the following operation to generate a model for documents
categorised as "BUSINESS" (only keeping the 100th iteration):

having(
  train(
  news_categories,
  features(
  news_categories,
  zkHost="localhost:9983",
  q="*:*",
  fq="role:train",
  fq="category:BUSINESS",
  featureSet="business",
  field="body",
  outcome="positive",
  numTerms=500
  ),
  fq="role:train",
  fq="category:BUSINESS",
  zkHost="localhost:9983",
  name="business_model",
  field="body",
  outcome="positive",
  maxIterations=100
  ),
  eq(iteration_i, 100)
)

The output generated includes "noise" terms, such as the following
"1,011.15", "10.3m", "01", "02", "03", "10.50", "04", "05", "06", "07",
"09", and these terms all have the same value for idfs_ds ("-Infinity").

Investigating the "features()" output, it seems that the issue is that
the noise terms are being returned with NaN for the score_f field:

  "docs": [
{
  "featureSet_s": "business",
  "score_f": "NaN",
  "term_s": "1,011.15",
  "idf_d": "-Infinity",
  "index_i": 1,
  "id": "business_1"
},
{
  "featureSet_s": "business",
  "score_f": "NaN",
  "term_s": "10.3m",
  "idf_d": "-Infinity",
  "index_i": 2,
  "id": "business_2"
},
{
  "featureSet_s": "business",
  "score_f": "NaN",
  "term_s": "01",
  "idf_d": "-Infinity",
  "index_i": 3,
  "id": "business_3"
},
{
  "featureSet_s": "business",
  "score_f": "NaN",
  "term_s": "02",
  "idf_d": "-Infinity",
  "index_i": 4,
  "id": "business_4"
},...

I have examined the code within
org/apache/solr/client/solrj/io/streamFeatureSelectionStream.java and
see that the scores being returned by {!igain} include NaN values, as
follows:

{
"responseHeader":{
  "zkConnected":true,
  "status":0,
  "QTime":20,
  "params":{
"q":"*:*",
"distrib":"false",
"positiveLabel":"1",
"field":"body",
"numTerms":"300",
"fq":["category:BUSINESS",
  "role:train",
  "{!igain}"],
"version":"2",
"wt":"json",
"outcome":"positive",
"_":"1569982496170"}},
"featuredTerms":[
  "0","NaN",
  "0.0051","NaN",
  "0.01","NaN",
  "0.02","NaN",
  "0.03","NaN",

Looking intoorg/apache/solr/search/IGainTermsQParserPlugin.java, it
seems that when a term is not included in the positive or negative
documents, the docFreq calculation (docFreq = xc + nc) is 0, which means
that subsequent calculations result in NaN (division by 0) which
generates these meaningless values for the computed score.

I have patched a local version of Solr to skip terms for which docFreq
is 0 in the finish() method of IGainTermsQParserPlugin and this is now
the result:

{
"responseHeader":{
  "zkConnected":true,
  "status":0,
  "QTime":260,
  "params":{
"q":"*:*",
"distrib":"false",
"positiveLabel":"1",
"field":"body",
"numTerms":"300",
"fq":["category:BUSINESS",
  "role:train",
  "{!igain}"],
"version":"2",
"wt":"json",
"outcome":"positive",
"_":"1569983546342"}},
"featuredTerms":[
  "3",-0.0173133558644304,
  "authority",-0.0173133558644304,
  "brand",-0.0173133558644304,
  "commission",-0.0173133558644304,
  "compared",-0.0173133558644304,
  "condition",-0.0173133558644304,
  "continuing",-0.0173133558644304,
  "deficit",-0.0173133558644304,
  "expectation",-0.0173133558644304,

To my (admittedly inexpert) eye, it seems like this is producing more
reasonable results.

With this change in

Re: igain query parser generating invalid output

2019-10-11 Thread Joel Bernstein

This sounds like a great patch. I can help with the review and commit after
the jira is created.

Thanks!

Joel


On Fri, Oct 11, 2019 at 1:06 AM Peter Davie <
peter.da...@convergentsolutions.com.au> wrote:

> Hi,
>
> I apologise in advance for the length of this email, but I want to share
> my discovery steps to make sure that I haven't missed anything during my
> investigation...
>
> I am working on a classification project and will be using the
> classify(model()) stream function to classify documents.  I have noticed
> that models generated include many noise terms from the (lexically)
> early part of the term list.  To test, I have used the /BBC articles
> fulltext and category //dataset from Kaggle/
> (https://www.kaggle.com/yufengdev/bbc-fulltext-and-category). I have
> indexed the data into a Solr collection (news_categories) and am
> performing the following operation to generate a model for documents
> categorised as "BUSINESS" (only keeping the 100th iteration):
>
> having(
>  train(
>  news_categories,
>  features(
>  news_categories,
>  zkHost="localhost:9983",
>  q="*:*",
>  fq="role:train",
>  fq="category:BUSINESS",
>  featureSet="business",
>  field="body",
>  outcome="positive",
>  numTerms=500
>  ),
>  fq="role:train",
>  fq="category:BUSINESS",
>  zkHost="localhost:9983",
>  name="business_model",
>  field="body",
>  outcome="positive",
>  maxIterations=100
>  ),
>  eq(iteration_i, 100)
> )
>
> The output generated includes "noise" terms, such as the following
> "1,011.15", "10.3m", "01", "02", "03", "10.50", "04", "05", "06", "07",
> "09", and these terms all have the same value for idfs_ds ("-Infinity").
>
> Investigating the "features()" output, it seems that the issue is that
> the noise terms are being returned with NaN for the score_f field:
>
>  "docs": [
>{
>  "featureSet_s": "business",
>  "score_f": "NaN",
>  "term_s": "1,011.15",
>  "idf_d": "-Infinity",
>  "index_i": 1,
>  "id": "business_1"
>},
>{
>  "featureSet_s": "business",
>  "score_f": "NaN",
>  "term_s": "10.3m",
>  "idf_d": "-Infinity",
>  "index_i": 2,
>  "id": "business_2"
>},
>{
>  "featureSet_s": "business",
>  "score_f": "NaN",
>  "term_s": "01",
>  "idf_d": "-Infinity",
>  "index_i": 3,
>  "id": "business_3"
>},
>{
>  "featureSet_s": "business",
>  "score_f": "NaN",
>  "term_s": "02",
>  "idf_d": "-Infinity",
>  "index_i": 4,
>  "id": "business_4"
>},...
>
> I have examined the code within
> org/apache/solr/client/solrj/io/streamFeatureSelectionStream.java and
> see that the scores being returned by {!igain} include NaN values, as
> follows:
>
> {
>"responseHeader":{
>  "zkConnected":true,
>  "status":0,
>  "QTime":20,
>  "params":{
>"q":"*:*",
>"distrib":"false",
>"positiveLabel":"1",
>"field":"body",
>"numTerms":"300",
>"fq":["category:BUSINESS",
>  "role:train",
>  "{!igain}"],
>"version":"2",
>"wt":"json",
>"outcome":"positive",
>"_":"1569982496170"}},
>"featuredTerms":[
>  "0","NaN",
>  "0.0051","NaN",
>  "0.01","NaN",
>  "0.02","NaN",
>  "0.03","NaN",
>
> Looking intoorg/apache/solr/search/IGainTermsQParserPlugin.java, it
> seems that when a term is not included in the positive or negative
> documents, the docFreq calculation (docFreq = xc + nc) is 0, which means
> that subsequent calculations result in NaN (division by 0) which
> generates these meaningless values for the computed score.
>
> I have patched a local version of Solr to skip terms for which docFreq
> is 0 in the finish() method of IGainTermsQParserPlugin and this is now
> the result:
>
> {
>"responseHeader":{
>  "zkConnected":true,
>  "status":0,
>  "QTime":260,
>  "params":{
>"q":"*:*",
>"distrib":"false",
>"positiveLabel":"1",
>"field":"body",
>"numTerms":"300",
>"fq":["category:BUSINESS",
>  "role:train",
>  "{!igain}"],
>"version":"2",
>"wt":"json",
>"outcome":"positive",
>"_":"1569983546342"}},
>"featuredTerms":[
>  "3",-0.0173133558644304,
>  "authority",-0.0173133558644304,
>  "brand",-0.0173133558644304,
>  "commission",-0.0173133558644304,
>  "compared",-0.0173133558644304,
>  "condition",-0.0173133558644304,
>  "continuing",-0.0173133558644304,
>  "deficit",-0.0173133558644304,
>  "expectation",-0.0173133558644304,
>
> To my (admittedly inexpert) eye, it seems like this is producing more

igain query parser generating invalid output

2019-10-10 Thread Peter Davie


Hi,

I apologise in advance for the length of this email, but I want to share 
my discovery steps to make sure that I haven't missed anything during my 
investigation...


I am working on a classification project and will be using the 
classify(model()) stream function to classify documents.  I have noticed 
that models generated include many noise terms from the (lexically) 
early part of the term list.  To test, I have used the /BBC articles 
fulltext and category //dataset from Kaggle/ 
(https://www.kaggle.com/yufengdev/bbc-fulltext-and-category). I have 
indexed the data into a Solr collection (news_categories) and am 
performing the following operation to generate a model for documents 
categorised as "BUSINESS" (only keeping the 100th iteration):


having(
    train(
        news_categories,
        features(
        news_categories,
        zkHost="localhost:9983",
        q="*:*",
        fq="role:train",
        fq="category:BUSINESS",
        featureSet="business",
        field="body",
        outcome="positive",
        numTerms=500
        ),
        fq="role:train",
        fq="category:BUSINESS",
        zkHost="localhost:9983",
        name="business_model",
        field="body",
        outcome="positive",
        maxIterations=100
    ),
    eq(iteration_i, 100)
)

The output generated includes "noise" terms, such as the following 
"1,011.15", "10.3m", "01", "02", "03", "10.50", "04", "05", "06", "07", 
"09", and these terms all have the same value for idfs_ds ("-Infinity").


Investigating the "features()" output, it seems that the issue is that 
the noise terms are being returned with NaN for the score_f field:


    "docs": [
  {
    "featureSet_s": "business",
    "score_f": "NaN",
    "term_s": "1,011.15",
    "idf_d": "-Infinity",
    "index_i": 1,
    "id": "business_1"
  },
  {
    "featureSet_s": "business",
    "score_f": "NaN",
    "term_s": "10.3m",
    "idf_d": "-Infinity",
    "index_i": 2,
    "id": "business_2"
  },
  {
    "featureSet_s": "business",
    "score_f": "NaN",
    "term_s": "01",
    "idf_d": "-Infinity",
    "index_i": 3,
    "id": "business_3"
  },
  {
    "featureSet_s": "business",
    "score_f": "NaN",
    "term_s": "02",
    "idf_d": "-Infinity",
    "index_i": 4,
    "id": "business_4"
  },...

I have examined the code within 
org/apache/solr/client/solrj/io/streamFeatureSelectionStream.java and 
see that the scores being returned by {!igain} include NaN values, as 
follows:


{
  "responseHeader":{
    "zkConnected":true,
    "status":0,
    "QTime":20,
    "params":{
  "q":"*:*",
  "distrib":"false",
  "positiveLabel":"1",
  "field":"body",
  "numTerms":"300",
  "fq":["category:BUSINESS",
    "role:train",
    "{!igain}"],
  "version":"2",
  "wt":"json",
  "outcome":"positive",
  "_":"1569982496170"}},
  "featuredTerms":[
    "0","NaN",
    "0.0051","NaN",
    "0.01","NaN",
    "0.02","NaN",
    "0.03","NaN",

Looking intoorg/apache/solr/search/IGainTermsQParserPlugin.java, it 
seems that when a term is not included in the positive or negative 
documents, the docFreq calculation (docFreq = xc + nc) is 0, which means 
that subsequent calculations result in NaN (division by 0) which 
generates these meaningless values for the computed score.


I have patched a local version of Solr to skip terms for which docFreq 
is 0 in the finish() method of IGainTermsQParserPlugin and this is now 
the result:


{
  "responseHeader":{
    "zkConnected":true,
    "status":0,
    "QTime":260,
    "params":{
  "q":"*:*",
  "distrib":"false",
  "positiveLabel":"1",
  "field":"body",
  "numTerms":"300",
  "fq":["category:BUSINESS",
    "role:train",
    "{!igain}"],
  "version":"2",
  "wt":"json",
  "outcome":"positive",
  "_":"1569983546342"}},
  "featuredTerms":[
    "3",-0.0173133558644304,
    "authority",-0.0173133558644304,
    "brand",-0.0173133558644304,
    "commission",-0.0173133558644304,
    "compared",-0.0173133558644304,
    "condition",-0.0173133558644304,
    "continuing",-0.0173133558644304,
    "deficit",-0.0173133558644304,
    "expectation",-0.0173133558644304,

To my (admittedly inexpert) eye, it seems like this is producing more 
reasonable results.


With this change in place, train() now produces:

    "idfs_ds": [
  0.6212826193303013,
  0.6434237452075148,
  0.7169578292536639,
  0.741349282377823,
  0.86843471069652,
  1.0140549006400466,
  1.0639267306802198,
  1.0753554265038423,...

|"terms_ss": [ "â", "company", "market", "firm", "month", "analyst", 
"chief", "time",|||...| I am not sure if I have missed anything, but this seems like it's 
producing better outcomes. I would appreciate any input on whether I 
have missed

Re: more like this query parser with faceting

2019-08-12 Thread Szűcs Roland

Thanks David.
This page I was looking for.

Roland

David Hastings  ezt írta (időpont: 2019. aug.
12., H, 20:52):

> should be fine,
> https://cwiki.apache.org/confluence/display/solr/MoreLikeThisHandler
>
> for more info
>
> On Mon, Aug 12, 2019 at 2:49 PM Szűcs Roland 
> wrote:
>
> > Hi David,
> > Thanks the fast reply. Am I right that I can combine fq with mlt only if
> I
> > use more like this as a query parser?
> >
> > Is there a way to achieve the same with mlt as a request handler?
> > Roland
> >
> > David Hastings  ezt írta (időpont: 2019.
> > aug.
> > 12., H, 20:44):
> >
> > > The easiest way will be to pass in a filter query (fq)
> > >
> > > On Mon, Aug 12, 2019 at 2:40 PM Szűcs Roland <
> > szucs.rol...@bookandwalk.hu>
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > Is there any tutorial or example how to use more like this
> > functionality
> > > > when we have some other constraints set by the user through faceting
> > > > parameters like price range, or product category for example?
> > > >
> > > > Cheers,
> > > > Roland
> > > >
> > >
> >
>

Re: more like this query parser with faceting

2019-08-12 Thread David Hastings

should be fine,
https://cwiki.apache.org/confluence/display/solr/MoreLikeThisHandler

for more info

On Mon, Aug 12, 2019 at 2:49 PM Szűcs Roland 
wrote:

> Hi David,
> Thanks the fast reply. Am I right that I can combine fq with mlt only if I
> use more like this as a query parser?
>
> Is there a way to achieve the same with mlt as a request handler?
> Roland
>
> David Hastings  ezt írta (időpont: 2019.
> aug.
> 12., H, 20:44):
>
> > The easiest way will be to pass in a filter query (fq)
> >
> > On Mon, Aug 12, 2019 at 2:40 PM Szűcs Roland <
> szucs.rol...@bookandwalk.hu>
> > wrote:
> >
> > > Hi All,
> > >
> > > Is there any tutorial or example how to use more like this
> functionality
> > > when we have some other constraints set by the user through faceting
> > > parameters like price range, or product category for example?
> > >
> > > Cheers,
> > > Roland
> > >
> >
>

Re: more like this query parser with faceting

2019-08-12 Thread Szűcs Roland

Hi David,
Thanks the fast reply. Am I right that I can combine fq with mlt only if I
use more like this as a query parser?

Is there a way to achieve the same with mlt as a request handler?
Roland

David Hastings  ezt írta (időpont: 2019. aug.
12., H, 20:44):

> The easiest way will be to pass in a filter query (fq)
>
> On Mon, Aug 12, 2019 at 2:40 PM Szűcs Roland 
> wrote:
>
> > Hi All,
> >
> > Is there any tutorial or example how to use more like this functionality
> > when we have some other constraints set by the user through faceting
> > parameters like price range, or product category for example?
> >
> > Cheers,
> > Roland
> >
>

Re: more like this query parser with faceting

2019-08-12 Thread David Hastings

The easiest way will be to pass in a filter query (fq)

On Mon, Aug 12, 2019 at 2:40 PM Szűcs Roland 
wrote:

> Hi All,
>
> Is there any tutorial or example how to use more like this functionality
> when we have some other constraints set by the user through faceting
> parameters like price range, or product category for example?
>
> Cheers,
> Roland
>

more like this query parser with faceting

2019-08-12 Thread Szűcs Roland

Hi All,

Is there any tutorial or example how to use more like this functionality
when we have some other constraints set by the user through faceting
parameters like price range, or product category for example?

Cheers,
Roland

Re: KeywordTokenizerFactory and Standard Query Parser

2019-04-02 Thread Chris Ulicny

Actually, nevermind. I found the part of the upgrade to 7 that was missed

" The sow (split-on-whitespace) request param now defaults to false (true
in previous versions). This affects the edismax and standard/"lucene" query
parsers: if the sow param is not specified, query text will not be split on
whitespace before analysis. See
https://lucidworks.com/2017/04/18/multi-word-synonyms-solr-adds-query-time-support/
."

On Tue, Apr 2, 2019 at 8:11 AM Chris Ulicny  wrote:

> Hi all,
>
> We have a multivalued field that has an integer at the beginning followed
> by a space, and the index analyzer chain extracts that value to search on
>
> 
>
> 
>
>
> testField:[
> 34 blah blah blah
> 27 blah blah blah
> ...
> ]
>
> The query analyzer chain is just a keyword tokenizer factory since the
> clients are searching only for the number on that field. So one process
> will attempt to send in the following query
>
> 
>   
>
>
> q=testField:(34 27)
>
> However, this will not pickup the document with the example testField
> value above in version 7.4.0. Passing it as an fq parameter has the same
> result.
>
> My understanding was that the query parser should split the (34 27) into
> search terms "34" and "27" before the query analyzer chain is even entered.
> Is that not correct anymore?
>
> Thanks,
> Chris
>
>
>
>
>
>

KeywordTokenizerFactory and Standard Query Parser

2019-04-02 Thread Chris Ulicny

Hi all,

We have a multivalued field that has an integer at the beginning followed
by a space, and the index analyzer chain extracts that value to search on


   


testField:[
34 blah blah blah
27 blah blah blah
...
]

The query analyzer chain is just a keyword tokenizer factory since the
clients are searching only for the number on that field. So one process
will attempt to send in the following query


  


q=testField:(34 27)

However, this will not pickup the document with the example testField value
above in version 7.4.0. Passing it as an fq parameter has the same result.

My understanding was that the query parser should split the (34 27) into
search terms "34" and "27" before the query analyzer chain is even entered.
Is that not correct anymore?

Thanks,
Chris

Re: Query of death? Collapsing Query Parser - Solr 7.5

2019-03-26 Thread IZaBEE_Keeper

OK..

The intent is to collapse on the field domain..

Here's a query that works fine and the way I want with the Collapsing query
parser..

/select?defType=dismax=score,content,description,keywords,title={!collapse%20field=domain%20nullPolicy=expand}=content^0.05%20description^0.03%20keywords^0.03%20title^0.05%20url^0.06=bernie+sanders=title%20description%20keywords%20content%20url

This is a complex query with 20 terms mixed alpha & numeric single
characters..

/select?defType=dismax=score,content,description,keywords,title={!collapse%20field=domain%20nullPolicy=expand}=content^0.05%20description^0.03%20keywords^0.03%20title^0.05%20url^0.06=1+2+e+3+s+a+d+f+r+4+5+t+g+6+7+8+7+1+2+3+6=title%20description%20keywords%20content%20url

This query crashes solr with the OOM process killer..

Removing the collapsing query parser {!collapse field=domain
nullPolicy=expand} eliminates the problem and never crashes solr on any
query by my testing.. A search of 20 alpha & numeric characters with spaces
is very slow though..

With the collapsing query parser the single numeric terms cause solr to
crash.. using whole words works but slow if there's too many terms..

The debug on all successful queries shows no errors.. the default is 10
rows.. a cold search (not cached) on a 2 word phrase takes 2-4 seconds.
Adding more than 3-4 numbers with spaces to the search kills it..

There is no debug for the failed queries as solr is killed by the process
killer..

Extreme queries are long multi term queries or long queries of single number
& letters with spaces in between. Something like '1 3 s 2 c 4 5 t s 5 6 3 a
s 4 e 6 1 4 3 2 4 5 6 ' will cause it to search for all those individual
terms which are likely to be very frequent.. This type of query seems to
make solr work really hard..

While it's not likely that users would make such searches I need to prevent
solr from crashing with the collapsing query parser.. This type of query can
cause a heavy load on various types of search systems and can be used in DOS
attacks targeting search systems.. You can try a 20 term query made of
numbers & letters with spaces between to see what I mean if you have a 100m
doc index handy..

I can try to prevent these types of queries through the search API by
rewriting the user input.. However if there is a way to make solr time out
instead of being killed that would be preferable.. Otherwise I'll have to
find a different way to limit the number of results per domain..

I have some more ram to put in the server tomorrow, that might help.. I
don't mind if the complex searches are slow.. but crashing out is not good..
especially with the process killer killing solr completely..

Currently this is on a master/slave setup, 150m docs 800GB, 24GB ram, 16GB
heap..

-
Bee Keeper at IZaBEE.com
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Query of death? Collapsing Query Parser - Solr 7.5

2019-03-26 Thread Michael Gibney

Would you be willing to share your query-time analysis chain config, and
perhaps the "debug=true" (or "debug=query") output for successful queries
of a similar nature to the problematic ones? Also, re: "only times out on
extreme queries" -- what do you consider to be an "extreme query", in this
context?

On Mon, Mar 25, 2019 at 10:06 PM IZaBEE_Keeper 
wrote:

> Hi..
>
> I'm wondering if I've found a query of death or just a really expensive
> query.. It's killing my solr with OOM..
>
> Collapsing query parser using:
> fq={!collapse field=domain nullPolicy=expand}
>
> Everything works fine using words & phrases.. However as soon as there are
> numbers involved it crashes out with OOM Killer..
>
> The server has nowhere near enough ram for the index of 800GB & 150M docs..
>
> But a dismax query like '1 2 s 2 s 3 e d 4 r f 3 e s 7 2 1 4 6 7 8 2 9 0 3'
> will make it crash..
>
> fq={!collapse field=domain nullPolicy=expand}
> PhraseFields( 'content^0.05 description^0.03 keywords^0.03 title^0.05
> url^0.06' )
> BoostQuery( 'host:"' . $q . '"^0.6 host:"twitter.com"^0.35 domain:"' . $q
> .
> '"^0.6' )
>
> Without the fq it works just fine and only times out on extreme queries..
> eventually it finds them..
>
> Do I just need more ram or is there another way to prevent solr from
> crashing?
>
> Solr 7.5 24GB ram 16gb heap with ssd lv..
>
>
>
> -
> Bee Keeper at IZaBEE.com
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>

Query of death? Collapsing Query Parser - Solr 7.5

2019-03-25 Thread IZaBEE_Keeper

Hi..

I'm wondering if I've found a query of death or just a really expensive
query.. It's killing my solr with OOM..

Collapsing query parser using:
fq={!collapse field=domain nullPolicy=expand}

Everything works fine using words & phrases.. However as soon as there are
numbers involved it crashes out with OOM Killer..

The server has nowhere near enough ram for the index of 800GB & 150M docs..

But a dismax query like '1 2 s 2 s 3 e d 4 r f 3 e s 7 2 1 4 6 7 8 2 9 0 3'
will make it crash..

fq={!collapse field=domain nullPolicy=expand}
PhraseFields( 'content^0.05 description^0.03 keywords^0.03 title^0.05
url^0.06' )
BoostQuery( 'host:"' . $q . '"^0.6 host:"twitter.com"^0.35 domain:"' . $q .
'"^0.6' )

Without the fq it works just fine and only times out on extreme queries..
eventually it finds them..

Do I just need more ram or is there another way to prevent solr from
crashing?

Solr 7.5 24GB ram 16gb heap with ssd lv..



-
Bee Keeper at IZaBEE.com
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: graph query parser: depth dependent score?

2019-02-27 Thread Jochen Barth


Dear reader, I've found an different solution for my problem
and don't need a depth dependent score anymore.
Kind regards, Jochen

Am 19.02.19 um 14:42 schrieb Jochen Barth:

Dear reader,

I'll have a hierarchical graph "like a book":

{ id:solr_doc1; title:book }

{ id:solr_doc2; title:chapter; parent_ids: solr_doc1 }

{ id:solr_doc3; title:subchapter; parent_ids: solr_doc2 }

etc.

Now to match all docs with "title" and "chapter" I could do:

+_query_:"{!graph from=parent_ids to=id}title:book"

+_query_:"{!graph from=parent_ids to=id}title:chapter",

The result would be solr_doc2 and solr_doc3;

but is there a way to "boost" or "put a higher score" on solr_doc2 
than on solr_doc3 because of direct match (and not via {!graph... ) ?



The only way to do so seems a {!boost before {!graph, but what I can 
do there is not dependent on the match nor {!graph, I think.



Kind regards,

Jochen



--
Jochen Barth * Universitätsbibliothek Heidelberg, IT * Telefon 06221 54-2580

graph query parser: depth dependent score?

2019-02-19 Thread Jochen Barth


Dear reader,

I'll have a hierarchical graph "like a book":

{ id:solr_doc1; title:book }

{ id:solr_doc2; title:chapter; parent_ids: solr_doc1 }

{ id:solr_doc3; title:subchapter; parent_ids: solr_doc2 }

etc.

Now to match all docs with "title" and "chapter" I could do:

+_query_:"{!graph from=parent_ids to=id}title:book"

+_query_:"{!graph from=parent_ids to=id}title:chapter",

The result would be solr_doc2 and solr_doc3;

but is there a way to "boost" or "put a higher score" on solr_doc2 than 
on solr_doc3 because of direct match (and not via {!graph... ) ?



The only way to do so seems a {!boost before {!graph, but what I can do 
there is not dependent on the match nor {!graph, I think.



Kind regards,

Jochen

--
Jochen Barth * Universitätsbibliothek Heidelberg, IT * Telefon 06221 54-2580

Re: Terms Query Parser: filtering on null and strings with whitespace.

2019-02-13 Thread Mikhail Khludnev

Oh yeah, my pet peeve. This is the cure.
(*:* AND -department_name:[* TO *]) OR {!tag=department_name terms
f=department_name v='Kirurgisk avdeling'}
no comments.

On Wed, Feb 13, 2019 at 1:49 PM Andreas Lønes  wrote:

> I am experiencing some weird behaviour when using terms query parser where
> I am filtering on documents that has no value for a given field(null) and
> strings with whitespaces.
>
> I can filter on documents not having a value OR having some specific
> values for the field as long as the value does not have a
> whitespace(example 1). I can also filter on specific values with
> whitespace)(example 2).
> What I am not able to make work is to filter on documents not having a
> value OR certain terms with whitespace(example 3)
>
> These work:
> 1. (*:* AND -department_shortname:[* TO *]) OR
> {!tag=department_shortname terms f=department_shortname}BARN,KIR
> 2. {!tag=department_name terms f=department_name}Kirurgisk avdeling
> Does not work:
> 3. (*:* AND -department_name:[* TO *]) OR {!tag=department_name terms
> f=department_name}Kirurgisk avdeling
>
> The configuration of the fields if that is of any value:
>  stored="true" required="false" multiValued="false" />
>  stored="true" required="false" multiValued="false" />
>
> So far, the only solution I can come up with is to index some kind of
> value that represents null, but as of my understanding it is recommended
> that null is not indexed.
>
>
> Thanks,
> Andreas
>


-- 
Sincerely yours
Mikhail Khludnev

Terms Query Parser: filtering on null and strings with whitespace.

2019-02-13 Thread Andreas Lønes

I am experiencing some weird behaviour when using terms query parser where I am 
filtering on documents that has no value for a given field(null) and strings 
with whitespaces.

I can filter on documents not having a value OR having some specific values for 
the field as long as the value does not have a whitespace(example 1). I can 
also filter on specific values with whitespace)(example 2).
What I am not able to make work is to filter on documents not having a value OR 
certain terms with whitespace(example 3)

These work:
1. (*:* AND -department_shortname:[* TO *]) OR {!tag=department_shortname 
terms f=department_shortname}BARN,KIR
2. {!tag=department_name terms f=department_name}Kirurgisk avdeling
Does not work:
3. (*:* AND -department_name:[* TO *]) OR {!tag=department_name terms 
f=department_name}Kirurgisk avdeling

The configuration of the fields if that is of any value:



So far, the only solution I can come up with is to index some kind of value 
that represents null, but as of my understanding it is recommended that null is 
not indexed.


Thanks,
Andreas

Re: How to debug empty ParsedQuery from Edismax Query Parser

2019-01-04 Thread Kay Wrobel

I'd like to follow up on this post here because it has become relevant to me 
now.

I have set up a debugging environment and took a deep-dive into the SOLR 7.6.0 
source code with Eclipse as my IDE of choice for this task. I have isolated the 
exact line as to where things fall apart for my two sample queries that I have 
been testing with, which are "q=a3f*" and "q=aa3f*. As you can see here, the 
only visible difference between the two search terms are that the second search 
term has two characters in succession before switching to a numerical portion.

First things first, the Extended Dismax Query Parser hands over portions of the 
parsing to the Standard Query Parser early on the the parsing process. 
Following down the rabbit hole, I ended up in 
SolrQueryParserBase.getPrefixQuery() method. On line 1173 of this method, we 
have the following statement:

termStr = analyzeIfMultitermTermText(field, termStr, 
schema.getFieldType(field));

This statement, when executing with the "a3f" search term, returns "a3f" as a 
result. However, when using "aa3f", it throws a SolrException with excatly the 
same multi-term error as shown below, only like this:
> analyzer returned too many terms for multiTerm term: aa3f

At this point, I would like to reiterate the purpose of our search: we are a 
part number house. We deal with millions of part numbers in our system and on 
our web site. A customer of ours typically searches our site with a given part 
number (or SKU if you will). Some part numbers are intelligent, and so 
customers might reduce the part number string to a portion at the beginning. 
Either way, it is *not* a typical "word" based search. Yet, the system (Drupal) 
does treat those two query fields like standard "Text" search fields. Those who 
know Drupal Commerce will recognize the Title field of a node and also possible 
the Product Variation or (SKU) field.

With that in mind, multi-term was introduced with SOLR 5, and I think this 
error (or limitation) has probably been in SOLR 5 since then. Can anyone closer 
to the matter or having struggled with this same issue chime in on the subject?

Kind regards,

Kay

> On Dec 28, 2018, at 9:57 AM, Kay Wrobel  wrote:
> 
> Here are my log entries:
> 
> SOLR 7.x (non-working)
> 2018-12-28 15:36:32.786 INFO  (qtp1769193365-20) [   x:collection1] 
> o.a.s.c.S.Request [collection1]  webapp=/solr path=/select 
> params={q=ac6023*=tm_field_product^21.0=tm_title_field^8.0=all=10=xml=true}
>  hits=0 status=0 QTime=2
> 
> SOLR 4.x (working)
> INFO  - 2018-12-28 15:43:41.938; org.apache.solr.core.SolrCore; [collection1] 
> webapp=/solr path=/select 
> params={q=ac6023*=tm_field_product^21.0=tm_title_field^8.0=all=10=xml=true}
>  hits=32 status=0 QTime=8 
> 
> EchoParams=all did not show anything different in the resulting XML from SOLR 
> 7.x.
> 
> 
> I found out something curious yesterday. When I try to force the Standard 
> query parser on SOLR 7.x using the same query, but adding "defType=lucene" at 
> the beginning, SOLR 7 raises a SolrException with this message: "analyzer 
> returned too many terms for multiTerm term: ac6023" (full response: 
> https://pastebin.com/ijdBj4GF)
> 
> Log entry for that request:
> 2018-12-28 15:50:58.804 ERROR (qtp1769193365-15) [   x:collection1] 
> o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: analyzer 
> returned too many terms for multiTerm term: ac6023
>at 
> org.apache.solr.schema.TextField.analyzeMultiTerm(TextField.java:180)
>at 
> org.apache.solr.parser.SolrQueryParserBase.analyzeIfMultitermTermText(SolrQueryParserBase.java:992)
>at 
> org.apache.solr.parser.SolrQueryParserBase.getPrefixQuery(SolrQueryParserBase.java:1173)
>at 
> org.apache.solr.parser.SolrQueryParserBase.handleBareTokenQuery(SolrQueryParserBase.java:781)
>at org.apache.solr.parser.QueryParser.Term(QueryParser.java:421)
>at org.apache.solr.parser.QueryParser.Clause(QueryParser.java:278)
>at org.apache.solr.parser.QueryParser.Query(QueryParser.java:162)
>at 
> org.apache.solr.parser.QueryParser.TopLevelQuery(QueryParser.java:131)
>at 
> org.apache.solr.parser.SolrQueryParserBase.parse(SolrQueryParserBase.java:254)
>at org.apache.solr.search.LuceneQParser.parse(LuceneQParser.java:49)
>at org.apache.solr.search.QParser.getQuery(QParser.java:173)
>at 
> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:160)
>at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:279)
>at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)
>at org.apache.solr.core.SolrCore.e

Re: How to debug empty ParsedQuery from Edismax Query Parser

2019-01-02 Thread Kay Wrobel

Thanks for your thoughts, Shawn. Are you a developer on SOLR?

Anyway, the configuration (solrconfig.xml) was provided by search_api_solr 
(Drupal 7 module) and is untouched. You can find it here:
https://cgit.drupalcode.org/search_api_solr/tree/solr-conf/7.x/solrconfig.xml?h=7.x-1.x

Thank you for pointing out the capital E on echoParams. I re-ran the query, but 
it doesn't change the output (at least on the surface of it).

> On Jan 2, 2019, at 1:11 PM, Shawn Heisey  wrote:
> 
> On 12/28/2018 8:57 AM, Kay Wrobel wrote:
>> Here are my log entries:
>> 
>> SOLR 7.x (non-working)
>> 2018-12-28 15:36:32.786 INFO  (qtp1769193365-20) [   x:collection1] 
>> o.a.s.c.S.Request [collection1] webapp=/solr path=/select 
>> params={q=ac6023*=tm_field_product^21.0=tm_title_field^8.0=all=10=xml=true}
>>  hits=0 status=0 QTime=2
>> 
>> SOLR 4.x (working)
>> INFO  - 2018-12-28 15:43:41.938; org.apache.solr.core.SolrCore; 
>> [collection1] webapp=/solr path=/select 
>> params={q=ac6023*=tm_field_product^21.0=tm_title_field^8.0=all=10=xml=true}
>>  hits=32 status=0 QTime=8
> 
> Neither of those requests includes anything that would change from the 
> default lucene parser to edismax.  The logging *would* include all parameters 
> set by the configuration as well as those specified on the URL.
> 
> You ought to try adding "defType=edismax" to the URL parameters, or to the 
> definition of "/select" in solrconfig.xml.
> 
>> EchoParams=all did not show anything different in the resulting XML from 
>> SOLR 7.x.
> 
> The parameter requested was "echoParams" and not "EchoParams".  There *is* a 
> difference, and the latter will not work.
> 
>> I found out something curious yesterday. When I try to force the Standard 
>> query parser on SOLR 7.x using the same query, but adding "defType=lucene" 
>> at the beginning, SOLR 7 raises a SolrException with this message: "analyzer 
>> returned too many terms for multiTerm term: ac6023"
> 
> I do not know what this is about.  I did find the message in the source code. 
>  I don't understand the low-level code, and it looks to me like that section 
> of code will *always* throw an exception, which isn't very useful.
> 
> Thanks,
> Shawn

-- 

The information in this e-mail is confidential and is intended solely for 
the addressee(s). Access to this email by anyone else is unauthorized. If 
you are not an intended recipient, you may not print, save or otherwise 
store the e-mail or any of the contents thereof in electronic or physical 
form, nor copy, use or disseminate the information contained in the email.  
If you are not an intended recipient,  please notify the sender of this 
email immediately.

Re: How to debug empty ParsedQuery from Edismax Query Parser

2019-01-02 Thread Shawn Heisey


On 12/28/2018 8:57 AM, Kay Wrobel wrote:

Here are my log entries:

SOLR 7.x (non-working)
2018-12-28 15:36:32.786 INFO  (qtp1769193365-20) [   x:collection1] o.a.s.c.S.Request [collection1]  
webapp=/solr path=/select 
params={q=ac6023*=tm_field_product^21.0=tm_title_field^8.0=all=10=xml=true}
 hits=0 status=0 QTime=2

SOLR 4.x (working)
INFO  - 2018-12-28 15:43:41.938; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/select 
params={q=ac6023*=tm_field_product^21.0=tm_title_field^8.0=all=10=xml=true}
 hits=32 status=0 QTime=8


Neither of those requests includes anything that would change from the 
default lucene parser to edismax.  The logging *would* include all 
parameters set by the configuration as well as those specified on the URL.


You ought to try adding "defType=edismax" to the URL parameters, or to 
the definition of "/select" in solrconfig.xml.



EchoParams=all did not show anything different in the resulting XML from SOLR 
7.x.


The parameter requested was "echoParams" and not "EchoParams".  There 
*is* a difference, and the latter will not work.



I found out something curious yesterday. When I try to force the Standard query parser on SOLR 7.x 
using the same query, but adding "defType=lucene" at the beginning, SOLR 7 raises a 
SolrException with this message: "analyzer returned too many terms for multiTerm term: 
ac6023"


I do not know what this is about.  I did find the message in the source 
code.  I don't understand the low-level code, and it looks to me like 
that section of code will *always* throw an exception, which isn't very 
useful.


Thanks,
Shawn

Re: How to debug empty ParsedQuery from Edismax Query Parser

2019-01-02 Thread Kay Wrobel

Well, I was putting that info out there because I am literally hunting down 
this issue without any guidance. The real problem for still is that the Edismax 
Query Parser behaves abnormally starting with Version 5 until current giving me 
empty parsedQuery. Forcing the request through the Lucene parser was one way I 
was hoping to get to the bottom of this. Frankly, Multi-Term seems to be *the* 
new feature that was introduced in SOLR 5, and so I am jumping to conclusions 
here.

I would hate to go as low-level as debugging SOLR source just to find out what 
is going on here, but it sure seems that way. By the way, I have tried a 
multitude of other search terms (ending in *), like:
602* (works)
602K* (does NOT work)
A3F* (works!, but is also single changing characters, so...)
AC* (works, but MANY results for obvious reasons)
6023* (works)

So again, it seems that as soon as there is more than one character involved 
and a "word" is somewhat detected, the parser fails (in my specific case).

I am contemplating going down to the source-code level and debugging this 
issue; I am a programmer and should be able to understand some of it. That 
said, it seems like a very time-consuming thing to do. One last attempt right 
now is for me so change some logging level in the SOLR 7 instance and see what 
it spits out. I changed the following to "DEBUG":
org.apache.solr.search.PayloadCheckQParserPlugin
org.apache.solr.search.SurroundQParserPlugin
org.apache.solr.search.join.ChildFieldValueSourceParser

That didn't add any new information in the log file at all.

> On Jan 2, 2019, at 12:40 PM, Gus Heck  wrote:
> 
> If you mean attach a debugger, solr is just like any other java program.
> Pass in the standard java options at start up to have it listen or connect
> as usual. The port is just a TCP port so ssh tunneling the debugger port
> can bridge the gap with a remote machine (or a vpn).
> 
> That said the prior thread posts makes it sound like we are looking for a
> case where the query parser or something above it is inappropriately eating
> an exception relating to too many terms.
> 
> Did 5.x impose a new or reduced limit there?
> 
> On Wed, Jan 2, 2019, 1:20 PM Kay Wrobel  
>> Is there any way I can debug the parser? Especially, the edismax parser
>> which does *not* raise any exception but produces an empty parsedQuery?
>> Please, if anyone can help. I feel very lost and without guidance, and
>> Google search has not provided me with any help at all.
>> 
>>> On Dec 28, 2018, at 9:57 AM, Kay Wrobel  wrote:
>>> 
>>> Here are my log entries:
>>> 
>>> SOLR 7.x (non-working)
>>> 2018-12-28 15:36:32.786 INFO  (qtp1769193365-20) [   x:collection1]
>> o.a.s.c.S.Request [collection1]  webapp=/solr path=/select
>> params={q=ac6023*=tm_field_product^21.0=tm_title_field^8.0=all=10=xml=true}
>> hits=0 status=0 QTime=2
>>> 
>>> SOLR 4.x (working)
>>> INFO  - 2018-12-28 15:43:41.938; org.apache.solr.core.SolrCore;
>> [collection1] webapp=/solr path=/select
>> params={q=ac6023*=tm_field_product^21.0=tm_title_field^8.0=all=10=xml=true}
>> hits=32 status=0 QTime=8
>>> 
>>> EchoParams=all did not show anything different in the resulting XML from
>> SOLR 7.x.
>>> 
>>> 
>>> I found out something curious yesterday. When I try to force the
>> Standard query parser on SOLR 7.x using the same query, but adding
>> "defType=lucene" at the beginning, SOLR 7 raises a SolrException with this
>> message: "analyzer returned too many terms for multiTerm term: ac6023"
>> (full response: https://pastebin.com/ijdBj4GF)
>>> 
>>> Log entry for that request:
>>> 2018-12-28 15:50:58.804 ERROR (qtp1769193365-15) [   x:collection1]
>> o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: analyzer
>> returned too many terms for multiTerm term: ac6023
>>>   at
>> org.apache.solr.schema.TextField.analyzeMultiTerm(TextField.java:180)
>>>   at
>> org.apache.solr.parser.SolrQueryParserBase.analyzeIfMultitermTermText(SolrQueryParserBase.java:992)
>>>   at
>> org.apache.solr.parser.SolrQueryParserBase.getPrefixQuery(SolrQueryParserBase.java:1173)
>>>   at
>> org.apache.solr.parser.SolrQueryParserBase.handleBareTokenQuery(SolrQueryParserBase.java:781)
>>>   at org.apache.solr.parser.QueryParser.Term(QueryParser.java:421)
>>>   at org.apache.solr.parser.QueryParser.Clause(QueryParser.java:278)
>>>   at org.apache.solr.parser.QueryParser.Query(QueryParser.java:162)
>>>   at
>> org.apache.solr.parser.QueryParser.TopLevelQuery(QueryParser.java:131)
>&

Re: How to debug empty ParsedQuery from Edismax Query Parser

2019-01-02 Thread Gus Heck

If you mean attach a debugger, solr is just like any other java program.
Pass in the standard java options at start up to have it listen or connect
as usual. The port is just a TCP port so ssh tunneling the debugger port
can bridge the gap with a remote machine (or a vpn).

That said the prior thread posts makes it sound like we are looking for a
case where the query parser or something above it is inappropriately eating
an exception relating to too many terms.

Did 5.x impose a new or reduced limit there?

On Wed, Jan 2, 2019, 1:20 PM Kay Wrobel  Is there any way I can debug the parser? Especially, the edismax parser
> which does *not* raise any exception but produces an empty parsedQuery?
> Please, if anyone can help. I feel very lost and without guidance, and
> Google search has not provided me with any help at all.
>
> > On Dec 28, 2018, at 9:57 AM, Kay Wrobel  wrote:
> >
> > Here are my log entries:
> >
> > SOLR 7.x (non-working)
> > 2018-12-28 15:36:32.786 INFO  (qtp1769193365-20) [   x:collection1]
> o.a.s.c.S.Request [collection1]  webapp=/solr path=/select
> params={q=ac6023*=tm_field_product^21.0=tm_title_field^8.0=all=10=xml=true}
> hits=0 status=0 QTime=2
> >
> > SOLR 4.x (working)
> > INFO  - 2018-12-28 15:43:41.938; org.apache.solr.core.SolrCore;
> [collection1] webapp=/solr path=/select
> params={q=ac6023*=tm_field_product^21.0=tm_title_field^8.0=all=10=xml=true}
> hits=32 status=0 QTime=8
> >
> > EchoParams=all did not show anything different in the resulting XML from
> SOLR 7.x.
> >
> >
> > I found out something curious yesterday. When I try to force the
> Standard query parser on SOLR 7.x using the same query, but adding
> "defType=lucene" at the beginning, SOLR 7 raises a SolrException with this
> message: "analyzer returned too many terms for multiTerm term: ac6023"
> (full response: https://pastebin.com/ijdBj4GF)
> >
> > Log entry for that request:
> > 2018-12-28 15:50:58.804 ERROR (qtp1769193365-15) [   x:collection1]
> o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: analyzer
> returned too many terms for multiTerm term: ac6023
> >at
> org.apache.solr.schema.TextField.analyzeMultiTerm(TextField.java:180)
> >at
> org.apache.solr.parser.SolrQueryParserBase.analyzeIfMultitermTermText(SolrQueryParserBase.java:992)
> >at
> org.apache.solr.parser.SolrQueryParserBase.getPrefixQuery(SolrQueryParserBase.java:1173)
> >at
> org.apache.solr.parser.SolrQueryParserBase.handleBareTokenQuery(SolrQueryParserBase.java:781)
> >at org.apache.solr.parser.QueryParser.Term(QueryParser.java:421)
> >at org.apache.solr.parser.QueryParser.Clause(QueryParser.java:278)
> >at org.apache.solr.parser.QueryParser.Query(QueryParser.java:162)
> >at
> org.apache.solr.parser.QueryParser.TopLevelQuery(QueryParser.java:131)
> >at
> org.apache.solr.parser.SolrQueryParserBase.parse(SolrQueryParserBase.java:254)
> >at
> org.apache.solr.search.LuceneQParser.parse(LuceneQParser.java:49)
> >at org.apache.solr.search.QParser.getQuery(QParser.java:173)
> >at
> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:160)
> >at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:279)
> >at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)
> >at org.apache.solr.core.SolrCore.execute(SolrCore.java:2541)
> >at
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:709)
> >at
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:515)
> >at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:377)
> >at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:323)
> >at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1634)
> >at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
> >at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
> >at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> >at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
> >at
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
> >at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
> >at
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(Scope

Re: How to debug empty ParsedQuery from Edismax Query Parser

2019-01-02 Thread Kay Wrobel

Is there any way I can debug the parser? Especially, the edismax parser which 
does *not* raise any exception but produces an empty parsedQuery? Please, if 
anyone can help. I feel very lost and without guidance, and Google search has 
not provided me with any help at all.

> On Dec 28, 2018, at 9:57 AM, Kay Wrobel  wrote:
> 
> Here are my log entries:
> 
> SOLR 7.x (non-working)
> 2018-12-28 15:36:32.786 INFO  (qtp1769193365-20) [   x:collection1] 
> o.a.s.c.S.Request [collection1]  webapp=/solr path=/select 
> params={q=ac6023*=tm_field_product^21.0=tm_title_field^8.0=all=10=xml=true}
>  hits=0 status=0 QTime=2
> 
> SOLR 4.x (working)
> INFO  - 2018-12-28 15:43:41.938; org.apache.solr.core.SolrCore; [collection1] 
> webapp=/solr path=/select 
> params={q=ac6023*=tm_field_product^21.0=tm_title_field^8.0=all=10=xml=true}
>  hits=32 status=0 QTime=8 
> 
> EchoParams=all did not show anything different in the resulting XML from SOLR 
> 7.x.
> 
> 
> I found out something curious yesterday. When I try to force the Standard 
> query parser on SOLR 7.x using the same query, but adding "defType=lucene" at 
> the beginning, SOLR 7 raises a SolrException with this message: "analyzer 
> returned too many terms for multiTerm term: ac6023" (full response: 
> https://pastebin.com/ijdBj4GF)
> 
> Log entry for that request:
> 2018-12-28 15:50:58.804 ERROR (qtp1769193365-15) [   x:collection1] 
> o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: analyzer 
> returned too many terms for multiTerm term: ac6023
>at 
> org.apache.solr.schema.TextField.analyzeMultiTerm(TextField.java:180)
>at 
> org.apache.solr.parser.SolrQueryParserBase.analyzeIfMultitermTermText(SolrQueryParserBase.java:992)
>at 
> org.apache.solr.parser.SolrQueryParserBase.getPrefixQuery(SolrQueryParserBase.java:1173)
>at 
> org.apache.solr.parser.SolrQueryParserBase.handleBareTokenQuery(SolrQueryParserBase.java:781)
>at org.apache.solr.parser.QueryParser.Term(QueryParser.java:421)
>at org.apache.solr.parser.QueryParser.Clause(QueryParser.java:278)
>at org.apache.solr.parser.QueryParser.Query(QueryParser.java:162)
>at 
> org.apache.solr.parser.QueryParser.TopLevelQuery(QueryParser.java:131)
>at 
> org.apache.solr.parser.SolrQueryParserBase.parse(SolrQueryParserBase.java:254)
>at org.apache.solr.search.LuceneQParser.parse(LuceneQParser.java:49)
>at org.apache.solr.search.QParser.getQuery(QParser.java:173)
>at 
> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:160)
>at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:279)
>at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)
>at org.apache.solr.core.SolrCore.execute(SolrCore.java:2541)
>at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:709)
>at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:515)
>at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:377)
>at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:323)
>at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1634)
>at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
>at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
>at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
>at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1317)
>at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
>at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
>at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1219)
>at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
>at 
>

Re: How to debug empty ParsedQuery from Edismax Query Parser

2018-12-28 Thread Kay Wrobel

Here are my log entries:

SOLR 7.x (non-working)
2018-12-28 15:36:32.786 INFO  (qtp1769193365-20) [   x:collection1] 
o.a.s.c.S.Request [collection1]  webapp=/solr path=/select 
params={q=ac6023*=tm_field_product^21.0=tm_title_field^8.0=all=10=xml=true}
 hits=0 status=0 QTime=2

SOLR 4.x (working)
INFO  - 2018-12-28 15:43:41.938; org.apache.solr.core.SolrCore; [collection1] 
webapp=/solr path=/select 
params={q=ac6023*=tm_field_product^21.0=tm_title_field^8.0=all=10=xml=true}
 hits=32 status=0 QTime=8 

EchoParams=all did not show anything different in the resulting XML from SOLR 
7.x.


I found out something curious yesterday. When I try to force the Standard query 
parser on SOLR 7.x using the same query, but adding "defType=lucene" at the 
beginning, SOLR 7 raises a SolrException with this message: "analyzer returned 
too many terms for multiTerm term: ac6023" (full response: 
https://pastebin.com/ijdBj4GF)

Log entry for that request:
2018-12-28 15:50:58.804 ERROR (qtp1769193365-15) [   x:collection1] 
o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: analyzer 
returned too many terms for multiTerm term: ac6023
at org.apache.solr.schema.TextField.analyzeMultiTerm(TextField.java:180)
at 
org.apache.solr.parser.SolrQueryParserBase.analyzeIfMultitermTermText(SolrQueryParserBase.java:992)
at 
org.apache.solr.parser.SolrQueryParserBase.getPrefixQuery(SolrQueryParserBase.java:1173)
at 
org.apache.solr.parser.SolrQueryParserBase.handleBareTokenQuery(SolrQueryParserBase.java:781)
at org.apache.solr.parser.QueryParser.Term(QueryParser.java:421)
at org.apache.solr.parser.QueryParser.Clause(QueryParser.java:278)
at org.apache.solr.parser.QueryParser.Query(QueryParser.java:162)
at 
org.apache.solr.parser.QueryParser.TopLevelQuery(QueryParser.java:131)
at 
org.apache.solr.parser.SolrQueryParserBase.parse(SolrQueryParserBase.java:254)
at org.apache.solr.search.LuceneQParser.parse(LuceneQParser.java:49)
at org.apache.solr.search.QParser.getQuery(QParser.java:173)
at 
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:160)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:279)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2541)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:709)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:515)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:377)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:323)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1634)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1317)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1219)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at 
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at org.eclipse.jetty.server.Server.handle(Server.java:531)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:352)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(Abs

Re: How to debug empty ParsedQuery from Edismax Query Parser

2018-12-27 Thread Alexandre Rafalovitch

EchoParams=all

May also be helpful to pinpoint differences in params from all sources,
including request handler defaults.

Regards,
Alex

On Thu, Dec 27, 2018, 8:25 PM Shawn Heisey  On 12/27/2018 10:47 AM, Kay Wrobel wrote:
> > Now starting from SOLR version 5+, I receive zero (0) results back, but
> more importantly, the Query Parser produces an empty parsedQuery.
> >
> > Here is the same query issued to SOLR 7.6.0 (current version):
> > https://pastebin.com/XcNhfdUD <https://pastebin.com/XcNhfdUD>
> >
> > Notice how "parsedQuery" now shows "+()"; an empty query string.
>
> I can duplicate this result on a 7.5.0 example config by sending an
> edismax query with undefined parameters for df and qf. The other
> field-related parameters for edismax are also undefined.  The following
> URL parameters with the default example config will produce that parsed
> query:
>
> q=ac6023*=edismax===on
>
> When a query is made and Solr's logging configuration is at its default
> setting, Solr will log a line into its logfile containing all of the
> parameters in the query, both those provided on the URL and those set by
> Solr's configuration (solrconfig.xml).  Can you share this log line from
> both the version that works and the version that doesn't?
>
> This is the log line created when I ran the query mentioned above:
>
> 2018-12-27 23:03:23.199 INFO  (qtp315932542-23) [   x:baz]
> o.a.s.c.S.Request [baz]  webapp=/solr path=/select
> params={q=ac6023*=edismax===on} hits=0 status=0
> QTime=0
>
> What I'm thinking is that there is a difference in the configuration of
> the two servers or the actual query being sent is different.  Either
> way, there's something different.  The two log lines that I have asked
> for are likely to be different from each other in some way that will
> explain what you're seeing.
>
> Thanks,
> Shawn
>
>

Re: How to debug empty ParsedQuery from Edismax Query Parser

2018-12-27 Thread Shawn Heisey


On 12/27/2018 10:47 AM, Kay Wrobel wrote:

Now starting from SOLR version 5+, I receive zero (0) results back, but more 
importantly, the Query Parser produces an empty parsedQuery.

Here is the same query issued to SOLR 7.6.0 (current version):
https://pastebin.com/XcNhfdUD <https://pastebin.com/XcNhfdUD>

Notice how "parsedQuery" now shows "+()"; an empty query string.


I can duplicate this result on a 7.5.0 example config by sending an 
edismax query with undefined parameters for df and qf. The other 
field-related parameters for edismax are also undefined.  The following 
URL parameters with the default example config will produce that parsed 
query:


q=ac6023*=edismax===on

When a query is made and Solr's logging configuration is at its default 
setting, Solr will log a line into its logfile containing all of the 
parameters in the query, both those provided on the URL and those set by 
Solr's configuration (solrconfig.xml).  Can you share this log line from 
both the version that works and the version that doesn't?


This is the log line created when I ran the query mentioned above:

2018-12-27 23:03:23.199 INFO  (qtp315932542-23) [   x:baz] 
o.a.s.c.S.Request [baz]  webapp=/solr path=/select 
params={q=ac6023*=edismax===on} hits=0 status=0 
QTime=0


What I'm thinking is that there is a difference in the configuration of 
the two servers or the actual query being sent is different.  Either 
way, there's something different.  The two log lines that I have asked 
for are likely to be different from each other in some way that will 
explain what you're seeing.


Thanks,
Shawn

How to debug empty ParsedQuery from Edismax Query Parser

2018-12-27 Thread Kay Wrobel

Hi everyone.

I have the task of converting our old SOLR 4.10.2 instance to the current SOLR
7.6.0 version. We're using SOLR as our Search API backend for a Drupal 7
Commerce web site. One of the most basic queries is that a customer would enter
a part number or a portion of a part number on our web site and then get a list
of part numbers back. Under the hood, we are using the "search_api_solr" module
for Drupal 7 which produces a rather large query request. I have simplified a
sample request for the sake of this discussion.

On SOLR 4.10.2, when I issue the following to our core:
/select?qf=tm_field_product^21.0=tm_title_field^8.0=ac6023*=xml=10=true

I get 32 rows returned (out of a 1.4M indexed documents). Here is a link to the
response (edited to focus on debugging info):
https://pastebin.com/JHuFcbGG <https://pastebin.com/JHuFcbGG>

Notice how "parsedQuery" has a proper DisjunctionMaxQuery based on the two
query fields.

Now starting from SOLR version 5+, I receive zero (0) results back, but more
importantly, the Query Parser produces an empty parsedQuery.

Here is the same query issued to SOLR 7.6.0 (current version):
https://pastebin.com/XcNhfdUD <https://pastebin.com/XcNhfdUD>

Notice how "parsedQuery" now shows "+()"; an empty query string.

As I understand it, the wildcard is a perfectly legal character in a query and
has the following meaning:
http://lucene.apache.org/solr/guide/7_6/the-standard-query-parser.html#wildcard-searches

<http://lucene.apache.org/solr/guide/7_6/the-standard-query-parser.html#wildcard-searches>

So why does this not work? I have installed SOLR 5.5.5 and 6.6.5 as well just
to test when this behavior started happening, and it starts as early as SOLR 5.
I've been searching Google for the past week now on the matter and cannot for
the life of me find an answer to this issue. So I am turning to this mailing
list for any advice and assistance.

Kind regards, and thanks.

Kay
--

The information in this e-mail is confidential and is intended solely for
the addressee(s). Access to this email by anyone else is unauthorized. If
you are not an intended recipient, you may not print, save or otherwise
store the e-mail or any of the contents thereof in electronic or physical
form, nor copy, use or disseminate the information contained in the email.
If you are not an intended recipient, please notify the sender of this
email immediately.

Re: switch query parser and solr cloud

2018-09-13 Thread Dwane Hall

Afternoon all,

Just to add some closure to this topic in case anybody else stumbles across a 
similar problem I've managed to resolve my issue by removing the switch query 
parser from the _appends_ component of the parameter set.

so the parameter set changes from this

 "set":{
"firstParams":{
"op":"AND",
"wt":"json",
"start":0,
"allResults":"false",
"fl":"FIELD_1,FIELD_2,SUMMARY_FIELD",
  "_appends_":{
"fq":"{!switch default=\"{!collapse field=SUMMARY_FIELD}\" 
case.true=*:* v=${allResults}}",
  },

to just a regular old filter query

 "set":{
"firstParams":{
"op":"AND",
"wt":"json",
"start":0,
"allResults":"false",
"fl":"FIELD_1,FIELD_2,SUMMARY_FIELD",
"fq":"{!switch default=\"{!collapse field=SUMMARY_FIELD}\" 
case.true=*:* v=${allResults}}",

Somewhat odd.

Thanks again to Erick and Shawn for taking the time to assist and talk this 
through.

Dwane

From: Dwane Hall 
Sent: Thursday, 13 September 2018 6:42 AM
To: Erick Erickson; solr-user@lucene.apache.org
Subject: Re: switch query parser and solr cloud

Thanks for the suggestions and responses Erick and Shawn.  Erick I only return 
30 records irrespective of the query (not the entire payload) I removed some of 
my configuration settings for readability. The parameter "allResults" was a 
little misleading I apologise for that but I appreciate your input.

Shawn thanks for your comments. Regarding the switch query parser the Hossman 
has a great description of its use and application here 
(https://lucidworks.com/2013/02/20/custom-solr-request-params/).  PTST is just 
our performance testing environment and is not important in the context of the 
question other than it being a multi node solr environment.  The server side 
error was the null pointer which is why I was having a few difficulties 
debugging it as there was not a lot of info to troubleshoot.  I'll keep playing 
and explore the client filter option for addressing this issue.

Thanks again for both of your input

Cheers,

Dwane

From: Erick Erickson 
Sent: Thursday, 13 September 2018 12:20 AM
To: solr-user
Subject: Re: switch query parser and solr cloud

You will run into significant problems if, when returning "all
results", you return large result sets. For regular queries I like to
limit the return to 100, although 1,000 is sometimes OK.

Millions will blow you out of the water, use CursorMark or Streaming
for very large result sets. CursorMark gets you a page at a time, but
efficiently and Streaming doesn't consume huge amounts of memory.

And assuming you could possible return 1M rows, say, what would the
user do with it? Displaying in a browser is problematic for instance.

Best,
Erick
On Wed, Sep 12, 2018 at 5:54 AM Shawn Heisey  wrote:
>
> On 9/12/2018 5:47 AM, Dwane Hall wrote:
> > Good afternoon Solr brains trust I'm seeking some community advice if 
> > somebody can spare a minute from their busy schedules.
> >
> > I'm attempting to use the switch query parser to influence client search 
> > behaviour based on a client specified request parameter.
> >
> > Essentially I want the following to occur:
> >
> > -A user has the option to pass through an optional request parameter 
> > "allResults" to solr
> > -If "allResults" is true then return all matching query records by 
> > appending a filter query for all records (fq=*:*)
> > -If "allResults" is empty then apply a filter using the collapse query 
> > parser ({!collapse field=SUMMARY_FIELD})
>
> I'm looking at the documentation for the switch parser and I'm having
> difficulty figuring out what it actually does.
>
> This is the kind of thing that is better to handle in your client
> instead of asking Solr to do it for you.  You'd have to have your code
> construct the complex localparam for the switch parser ... it would be
> much easier to write code to insert your special collapse filter when it
> is required.
>
> > Everything works nicely until I move from a single node solr instance (DEV) 
> > to a clustered solr instance (PTST) in which I receive a null pointer 
> > exception from Solr which I'm having trouble picking apart.  I've 
> > co-located the solr documents using document routing which appear to be the 
> > only requirement for the collapse query parser's use.
>
> Some features break down when working with sharded indexes.  This is one
> of the reasons

Re: switch query parser and solr cloud

2018-09-12 Thread Dwane Hall

Thanks for the suggestions and responses Erick and Shawn.  Erick I only return 
30 records irrespective of the query (not the entire payload) I removed some of 
my configuration settings for readability. The parameter "allResults" was a 
little misleading I apologise for that but I appreciate your input.

Shawn thanks for your comments. Regarding the switch query parser the Hossman 
has a great description of its use and application here 
(https://lucidworks.com/2013/02/20/custom-solr-request-params/).  PTST is just 
our performance testing environment and is not important in the context of the 
question other than it being a multi node solr environment.  The server side 
error was the null pointer which is why I was having a few difficulties 
debugging it as there was not a lot of info to troubleshoot.  I'll keep playing 
and explore the client filter option for addressing this issue.

Thanks again for both of your input

Cheers,

Dwane

From: Erick Erickson 
Sent: Thursday, 13 September 2018 12:20 AM
To: solr-user
Subject: Re: switch query parser and solr cloud

You will run into significant problems if, when returning "all
results", you return large result sets. For regular queries I like to
limit the return to 100, although 1,000 is sometimes OK.

Millions will blow you out of the water, use CursorMark or Streaming
for very large result sets. CursorMark gets you a page at a time, but
efficiently and Streaming doesn't consume huge amounts of memory.

And assuming you could possible return 1M rows, say, what would the
user do with it? Displaying in a browser is problematic for instance.

Best,
Erick
On Wed, Sep 12, 2018 at 5:54 AM Shawn Heisey  wrote:
>
> On 9/12/2018 5:47 AM, Dwane Hall wrote:
> > Good afternoon Solr brains trust I'm seeking some community advice if 
> > somebody can spare a minute from their busy schedules.
> >
> > I'm attempting to use the switch query parser to influence client search 
> > behaviour based on a client specified request parameter.
> >
> > Essentially I want the following to occur:
> >
> > -A user has the option to pass through an optional request parameter 
> > "allResults" to solr
> > -If "allResults" is true then return all matching query records by 
> > appending a filter query for all records (fq=*:*)
> > -If "allResults" is empty then apply a filter using the collapse query 
> > parser ({!collapse field=SUMMARY_FIELD})
>
> I'm looking at the documentation for the switch parser and I'm having
> difficulty figuring out what it actually does.
>
> This is the kind of thing that is better to handle in your client
> instead of asking Solr to do it for you.  You'd have to have your code
> construct the complex localparam for the switch parser ... it would be
> much easier to write code to insert your special collapse filter when it
> is required.
>
> > Everything works nicely until I move from a single node solr instance (DEV) 
> > to a clustered solr instance (PTST) in which I receive a null pointer 
> > exception from Solr which I'm having trouble picking apart.  I've 
> > co-located the solr documents using document routing which appear to be the 
> > only requirement for the collapse query parser's use.
>
> Some features break down when working with sharded indexes.  This is one
> of the reasons that sharding should only be done when it is absolutely
> required.  A single-shard index tends to perform better anyway, unless
> it's really really huge.
>
> The error is a remote exception, from
> https://myserver:1234/solr/my_collection_ptst_shard2_replica_n2. Which
> suggests that maybe not all your documents are co-located on the same
> shard the way you think they are.  Is this a remote server/shard?  I am
> completely guessing here.  It's always possible that you've encountered
> a bug.  Does this one (not fixed) look like it might apply?
>
> https://issues.apache.org/jira/browse/SOLR-9104
>
> There should be a server-side error logged by the Solr instance running
> on myserver:1234 as well.  Have you looked at that?
>
> I do not know what PTST means.  Is that important for me to understand?
>
> Thanks,
> Shawn
>

Re: switch query parser and solr cloud

2018-09-12 Thread Erick Erickson

You will run into significant problems if, when returning "all
results", you return large result sets. For regular queries I like to
limit the return to 100, although 1,000 is sometimes OK.

Millions will blow you out of the water, use CursorMark or Streaming
for very large result sets. CursorMark gets you a page at a time, but
efficiently and Streaming doesn't consume huge amounts of memory.

And assuming you could possible return 1M rows, say, what would the
user do with it? Displaying in a browser is problematic for instance.

Best,
Erick
On Wed, Sep 12, 2018 at 5:54 AM Shawn Heisey  wrote:
>
> On 9/12/2018 5:47 AM, Dwane Hall wrote:
> > Good afternoon Solr brains trust I'm seeking some community advice if 
> > somebody can spare a minute from their busy schedules.
> >
> > I'm attempting to use the switch query parser to influence client search 
> > behaviour based on a client specified request parameter.
> >
> > Essentially I want the following to occur:
> >
> > -A user has the option to pass through an optional request parameter 
> > "allResults" to solr
> > -If "allResults" is true then return all matching query records by 
> > appending a filter query for all records (fq=*:*)
> > -If "allResults" is empty then apply a filter using the collapse query 
> > parser ({!collapse field=SUMMARY_FIELD})
>
> I'm looking at the documentation for the switch parser and I'm having
> difficulty figuring out what it actually does.
>
> This is the kind of thing that is better to handle in your client
> instead of asking Solr to do it for you.  You'd have to have your code
> construct the complex localparam for the switch parser ... it would be
> much easier to write code to insert your special collapse filter when it
> is required.
>
> > Everything works nicely until I move from a single node solr instance (DEV) 
> > to a clustered solr instance (PTST) in which I receive a null pointer 
> > exception from Solr which I'm having trouble picking apart.  I've 
> > co-located the solr documents using document routing which appear to be the 
> > only requirement for the collapse query parser's use.
>
> Some features break down when working with sharded indexes.  This is one
> of the reasons that sharding should only be done when it is absolutely
> required.  A single-shard index tends to perform better anyway, unless
> it's really really huge.
>
> The error is a remote exception, from
> https://myserver:1234/solr/my_collection_ptst_shard2_replica_n2. Which
> suggests that maybe not all your documents are co-located on the same
> shard the way you think they are.  Is this a remote server/shard?  I am
> completely guessing here.  It's always possible that you've encountered
> a bug.  Does this one (not fixed) look like it might apply?
>
> https://issues.apache.org/jira/browse/SOLR-9104
>
> There should be a server-side error logged by the Solr instance running
> on myserver:1234 as well.  Have you looked at that?
>
> I do not know what PTST means.  Is that important for me to understand?
>
> Thanks,
> Shawn
>

Re: switch query parser and solr cloud

2018-09-12 Thread Shawn Heisey


On 9/12/2018 5:47 AM, Dwane Hall wrote:

Good afternoon Solr brains trust I'm seeking some community advice if somebody 
can spare a minute from their busy schedules.

I'm attempting to use the switch query parser to influence client search 
behaviour based on a client specified request parameter.

Essentially I want the following to occur:

-A user has the option to pass through an optional request parameter 
"allResults" to solr
-If "allResults" is true then return all matching query records by appending a 
filter query for all records (fq=*:*)
-If "allResults" is empty then apply a filter using the collapse query parser 
({!collapse field=SUMMARY_FIELD})


I'm looking at the documentation for the switch parser and I'm having 
difficulty figuring out what it actually does.


This is the kind of thing that is better to handle in your client 
instead of asking Solr to do it for you.  You'd have to have your code 
construct the complex localparam for the switch parser ... it would be 
much easier to write code to insert your special collapse filter when it 
is required.



Everything works nicely until I move from a single node solr instance (DEV) to 
a clustered solr instance (PTST) in which I receive a null pointer exception 
from Solr which I'm having trouble picking apart.  I've co-located the solr 
documents using document routing which appear to be the only requirement for 
the collapse query parser's use.


Some features break down when working with sharded indexes.  This is one 
of the reasons that sharding should only be done when it is absolutely 
required.  A single-shard index tends to perform better anyway, unless 
it's really really huge.


The error is a remote exception, from 
https://myserver:1234/solr/my_collection_ptst_shard2_replica_n2. Which 
suggests that maybe not all your documents are co-located on the same 
shard the way you think they are.  Is this a remote server/shard?  I am 
completely guessing here.  It's always possible that you've encountered 
a bug.  Does this one (not fixed) look like it might apply?


https://issues.apache.org/jira/browse/SOLR-9104

There should be a server-side error logged by the Solr instance running 
on myserver:1234 as well.  Have you looked at that?


I do not know what PTST means.  Is that important for me to understand?

Thanks,
Shawn

switch query parser and solr cloud

2018-09-12 Thread Dwane Hall

Good afternoon Solr brains trust I'm seeking some community advice if somebody 
can spare a minute from their busy schedules.

I'm attempting to use the switch query parser to influence client search 
behaviour based on a client specified request parameter.

Essentially I want the following to occur:

-A user has the option to pass through an optional request parameter 
"allResults" to solr
-If "allResults" is true then return all matching query records by appending a 
filter query for all records (fq=*:*)
-If "allResults" is empty then apply a filter using the collapse query parser 
({!collapse field=SUMMARY_FIELD})

Environment
Solr 7.3.1 (1 solr node DEV, 4 solr nodes PTST)
4 shard collection

My Implementation
I'm using the switch query parser to choose client behaviour by appending a 
filter query to the user request very similar to what is documented in the solr 
reference guide here 
(https://lucene.apache.org/solr/guide/7_4/other-parsers.html#switch-query-parser)

The request uses the params api (pertinent line below is the _appends_ filter 
queries)
(useParams=firstParams,secondParams)

  "set":{
"firstParams":{
"op":"AND",
"wt":"json",
"start":0,
"allResults":"false",
"fl":"FIELD_1,FIELD_2,SUMMARY_FIELD",
  "_appends_":{
"fq":"{!switch default=\"{!collapse field=SUMMARY_FIELD}\" 
case.true=*:* v=${allResults}}",
  },
  "_invariants_":{
"deftype":"edismax",
"timeAllowed":2,
"rows":"30",
"echoParams":"none",
}
  }
   }

   "set":{
"secondParams":{
"df":"FIELD_1",
"q":"{!edismax v=${searchString} df=FIELD_1 q.op=${op}}",
  "_invariants_":{
"qf":"FIELD_1,FIELD_2,SUMMARY_FIELD",
}
  }
   }}

Everything works nicely until I move from a single node solr instance (DEV) to 
a clustered solr instance (PTST) in which I receive a null pointer exception 
from Solr which I'm having trouble picking apart.  I've co-located the solr 
documents using document routing which appear to be the only requirement for 
the collapse query parser's use.

Does anyone know if the switch query parser has any limitations in a sharded 
solr cloud environment or can provide any possible troubleshooting advice?

Any community recommendations would be greatly appreciated

Solr stack trace
2018-09-12 12:16:12,918 4064160860 ERROR : [c:my_collection s:shard1 
r:core_node3 x:my_collection_ptst_shard1_replica_n1] 
org.apache.solr.common.SolrException : 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at https://myserver:1234/solr/my_collection_ptst_shard2_replica_n2: 
java.lang.NullPointerException
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:643)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244)
at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1219)
at 
org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:172)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748

Thanks for taking the time to assist,

Dwane

Re: Type ahead functionality using complex phrase query parser

2018-08-16 Thread Gus Heck

Yes, that's a common strategy, and it's also fairly common to index two (or
more) versions of the field, with different tokenizations (or not
tokenized) if there is a need to perform different types of search the
field. This duplication can be acheived either with  in the
schema, CloneFieldUpdateRequestProcessor
<https://lucene.apache.org/solr/7_4_0//solr-core/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.html>
in a custom update chain, or in the indexing/ETL software prior to sending
the data to solr depending on what you have available and what you need to
do.

-Gus

On Wed, Aug 15, 2018 at 7:12 PM, Hanjan, Harinder <
harinder.han...@calgary.ca> wrote:

> Keeping the field as string so that no analysis is done on it has yielded
> promising results.
>
>  required="true" />
>
> I will test more tomorrow and report back.
>
> -Original Message-
> From: Hanjan, Harinder [mailto:harinder.han...@calgary.ca]
> Sent: Wednesday, August 15, 2018 5:01 PM
> To: solr-user@lucene.apache.org
> Subject: [EXT] Type ahead functionality using complex phrase query parser
>
> Hello!
>
> I can't get Solr to give the results I would expect, would appreciate if
> someone could point me in the right direction here.
>
> /select?q={!complexphrase}"gar*"
> shows me the following terms
>
> -garages
>
> -garburator
>
> -gardening
>
> -gardens
>
> -garage
>
> -garden
>
> -garbage
>
> -century gardens
>
> -community gardens
>
> I was not expecting to see the bottom two.
>
> --- schema.xml ---
>  required="true" />  positionIncrementGap="100"> 
>   
>   
>
> 
>
> --- query ---
> /select?q={!complexphrase}"gar*"
>
> --- solrconfig.xml ---
> 
>
>   explicit
>   10
>   suggestion
>
> 
>
> Thanks!
> Harinder
>
> 
> NOTICE -
> This communication is intended ONLY for the use of the person or entity
> named above and may contain information that is confidential or legally
> privileged. If you are not the intended recipient named above or a person
> responsible for delivering messages or communications to the intended
> recipient, YOU ARE HEREBY NOTIFIED that any use, distribution, or copying
> of this communication or any of the information contained in it is strictly
> prohibited. If you have received this communication in error, please notify
> us immediately by telephone and then destroy or delete this communication,
> or return it to us by mail if requested by us. The City of Calgary thanks
> you for your attention and co-operation.
>



-- 
http://www.the111shift.com

RE: Type ahead functionality using complex phrase query parser

2018-08-15 Thread Hanjan, Harinder

Keeping the field as string so that no analysis is done on it has yielded 
promising results.  



I will test more tomorrow and report back.

-Original Message-
From: Hanjan, Harinder [mailto:harinder.han...@calgary.ca] 
Sent: Wednesday, August 15, 2018 5:01 PM
To: solr-user@lucene.apache.org
Subject: [EXT] Type ahead functionality using complex phrase query parser

Hello!

I can't get Solr to give the results I would expect, would appreciate if 
someone could point me in the right direction here.

/select?q={!complexphrase}"gar*"
shows me the following terms

-garages

-garburator

-gardening

-gardens

-garage

-garden

-garbage

-century gardens

-community gardens

I was not expecting to see the bottom two.

--- schema.xml ---
  
  
  
   


--- query ---
/select?q={!complexphrase}"gar*"

--- solrconfig.xml ---

   
  explicit
  10
  suggestion
   


Thanks!
Harinder


NOTICE -
This communication is intended ONLY for the use of the person or entity named 
above and may contain information that is confidential or legally privileged. 
If you are not the intended recipient named above or a person responsible for 
delivering messages or communications to the intended recipient, YOU ARE HEREBY 
NOTIFIED that any use, distribution, or copying of this communication or any of 
the information contained in it is strictly prohibited. If you have received 
this communication in error, please notify us immediately by telephone and then 
destroy or delete this communication, or return it to us by mail if requested 
by us. The City of Calgary thanks you for your attention and co-operation.

Type ahead functionality using complex phrase query parser

2018-08-15 Thread Hanjan, Harinder

Hello!

I can't get Solr to give the results I would expect, would appreciate if 
someone could point me in the right direction here.

/select?q={!complexphrase}"gar*"
shows me the following terms

-garages

-garburator

-gardening

-gardens

-garage

-garden

-garbage

-century gardens

-community gardens

I was not expecting to see the bottom two.

--- schema.xml ---



  
  
   


--- query ---
/select?q={!complexphrase}"gar*"

--- solrconfig.xml ---

   
  explicit
  10
  suggestion
   


Thanks!
Harinder


NOTICE -
This communication is intended ONLY for the use of the person or entity named 
above and may contain information that is confidential or legally privileged. 
If you are not the intended recipient named above or a person responsible for 
delivering messages or communications to the intended recipient, YOU ARE HEREBY 
NOTIFIED that any use, distribution, or copying of this communication or any of 
the information contained in it is strictly prohibited. If you have received 
this communication in error, please notify us immediately by telephone and then 
destroy or delete this communication, or return it to us by mail if requested 
by us. The City of Calgary thanks you for your attention and co-operation.

Re: Solr Default query parser

2018-07-02 Thread Kamal Kishore Aggarwal

Thanks Jason and Shawn.

It's clear now.


Regards
Kamal


On Tue, Jun 26, 2018, 6:12 PM Jason Gerlowski  wrote:

> The "Standard Query Parser" _is_ the lucene query parser.  They're the
> same parser.  As Shawn pointed out above, they're also the default, so
> if you don't specify any defType, they will be used.  Though if you
> want to be explicit and specify it anyway, the value is defType=lucene
>
> Jason
> On Mon, Jun 25, 2018 at 1:05 PM Kamal Kishore Aggarwal
>  wrote:
> >
> > Hi Shawn,
> >
> > Thanks for the reply.
> >
> > If "lucene" is the default query parser, then how can we specify Standard
> > Query Parser(QP) in the query.
> >
> > Dismax QP can be specified by defType=dismax and Extended Dismax Qp by
> > defType=edismax, how about for declaration of Standard QP.
> >
> > Regards
> > Kamal
> >
> > On Wed, Jun 6, 2018 at 9:41 PM, Shawn Heisey 
> wrote:
> >
> > > On 6/6/2018 9:52 AM, Kamal Kishore Aggarwal wrote:
> > > >> What is the default query parser (QP) for solr.
> > > >>
> > > >> While I was reading about this, I came across two links which looks
> > > >> ambiguous to me. It's not clear to me whether Standard is the
> default
> > > QP or
> > > >> Lucene is the default QP or they are same. Below is the screenshot
> and
> > > >> links which are confusing me.
> > >
> > > The default query parser in Solr has the name "lucene".  This query
> > > parser, which is part of Solr, deals with Lucene query syntax.
> > >
> > > The most recent documentation states this clearly right after the table
> > > of contents:
> > >
> > >
> https://lucene.apache.org/solr/guide/7_3/the-standard-query-parser.html
> > >
> > > It is highly unlikely that the 6.6 documentation will receive any
> > > changes, unless serious errors are found in it.  The omission of this
> > > piece of information will not be seen as a serious error.
> > >
> > > Thanks,
> > > Shawn
> > >
> > >
>

Re: Solr Default query parser

2018-06-26 Thread Jason Gerlowski

The "Standard Query Parser" _is_ the lucene query parser.  They're the
same parser.  As Shawn pointed out above, they're also the default, so
if you don't specify any defType, they will be used.  Though if you
want to be explicit and specify it anyway, the value is defType=lucene

Jason
On Mon, Jun 25, 2018 at 1:05 PM Kamal Kishore Aggarwal
 wrote:
>
> Hi Shawn,
>
> Thanks for the reply.
>
> If "lucene" is the default query parser, then how can we specify Standard
> Query Parser(QP) in the query.
>
> Dismax QP can be specified by defType=dismax and Extended Dismax Qp by
> defType=edismax, how about for declaration of Standard QP.
>
> Regards
> Kamal
>
> On Wed, Jun 6, 2018 at 9:41 PM, Shawn Heisey  wrote:
>
> > On 6/6/2018 9:52 AM, Kamal Kishore Aggarwal wrote:
> > >> What is the default query parser (QP) for solr.
> > >>
> > >> While I was reading about this, I came across two links which looks
> > >> ambiguous to me. It's not clear to me whether Standard is the default
> > QP or
> > >> Lucene is the default QP or they are same. Below is the screenshot and
> > >> links which are confusing me.
> >
> > The default query parser in Solr has the name "lucene".  This query
> > parser, which is part of Solr, deals with Lucene query syntax.
> >
> > The most recent documentation states this clearly right after the table
> > of contents:
> >
> > https://lucene.apache.org/solr/guide/7_3/the-standard-query-parser.html
> >
> > It is highly unlikely that the 6.6 documentation will receive any
> > changes, unless serious errors are found in it.  The omission of this
> > piece of information will not be seen as a serious error.
> >
> > Thanks,
> > Shawn
> >
> >

Re: Solr Default query parser

2018-06-25 Thread Kamal Kishore Aggarwal

Hi Shawn,

Thanks for the reply.

If "lucene" is the default query parser, then how can we specify Standard
Query Parser(QP) in the query.

Dismax QP can be specified by defType=dismax and Extended Dismax Qp by
defType=edismax, how about for declaration of Standard QP.

Regards
Kamal

On Wed, Jun 6, 2018 at 9:41 PM, Shawn Heisey  wrote:

> On 6/6/2018 9:52 AM, Kamal Kishore Aggarwal wrote:
> >> What is the default query parser (QP) for solr.
> >>
> >> While I was reading about this, I came across two links which looks
> >> ambiguous to me. It's not clear to me whether Standard is the default
> QP or
> >> Lucene is the default QP or they are same. Below is the screenshot and
> >> links which are confusing me.
>
> The default query parser in Solr has the name "lucene".  This query
> parser, which is part of Solr, deals with Lucene query syntax.
>
> The most recent documentation states this clearly right after the table
> of contents:
>
> https://lucene.apache.org/solr/guide/7_3/the-standard-query-parser.html
>
> It is highly unlikely that the 6.6 documentation will receive any
> changes, unless serious errors are found in it.  The omission of this
> piece of information will not be seen as a serious error.
>
> Thanks,
> Shawn
>
>

Re: Sole Default query parser

2018-06-22 Thread Jason Gerlowski

Hi Kamal,

Sorry for the late reply.  If you're still unsure, the "lucene" query
parser is the default one.  The first ref-guide link you posted refers
to it almost ubiquitously as the "Standard Query Parser", but it's the
same thing as the lucene query parser.  (The page does say this, but
it's easy to miss "Solr’s default Query Parser is also known as the
lucene parser")

Best,

Jason
On Wed, Jun 6, 2018 at 5:08 AM Kamal Kishore Aggarwal
 wrote:
>
> Hi Guys,
>
> What is the default query parser (QP) for solr.
>
> While I was reading about this, I came across two links which looks ambiguous 
> to me. It's not clear to me whether Standard is the default QP or Lucene is 
> the default QP or they are same. Below is the screenshot and links which are 
> confusing me.
>
> https://lucene.apache.org/solr/guide/6_6/the-standard-query-parser.html
>
> https://lucene.apache.org/solr/guide/6_6/common-query-parameters.html
>
> Please suggest. Thanks in advance.
>
>
> Regards
> Kamal Kishore

Re: Solr Default query parser

2018-06-06 Thread Shawn Heisey

On 6/6/2018 9:52 AM, Kamal Kishore Aggarwal wrote:
>> What is the default query parser (QP) for solr.
>>
>> While I was reading about this, I came across two links which looks
>> ambiguous to me. It's not clear to me whether Standard is the default QP or
>> Lucene is the default QP or they are same. Below is the screenshot and
>> links which are confusing me.

The default query parser in Solr has the name "lucene".  This query
parser, which is part of Solr, deals with Lucene query syntax.

The most recent documentation states this clearly right after the table
of contents:

https://lucene.apache.org/solr/guide/7_3/the-standard-query-parser.html

It is highly unlikely that the 6.6 documentation will receive any
changes, unless serious errors are found in it.  The omission of this
piece of information will not be seen as a serious error.

Thanks,
Shawn

Re: Solr Default query parser

2018-06-06 Thread Kamal Kishore Aggarwal

[Correcting the subject]

On Wed, Jun 6, 2018 at 2:37 PM, Kamal Kishore Aggarwal <
kkroyal@gmail.com> wrote:

> Hi Guys,
>
> What is the default query parser (QP) for solr.
>
> While I was reading about this, I came across two links which looks
> ambiguous to me. It's not clear to me whether Standard is the default QP or
> Lucene is the default QP or they are same. Below is the screenshot and
> links which are confusing me.
>
> https://lucene.apache.org/solr/guide/6_6/the-standard-query-parser.html
>
> https://lucene.apache.org/solr/guide/6_6/common-query-parameters.html
>
> Please suggest. Thanks in advance.
>
>
> Regards
> Kamal Kishore
>

Sole Default query parser

2018-06-06 Thread Kamal Kishore Aggarwal

Hi Guys,

What is the default query parser (QP) for solr.

While I was reading about this, I came across two links which looks
ambiguous to me. It's not clear to me whether Standard is the default QP or
Lucene is the default QP or they are same. Below is the screenshot and
links which are confusing me.

https://lucene.apache.org/solr/guide/6_6/the-standard-query-parser.html

https://lucene.apache.org/solr/guide/6_6/common-query-parameters.html

Please suggest. Thanks in advance.


Regards
Kamal Kishore

Re: Block join query parser

2018-06-06 Thread Mikhail Khludnev

[child] has childFilter param. Also, mind about [subquery]

On Wed, Jun 6, 2018 at 9:33 AM, Ryan Yacyshyn 
wrote:

> Hi all,
>
> I'm looking for a way to query nested documents that would return the
> parent documents along with its child documents nested under it, but only
> the child documents that match the query. The [child] doc transformer comes
> close, but it returns all child docs.
>
> I'm looking for something similar to ES' inner hits (
> https://www.elastic.co/guide/en/elasticsearch/reference/
> current/search-request-inner-hits.html
> ).
>
> Is this possible?
>
> Thanks,
> Ryan
>



-- 
Sincerely yours
Mikhail Khludnev

Block join query parser

2018-06-06 Thread Ryan Yacyshyn

Hi all,

I'm looking for a way to query nested documents that would return the
parent documents along with its child documents nested under it, but only
the child documents that match the query. The [child] doc transformer comes
close, but it returns all child docs.

I'm looking for something similar to ES' inner hits (
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-inner-hits.html
).

Is this possible?

Thanks,
Ryan

RE: change in the Standard Query Parser behavior when migrating from Solr 5 to 7.

2018-05-10 Thread Piyush Kumar Nayak

Shawn,
I've raised a bug for the issue
https://issues.apache.org/jira/browse/SOLR-12340

have shared the schemas there, in case you wanna take a look.
Thanks for helping out.

-Original Message-
From: Shawn Heisey [mailto:apa...@elyograg.org] 
Sent: Thursday, May 10, 2018 3:41 AM
To: solr-user@lucene.apache.org
Subject: Re: change in the Standard Query Parser behavior when migrating from 
Solr 5 to 7.

On 5/9/2018 2:37 PM, Piyush Kumar Nayak wrote:
> Same here. "sow" restores the old behavior.

This might be a bug.  I'd like someone who has better understanding of the 
low-level internals to comment before assuming that it's a bug, though.  Sounds 
like sow=false (default as of 7.0) might be causing the 
autoGeneratePhraseQueries setting in the schema to be ignored.  Need to find 
out from an expert whether that is expected or wrong.

> The schema.xml in both Solr versions for me is the one that gets copied from 
> the default template folder to the collections's conf folder.
> On 7 though, looks like the schema changes file changes to "managed-schema".

The managed schema factory became default in 5.5.  The default filename for the 
managed schema is managed-schema.  Some of the configs included with Solr did 
already use the managed schema before it became default. Now all of the 
examples use the managed schema.

> On 7 Solr backs up the original schema.xml and creates a managed schema 
> config file.

That is a back-compat feature of the managed schema factory.  If a schema.xml 
file is found, it will be renamed and its contents will be copied to 
managed-schema, overwriting it if it exists already.

Thanks,
Shawn

Re: change in the Standard Query Parser behavior when migrating from Solr 5 to 7.

2018-05-09 Thread Shawn Heisey

On 5/9/2018 2:37 PM, Piyush Kumar Nayak wrote:
> Same here. "sow" restores the old behavior.

This might be a bug.  I'd like someone who has better understanding of
the low-level internals to comment before assuming that it's a bug,
though.  Sounds like sow=false (default as of 7.0) might be causing the
autoGeneratePhraseQueries setting in the schema to be ignored.  Need to
find out from an expert whether that is expected or wrong.

> The schema.xml in both Solr versions for me is the one that gets copied from 
> the default template folder to the collections's conf folder.
> On 7 though, looks like the schema changes file changes to "managed-schema".

The managed schema factory became default in 5.5.  The default filename
for the managed schema is managed-schema.  Some of the configs included
with Solr did already use the managed schema before it became default. 
Now all of the examples use the managed schema.

> On 7 Solr backs up the original schema.xml and creates a managed schema 
> config file.

That is a back-compat feature of the managed schema factory.  If a
schema.xml file is found, it will be renamed and its contents will be
copied to managed-schema, overwriting it if it exists already.

Thanks,
Shawn

RE: change in the Standard Query Parser behavior when migrating from Solr 5 to 7.

2018-05-09 Thread Piyush Kumar Nayak

Same here. "sow" restores the old behavior.
The schema.xml in both Solr versions for me is the one that gets copied from 
the default template folder to the collections's conf folder.
On 7 though, looks like the schema changes file changes to "managed-schema".

The fieldtype that corresponds to field "contents" is "text", and the 
definition of "text" field in 5 and the schema backup on 7 is the same. 
On 7 Solr backs up the original schema.xml and creates a managed schema config 
file.

I tried the analysis tab. Looks like all the classes (WT, SF ...) in 7 list a 
property (termFrequency = 1) that is missing in 5.

Lemme see if I can share the schemas.

-Original Message-
From: David Hastings [mailto:hastings.recurs...@gmail.com] 
Sent: Thursday, May 10, 2018 1:38 AM
To: solr-user@lucene.apache.org
Subject: Re: change in the Standard Query Parser behavior when migrating from 
Solr 5 to 7.

sow=true made 7 mimic 5.

On Wed, May 9, 2018 at 3:57 PM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 5/9/2018 1:25 PM, David Hastings wrote:
> > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpas
> > tebin.com%2F0QUseqrN=02%7C01%7Cpnayak%40adobe.com%7C2f18f520a2f
> > b463faabf08d5b5e89166%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C6
> > 36614932828978588=gs9zG65%2FMZayaIthXJZSbE6m%2FvjKd2uBPnRIta98
> > Haw%3D=0
> >
> > here is mine for an example with the exact same behavior
>
> Can you try the query in the Analysis tab in the admin UI on both 
> versions and see which step in the analysis chain is the point at 
> which the two diverge from each other?
>
> I would still like to see the full schema, but I have an idea for a 
> troubleshooting step.  Can you add "sow=true" to the URL on version 7 
> and see if that makes any difference?  The query that the OP is using 
> doesn't have spaces, but there might be some kind of odd interaction 
> between sow and other settings.  I believe the default value for sow 
> changed to false in version 7.
>
> You also might try setting autoGeneratePhraseQueries to false on the 
> fieldType, which might cause version 5 to behave the same as 7.  But 
> be warned that this could make things work very differently than users 
> might expect.
>
> Thanks,
> Shawn
>

Re: change in the Standard Query Parser behavior when migrating from Solr 5 to 7.

2018-05-09 Thread David Hastings

sow=true made 7 mimic 5.



On Wed, May 9, 2018 at 3:57 PM, Shawn Heisey  wrote:

> On 5/9/2018 1:25 PM, David Hastings wrote:
> > https://pastebin.com/0QUseqrN
> >
> > here is mine for an example with the exact same behavior
>
> Can you try the query in the Analysis tab in the admin UI on both
> versions and see which step in the analysis chain is the point at which
> the two diverge from each other?
>
> I would still like to see the full schema, but I have an idea for a
> troubleshooting step.  Can you add "sow=true" to the URL on version 7
> and see if that makes any difference?  The query that the OP is using
> doesn't have spaces, but there might be some kind of odd interaction
> between sow and other settings.  I believe the default value for sow
> changed to false in version 7.
>
> You also might try setting autoGeneratePhraseQueries to false on the
> fieldType, which might cause version 5 to behave the same as 7.  But be
> warned that this could make things work very differently than users
> might expect.
>
> Thanks,
> Shawn
>

Re: change in the Standard Query Parser behavior when migrating from Solr 5 to 7.

2018-05-09 Thread Shawn Heisey

On 5/9/2018 1:25 PM, David Hastings wrote:
> https://pastebin.com/0QUseqrN
> 
> here is mine for an example with the exact same behavior

Can you try the query in the Analysis tab in the admin UI on both
versions and see which step in the analysis chain is the point at which
the two diverge from each other?

I would still like to see the full schema, but I have an idea for a
troubleshooting step.  Can you add "sow=true" to the URL on version 7
and see if that makes any difference?  The query that the OP is using
doesn't have spaces, but there might be some kind of odd interaction
between sow and other settings.  I believe the default value for sow
changed to false in version 7.

You also might try setting autoGeneratePhraseQueries to false on the
fieldType, which might cause version 5 to behave the same as 7.  But be
warned that this could make things work very differently than users
might expect.

Thanks,
Shawn

Re: change in the Standard Query Parser behavior when migrating from Solr 5 to 7.

2018-05-09 Thread David Hastings

id rather not at least on my part, but in both cases i have:

   

and text as my default field, changed from text_general


On Wed, May 9, 2018 at 3:43 PM, Shawn Heisey  wrote:

> On 5/9/2018 1:25 PM, David Hastings wrote:
> > https://pastebin.com/0QUseqrN
>
> Can you provide the *full* schema for both versions?
>
> Thanks,
> Shawn
>
>

Re: change in the Standard Query Parser behavior when migrating from Solr 5 to 7.

2018-05-09 Thread Shawn Heisey

On 5/9/2018 1:25 PM, David Hastings wrote:
> https://pastebin.com/0QUseqrN

Can you provide the *full* schema for both versions?

Thanks,
Shawn

Re: change in the Standard Query Parser behavior when migrating from Solr 5 to 7.

2018-05-09 Thread David Hastings

https://pastebin.com/0QUseqrN

here is mine for an example with the exact same behavior

On Wed, May 9, 2018 at 3:14 PM, Shawn Heisey  wrote:

> On 5/9/2018 12:39 PM, Piyush Kumar Nayak wrote:
> > we have recently upgraded from Solr5 to Solr7. I'm running into a change
> of behavior that I cannot fathom.
> > For the term "test3" Solr7 splits the numeric and alphabetical
> components and does a simple term search while Solr 5 did a phrase search.
> 
> > Also, the fieldType name="text" in the schema.xml for the cores in both
> the versions of Solr are identical.
> > I'd appreciate any pointers that can help with explaining this change.
>
> There shouldn't be anything in Solr code that would cause this with an
> upgrade.  It's most likely going to be some difference in the schema.
> Where did you get the schemas that you're using?
>
> The default field appears to be "contents".  You will need to look at
> the schema version, the field definition for "contents", and the
> fieldType definition for whatever the "type" attribute in the field
> definition of "contents" points to.  If you can provide the full
> contents of both schemas, we will have an easier time looking for
> differences.  Use a paste website or a file sharing site for that.  If
> you try to attach the files to a mailing list message, they will most
> likely be removed by the mailing list software.
>
> Thanks,
> Shawn
>
>

Re: change in the Standard Query Parser behavior when migrating from Solr 5 to 7.

2018-05-09 Thread Shawn Heisey

On 5/9/2018 12:39 PM, Piyush Kumar Nayak wrote:
> we have recently upgraded from Solr5 to Solr7. I'm running into a change of 
> behavior that I cannot fathom.
> For the term "test3" Solr7 splits the numeric and alphabetical components and 
> does a simple term search while Solr 5 did a phrase search.

> Also, the fieldType name="text" in the schema.xml for the cores in both the 
> versions of Solr are identical.
> I'd appreciate any pointers that can help with explaining this change.

There shouldn't be anything in Solr code that would cause this with an
upgrade.  It's most likely going to be some difference in the schema. 
Where did you get the schemas that you're using?

The default field appears to be "contents".  You will need to look at
the schema version, the field definition for "contents", and the
fieldType definition for whatever the "type" attribute in the field
definition of "contents" points to.  If you can provide the full
contents of both schemas, we will have an easier time looking for
differences.  Use a paste website or a file sharing site for that.  If
you try to attach the files to a mailing list message, they will most
likely be removed by the mailing list software.

Thanks,
Shawn

Re: change in the Standard Query Parser behavior when migrating from Solr 5 to 7.

2018-05-09 Thread David Hastings

Strange, I have the exact same results, whats more interesting is the
analyzer shows identical for both 5 and 7, so its definetly a change in the
LuceneQParser

On Wed, May 9, 2018 at 2:39 PM, Piyush Kumar Nayak  wrote:

> we have recently upgraded from Solr5 to Solr7. I'm running into a change
> of behavior that I cannot fathom.
> For the term "test3" Solr7 splits the numeric and alphabetical components
> and does a simple term search while Solr 5 did a phrase search.
> 
> 
> --
> lucene/solr-spec: 7.2.1
> http://localhost:8991/solr/solr4/select?q=test3=test;
> wt=json=true=true
>
> "debug":{
> "rawquerystring":"test3",
> "querystring":"test3",
> "parsedquery":"contents:test contents:3",
> "parsedquery_toString":"contents:test contents:3",
>
> 
> 
> --
> lucene/solr-spec 5.2.1
> http://localhost:8989/solr/solr4/select?q=test3=test;
> wt=json=true=true
>
> "debug":{
> "rawquerystring":"test3",
> "querystring":"test3",
> "parsedquery":"PhraseQuery(contents:\"test 3\")",
> "parsedquery_toString":"contents:\"test 3\"",
> 
> 
> --
>
>
> I've not been able to find any mention of in the release notes or user
> guide at:
> https://lucene.apache.org/solr/guide/7_3/major-changes-in-solr-7.html
> https://lucene.apache.org/solr/guide/7_3/major-changes-
> from-solr-5-to-solr-6.html
>
> Also, the fieldType name="text" in the schema.xml for the cores in both
> the versions of Solr are identical.
> I'd appreciate any pointers that can help with explaining this change.
>
> Thanks,
> Piyush.
>
>

change in the Standard Query Parser behavior when migrating from Solr 5 to 7.

2018-05-09 Thread Piyush Kumar Nayak

we have recently upgraded from Solr5 to Solr7. I'm running into a change of 
behavior that I cannot fathom.
For the term "test3" Solr7 splits the numeric and alphabetical components and 
does a simple term search while Solr 5 did a phrase search.
--
lucene/solr-spec: 7.2.1
http://localhost:8991/solr/solr4/select?q=test3=test=json=true=true

"debug":{
"rawquerystring":"test3",
"querystring":"test3",
"parsedquery":"contents:test contents:3",
"parsedquery_toString":"contents:test contents:3",

--
lucene/solr-spec 5.2.1
http://localhost:8989/solr/solr4/select?q=test3=test=json=true=true

"debug":{
"rawquerystring":"test3",
"querystring":"test3",
"parsedquery":"PhraseQuery(contents:\"test 3\")",
"parsedquery_toString":"contents:\"test 3\"",
--


I've not been able to find any mention of in the release notes or user guide at:
https://lucene.apache.org/solr/guide/7_3/major-changes-in-solr-7.html
https://lucene.apache.org/solr/guide/7_3/major-changes-from-solr-5-to-solr-6.html

Also, the fieldType name="text" in the schema.xml for the cores in both the 
versions of Solr are identical.
I'd appreciate any pointers that can help with explaining this change.

Thanks,
Piyush.

Re: Query parser problem, using fuzzy search

2018-02-01 Thread David Frese


Am 31.01.18 um 16:30 schrieb David Frese:

Am 29.01.18 um 18:05 schrieb Erick Erickson:

Try searching with lowercase the word and. Somehow you have to allow
the parser to distinguish the two.


Oh yeah, the biggest unsolved problem in the ~80 years history of 
programming languages... NOT ;-)



You _might_ be able to try "AND~2" (with quotes) to see if you can get
that through the parser. Kind of a hack, but


Well, the parser swallows that, but it's not a fuzzy search then anymore.


There's also a parameter (depending on the parser) about lowercasing
operators, so if and~2 doesn't work check thatl


And if both appear?

Well, thanks for your ideas - of course you are not the one to blame.



If anybody runs into the same problem, I found a possibility:

field:\AND~1

will find documents with field values similar to "AND".



--
David Frese
+49 7071 70896 75

Active Group GmbH
Hechinger Str. 12/1, 72072 Tübingen
Registergericht: Amtsgericht Stuttgart, HRB 224404
Geschäftsführer: Dr. Michael Sperber

Re: Query parser problem, using fuzzy search

2018-01-31 Thread David Frese


Am 29.01.18 um 18:05 schrieb Erick Erickson:

Try searching with lowercase the word and. Somehow you have to allow
the parser to distinguish the two.


Oh yeah, the biggest unsolved problem in the ~80 years history of 
programming languages... NOT ;-)



You _might_ be able to try "AND~2" (with quotes) to see if you can get
that through the parser. Kind of a hack, but


Well, the parser swallows that, but it's not a fuzzy search then anymore.


There's also a parameter (depending on the parser) about lowercasing
operators, so if and~2 doesn't work check thatl


And if both appear?

Well, thanks for your ideas - of course you are not the one to blame.



On Mon, Jan 29, 2018 at 8:32 AM, David Frese
 wrote:

Hello everybody,

how can I formulate a fuzzy query that works for an arbitrary string, resp.
is there a formal syntax definition somewhere?

I already found by by hand, that

field:"val"~2

Is read by the parser, but the fuzzyness seems to get lost. So I write

field:val~2

Now if val contain spaces and other special characters, I can escape them:

field:my\ val~2

But now I'm stuck with the term AND:

field:AND~2

Note that I do not want a boolean expression here, but I want to match the
string AND! But the parser complains:

"org.apache.solr.search.SyntaxError: Cannot parse 'field:AND~2': Encountered
\"  \"AND \"\" at line 1, column 4.\nWas expecting one of:\n
 ...\n\"(\" ...\n\"*\" ...\n ...\n
...\n ...\n ...\n  ...\n\"[\"
...\n\"{\" ...\n ...\n \"filter(\" ...\n ...\n
",




--
David Frese
+49 7071 70896 75

Active Group GmbH
Hechinger Str. 12/1, 72072 Tübingen
Registergericht: Amtsgericht Stuttgart, HRB 224404
Geschäftsführer: Dr. Michael Sperber

Re: Query parser problem, using fuzzy search

2018-01-29 Thread Erick Erickson

Try searching with lowercase the word and. Somehow you have to allow
the parser to distinguish the two.

You _might_ be able to try "AND~2" (with quotes) to see if you can get
that through the parser. Kind of a hack, but

There's also a parameter (depending on the parser) about lowercasing
operators, so if and~2 doesn't work check thatl

On Mon, Jan 29, 2018 at 8:32 AM, David Frese
 wrote:
> Hello everybody,
>
> how can I formulate a fuzzy query that works for an arbitrary string, resp.
> is there a formal syntax definition somewhere?
>
> I already found by by hand, that
>
> field:"val"~2
>
> Is read by the parser, but the fuzzyness seems to get lost. So I write
>
> field:val~2
>
> Now if val contain spaces and other special characters, I can escape them:
>
> field:my\ val~2
>
> But now I'm stuck with the term AND:
>
> field:AND~2
>
> Note that I do not want a boolean expression here, but I want to match the
> string AND! But the parser complains:
>
> "org.apache.solr.search.SyntaxError: Cannot parse 'field:AND~2': Encountered
> \"  \"AND \"\" at line 1, column 4.\nWas expecting one of:\n
>  ...\n\"(\" ...\n\"*\" ...\n ...\n
> ...\n ...\n ...\n  ...\n\"[\"
> ...\n\"{\" ...\n ...\n \"filter(\" ...\n ...\n
> ",
>
>
> Thanks for any hints and help.
>
> --
> David Frese
> +49 7071 70896 75
>
> Active Group GmbH
> Hechinger Str. 12/1, 72072 Tübingen
> Registergericht: Amtsgericht Stuttgart, HRB 224404
> Geschäftsführer: Dr. Michael Sperber

Query parser problem, using fuzzy search

2018-01-29 Thread David Frese


Hello everybody,

how can I formulate a fuzzy query that works for an arbitrary string, 
resp. is there a formal syntax definition somewhere?


I already found by by hand, that

field:"val"~2

Is read by the parser, but the fuzzyness seems to get lost. So I write

field:val~2

Now if val contain spaces and other special characters, I can escape them:

field:my\ val~2

But now I'm stuck with the term AND:

field:AND~2

Note that I do not want a boolean expression here, but I want to match 
the string AND! But the parser complains:


"org.apache.solr.search.SyntaxError: Cannot parse 'field:AND~2': 
Encountered \"  \"AND \"\" at line 1, column 4.\nWas expecting one 
of:\n ...\n\"(\" ...\n\"*\" ...\n 
...\n ...\n ...\n ...\n 
 ...\n\"[\" ...\n\"{\" ...\n ...\n 
\"filter(\" ...\n ...\n",



Thanks for any hints and help.

--
David Frese
+49 7071 70896 75

Active Group GmbH
Hechinger Str. 12/1, 72072 Tübingen
Registergericht: Amtsgericht Stuttgart, HRB 224404
Geschäftsführer: Dr. Michael Sperber

Re: does the payload_check query parser have support for simple query parser operators?

2017-11-30 Thread John Anonymous

Ok, thanks.  Do you know if there are any plans to support special syntax
in the future?

On Thu, Nov 30, 2017 at 5:04 AM, Erik Hatcher <erik.hatc...@gmail.com>
wrote:

> No it doesn’t.   The payload parsers currently just simple tokenize with
> no special syntax supported.
>
>  Erik
>
> > On Nov 30, 2017, at 02:41, John Anonymous <orro...@gmail.com> wrote:
> >
> > I would like to use wildcards and fuzzy search with the payload_check
> query
> > parser. Are these supported?
> >
> > {!payload_check f=text payloads='NOUN'}apple~1
> >
> > {!payload_check f=text payloads='NOUN'}app*
> >
> > Thanks
>

Re: does the payload_check query parser have support for simple query parser operators?

2017-11-30 Thread Erik Hatcher

No it doesn’t.   The payload parsers currently just simple tokenize with no 
special syntax supported.  

 Erik

> On Nov 30, 2017, at 02:41, John Anonymous <orro...@gmail.com> wrote:
> 
> I would like to use wildcards and fuzzy search with the payload_check query
> parser. Are these supported?
> 
> {!payload_check f=text payloads='NOUN'}apple~1
> 
> {!payload_check f=text payloads='NOUN'}app*
> 
> Thanks

does the payload_check query parser have support for simple query parser operators?

2017-11-29 Thread John Anonymous

I would like to use wildcards and fuzzy search with the payload_check query
parser. Are these supported?

{!payload_check f=text payloads='NOUN'}apple~1

{!payload_check f=text payloads='NOUN'}app*

Thanks

LTR feature and proximity search with Block Join Parent query Parser

2017-10-19 Thread Dariusz Wojtas

Hi,
I am working on features and my main document ('type:entity') has child
documents, some of them contain addresses ('type:entityAddress').

My feature definition:
{
  "store": "store_myStore",
  "name": "scoreAddressCity",
  "class": "org.apache.solr.ltr.feature.SolrFeature",
  "params":{ "q": "+{!parent which='type:entity'
score='max'}type:entityAddress +{!parent which='type:entity'
score='max'}address.city:${searchedCity}" }
}

Two sample searches where I search for city 'Warszawa'.
I am passing the searched city name with as efi.searchedCity .
a) the address document contains value 'Warszawa' in field 'address.city'
The result feature score is 1.98

b) the address document contains value 'WarszawaRado' in field
'address.city'
The result score is 0.0

How to return a score that finds some similarities between 'Warszawa' and
'WarszawaRado' in search b)?

Best regards,
Dariusz Wojtas

Re: Apache 4.9.1 - trouble trying to use complex phrase query parser.

2017-06-28 Thread Stefan Matheis

If you'd include the actual error message you get .. it might easier to try
and help?

-Stefan

On Jun 28, 2017 6:24 PM, "Michael Craven" <mcrav...@jhu.edu> wrote:

> Hi -
>
> I am trying to use the complex phrase query parser on my Drupal
> installation. Our core is sore 4.9.1, so I thought it should be no problem.
> Search works fine when I use a local parameter to do a search of type
> lucene, dismax, or edismax, (a la {!lucene} etc.), but when I try to do a
> search of type complex phrase, I get an error. Does anyone know why that
> might be? Is this maybe a Drupal specific problem? We are running Drupal
> 7.56.
>
> Thanks
>
> -M

Apache 4.9.1 - trouble trying to use complex phrase query parser.

2017-06-28 Thread Michael Craven

Hi - 

I am trying to use the complex phrase query parser on my Drupal installation. 
Our core is sore 4.9.1, so I thought it should be no problem. Search works fine 
when I use a local parameter to do a search of type lucene, dismax, or edismax, 
(a la {!lucene} etc.), but when I try to do a search of type complex phrase, I 
get an error. Does anyone know why that might be? Is this maybe a Drupal 
specific problem? We are running Drupal 7.56.

Thanks

-M

Re: Solr NLS custom query parser

2017-06-15 Thread aruninfo100

Hi Michael,

I have indexed the documents in such a way,I used OpenNLP to extract named
entities and POS and has indexed these data to respective fields. I have
read(my understanding) that for natural language search using Solr,once you
have the entities extracted the next step is to create a custom query parser
which takes advantage of the entite fields.
I have referred the slides and talk-
https://www.slideshare.net/lucenerevolution/teofilie-natural-languagesearchinsolreurocon2011
<https://www.slideshare.net/lucenerevolution/teofilie-natural-languagesearchinsolreurocon2011>
  
to do the same.

Thanks and Regards,
Arun



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-NLS-custom-query-parser-tp4340511p4340679.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr NLS custom query parser

2017-06-15 Thread Michael Kuhlmann

Hi Arun,

your question is too generic. What do you mean with nlp search? What do
you expect to happen?

The short answer is: No, there is no such parser because the individual
requirements will vary a lot.

-Michael

Am 14.06.2017 um 16:32 schrieb aruninfo100:
> Hi,
>
> I am trying to configure NLP search with Solr. I am using OpenNLP for the
> same.I am able to index the documents and extract named entities and POS
> using OpenNLP-UIMA support and also by using a UIMA Update request processor
> chain.But I am not able to write a query parser for the same.Is there a
> query parser already written to satisfy the above features(nlp search).
>
> Thanks and Regards,
> Arun
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-NLS-custom-query-parser-tp4340511.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Solr NLS custom query parser

2017-06-14 Thread aruninfo100

Hi,

I am trying to configure NLP search with Solr. I am using OpenNLP for the
same.I am able to index the documents and extract named entities and POS
using OpenNLP-UIMA support and also by using a UIMA Update request processor
chain.But I am not able to write a query parser for the same.Is there a
query parser already written to satisfy the above features(nlp search).

Thanks and Regards,
Arun



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-NLS-custom-query-parser-tp4340511.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Configure query parser to handle field name case-insensitive

2017-05-16 Thread Erick Erickson

Rick:

Easiest to _code_. There isn't any. And if you just toss the problem
over the fence to support then it's not a problem ;)

Best,
Erick

On Tue, May 16, 2017 at 9:04 AM, Rick Leir <rl...@leirtech.com> wrote:
> Björn
> You are not serious about (1) are you? Yikes!! Easiest for you if you do not 
> need to sit at the helpdesk. Easiest if the users stop using the system.
>
> My guess is that (2) is easiest if you have text entry boxes for each field, 
> and the user need not type in the field name. Cheers -- Rick
>
> On May 16, 2017 10:56:37 AM EDT, Erick Erickson <erickerick...@gmail.com> 
> wrote:
>>Yeah, your options (5) and (6) are well... definitely at the
>>bottom of _my_ list, I understand you included them for
>>completeness...
>>
>>as for (4) Oh, my aching head. Parsers give me a headache ;)
>>
>>Yes, (1) is the easiest.(2) and (3) mostly depend on where you're most
>>comfortable coding. If you intercept the query on the backend in Java
>>_very_ early in the process you are working with essentially the same
>>string as you would in JS on the front end so it's a tossup. You might
>>just be more comfortable writing JS on the client rather than Java and
>>getting it hooked in to Solr, really your choice.
>>
>>Best,
>>Erick
>>
>>2017-05-16 0:59 GMT-07:00 Peemöller, Björn
>><bjoern.peemoel...@berenberg.de>:
>>> Hi all,
>>>
>>> thank you for your replies!
>>>
>>> We do not directly expose the Solr API, but provide an endpoint in
>>our backend which acts as a proxy for a specific search handler. One
>>requirement in our application is to search for people using various
>>properties, e.g., first name, last name, description, date of birth.
>>For simplicity reasons, we want to provide only a single search input
>>and allow the user to narrow down its results using the query syntax,
>>e.g. "firstname:John".
>>>
>>> Based on your suggestions, I can see the following solutions for our
>>problem:
>>>
>>> 1) Train the users to denote fieldnames in lowercase - they need to
>>know the exact field names anyway.
>>> 2) Modify (i.e., lowercase) the search term in the backend (Java)
>>> 3) Modify (i.e., lowercase) the search term in the frontend (JS)
>>> 4) Modify the Solr query parser (provide a customized implementation)
>>> 5) Define *a lot* of field aliases
>>> 6) Define *a lot* of copy fields
>>>
>>> I assess these solutions to be ordered in decreasing quality, so I
>>think that we will start to improve with more user guidance.
>>>
>>> Thanks to all,
>>> Björn
>>>
>>> -Ursprüngliche Nachricht-
>>> Von: Rick Leir [mailto:rl...@leirtech.com]
>>> Gesendet: Montag, 15. Mai 2017 18:33
>>> An: solr-user@lucene.apache.org
>>> Betreff: Re: Configure query parser to handle field name
>>case-insensitive
>>>
>>> Björn
>>> Yes, at query time you could downcase the names. Not in Solr, but in
>>the front-end web app you have in front of Solr. It needs to be a bit
>>smart, so it can downcase the field names but not the query terms.
>>>
>>> I assume you do not expose Solr directly to the web.
>>>
>>> This downcasing might be easier to do in Javascript in the browser.
>>Particularly if the user never has to enter a field name.
>>>
>>> Another solution, this time inside Solr, is to provide copyfields for
>>ID, Id, and maybe iD. And for other fields that you mention in queries.
>>This will consume some memory, particularly for saved fields, so I
>>hesitate to even suggest it. Cheers - Rick
>>>
>>>
>>> On May 15, 2017 9:16:59 AM EDT, "Peemöller, Björn"
>><bjoern.peemoel...@berenberg.de> wrote:
>>>>Hi Rick,
>>>>
>>>>thank you for your reply! I really meant field *names*, since our
>>>>values are already processed by a lower case filter (both index and
>>>>query). However, our users are confused because they can search for
>>>>"id:1" but not for "ID:1". Furthermore, we employ the EDisMax query
>>>>parser, so then even get no error message.
>>>>
>>>>Therefore, I thought it may be sufficient to map all field names to
>>>>lower case at the query level so that I do not have to introduce
>>>>additional fields.
>>>>
>>>>Regards,
>>>>Björn
>>>>
>>>>-Ursprüngliche Nachricht-
>>>&g

Re: Configure query parser to handle field name case-insensitive

2017-05-16 Thread Rick Leir

Björn
You are not serious about (1) are you? Yikes!! Easiest for you if you do not 
need to sit at the helpdesk. Easiest if the users stop using the system. 

My guess is that (2) is easiest if you have text entry boxes for each field, 
and the user need not type in the field name. Cheers -- Rick

On May 16, 2017 10:56:37 AM EDT, Erick Erickson <erickerick...@gmail.com> wrote:
>Yeah, your options (5) and (6) are well... definitely at the
>bottom of _my_ list, I understand you included them for
>completeness...
>
>as for (4) Oh, my aching head. Parsers give me a headache ;)
>
>Yes, (1) is the easiest.(2) and (3) mostly depend on where you're most
>comfortable coding. If you intercept the query on the backend in Java
>_very_ early in the process you are working with essentially the same
>string as you would in JS on the front end so it's a tossup. You might
>just be more comfortable writing JS on the client rather than Java and
>getting it hooked in to Solr, really your choice.
>
>Best,
>Erick
>
>2017-05-16 0:59 GMT-07:00 Peemöller, Björn
><bjoern.peemoel...@berenberg.de>:
>> Hi all,
>>
>> thank you for your replies!
>>
>> We do not directly expose the Solr API, but provide an endpoint in
>our backend which acts as a proxy for a specific search handler. One
>requirement in our application is to search for people using various
>properties, e.g., first name, last name, description, date of birth.
>For simplicity reasons, we want to provide only a single search input
>and allow the user to narrow down its results using the query syntax,
>e.g. "firstname:John".
>>
>> Based on your suggestions, I can see the following solutions for our
>problem:
>>
>> 1) Train the users to denote fieldnames in lowercase - they need to
>know the exact field names anyway.
>> 2) Modify (i.e., lowercase) the search term in the backend (Java)
>> 3) Modify (i.e., lowercase) the search term in the frontend (JS)
>> 4) Modify the Solr query parser (provide a customized implementation)
>> 5) Define *a lot* of field aliases
>> 6) Define *a lot* of copy fields
>>
>> I assess these solutions to be ordered in decreasing quality, so I
>think that we will start to improve with more user guidance.
>>
>> Thanks to all,
>> Björn
>>
>> -Ursprüngliche Nachricht-
>> Von: Rick Leir [mailto:rl...@leirtech.com]
>> Gesendet: Montag, 15. Mai 2017 18:33
>> An: solr-user@lucene.apache.org
>> Betreff: Re: Configure query parser to handle field name
>case-insensitive
>>
>> Björn
>> Yes, at query time you could downcase the names. Not in Solr, but in
>the front-end web app you have in front of Solr. It needs to be a bit
>smart, so it can downcase the field names but not the query terms.
>>
>> I assume you do not expose Solr directly to the web.
>>
>> This downcasing might be easier to do in Javascript in the browser.
>Particularly if the user never has to enter a field name.
>>
>> Another solution, this time inside Solr, is to provide copyfields for
>ID, Id, and maybe iD. And for other fields that you mention in queries.
>This will consume some memory, particularly for saved fields, so I
>hesitate to even suggest it. Cheers - Rick
>>
>>
>> On May 15, 2017 9:16:59 AM EDT, "Peemöller, Björn"
><bjoern.peemoel...@berenberg.de> wrote:
>>>Hi Rick,
>>>
>>>thank you for your reply! I really meant field *names*, since our
>>>values are already processed by a lower case filter (both index and
>>>query). However, our users are confused because they can search for
>>>"id:1" but not for "ID:1". Furthermore, we employ the EDisMax query
>>>parser, so then even get no error message.
>>>
>>>Therefore, I thought it may be sufficient to map all field names to
>>>lower case at the query level so that I do not have to introduce
>>>additional fields.
>>>
>>>Regards,
>>>Björn
>>>
>>>-Ursprüngliche Nachricht-
>>>Von: Rick Leir [mailto:rl...@leirtech.com]
>>>Gesendet: Montag, 15. Mai 2017 13:48
>>>An: solr-user@lucene.apache.org
>>>Betreff: Re: Configure query parser to handle field name
>>>case-insensitive
>>>
>>>Björn
>>>Field names or values? I assume values. Your analysis chain in
>>>schema.xml probably downcases chars, if not then that could be your
>>>problem.
>>>
>>>Field _name_? Then you might have to copyfield the field to a new
>field
>>>with the desired case. Avoid doing that if you can. Cheers -- Ric

Re: Configure query parser to handle field name case-insensitive

2017-05-16 Thread Erick Erickson

Yeah, your options (5) and (6) are well... definitely at the
bottom of _my_ list, I understand you included them for
completeness...

as for (4) Oh, my aching head. Parsers give me a headache ;)

Yes, (1) is the easiest.(2) and (3) mostly depend on where you're most
comfortable coding. If you intercept the query on the backend in Java
_very_ early in the process you are working with essentially the same
string as you would in JS on the front end so it's a tossup. You might
just be more comfortable writing JS on the client rather than Java and
getting it hooked in to Solr, really your choice.

Best,
Erick

2017-05-16 0:59 GMT-07:00 Peemöller, Björn <bjoern.peemoel...@berenberg.de>:
> Hi all,
>
> thank you for your replies!
>
> We do not directly expose the Solr API, but provide an endpoint in our 
> backend which acts as a proxy for a specific search handler. One requirement 
> in our application is to search for people using various properties, e.g., 
> first name, last name, description, date of birth. For simplicity reasons, we 
> want to provide only a single search input and allow the user to narrow down 
> its results using the query syntax, e.g. "firstname:John".
>
> Based on your suggestions, I can see the following solutions for our problem:
>
> 1) Train the users to denote fieldnames in lowercase - they need to know the 
> exact field names anyway.
> 2) Modify (i.e., lowercase) the search term in the backend (Java)
> 3) Modify (i.e., lowercase) the search term in the frontend (JS)
> 4) Modify the Solr query parser (provide a customized implementation)
> 5) Define *a lot* of field aliases
> 6) Define *a lot* of copy fields
>
> I assess these solutions to be ordered in decreasing quality, so I think that 
> we will start to improve with more user guidance.
>
> Thanks to all,
> Björn
>
> -Ursprüngliche Nachricht-
> Von: Rick Leir [mailto:rl...@leirtech.com]
> Gesendet: Montag, 15. Mai 2017 18:33
> An: solr-user@lucene.apache.org
> Betreff: Re: Configure query parser to handle field name case-insensitive
>
> Björn
> Yes, at query time you could downcase the names. Not in Solr, but in the 
> front-end web app you have in front of Solr. It needs to be a bit smart, so 
> it can downcase the field names but not the query terms.
>
> I assume you do not expose Solr directly to the web.
>
> This downcasing might be easier to do in Javascript in the browser. 
> Particularly if the user never has to enter a field name.
>
> Another solution, this time inside Solr, is to provide copyfields for ID, Id, 
> and maybe iD. And for other fields that you mention in queries. This will 
> consume some memory, particularly for saved fields, so I hesitate to even 
> suggest it. Cheers - Rick
>
>
> On May 15, 2017 9:16:59 AM EDT, "Peemöller, Björn" 
> <bjoern.peemoel...@berenberg.de> wrote:
>>Hi Rick,
>>
>>thank you for your reply! I really meant field *names*, since our
>>values are already processed by a lower case filter (both index and
>>query). However, our users are confused because they can search for
>>"id:1" but not for "ID:1". Furthermore, we employ the EDisMax query
>>parser, so then even get no error message.
>>
>>Therefore, I thought it may be sufficient to map all field names to
>>lower case at the query level so that I do not have to introduce
>>additional fields.
>>
>>Regards,
>>Björn
>>
>>-Ursprüngliche Nachricht-
>>Von: Rick Leir [mailto:rl...@leirtech.com]
>>Gesendet: Montag, 15. Mai 2017 13:48
>>An: solr-user@lucene.apache.org
>>Betreff: Re: Configure query parser to handle field name
>>case-insensitive
>>
>>Björn
>>Field names or values? I assume values. Your analysis chain in
>>schema.xml probably downcases chars, if not then that could be your
>>problem.
>>
>>Field _name_? Then you might have to copyfield the field to a new field
>>with the desired case. Avoid doing that if you can. Cheers -- Rick
>>
>>On May 15, 2017 5:48:09 AM EDT, "Peemöller, Björn"
>><bjoern.peemoel...@berenberg.de> wrote:
>>>Hi all,
>>>
>>>I'm fairly new at using Solr and I need to configure our instance to
>>>accept field names in both uppercase and lowercase (they are defined
>>as
>>>lowercase in our configuration). Is there a simple way to achieve
>>this?
>>>
>>>Thanks in advance,
>>>Björn
>>>
>>>Björn Peemöller
>>>IT & IT Operations
>>>
>>>BERENBERG
>>>Joh. Berenberg, Gossler & Co. KG
>>>Neuer Jungfernstieg 20
>>>20354 Hamburg
>>&

AW: Configure query parser to handle field name case-insensitive

2017-05-16 Thread Peemöller , Björn

Hi all,

thank you for your replies!

We do not directly expose the Solr API, but provide an endpoint in our backend 
which acts as a proxy for a specific search handler. One requirement in our 
application is to search for people using various properties, e.g., first name, 
last name, description, date of birth. For simplicity reasons, we want to 
provide only a single search input and allow the user to narrow down its 
results using the query syntax, e.g. "firstname:John".

Based on your suggestions, I can see the following solutions for our problem:

1) Train the users to denote fieldnames in lowercase - they need to know the 
exact field names anyway.
2) Modify (i.e., lowercase) the search term in the backend (Java)
3) Modify (i.e., lowercase) the search term in the frontend (JS)
4) Modify the Solr query parser (provide a customized implementation)
5) Define *a lot* of field aliases 
6) Define *a lot* of copy fields

I assess these solutions to be ordered in decreasing quality, so I think that 
we will start to improve with more user guidance.

Thanks to all,
Björn

-Ursprüngliche Nachricht-
Von: Rick Leir [mailto:rl...@leirtech.com] 
Gesendet: Montag, 15. Mai 2017 18:33
An: solr-user@lucene.apache.org
Betreff: Re: Configure query parser to handle field name case-insensitive

Björn
Yes, at query time you could downcase the names. Not in Solr, but in the 
front-end web app you have in front of Solr. It needs to be a bit smart, so it 
can downcase the field names but not the query terms.

I assume you do not expose Solr directly to the web.

This downcasing might be easier to do in Javascript in the browser. 
Particularly if the user never has to enter a field name.

Another solution, this time inside Solr, is to provide copyfields for ID, Id, 
and maybe iD. And for other fields that you mention in queries. This will 
consume some memory, particularly for saved fields, so I hesitate to even 
suggest it. Cheers - Rick

On May 15, 2017 9:16:59 AM EDT, "Peemöller, Björn" 
<bjoern.peemoel...@berenberg.de> wrote:
>Hi Rick,
>
>thank you for your reply! I really meant field *names*, since our 
>values are already processed by a lower case filter (both index and 
>query). However, our users are confused because they can search for 
>"id:1" but not for "ID:1". Furthermore, we employ the EDisMax query 
>parser, so then even get no error message.
>
>Therefore, I thought it may be sufficient to map all field names to 
>lower case at the query level so that I do not have to introduce 
>additional fields.
>
>Regards,
>Björn
>
>-Ursprüngliche Nachricht-
>Von: Rick Leir [mailto:rl...@leirtech.com]
>Gesendet: Montag, 15. Mai 2017 13:48
>An: solr-user@lucene.apache.org
>Betreff: Re: Configure query parser to handle field name 
>case-insensitive
>
>Björn
>Field names or values? I assume values. Your analysis chain in 
>schema.xml probably downcases chars, if not then that could be your 
>problem.
>
>Field _name_? Then you might have to copyfield the field to a new field 
>with the desired case. Avoid doing that if you can. Cheers -- Rick
>
>On May 15, 2017 5:48:09 AM EDT, "Peemöller, Björn"
><bjoern.peemoel...@berenberg.de> wrote:
>>Hi all,
>>
>>I'm fairly new at using Solr and I need to configure our instance to 
>>accept field names in both uppercase and lowercase (they are defined
>as
>>lowercase in our configuration). Is there a simple way to achieve
>this?
>>
>>Thanks in advance,
>>Björn
>>
>>Björn Peemöller
>>IT & IT Operations
>>
>>BERENBERG
>>Joh. Berenberg, Gossler & Co. KG
>>Neuer Jungfernstieg 20
>>20354 Hamburg
>>
>>Telefon +49 40 350 60-8548
>>Telefax +49 40 350 60-900
>>E-Mail
>>bjoern.peemoel...@berenberg.de<mailto:bjoern.peemoel...@berenberg.de>
>>www.berenberg.de<http://www.berenberg.de/>
>>
>>Sitz: Hamburg - Amtsgericht Hamburg HRA 42659
>>
>>
>>Diese Nachricht einschliesslich etwa beigefuegter Anhaenge ist 
>>vertraulich und kann dem Bank- und Datengeheimnis unterliegen oder 
>>sonst rechtlich geschuetzte Daten und Informationen enthalten. Wenn
>Sie
>>nicht der richtige Adressat sind oder diese Nachricht irrtuemlich 
>>erhalten haben, informieren Sie bitte sofort den Absender über die 
>>Antwortfunktion. Anschliessend moechten Sie bitte diese Nachricht 
>>einschliesslich etwa beigefuegter Anhaenge unverzueglich vollstaendig 
>>loeschen. Das unerlaubte Kopieren oder Speichern dieser Nachricht 
>>und/oder der ihr etwa beigefuegten Anhaenge sowie die unbefugte 
>>Weitergabe der darin enthaltenen Daten und Informationen sind nicht 
>>gestattet. Wir weisen darauf hin, dass rechtsverbindlich

Re: Configure query parser to handle field name case-insensitive

2017-05-15 Thread Rick Leir

Björn
Yes, at query time you could downcase the names. Not in Solr, but in the 
front-end web app you have in front of Solr. It needs to be a bit smart, so it 
can downcase the field names but not the query terms.

I assume you do not expose Solr directly to the web.

This downcasing might be easier to do in Javascript in the browser. 
Particularly if the user never has to enter a field name.

Another solution, this time inside Solr, is to provide copyfields for ID, Id, 
and maybe iD. And for other fields that you mention in queries. This will 
consume some memory, particularly for saved fields, so I hesitate to even 
suggest it. Cheers - Rick


On May 15, 2017 9:16:59 AM EDT, "Peemöller, Björn" 
<bjoern.peemoel...@berenberg.de> wrote:
>Hi Rick,
>
>thank you for your reply! I really meant field *names*, since our
>values are already processed by a lower case filter (both index and
>query). However, our users are confused because they can search for
>"id:1" but not for "ID:1". Furthermore, we employ the EDisMax query
>parser, so then even get no error message.
>
>Therefore, I thought it may be sufficient to map all field names to
>lower case at the query level so that I do not have to introduce
>additional fields.
>
>Regards,
>Björn
>
>-Ursprüngliche Nachricht-
>Von: Rick Leir [mailto:rl...@leirtech.com] 
>Gesendet: Montag, 15. Mai 2017 13:48
>An: solr-user@lucene.apache.org
>Betreff: Re: Configure query parser to handle field name
>case-insensitive
>
>Björn
>Field names or values? I assume values. Your analysis chain in
>schema.xml probably downcases chars, if not then that could be your
>problem.
>
>Field _name_? Then you might have to copyfield the field to a new field
>with the desired case. Avoid doing that if you can. Cheers -- Rick
>
>On May 15, 2017 5:48:09 AM EDT, "Peemöller, Björn"
><bjoern.peemoel...@berenberg.de> wrote:
>>Hi all,
>>
>>I'm fairly new at using Solr and I need to configure our instance to 
>>accept field names in both uppercase and lowercase (they are defined
>as 
>>lowercase in our configuration). Is there a simple way to achieve
>this?
>>
>>Thanks in advance,
>>Björn
>>
>>Björn Peemöller
>>IT & IT Operations
>>
>>BERENBERG
>>Joh. Berenberg, Gossler & Co. KG
>>Neuer Jungfernstieg 20
>>20354 Hamburg
>>
>>Telefon +49 40 350 60-8548
>>Telefax +49 40 350 60-900
>>E-Mail
>>bjoern.peemoel...@berenberg.de<mailto:bjoern.peemoel...@berenberg.de>
>>www.berenberg.de<http://www.berenberg.de/>
>>
>>Sitz: Hamburg - Amtsgericht Hamburg HRA 42659
>>
>>
>>Diese Nachricht einschliesslich etwa beigefuegter Anhaenge ist 
>>vertraulich und kann dem Bank- und Datengeheimnis unterliegen oder 
>>sonst rechtlich geschuetzte Daten und Informationen enthalten. Wenn
>Sie 
>>nicht der richtige Adressat sind oder diese Nachricht irrtuemlich 
>>erhalten haben, informieren Sie bitte sofort den Absender über die 
>>Antwortfunktion. Anschliessend moechten Sie bitte diese Nachricht 
>>einschliesslich etwa beigefuegter Anhaenge unverzueglich vollstaendig 
>>loeschen. Das unerlaubte Kopieren oder Speichern dieser Nachricht 
>>und/oder der ihr etwa beigefuegten Anhaenge sowie die unbefugte 
>>Weitergabe der darin enthaltenen Daten und Informationen sind nicht 
>>gestattet. Wir weisen darauf hin, dass rechtsverbindliche Erklaerungen
>
>>namens unseres Hauses grundsaetzlich der Unterschriften zweier 
>>ausreichend bevollmaechtigter Vertreter unseres Hauses beduerfen. Wir 
>>verschicken daher keine rechtsverbindlichen Erklaerungen per E-Mail an
>
>>Dritte. Demgemaess nehmen wir per E-Mail auch keine
>rechtsverbindlichen 
>>Erklaerungen oder Auftraege von Dritten entgegen.
>>Sollten Sie Schwierigkeiten beim Oeffnen dieser E-Mail haben, wenden 
>>Sie sich bitte an den Absender oder an i...@berenberg.de. Please refer
>
>>to http://www.berenberg.de/my_berenberg/disclaimer_e.html for our 
>>confidentiality notice.
>
>--
>Sorry for being brief. Alternate email is rickleir at yahoo dot com 
>
>Diese Nachricht einschliesslich etwa beigefuegter Anhaenge ist
>vertraulich und kann dem Bank- und Datengeheimnis unterliegen oder
>sonst rechtlich geschuetzte Daten und Informationen enthalten. Wenn Sie
>nicht der richtige Adressat sind oder diese Nachricht irrtuemlich
>erhalten haben, informieren Sie bitte sofort den Absender über die
>Antwortfunktion. Anschliessend moechten Sie bitte diese Nachricht
>einschliesslich etwa beigefuegter Anhaenge unverzueglich vollstaendig
>loeschen. Das unerlaubte Kopieren oder Speichern dieser Nachrich

Re: Configure query parser to handle field name case-insensitive

2017-05-15 Thread Erick Erickson

So do you have _users_ directly entering Solr queries? And are the
totally trusted to be
1> not malicious
2> already know your schema?

Because direct access to the Solr URL allows me to delete all your
data. Usually there are drop-downs or other UI "stuff" that allows you
to programmatically assign the field name.

Trying to get in there and parse an arbitrary query in component is
do-able but difficult.

As Geraint says, field aliasing will work, but you'd need to cover all
the possibilities. All uppercase to lowercase is easy, but camel case
etc. would lead to a lot of aliases.

Best,
Erick

2017-05-15 6:16 GMT-07:00 Peemöller, Björn <bjoern.peemoel...@berenberg.de>:
> Hi Rick,
>
> thank you for your reply! I really meant field *names*, since our values are 
> already processed by a lower case filter (both index and query). However, our 
> users are confused because they can search for "id:1" but not for "ID:1". 
> Furthermore, we employ the EDisMax query parser, so then even get no error 
> message.
>
> Therefore, I thought it may be sufficient to map all field names to lower 
> case at the query level so that I do not have to introduce additional fields.
>
> Regards,
> Björn
>
> -Ursprüngliche Nachricht-
> Von: Rick Leir [mailto:rl...@leirtech.com]
> Gesendet: Montag, 15. Mai 2017 13:48
> An: solr-user@lucene.apache.org
> Betreff: Re: Configure query parser to handle field name case-insensitive
>
> Björn
> Field names or values? I assume values. Your analysis chain in schema.xml 
> probably downcases chars, if not then that could be your problem.
>
> Field _name_? Then you might have to copyfield the field to a new field with 
> the desired case. Avoid doing that if you can. Cheers -- Rick
>
> On May 15, 2017 5:48:09 AM EDT, "Peemöller, Björn" 
> <bjoern.peemoel...@berenberg.de> wrote:
>>Hi all,
>>
>>I'm fairly new at using Solr and I need to configure our instance to
>>accept field names in both uppercase and lowercase (they are defined as
>>lowercase in our configuration). Is there a simple way to achieve this?
>>
>>Thanks in advance,
>>Björn
>>
>>Björn Peemöller
>>IT & IT Operations
>>
>>BERENBERG
>>Joh. Berenberg, Gossler & Co. KG
>>Neuer Jungfernstieg 20
>>20354 Hamburg
>>
>>Telefon +49 40 350 60-8548
>>Telefax +49 40 350 60-900
>>E-Mail
>>bjoern.peemoel...@berenberg.de<mailto:bjoern.peemoel...@berenberg.de>
>>www.berenberg.de<http://www.berenberg.de/>
>>
>>Sitz: Hamburg - Amtsgericht Hamburg HRA 42659
>>
>>
>>Diese Nachricht einschliesslich etwa beigefuegter Anhaenge ist
>>vertraulich und kann dem Bank- und Datengeheimnis unterliegen oder
>>sonst rechtlich geschuetzte Daten und Informationen enthalten. Wenn Sie
>>nicht der richtige Adressat sind oder diese Nachricht irrtuemlich
>>erhalten haben, informieren Sie bitte sofort den Absender über die
>>Antwortfunktion. Anschliessend moechten Sie bitte diese Nachricht
>>einschliesslich etwa beigefuegter Anhaenge unverzueglich vollstaendig
>>loeschen. Das unerlaubte Kopieren oder Speichern dieser Nachricht
>>und/oder der ihr etwa beigefuegten Anhaenge sowie die unbefugte
>>Weitergabe der darin enthaltenen Daten und Informationen sind nicht
>>gestattet. Wir weisen darauf hin, dass rechtsverbindliche Erklaerungen
>>namens unseres Hauses grundsaetzlich der Unterschriften zweier
>>ausreichend bevollmaechtigter Vertreter unseres Hauses beduerfen. Wir
>>verschicken daher keine rechtsverbindlichen Erklaerungen per E-Mail an
>>Dritte. Demgemaess nehmen wir per E-Mail auch keine rechtsverbindlichen
>>Erklaerungen oder Auftraege von Dritten entgegen.
>>Sollten Sie Schwierigkeiten beim Oeffnen dieser E-Mail haben, wenden
>>Sie sich bitte an den Absender oder an i...@berenberg.de. Please refer
>>to http://www.berenberg.de/my_berenberg/disclaimer_e.html for our
>>confidentiality notice.
>
> --
> Sorry for being brief. Alternate email is rickleir at yahoo dot com
>
> Diese Nachricht einschliesslich etwa beigefuegter Anhaenge ist vertraulich 
> und kann dem Bank- und Datengeheimnis unterliegen oder sonst rechtlich 
> geschuetzte Daten und Informationen enthalten. Wenn Sie nicht der richtige 
> Adressat sind oder diese Nachricht irrtuemlich erhalten haben, informieren 
> Sie bitte sofort den Absender über die Antwortfunktion. Anschliessend 
> moechten Sie bitte diese Nachricht einschliesslich etwa beigefuegter Anhaenge 
> unverzueglich vollstaendig loeschen. Das unerlaubte Kopieren oder Speichern 
> dieser Nachricht und/oder der ihr etwa beigefuegten Anhaenge sowie die 
> unbefugte W

1 2 3 4 5 >

1 - 100 of 449 matches

Mail list logo