Avoiding duplicate entry for a multivalued field

2020-10-28 Thread Srinivas Kashyap
Hello,

Say, I have a schema field which is multivalued. Is there a way to maintain 
distinct values for that field though I continue to add duplicate values 
through atomic update via solrj?

Is there some property setting to have only unique values in a multi valued 
fields?

Thanks,
Srinivas

DISCLAIMER:
E-mails and attachments from Bamboo Rose, LLC are confidential.
If you are not the intended recipient, please notify the sender immediately by 
replying to the e-mail, and then delete it without making copies or using it in 
any way.
No representation is made that this email or any attachments are free of 
viruses. Virus scanning is recommended and is the responsibility of the 
recipient.

Disclaimer

The information contained in this communication from the sender is 
confidential. It is intended solely for use by the recipient and others 
authorized to receive it. If you are not the recipient, you are hereby notified 
that any disclosure, copying, distribution or taking action in relation of the 
contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been 
automatically archived by Mimecast Ltd, an innovator in Software as a Service 
(SaaS) for business. Providing a safer and more useful place for your human 
generated data. Specializing in; Security, archiving and compliance. To find 
out more visit the Mimecast website.


Re: Simulate facet.exists for json query facets

2020-10-28 Thread Michael Gibney
Separately, and in parallel to Erick's question: indeed I'm not aware
of any way to do this currently, but I *can* imagine cases where this
would be useful. I have a sense this could be cleanly implemented as a
stat facet function
(https://lucene.apache.org/solr/guide/8_6/json-facet-api.html#stat-facet-functions),
e.g.:

curl http://localhost:8983/solr/portal/select -d \
"q=*:*\
={
  tour: \"exists(+categoryId:6000 -categoryId:(6061 21493 8510))\"
}\
=0"

The return value of the `exists` function could be boolean, which
would be semantically clearer than capping count to 1, as I gather
`facet.exists` does. For the same reason, implementing this as a
function would probably be better than adding this functionality to
the `query` facet type, which carries certain useful assumptions (the
meaning of the "count" attribute in the response, the ability to nest
stats and subfacets, etc.) ... just thinking out loud at the moment
...

On Wed, Oct 28, 2020 at 9:17 AM Erick Erickson  wrote:
>
> This really sounds like an XY problem. The whole point of facets is
> to count the number of documents that have a value in some
> number of buckets. So trying to stop your facet query as soon
> as it matches a hit for the first time seems like an odd thing to do.
>
> So what’s the “X”? In other words, what is the problem you’re trying
> to solve at a high level? Perhaps there’s a better way to figure this
> out.
>
> Best,
> Erick
>
> > On Oct 28, 2020, at 3:48 AM, michael dürr  wrote:
> >
> > Hi,
> >
> > I use json facets of type 'query'. As these queries are pretty slow and I'm
> > only interested in whether there is a match or not, I'd like to restrict
> > the query execution similar to the standard facetting (like with the
> > facet.exists parameter). My simplified query looks something like this (in
> > reality *:* may be replaced by a complex edismax query and multiple
> > subfacets similar to "tour" occur):
> >
> > curl http://localhost:8983/solr/portal/select -d \
> > "q=*:*\
> > ={
> >  tour:{
> >type : query,
> > q: \"+(+categoryId:6000 -categoryId:(6061 21493 8510))\"
> >  }
> > }\
> > =0"
> >
> > Is there any possibility to modify my request to ensure that the facet
> > query stops as soon as it matches a hit for the first time?
> >
> > Thanks!
> > Michael
>


Re: Solr LockObtainFailedException and NPEs for CoreAdmin STATUS

2020-10-28 Thread ahubold
Hi,

after reading some Solr source code, I might have found the cause:

There was indeed a change in Solr 8.6 that leads to the NullPointerException
for the CoreAdmin STATUS request in CoreAdminOperation#getCoreStatus. The
instancePath is not retrieved from the ResourceLoader anymore, but from the
registered CoreDescriptor. See commit [1]. 

SolrCore.getInstancePath(SolrCore.java:333) throws an NPE because the
CoreContainer does not have a CoreDescriptor for the name, even though a
SolrCore is available in the CoreContainer under that name (retrieved some
lines above). This inconsistency is persistent: All STATUS requests keep
failing until Solr is restarted.

IIUC, the underlying problem is that CoreContainer#create does not correctly
handle concurrent requests to create the same core. There's a race condition
(see TODO comment [2]), and CoreContainer#createFromDescriptor may be called
subsequently for the same core. The second call then fails to create an
IndexWriter (LockObtainFailedException), and this causes a call to
SolrCores#removeCoreDescriptor [3]. This mean, the second call removes the
CoreDescriptor for the SolrCore created with the first call. This is the
inconsistency that causes the NPE in CoreAdminOperation#getCoreStatus.

Does this sound reasonable?

I'll create a JIRA ticket tomorrow, if that's okay.

Thank you,
Andreas

[1]
https://github.com/apache/lucene-solr/commit/17ae79b0905b2bf8635c1b260b30807cae2f5463#diff-9652fe8353b7eff59cd6f128bb2699d88361e670b840ee5ca1018b1bc45584d1R324
[2]
https://github.com/apache/lucene-solr/blob/15241573d3c8da0db3dfd380d99e4efcfe500c2e/solr/core/src/java/org/apache/solr/core/CoreContainer.java#L1242
[3]
https://github.com/apache/lucene-solr/blob/15241573d3c8da0db3dfd380d99e4efcfe500c2e/solr/core/src/java/org/apache/solr/core/CoreContainer.java#L1407




--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


How to remove special characters from suggestion in Solr

2020-10-28 Thread Abhay Kumar
Hello,

We are using below suggest component in our solr implementation.


  
 analyzinginfixsuggester
 analyzinginfixlookupfactory
 documentdictionaryfactory
 text_auto
 prefix_text
 true
 true
  
  
 FreeTextSuggester
 FreeTextLookupFactory
 DocumentDictionaryFactory
 text
 5
  
 text_general
 true
 true
  
   







  
  
  

  

For one of document, we have large data and while syncing this document using 
SolrNet library. We are getting below exception.

SuggestComponent Exception in building suggester index for: 
AnalyzingInfixSuggester
java.lang.IllegalArgumentException: Document contains at least one immense term 
in field="exacttext" (whose UTF8 encoding is longer than the max length 32766), 
all of which were skipped.  Please correct the analyzer to not produce such 
terms.  The prefix of the first immense term is: '[77, 101, 100, 105, 99, 97, 
108, 32, 108, 97, 117, 110, 99, 104, 32, 112, 97, 99, 107, 10, 65, 98, 105, 
114, 97, 116, 101, 114, 111, 110]...', original message: bytes can be at most 
32766 in length; got 95994

Please help to resolve this issue.

Any help to remove special characters from suggestion result will also work.

Thanks.
Abhay


Confidentiality Notice

This email message, including any attachments, is for the sole use of the 
intended recipient and may contain confidential and privileged information. Any 
unauthorized view, use, disclosure or distribution is prohibited. If you are 
not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message. Anju Software, Inc. 4500 S. 
Lakeshore Drive, Suite 620, Tempe, AZ USA 85282.


Re: SOLR uses too much CPU and GC is also weird on Windows server

2020-10-28 Thread Erick Erickson
DocValues=true are usually only used for “primitive” types, string, numerics, 
booleans and the like, specifically _not_ text-based.

I say “usually” because there’s a special “SortableTextField” where it does 
make some sense to have a text-based field have docValues, but that’s intended 
for relatively short fields. For example you want to sort on a title field. And 
probably not something you’re working with.

There’s not much we can say from this distance I’m afraid. I think I’d focus on 
the memory requirements, maybe take a heap dump and see what’s using memory.

Did you restart Solr _after_ turning off indexing? I ask because that would 
help determine which side the problem is on, indexing or querying. It does 
sound like querying though.

As for docValues in general, if you want to be really brave, you can set 
uninvertible=false for all your fields where docValues=false. When you facet on 
such a field, you won’t get anything back. If you sort on such a field, you’ll 
get an error message back. That should test if somehow not having docValues is 
the root of your problem. Do this on a test system of course ;) I think this is 
a low-probability issue, but it’s a mystery anyway so...

Updating shouldn’t be that much of a problem either, and if you still see high 
CPU with indexing turned off, that eliminates indexing as a candidate.

Is there any chance you changed your schema at all and didn’t delete your 
entire index and add all your documents back? There are a lot of ways things 
can go wrong if that’s the case. You had to reindex from scratch when you went 
to 8x from 6x, I’m wondering if during that process the schema changed without 
starting over. I’m grasping at straws here…

I’d also seriously consider going to 8.6.3. We only make point releases when 
there’s something serious. Looking through lucene/CHANGES.txt, there is one 
memory leak fix in 8.6.2. I’d expect a gradual buildup of heap if that were 
what you’re seeing, but you never know.

As for having docValues=false, that would cut down on the size of the index on 
disk and speed up indexing some, but in terms of memory usage or CPU usage when 
querying, unless the docValues structures are _needed_, they’re never read into 
OS RAM by MMapDirectory… The question really is whether you ever, intentionally 
or not, do “something” that would be more efficient with docValues. That’s 
where setting uninvertible=false whenever you set docValues=false makes sense, 
things will show up if your assumption that you don’t need docValues is false.

Best,
Erick


> On Oct 28, 2020, at 9:29 AM, Jaan Arjasepp  wrote:
> 
> Hi all,
> 
> Its me again. Anyway, I did a little research and we tried different things 
> and well, some questions I want to ask and some things that I found.
> 
> Well after monitoring my system with VirtualVM, I found that GC jumping is 
> from 0.5GB to 2.5GB and it has 4GB of memory for now, so it should not be an 
> issue anymore or what? But will observe it a bit as it might rise I guess a 
> bit.
> 
> Next thing we found or are thinking about is that writing on a disk might be 
> an issue, we turned off the indexing and some other stuff, but I would say, 
> it did not save much still.
> I also did go through all the schema fields, not that much really. They are 
> all docValues=true. Also I must say they are all automatically generated, so 
> no manual working there except one field, but this also has docValue=true. 
> Just curious, if the field is not a string/text, can it be docValue=false or 
> still better to have true? And as for uninversion, then we are not using much 
> facets nor other specific things in query, just simple queries. 
> 
> Though I must say we are updating documents quite a bunch, but usage of CPU 
> for being so high, not sure about that. Older version seemed not using CPU so 
> much...
> 
> I am a bit running out of ideas and hoping that this will continue to work, 
> but I dont like the CPU usage even over night, when nobody uses it. We will 
> try to figure out the issue here and I hope I can ask more questions when in 
> doubt or out of ideas. Also I must admit, solr is really new for me 
> personally.
> 
> Jaan
> 
> -Original Message-
> From: Walter Underwood  
> Sent: 27 October 2020 18:44
> To: solr-user@lucene.apache.org
> Subject: Re: SOLR uses too much CPU and GC is also weird on Windows server
> 
> That first graph shows a JVM that does not have enough heap for the program 
> it is running. Look at the bottom of the dips. That is the amount of memory 
> still in use after a full GC.
> 
> You want those dips to drop to about half of the available heap, so I’d 
> immediately increase that heap to 4G. That might not be enough, so you’ll 
> need to to watch that graph after the increase.
> 
> I’ve been using 8G heaps with Solr since version 1.2. We run this config with 
> Java 8 on over 100 machines. We do not do any faceting, which can take more 
> memory.
> 
> SOLR_HEAP=8g
> # Use 

Re: SOLR uses too much CPU and GC is also weird on Windows server

2020-10-28 Thread Walter Underwood
Double the heap.

All that CPU is the GC trying to free up space.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Oct 28, 2020, at 6:29 AM, Jaan Arjasepp  wrote:
> 
> Hi all,
> 
> Its me again. Anyway, I did a little research and we tried different things 
> and well, some questions I want to ask and some things that I found.
> 
> Well after monitoring my system with VirtualVM, I found that GC jumping is 
> from 0.5GB to 2.5GB and it has 4GB of memory for now, so it should not be an 
> issue anymore or what? But will observe it a bit as it might rise I guess a 
> bit.
> 
> Next thing we found or are thinking about is that writing on a disk might be 
> an issue, we turned off the indexing and some other stuff, but I would say, 
> it did not save much still.
> I also did go through all the schema fields, not that much really. They are 
> all docValues=true. Also I must say they are all automatically generated, so 
> no manual working there except one field, but this also has docValue=true. 
> Just curious, if the field is not a string/text, can it be docValue=false or 
> still better to have true? And as for uninversion, then we are not using much 
> facets nor other specific things in query, just simple queries. 
> 
> Though I must say we are updating documents quite a bunch, but usage of CPU 
> for being so high, not sure about that. Older version seemed not using CPU so 
> much...
> 
> I am a bit running out of ideas and hoping that this will continue to work, 
> but I dont like the CPU usage even over night, when nobody uses it. We will 
> try to figure out the issue here and I hope I can ask more questions when in 
> doubt or out of ideas. Also I must admit, solr is really new for me 
> personally.
> 
> Jaan
> 
> -Original Message-
> From: Walter Underwood  
> Sent: 27 October 2020 18:44
> To: solr-user@lucene.apache.org
> Subject: Re: SOLR uses too much CPU and GC is also weird on Windows server
> 
> That first graph shows a JVM that does not have enough heap for the program 
> it is running. Look at the bottom of the dips. That is the amount of memory 
> still in use after a full GC.
> 
> You want those dips to drop to about half of the available heap, so I’d 
> immediately increase that heap to 4G. That might not be enough, so you’ll 
> need to to watch that graph after the increase.
> 
> I’ve been using 8G heaps with Solr since version 1.2. We run this config with 
> Java 8 on over 100 machines. We do not do any faceting, which can take more 
> memory.
> 
> SOLR_HEAP=8g
> # Use G1 GC  -- wunder 2017-01-23
> # Settings from https://wiki.apache.org/solr/ShawnHeisey
> GC_TUNE=" \
> -XX:+UseG1GC \
> -XX:+ParallelRefProcEnabled \
> -XX:G1HeapRegionSize=8m \
> -XX:MaxGCPauseMillis=200 \
> -XX:+UseLargePages \
> -XX:+AggressiveOpts \
> "
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
>> On Oct 27, 2020, at 12:48 AM, Jaan Arjasepp  wrote:
>> 
>> Hello,
>> 
>> We have been using SOLR for quite some time. We used 6.0 and now we did a 
>> little upgrade to our system and servers and we started to use 8.6.1.
>> We use it on a Windows Server 2019.
>> Java version is 11
>> Basically using it in a default setting, except giving SOLR 2G of heap. It 
>> used 512, but it ran out of memory and stopped responding. Not sure if it 
>> was the issue. When older version, it managed fine with 512MB.
>> SOLR is not in a cloud mode, but in solo mode as we use it internally and it 
>> does not have too many request nor indexing actually.
>> Document sizes are not big, I guess. We only use one core.
>> Document stats are here:
>> Num Docs: 3627341
>> Max Doc: 4981019
>> Heap Memory Usage: 434400
>> Deleted Docs: 1353678
>> Version: 15999036
>> Segment Count: 30
>> 
>> The size of index is 2.66GB
>> 
>> While making upgrade we had to modify one field and a bit of code that uses 
>> it. Thats basically it. It works.
>> If needed more information about background of the system, I am happy to 
>> help.
>> 
>> 
>> But now to the issue I am having.
>> If SOLR is started, at first 40-60 minutes it works just fine. CPU is not 
>> high, heap usage seem normal. All is good, but then suddenly, the heap usage 
>> goes crazy, going up and down, up and down and CPU rises to 50-60% of the 
>> usage. Also I noticed over the weekend, when there are no writing usage, the 
>> CPU remains low and decent. I can try it this weekend again to see if and 
>> how this works out.
>> Also it seems to me, that after 4-5 days of working like this, it stops 
>> responding, but needs to be confirmed with more heap also.
>> 
>> Heap memory usage via JMX and jconsole -> 
>> https://drive.google.com/file/d/1Zo3B_xFsrrt-WRaxW-0A0QMXDNscXYih/view
>> ?usp=sharing As you can see, it starts of normal, but then goes crazy 
>> and it has been like this over night.
>> 
>> This is overall monitoring graphs, as you can see CPU is working hard 
>> or hardly 

Re: Tangent: old Solr versions

2020-10-28 Thread Walter Underwood
Chegg is running a 4.10.2 master/slave cluster for textbook search and several
other collections.

1. None of the features past 4.x are needed.
2. We depend on the extended edismax (SOLR-629).
3. Ain’t broke.

We are moving our Solr Cloud clusters to 8.x, even though there are no
features we need that aren’t in 6.6.2. Moving the Solr 4 cluster is way at
the bottom of the list.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Oct 28, 2020, at 5:37 AM, Mark H. Wood  wrote:
> 
> On Tue, Oct 27, 2020 at 04:25:54PM -0500, Mike Drob wrote:
>> Based on the questions that we've seen over the past month on this list,
>> there are still users with Solr on 6, 7, and 8. I suspect there are still
>> Solr 5 users out there too, although they don't appear to be asking for
>> help - likely they are in set it and forget it mode.
> 
> Oh, there are quite a few instances of Solr 4 out there as well.  Many
> of them will be moving to v7 or v8, probably starting in the next 6-12
> months.
> 
> -- 
> Mark H. Wood
> Lead Technology Analyst
> 
> University Library
> Indiana University - Purdue University Indianapolis
> 755 W. Michigan Street
> Indianapolis, IN 46202
> 317-274-0749
> www.ulib.iupui.edu



RE: SOLR uses too much CPU and GC is also weird on Windows server

2020-10-28 Thread Jaan Arjasepp
Hi all,

Its me again. Anyway, I did a little research and we tried different things and 
well, some questions I want to ask and some things that I found.

Well after monitoring my system with VirtualVM, I found that GC jumping is from 
0.5GB to 2.5GB and it has 4GB of memory for now, so it should not be an issue 
anymore or what? But will observe it a bit as it might rise I guess a bit.

Next thing we found or are thinking about is that writing on a disk might be an 
issue, we turned off the indexing and some other stuff, but I would say, it did 
not save much still.
I also did go through all the schema fields, not that much really. They are all 
docValues=true. Also I must say they are all automatically generated, so no 
manual working there except one field, but this also has docValue=true. Just 
curious, if the field is not a string/text, can it be docValue=false or still 
better to have true? And as for uninversion, then we are not using much facets 
nor other specific things in query, just simple queries. 

Though I must say we are updating documents quite a bunch, but usage of CPU for 
being so high, not sure about that. Older version seemed not using CPU so 
much...

I am a bit running out of ideas and hoping that this will continue to work, but 
I dont like the CPU usage even over night, when nobody uses it. We will try to 
figure out the issue here and I hope I can ask more questions when in doubt or 
out of ideas. Also I must admit, solr is really new for me personally.

Jaan

-Original Message-
From: Walter Underwood  
Sent: 27 October 2020 18:44
To: solr-user@lucene.apache.org
Subject: Re: SOLR uses too much CPU and GC is also weird on Windows server

That first graph shows a JVM that does not have enough heap for the program it 
is running. Look at the bottom of the dips. That is the amount of memory still 
in use after a full GC.

You want those dips to drop to about half of the available heap, so I’d 
immediately increase that heap to 4G. That might not be enough, so you’ll need 
to to watch that graph after the increase.

I’ve been using 8G heaps with Solr since version 1.2. We run this config with 
Java 8 on over 100 machines. We do not do any faceting, which can take more 
memory.

SOLR_HEAP=8g
# Use G1 GC  -- wunder 2017-01-23
# Settings from https://wiki.apache.org/solr/ShawnHeisey
GC_TUNE=" \
-XX:+UseG1GC \
-XX:+ParallelRefProcEnabled \
-XX:G1HeapRegionSize=8m \
-XX:MaxGCPauseMillis=200 \
-XX:+UseLargePages \
-XX:+AggressiveOpts \
"
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Oct 27, 2020, at 12:48 AM, Jaan Arjasepp  wrote:
> 
> Hello,
> 
> We have been using SOLR for quite some time. We used 6.0 and now we did a 
> little upgrade to our system and servers and we started to use 8.6.1.
> We use it on a Windows Server 2019.
> Java version is 11
> Basically using it in a default setting, except giving SOLR 2G of heap. It 
> used 512, but it ran out of memory and stopped responding. Not sure if it was 
> the issue. When older version, it managed fine with 512MB.
> SOLR is not in a cloud mode, but in solo mode as we use it internally and it 
> does not have too many request nor indexing actually.
> Document sizes are not big, I guess. We only use one core.
> Document stats are here:
> Num Docs: 3627341
> Max Doc: 4981019
> Heap Memory Usage: 434400
> Deleted Docs: 1353678
> Version: 15999036
> Segment Count: 30
> 
> The size of index is 2.66GB
> 
> While making upgrade we had to modify one field and a bit of code that uses 
> it. Thats basically it. It works.
> If needed more information about background of the system, I am happy to help.
> 
> 
> But now to the issue I am having.
> If SOLR is started, at first 40-60 minutes it works just fine. CPU is not 
> high, heap usage seem normal. All is good, but then suddenly, the heap usage 
> goes crazy, going up and down, up and down and CPU rises to 50-60% of the 
> usage. Also I noticed over the weekend, when there are no writing usage, the 
> CPU remains low and decent. I can try it this weekend again to see if and how 
> this works out.
> Also it seems to me, that after 4-5 days of working like this, it stops 
> responding, but needs to be confirmed with more heap also.
> 
> Heap memory usage via JMX and jconsole -> 
> https://drive.google.com/file/d/1Zo3B_xFsrrt-WRaxW-0A0QMXDNscXYih/view
> ?usp=sharing As you can see, it starts of normal, but then goes crazy 
> and it has been like this over night.
> 
> This is overall monitoring graphs, as you can see CPU is working hard 
> or hardly working. -> 
> https://drive.google.com/file/d/1_Gtz-Bi7LUrj8UZvKfmNMr-8gF_lM2Ra/view
> ?usp=sharing VM summary can be found here -> 
> https://drive.google.com/file/d/1FvdCz0N5pFG1fmX_5OQ2855MVkaL048w/view
> ?usp=sharing And finally to have better and quick overview of the SOLR 
> executing parameters that I have -> 
> https://drive.google.com/file/d/10VCtYDxflJcvb1aOoxt0u3Nb5JzTjrAI/view
> 

Re: Simulate facet.exists for json query facets

2020-10-28 Thread Erick Erickson
This really sounds like an XY problem. The whole point of facets is
to count the number of documents that have a value in some
number of buckets. So trying to stop your facet query as soon
as it matches a hit for the first time seems like an odd thing to do.

So what’s the “X”? In other words, what is the problem you’re trying
to solve at a high level? Perhaps there’s a better way to figure this
out.

Best,
Erick

> On Oct 28, 2020, at 3:48 AM, michael dürr  wrote:
> 
> Hi,
> 
> I use json facets of type 'query'. As these queries are pretty slow and I'm
> only interested in whether there is a match or not, I'd like to restrict
> the query execution similar to the standard facetting (like with the
> facet.exists parameter). My simplified query looks something like this (in
> reality *:* may be replaced by a complex edismax query and multiple
> subfacets similar to "tour" occur):
> 
> curl http://localhost:8983/solr/portal/select -d \
> "q=*:*\
> ={
>  tour:{
>type : query,
> q: \"+(+categoryId:6000 -categoryId:(6061 21493 8510))\"
>  }
> }\
> =0"
> 
> Is there any possibility to modify my request to ensure that the facet
> query stops as soon as it matches a hit for the first time?
> 
> Thanks!
> Michael



Tangent: old Solr versions

2020-10-28 Thread Mark H. Wood
On Tue, Oct 27, 2020 at 04:25:54PM -0500, Mike Drob wrote:
> Based on the questions that we've seen over the past month on this list,
> there are still users with Solr on 6, 7, and 8. I suspect there are still
> Solr 5 users out there too, although they don't appear to be asking for
> help - likely they are in set it and forget it mode.

Oh, there are quite a few instances of Solr 4 out there as well.  Many
of them will be moving to v7 or v8, probably starting in the next 6-12
months.

-- 
Mark H. Wood
Lead Technology Analyst

University Library
Indiana University - Purdue University Indianapolis
755 W. Michigan Street
Indianapolis, IN 46202
317-274-0749
www.ulib.iupui.edu


signature.asc
Description: PGP signature


Simulate facet.exists for json query facets

2020-10-28 Thread michael dürr
Hi,

I use json facets of type 'query'. As these queries are pretty slow and I'm
only interested in whether there is a match or not, I'd like to restrict
the query execution similar to the standard facetting (like with the
facet.exists parameter). My simplified query looks something like this (in
reality *:* may be replaced by a complex edismax query and multiple
subfacets similar to "tour" occur):

curl http://localhost:8983/solr/portal/select -d \
"q=*:*\
={
  tour:{
type : query,
 q: \"+(+categoryId:6000 -categoryId:(6061 21493 8510))\"
  }
}\
=0"

Is there any possibility to modify my request to ensure that the facet
query stops as soon as it matches a hit for the first time?

Thanks!
Michael