Master Slave Terminology

2020-06-16 Thread Kayak28
Hello, Community:

As the Github and Python will replace terminologies that relative to
slavery,
why don't we replace master-slave for Solr as well?

https://developers.srad.jp/story/18/09/14/0935201/
https://developer-tech.com/news/2020/jun/15/github-replace-slavery-terms-master-whitelist/

-- 

Sincerely,
Kaya
github: https://github.com/28kayak


Re: Solr 7.6 optimize index size increase

2020-06-16 Thread Erick Erickson
It Depends (tm).

As of Solr 7.5, optimize is different. See: 
https://lucidworks.com/post/solr-and-optimizing-your-index-take-ii/

So, assuming you have _not_ specified maxSegments=1, any very large
segment (near 5G) that has _zero_ deleted documents won’t be merged.

So there are two scenarios:

1> What Walter mentioned. The optimize process runs out of disk space
 and leaves lots of crud around

2> your “older segments” are just max-sized segments with zero deletions.


All that said… do you have demonstrable performance improvements after
optimizing? The entire name “optimize” is misleading, of course who
wouldn’t want an optimized index? In earlier versions of Solr (i.e. 4x),
it made quite a difference. In more recent Solr releases, it’s not as clear
cut. So before worrying about making optimize work, I’d recommend that
you do some performance tests on optimized and un-optimized indexes. 
If there are significant improvements, that’s one thing. Otherwise, it’s
a waste.

Best,
Erick

> On Jun 16, 2020, at 5:36 PM, Walter Underwood  wrote:
> 
> For a full forced merge (mistakenly named “optimize”), the worst case disk 
> space
> is 3X the size of the index. It is common to need 2X the size of the index.
> 
> When I worked on Ultraseek Server 20+ years ago, it had the same merge 
> behavior.
> I implemented a disk space check that would refuse to merge if there wasn’t 
> enough
> free space. It would log an error and send an email to the admin.
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
>> On Jun 16, 2020, at 1:58 PM, David Hastings  
>> wrote:
>> 
>> I cant give you a 100% true answer but ive experienced this, and what
>> "seemed" to happen to me was that the optimize would start, and that will
>> drive the size up by 3 fold, and if you out of disk space in the process
>> the optimize will quit since, it cant optimize, and leave the live index
>> pieces in tact, so now you have the "current" index as well as the
>> "optimized" fragments
>> 
>> i cant say for certain thats what you ran into, but we found that if you
>> get an expanding disk it will keep growing and prevent this from happening,
>> then the index will contract and the disk will shrink back to only what it
>> needs.  saved me a lot of headaches not needing to ever worry about disk
>> space
>> 
>> On Tue, Jun 16, 2020 at 4:43 PM Raveendra Yerraguntla
>>  wrote:
>> 
>>> 
>>> when optimize command is issued, the expectation after the completion of
>>> optimization process is that the index size either decreases or at most
>>> remain same. In solr 7.6 cluster with 50 plus shards, when optimize command
>>> is issued, some of the shard's transient or older segment files are not
>>> deleted. This is happening randomly across all shards. When unnoticed these
>>> transient files makes disk full. Currently it is handled through monitors,
>>> but question is what is causing the transient/older files remains there.
>>> Are there any specific race conditions which laves the older files not
>>> being deleted?
>>> Any pointers around this will be helpful.
>>> TIA
> 



Re: Solr 7.6 optimize index size increase

2020-06-16 Thread Walter Underwood
For a full forced merge (mistakenly named “optimize”), the worst case disk space
is 3X the size of the index. It is common to need 2X the size of the index.

When I worked on Ultraseek Server 20+ years ago, it had the same merge behavior.
I implemented a disk space check that would refuse to merge if there wasn’t 
enough
free space. It would log an error and send an email to the admin.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Jun 16, 2020, at 1:58 PM, David Hastings  
> wrote:
> 
> I cant give you a 100% true answer but ive experienced this, and what
> "seemed" to happen to me was that the optimize would start, and that will
> drive the size up by 3 fold, and if you out of disk space in the process
> the optimize will quit since, it cant optimize, and leave the live index
> pieces in tact, so now you have the "current" index as well as the
> "optimized" fragments
> 
> i cant say for certain thats what you ran into, but we found that if you
> get an expanding disk it will keep growing and prevent this from happening,
> then the index will contract and the disk will shrink back to only what it
> needs.  saved me a lot of headaches not needing to ever worry about disk
> space
> 
> On Tue, Jun 16, 2020 at 4:43 PM Raveendra Yerraguntla
>  wrote:
> 
>> 
>> when optimize command is issued, the expectation after the completion of
>> optimization process is that the index size either decreases or at most
>> remain same. In solr 7.6 cluster with 50 plus shards, when optimize command
>> is issued, some of the shard's transient or older segment files are not
>> deleted. This is happening randomly across all shards. When unnoticed these
>> transient files makes disk full. Currently it is handled through monitors,
>> but question is what is causing the transient/older files remains there.
>> Are there any specific race conditions which laves the older files not
>> being deleted?
>> Any pointers around this will be helpful.
>> TIA



Re: Solr 7.6 optimize index size increase

2020-06-16 Thread David Hastings
I cant give you a 100% true answer but ive experienced this, and what
"seemed" to happen to me was that the optimize would start, and that will
drive the size up by 3 fold, and if you out of disk space in the process
the optimize will quit since, it cant optimize, and leave the live index
pieces in tact, so now you have the "current" index as well as the
"optimized" fragments

i cant say for certain thats what you ran into, but we found that if you
get an expanding disk it will keep growing and prevent this from happening,
then the index will contract and the disk will shrink back to only what it
needs.  saved me a lot of headaches not needing to ever worry about disk
space

On Tue, Jun 16, 2020 at 4:43 PM Raveendra Yerraguntla
 wrote:

>
> when optimize command is issued, the expectation after the completion of
> optimization process is that the index size either decreases or at most
> remain same. In solr 7.6 cluster with 50 plus shards, when optimize command
> is issued, some of the shard's transient or older segment files are not
> deleted. This is happening randomly across all shards. When unnoticed these
> transient files makes disk full. Currently it is handled through monitors,
> but question is what is causing the transient/older files remains there.
> Are there any specific race conditions which laves the older files not
> being deleted?
> Any pointers around this will be helpful.
>  TIA


Re: Order of spellcheck suggestions

2020-06-16 Thread Thomas Corthals
Can anybody shed some light on this? If not, I'm going to report it as a
bug in JIRA.

Thomas

Op za 13 jun. 2020 13:37 schreef Thomas Corthals :

> Hi
>
> I'm seeing different ordering on the spellcheck suggestions in cloud mode
> when using spellcheck.extendedResults=false vs.
> spellcheck.extendedResults=true.
>
> Solr 8.5.2 in cloud mode with 2 nodes, 1 collection with numShards = 2 &
> replicationFactor = 1, techproducts configset and example data:
>
> $ curl '
> http://localhost:8983/solr/techproducts/spell?q=power%20cort=false
> '
>
> "suggestion":["cord", "corp", "card"]}],
>
> $ curl '
> http://localhost:8983/solr/techproducts/spell?q=power%20cort=true
> '
>
> "suggestion":[{ "word":"corp", "freq":2}, { "word":"cord", "freq":1}, {
> "word":"card", "freq":4}]}],
>
> The correct order should be "corp" (LD: 1, freq: 2), "cord" (LD: 1, freq:
> 1) , "card" (LD: 2, freq: 4). In standalone mode, I get "corp", "cord",
> "card" with extendedResults true or false.
>
> The results are the same for the /spell and /browse request handlers in
> that configset. I've put all combinations side by side in this spreadsheet:
> https://docs.google.com/spreadsheets/d/1ym44TlbomXMCeoYpi_eOBmv6-mZHCZ0nhsVDB_dDavM/edit?usp=sharing
>
> Is it something in the configuration? Or a bug?
>
> Thomas
>


Solr 7.6 optimize index size increase

2020-06-16 Thread Raveendra Yerraguntla

when optimize command is issued, the expectation after the completion of 
optimization process is that the index size either decreases or at most remain 
same. In solr 7.6 cluster with 50 plus shards, when optimize command is issued, 
some of the shard's transient or older segment files are not deleted. This is 
happening randomly across all shards. When unnoticed these transient files 
makes disk full. Currently it is handled through monitors, but question is what 
is causing the transient/older files remains there. Are there any specific race 
conditions which laves the older files not being deleted?
Any pointers around this will be helpful.
 TIA

Re: Proxy Error when cluster went down

2020-06-16 Thread Vishal Vaibhav
So entire cluster was down. I m trying to bring node by node . I restarted
the first node . The solr comes up but add replicas command fails. And then
I tried to check clusterstatus api, it showed shard in active state, no
core as active i.e. all down and one live node which was the one that I
restarted. Also this all connects to one zookeeper ensemble of 3 nodes

On Tue, 16 Jun 2020 at 11:20 PM, Jörn Franke  wrote:

> Do you have another host with replica alive or are all replicas on the
> host that is down?
>
> Are all SolrCloud hosts in the same ZooKeeper?
>
> > Am 16.06.2020 um 19:29 schrieb Vishal Vaibhav :
> >
> > Hi thanks . My solr is running in kubernetes. So host name goes away
> with
> > the pod going
> search-rules-solr-v1-2.search-rules-solr-v1.search-digital.s
> > vc.cluster.local
> >  So in my case the pod with this host has gone and also the hostname
> >
> search-rules-solr-v1-2.search-rules-solr-v1.search-digital.svc.cluster.local
> > is no more there.. should not the solr cloud be aware of the fact that
> all
> > the replicas in that solr host is down and should not proxy the request
> to
> > that node..
> >
> >> On Tue, 16 Jun 2020 at 5:06 PM, Shawn Heisey 
> wrote:
> >>
> >>> On 6/15/2020 9:04 PM, Vishal Vaibhav wrote:
> >>> I am running on solr 8.5. For some reason entire cluster went down.
> When
> >> i
> >>> am trying to bring up the nodes,its not coming up. My health check is
> >>> on "/solr/rules/admin/system". I tried forcing a leader election but it
> >>> dint help.
> >>> so when i run the following commands. Why is it trying to proxy when
> >> those
> >>> nodes are down. Am i missing something?
> >>
> >> 
> >>
> >>> java.net.UnknownHostException:
> >>>
> >>
> search-rules-solr-v1-2.search-rules-solr-v1.search-digital.svc.cluster.local:
> >>
> >> It is trying to proxy because it's SolrCloud.  SolrCloud has an internal
> >> load balancer that spreads queries across multiple replicas when
> >> possible.  Your cluster must be aware of multiple servers where the
> >> "rules" collection can be queried.
> >>
> >> The underlying problem behind this error message is that the following
> >> hostname is being looked up, and it doesn't exist:
> >>
> >>
> >>
> search-rules-solr-v1-2.search-rules-solr-v1.search-digital.svc.cluster.local
> >>
> >> This hostname is most likely coming from /etc/hosts on one of your
> >> systems when that system starts Solr and it registers with the cluster,
> >> and that /etc/hosts file is the ONLY place that the hostname exists, so
> >> when SolrCloud tries to forward the request to that server, it is
> failing.
> >>
> >> Thanks,
> >> Shawn
> >>
>


Re: Proxy Error when cluster went down

2020-06-16 Thread Jörn Franke
Do you have another host with replica alive or are all replicas on the host 
that is down?

Are all SolrCloud hosts in the same ZooKeeper?

> Am 16.06.2020 um 19:29 schrieb Vishal Vaibhav :
> 
> Hi thanks . My solr is running in kubernetes. So host name goes away with
> the pod going search-rules-solr-v1-2.search-rules-solr-v1.search-digital.s
> vc.cluster.local
>  So in my case the pod with this host has gone and also the hostname
> search-rules-solr-v1-2.search-rules-solr-v1.search-digital.svc.cluster.local
> is no more there.. should not the solr cloud be aware of the fact that all
> the replicas in that solr host is down and should not proxy the request to
> that node..
> 
>> On Tue, 16 Jun 2020 at 5:06 PM, Shawn Heisey  wrote:
>> 
>>> On 6/15/2020 9:04 PM, Vishal Vaibhav wrote:
>>> I am running on solr 8.5. For some reason entire cluster went down. When
>> i
>>> am trying to bring up the nodes,its not coming up. My health check is
>>> on "/solr/rules/admin/system". I tried forcing a leader election but it
>>> dint help.
>>> so when i run the following commands. Why is it trying to proxy when
>> those
>>> nodes are down. Am i missing something?
>> 
>> 
>> 
>>> java.net.UnknownHostException:
>>> 
>> search-rules-solr-v1-2.search-rules-solr-v1.search-digital.svc.cluster.local:
>> 
>> It is trying to proxy because it's SolrCloud.  SolrCloud has an internal
>> load balancer that spreads queries across multiple replicas when
>> possible.  Your cluster must be aware of multiple servers where the
>> "rules" collection can be queried.
>> 
>> The underlying problem behind this error message is that the following
>> hostname is being looked up, and it doesn't exist:
>> 
>> 
>> search-rules-solr-v1-2.search-rules-solr-v1.search-digital.svc.cluster.local
>> 
>> This hostname is most likely coming from /etc/hosts on one of your
>> systems when that system starts Solr and it registers with the cluster,
>> and that /etc/hosts file is the ONLY place that the hostname exists, so
>> when SolrCloud tries to forward the request to that server, it is failing.
>> 
>> Thanks,
>> Shawn
>> 


Re: Proxy Error when cluster went down

2020-06-16 Thread Vishal Vaibhav
Hi thanks . My solr is running in kubernetes. So host name goes away with
the pod going search-rules-solr-v1-2.search-rules-solr-v1.search-digital.s
vc.cluster.local
  So in my case the pod with this host has gone and also the hostname
search-rules-solr-v1-2.search-rules-solr-v1.search-digital.svc.cluster.local
is no more there.. should not the solr cloud be aware of the fact that all
the replicas in that solr host is down and should not proxy the request to
that node..

On Tue, 16 Jun 2020 at 5:06 PM, Shawn Heisey  wrote:

> On 6/15/2020 9:04 PM, Vishal Vaibhav wrote:
> > I am running on solr 8.5. For some reason entire cluster went down. When
> i
> > am trying to bring up the nodes,its not coming up. My health check is
> > on "/solr/rules/admin/system". I tried forcing a leader election but it
> > dint help.
> > so when i run the following commands. Why is it trying to proxy when
> those
> > nodes are down. Am i missing something?
>
> 
>
> > java.net.UnknownHostException:
> >
> search-rules-solr-v1-2.search-rules-solr-v1.search-digital.svc.cluster.local:
>
> It is trying to proxy because it's SolrCloud.  SolrCloud has an internal
> load balancer that spreads queries across multiple replicas when
> possible.  Your cluster must be aware of multiple servers where the
> "rules" collection can be queried.
>
> The underlying problem behind this error message is that the following
> hostname is being looked up, and it doesn't exist:
>
>
> search-rules-solr-v1-2.search-rules-solr-v1.search-digital.svc.cluster.local
>
> This hostname is most likely coming from /etc/hosts on one of your
> systems when that system starts Solr and it registers with the cluster,
> and that /etc/hosts file is the ONLY place that the hostname exists, so
> when SolrCloud tries to forward the request to that server, it is failing.
>
> Thanks,
> Shawn
>


Re: How to determine why solr stops running?

2020-06-16 Thread David Hastings
me personally, around 290gb.  as much as we could shove into them

On Tue, Jun 16, 2020 at 12:44 PM Erick Erickson 
wrote:

> How much physical RAM? A rule of thumb is that you should allocate no more
> than 25-50 percent of the total physical RAM to Solr. That's cumulative,
> i.e. the sum of the heap allocations across all your JVMs should be below
> that percentage. See Uwe Schindler's mmapdirectiry blog...
>
> Shot in the dark...
>
> On Tue, Jun 16, 2020, 11:51 David Hastings 
> wrote:
>
> > To add to this, i generally have solr start with this:
> > -Xms31000m-Xmx31000m
> >
> > and the only other thing that runs on them are maria db gallera cluster
> > nodes that are not in use (aside from replication)
> >
> > the 31gb is not an accident either, you dont want 32gb.
> >
> >
> > On Tue, Jun 16, 2020 at 11:26 AM Shawn Heisey 
> wrote:
> >
> > > On 6/11/2020 11:52 AM, Ryan W wrote:
> > > >> I will check "dmesg" first, to find out any hardware error message.
> > >
> > > 
> > >
> > > > [1521232.781801] Out of memory: Kill process 117529 (httpd) score 9
> or
> > > > sacrifice child
> > > > [1521232.782908] Killed process 117529 (httpd), UID 48,
> > > total-vm:675824kB,
> > > > anon-rss:181844kB, file-rss:0kB, shmem-rss:0kB
> > > >
> > > > Is this a relevant "Out of memory" message?  Does this suggest an OOM
> > > > situation is the culprit?
> > >
> > > Because this was in the "dmesg" output, it indicates that it is the
> > > operating system killing programs because the *system* doesn't have any
> > > memory left.  It wasn't Java that did this, and it wasn't Solr that was
> > > killed.  It very well could have been Solr that was killed at another
> > > time, though.
> > >
> > > The process that it killed this time is named httpd ... which is most
> > > likely the Apache webserver.  Because the UID is 48, this is probably
> an
> > > OS derived from Redhat, where the "apache" user has UID and GID 48 by
> > > default.  Apache with its default config can be VERY memory hungry when
> > > it gets busy.
> > >
> > > > -XX:InitialHeapSize=536870912 -XX:MaxHeapSize=536870912
> > >
> > > This says that you started Solr with the default 512MB heap.  Which is
> > > VERY VERY small.  The default is small so that Solr will start on
> > > virtually any hardware.  Almost every user must increase the heap size.
> > > And because the OS is killing processes, it is likely that the system
> > > does not have enough memory installed for what you have running on it.
> > >
> > > It is generally not a good idea to share the server hardware between
> > > Solr and other software, unless the system has a lot of spare
> resources,
> > > memory in particular.
> > >
> > > Thanks,
> > > Shawn
> > >
> >
>


Re: Facet Performance

2020-06-16 Thread Erick Erickson
Ok, I see the disconnect... Necessary parts if the index are read from disk
lazily. So your newSearcher or firstSearcher query needs to do whatever
operation causes the relevant parts of the index to be read. In this case,
probably just facet on all the fields you care about. I'd add sorting too
if you sort on different fields.

The *:* query without facets or sorting does virtually nothing due to some
special handling...

On Tue, Jun 16, 2020, 10:48 James Bodkin 
wrote:

> I've been trying to build a query that I can use in newSearcher based off
> the information in your previous e-mail. I thought you meant to build a *:*
> query as per Query 1 in my previous e-mail but I'm still seeing the
> first-hit execution.
> Now I'm wondering if you meant to create a *:* query with each of the
> fields as part of the fl query parameters or a *:* query with each of the
> fields and values as part of the fq query parameters.
>
> At the moment I've been running these manually as I expected that I would
> see the first-execution penalty disappear by the time I got to query 4, as
> I thought this would replicate the actions of the newSeacher.
> Unfortunately we can't use the autowarm count that is available as part of
> the filterCache/filterCache due to the custom deployment mechanism we use
> to update our index.
>
> Kind Regards,
>
> James Bodkin
>
> On 16/06/2020, 15:30, "Erick Erickson"  wrote:
>
> Did you try the autowarming like I mentioned in my previous e-mail?
>
> > On Jun 16, 2020, at 10:18 AM, James Bodkin <
> james.bod...@loveholidays.com> wrote:
> >
> > We've changed the schema to enable docValues for these fields and
> this led to an improvement in the response time. We found a further
> improvement by also switching off indexed as these fields are used for
> faceting and filtering only.
> > Since those changes, we've found that the first-execution for
> queries is really noticeable. I thought this would be the filterCache based
> on what I saw in NewRelic however it is probably trying to read the
> docValues from disk. How can we use the autowarming to improve this?
> >
> > For example, I've run the following queries in sequence and each
> query has a first-execution penalty.
> >
> > Query 1:
> >
> > q=*:*
> > facet=true
> > facet.field=D_DepartureAirport
> > facet.field=D_Destination
> > facet.limit=-1
> > rows=0
> >
> > Query 2:
> >
> > q=*:*
> > fq=D_DepartureAirport:(2660)
> > facet=true
> > facet.field=D_Destination
> > facet.limit=-1
> > rows=0
> >
> > Query 3:
> >
> > q=*:*
> > fq=D_DepartureAirport:(2661)
> > facet=true
> > facet.field=D_Destination
> > facet.limit=-1
> > rows=0
> >
> > Query 4:
> >
> > q=*:*
> > fq=D_DepartureAirport:(2660+OR+2661)
> > facet=true
> > facet.field=D_Destination
> > facet.limit=-1
> > rows=0
> >
> > We've kept the field type as a string, as the value is mapped by
> application that accesses Solr. In the examples above, the values are
> mapped to airports and destinations.
> > Is it possible to prewarm the above queries without having to define
> all the potential filters manually in the auto warming?
> >
> > At the moment, we update and optimise our index in a different
> environment and then copy the index to our production instances by using a
> rolling deployment in Kubernetes.
> >
> > Kind Regards,
> >
> > James Bodkin
> >
> > On 12/06/2020, 18:58, "Erick Erickson" 
> wrote:
> >
> >I question whether fiterCache has anything to do with it, I
> suspect what’s really happening is that first time you’re reading the
> relevant bits from disk into memory. And to double check you should have
> docVaues enabled for all these fields. The “uninverting” process  can be
> very expensive, and docValues bypasses that.
> >
> >As of Solr 7.6, you can define “uninvertible=true” to your
> field(Type) to “fail fast” if Solr needs to uninvert the field.
> >
> >But that’s an aside. In either case, my claim is that first-time
> execution does “something”, either reads the serialized docValues from disk
> or uninverts the file on Solr’s heap.
> >
> >You can have this autowarmed by any combination of
> >1> specifying an autowarm count on your queryResultCache. That’s
> hit or miss, as it replays the most recent N queries which may or may not
> contain the sorts. That said, specifying 10-20 for autowarm count is
> usually a good idea, assuming you’re not committing more than, say, every
> 30 seconds. I’d add the same to filterCache too.
> >
> >2> specifying a newSearcher or firstSearcher query in
> solrconfig.xml. The difference is that newSearcher is fired every time a
> commit happens, while firstSearcher is only fired when Solr starts, the
> theory being that there’s no cache autowarming available 

Re: Unified highlighter- unable to get results - can get results with original and termvector highlighters

2020-06-16 Thread Warren, David [USA]
David –

It’s fine to take this conversation back to the mailing list.  Thank you very 
much again for your suggestions.

I think you are correct.  It doesn’t appear necessary to set termOffsets, and 
it appears that that the unified highlighter is using the TERM_VECTORS offset 
source if I don’t tell it to do otherwise.  When I run the query with 
hl.offsetSource=ANALYSIS, I get highlighted results returned.  When I run the 
query with hl.offsetSource=TERM_VECTORS or omit hl.offsetSource, I get the same 
result – no text returned in the highlighting section of the search result.

Thanks as well for the suggestion about moving clauses to fq and using a 
simpler query in hl.q.  That helps.

-Dave Warren


From: David Smiley 
Date: Tuesday, June 16, 2020 at 12:21 AM
To: David Warren 
Subject: Re: [External] Fwd: Unified highlighter- unable to get results - can 
get results with original and termvector highlighters

Hi Dave,

Thanks for providing more information.  Is it alright to take this conversation 
back to the list or is that query/debug info sensitive?

With default hl.weightMatches=true:
Try setting hl.q (instead of defaulting to q) and set it to be the 
highlightable portion of your query -- i.e. strip out all that boosting.  For 
example, (text:zelda OR il_title:zelda) AND collection:xml_products  Does that 
help?  I suggest this because I see QParser is Lucene.  If you used edismax 
with edismax's boosting params, then this QParser is able to tell the 
highlighter the primary part of the query without boosting.  For advanced cases 
like yours, that's perhaps not possible.  BTW the collection:xml_products part 
of the query looks to me like it would better belong as a filter query (fq 
param).

I don't believe it was _necessary_ to set termOffsets; that's merely a 
performance trade-off.  If you set hl.offsetSource=ANALYSIS (thus ignoring 
termOffsets), I believe you should get the same results.  Can you confirm it's 
the same or is it still different?  If different then I'll look closer; I have 
a theory on how it could be different.

p.s. I'm quite busy but will return to this at some point.

~ David


On Thu, Jun 11, 2020 at 4:01 PM Warren, David [USA] 
mailto:warren_da...@bah.com>> wrote:
David –

Thank you very much for the response to my Solr highlighting question.  Due to 
competing priorities, I wasn’t able to further investigation before today.  
But, now that I have…
Based on your advice, I got the unified highlighter to work by setting 
hl.weightMatches=false.  The field I was highlighting wasn’t configured to 
store termOffsets, so I had to set termOffsets=true and re-index to get this to 
work.  I still don’t get any results with the unified highlighter when 
hl.weightMatches=true.

You asked about running with debug=query, Results of that are below.  Also, 
here’s the configuration for the il_title and text fields



debug
rawquerystring
"({!boost b=recip(ms(NOW/HOUR,il_pubdate),3.16e-11,1,1)}text:zelda AND 
collection: xml_products) OR {!boost b=2 v=\"il_title:zelda AND collection: 
xml_products\"}"
querystring
"({!boost b=recip(ms(NOW/HOUR,il_pubdate),3.16e-11,1,1)}text:zelda AND 
collection: xml_products) OR {!boost b=2 v=\"il_title:zelda AND collection: 
xml_products\"}"
parsedquery
"(+FunctionScoreQuery(FunctionScoreQuery(text:zelda, scored by 
boost(1.0/(3.16E-11*float(ms(const(159190200),date(il_pubdate)))+1.0 
+collection:xml_products) FunctionScoreQuery(FunctionScoreQuery(+il_title:zelda 
+collection:xml_products, scored by boost(const(2"
parsedquery_toString
"(+FunctionScoreQuery(text:zelda, scored by 
boost(1.0/(3.16E-11*float(ms(const(159190200),date(il_pubdate)))+1.0))) 
+collection:xml_products) FunctionScoreQuery(+il_title:zelda 
+collection:xml_products, scored by boost(const(2)))"
QParser
"LuceneQParser"

-Dave Warren

From: David Smiley mailto:david.w.smi...@gmail.com>>
Date: Saturday, May 30, 2020 at 11:24 PM
To: David Warren mailto:warren_da...@bah.com>>
Subject: [External] Fwd: Unified highlighter- unable to get results - can get 
results with original and termvector highlighters


-- Forwarded message -
From: David Smiley mailto:david.w.smi...@gmail.com>>
Date: Fri, May 22, 2020 at 11:43 AM
Subject: Re: Unified highlighter- unable to get results - can get results with 
original and termvector highlighters
To: solr-user mailto:solr-user@lucene.apache.org>>

Hello,

Did you get it to work eventually?

Try setting hl.weightMatches=false and see if that helps.  Wether this helps or 
not, I'd like to have a deeper understanding of the internal structure of the 
Query (not the original query string).  What query parser are you using?.  If 
you pass debug=query to Solr then you'll get a a parsed version of the query 
that would be helpful to me.

~ David


On Mon, May 11, 2020 at 10:46 AM Warren, David [USA] 
mailto:warren_da...@bah.com>> wrote:
I am running Solr 8.4 and am attempting to use its highlighting feature. It 
appears to work 

Re: How to determine why solr stops running?

2020-06-16 Thread Erick Erickson
How much physical RAM? A rule of thumb is that you should allocate no more
than 25-50 percent of the total physical RAM to Solr. That's cumulative,
i.e. the sum of the heap allocations across all your JVMs should be below
that percentage. See Uwe Schindler's mmapdirectiry blog...

Shot in the dark...

On Tue, Jun 16, 2020, 11:51 David Hastings 
wrote:

> To add to this, i generally have solr start with this:
> -Xms31000m-Xmx31000m
>
> and the only other thing that runs on them are maria db gallera cluster
> nodes that are not in use (aside from replication)
>
> the 31gb is not an accident either, you dont want 32gb.
>
>
> On Tue, Jun 16, 2020 at 11:26 AM Shawn Heisey  wrote:
>
> > On 6/11/2020 11:52 AM, Ryan W wrote:
> > >> I will check "dmesg" first, to find out any hardware error message.
> >
> > 
> >
> > > [1521232.781801] Out of memory: Kill process 117529 (httpd) score 9 or
> > > sacrifice child
> > > [1521232.782908] Killed process 117529 (httpd), UID 48,
> > total-vm:675824kB,
> > > anon-rss:181844kB, file-rss:0kB, shmem-rss:0kB
> > >
> > > Is this a relevant "Out of memory" message?  Does this suggest an OOM
> > > situation is the culprit?
> >
> > Because this was in the "dmesg" output, it indicates that it is the
> > operating system killing programs because the *system* doesn't have any
> > memory left.  It wasn't Java that did this, and it wasn't Solr that was
> > killed.  It very well could have been Solr that was killed at another
> > time, though.
> >
> > The process that it killed this time is named httpd ... which is most
> > likely the Apache webserver.  Because the UID is 48, this is probably an
> > OS derived from Redhat, where the "apache" user has UID and GID 48 by
> > default.  Apache with its default config can be VERY memory hungry when
> > it gets busy.
> >
> > > -XX:InitialHeapSize=536870912 -XX:MaxHeapSize=536870912
> >
> > This says that you started Solr with the default 512MB heap.  Which is
> > VERY VERY small.  The default is small so that Solr will start on
> > virtually any hardware.  Almost every user must increase the heap size.
> > And because the OS is killing processes, it is likely that the system
> > does not have enough memory installed for what you have running on it.
> >
> > It is generally not a good idea to share the server hardware between
> > Solr and other software, unless the system has a lot of spare resources,
> > memory in particular.
> >
> > Thanks,
> > Shawn
> >
>


Near-Realtime-Search, CommitWithin and AtomicUpdates

2020-06-16 Thread Mirko Sertic
Hi@all,
 
I'm using Solr 6.6 and trying to validate my setup for AtomicUpdates and
Near-Realtime-Search.
 
Some questions are bogging my mind, so maybe someone can give me a hint
to make things clearer.
 
I am posting regular updates to a collection using the UpdateHandler and
Solr Command Syntax, including updates and deletes. These changes are
commited using the commitWithin configuration every 30 seconds.
 
Now I want to use AtomicUpdates on MultiValue'd fields, so I post the
"add" commands for these fields only. Sometimes I have to post multiple
Solr commands affecting the same document, but within the same
commitWithin interval. The question is now, what is the final new value
of the field after the atomic update add operations? From my point of
view the final value should be the old value plus the newly added
values, which is commited to the index in the next commitWithin period.
So can I combine multiple AtomicUpdate commands affecting the same
document within the same commitWithin interval?
 
Another thing that is bogging me: can I combine multiple AtomicUpdates
for the same document with CopyFields? Does Solr use some kind of
dirty-read or pending uncommited changes to get the right value of the
source field, or is the source always the last commited value?
 
So in summary, does Solr AtomicUpdates use some kind of dirty-read
mechanism do do its "magic" ?
 
Thanks in advance,
Mirko


Migration for total noob?

2020-06-16 Thread Hammer, Erich F
Disclaimer:  My background is Windows desktop and AD management and PowerShell. 
 I have no experience with Solr and only very limited experience with Java, so 
please be patient with me.  

I have inherited a Solr 7.2.1 setup (on Windows), and I'm trying to figure it 
out so that it can be migrated to a newer system and to Solr 8.5.2.  I feel 
like the documentation assumes an awful lot of prior knowledge that I'm clearly 
lacking and especially in how to upgrade versions of Solr.  As a "Windows guy" 
I'm used to binaries and configuration files in separate locations and 
upgrading is generally and easy replacement of the binaries and (sometimes 
automated) adjustments to the config.  With Solr, it's all jumbled into the 
same folder structure, and I am trying to track down where all the important 
info is set.

The old setup appears to be a stand-alone system with 5 cores (some of which 
may be test/experiments) and what I believe are pretty small indexes and not 
using any configsets (although there are a few in there).  I compared the 
"Solr.in.cmd" files from the old to the default, new and adjusted as seemed 
fitting.  I was able to successfully start an empty Solr 8.5.2 and view the 
admin interface.

Then, I stopped the service on the old server (it's not a critical system) and 
copied the folders for the cores over to the new system.  When I start it up, 
one of the cores is running and I get errors on the other four.  Two each of:

Plugin init failure for [schema.xml] fieldType "textSuggest"   
Plugin init failure for [schema.xml] fieldType "textSpell"

I'm not having any luck finding information on how to resolve this.  Am I 
missing a plugin java library?  Where might I get it and/or load it?  Is there 
some config file I missed from some other location?

I appreciate any suggestions you can offer.

Erich



Re: Can't fetch table from cassandra through jdbc connection

2020-06-16 Thread Jason Gerlowski
The way I read the stack trace you provided, it looks like DIH is
running the query "select test_field from test_keyspace.test_table
limit 10", but the Cassandra jdbc driver is reporting that Cassandra
doesn't support some aspect of that query.  If I'm reading that right,
this seems like a question for the Cassandra folks who wrote that jdbc
driver instead of the Solr folks here.  Though maybe there's someone
here who happens to know.

The only thing I'd suggest to get more DIH logging would be to raise
the log levels for DIH classes, but from what you said above it sounds
like you already did that for the root logger and it didn't give you
anything that helped solve the issue.  So I'm stumped.

Good luck,

Jason

On Tue, Jun 16, 2020 at 6:05 AM Ирина Камалова  wrote:
>
> Could you please tell me if I can expand log trace here?
> (if I'm trying to do it through solr admin and make root log ALL - it
> doesn't help me)
>
>
> Best regards,
> Irina Kamalova
>
>
> On Mon, 15 Jun 2020 at 10:12, Ирина Камалова 
> wrote:
>
> > I’m using Solr 7.7.3 and latest Cassandra jdbc driver 1.3.5
> >
> > I get  *SQLFeatureNotSupportedException *
> >
> >
> > I see this error and have no idea what’s wrong (not enough verbose - table
> > name or field wrong/ couldn’t mapping type or driver doesn’t support?)
> >
> >
> > Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: 
> > org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to 
> > execute query: select test_field from test_keyspace.test_table limit 10; 
> > Processing Document # 1
> > at 
> > org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:271)
> > at 
> > org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:424)
> > at 
> > org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:483)
> > at 
> > org.apache.solr.handler.dataimport.DataImporter.lambda$runAsync$0(DataImporter.java:466)
> > at java.lang.Thread.run(Thread.java:748)
> > Caused by: java.lang.RuntimeException: 
> > org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to 
> > execute query: select test_field from test_keyspace.test_table limit 10; 
> > Processing Document # 1
> > at 
> > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:417)
> > at 
> > org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:330)
> > at 
> > org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:233)
> > ... 4 more
> > Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: 
> > Unable to execute query: select test_field from test_keyspace.test_table 
> > limit 10; Processing Document # 1
> > at 
> > org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:69)
> > at 
> > org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:327)
> > at 
> > org.apache.solr.handler.dataimport.JdbcDataSource.createResultSetIterator(JdbcDataSource.java:288)
> > at 
> > org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:283)
> > at 
> > org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:52)
> > at 
> > org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
> > at 
> > org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
> > at 
> > org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:267)
> > at 
> > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:476)
> > at 
> > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:415)
> > ... 6 more
> > Caused by: java.sql.SQLFeatureNotSupportedException
> > at 
> > com.dbschema.CassandraConnection.createStatement(CassandraConnection.java:75)
> > at 
> > org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.createStatement(JdbcDataSource.java:342)
> > at 
> > org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:318)
> > ... 14 more
> >
> >
> >
> >
> > Best regards,
> > Irina Kamalova
> >


Re: How to determine why solr stops running?

2020-06-16 Thread David Hastings
To add to this, i generally have solr start with this:
-Xms31000m-Xmx31000m

and the only other thing that runs on them are maria db gallera cluster
nodes that are not in use (aside from replication)

the 31gb is not an accident either, you dont want 32gb.


On Tue, Jun 16, 2020 at 11:26 AM Shawn Heisey  wrote:

> On 6/11/2020 11:52 AM, Ryan W wrote:
> >> I will check "dmesg" first, to find out any hardware error message.
>
> 
>
> > [1521232.781801] Out of memory: Kill process 117529 (httpd) score 9 or
> > sacrifice child
> > [1521232.782908] Killed process 117529 (httpd), UID 48,
> total-vm:675824kB,
> > anon-rss:181844kB, file-rss:0kB, shmem-rss:0kB
> >
> > Is this a relevant "Out of memory" message?  Does this suggest an OOM
> > situation is the culprit?
>
> Because this was in the "dmesg" output, it indicates that it is the
> operating system killing programs because the *system* doesn't have any
> memory left.  It wasn't Java that did this, and it wasn't Solr that was
> killed.  It very well could have been Solr that was killed at another
> time, though.
>
> The process that it killed this time is named httpd ... which is most
> likely the Apache webserver.  Because the UID is 48, this is probably an
> OS derived from Redhat, where the "apache" user has UID and GID 48 by
> default.  Apache with its default config can be VERY memory hungry when
> it gets busy.
>
> > -XX:InitialHeapSize=536870912 -XX:MaxHeapSize=536870912
>
> This says that you started Solr with the default 512MB heap.  Which is
> VERY VERY small.  The default is small so that Solr will start on
> virtually any hardware.  Almost every user must increase the heap size.
> And because the OS is killing processes, it is likely that the system
> does not have enough memory installed for what you have running on it.
>
> It is generally not a good idea to share the server hardware between
> Solr and other software, unless the system has a lot of spare resources,
> memory in particular.
>
> Thanks,
> Shawn
>


Re: HTTP 401 when searching on alias in secured Solr

2020-06-16 Thread Jason Gerlowski
Just wanted to close the loop here: Isabelle filed SOLR-14569 for this
and eventually reported there that the problem seems specific to her
custom configuration which specifies a seemingly innocuous
 in solrconfig.xml.

See that jira for more detailed explanation (and hopefully a
resolution coming soon).

On Wed, Jun 10, 2020 at 4:01 PM Jan Høydahl  wrote:
>
> Please share your security.json file
>
> Jan Høydahl
>
> > 10. jun. 2020 kl. 21:53 skrev Isabelle Giguere 
> > :
> >
> > Hi;
> >
> > I'm using Solr 8.5.0.  I have uploaded security.json to Zookeeper.  I can 
> > log in the Solr Admin UI.  I can create collections and aliases, and I can 
> > index documents in Solr.
> >
> > Collections : test1, test2
> > Alias: test (combines test1, test2)
> >
> > Indexed document "solr-word.pdf" in collection test1
> >
> > Searching on a collection works:
> > http://localhost:8983/solr/test1/select?q=*:*=xml
> > 
> >
> > But searching on an alias results in HTTP 401
> > http://localhost:8983/solr/test/select?q=*:*=xml
> >
> > Error from server at null: Expected mime type application/octet-stream but 
> > got text/html.> content="text/html;charset=utf-8"/> Error 401 Authentication failed, 
> > Response code: 401  HTTP ERROR 401 Authentication 
> > failed, Response code: 401  
> > URI:/solr/test1_shard1_replica_n1/select 
> > STATUS:401 
> > MESSAGE:Authentication failed, Response code: 
> > 401 SERVLET:default   
> > 
> >
> > Even if https://issues.apache.org/jira/browse/SOLR-13510 is fixed in Solr 
> > 8.5.0, I did try to start Solr with -Dsolr.http1=true, and I set 
> > "forwardCredentials":true in security.json.
> >
> > Nothing works.  I just cannot use aliases when Solr is secured.
> >
> > Can anyone confirm if this may be a configuration issue, or if this could 
> > possibly be a bug ?
> >
> > Thank you;
> >
> > Isabelle Giguère
> > Computational Linguist & Java Developer
> > Linguiste informaticienne & développeur java
> >
> >


Re: How to determine why solr stops running?

2020-06-16 Thread Shawn Heisey

On 6/11/2020 11:52 AM, Ryan W wrote:

I will check "dmesg" first, to find out any hardware error message.





[1521232.781801] Out of memory: Kill process 117529 (httpd) score 9 or
sacrifice child
[1521232.782908] Killed process 117529 (httpd), UID 48, total-vm:675824kB,
anon-rss:181844kB, file-rss:0kB, shmem-rss:0kB

Is this a relevant "Out of memory" message?  Does this suggest an OOM
situation is the culprit?


Because this was in the "dmesg" output, it indicates that it is the 
operating system killing programs because the *system* doesn't have any 
memory left.  It wasn't Java that did this, and it wasn't Solr that was 
killed.  It very well could have been Solr that was killed at another 
time, though.


The process that it killed this time is named httpd ... which is most 
likely the Apache webserver.  Because the UID is 48, this is probably an 
OS derived from Redhat, where the "apache" user has UID and GID 48 by 
default.  Apache with its default config can be VERY memory hungry when 
it gets busy.



-XX:InitialHeapSize=536870912 -XX:MaxHeapSize=536870912


This says that you started Solr with the default 512MB heap.  Which is 
VERY VERY small.  The default is small so that Solr will start on 
virtually any hardware.  Almost every user must increase the heap size. 
And because the OS is killing processes, it is likely that the system 
does not have enough memory installed for what you have running on it.


It is generally not a good idea to share the server hardware between 
Solr and other software, unless the system has a lot of spare resources, 
memory in particular.


Thanks,
Shawn


Re: Facet Performance

2020-06-16 Thread James Bodkin
I've been trying to build a query that I can use in newSearcher based off the 
information in your previous e-mail. I thought you meant to build a *:* query 
as per Query 1 in my previous e-mail but I'm still seeing the first-hit 
execution.
Now I'm wondering if you meant to create a *:* query with each of the fields as 
part of the fl query parameters or a *:* query with each of the fields and 
values as part of the fq query parameters.

At the moment I've been running these manually as I expected that I would see 
the first-execution penalty disappear by the time I got to query 4, as I 
thought this would replicate the actions of the newSeacher.
Unfortunately we can't use the autowarm count that is available as part of the 
filterCache/filterCache due to the custom deployment mechanism we use to update 
our index.

Kind Regards,

James Bodkin

On 16/06/2020, 15:30, "Erick Erickson"  wrote:

Did you try the autowarming like I mentioned in my previous e-mail?

> On Jun 16, 2020, at 10:18 AM, James Bodkin 
 wrote:
> 
> We've changed the schema to enable docValues for these fields and this 
led to an improvement in the response time. We found a further improvement by 
also switching off indexed as these fields are used for faceting and filtering 
only.
> Since those changes, we've found that the first-execution for queries is 
really noticeable. I thought this would be the filterCache based on what I saw 
in NewRelic however it is probably trying to read the docValues from disk. How 
can we use the autowarming to improve this?
> 
> For example, I've run the following queries in sequence and each query 
has a first-execution penalty.
> 
> Query 1:
> 
> q=*:*
> facet=true
> facet.field=D_DepartureAirport
> facet.field=D_Destination
> facet.limit=-1
> rows=0
> 
> Query 2:
> 
> q=*:*
> fq=D_DepartureAirport:(2660) 
> facet=true
> facet.field=D_Destination
> facet.limit=-1
> rows=0
> 
> Query 3:
> 
> q=*:*
> fq=D_DepartureAirport:(2661)
> facet=true
> facet.field=D_Destination
> facet.limit=-1
> rows=0
> 
> Query 4:
> 
> q=*:*
> fq=D_DepartureAirport:(2660+OR+2661)
> facet=true
> facet.field=D_Destination
> facet.limit=-1
> rows=0
> 
> We've kept the field type as a string, as the value is mapped by 
application that accesses Solr. In the examples above, the values are mapped to 
airports and destinations.
> Is it possible to prewarm the above queries without having to define all 
the potential filters manually in the auto warming?
> 
> At the moment, we update and optimise our index in a different 
environment and then copy the index to our production instances by using a 
rolling deployment in Kubernetes.
> 
> Kind Regards,
> 
> James Bodkin
> 
> On 12/06/2020, 18:58, "Erick Erickson"  wrote:
> 
>I question whether fiterCache has anything to do with it, I suspect 
what’s really happening is that first time you’re reading the relevant bits 
from disk into memory. And to double check you should have docVaues enabled for 
all these fields. The “uninverting” process  can be very expensive, and 
docValues bypasses that.
> 
>As of Solr 7.6, you can define “uninvertible=true” to your field(Type) 
to “fail fast” if Solr needs to uninvert the field.
> 
>But that’s an aside. In either case, my claim is that first-time 
execution does “something”, either reads the serialized docValues from disk or 
uninverts the file on Solr’s heap.
> 
>You can have this autowarmed by any combination of
>1> specifying an autowarm count on your queryResultCache. That’s hit 
or miss, as it replays the most recent N queries which may or may not contain 
the sorts. That said, specifying 10-20 for autowarm count is usually a good 
idea, assuming you’re not committing more than, say, every 30 seconds. I’d add 
the same to filterCache too.
> 
>2> specifying a newSearcher or firstSearcher query in solrconfig.xml. 
The difference is that newSearcher is fired every time a commit happens, while 
firstSearcher is only fired when Solr starts, the theory being that there’s no 
cache autowarming available when Solr fist powers up. Usually, people don’t 
bother with firstSearcher or just make it the same as newSearcher. Note that a 
query doesn’t have to be “real” at all. You can just add all the facet fields 
to a *:* query in a single go.
> 
>BTW, Trie fields will stay around for a long time even though 
deprecated. Or at least until we find something to replace them with that 
doesn’t have this penalty, so I’d feel pretty safe using those and they’ll be 
more efficient than strings.
> 
>Best,
>Erick
> 



Re: Solr cloud backup/restore not working

2020-06-16 Thread yaswanth kumar
I don't see anything related in the solr.log file for the same error. Not
sure if there is anyother place where I can check for this.

Thanks,

On Tue, Jun 16, 2020 at 10:21 AM Shawn Heisey  wrote:

> On 6/12/2020 8:38 AM, yaswanth kumar wrote:
> > Using solr 8.2.0 and setup a cloud with 2 nodes. (2 replica's for each
> > collection)
> > Enabled basic authentication and gave all access to the admin user
> >
> > Now trying to use solr cloud backup/restore API, backup is working great,
> > but when trying to invoke restore API its throwing the below error
> 
> >  "msg":"ADDREPLICA failed to create replica",
> >  "trace":"org.apache.solr.common.SolrException: ADDREPLICA failed to
> > create replica\n\tat
> >
> org.apache.solr.client.solrj.SolrResponse.getException(SolrResponse.java:53)\n\tat
> >
> org.apache.solr.handler.admin.CollectionsHandler.invokeAction(CollectionsHandler.java:280)\n\tat
>
> The underlying cause of this exception is not recorded here.  Are there
> other entries in the Solr log with more detailed information from the
> ADDREPLICA attempt?
>
> Thanks,
> Shawn
>


-- 
Thanks & Regards,
Yaswanth Kumar Konathala.
yaswanth...@gmail.com


Re: Facet Performance

2020-06-16 Thread Erick Erickson
Did you try the autowarming like I mentioned in my previous e-mail?

> On Jun 16, 2020, at 10:18 AM, James Bodkin  
> wrote:
> 
> We've changed the schema to enable docValues for these fields and this led to 
> an improvement in the response time. We found a further improvement by also 
> switching off indexed as these fields are used for faceting and filtering 
> only.
> Since those changes, we've found that the first-execution for queries is 
> really noticeable. I thought this would be the filterCache based on what I 
> saw in NewRelic however it is probably trying to read the docValues from 
> disk. How can we use the autowarming to improve this?
> 
> For example, I've run the following queries in sequence and each query has a 
> first-execution penalty.
> 
> Query 1:
> 
> q=*:*
> facet=true
> facet.field=D_DepartureAirport
> facet.field=D_Destination
> facet.limit=-1
> rows=0
> 
> Query 2:
> 
> q=*:*
> fq=D_DepartureAirport:(2660) 
> facet=true
> facet.field=D_Destination
> facet.limit=-1
> rows=0
> 
> Query 3:
> 
> q=*:*
> fq=D_DepartureAirport:(2661)
> facet=true
> facet.field=D_Destination
> facet.limit=-1
> rows=0
> 
> Query 4:
> 
> q=*:*
> fq=D_DepartureAirport:(2660+OR+2661)
> facet=true
> facet.field=D_Destination
> facet.limit=-1
> rows=0
> 
> We've kept the field type as a string, as the value is mapped by application 
> that accesses Solr. In the examples above, the values are mapped to airports 
> and destinations.
> Is it possible to prewarm the above queries without having to define all the 
> potential filters manually in the auto warming?
> 
> At the moment, we update and optimise our index in a different environment 
> and then copy the index to our production instances by using a rolling 
> deployment in Kubernetes.
> 
> Kind Regards,
> 
> James Bodkin
> 
> On 12/06/2020, 18:58, "Erick Erickson"  wrote:
> 
>I question whether fiterCache has anything to do with it, I suspect what’s 
> really happening is that first time you’re reading the relevant bits from 
> disk into memory. And to double check you should have docVaues enabled for 
> all these fields. The “uninverting” process  can be very expensive, and 
> docValues bypasses that.
> 
>As of Solr 7.6, you can define “uninvertible=true” to your field(Type) to 
> “fail fast” if Solr needs to uninvert the field.
> 
>But that’s an aside. In either case, my claim is that first-time execution 
> does “something”, either reads the serialized docValues from disk or 
> uninverts the file on Solr’s heap.
> 
>You can have this autowarmed by any combination of
>1> specifying an autowarm count on your queryResultCache. That’s hit or 
> miss, as it replays the most recent N queries which may or may not contain 
> the sorts. That said, specifying 10-20 for autowarm count is usually a good 
> idea, assuming you’re not committing more than, say, every 30 seconds. I’d 
> add the same to filterCache too.
> 
>2> specifying a newSearcher or firstSearcher query in solrconfig.xml. The 
> difference is that newSearcher is fired every time a commit happens, while 
> firstSearcher is only fired when Solr starts, the theory being that there’s 
> no cache autowarming available when Solr fist powers up. Usually, people 
> don’t bother with firstSearcher or just make it the same as newSearcher. Note 
> that a query doesn’t have to be “real” at all. You can just add all the facet 
> fields to a *:* query in a single go.
> 
>BTW, Trie fields will stay around for a long time even though deprecated. 
> Or at least until we find something to replace them with that doesn’t have 
> this penalty, so I’d feel pretty safe using those and they’ll be more 
> efficient than strings.
> 
>Best,
>Erick
> 



Re: Index download speed while replicating is fixed at 5.1 in replication.html

2020-06-16 Thread Florin Babes
Hello,
The patch is to fix the display. It doesn't configure or limit the speed :)


În mar., 16 iun. 2020 la 14:26, Shawn Heisey  a scris:

> On 6/14/2020 12:06 AM, Florin Babes wrote:
> > While checking ways to optimize the speed of replication I've noticed
> that
> > the index download speed is fixed at 5.1 in replication.html. There is a
> > reason for that? If not, I would like to submit a patch with the fix.
> > We are using solr 8.3.1.
>
> Looking at the replication.html file, the part that says "5.1 MB/s"
> appears to be purely display.  As far as I can tell, it's not
> configuring anything, and it's not gathering information from anywhere.
>
> So unless your solrconfig.xml is configuring a speed limit in the
> replication handler, I don't think there is one.
>
> I'm curious about exactly what you have in mind for a patch.
>
> Thanks,
> Shawn
>


Re: Solr cloud backup/restore not working

2020-06-16 Thread Shawn Heisey

On 6/12/2020 8:38 AM, yaswanth kumar wrote:

Using solr 8.2.0 and setup a cloud with 2 nodes. (2 replica's for each
collection)
Enabled basic authentication and gave all access to the admin user

Now trying to use solr cloud backup/restore API, backup is working great,
but when trying to invoke restore API its throwing the below error



 "msg":"ADDREPLICA failed to create replica",
 "trace":"org.apache.solr.common.SolrException: ADDREPLICA failed to
create replica\n\tat
org.apache.solr.client.solrj.SolrResponse.getException(SolrResponse.java:53)\n\tat
org.apache.solr.handler.admin.CollectionsHandler.invokeAction(CollectionsHandler.java:280)\n\tat


The underlying cause of this exception is not recorded here.  Are there 
other entries in the Solr log with more detailed information from the 
ADDREPLICA attempt?


Thanks,
Shawn


Re: Facet Performance

2020-06-16 Thread James Bodkin
We've changed the schema to enable docValues for these fields and this led to 
an improvement in the response time. We found a further improvement by also 
switching off indexed as these fields are used for faceting and filtering only.
Since those changes, we've found that the first-execution for queries is really 
noticeable. I thought this would be the filterCache based on what I saw in 
NewRelic however it is probably trying to read the docValues from disk. How can 
we use the autowarming to improve this?

For example, I've run the following queries in sequence and each query has a 
first-execution penalty.

Query 1:

q=*:*
facet=true
facet.field=D_DepartureAirport
facet.field=D_Destination
facet.limit=-1
rows=0

Query 2:

q=*:*
fq=D_DepartureAirport:(2660) 
facet=true
facet.field=D_Destination
facet.limit=-1
rows=0

Query 3:

q=*:*
fq=D_DepartureAirport:(2661)
facet=true
facet.field=D_Destination
facet.limit=-1
rows=0

Query 4:

q=*:*
fq=D_DepartureAirport:(2660+OR+2661)
facet=true
facet.field=D_Destination
facet.limit=-1
rows=0

We've kept the field type as a string, as the value is mapped by application 
that accesses Solr. In the examples above, the values are mapped to airports 
and destinations.
Is it possible to prewarm the above queries without having to define all the 
potential filters manually in the auto warming?

At the moment, we update and optimise our index in a different environment and 
then copy the index to our production instances by using a rolling deployment 
in Kubernetes.

Kind Regards,

James Bodkin

On 12/06/2020, 18:58, "Erick Erickson"  wrote:

I question whether fiterCache has anything to do with it, I suspect what’s 
really happening is that first time you’re reading the relevant bits from disk 
into memory. And to double check you should have docVaues enabled for all these 
fields. The “uninverting” process  can be very expensive, and docValues 
bypasses that.

As of Solr 7.6, you can define “uninvertible=true” to your field(Type) to 
“fail fast” if Solr needs to uninvert the field.

But that’s an aside. In either case, my claim is that first-time execution 
does “something”, either reads the serialized docValues from disk or uninverts 
the file on Solr’s heap.

You can have this autowarmed by any combination of
1> specifying an autowarm count on your queryResultCache. That’s hit or 
miss, as it replays the most recent N queries which may or may not contain the 
sorts. That said, specifying 10-20 for autowarm count is usually a good idea, 
assuming you’re not committing more than, say, every 30 seconds. I’d add the 
same to filterCache too.

2> specifying a newSearcher or firstSearcher query in solrconfig.xml. The 
difference is that newSearcher is fired every time a commit happens, while 
firstSearcher is only fired when Solr starts, the theory being that there’s no 
cache autowarming available when Solr fist powers up. Usually, people don’t 
bother with firstSearcher or just make it the same as newSearcher. Note that a 
query doesn’t have to be “real” at all. You can just add all the facet fields 
to a *:* query in a single go.

BTW, Trie fields will stay around for a long time even though deprecated. 
Or at least until we find something to replace them with that doesn’t have this 
penalty, so I’d feel pretty safe using those and they’ll be more efficient than 
strings.

Best,
Erick



Re: Solr cloud backup/restore not working

2020-06-16 Thread yaswanth kumar
Sure I pasted it below from the solr logfiles..

2020-06-16 14:06:27.000 INFO  (qtp1987693491-153) [c:test   ]
o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/collections
params={name=test=RESTORE=/opt/$
2020-06-16 14:06:27.001 ERROR (qtp1987693491-153) [c:test   ]
o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: ADDREPLICA
failed to create replica
at
org.apache.solr.client.solrj.SolrResponse.getException(SolrResponse.java:53)
at
org.apache.solr.handler.admin.CollectionsHandler.invokeAction(CollectionsHandler.java:280)
at
org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:252)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)
at
org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:820)
at
org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:786)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:546)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:423)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:350)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1711)
at
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1347)
at
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1678)
at
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1249)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:152)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at org.eclipse.jetty.server.Server.handle(Server.java:505)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:370)
at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:267)
at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
at
org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at
org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.onFillable(SslConnection.java:427)
at
org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:321)
at
org.eclipse.jetty.io.ssl.SslConnection$2.succeeded(SslConnection.java:159)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
at
org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
at
org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:781)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:917)
at java.base/java.lang.Thread.run(Thread.java:834)

Can you please review and let me know if I am missing something??

On Tue, Jun 16, 2020 at 3:15 AM Jörn Franke  wrote:

> Have you looked in the Solr logfiles?
>
> > Am 16.06.2020 um 05:46 schrieb yaswanth kumar :
> >
> > Can anyone here help on the posted question pls??
> >
> >> On Fri, Jun 12, 2020 at 10:38 AM yaswanth kumar 
> >> wrote:
> >>
> >> 

Re: getting different errors from complex phrase query

2020-06-16 Thread Erick Erickson
Check your “df” parameter in all your handlers in solrconfig.xml.

Second, add "=query” to the query and look at the parsed
return, you’ll probably see something field qualified by “text:….”

Offhand, though, I don’t see where that’s happening in your query.

wait, how are you submitting this? If the quote is getting to Solr
as an escaped character,  Solr is not be treating it as a phrase 
and the ‘by’ is independent of the “test*” in which case the “test*” 
would go  against the default search field, which I claim you have
defined as “text” with a “df” parameter in your request handler.

Best,
Erick 

> On Jun 15, 2020, at 11:19 PM, Shawn Heisey  wrote:
> 
> On 6/15/2020 2:52 PM, Deepu wrote:
>> sample query is
>> "{!complexphrase inOrder=true}(all_text_txt_enus:\"by\\ test*\") AND
>> (({!terms f=product_id_l}959945,959959,959960,959961,959962,959963)
>> AND (date_created_at_rdt:[2020-04-07T01:23:09Z TO *} AND
>> date_created_at_rdt:{* TO 2020-04-07T01:24:57Z]))"
>> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
>> from server  https://XX.XX.XX:8983/solr/problem: undefined field text
> 
> The error is "undefined field text".  How exactly that occurs with what you 
> have sent, I do not know.  There is something defined somewhere that refers 
> to a field named "text" and the field does not exist in that index.
> 
> Something that may be indirectly relevant:  Generally speaking, Solr only 
> supports one "localparams" in a query, and it must be the first text in the 
> query string.  You have two -- one starts with {!complexphrase and the other 
> starts with {!terms.
> 
> There are some special circumstances where multiples are allowed, but I do 
> not know which circumstances.  For the most part, more than one isn't allowed 
> or supported.  I am pretty sure that you can't use multiple query parsers in 
> one query string.
> 
> Thanks,
> Shawn



SolrCloud creates the data folder in solr.data.home instead of solr.core.name

2020-06-16 Thread Razvan-Daniel Mihai
Hello,

I run a kerberized SolrCloud (7.4.0) environment (shipped with Cloudera
6.3.2) and have the problem that all index files for all cores are created
under the same ${solr.solr.home} directory instead of ${solr.core.name} and
thus they are corrupt.

Cloudera uses the HDFSDirectoryFactory by default - but this is largely
unusable for large collections with billions of documents, so we switched
to storing the indexes locally (on /data/solr) a long time ago.

I should probably mention that we have another cluster (CDH6.3.1) which
doesn't have this problem.

Also it might be that there is a Sentry authorization problem (I'm still
investigating) but nevertheless it seems to me like Solr should never ever
use the same data folder for two different cores, so this might also be a
bug in Solr.

Any ideas of what is wrong here?

Thank you,
Razvan

PS:

Here is what it looks like on disk:

$ ls /data/solr/

index  niofs_test_shard1_replica_n1
prod_applogs_20200616_shard2_replica_n2  snapshot_metadata  tlog

[a6709018@cdcdhp10 ~]$ ls /data/solr/niofs_test_shard1_replica_n1/

core.properties

[a6709018@cdcdhp10 ~]$ ls /data/solr/niofs_test_shard1_replica_n1/

core.properties


You can see that there are two core directories (niofs_test
and prod_applogs) and then there are all data-related folders on the same
level. The core folders contain only one file (core.properties).


Here is how Solr is started on one machine. (I removed some security
related properties from the listing):


/usr/lib/jvm/java-openjdk/bin/java -server -XX:NewRatio=3
-XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8
-XX:+UseConcMarkSweepGC -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4
-XX:+CMSScavengeBeforeRemark -XX:PretenureSizeThreshold=64m
-XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=50
-XX:CMSMaxAbortablePrecleanTime=6000 -XX:+CMSParallelRemarkEnabled
-XX:+ParallelRefProcEnabled -XX:-OmitStackTraceInFastThrow
-Xlog:gc*:file=/var/log/solr/solr_gc.log:time,uptime:filecount=9,filesize=2
-DzkClientTimeout=15000 -DzkHost=cdcdhp02.bigdatap.de.comdirect.com:2181,
cdcdhp20.bigdatap.de.comdirect.com:2181,
cdcdhp22.bigdatap.de.comdirect.com:2181/solr -Dsolr.log.dir=/var/log/solr
-Djetty.port=8985 -DSTOP.PORT=7985 -DSTOP.KEY=csearch
-Duser.timezone=GMT+0200
-Djetty.home=/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/solr/server
-Dsolr.solr.home=/data/solr -Dsolr.data.home=
-Dsolr.install.dir=/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/solr
-Dsolr.default.confdir=/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/solr/server/solr/configsets/_default/conf
-Xss256k -DwaitForZk=60 -Dsolr.host=cdcdhp10.bigdatap.de.comdirect.com
-DuseCachedStatsBetweenGetMBeanInfoCalls=true
-DdisableSolrFieldCacheMBeanEntryListJmx=true
-Dlog4j.configuration=file:///var/run/cloudera-scm-agent/process/518-solr-SOLR_SERVER/log4j.properties
-Dsolr.log=/var/log/solr -Dsolr.admin.port=8984
-Dsolr.max.connector.thread=1 -Dsolr.solr.home=/data/solr
-Dsolr.authorization.sentry.site=/var/run/cloudera-scm-agent/process/518-solr-SOLR_SERVER/sentry-conf/sentry-site.xml
-Dsolr.sentry.override.plugins=true -jar start.jar --module=https
--lib=/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/solr/server/solr-webapp/webapp/WEB-INF/lib/*
--lib=/var/run/cloudera-scm-agent/process/518-solr-SOLR_SERVER/hadoop-conf


Re: Dataimporter status

2020-06-16 Thread devashrid
Hi Shawn,

I am new to solr and I have set up a cloud cluster of 1 shard and 3
collections one 2 servers. I am facing the same issue. I am using  
CloudSolrClient client = new
CloudSolrClient.Builder(zkUrls,Optional.empty()).build(), to create my
client.

and then I fire import command using,
client.request(queryRequest,collectionName);

However, I am not sure how to fire it to a particular coreName
(collection_shard_replica) ? Could you please help me out in the same.

Thanks!
Devashri



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Proxy Error when cluster went down

2020-06-16 Thread Shawn Heisey

On 6/15/2020 9:04 PM, Vishal Vaibhav wrote:

I am running on solr 8.5. For some reason entire cluster went down. When i
am trying to bring up the nodes,its not coming up. My health check is
on "/solr/rules/admin/system". I tried forcing a leader election but it
dint help.
so when i run the following commands. Why is it trying to proxy when those
nodes are down. Am i missing something?





java.net.UnknownHostException:
search-rules-solr-v1-2.search-rules-solr-v1.search-digital.svc.cluster.local:


It is trying to proxy because it's SolrCloud.  SolrCloud has an internal 
load balancer that spreads queries across multiple replicas when 
possible.  Your cluster must be aware of multiple servers where the 
"rules" collection can be queried.


The underlying problem behind this error message is that the following 
hostname is being looked up, and it doesn't exist:


search-rules-solr-v1-2.search-rules-solr-v1.search-digital.svc.cluster.local

This hostname is most likely coming from /etc/hosts on one of your 
systems when that system starts Solr and it registers with the cluster, 
and that /etc/hosts file is the ONLY place that the hostname exists, so 
when SolrCloud tries to forward the request to that server, it is failing.


Thanks,
Shawn


Re: getting different errors from complex phrase query

2020-06-16 Thread Shawn Heisey

On 6/15/2020 2:52 PM, Deepu wrote:

sample query is
"{!complexphrase inOrder=true}(all_text_txt_enus:\"by\\ test*\") AND
(({!terms f=product_id_l}959945,959959,959960,959961,959962,959963)
AND (date_created_at_rdt:[2020-04-07T01:23:09Z TO *} AND
date_created_at_rdt:{* TO 2020-04-07T01:24:57Z]))"

org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
from server  https://XX.XX.XX:8983/solr/problem: undefined field text


The error is "undefined field text".  How exactly that occurs with what 
you have sent, I do not know.  There is something defined somewhere that 
refers to a field named "text" and the field does not exist in that index.


Something that may be indirectly relevant:  Generally speaking, Solr 
only supports one "localparams" in a query, and it must be the first 
text in the query string.  You have two -- one starts with 
{!complexphrase and the other starts with {!terms.


There are some special circumstances where multiples are allowed, but I 
do not know which circumstances.  For the most part, more than one isn't 
allowed or supported.  I am pretty sure that you can't use multiple 
query parsers in one query string.


Thanks,
Shawn


Re: eDismax query syntax question

2020-06-16 Thread Shawn Heisey

On 6/15/2020 8:01 AM, Webster Homer wrote:

Only the minus following the parenthesis is treated as a NOT.
Are parentheses special? They're not mentioned in the eDismax documentation.


Yes, parentheses are special to edismax.  They are used just like in 
math equations, to group and separate things or to override the default 
operator order.


https://lucene.apache.org/solr/guide/8_5/the-standard-query-parser.html#escaping-special-characters

The edismax parser supports a superset of what the standard (lucene) 
parser does, so they have the same special characters.


Thanks,
Shawn


Re: Index download speed while replicating is fixed at 5.1 in replication.html

2020-06-16 Thread Shawn Heisey

On 6/14/2020 12:06 AM, Florin Babes wrote:

While checking ways to optimize the speed of replication I've noticed that
the index download speed is fixed at 5.1 in replication.html. There is a
reason for that? If not, I would like to submit a patch with the fix.
We are using solr 8.3.1.


Looking at the replication.html file, the part that says "5.1 MB/s" 
appears to be purely display.  As far as I can tell, it's not 
configuring anything, and it's not gathering information from anywhere.


So unless your solrconfig.xml is configuring a speed limit in the 
replication handler, I don't think there is one.


I'm curious about exactly what you have in mind for a patch.

Thanks,
Shawn


Re: Can't fetch table from cassandra through jdbc connection

2020-06-16 Thread Ирина Камалова
Could you please tell me if I can expand log trace here?
(if I'm trying to do it through solr admin and make root log ALL - it
doesn't help me)


Best regards,
Irina Kamalova


On Mon, 15 Jun 2020 at 10:12, Ирина Камалова 
wrote:

> I’m using Solr 7.7.3 and latest Cassandra jdbc driver 1.3.5
>
> I get  *SQLFeatureNotSupportedException *
>
>
> I see this error and have no idea what’s wrong (not enough verbose - table
> name or field wrong/ couldn’t mapping type or driver doesn’t support?)
>
>
> Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: 
> org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to 
> execute query: select test_field from test_keyspace.test_table limit 10; 
> Processing Document # 1
> at 
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:271)
> at 
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:424)
> at 
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:483)
> at 
> org.apache.solr.handler.dataimport.DataImporter.lambda$runAsync$0(DataImporter.java:466)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.RuntimeException: 
> org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to 
> execute query: select test_field from test_keyspace.test_table limit 10; 
> Processing Document # 1
> at 
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:417)
> at 
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:330)
> at 
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:233)
> ... 4 more
> Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: 
> Unable to execute query: select test_field from test_keyspace.test_table 
> limit 10; Processing Document # 1
> at 
> org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:69)
> at 
> org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:327)
> at 
> org.apache.solr.handler.dataimport.JdbcDataSource.createResultSetIterator(JdbcDataSource.java:288)
> at 
> org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:283)
> at 
> org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:52)
> at 
> org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
> at 
> org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
> at 
> org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:267)
> at 
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:476)
> at 
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:415)
> ... 6 more
> Caused by: java.sql.SQLFeatureNotSupportedException
> at 
> com.dbschema.CassandraConnection.createStatement(CassandraConnection.java:75)
> at 
> org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.createStatement(JdbcDataSource.java:342)
> at 
> org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:318)
> ... 14 more
>
>
>
>
> Best regards,
> Irina Kamalova
>


Re: Solr cloud backup/restore not working

2020-06-16 Thread Jörn Franke
Have you looked in the Solr logfiles?

> Am 16.06.2020 um 05:46 schrieb yaswanth kumar :
> 
> Can anyone here help on the posted question pls??
> 
>> On Fri, Jun 12, 2020 at 10:38 AM yaswanth kumar 
>> wrote:
>> 
>> Using solr 8.2.0 and setup a cloud with 2 nodes. (2 replica's for each
>> collection)
>> Enabled basic authentication and gave all access to the admin user
>> 
>> Now trying to use solr cloud backup/restore API, backup is working great,
>> but when trying to invoke restore API its throwing the below error
>> 
>> {
>>  "responseHeader":{
>>"status":500,
>>"QTime":349},
>>  "Operation restore caused
>> exception:":"org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
>> ADDREPLICA failed to create replica",
>>  "exception":{
>>"msg":"ADDREPLICA failed to create replica",
>>"rspCode":500},
>>  "error":{
>>"metadata":[
>>  "error-class","org.apache.solr.common.SolrException",
>>  "root-error-class","org.apache.solr.common.SolrException"],
>>"msg":"ADDREPLICA failed to create replica",
>>"trace":"org.apache.solr.common.SolrException: ADDREPLICA failed to
>> create replica\n\tat
>> org.apache.solr.client.solrj.SolrResponse.getException(SolrResponse.java:53)\n\tat
>> org.apache.solr.handler.admin.CollectionsHandler.invokeAction(CollectionsHandler.java:280)\n\tat
>> org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:252)\n\tat
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)\n\tat
>> org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:820)\n\tat
>> org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:786)\n\tat
>> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:546)\n\tat
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:423)\n\tat
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:350)\n\tat
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)\n\tat
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)\n\tat
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)\n\tat
>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)\n\tat
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)\n\tat
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)\n\tat
>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1711)\n\tat
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)\n\tat
>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1347)\n\tat
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)\n\tat
>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)\n\tat
>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1678)\n\tat
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)\n\tat
>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1249)\n\tat
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)\n\tat
>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)\n\tat
>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:152)\n\tat
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)\n\tat
>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)\n\tat
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)\n\tat
>> org.eclipse.jetty.server.Server.handle(Server.java:505)\n\tat
>> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:370)\n\tat
>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:267)\n\tat
>> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)\n\tat
>> org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)\n\tat
>> org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.onFillable(SslConnection.java:427)\n\tat
>> org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:321)\n\tat
>> org.eclipse.jetty.io.ssl.SslConnection$2.succeeded(SslConnection.java:159)\n\tat
>> org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)\n\tat
>> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)\n\tat
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)\n\tat
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)\n\tat
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)\n\tat
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)\n\tat
>>