Re: CloudSolrClient (any version). Find the node your query has connected to.

2019-05-22 Thread Jan Høydahl
Try to add &shards.info=true to your request. It will return a section telling 
exactly what shards/replicas served that request with counts and all :)

Jan Høydahl

> 22. mai 2019 kl. 21:17 skrev Erick Erickson :
> 
> You have to be a little careful here, one thing I learned relatively recently 
> is that there are in-memory structures that hold pointers to _all_ 
> un-searchable docs (i.e. no new searchers have been opened since the doc was 
> added/updated) to support real-time get. So if you’re indexing a _lot_ of 
> docs that internal structure can grow quite large….
> 
> FWIW, delete-by-query is painful. Each one has to lock all indexing on all 
> replicas while it completes. If you can use delete-by-id it’d be better.
> 
> Let’s back up a bit and look at _why_ your nodes go into recovery…. Leave the 
> replicas on if you can and look for “Leader Initiated Recovery” (not sure 
> that’s the exact phrase, but you’ll see something very like that). If that’s 
> the case, then one situation we’ve seen is that a request takes too long to 
> return from a follower. So the sequence looks like this:
> 
> - leader gets update
> - leader indexes locally _and_ forwards to follower
> - follower is busy (and the delete-by-query could be why) and takes too long 
> to respond so the request times out
> - leader says “hmmm, I don’t know what happened so I’ll tell the follower to 
> recover”.
> 
> Given your heavy update rate, there’ll be no chance for “peer sync” to fully 
> recover so it’ll go into full recovery. That can sometimes be fixed by simply 
> lengthening the timeout.
> 
> Otherwise also take a look at the logs and see if you can find a root cause 
> for the replica going into recovery and we should see if we can fix that.
> 
> I didn’t ask what versions of Solr you’re using, but in the 7x code line (7.3 
> IIRC) significant work was done to make recovery less likely.
> 
> Best,
> Erick
> 
>> On May 22, 2019, at 10:27 AM, Shawn Heisey  wrote:
>> 
>> On 5/22/2019 10:47 AM, Russell Taylor wrote:
>>> I will add that we have set commits to be only called by the loading 
>>> program. We have turned off soft and autoCommits in the solrconfig.xml.
>> 
>> Don't turn off autoCommit.  Regular hard commits, typically with 
>> openSearcher set to false so they don't interfere with change visibility, 
>> are extremely important for good Solr operation.  Without it, the 
>> transaction logs will grow out of control.  In addition to taking a lot of 
>> disk space, that will cause a Solr restart to happen VERY slowly.  Note that 
>> a hard commit with openSearcher set to false will be VERY fast -- doing them 
>> frequently is usually not a problem for performance.  Sample configs in 
>> recent Solr versions ship with autoCommit set to 15 seconds and openSearcher 
>> set to false.
>> 
>> Not using autoSoftCommit is a reasonable thing to do if you do not need that 
>> functionality ... but don't disable autoCommit.
>> 
>> Thanks,
>> Shawn
> 


Re: Ignore faceting for particular fields in solr using Solrconfig.xml

2019-05-22 Thread Bernd Fehling

Have a look at "invariants" for your requestHandler in solrconfig.xml.
It might be an option for you.

Regards
Bernd


Am 22.05.19 um 22:23 schrieb RaviTeja:

Hello Solr Expert,

How are you?

Am trying to ignore faceting for some of the fields. Can you please help me
out to ignore faceting using solrconfig.xml.
I tried but I can ignore faceting all the fields that useless. I'm trying
to ignore some specific fields.

Really Appreciate your help for the response!

Regards,
Ravi



Re: Does Solr support retrieve a string text and get its filename accordingly?

2019-05-22 Thread Jörn Franke
You can go much more than grep. I recommend to get a book on Solr and read 
through it. Then you get the full context and you can see if it is useful for 
you. 

> Am 23.05.2019 um 07:44 schrieb luckydog xf :
> 
> Hi, list,
> 
>A quick question, we have tons of Microsoft docx/PDFs files( some PDFs
> are scanned copies), and we want to populate into Apache solr and search a
> few keywords that contain in the files and  return filenames accordingly.
> 
>   # it's the same thing as `grep -r KEYWORD /PATH/XXX` in Linux system.
> 
>   Is it doable ?
> 
>   Thanks,


Re: Does Solr support retrieve a string text and get its filename accordingly?

2019-05-22 Thread Mohomed Rimash
Hi

Definitely you can do, what you have to do is
1.Feed Docs/PDF (solr support rich format file import) to solr
2.index it with corresponding analyzers (if its just string match, default
is adequate. if you want phonetic and partial matches you have to add more
analyzers)
3.Create a query (edismax or dismax) with you keyword and  apply on the
index
4.To try out you can use the default GUI which comes with the SOLR

Regards
Rimash

On Thu, 23 May 2019 at 11:14, luckydog xf  wrote:

> Hi, list,
>
> A quick question, we have tons of Microsoft docx/PDFs files( some PDFs
> are scanned copies), and we want to populate into Apache solr and search a
> few keywords that contain in the files and  return filenames accordingly.
>
># it's the same thing as `grep -r KEYWORD /PATH/XXX` in Linux system.
>
>Is it doable ?
>
>Thanks,
>


Re: Unable to run solr | SolrCore Initialization Failures {{Core}}: {{error}}

2019-05-22 Thread Mohomed Rimash
Hi

Do you know what are the cores (name of the core used) in the solr instance
you  trying to use? if create those cores manually and try

Regards
Rimash

On Thu, 23 May 2019 at 11:07, Karthic Viswanathan <
karthic.viswan...@gmail.com> wrote:

>
> Hi,
>
> I am trying to install Solr for my Windows Server 2016 Standard edition. .
> While the installation of Solr itself succeeds, I am not able to get it
> running.
>
> Everytime after installation and starting the service
>
>  “SolrCore Initialization Failures {{Core}}: {{error}}”
>
>
>
> I am not sure what the error is since it is not very clear. Also, the log
> files are all empty. It has just a few warnings.  I have attached them for
> reference. Solr is a requirement for installing Sitecore CMS and I am not
> able to proceed any further.  Any help on this would be greatly
> appreciated.
>
>
> I have this same error with solr 7.2.1, 6.6.2.
> I tried running this with both nssm 2.4 and nssm 2.24 pre.
> I have jre 1.8.0_211 installed.
>
> --
> Regards,
> Karthic Viswanathan
>
>
> [image: solr.png]
>
>
>
> [image: log.png]
>
>
>
>


Does Solr support retrieve a string text and get its filename accordingly?

2019-05-22 Thread luckydog xf
Hi, list,

A quick question, we have tons of Microsoft docx/PDFs files( some PDFs
are scanned copies), and we want to populate into Apache solr and search a
few keywords that contain in the files and  return filenames accordingly.

   # it's the same thing as `grep -r KEYWORD /PATH/XXX` in Linux system.

   Is it doable ?

   Thanks,


Unable to run solr | SolrCore Initialization Failures {{Core}}: {{error}}

2019-05-22 Thread Karthic Viswanathan
Hi,

I am trying to install Solr for my Windows Server 2016 Standard edition. .
While the installation of Solr itself succeeds, I am not able to get it
running.

Everytime after installation and starting the service

 “SolrCore Initialization Failures {{Core}}: {{error}}”



I am not sure what the error is since it is not very clear. Also, the log
files are all empty. It has just a few warnings.  I have attached them for
reference. Solr is a requirement for installing Sitecore CMS and I am not
able to proceed any further.  Any help on this would be greatly
appreciated.


I have this same error with solr 7.2.1, 6.6.2.
I tried running this with both nssm 2.4 and nssm 2.24 pre.
I have jre 1.8.0_211 installed.

-- 
Regards,
Karthic Viswanathan


[image: solr.png]



[image: log.png]


Re: Facet count incorrect

2019-05-22 Thread Erick Erickson
1> I strongly recommend you re-index into a new collection and switch to it 
with a collection alias rather than try to re-index all the docs. Segment 
merging with the same field with dissimilar definitions is not guaranteed to do 
the right thing.

2> No. There a few (very few) things that don’t require starting fresh. You can 
do some things like add a lowercasefilter, add or remove a field totally and 
the like. Even then you’ll go through a period of mixed-up results until the 
reindex is complete. But changing the type, changing from multiValued to 
singleValued or vice versa (particularly with docValues) etc. are all “fraught”.

My usual reply is “if you’re going to reindex everything anyway, why not just 
do it to a new collection and alias when you’re done?” It’s much safer.

Best,
Erick

> On May 22, 2019, at 3:06 PM, John Davis  wrote:
> 
> Hi there -
> Our facet counts are incorrect for a particular field and I suspect it is
> because we changed the type of the field from StrField to TextField. Two
> questions:
> 
> 1. If we do re-index all the documents in the index, would these counts get
> fixed?
> 2. Is there a "safe" way of changing field types that generally works?
> 
> *Old type:*
>   docValues="true" multiValued="true"/>
> 
> *New type:*
>   omitNorms="true" omitTermFreqAndPositions="true" indexed="true"
> stored="true" positionIncrementGap="100" sortMissingLast="true"
> multiValued="true">
> 
>  
>  
>
>  



Re: Ignore faceting for particular fields in solr using Solrconfig.xml

2019-05-22 Thread Erick Erickson
Just don’t ask for them. Or you saying that users can specify arbitrary fields 
to facet on and you want to prevent certain fields from being possible?

No, there’s no good way to do that in solrconfig.xml. You could write a query 
component that stripped out certain fields from the facet.field parameter.

Likely the easiest would be to do that in the application I assume you have 
between Solr and your users.

Best,
Erick

> On May 22, 2019, at 1:23 PM, RaviTeja  wrote:
> 
> Hello Solr Expert,
> 
> How are you?
> 
> Am trying to ignore faceting for some of the fields. Can you please help me
> out to ignore faceting using solrconfig.xml.
> I tried but I can ignore faceting all the fields that useless. I'm trying
> to ignore some specific fields.
> 
> Really Appreciate your help for the response!
> 
> Regards,
> Ravi



Re: Is it possible to reconstruct non stored fields and tun those into stored fields

2019-05-22 Thread Erick Erickson
You might get some pointer from the Luke code….

All in all I’d focus on re-indexing somehow. Unless the original documents are 
just totally impossible to find again it’s probably easier.

Best,
Erick

> On May 22, 2019, at 3:30 PM, Shawn Heisey  wrote:
> 
> On 5/22/2019 3:51 PM, Pushkar Raste wrote:
>> Looks like giving Luke a shot is the answer. Can you point me to an example
>> to extract the fields from inverted Index using Luke.
> 
> Luke is a GUI application that can view the Lucene index in considerable 
> detail.  To use Luke directly, you'd have to have somebody running it and 
> typing/copying what they find to some kind of system for indexing. It would 
> be a very manual process.
> 
> To do it programmatically, you would have to write code yourself using the 
> Lucene API.  I don't think we'd be able to point you at existing code.
> 
> Thanks,
> Shawn



Re: Is it possible to reconstruct non stored fields and tun those into stored fields

2019-05-22 Thread Shawn Heisey

On 5/22/2019 3:51 PM, Pushkar Raste wrote:

Looks like giving Luke a shot is the answer. Can you point me to an example
to extract the fields from inverted Index using Luke.


Luke is a GUI application that can view the Lucene index in considerable 
detail.  To use Luke directly, you'd have to have somebody running it 
and typing/copying what they find to some kind of system for indexing. 
It would be a very manual process.


To do it programmatically, you would have to write code yourself using 
the Lucene API.  I don't think we'd be able to point you at existing code.


Thanks,
Shawn


Ignore faceting for particular fields in solr using Solrconfig.xml

2019-05-22 Thread RaviTeja
Hello Solr Expert,

How are you?

Am trying to ignore faceting for some of the fields. Can you please help me
out to ignore faceting using solrconfig.xml.
I tried but I can ignore faceting all the fields that useless. I'm trying
to ignore some specific fields.

Really Appreciate your help for the response!

Regards,
Ravi


Facet count incorrect

2019-05-22 Thread John Davis
Hi there -
Our facet counts are incorrect for a particular field and I suspect it is
because we changed the type of the field from StrField to TextField. Two
questions:

1. If we do re-index all the documents in the index, would these counts get
fixed?
2. Is there a "safe" way of changing field types that generally works?

*Old type:*
  

*New type:*
  

  
  

  


Re: Is it possible to reconstruct non stored fields and tun those into stored fields

2019-05-22 Thread Pushkar Raste
We have only a handful of fields that are stored and many (non Text) fields
which are neither stored nor have docValues :-(

Looks like giving Luke a shot is the answer. Can you point me to an example
to extract the fields from inverted Index using Luke.

On Wed, May 22, 2019 at 11:52 AM Erick Erickson 
wrote:

> Well, if they’re all docValues or stored=true, sure. It’d be kind of
> slow.. The short form is “if you can specify fl=f1,f2,f3…. for all your
> fields and see all your values, then it’s easy if slow”.
>
> If that works _and_ you are on Solr 4.7+ cursorMark will help the “deep
> paging” issue.
>
> If they’re all docValues, you could use the /export handler to dump them
> all to a file and re-index that.
>
> If none of those are possible, you can do this but it’d be quite painful.
> Luke can reassemble a document (lossily for text fields, but in this case
> it’d be OK since they’re simple types) by examining the inverted index and
> pulling out the values. Painfully slow and you’d have to write custom code
> probably at the Lucene level to make it all work.
>
> Best,
> Erick
>
> > On May 22, 2019, at 8:11 AM, Pushkar Raste 
> wrote:
> >
> > I know this is a long shot. I am trying move from Solr4 to Solr7.
> > Reindexing all the data from the source is difficult to do in a
> reasonable
> > time. All the fields are of basic types like int, long, float, double,
> > Boolean, date,  string.
> >
> > Since these fields don’t have analyzers, I was wondering if these fields
> > can be retrieved while iterating over index while reading the documents.
> > --
> > — Pushkar Raste
>
> --
— Pushkar Raste


Slow ReadProcessor read fields Warnings - Ideas to investigate?

2019-05-22 Thread David Winter
Hello User Group,

we run Solr with HDFS and got a lot of the following warning:
Slow ReadProcessor read fields took 15093ms (threshold=1ms); ack: 
seqno: 3 reply: SUCCESS reply: SUCCESS reply: SUCCESS 
downstreamAckTimeNanos: 798309 flag: 0 flag: 0 flag: 0, targets: 
[DatanodeInfoWithStorage[xxx.xxx.xxx.xxx:50010,DS-xx,DISK], 
DatanodeInfoWithStorage[xxx.xxx.xxx.xxx:50010,DS-xx,DISK], 
DatanodeInfoWithStorage[xxx.xxx.xxx.xxx:50010,DS-xx,DISK]]

It started with the default threshold of 30 seconds. But 10 seconds are 
still too much for a query and we configured the warning threshold to 10 
seconds.
It resulted in a flood of warnings and uncovered a slow HDFS read 
performance. The HDFS statistics looks quite good and stable.

We are not sure how to investigate the reason and what we can improve to 
solve the issue.
Did anybody have similar issues? 

Mit freundlichen Grüßen / Kind regards

David Winter



Re: CloudSolrClient (any version). Find the node your query has connected to.

2019-05-22 Thread Erick Erickson
You have to be a little careful here, one thing I learned relatively recently 
is that there are in-memory structures that hold pointers to _all_ 
un-searchable docs (i.e. no new searchers have been opened since the doc was 
added/updated) to support real-time get. So if you’re indexing a _lot_ of docs 
that internal structure can grow quite large….

FWIW, delete-by-query is painful. Each one has to lock all indexing on all 
replicas while it completes. If you can use delete-by-id it’d be better.

Let’s back up a bit and look at _why_ your nodes go into recovery…. Leave the 
replicas on if you can and look for “Leader Initiated Recovery” (not sure 
that’s the exact phrase, but you’ll see something very like that). If that’s 
the case, then one situation we’ve seen is that a request takes too long to 
return from a follower. So the sequence looks like this:

- leader gets update
- leader indexes locally _and_ forwards to follower
- follower is busy (and the delete-by-query could be why) and takes too long to 
respond so the request times out
- leader says “hmmm, I don’t know what happened so I’ll tell the follower to 
recover”.

Given your heavy update rate, there’ll be no chance for “peer sync” to fully 
recover so it’ll go into full recovery. That can sometimes be fixed by simply 
lengthening the timeout.

Otherwise also take a look at the logs and see if you can find a root cause for 
the replica going into recovery and we should see if we can fix that.

I didn’t ask what versions of Solr you’re using, but in the 7x code line (7.3 
IIRC) significant work was done to make recovery less likely.

Best,
Erick

> On May 22, 2019, at 10:27 AM, Shawn Heisey  wrote:
> 
> On 5/22/2019 10:47 AM, Russell Taylor wrote:
>> I will add that we have set commits to be only called by the loading 
>> program. We have turned off soft and autoCommits in the solrconfig.xml.
> 
> Don't turn off autoCommit.  Regular hard commits, typically with openSearcher 
> set to false so they don't interfere with change visibility, are extremely 
> important for good Solr operation.  Without it, the transaction logs will 
> grow out of control.  In addition to taking a lot of disk space, that will 
> cause a Solr restart to happen VERY slowly.  Note that a hard commit with 
> openSearcher set to false will be VERY fast -- doing them frequently is 
> usually not a problem for performance.  Sample configs in recent Solr 
> versions ship with autoCommit set to 15 seconds and openSearcher set to false.
> 
> Not using autoSoftCommit is a reasonable thing to do if you do not need that 
> functionality ... but don't disable autoCommit.
> 
> Thanks,
> Shawn



Re: CloudSolrClient (any version). Find the node your query has connected to.

2019-05-22 Thread Shawn Heisey

On 5/22/2019 10:47 AM, Russell Taylor wrote:

I will add that we have set commits to be only called by the loading program. 
We have turned off soft and autoCommits in the solrconfig.xml.


Don't turn off autoCommit.  Regular hard commits, typically with 
openSearcher set to false so they don't interfere with change 
visibility, are extremely important for good Solr operation.  Without 
it, the transaction logs will grow out of control.  In addition to 
taking a lot of disk space, that will cause a Solr restart to happen 
VERY slowly.  Note that a hard commit with openSearcher set to false 
will be VERY fast -- doing them frequently is usually not a problem for 
performance.  Sample configs in recent Solr versions ship with 
autoCommit set to 15 seconds and openSearcher set to false.


Not using autoSoftCommit is a reasonable thing to do if you do not need 
that functionality ... but don't disable autoCommit.


Thanks,
Shawn


RE: CloudSolrClient (any version). Find the node your query has connected to.

2019-05-22 Thread Russell Taylor
Thanks Eric,
I will add that we have set commits to be only called by the loading program. 
We have turned off soft and autoCommits in the solrconfig.xml.
This is so when we upload, we move from one list of docs to the new list in one 
atomic operation (delete, add and then commit).

I'll also add: This index holds 500,000,000 docs and under heavy uploading we 
get the nodes going into recovery. I'm presuming it's down to the commits being 
too far apart and causing the replication nodes to falter. This heavy upload is 
a small window of time and to get around this issue, I remove the replicas 
during this period and then add them back afterwards. The new recovery mode 
issue looks like it was down to heavy upload but outside the designated period.

So the most likely scenario is that I've created the issue with my tweaking, 
hope you can point me in the right direction.



   ${solr.autoCommit.maxTime:15000}
   false
  


   ${solr.autoSoftCommit.maxTime:-1}
  

Regards

Russell Taylor



-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: 22 May 2019 16:45
To: solr-user@lucene.apache.org
Subject: Re: CloudSolrClient (any version). Find the node your query has 
connected to.

WARNING - External email from lucene.apache.org

OK, now we’re cooking with oil.

First, nodes in recovery shouldn’t make any difference to a query. They should 
not serve any part of a query so I think/hope that’s a red herring. At worst a 
node in recovery should pass the query on to another replica that is _not_ 
recovering.

When you’re looking at this, be aware that as long as _Solr_ is up and running 
on a node, it’ll accept queries. For simplicity let's say Solr1 hosts _only_ 
collection1_shard1_replica1 (cs1r1).

Now you fire a query at Solr1. It has the topology from ZooKeeper as well as 
its own internal knowledge of hosted replicas. For a top-level query it should 
send sub-queries out only to healthy replicas, bypassing its own recovering 
replica.

Let’s claim you fire the query at Solr2. First if there’s been time to 
propagate the down state of cs1r1 to ZooKeeper and Solr2 has the state, it 
shouldn’t even send a subrequest to cs1r1.

Now let’s say Solr2 hasn’t gotten the message yet and does send a query to 
cs1r1. cs1r1 should know its state is recovering and either return an error the 
Solr2 (which will pick a new replica to send that subrequest to) or forward it 
on to another healthy replica, I’m not quite sure which. In any case it should 
_not_ service the request from cs1r1.

If you do prove that a node serving requests that is really in recovery, that’s 
a fairly serious bug and we need to know lots of details.


Second, even if you did have the URL Solr sends the query to it wouldn’t help. 
Once a Solr node receives a query, it does its _own_ round robin for a 
subrequest to one replica of each shard, get’s the replies back then goes back 
out to the same replica for the final documents. So you still wouldn’t know 
what replica served the queries.

The fact that you say things come back into sync after commit points to 
autocommit times. I’m assuming you have an autocommit setting that opens a new 
searcher (true in the “autocommit” section or any positive time 
in the autoSoftCommit section of solrconfig.xml). These commit points will fire 
at different wall-clock time, resulting in replicas temporarily having 
different searchable documents. BTW, the same thing applies if you send 
“commitWithin” in a SolrJ cloudSolrClient.add command…

Anyway, if you just fire a query at a specific replica and add &distrib=false, 
the replica will bring back only documents from that replica. We’re talking the 
replica, so part of the URL will be the complete replica name like 
"…./solr/collection1_shard1_replica_n1/query?q=*:*&distrib=false”

A very quick test would be, when you have a replica in recovery, stop indexing 
and wait for your autocommit interval to expire (one that opens a new searcher) 
or issue a commit to the collection. My bet/hope is that your counts will be 
just fine. You can use the &distrib=false parameter to query each replica of 
the relevant shard directly…

Best,
Erick

> On May 22, 2019, at 8:09 AM, Russell Taylor  wrote:
>
> Hi Erick,
> Every time any of the replication nodes goes into recovery mode we start 
> seeing queries which don't match the correct count. I'm being told zookeeper 
> will give me the correct node (Not one in recovery), but I want to prove it 
> as the query issue only comes up when any of the nodes are in recovery mode. 
> The application loading the data shows the correct counts and after 
> committing we check the results and they look correct.
>
> If I can get the URL I can prove that the problem is due to doing the query 
> against a node in recovery mode.
>
> I hope that explains the problem, thanks for your time.
>
> Regards
>
> Russell Taylor
>
>
>
> -Original Message-
> From: Erick Erickson [mailto:erickerick..

stats and facet is not working as expected after upgrade 5.3 to 6.0

2019-05-22 Thread ilango dhandapani
I upgraded my AD solr cloud environment from solr5.3 to 6.0 and everything
worked fine.
But when i did QA, after the upgrade stats and facet queries are not working
as expected.

when I run
q=APPL_TOKN_ID_s:testApplication2&facet.limit=100&facet.field=CTNT_FILE_PATH_NM_s&fl=CTNT_FILE_PATH_NM_s&start=0&facet.mincount=1&facet=true

am not getting any results in facet fields column. But it is a valid filed
and the same works fine in my AD. Not sure what mistake I did.












--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Schema API Version 2 - 7.6.0

2019-05-22 Thread Joe Obernberger

Hi - according to the documentation here:

https://lucene.apache.org/solr/guide/7_6/schema-api.html

The V2 API is located at api/cores/collection/schema
However the documentation here:
https://lucene.apache.org/solr/guide/7_6/v2-api.html

has it at api/c/collection/schema

I believe the later is correct - true?  Thank you!

-Joe Obernberger




Re: Cluster with no overseer?

2019-05-22 Thread Erick Erickson
110 isn’t all that many, well within the normal range _assuming_ that they are 
being processed…. When you restart Solr, every state change operation writes an 
operation to the work queue which can mount up.

Perhaps you’re hitting: https://issues.apache.org/jira/browse/SOLR-13416?

In which case restarting ZK should fix it since all the work items are 
ephemeral and will go away if ZK restarts. Of course shut them all down rather 
than doing a rolling restart.

You shouldn’t need to clean anything in ZK associated with this since those are 
are all (I’m pretty sure) “ephemeral nodes” and should just disappear when ZK 
shuts down.

If you’re feeling really brave, you could try to use "bin/solr zk rm” to nuke 
the ephemeral nodes and try the rolling restart of Solr nodes, but only as a 
last resort IMO.

Best,
Erick

> On May 22, 2019, at 9:11 AM, Walter Underwood  wrote:
> 
> The ZK ensemble appears to be OK. It is the Solr-related stuff that is 
> borked. There are 110 items in /overseer/collection-queue-work/, which 
> doesn’t seem healthy.
> 
> If it is really hosed, I’ll shut down all the nodes, clean out the files in 
> Zookeeper and start over.
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
>> On May 22, 2019, at 8:53 AM, Erick Erickson  wrote:
>> 
>> Good luck, this kind of assumes that your ZK ensemble is healthy of course...
>> 
>>> On May 22, 2019, at 8:23 AM, Walter Underwood  wrote:
>>> 
>>> Thanks, we’ll try that. Bouncing one Solr node doesn’t fix it, because we 
>>> did a rolling restart yesterday.
>>> 
>>> wunder
>>> Walter Underwood
>>> wun...@wunderwood.org
>>> http://observer.wunderwood.org/  (my blog)
>>> 
 On May 22, 2019, at 8:21 AM, Erick Erickson  
 wrote:
 
 Walter:
 
 I have no idea what the root cause is here, this really shouldn’t happen. 
 But the Overseer role (and I’m assuming you’re talking Solr’s Overseer) is 
 assigned similarly to a shard leader, the same election process happens. 
 All the election nodes are ephemeral ZK nodes.
 
 Solr’s Overseer is _not_ fixed to a particular Solr node, although you can 
 assign a preferred role of Overseer in those (rare) cases where there are 
 so many state changes for ZooKeeper that it’s advisable for them to run on 
 a dedicated machine.
 
 Overseer assignment is automatic. This should work;
 1> shut everything down, Solr and Zookeeper
 2> start your ZooKeepers and let them all get in sync with each other
 3> start your Solr nodes. It might take 3 minutes or more to bring up the 
 first Solr node, there’s up to a 180 second delay if leaders are not 
 findable easily.
 
 That should cause Solr to elect an overseer, probably the first Solr node 
 to come up.
 
 It _might_ work to bounce just one Solr node, seeing the Overseer election 
 queue empty it may elect itself. That said, the overseer election queue 
 won’t contain the rest of the Solr nodes like it should, so if that works 
 you should probably bounce the rest of the Solr servers one by one to 
 restore the proper election queue process.
 
 Not a fix for the root cause of course, but should get things operating 
 again. I’ll add that I haven’t seen this happen in the field to my 
 recollection, if at all.
 
 Best,
 Erick
 
> On May 21, 2019, at 9:04 PM, Will Martin  wrote:
> 
> Worked with Fusion and Zookeeper at GSA for 18 months: admin role.
> 
> Before blowing it away, you could try:
> 
> - id a candidate node, with a snapshot you just might think is old enough
> to be robust.
> - clean data for zk nodes otherwise.
> - bring up the chosen node and wait for it to settle[wish i could remember
> why i called what i saw that]
> - bring up other nodes 1 at a time.  let each one fully sync to follower 
> of
> the new leader.
> - they should each in turn request the snapshot from the lead. then you
> have
> 
> : align your collections with the ensemble. and for the life of me i can't
> remember there being anything particularly tricky about that with fusion ,
> which means I can't remember what I did... or have it doc'd at home. ;-)
> 
> 
> Will Martin
> DEVOPS ENGINEER
> 540.454.9565
> 
> 8609 WESTWOOD CENTER DR, SUITE 475
> VIENNA, VA 22182
> geturgently.com
> 
> 
> On Tue, May 21, 2019 at 11:40 PM Walter Underwood 
> wrote:
> 
>> Yes, please. I have the logs from each of the Zookeepers.
>> 
>> We are running 3.4.12.
>> 
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
>>> On May 21, 2019, at 6:49 PM, Will Martin  wrote:
>>> 
>>> Walter. Can I cross-post to zk-dev?
>>> 
>>> 
>>> 
>>> Will Martin
>>> DEVOPS ENGINEER
>

Re: Cluster with no overseer?

2019-05-22 Thread Walter Underwood
The ZK ensemble appears to be OK. It is the Solr-related stuff that is borked. 
There are 110 items in /overseer/collection-queue-work/, which doesn’t seem 
healthy.

If it is really hosed, I’ll shut down all the nodes, clean out the files in 
Zookeeper and start over.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On May 22, 2019, at 8:53 AM, Erick Erickson  wrote:
> 
> Good luck, this kind of assumes that your ZK ensemble is healthy of course...
> 
>> On May 22, 2019, at 8:23 AM, Walter Underwood  wrote:
>> 
>> Thanks, we’ll try that. Bouncing one Solr node doesn’t fix it, because we 
>> did a rolling restart yesterday.
>> 
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
>>> On May 22, 2019, at 8:21 AM, Erick Erickson  wrote:
>>> 
>>> Walter:
>>> 
>>> I have no idea what the root cause is here, this really shouldn’t happen. 
>>> But the Overseer role (and I’m assuming you’re talking Solr’s Overseer) is 
>>> assigned similarly to a shard leader, the same election process happens. 
>>> All the election nodes are ephemeral ZK nodes.
>>> 
>>> Solr’s Overseer is _not_ fixed to a particular Solr node, although you can 
>>> assign a preferred role of Overseer in those (rare) cases where there are 
>>> so many state changes for ZooKeeper that it’s advisable for them to run on 
>>> a dedicated machine.
>>> 
>>> Overseer assignment is automatic. This should work;
>>> 1> shut everything down, Solr and Zookeeper
>>> 2> start your ZooKeepers and let them all get in sync with each other
>>> 3> start your Solr nodes. It might take 3 minutes or more to bring up the 
>>> first Solr node, there’s up to a 180 second delay if leaders are not 
>>> findable easily.
>>> 
>>> That should cause Solr to elect an overseer, probably the first Solr node 
>>> to come up.
>>> 
>>> It _might_ work to bounce just one Solr node, seeing the Overseer election 
>>> queue empty it may elect itself. That said, the overseer election queue 
>>> won’t contain the rest of the Solr nodes like it should, so if that works 
>>> you should probably bounce the rest of the Solr servers one by one to 
>>> restore the proper election queue process.
>>> 
>>> Not a fix for the root cause of course, but should get things operating 
>>> again. I’ll add that I haven’t seen this happen in the field to my 
>>> recollection, if at all.
>>> 
>>> Best,
>>> Erick
>>> 
 On May 21, 2019, at 9:04 PM, Will Martin  wrote:
 
 Worked with Fusion and Zookeeper at GSA for 18 months: admin role.
 
 Before blowing it away, you could try:
 
 - id a candidate node, with a snapshot you just might think is old enough
 to be robust.
 - clean data for zk nodes otherwise.
 - bring up the chosen node and wait for it to settle[wish i could remember
 why i called what i saw that]
 - bring up other nodes 1 at a time.  let each one fully sync to follower of
 the new leader.
 - they should each in turn request the snapshot from the lead. then you
 have
 
 : align your collections with the ensemble. and for the life of me i can't
 remember there being anything particularly tricky about that with fusion ,
 which means I can't remember what I did... or have it doc'd at home. ;-)
 
 
 Will Martin
 DEVOPS ENGINEER
 540.454.9565
 
 8609 WESTWOOD CENTER DR, SUITE 475
 VIENNA, VA 22182
 geturgently.com
 
 
 On Tue, May 21, 2019 at 11:40 PM Walter Underwood 
 wrote:
 
> Yes, please. I have the logs from each of the Zookeepers.
> 
> We are running 3.4.12.
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
>> On May 21, 2019, at 6:49 PM, Will Martin  wrote:
>> 
>> Walter. Can I cross-post to zk-dev?
>> 
>> 
>> 
>> Will Martin
>> DEVOPS ENGINEER
>> 540.454.9565
>> 
>> 
>> 
>> 8609 WESTWOOD CENTER DR, SUITE 475
>> VIENNA, VA 22182
>> geturgently.com 
>> 
>> 
>> 
>> 
>>> On May 21, 2019, at 9:26 PM, Will Martin  wmar...@urgent.ly>> wrote:
>>> 
>>> +1
>>> 
>>> Will Martin
>>> DEVOPS ENGINEER
>>> 540.454.9565
>>> 
>>> 8609 WESTWOOD CENTER DR, SUITE 475
>>> VIENNA, VA 22182
>>> geturgently.com 
>>> 
>>> 
>>> On Tue, May 21, 2019 at 7:39 PM Walter Underwood  > wrote:
>>> ADDROLE times out after 180 seconds. This seems to be an unrecoverable
> state for the cluster, so that is a pretty serious bug.
>>> 
>>> wunder
>>> Walter Underwood
>>> wun...@wunderwood.org 
>>> http://observer.wunderwood.org/   (my
> blog)
>>> 
 On May 21, 2019, at 4:10 PM, Walter Underwood  

Slow soft-commit

2019-05-22 Thread André Widhani
Hi everyone,

I need some advice how to debug slow soft commits.

We use Solr for searches in a DAM system and in similar setups, soft
commits take about one to two seconds, in this case nearly ten seconds.
Solr runs on a dedicated VM with eight cores and 64 GB RAM (16G heap),
which is common scenario with our software and the index holds about 20
million documents. Queries are as fast as expected.

This is Solr 7.5.0, stand-alone, auto hard-commit set to 60 seconds, no
explicit soft-commits but documents added with commitWhitin=5000 or 1000
depending on the use case. No warm-up queries, caches set to zero.

I enabled infostream and debug logging. Here is a little test case where I
stopped any other requests to Solr and just manually added a single
document and then posted a soft commit request.

2019-05-22 17:19:42.160 INFO  (qtp26728049-20) o.a.s.u.DirectUpdateHandler2
start
commit{_version_=1634245942610755584,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false}
2019-05-22 17:19:42.160 DEBUG (qtp26728049-20) o.a.s.u.UpdateLog TLOG:
preSoftCommit: prevMap=1930097580 new map=1023061476
2019-05-22 17:19:42.160 DEBUG (qtp26728049-20)
o.a.s.c.CachingDirectoryFactory Reusing cached directory:
CachedDir<>
2019-05-22 17:19:42.160 DEBUG (qtp26728049-20)
o.a.s.c.CachingDirectoryFactory Releasing directory:
/solr/core-tex73oy02hnxgx1dqc14p5o-index1 0 false
2019-05-22 17:19:42.160 INFO  (qtp26728049-20) o.a.s.u.LoggingInfoStream
[DW][qtp26728049-20]: anyChanges? numDocsInRam=1 deletes=true
hasTickets:false pendingChangesInFullFlush: false
2019-05-22 17:19:42.160 INFO  (qtp26728049-20) o.a.s.u.LoggingInfoStream
[IW][qtp26728049-20]: nrtIsCurrent: infoVersion matches: false; DW changes:
true; BD changes: false
2019-05-22 17:19:42.160 INFO  (qtp26728049-20) o.a.s.u.LoggingInfoStream
[IW][qtp26728049-20]: flush at getReader
... a lot of things logged here (omitted) that happen within milli-seconds
...
2019-05-22 17:19:42.168 INFO  (qtp26728049-20) o.a.s.u.LoggingInfoStream
[IW][qtp26728049-20]: getReader took 8 msec
2019-05-22 17:19:47.499 INFO  (qtp26728049-20) o.a.s.s.SolrIndexSearcher
Opening [Searcher@6211242f[core-tex73oy02hnxgx1dqc14p5o-index1] main]
2019-05-22 17:19:47.499 DEBUG (qtp26728049-20)
o.a.s.c.CachingDirectoryFactory incRef'ed:
CachedDir<>
2019-05-22 17:19:50.233 DEBUG (qtp26728049-20) o.a.s.s.SolrIndexSearcher
Closing [Searcher@78d9785[core-tex73oy02hnxgx1dqc14p5o-index1] realtime]
2019-05-22 17:19:50.233 INFO  (qtp26728049-20) o.a.s.u.LoggingInfoStream
[IW][qtp26728049-20]: decRefDeleter for NRT reader version=10560246
segments=_22lfz(7.5.0):c1033782/56603:delGen=2772
_22lfp(7.5.0):c1025574/39055:delGen=2113
_22lfr(7.5.0):c759249/32191:delGen=1386
_26q49(7.5.0):c923418/29825:delGen=958
_22lfx(7.5.0):c684064/30952:delGen=1098
_22lfv(7.5.0):c856317/78777:delGen=928
_22lg1(7.5.0):c1062384/188447:delGen=1750
_22lg0(7.5.0):c561881/1480:delGen=386
_22lg5(7.5.0):c1104218/1004:delGen=139 _22lgh(7.5.0):c1156482/46:delGen=33
_22lgf(7.5.0):c626273/27:delGen=19 _22lfy(7.5.0):c697224/6:delGen=6
_22lgd(7.5.0):c399283/6:delGen=3 _22lgg(7.5.0):c482373/3:delGen=3
_22lg4(7.5.0):c656746/2:delGen=2 _22lgc(7.5.0):c664274/3:delGen=3
_22lg7(7.5.0):c703377 _22lfu(7.5.0):c700340/3:delGen=3
_22lg2(7.5.0):c743334 _22lg6(7.5.0):c1091387/44659:delGen=946
_22lfs(7.5.0):c845018 _22lg8(7.5.0):c675649 _22lg9(7.5.0):c686292
_22lft(7.5.0):c636751/1004:delGen=300 _22lgb(7.5.0):c664531
_22lga(7.5.0):c647696/1:delGen=1 _22lge(7.5.0):c659794
_22lfw(7.5.0):c568537/1:delGen=1 _22lg3(7.5.0):c837568/1426:delGen=423
_26r20(7.5.0):c63899/10456:delGen=257 _273qn(7.5.0):c39076/8075:delGen=323
_27q6g(7.5.0):c40830/8111:delGen=195 _27aii(7.5.0):c30182/6777:delGen=287
_27tkq(7.5.0):c57620/6234:delGen=162 _27jrm(7.5.0):c33298/4797:delGen=202
_280zq(7.5.0):c60476/2341:delGen=173 _28h7x(7.5.0):c48570/453:delGen=35
_28c75(7.5.0):c29536/1088:delGen=47 _28h71(7.5.0):c1191/138:delGen=5
_28idl(7.5.0):c782/57:delGen=6 _28ihc(7.5.0):c5398/348:delGen=9
_28ig3(7.5.0):c1118/302:delGen=4 _28j1s(7.5.0):c917/269:delGen=2
_28iu9(7.5.0):c758/129:delGen=4 _28j23(7.5.0):c567/70:delGen=2
_28j3j(7.5.0):c802/11:delGen=1 _28j3u(7.5.0):c697/11:delGen=2
_28iz5(7.5.0):c858/116:delGen=4 _28j2n(7.5.0):c566/78:delGen=1
_28j3t(7.5.0):C20/11:delGen=1 _28j3v(7.5.0):C13/8:delGen=1 _28j3y(7.5.0):C1
2019-05-22 17:19:50.234 DEBUG (qtp26728049-20)
o.a.s.c.CachingDirectoryFactory Releasing directory:
/solr/core-tex73oy02hnxgx1dqc14p5o-index1/index 3 false
2019-05-22 17:19:50.234 INFO  (qtp26728049-20) o.a.s.u.DirectUpdateHandler2
end_commit_flush
2019-05-22 17:19:50.234 DEBUG
(searcherExecutor-10-thread-1-processing-x:core-tex73oy02hnxgx1dqc14p5o-index1)
o.a.s.s.SolrIndexSearcher autowarming
[Searcher@6211242f[core-tex73oy02hnxgx1dqc14p5o-index1]
main{ExitableDirectoryReader(UninvertingDirectoryReader(Uninverting(_22lfz(7.5.0):c1033782/56603:delGen=2772)
...  ...Uninverting(_28j3z(7.5.0):C1)))}] from
[Searcher@680b1764[core-tex73oy0

Re: CloudSolrClient (any version). Find the node your query has connected to.

2019-05-22 Thread Erick Erickson
OK, now we’re cooking with oil.

First, nodes in recovery shouldn’t make any difference to a query. They should 
not serve any part of a query so I think/hope that’s a red herring. At worst a 
node in recovery should pass the query on to another replica that is _not_ 
recovering.

When you’re looking at this, be aware that as long as _Solr_ is up and running 
on a node, it’ll accept queries. For simplicity let's say Solr1 hosts _only_ 
collection1_shard1_replica1 (cs1r1).

Now you fire a query at Solr1. It has the topology from ZooKeeper as well as 
its own internal knowledge of hosted replicas. For a top-level query it should 
send sub-queries out only to healthy replicas, bypassing its own recovering 
replica.

Let’s claim you fire the query at Solr2. First if there’s been time to 
propagate the down state of cs1r1 to ZooKeeper and Solr2 has the state, it 
shouldn’t even send a subrequest to cs1r1.

Now let’s say Solr2 hasn’t gotten the message yet and does send a query to 
cs1r1. cs1r1 should know its state is recovering and either return an error the 
Solr2 (which will pick a new replica to send that subrequest to) or forward it 
on to another healthy replica, I’m not quite sure which. In any case it should 
_not_ service the request from cs1r1.

If you do prove that a node serving requests that is really in recovery, that’s 
a fairly serious bug and we need to know lots of details.


Second, even if you did have the URL Solr sends the query to it wouldn’t help. 
Once a Solr node receives a query, it does its _own_ round robin for a 
subrequest to one replica of each shard, get’s the replies back then goes back 
out to the same replica for the final documents. So you still wouldn’t know 
what replica served the queries.

The fact that you say things come back into sync after commit points to 
autocommit times. I’m assuming you have an autocommit setting that opens a new 
searcher (true in the “autocommit” section or any positive time 
in the autoSoftCommit section of solrconfig.xml). These commit points will fire 
at different wall-clock time, resulting in replicas temporarily having 
different searchable documents. BTW, the same thing applies if you send 
“commitWithin” in a SolrJ cloudSolrClient.add command…

Anyway, if you just fire a query at a specific replica and add &distrib=false, 
the replica will bring back only documents from that replica. We’re talking the 
replica, so part of the URL will be the complete replica name like 
"…./solr/collection1_shard1_replica_n1/query?q=*:*&distrib=false”

A very quick test would be, when you have a replica in recovery, stop indexing 
and wait for your autocommit interval to expire (one that opens a new searcher) 
or issue a commit to the collection. My bet/hope is that your counts will be 
just fine. You can use the &distrib=false parameter to query each replica of 
the relevant shard directly…

Best,
Erick

> On May 22, 2019, at 8:09 AM, Russell Taylor  wrote:
> 
> Hi Erick,
> Every time any of the replication nodes goes into recovery mode we start 
> seeing queries which don't match the correct count. I'm being told zookeeper 
> will give me the correct node (Not one in recovery), but I want to prove it 
> as the query issue only comes up when any of the nodes are in recovery mode. 
> The application loading the data shows the correct counts and after 
> committing we check the results and they look correct.
> 
> If I can get the URL I can prove that the problem is due to doing the query 
> against a node in recovery mode.
> 
> I hope that explains the problem, thanks for your time.
> 
> Regards
> 
> Russell Taylor
> 
> 
> 
> -Original Message-
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: 22 May 2019 15:50
> To: solr-user@lucene.apache.org
> Subject: Re: CloudSolrClient (any version). Find the node your query has 
> connected to.
> 
> WARNING - External email from lucene.apache.org
> 
> Why do you want to know? You’ve asked how do to X without telling us what 
> problem Y you’re trying to solve (the XY problem) and frequently that leads 
> to a lot of wasted time…..
> 
> Under the covers CloudSolrClient uses a pretty simple round-robin load 
> balancer to pick a Solr node to send the query to so “it depends”…..
> 
>> On May 22, 2019, at 5:51 AM, Jörn Franke  wrote:
>> 
>> You have to provide the addresses of the zookeeper ensemble - it will figure 
>> it out on its own based on information in Zookeeper.
>> 
>>> Am 22.05.2019 um 14:38 schrieb Russell Taylor :
>>> 
>>> Hi,
>>> Using CloudSolrClient, how do I find the node (I have 3 nodes for this 
>>> collection on our 6 node cluster) the query has connected to.
>>> I'm hoping to get the full URL if possible.
>>> 
>>> 
>>> Regards
>>> 
>>> Russell Taylor
>>> 
>>> 
>>> 
>>> 
>>> 
>>> This message may contain confidential information and is intended for 
>>> specific recipients unless explicitly noted otherwise. If you have reason 
>>> to believe you

Re: Cluster with no overseer?

2019-05-22 Thread Erick Erickson
Good luck, this kind of assumes that your ZK ensemble is healthy of course...

> On May 22, 2019, at 8:23 AM, Walter Underwood  wrote:
> 
> Thanks, we’ll try that. Bouncing one Solr node doesn’t fix it, because we did 
> a rolling restart yesterday.
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
>> On May 22, 2019, at 8:21 AM, Erick Erickson  wrote:
>> 
>> Walter:
>> 
>> I have no idea what the root cause is here, this really shouldn’t happen. 
>> But the Overseer role (and I’m assuming you’re talking Solr’s Overseer) is 
>> assigned similarly to a shard leader, the same election process happens. All 
>> the election nodes are ephemeral ZK nodes.
>> 
>> Solr’s Overseer is _not_ fixed to a particular Solr node, although you can 
>> assign a preferred role of Overseer in those (rare) cases where there are so 
>> many state changes for ZooKeeper that it’s advisable for them to run on a 
>> dedicated machine.
>> 
>> Overseer assignment is automatic. This should work;
>> 1> shut everything down, Solr and Zookeeper
>> 2> start your ZooKeepers and let them all get in sync with each other
>> 3> start your Solr nodes. It might take 3 minutes or more to bring up the 
>> first Solr node, there’s up to a 180 second delay if leaders are not 
>> findable easily.
>> 
>> That should cause Solr to elect an overseer, probably the first Solr node to 
>> come up.
>> 
>> It _might_ work to bounce just one Solr node, seeing the Overseer election 
>> queue empty it may elect itself. That said, the overseer election queue 
>> won’t contain the rest of the Solr nodes like it should, so if that works 
>> you should probably bounce the rest of the Solr servers one by one to 
>> restore the proper election queue process.
>> 
>> Not a fix for the root cause of course, but should get things operating 
>> again. I’ll add that I haven’t seen this happen in the field to my 
>> recollection, if at all.
>> 
>> Best,
>> Erick
>> 
>>> On May 21, 2019, at 9:04 PM, Will Martin  wrote:
>>> 
>>> Worked with Fusion and Zookeeper at GSA for 18 months: admin role.
>>> 
>>> Before blowing it away, you could try:
>>> 
>>> - id a candidate node, with a snapshot you just might think is old enough
>>> to be robust.
>>> - clean data for zk nodes otherwise.
>>> - bring up the chosen node and wait for it to settle[wish i could remember
>>> why i called what i saw that]
>>> - bring up other nodes 1 at a time.  let each one fully sync to follower of
>>> the new leader.
>>> - they should each in turn request the snapshot from the lead. then you
>>> have
>>> 
>>> : align your collections with the ensemble. and for the life of me i can't
>>> remember there being anything particularly tricky about that with fusion ,
>>> which means I can't remember what I did... or have it doc'd at home. ;-)
>>> 
>>> 
>>> Will Martin
>>> DEVOPS ENGINEER
>>> 540.454.9565
>>> 
>>> 8609 WESTWOOD CENTER DR, SUITE 475
>>> VIENNA, VA 22182
>>> geturgently.com
>>> 
>>> 
>>> On Tue, May 21, 2019 at 11:40 PM Walter Underwood 
>>> wrote:
>>> 
 Yes, please. I have the logs from each of the Zookeepers.
 
 We are running 3.4.12.
 
 wunder
 Walter Underwood
 wun...@wunderwood.org
 http://observer.wunderwood.org/  (my blog)
 
> On May 21, 2019, at 6:49 PM, Will Martin  wrote:
> 
> Walter. Can I cross-post to zk-dev?
> 
> 
> 
> Will Martin
> DEVOPS ENGINEER
> 540.454.9565
> 
> 
> 
> 8609 WESTWOOD CENTER DR, SUITE 475
> VIENNA, VA 22182
> geturgently.com 
> 
> 
> 
> 
>> On May 21, 2019, at 9:26 PM, Will Martin >>> wmar...@urgent.ly>> wrote:
>> 
>> +1
>> 
>> Will Martin
>> DEVOPS ENGINEER
>> 540.454.9565
>> 
>> 8609 WESTWOOD CENTER DR, SUITE 475
>> VIENNA, VA 22182
>> geturgently.com 
>> 
>> 
>> On Tue, May 21, 2019 at 7:39 PM Walter Underwood >>> > wrote:
>> ADDROLE times out after 180 seconds. This seems to be an unrecoverable
 state for the cluster, so that is a pretty serious bug.
>> 
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org 
>> http://observer.wunderwood.org/   (my
 blog)
>> 
>>> On May 21, 2019, at 4:10 PM, Walter Underwood >>> > wrote:
>>> 
>>> We have a 6.6.2 cluster in prod that appears to have no overseer. In
 /overseer_elect on ZK, there is an election folder, but no leader document.
 An OVERSEERSTATUS request fails with a timeout.
>>> 
>>> I’m going to try ADDROLE, but I’d be delighted to hear any other
 ideas. We’ve diverted all the traffic to the backing cluster, so we can
 blow this one away and rebuild.
>>> 
>>> Looking at the Zookeeper logs, I see a few instances of network
 failures across all 

Re: Is it possible to reconstruct non stored fields and tun those into stored fields

2019-05-22 Thread Erick Erickson
Well, if they’re all docValues or stored=true, sure. It’d be kind of slow.. The 
short form is “if you can specify fl=f1,f2,f3…. for all your fields and see all 
your values, then it’s easy if slow”.

If that works _and_ you are on Solr 4.7+ cursorMark will help the “deep paging” 
issue.

If they’re all docValues, you could use the /export handler to dump them all to 
a file and re-index that.

If none of those are possible, you can do this but it’d be quite painful. Luke 
can reassemble a document (lossily for text fields, but in this case it’d be OK 
since they’re simple types) by examining the inverted index and pulling out the 
values. Painfully slow and you’d have to write custom code probably at the 
Lucene level to make it all work.

Best,
Erick

> On May 22, 2019, at 8:11 AM, Pushkar Raste  wrote:
> 
> I know this is a long shot. I am trying move from Solr4 to Solr7.
> Reindexing all the data from the source is difficult to do in a reasonable
> time. All the fields are of basic types like int, long, float, double,
> Boolean, date,  string.
> 
> Since these fields don’t have analyzers, I was wondering if these fields
> can be retrieved while iterating over index while reading the documents.
> -- 
> — Pushkar Raste



Re: Cluster with no overseer?

2019-05-22 Thread Walter Underwood
Thanks, we’ll try that. Bouncing one Solr node doesn’t fix it, because we did a 
rolling restart yesterday.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On May 22, 2019, at 8:21 AM, Erick Erickson  wrote:
> 
> Walter:
> 
> I have no idea what the root cause is here, this really shouldn’t happen. But 
> the Overseer role (and I’m assuming you’re talking Solr’s Overseer) is 
> assigned similarly to a shard leader, the same election process happens. All 
> the election nodes are ephemeral ZK nodes.
> 
> Solr’s Overseer is _not_ fixed to a particular Solr node, although you can 
> assign a preferred role of Overseer in those (rare) cases where there are so 
> many state changes for ZooKeeper that it’s advisable for them to run on a 
> dedicated machine.
> 
> Overseer assignment is automatic. This should work;
> 1> shut everything down, Solr and Zookeeper
> 2> start your ZooKeepers and let them all get in sync with each other
> 3> start your Solr nodes. It might take 3 minutes or more to bring up the 
> first Solr node, there’s up to a 180 second delay if leaders are not findable 
> easily.
> 
> That should cause Solr to elect an overseer, probably the first Solr node to 
> come up.
> 
> It _might_ work to bounce just one Solr node, seeing the Overseer election 
> queue empty it may elect itself. That said, the overseer election queue won’t 
> contain the rest of the Solr nodes like it should, so if that works you 
> should probably bounce the rest of the Solr servers one by one to restore the 
> proper election queue process.
> 
> Not a fix for the root cause of course, but should get things operating 
> again. I’ll add that I haven’t seen this happen in the field to my 
> recollection, if at all.
> 
> Best,
> Erick
> 
>> On May 21, 2019, at 9:04 PM, Will Martin  wrote:
>> 
>> Worked with Fusion and Zookeeper at GSA for 18 months: admin role.
>> 
>> Before blowing it away, you could try:
>> 
>> - id a candidate node, with a snapshot you just might think is old enough
>> to be robust.
>> - clean data for zk nodes otherwise.
>> - bring up the chosen node and wait for it to settle[wish i could remember
>> why i called what i saw that]
>> - bring up other nodes 1 at a time.  let each one fully sync to follower of
>> the new leader.
>> - they should each in turn request the snapshot from the lead. then you
>> have
>> 
>> : align your collections with the ensemble. and for the life of me i can't
>> remember there being anything particularly tricky about that with fusion ,
>> which means I can't remember what I did... or have it doc'd at home. ;-)
>> 
>> 
>> Will Martin
>> DEVOPS ENGINEER
>> 540.454.9565
>> 
>> 8609 WESTWOOD CENTER DR, SUITE 475
>> VIENNA, VA 22182
>> geturgently.com
>> 
>> 
>> On Tue, May 21, 2019 at 11:40 PM Walter Underwood 
>> wrote:
>> 
>>> Yes, please. I have the logs from each of the Zookeepers.
>>> 
>>> We are running 3.4.12.
>>> 
>>> wunder
>>> Walter Underwood
>>> wun...@wunderwood.org
>>> http://observer.wunderwood.org/  (my blog)
>>> 
 On May 21, 2019, at 6:49 PM, Will Martin  wrote:
 
 Walter. Can I cross-post to zk-dev?
 
 
 
 Will Martin
 DEVOPS ENGINEER
 540.454.9565
 
 
 
 8609 WESTWOOD CENTER DR, SUITE 475
 VIENNA, VA 22182
 geturgently.com 
 
 
 
 
> On May 21, 2019, at 9:26 PM, Will Martin >> wmar...@urgent.ly>> wrote:
> 
> +1
> 
> Will Martin
> DEVOPS ENGINEER
> 540.454.9565
> 
> 8609 WESTWOOD CENTER DR, SUITE 475
> VIENNA, VA 22182
> geturgently.com 
> 
> 
> On Tue, May 21, 2019 at 7:39 PM Walter Underwood >> > wrote:
> ADDROLE times out after 180 seconds. This seems to be an unrecoverable
>>> state for the cluster, so that is a pretty serious bug.
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org 
> http://observer.wunderwood.org/   (my
>>> blog)
> 
>> On May 21, 2019, at 4:10 PM, Walter Underwood >> > wrote:
>> 
>> We have a 6.6.2 cluster in prod that appears to have no overseer. In
>>> /overseer_elect on ZK, there is an election folder, but no leader document.
>>> An OVERSEERSTATUS request fails with a timeout.
>> 
>> I’m going to try ADDROLE, but I’d be delighted to hear any other
>>> ideas. We’ve diverted all the traffic to the backing cluster, so we can
>>> blow this one away and rebuild.
>> 
>> Looking at the Zookeeper logs, I see a few instances of network
>>> failures across all three nodes.
>> 
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org 
>> http://observer.wunderwood.org/ 
>>> (my blog)
>> 
> 
 
>>> 
>>> 
> 



Re: Cluster with no overseer?

2019-05-22 Thread Erick Erickson
Walter:

I have no idea what the root cause is here, this really shouldn’t happen. But 
the Overseer role (and I’m assuming you’re talking Solr’s Overseer) is assigned 
similarly to a shard leader, the same election process happens. All the 
election nodes are ephemeral ZK nodes.

Solr’s Overseer is _not_ fixed to a particular Solr node, although you can 
assign a preferred role of Overseer in those (rare) cases where there are so 
many state changes for ZooKeeper that it’s advisable for them to run on a 
dedicated machine.

Overseer assignment is automatic. This should work;
1> shut everything down, Solr and Zookeeper
2> start your ZooKeepers and let them all get in sync with each other
3> start your Solr nodes. It might take 3 minutes or more to bring up the first 
Solr node, there’s up to a 180 second delay if leaders are not findable easily.

That should cause Solr to elect an overseer, probably the first Solr node to 
come up.

It _might_ work to bounce just one Solr node, seeing the Overseer election 
queue empty it may elect itself. That said, the overseer election queue won’t 
contain the rest of the Solr nodes like it should, so if that works you should 
probably bounce the rest of the Solr servers one by one to restore the proper 
election queue process.

Not a fix for the root cause of course, but should get things operating again. 
I’ll add that I haven’t seen this happen in the field to my recollection, if at 
all.

Best,
Erick

> On May 21, 2019, at 9:04 PM, Will Martin  wrote:
> 
> Worked with Fusion and Zookeeper at GSA for 18 months: admin role.
> 
> Before blowing it away, you could try:
> 
> - id a candidate node, with a snapshot you just might think is old enough
> to be robust.
> - clean data for zk nodes otherwise.
> - bring up the chosen node and wait for it to settle[wish i could remember
> why i called what i saw that]
> - bring up other nodes 1 at a time.  let each one fully sync to follower of
> the new leader.
> - they should each in turn request the snapshot from the lead. then you
> have
> 
> : align your collections with the ensemble. and for the life of me i can't
> remember there being anything particularly tricky about that with fusion ,
> which means I can't remember what I did... or have it doc'd at home. ;-)
> 
> 
> Will Martin
> DEVOPS ENGINEER
> 540.454.9565
> 
> 8609 WESTWOOD CENTER DR, SUITE 475
> VIENNA, VA 22182
> geturgently.com
> 
> 
> On Tue, May 21, 2019 at 11:40 PM Walter Underwood 
> wrote:
> 
>> Yes, please. I have the logs from each of the Zookeepers.
>> 
>> We are running 3.4.12.
>> 
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
>>> On May 21, 2019, at 6:49 PM, Will Martin  wrote:
>>> 
>>> Walter. Can I cross-post to zk-dev?
>>> 
>>> 
>>> 
>>> Will Martin
>>> DEVOPS ENGINEER
>>> 540.454.9565
>>> 
>>> 
>>> 
>>> 8609 WESTWOOD CENTER DR, SUITE 475
>>> VIENNA, VA 22182
>>> geturgently.com 
>>> 
>>> 
>>> 
>>> 
 On May 21, 2019, at 9:26 PM, Will Martin > wmar...@urgent.ly>> wrote:
 
 +1
 
 Will Martin
 DEVOPS ENGINEER
 540.454.9565
 
 8609 WESTWOOD CENTER DR, SUITE 475
 VIENNA, VA 22182
 geturgently.com 
 
 
 On Tue, May 21, 2019 at 7:39 PM Walter Underwood > > wrote:
 ADDROLE times out after 180 seconds. This seems to be an unrecoverable
>> state for the cluster, so that is a pretty serious bug.
 
 wunder
 Walter Underwood
 wun...@wunderwood.org 
 http://observer.wunderwood.org/   (my
>> blog)
 
> On May 21, 2019, at 4:10 PM, Walter Underwood > > wrote:
> 
> We have a 6.6.2 cluster in prod that appears to have no overseer. In
>> /overseer_elect on ZK, there is an election folder, but no leader document.
>> An OVERSEERSTATUS request fails with a timeout.
> 
> I’m going to try ADDROLE, but I’d be delighted to hear any other
>> ideas. We’ve diverted all the traffic to the backing cluster, so we can
>> blow this one away and rebuild.
> 
> Looking at the Zookeeper logs, I see a few instances of network
>> failures across all three nodes.
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org 
> http://observer.wunderwood.org/ 
>> (my blog)
> 
 
>>> 
>> 
>> 



RE: CloudSolrClient (any version). Find the node your query has connected to.

2019-05-22 Thread Russell Taylor
Hi Erick,
 Every time any of the replication nodes goes into recovery mode we start 
seeing queries which don't match the correct count. I'm being told zookeeper 
will give me the correct node (Not one in recovery), but I want to prove it as 
the query issue only comes up when any of the nodes are in recovery mode. The 
application loading the data shows the correct counts and after committing we 
check the results and they look correct.

If I can get the URL I can prove that the problem is due to doing the query 
against a node in recovery mode.

I hope that explains the problem, thanks for your time.

Regards

Russell Taylor



-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: 22 May 2019 15:50
To: solr-user@lucene.apache.org
Subject: Re: CloudSolrClient (any version). Find the node your query has 
connected to.

WARNING - External email from lucene.apache.org

Why do you want to know? You’ve asked how do to X without telling us what 
problem Y you’re trying to solve (the XY problem) and frequently that leads to 
a lot of wasted time…..

Under the covers CloudSolrClient uses a pretty simple round-robin load balancer 
to pick a Solr node to send the query to so “it depends”…..

> On May 22, 2019, at 5:51 AM, Jörn Franke  wrote:
>
> You have to provide the addresses of the zookeeper ensemble - it will figure 
> it out on its own based on information in Zookeeper.
>
>> Am 22.05.2019 um 14:38 schrieb Russell Taylor :
>>
>> Hi,
>> Using CloudSolrClient, how do I find the node (I have 3 nodes for this 
>> collection on our 6 node cluster) the query has connected to.
>> I'm hoping to get the full URL if possible.
>>
>>
>> Regards
>>
>> Russell Taylor
>>
>>
>>
>> 
>>
>> This message may contain confidential information and is intended for 
>> specific recipients unless explicitly noted otherwise. If you have reason to 
>> believe you are not an intended recipient of this message, please delete it 
>> and notify the sender. This message may not represent the opinion of 
>> Intercontinental Exchange, Inc. (ICE), its subsidiaries or affiliates, and 
>> does not constitute a contract or guarantee. Unencrypted electronic mail is 
>> not secure and the recipient of this message is expected to provide 
>> safeguards from viruses and pursue alternate means of communication where 
>> privacy or a binding message is desired.




This message may contain confidential information and is intended for specific 
recipients unless explicitly noted otherwise. If you have reason to believe you 
are not an intended recipient of this message, please delete it and notify the 
sender. This message may not represent the opinion of Intercontinental 
Exchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a 
contract or guarantee. Unencrypted electronic mail is not secure and the 
recipient of this message is expected to provide safeguards from viruses and 
pursue alternate means of communication where privacy or a binding message is 
desired.


Is it possible to reconstruct non stored fields and tun those into stored fields

2019-05-22 Thread Pushkar Raste
I know this is a long shot. I am trying move from Solr4 to Solr7.
Reindexing all the data from the source is difficult to do in a reasonable
time. All the fields are of basic types like int, long, float, double,
Boolean, date,  string.

Since these fields don’t have analyzers, I was wondering if these fields
can be retrieved while iterating over index while reading the documents.
-- 
— Pushkar Raste


Re: Usage of docValuesFormat

2019-05-22 Thread Erick Erickson



> On May 22, 2019, at 12:51 AM, vishal patel  
> wrote:
> 
> We enabled the DocValues on some schema fields for sorting and faceting query 
> result.
> Is it necessary to add docValuesFormat for faster query process?

Only if you sort/facet or group. And queries won’t necessarily be faster after 
warmup. If you don’t have docValues enabled, the “uninverted” structure is 
created on the Java heap at query time. If you do have docValues enabled, the 
“uninverted” structure is serialized to disk at index time, and just read at 
query time in to the OS memory cache, NOT the Java heap.

If you don’t sort/facet or group, docValues do you no good at all.

We strongly recommend that if you sort/group or facet you use docValues. In 
fact, there’s a flag you can set in 8x that will throw an error if you do those 
operations on a field that does _not_ have docValues set.

docValues have no effect on the part of the query that don’t group, facet or 
sort.

> Which one should better? docValuesFormat="Memory" or docValuesFormat="Disk”?

The docValuesFormat is obsolete and no longer supported at all as of 8.0.

> Note: Our indexed data size are high in one collection and different sort and 
> faceting queries are executed within a second.
> 
> Sent from Outlook

Best,
Erick

Re: Unable to upgrade Lucene 6.x index using IndexUpgrader

2019-05-22 Thread Erick Erickson
Anticipating your next question “why can’t you use an index created 2 or more 
versions ago”, these two quotes were helpful for me to get my head around it. 
It’s _always_ been the case that going from X to X+2 has been unsupported, it’s 
just that the failures wouldn’t necessarily be obvious. Lucene tries mightily 
to have version X handle X-1 created indexes, but trying to go back farther 
would lead to horribly complex code if it could be done at all.

Oh, and IndexUpgraderTool (IUT) is fairly horrible in that it really does an 
optimize down to one segment with all the problems that entails, see: 
https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/.
 This discussion is about “optimize”, which is what the IUT does, and that’s 
still true for Solr 7.5+

From Robert Muir:
“I think the key issue here is Lucene is an index not a database. Because it is 
a lossy index and does not retain all of the user's data, its not possible to 
safely migrate some things automagically. In the norms case IndexWriter needs 
to re-analyze the text ("re-index") and compute stats to get back the value, so 
it can be re-encoded. The function is y = f(x) and if x is not available its 
not possible, so lucene can't do it.”

From Mike McCandless:
“This really is the difference between an index and a database: we do not 
store, precisely, the original documents.  We store an efficient 
derived/computed index from them.  Yes, Solr/ES can add database-like behavior 
where they hold the true original source of the document and use that to 
rebuild Lucene indices over time.  But Lucene really is just a "search index" 
and we need to be free to make important improvements with time.”

Best,
Erick

> On May 22, 2019, at 3:48 AM, Jan Høydahl  wrote:
> 
> Same as I answered in SOLR-13487:
> 
> Note that you'll probably need to re-index from scratch due to changes in 
> 8.0, see 
> https://builds.apache.org/view/L/view/Lucene/job/Solr-reference-guide-8.0/javadoc/indexupgrader-tool.html
>  
> 
>  :
> If you are currently using a release two or more major versions older, such 
> as moving from Solr 6x to Solr 8x, you will need to reindex your content.
> 
> See your error message. Does not matter if you upgraded from 6-7, Lucene 8 
> will still complain since it sees that the index was first created with 6.x:
> 
>> This index was initially created with Lucene 6.x while the current version
>> is 8.1.0 and Lucene only supports reading the current and previous major
>> versions.. This version of Lucene only supports indexes created with
>> release 7.0 and later.
> 
> 
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> 
>> 22. mai 2019 kl. 11:36 skrev Henrik B A :
>> 
>> I'm trying to upgrade a index from Lucene 6.x to 7.x, and then to 8.x,
>> using IndexUpgrader [1].  But it never successfully upgrades to 7, and I
>> cannot figure out why.
>> 
>> I've also tried using CheckIndex [2] with the -exorcise option to fix the
>> index first, but that doesn't help.
>> 
>> Any ideas?  I've added console output at the end of this mail.
>> 
>> Cheers,
>> Henrik
>> 
>> [1] https://lucene.apache.org/solr/guide/7_7/indexupgrader-tool.html
>> [2]
>> https://lucene.apache.org/core/7_1_0/core/org/apache/lucene/index/CheckIndex.html#main-java.lang.String:A-
>> 
>> 
>> 
>> user@dev-appsolr2535:/home/user> java -cp
>> '/opt/common/apps/apache-solr-7.7.1/server/solr-webapp/webapp/WEB-INF/lib/*'
>> org.apache.lucene.index.IndexUpgrader -delete-prior-commits -verbose
>> /opt/user/solr5-prog-fhh-storage-solr/solr/prog-fhh-storage-solr_shard4_replica_n259/data/index
>> IFD 0 [2019-05-22T09:24:14.317Z; main]: init: current segments file is
>> "segments_505t";
>> deletionPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy@29774679
>> IFD 0 [2019-05-22T09:24:14.336Z; main]: init: load commit "segments_505t"
>> IFD 0 [2019-05-22T09:24:14.346Z; main]: init: seg=_11nw3 set
>> nextWriteFieldInfosGen=134 vs current=1
>> IFD 0 [2019-05-22T09:24:14.347Z; main]: init: seg=_11nw3 set
>> nextWriteDocValuesGen=134 vs current=1
>> IFD 0 [2019-05-22T09:24:14.358Z; main]: now checkpoint
>> "_11nw3(7.7.1):C16557/1395:[diagnostics={os=Linux,
>> java.vendor=AdoptOpenJDK, java.version=11.0.1, java.vm.version=11.0.1+13,
>> lucene.version=7.7.1, mergeMaxNumSegments=-1, os.arch=amd64,
>> java.runtime.version=11.0.1+13, source=merge, mergeFactor=10,
>> os.version=3.16.0-5-amd64,
>> timestamp=1558499119537}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}]:delGen=133
>> _11o9p(7.7.1):c798:[diagnostics={os=Linux, java.vendor=AdoptOpenJDK,
>> java.version=11.0.1, java.vm.version=11.0.1+13, lucene.version=7.7.1,
>> mergeMaxNumSegments=-1, os.arch=amd64, java.runtime.version=11.0.1+13,
>> source=merge, mergeFactor=10, os.version=3.16.0-5-amd64,
>> timestamp=1558507559781}]:[attributes={Lucene50StoredFieldsFor

Re: CloudSolrClient (any version). Find the node your query has connected to.

2019-05-22 Thread Erick Erickson
Why do you want to know? You’ve asked how do to X without telling us what 
problem Y you’re trying to solve (the XY problem) and frequently that leads to 
a lot of wasted time…..

Under the covers CloudSolrClient uses a pretty simple round-robin load balancer 
to pick a Solr node to send the query to so “it depends”…..

> On May 22, 2019, at 5:51 AM, Jörn Franke  wrote:
> 
> You have to provide the addresses of the zookeeper ensemble - it will figure 
> it out on its own based on information in Zookeeper.
> 
>> Am 22.05.2019 um 14:38 schrieb Russell Taylor :
>> 
>> Hi,
>> Using CloudSolrClient, how do I find the node (I have 3 nodes for this 
>> collection on our 6 node cluster) the query has connected to.
>> I'm hoping to get the full URL if possible.
>> 
>> 
>> Regards
>> 
>> Russell Taylor
>> 
>> 
>> 
>> 
>> 
>> This message may contain confidential information and is intended for 
>> specific recipients unless explicitly noted otherwise. If you have reason to 
>> believe you are not an intended recipient of this message, please delete it 
>> and notify the sender. This message may not represent the opinion of 
>> Intercontinental Exchange, Inc. (ICE), its subsidiaries or affiliates, and 
>> does not constitute a contract or guarantee. Unencrypted electronic mail is 
>> not secure and the recipient of this message is expected to provide 
>> safeguards from viruses and pursue alternate means of communication where 
>> privacy or a binding message is desired.



Re: CloudSolrClient (any version). Find the node your query has connected to.

2019-05-22 Thread Jörn Franke
You have to provide the addresses of the zookeeper ensemble - it will figure it 
out on its own based on information in Zookeeper.

> Am 22.05.2019 um 14:38 schrieb Russell Taylor :
> 
> Hi,
> Using CloudSolrClient, how do I find the node (I have 3 nodes for this 
> collection on our 6 node cluster) the query has connected to.
> I'm hoping to get the full URL if possible.
> 
> 
> Regards
> 
> Russell Taylor
> 
> 
> 
> 
> 
> This message may contain confidential information and is intended for 
> specific recipients unless explicitly noted otherwise. If you have reason to 
> believe you are not an intended recipient of this message, please delete it 
> and notify the sender. This message may not represent the opinion of 
> Intercontinental Exchange, Inc. (ICE), its subsidiaries or affiliates, and 
> does not constitute a contract or guarantee. Unencrypted electronic mail is 
> not secure and the recipient of this message is expected to provide 
> safeguards from viruses and pursue alternate means of communication where 
> privacy or a binding message is desired.


CloudSolrClient (any version). Find the node your query has connected to.

2019-05-22 Thread Russell Taylor
Hi,
Using CloudSolrClient, how do I find the node (I have 3 nodes for this 
collection on our 6 node cluster) the query has connected to.
I'm hoping to get the full URL if possible.


Regards

Russell Taylor





This message may contain confidential information and is intended for specific 
recipients unless explicitly noted otherwise. If you have reason to believe you 
are not an intended recipient of this message, please delete it and notify the 
sender. This message may not represent the opinion of Intercontinental 
Exchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a 
contract or guarantee. Unencrypted electronic mail is not secure and the 
recipient of this message is expected to provide safeguards from viruses and 
pursue alternate means of communication where privacy or a binding message is 
desired.


Re: Graph query extremely slow

2019-05-22 Thread Toke Eskildsen
On Wed, 2019-05-15 at 21:37 -0400, Rahul Goswami wrote:
> fq={!graph from=from_field to=to_field returnRoot=false}
> 
> Executing _only_ the graph filter query takes about 64.5 seconds. The
> total number of documents from this filter query is a little over 1
> million.

I tried building an index in Solr 7.6 with 4M simple records with every
4th record having a from_field and a to_field, each containing a random
number from 0-65535 as a String.


Asking for the first 10 results:

time curl -s '
http://localhost:8983/solr/gettingstarted/select?rows=10&q={!graph+from=from_field+to=to_field+returnRoot=true}+from_field:*'
 | jq .response.numFound
100

real0m0.018s
user0m0.011s
sys 0m0.005s


Asking for 1M results (ignoring that export or streaming should be used
for exports of that size):

time curl -s '
http://localhost:8983/solr/gettingstarted/select?rows=100&q={!graph+from=from_field+to=to_field+returnRoot=true}+from_field:*'
 | jq .response.numFound
100

real0m10.101s
user0m3.344s
sys 0m0.419s

> Is this performance expected out of graph query ?

As the sanity check above shows, there is a huge difference between
evaluating a graph query (any query really) and asking for 1M results
to be returned. With that in mind, what do you set rows to? 


- Toke Eskildsen, Royal Danish Library




Re: Solr8.0.0 Performance Test

2019-05-22 Thread Kayak28
Hello, Shawn, Toke Eskildsen and Solr Community:

It might be too late to share, but the URL below is what I would try to
share as the attachment.
Again, Solr8.0.0 is somehow better, but I am doubting that it might be too
better?

https://docs.google.com/spreadsheets/d/e/2PACX-1vRSvs_kF0rPtJNyXCw7Pdl9-BQ9OMFQOOa8FAiV7NcfFDaZtAuW5CMfUR2GtWSwwIcH5mEESbs3mEos/pubhtml?gid=56815742&single=true


> Yes. Whether is is noticeable is another question. For small indexes of
for complex queries, the difference will be hard to measure. I mentioned it
primarily as a general comment. With your 2M docs / 8GB index, I highly
doubt you will see any difference.
I see. I would like to give it a shot.

>that's the facet.method parameter: fc vs. fcs
Even though it might not be so different to each other, it might be
interesting to test these performances.

Sincerely,

Kaya Ota

2019年5月21日(火) 19:13 Toke Eskildsen :

> Kayak28  wrote:
> > For the next opportunity to share table-formatted data,
> > what is the best way to share data with all of you?
>
> There's no hard recommendations, so anything where you can just click on a
> link and see the data: Google Docs, GitHub Gists, Pastebin...
>
> > When we request a facet query (like below) to 2 Solrs:
> > one with multiple segments and the other is only one
> > segment, we will have different performance result?
>
> Yes. Whether is is noticeable is another question. For small indexes of
> for complex queries, the difference will be hard to measure. I mentioned it
> primarily as a general comment. With your 2M docs / 8GB index, I highly
> doubt you will see any difference.
>
> > Can we configure up-front faceting or one-the-fly facet method?
>
> Yep, that's the facet.method parameter: fc vs. fcs
>
> https://lucene.apache.org/solr/guide/7_7/faceting.html#field-value-faceting-parameters
>
> - Toke Eskildsen, Royal Danish Library
>


Re: Unable to upgrade Lucene 6.x index using IndexUpgrader

2019-05-22 Thread Jan Høydahl
Same as I answered in SOLR-13487:

Note that you'll probably need to re-index from scratch due to changes in 8.0, 
see 
https://builds.apache.org/view/L/view/Lucene/job/Solr-reference-guide-8.0/javadoc/indexupgrader-tool.html
 

 :
If you are currently using a release two or more major versions older, such as 
moving from Solr 6x to Solr 8x, you will need to reindex your content.

See your error message. Does not matter if you upgraded from 6-7, Lucene 8 will 
still complain since it sees that the index was first created with 6.x:

> This index was initially created with Lucene 6.x while the current version
> is 8.1.0 and Lucene only supports reading the current and previous major
> versions.. This version of Lucene only supports indexes created with
> release 7.0 and later.


--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 22. mai 2019 kl. 11:36 skrev Henrik B A :
> 
> I'm trying to upgrade a index from Lucene 6.x to 7.x, and then to 8.x,
> using IndexUpgrader [1].  But it never successfully upgrades to 7, and I
> cannot figure out why.
> 
> I've also tried using CheckIndex [2] with the -exorcise option to fix the
> index first, but that doesn't help.
> 
> Any ideas?  I've added console output at the end of this mail.
> 
> Cheers,
> Henrik
> 
> [1] https://lucene.apache.org/solr/guide/7_7/indexupgrader-tool.html
> [2]
> https://lucene.apache.org/core/7_1_0/core/org/apache/lucene/index/CheckIndex.html#main-java.lang.String:A-
> 
> 
> 
> user@dev-appsolr2535:/home/user> java -cp
> '/opt/common/apps/apache-solr-7.7.1/server/solr-webapp/webapp/WEB-INF/lib/*'
> org.apache.lucene.index.IndexUpgrader -delete-prior-commits -verbose
> /opt/user/solr5-prog-fhh-storage-solr/solr/prog-fhh-storage-solr_shard4_replica_n259/data/index
> IFD 0 [2019-05-22T09:24:14.317Z; main]: init: current segments file is
> "segments_505t";
> deletionPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy@29774679
> IFD 0 [2019-05-22T09:24:14.336Z; main]: init: load commit "segments_505t"
> IFD 0 [2019-05-22T09:24:14.346Z; main]: init: seg=_11nw3 set
> nextWriteFieldInfosGen=134 vs current=1
> IFD 0 [2019-05-22T09:24:14.347Z; main]: init: seg=_11nw3 set
> nextWriteDocValuesGen=134 vs current=1
> IFD 0 [2019-05-22T09:24:14.358Z; main]: now checkpoint
> "_11nw3(7.7.1):C16557/1395:[diagnostics={os=Linux,
> java.vendor=AdoptOpenJDK, java.version=11.0.1, java.vm.version=11.0.1+13,
> lucene.version=7.7.1, mergeMaxNumSegments=-1, os.arch=amd64,
> java.runtime.version=11.0.1+13, source=merge, mergeFactor=10,
> os.version=3.16.0-5-amd64,
> timestamp=1558499119537}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}]:delGen=133
> _11o9p(7.7.1):c798:[diagnostics={os=Linux, java.vendor=AdoptOpenJDK,
> java.version=11.0.1, java.vm.version=11.0.1+13, lucene.version=7.7.1,
> mergeMaxNumSegments=-1, os.arch=amd64, java.runtime.version=11.0.1+13,
> source=merge, mergeFactor=10, os.version=3.16.0-5-amd64,
> timestamp=1558507559781}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}]
> _11o9z(7.7.1):c797:[diagnostics={os=Linux, java.vendor=AdoptOpenJDK,
> java.version=11.0.1, java.vm.version=11.0.1+13, lucene.version=7.7.1,
> mergeMaxNumSegments=-1, os.arch=amd64, java.runtime.version=11.0.1+13,
> source=merge, mergeFactor=10, os.version=3.16.0-5-amd64,
> timestamp=1558507739341}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}]
> _11oa0(7.7.1):C1:[diagnostics={java.runtime.version=11.0.1+13,
> java.vendor=AdoptOpenJDK, java.version=11.0.1, java.vm.version=11.0.1+13,
> lucene.version=7.7.1, os=Linux, os.arch=amd64, os.version=3.16.0-5-amd64,
> source=flush,
> timestamp=1558507750723}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}]
> _11oa1(7.7.1):C2:[diagnostics={java.runtime.version=11.0.1+13,
> java.vendor=AdoptOpenJDK, java.version=11.0.1, java.vm.version=11.0.1+13,
> lucene.version=7.7.1, os=Linux, os.arch=amd64, os.version=3.16.0-5-amd64,
> source=flush,
> timestamp=1558507812631}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}]"
> [5 segments ; isCommit = false]
> IFD 0 [2019-05-22T09:24:14.358Z; main]: 11 msec to checkpoint
> IW 0 [2019-05-22T09:24:14.359Z; main]: init: create=false reader=null
> IW 0 [2019-05-22T09:24:14.362Z; main]:
> dir=MMapDirectory@/opt/user/solr5-prog-fhh-storage-solr/solr/prog-fhh-storage-solr_shard4_replica_n259/data/index
> lockFactory=org.apache.lucene.store.NativeFSLockFactory@14bf9759
> index=_11nw3(7.7.1):C16557/1395:[diagnostics={os=Linux,
> java.vendor=AdoptOpenJDK, java.version=11.0.1, java.vm.version=11.0.1+13,
> lucene.version=7.7.1, mergeMaxNumSegments=-1, os.arch=amd64,
> java.runtime.version=11.0.1+13, source=merge, mergeFactor=10,
> os.version=3.16.0-5-amd64,
> timestamp=1558499119537}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}]:delGen=133
> _11o9p(7.7.1):c798:[diagnostics={os=Linux, java.vendor=AdoptOpenJDK,

alias read access impossible for anyone other than admin?

2019-05-22 Thread Sotiris Fragkiskos
Hi everyone!
I've been trying unsuccessfully to read an alias to a collection with a
curl command.
The command only works when I put in the admin credentials, although the
user I want access for also has the required role for accessing.
Is this perhaps built-in, or should anyone be able to access an alias from
the API?

The command I'm using is:
curl http://
:@/solr//select?q=:
This fails for the user but succeeds for the admin

My minimum working example of security.json follows.
Many thanks!

{
  "authentication":{
"blockUnknown":true,
"class":"solr.BasicAuthPlugin",
"credentials":{
  "admin":"blahblahblah",
  "user":"blahblah"},
"":{"v":13}},
  "authorization":{
"class":"solr.RuleBasedAuthorizationPlugin",
"permissions":[
  {
"name":"all",
"role":"admin",
"index":1},
  {
"name":"readColl",
"collection":"Coll",
"path":"/select/*",
"role":"readColl",
"index":2},
  {
"name":"readSCollAlias",
"collection":"sCollAlias",
"path":"/select/*",
"role":"readSCollAlias",
"index":3}],
"user-role":{
  "admin":[
"admin",
"readSCollAlias"],
  "user":["readSCollAlias"]},
"":{"v":21}}}


Unable to upgrade Lucene 6.x index using IndexUpgrader

2019-05-22 Thread Henrik B A
I'm trying to upgrade a index from Lucene 6.x to 7.x, and then to 8.x,
using IndexUpgrader [1].  But it never successfully upgrades to 7, and I
cannot figure out why.

I've also tried using CheckIndex [2] with the -exorcise option to fix the
index first, but that doesn't help.

Any ideas?  I've added console output at the end of this mail.

Cheers,
Henrik

[1] https://lucene.apache.org/solr/guide/7_7/indexupgrader-tool.html
[2]
https://lucene.apache.org/core/7_1_0/core/org/apache/lucene/index/CheckIndex.html#main-java.lang.String:A-



user@dev-appsolr2535:/home/user> java -cp
'/opt/common/apps/apache-solr-7.7.1/server/solr-webapp/webapp/WEB-INF/lib/*'
org.apache.lucene.index.IndexUpgrader -delete-prior-commits -verbose
/opt/user/solr5-prog-fhh-storage-solr/solr/prog-fhh-storage-solr_shard4_replica_n259/data/index
IFD 0 [2019-05-22T09:24:14.317Z; main]: init: current segments file is
"segments_505t";
deletionPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy@29774679
IFD 0 [2019-05-22T09:24:14.336Z; main]: init: load commit "segments_505t"
IFD 0 [2019-05-22T09:24:14.346Z; main]: init: seg=_11nw3 set
nextWriteFieldInfosGen=134 vs current=1
IFD 0 [2019-05-22T09:24:14.347Z; main]: init: seg=_11nw3 set
nextWriteDocValuesGen=134 vs current=1
IFD 0 [2019-05-22T09:24:14.358Z; main]: now checkpoint
"_11nw3(7.7.1):C16557/1395:[diagnostics={os=Linux,
java.vendor=AdoptOpenJDK, java.version=11.0.1, java.vm.version=11.0.1+13,
lucene.version=7.7.1, mergeMaxNumSegments=-1, os.arch=amd64,
java.runtime.version=11.0.1+13, source=merge, mergeFactor=10,
os.version=3.16.0-5-amd64,
timestamp=1558499119537}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}]:delGen=133
_11o9p(7.7.1):c798:[diagnostics={os=Linux, java.vendor=AdoptOpenJDK,
java.version=11.0.1, java.vm.version=11.0.1+13, lucene.version=7.7.1,
mergeMaxNumSegments=-1, os.arch=amd64, java.runtime.version=11.0.1+13,
source=merge, mergeFactor=10, os.version=3.16.0-5-amd64,
timestamp=1558507559781}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}]
_11o9z(7.7.1):c797:[diagnostics={os=Linux, java.vendor=AdoptOpenJDK,
java.version=11.0.1, java.vm.version=11.0.1+13, lucene.version=7.7.1,
mergeMaxNumSegments=-1, os.arch=amd64, java.runtime.version=11.0.1+13,
source=merge, mergeFactor=10, os.version=3.16.0-5-amd64,
timestamp=1558507739341}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}]
_11oa0(7.7.1):C1:[diagnostics={java.runtime.version=11.0.1+13,
java.vendor=AdoptOpenJDK, java.version=11.0.1, java.vm.version=11.0.1+13,
lucene.version=7.7.1, os=Linux, os.arch=amd64, os.version=3.16.0-5-amd64,
source=flush,
timestamp=1558507750723}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}]
_11oa1(7.7.1):C2:[diagnostics={java.runtime.version=11.0.1+13,
java.vendor=AdoptOpenJDK, java.version=11.0.1, java.vm.version=11.0.1+13,
lucene.version=7.7.1, os=Linux, os.arch=amd64, os.version=3.16.0-5-amd64,
source=flush,
timestamp=1558507812631}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}]"
[5 segments ; isCommit = false]
IFD 0 [2019-05-22T09:24:14.358Z; main]: 11 msec to checkpoint
IW 0 [2019-05-22T09:24:14.359Z; main]: init: create=false reader=null
IW 0 [2019-05-22T09:24:14.362Z; main]:
dir=MMapDirectory@/opt/user/solr5-prog-fhh-storage-solr/solr/prog-fhh-storage-solr_shard4_replica_n259/data/index
lockFactory=org.apache.lucene.store.NativeFSLockFactory@14bf9759
index=_11nw3(7.7.1):C16557/1395:[diagnostics={os=Linux,
java.vendor=AdoptOpenJDK, java.version=11.0.1, java.vm.version=11.0.1+13,
lucene.version=7.7.1, mergeMaxNumSegments=-1, os.arch=amd64,
java.runtime.version=11.0.1+13, source=merge, mergeFactor=10,
os.version=3.16.0-5-amd64,
timestamp=1558499119537}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}]:delGen=133
_11o9p(7.7.1):c798:[diagnostics={os=Linux, java.vendor=AdoptOpenJDK,
java.version=11.0.1, java.vm.version=11.0.1+13, lucene.version=7.7.1,
mergeMaxNumSegments=-1, os.arch=amd64, java.runtime.version=11.0.1+13,
source=merge, mergeFactor=10, os.version=3.16.0-5-amd64,
timestamp=1558507559781}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}]
_11o9z(7.7.1):c797:[diagnostics={os=Linux, java.vendor=AdoptOpenJDK,
java.version=11.0.1, java.vm.version=11.0.1+13, lucene.version=7.7.1,
mergeMaxNumSegments=-1, os.arch=amd64, java.runtime.version=11.0.1+13,
source=merge, mergeFactor=10, os.version=3.16.0-5-amd64,
timestamp=1558507739341}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}]
_11oa0(7.7.1):C1:[diagnostics={java.runtime.version=11.0.1+13,
java.vendor=AdoptOpenJDK, java.version=11.0.1, java.vm.version=11.0.1+13,
lucene.version=7.7.1, os=Linux, os.arch=amd64, os.version=3.16.0-5-amd64,
source=flush,
timestamp=1558507750723}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}]
_11oa1(7.7.1):C2:[diagnostics={java.runtime.version=11.0.1+13,
java.vendor=AdoptOpenJDK, java.version=11.0.1, java.vm.version=11.0.1+13,
lucene.version=7.7.1, os=Linux, os.arch=amd64, os.version=3.16.0-5-amd64,
sour

Usage of docValuesFormat

2019-05-22 Thread vishal patel
We enabled the DocValues on some schema fields for sorting and faceting query 
result.
Is it necessary to add docValuesFormat for faster query process?
Which one should better? docValuesFormat="Memory" or docValuesFormat="Disk"?

Note: Our indexed data size are high in one collection and different sort and 
faceting queries are executed within a second.

Sent from Outlook


CloudSolrClient Sesion

2019-05-22 Thread Rainman Sián
Hello all,

I'm writing a Solr solution for a quite big project. To do so, I wrote an
OSGi service that provides the add/delete/query functionalities to Solr.

It works just fine, but we have a kind of Session issues along logs
increasing size on server due that the application servers throws this
exception

java.lang.NoClassDefFoundError: org/apache/zookeeper/proto/SetWatches
   at
org.apache.zookeeper.ClientCnxn$SendThread.primeConnection(ClientCnxn.java:926)
[com.servicenow.aem.servicenow-www-services:1.0.0.SNAPSHOT]
   at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:363)
[com.servicenow.aem.servicenow-www-services:1.0.0.SNAPSHOT]
   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
[com.servicenow.aem.servicenow-www-services:1.0.0.SNAPSHOT]
20.05.2019 16:46:56.463 *INFO* [OsgiInstallerImpl-SendThread(X.X.X.X:2181)]
org.apache.zookeeper.ClientCnxn Socket connection established to
X.X.X.X/X.X.X.X:2181, initiating session
20.05.2019 16:46:56.463 *WARN* [OsgiInstallerImpl-SendThread(X.X.X.X:2181)]
org.apache.zookeeper.ClientCnxn Session 0x1003941f01c0a87 for server
X.X.X.X/X.X.X.X:2181, unexpected error, closing socket connection and
attempting reconnect

* I replaced X.X.X.X for the valid public IP we have

I have googled for this exception with no much luck, I can figure out what
is the root cause of this exception. I have to mention that this doesn't
happens always. but when it starts it just doesn't stop until we restart
the server. also all the Solr functionality keeps working just fine is just
that some server functionalities stops working and the amount of logs it
generates the issue.

Here is the code that I have, as you can see I only have on instance of
SolrClient, and I don't close the session so I don't have to create a new
SolrClient every time (at least that is the idea).


@Component(...)
@Service(...)
public class SolrAPIServiceImpl implements SolrAPIService {


private CloudSolrClient solrClient;

@Reference
private SolrConfigurationService solrConfigurationService;

@Activate
@Modified
protected void activate(final ComponentContext context) {
solrClient = new CloudSolrClient.Builder(Arrays.asList(
solrConfigurationService.getZookeeperHost().split(",")), Optional.empty()).
build();
solrClient.setZkConnectTimeout(solrConfigurationService.
getConnectionTimeoutMs());
solrClient.setZkClientTimeout(solrConfigurationService.getClientTimeoutMs
());
solrClient.setDefaultCollection(solrConfigurationService.getCollectionName
());

}

public void addOrUpdate(List docs) throws Exception {
if (docs.size() > 0) {
solrClient.addBeans(docs, solrConfigurationService.getCommitWithinMs());
}
}

public void addOrUpdate(SchemaObject doc) throws Exception {
solrClient.addBean(doc, solrConfigurationService.getCommitWithinMs());
}

public void deleteAll() throws Exception {
solrClient.deleteByQuery("*:*", solrConfigurationService.getCommitWithinMs
());
}

public void deleteById(String id) throws Exception {
solrClient.deleteById(id, solrConfigurationService.getCommitWithinMs());
}

public void delete(ServiceNowSchemaObject doc) throws Exception {
deleteById(doc.id);
}


public void deleteByQuery(String query) throws Exception {
solrClient.deleteByQuery(query, solrConfigurationService.getCommitWithinMs
());
}

}