Backup v.s. Snapshot API for Solr

2019-11-21 Thread Kayak28
Hello, Community Members:

I have tested the behaviors of Backup API and Snapshot API, which are
written in the URL below.
https://lucene.apache.org/solr/guide/7_4/making-and-restoring-backups.html#making-and-restoring-backups
()

As far as I observed the behavior of Backup API, I now know the followings:
-  Solr's back up simply means to make a copy of a full-sized index.
- Solr's restore means to make another copy of a full-sized index from a
backup directory and refer the copy as the index.
- Backup / Restore APIs belong to Replication Handler.

Also, I know the following for Snapshot API.
- Solr can make a snapshot at any time (i.e. does not matter if it is after
commit/backup/ restore..)
- snapshot_N contains a name of the snapshot(commitName) and the
current Index path.
- N in snapshot_N is the identical number to segments_N.

I believe, by observing the Snapshot API behavior, it is impossible to
"backup" or "restore" Solr's Index.
So, my questions are"
- What is the difference between Backup API and Snapshot API?
   The above Solr's Guide says "The snapshot functionality is different
from the backup functionality as the index files aren’t copied anywhere." But,
then how Snapshot API help me to make a backup?

- Or, more basically, when should I use Snapshot API?

- What is the rule of thumb operation for Solr's backup system?

Sincerely,
Kaya Ota


Nested SubQuery

2019-11-21 Thread Ravi Dhanwate
Hi All, 

We are using solar 8.2 and wants to know if we can have nested subqueries.

 For ex. I have following types of documents in my collection. Different 
document types are stored . ( Not using nested documents  to avoid indexing 
overhead).

 {
Id :111,
docType: “HEADER”
Value: “Header Document”
}

{
id:222
headerId :111,
docType: “LINE”
Value: “Line Document”
}

{
Id :333,
lineId:222
docType: “SHIPMENT”
Value: “Shipment Document”
}

If I have subquery in below fashion, I can get headers and lines. Lines as 
nested document.

http://localhost:9881/solr/order-coll-scmdt/select?q=id:111=doc_type:HEADER=id,docType,linessub:[subquery]
={!terms f=headerId 
v=$row.id}=docType:LINE=id,doc_type

We want to have further level of nesting where can also find shipments for each 
line , however this type of query does not work.

http://localhost:9881/solr/order-coll-scmdt/select?q=id:111=doc_type:HEADER=id,docType,linessub:[subquery]
={!terms f=headerId 
v=$row.id}=docType:LINE=id,doc_type,shipments:[subquery]
={!terms f=lineId v=$row.id}=doc_type:SHIPMENT

Are there any limitations on nested subqueries? 

Any help on this is appreciated.

Thanks
Ravi

Fetch parent and child document in solr 8.2

2019-11-21 Thread Jigar Gajjar
Hello,



I am trying to fetch parent and child document together in one Solr query,
I was able to do that in solr 7.4 but same query does not work in solr 8.2.

Are there any major changes in the way that we are fetching children?



My requirement is to fetch parent and children both in one call.



I am trying



http://localhost:8983/solr/demo/select?fl=*,[child]={!parent
which="cat_s:sci-fi
AND pubyear_i:1992"}



what are the ways to retrieve parent child as nested documents?



We need to start working on it very soon, any help will be appreciated.

-- 
Thanks
Jigar Gajjar


Re: Possible bug in cluster status - > solr 8.3

2019-11-21 Thread Andrzej Białecki
AFAIK these collection properties are not tracked that faithfully and can get 
out of sync, mostly because they are used only during collection CREATE and 
BACKUP / RESTORE and not during other collection operations or during searching 
/ indexing. SPLITSHARD doesn’t trust them, instead it checks the actual counts 
of existing replicas.

These out-of-sync counts may actually cause problems in BACKUP / RESTORE, which 
is worth checking.

There are also conceptual issues here, eg. “replicationFactor” becomes 
meaningless as soon as we have different counts of NRT / TLOG / PULL replicas.

> On 21 Nov 2019, at 13:40, Jason Gerlowski  wrote:
> 
> It seems like an issue to me.  Can you open a JIRA with these details?
> 
> On Fri, Nov 15, 2019 at 10:51 AM Jacek Kikiewicz  wrote:
>> 
>> I found interesting situation, I've created a collection with only one 
>> replica.
>> Then I scaled solr-cloud cluster, and run  'addreplica' call to add 2 more.
>> So I have a collection with 3 tlog replicas, cluster status page shows
>> them but shows also this:
>>  "core_node2":{
>>"core":"EDITED_NAME_shard1_replica_t1",
>>"base_url":"http://EDITED_NODE:8983/solr;,
>>"node_name":"EDITED_NODE:8983_solr",
>>"state":"active",
>>"type":"TLOG",
>>"force_set_state":"false",
>>"leader":"true"},
>>  "core_node5":{
>>"core":"EDITED_NAME_shard1_replica_t3",
>>"base_url":"http://EDITED_NODE:8983/solr;,
>>"node_name":"EDITED_NODE:8983_solr",
>>"state":"active",
>>"type":"TLOG",
>>"force_set_state":"false"},
>>  "core_node6":{
>>"core":"EDITED_NAME_shard1_replica_t4",
>>"base_url":"http://EDITED_NODE:8983/solr;,
>>"node_name":"EDITED_NODE:8983_solr",
>>"state":"active",
>>"type":"TLOG",
>>"force_set_state":"false",
>>"router":{"name":"compositeId"},
>>"maxShardsPerNode":"1",
>>"autoAddReplicas":"false",
>>"nrtReplicas":"1",
>>"tlogReplicas":"1",
>>"znodeVersion":11,
>> 
>> 
>> As you can see I have 3 replicas but then I have also: "tlogReplicas":"1"
>> 
>> If I create collection with tlogReplicas=3 then cluster status shows
>> "tlogReplicas":"3"
>> IS that a bug or somehow 'works as it should' ?
>> 
>> Regards,
>> Jacek
> 



Re: Solr process takes several minutes before accepting commands after restart

2019-11-21 Thread Dave
https://doc.sitecore.com/developers/90/platform-administration-and-architecture/en/using-solr-auto-suggest.html


If you need more references. Set all parameters yourself, don’t rely on 
defaults. 

> On Nov 21, 2019, at 3:41 PM, Dave  wrote:
> 
> https://lucidworks.com/post/solr-suggester/
> 
> You must set buildonstartup to false, the default is true. Try it
> 
>> On Nov 21, 2019, at 3:21 PM, Koen De Groote  
>> wrote:
>> 
>> Erick:
>> 
>> No suggesters. There is 1 spellchecker for
>> 
>> text_general
>> 
>> But no buildOnCommit or buildOnStartup setting mentioned anywhere.
>> 
>> That being said, the point in time at which this occurs, the database is
>> guaranteed to be empty, as the data folders had previously been deleted and
>> recreated empty. Then the docker container is restarted and this behavior
>> is observed.
>> 
>> Long shot, but even if Solr is getting data from zookeeper telling of file
>> locations and checking for the existence of these files... that should be
>> pretty fast, I'd think.
>> 
>> This is really disturbing. I know what to expect when recovering now, but
>> someone doing this on a live environment that has to be up again ASAP is
>> probably going to be sweating bullets.
>> 
>> 
>> On Thu, Nov 21, 2019 at 2:45 PM Erick Erickson 
>> wrote:
>> 
>>> Koen:
>>> 
>>> Do you have any spellcheckers or suggesters defined with buildOnCommit or
>>> buildOnStartup set to “true”? Depending on the implementation, this may
>>> have to read the stored data for the field used in the
>>> suggester/spellchecker from _every_ document in your collection, which can
>>> take many minutes. Even if your implementation in your config is file-based
>>> it can still take a while.
>>> 
>>> Shot in the dark….
>>> 
>>> Erick
>>> 
 On Nov 21, 2019, at 4:03 AM, Koen De Groote 
>>> wrote:
 
 The logs files showed a startup, printing of all the config options that
 had been set, 1 or 2 commands that got executed and then nothing.
 
 Sending the curl did not get shown in the logs files until after that
 period where Solr became unresponsive.
 
 Service mesh, I don't think so? It's in a docker container, but that
 shouldn't be a problem, it usually never is.
 
 
 On Wed, Nov 20, 2019 at 10:42 AM Jörn Franke 
>>> wrote:
 
> Have you checked the log files of Solr?
> 
> 
> Do you have a service mesh in-between? Could it be something at the
> network layer/container orchestration  that is blocking requests for
>>> some
> minutes?
> 
>> Am 20.11.2019 um 10:32 schrieb Koen De Groote <
> koen.degro...@limecraft.com>:
>> 
>> Hello
>> 
>> I was testing some backup/restore scenarios.
>> 
>> 1 of them is Solr7.6 in a docker container(7.6.0-slim), set up as
>> SolrCloud, with zookeeper.
>> 
>> The steps are as follows:
>> 
>> 1. Manually delete the data folder.
>> 2. Restart the container. The process is now in error mode, complaining
>> that it cannot find the cores.
>> 3. Fix the install, meaning create new data folders, which are empty at
>> this point.
>> 4. Restart the container again, to pick up the empty folders and not be
> in
>> error anymore.
>> 5. Perform the restore
>> 6. Check if everything is available again
>> 
>> The problem is between step 4 and 5. After step 4, it takes several
> minutes
>> before solr actually responds to curl commands.
>> 
>> Once responsive, the restore happened just fine. But it's very
>>> stressful
> in
>> a situation where you have to restore a production environment and the
>> process just doesn't respond for 5-10 minutes.
>> 
>> We're talking about 20GB of data here, so not very much, but not little
>> either.
>> 
>> Is it normal that it takes so long before solr responds? If not, what
>> should I look at in order to find the cause?
>> 
>> I have asked this before recently, though the wording was confusing.
>>> This
>> should be clearer.
>> 
>> Kind regards,
>> Koen De Groote
> 
>>> 
>>> 


Re: Solr process takes several minutes before accepting commands after restart

2019-11-21 Thread Dave
https://lucidworks.com/post/solr-suggester/

You must set buildonstartup to false, the default is true. Try it

> On Nov 21, 2019, at 3:21 PM, Koen De Groote  
> wrote:
> 
> Erick:
> 
> No suggesters. There is 1 spellchecker for
> 
> text_general
> 
> But no buildOnCommit or buildOnStartup setting mentioned anywhere.
> 
> That being said, the point in time at which this occurs, the database is
> guaranteed to be empty, as the data folders had previously been deleted and
> recreated empty. Then the docker container is restarted and this behavior
> is observed.
> 
> Long shot, but even if Solr is getting data from zookeeper telling of file
> locations and checking for the existence of these files... that should be
> pretty fast, I'd think.
> 
> This is really disturbing. I know what to expect when recovering now, but
> someone doing this on a live environment that has to be up again ASAP is
> probably going to be sweating bullets.
> 
> 
> On Thu, Nov 21, 2019 at 2:45 PM Erick Erickson 
> wrote:
> 
>> Koen:
>> 
>> Do you have any spellcheckers or suggesters defined with buildOnCommit or
>> buildOnStartup set to “true”? Depending on the implementation, this may
>> have to read the stored data for the field used in the
>> suggester/spellchecker from _every_ document in your collection, which can
>> take many minutes. Even if your implementation in your config is file-based
>> it can still take a while.
>> 
>> Shot in the dark….
>> 
>> Erick
>> 
>>> On Nov 21, 2019, at 4:03 AM, Koen De Groote 
>> wrote:
>>> 
>>> The logs files showed a startup, printing of all the config options that
>>> had been set, 1 or 2 commands that got executed and then nothing.
>>> 
>>> Sending the curl did not get shown in the logs files until after that
>>> period where Solr became unresponsive.
>>> 
>>> Service mesh, I don't think so? It's in a docker container, but that
>>> shouldn't be a problem, it usually never is.
>>> 
>>> 
>>> On Wed, Nov 20, 2019 at 10:42 AM Jörn Franke 
>> wrote:
>>> 
 Have you checked the log files of Solr?
 
 
 Do you have a service mesh in-between? Could it be something at the
 network layer/container orchestration  that is blocking requests for
>> some
 minutes?
 
> Am 20.11.2019 um 10:32 schrieb Koen De Groote <
 koen.degro...@limecraft.com>:
> 
> Hello
> 
> I was testing some backup/restore scenarios.
> 
> 1 of them is Solr7.6 in a docker container(7.6.0-slim), set up as
> SolrCloud, with zookeeper.
> 
> The steps are as follows:
> 
> 1. Manually delete the data folder.
> 2. Restart the container. The process is now in error mode, complaining
> that it cannot find the cores.
> 3. Fix the install, meaning create new data folders, which are empty at
> this point.
> 4. Restart the container again, to pick up the empty folders and not be
 in
> error anymore.
> 5. Perform the restore
> 6. Check if everything is available again
> 
> The problem is between step 4 and 5. After step 4, it takes several
 minutes
> before solr actually responds to curl commands.
> 
> Once responsive, the restore happened just fine. But it's very
>> stressful
 in
> a situation where you have to restore a production environment and the
> process just doesn't respond for 5-10 minutes.
> 
> We're talking about 20GB of data here, so not very much, but not little
> either.
> 
> Is it normal that it takes so long before solr responds? If not, what
> should I look at in order to find the cause?
> 
> I have asked this before recently, though the wording was confusing.
>> This
> should be clearer.
> 
> Kind regards,
> Koen De Groote
 
>> 
>> 


Re: Solr process takes several minutes before accepting commands after restart

2019-11-21 Thread Koen De Groote
Erick:

No suggesters. There is 1 spellchecker for

text_general

But no buildOnCommit or buildOnStartup setting mentioned anywhere.

That being said, the point in time at which this occurs, the database is
guaranteed to be empty, as the data folders had previously been deleted and
recreated empty. Then the docker container is restarted and this behavior
is observed.

Long shot, but even if Solr is getting data from zookeeper telling of file
locations and checking for the existence of these files... that should be
pretty fast, I'd think.

This is really disturbing. I know what to expect when recovering now, but
someone doing this on a live environment that has to be up again ASAP is
probably going to be sweating bullets.


On Thu, Nov 21, 2019 at 2:45 PM Erick Erickson 
wrote:

> Koen:
>
> Do you have any spellcheckers or suggesters defined with buildOnCommit or
> buildOnStartup set to “true”? Depending on the implementation, this may
> have to read the stored data for the field used in the
> suggester/spellchecker from _every_ document in your collection, which can
> take many minutes. Even if your implementation in your config is file-based
> it can still take a while.
>
> Shot in the dark….
>
> Erick
>
> > On Nov 21, 2019, at 4:03 AM, Koen De Groote 
> wrote:
> >
> > The logs files showed a startup, printing of all the config options that
> > had been set, 1 or 2 commands that got executed and then nothing.
> >
> > Sending the curl did not get shown in the logs files until after that
> > period where Solr became unresponsive.
> >
> > Service mesh, I don't think so? It's in a docker container, but that
> > shouldn't be a problem, it usually never is.
> >
> >
> > On Wed, Nov 20, 2019 at 10:42 AM Jörn Franke 
> wrote:
> >
> >> Have you checked the log files of Solr?
> >>
> >>
> >> Do you have a service mesh in-between? Could it be something at the
> >> network layer/container orchestration  that is blocking requests for
> some
> >> minutes?
> >>
> >>> Am 20.11.2019 um 10:32 schrieb Koen De Groote <
> >> koen.degro...@limecraft.com>:
> >>>
> >>> Hello
> >>>
> >>> I was testing some backup/restore scenarios.
> >>>
> >>> 1 of them is Solr7.6 in a docker container(7.6.0-slim), set up as
> >>> SolrCloud, with zookeeper.
> >>>
> >>> The steps are as follows:
> >>>
> >>> 1. Manually delete the data folder.
> >>> 2. Restart the container. The process is now in error mode, complaining
> >>> that it cannot find the cores.
> >>> 3. Fix the install, meaning create new data folders, which are empty at
> >>> this point.
> >>> 4. Restart the container again, to pick up the empty folders and not be
> >> in
> >>> error anymore.
> >>> 5. Perform the restore
> >>> 6. Check if everything is available again
> >>>
> >>> The problem is between step 4 and 5. After step 4, it takes several
> >> minutes
> >>> before solr actually responds to curl commands.
> >>>
> >>> Once responsive, the restore happened just fine. But it's very
> stressful
> >> in
> >>> a situation where you have to restore a production environment and the
> >>> process just doesn't respond for 5-10 minutes.
> >>>
> >>> We're talking about 20GB of data here, so not very much, but not little
> >>> either.
> >>>
> >>> Is it normal that it takes so long before solr responds? If not, what
> >>> should I look at in order to find the cause?
> >>>
> >>> I have asked this before recently, though the wording was confusing.
> This
> >>> should be clearer.
> >>>
> >>> Kind regards,
> >>> Koen De Groote
> >>
>
>


Re: Solr 8.2 indexing issues

2019-11-21 Thread Jörn Franke
You are switching 2 major versions. You probably  need to delete the 
collections (fully not only delete command) and reindex

> Am 12.11.2019 um 21:42 schrieb Sujatha Arun :
> 
> We recently migrated from 6.6.2 to 8.2. We are seeing issues with indexing
> where the leader and the replica document counts do not match. We get
> different results every time we do a *:* search.
> 
> The only issue we see in the logs is Jira issue : Solr-13293
> 
> Has anybody seen similar issues?
> 
> Thanks


Re: Solr 8.2 indexing issues

2019-11-21 Thread Rahul Goswami
Hi Sujatha,

How did you upgrade your cluster ? Did you restart each node in the cluster
one by one after upgrade (while other nodes were running on 6.6.2) or did
you bring down the entire cluster and bring up one upgraded node at a time?

Thanks,
Rahul

On Thu, Nov 14, 2019 at 7:03 AM Paras Lehana 
wrote:

> Hi Sujatha,
>
> Apologies that I am not addressing your bug directly but have you tried 8.3
>  that has just been
> released?
>
> On Wed, 13 Nov 2019 at 02:12, Sujatha Arun  wrote:
>
> > We recently migrated from 6.6.2 to 8.2. We are seeing issues with
> indexing
> > where the leader and the replica document counts do not match. We get
> > different results every time we do a *:* search.
> >
> > The only issue we see in the logs is Jira issue : Solr-13293
> >
> > Has anybody seen similar issues?
> >
> > Thanks
> >
>
>
> --
> --
> Regards,
>
> *Paras Lehana* [65871]
> Development Engineer, Auto-Suggest,
> IndiaMART Intermesh Ltd.
>
> 8th Floor, Tower A, Advant-Navis Business Park, Sector 142,
> Noida, UP, IN - 201303
>
> Mob.: +91-9560911996
> Work: 01203916600 | Extn:  *8173*
>
> --
> IMPORTANT:
> NEVER share your IndiaMART OTP/ Password with anyone.
>


Re: Possible data corruption in JavaBinCodec in Solr 8.3 during distributed update?

2019-11-21 Thread Colvin Cowie
*> the difference is because the _default config has the dynamic schema
building in it, which I assume is pushing it down a different code path. *

Also to add to that, I assumed initially that this just meant that it was
working because the corrupted field names would just cause it to create a
field with the dodgy name (since that's the idea for the dynamic schema),
but checking the documents on retrieval showed they all had the right field
names...
So I assume it's a result of going into a different branch of code instead.


On an unrelated matter, I saw this in the logs when running with embedded
zookeeper... I don't think I've seen it mentioned anywhere else, so I will
raise an issue for it
















*2019-11-21 17:25:14.292 INFO  (main) [   ] o.a.s.c.SolrZkServer STARTING
EMBEDDED STANDALONE ZOOKEEPER SERVER at port 99832019-11-21 17:25:14.792
INFO  (main) [   ] o.a.s.c.ZkContainer Zookeeper
client=localhost:99832019-11-21 17:25:18.833 WARN  (Thread-13) [   ]
o.a.z.s.a.AdminServerFactory Unable to load jetty, not starting
JettyAdminServer => java.lang.NoClassDefFoundError:
org/eclipse/jetty/server/Connector at java.lang.Class.forName0(Native
Method)java.lang.NoClassDefFoundError: org/eclipse/jetty/server/Connector
at java.lang.Class.forName0(Native Method) ~[?:1.8.0_191] at
java.lang.Class.forName(Class.java:264) ~[?:1.8.0_191] at
org.apache.zookeeper.server.admin.AdminServerFactory.createAdminServer(AdminServerFactory.java:43)
~[?:?] at
org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:136)
~[?:?] at org.apache.solr.cloud.SolrZkServer$1.run(SolrZkServer.java:121)
~[?:?]Caused by: java.lang.ClassNotFoundException:
org.eclipse.jetty.server.Connector at
org.eclipse.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:577)
~[jetty-webapp-9.4.19.v20190610.jar:9.4.19.v20190610] at
java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[?:1.8.0_191] ... 5
more2019-11-21 17:25:19.365 INFO  (main) [   ] o.a.s.c.c.ConnectionManager
Waiting for client to connect to ZooKeeper2019-11-21 17:25:19.396 INFO
 (zkConnectionManagerCallback-7-thread-1) [   ] o.a.s.c.c.ConnectionManager
zkClient has connected2019-11-21 17:25:19.396 INFO  (main) [   ]
o.a.s.c.c.ConnectionManager Client is connected to ZooKeeper*

On Thu, 21 Nov 2019 at 17:30, Colvin Cowie 
wrote:

> I've been a bit snowed under, but I've found the difference is because the
> _default config has the dynamic schema building in it, which I assume is
> pushing it down a different code path.
>
>default="${update.autoCreateFields:true}"
>
>  
> processor="uuid,remove-blank,field-name-mutating,parse-boolean,parse-long,parse-double,parse-date,add-schema-fields">
>
> I'm using the vanilla Solr 8.3.0 binary8.3.0
> 2aa586909b911e66e1d8863aa89f173d69f86cd2 - ishan - 2019-10-25 23:15:22 with
> Eclipse OpenJ9 Eclipse OpenJ9 VM 1.8.0_232 openj9-0.17.0
> and I've checked with Oracle Corporation Java HotSpot(TM) 64-Bit Server VM
> 1.8.0_191 25.191-b12 as well
>
> I've put a testcase and configsets in Google Drive:
> https://drive.google.com/open?id=1ibKNWvowT8cXTwSa3bcTwKYLSRNur86U
> The configsets are a copy of the _default configset, except the "problem"
> configset has autoCreateFields set to false.
> I created a collection with 4 shards, replication factor 1 for each
> configset. The test case reliably fails on the "problem" collection and
> reliably passes against the "no_problem" collection.
>
> The test (well it's not actually a @Test but still) has static data
> (though it was originally generated randomly). The data is a bit mad... but
> it was easier to reproduce the problem reliably with this data, than with
> the normal documents we use in our product.
> Each document has a different (dynamically named) field to index data
> into, but it's the same data in each field.
> The problem only appears (or probably is just more likely to appear?) when
> the field names in the request are of different lengths.
> The length / value of the data doesn't appear to matter. Or is less
> impactful than variations in the field names.
> *If you run the test 10 times you will see a variety of different errors.
> i.e. it's not the same error every time.*
> I've included some examples of the errors in the Drive folder. One of the
> most fundamental (and probably points at the root cause) is this:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *2019-11-21 17:02:53.720 ERROR
> (updateExecutor-3-thread-6-processing-x:problem_collection_shard2_replica_n2
> r:core_node5 null n:10.0.75.1:8983_solr c:problem_collection s:shard2)
> [c:problem_collection s:shard2 r:core_node5
> x:problem_collection_shard2_replica_n2]
> o.a.s.u.ErrorReportingConcurrentUpdateSolrClient Error when calling
> SolrCmdDistributor$Req: cmd=add{,id=(null)}; node=ForwardNode:
> http://10.0.75.1:8983/solr/problem_collection_shard3_replica_n4/
>  to
> 

Re: Possible data corruption in JavaBinCodec in Solr 8.3 during distributed update?

2019-11-21 Thread Colvin Cowie
I've been a bit snowed under, but I've found the difference is because the
_default config has the dynamic schema building in it, which I assume is
pushing it down a different code path.

  

I'm using the vanilla Solr 8.3.0 binary8.3.0
2aa586909b911e66e1d8863aa89f173d69f86cd2 - ishan - 2019-10-25 23:15:22 with
Eclipse OpenJ9 Eclipse OpenJ9 VM 1.8.0_232 openj9-0.17.0
and I've checked with Oracle Corporation Java HotSpot(TM) 64-Bit Server VM
1.8.0_191 25.191-b12 as well

I've put a testcase and configsets in Google Drive:
https://drive.google.com/open?id=1ibKNWvowT8cXTwSa3bcTwKYLSRNur86U
The configsets are a copy of the _default configset, except the "problem"
configset has autoCreateFields set to false.
I created a collection with 4 shards, replication factor 1 for each
configset. The test case reliably fails on the "problem" collection and
reliably passes against the "no_problem" collection.

The test (well it's not actually a @Test but still) has static data (though
it was originally generated randomly). The data is a bit mad... but it was
easier to reproduce the problem reliably with this data, than with the
normal documents we use in our product.
Each document has a different (dynamically named) field to index data into,
but it's the same data in each field.
The problem only appears (or probably is just more likely to appear?) when
the field names in the request are of different lengths.
The length / value of the data doesn't appear to matter. Or is less
impactful than variations in the field names.
*If you run the test 10 times you will see a variety of different errors.
i.e. it's not the same error every time.*
I've included some examples of the errors in the Drive folder. One of the
most fundamental (and probably points at the root cause) is this:
































*2019-11-21 17:02:53.720 ERROR
(updateExecutor-3-thread-6-processing-x:problem_collection_shard2_replica_n2
r:core_node5 null n:10.0.75.1:8983_solr c:problem_collection s:shard2)
[c:problem_collection s:shard2 r:core_node5
x:problem_collection_shard2_replica_n2]
o.a.s.u.ErrorReportingConcurrentUpdateSolrClient Error when calling
SolrCmdDistributor$Req: cmd=add{,id=(null)}; node=ForwardNode:
http://10.0.75.1:8983/solr/problem_collection_shard3_replica_n4/
 to
http://10.0.75.1:8983/solr/problem_collection_shard3_replica_n4/
 =>
java.lang.StringIndexOutOfBoundsException at
java.lang.String.(String.java:668)java.lang.StringIndexOutOfBoundsException:
null at java.lang.String.(String.java:668) ~[?:1.8.0_232] at
org.noggit.CharArr.toString(CharArr.java:182) ~[?:?] at
org.apache.solr.common.util.JavaBinCodec.lambda$getStringProvider$1(JavaBinCodec.java:966)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec$$Lambda$668..apply(Unknown
Source) ~[?:?] at
org.apache.solr.common.util.ByteArrayUtf8CharSequence._getStr(ByteArrayUtf8CharSequence.java:156)
~[?:?] at
org.apache.solr.common.util.ByteArrayUtf8CharSequence.toString(ByteArrayUtf8CharSequence.java:235)
~[?:?] at
org.apache.solr.common.util.ByteArrayUtf8CharSequence.convertCharSeq(ByteArrayUtf8CharSequence.java:215)
~[?:?] at
org.apache.solr.common.SolrInputField.getValue(SolrInputField.java:128)
~[?:?] at
org.apache.solr.common.SolrInputDocument.lambda$writeMap$0(SolrInputDocument.java:55)
~[?:?] at
org.apache.solr.common.SolrInputDocument$$Lambda$743.2774E7B0.accept(Unknown
Source) ~[?:?] at java.util.LinkedHashMap.forEach(LinkedHashMap.java:684)
~[?:1.8.0_232] at
org.apache.solr.common.SolrInputDocument.writeMap(SolrInputDocument.java:59)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.writeSolrInputDocument(JavaBinCodec.java:658)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:383)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:253)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.writeMapEntry(JavaBinCodec.java:813)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:411)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:253)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.writeIterator(JavaBinCodec.java:750)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:395)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:253)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.writeNamedList(JavaBinCodec.java:248)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:355)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:253)
~[?:?] at
org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:167)
~[?:?] at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.marshal(JavaBinUpdateRequestCodec.java:102)
~[?:?] at

Re: fq pfloat_field:* returns no documents, tfloat:* does

2019-11-21 Thread Shawn Heisey

On 11/21/2019 7:48 AM, Webster Homer wrote:

Thank you. Why don't point fields get loaded by the Schema Browser's "Load Term 
Info" button?


From what I've seen in the past on discussions, Point-based fields are 
missing Term data.  There's literally nothing to load.


https://issues.apache.org/jira/browse/SOLR-13757
https://issues.apache.org/jira/browse/SOLR-10829

This is listed definitively as the reason that such fields can't be used 
for uniqueKey.


Other things supported by evidence:  That point fields are very slow for 
single-value lookups and really fast at range queries.  I do not know if 
these are related to the missing Term data.


You would need to talk to an expert on Lucene to find out why all other 
numeric field types were deprecated when points were brought into existence.


Thanks,
Shawn


Re: Highlighting on typing in search box

2019-11-21 Thread rhys J
Thank you both! I've got an autocomplete working on a basic format right
now, and I'm working on implementing it to be smart about which core it
searches.

On Thu, Nov 21, 2019 at 11:43 AM Jörn Franke  wrote:

> It sounds like you look for a suggester.
>
> You can use the suggester of Solr.
>
> For the visualization part: Angular has a suggestion box that can ingest
> the results from Solr.
>
> > Am 21.11.2019 um 16:42 schrieb rhys J :
> >
> > Are there any recommended APIs or code examples of using Solr and then
> > highlighting results below the search box?
> >
> > I'm trying to implement a search box that will search solr as the user
> > types, if that makes sense?
> >
> > Thanks,
> >
> > Rhys
>


Fetching Parent-Child documents - Solr 8.2

2019-11-21 Thread Ali,Sherif
Hello,

I am trying to fetch parent and child document together in one Solr query, I 
was able to do that in solr 7.4 but same query does not work in solr 8.2.

Are there any major changes in the way that we are fetching children?



My requirement is to fetch parent and children both in one call.



I am trying



http://localhost:8983/solr/demo/select?fl=*,[child]={!parent
 which="cat_s:sci-fi AND pubyear_i:1992"}



what are the ways to retrieve parent child as nested documents?



We need to start working on it very soon, any help will be appreciated.


Sincerely,
Sherif
--
Sherif Ali
OCLC · Senior Software Engineer
6565 Kilgour Place, Dublin, Ohio USA 43017
C +1-614-764-6077
[signature_485796431]
OCLC.org · 
Facebook · 
Twitter · YouTube



Re: Fetch parent and child document in solr 8.2

2019-11-21 Thread Gajjar, Jigar


Thanks,
Jigar Gajjar
OCLC · Senior Software  Engineer
6565 Kilgour Place, Dublin, OH, USA, 43017
 M +1-408-334-6379
[OCLC]
OCLC.org · 
Blog · 
Facebook · 
Twitter · YouTube


From: "Gajjar, Jigar" 
Date: Thursday, November 21, 2019 at 11:28 AM
To: "solr-user@lucene.apache.org" 
Subject: Fetch parent and child document in solr 8.2

 Hello,

I am trying to fetch parent and child document together in one Solr query, I 
was able to do that in solr 7.4 but same query does not work in solr 8.2.
Are there any major changes in the way that we are fetching children?

My requirement is to fetch parent and children both in one call.

I am trying

http://localhost:8983/solr/demo/select?fl=*,[child]={!parent 
which="cat_s:sci-fi AND pubyear_i:1992"}

what are the ways to retrieve parent child as nested documents?

We need to start working on it very soon, any help will be appreciated.



Thanks,
Jigar Gajjar
OCLC · Senior Software  Engineer
6565 Kilgour Place, Dublin, OH, USA, 43017
 M +1-408-334-6379
[OCLC]
OCLC.org · 
Blog · 
Facebook · 
Twitter · YouTube



Re: Highlighting on typing in search box

2019-11-21 Thread Jörn Franke
It sounds like you look for a suggester.

You can use the suggester of Solr.

For the visualization part: Angular has a suggestion box that can ingest the 
results from Solr.

> Am 21.11.2019 um 16:42 schrieb rhys J :
> 
> Are there any recommended APIs or code examples of using Solr and then
> highlighting results below the search box?
> 
> I'm trying to implement a search box that will search solr as the user
> types, if that makes sense?
> 
> Thanks,
> 
> Rhys


Fetch parent and child document in solr 8.2

2019-11-21 Thread Gajjar, Jigar
 Hello,

I am trying to fetch parent and child document together in one Solr query, I 
was able to do that in solr 7.4 but same query does not work in solr 8.2.
Are there any major changes in the way that we are fetching children?

My requirement is to fetch parent and children both in one call.

I am trying

http://localhost:8983/solr/demo/select?fl=*,[child]={!parent 
which="cat_s:sci-fi AND pubyear_i:1992"}

what are the ways to retrieve parent child as nested documents?

We need to start working on it very soon, any help will be appreciated.



Thanks,
Jigar Gajjar
OCLC · Senior Software  Engineer
6565 Kilgour Place, Dublin, OH, USA, 43017
 M +1-408-334-6379
[OCLC]
OCLC.org · 
Blog · 
Facebook · 
Twitter · YouTube



Re: Highlighting on typing in search box

2019-11-21 Thread David Hastings
you can modify the result in this SO question to fit your needs:

https://stackoverflow.com/questions/16742610/retrieve-results-from-solr-using-jquery-calls

On Thu, Nov 21, 2019 at 10:42 AM rhys J  wrote:

> Are there any recommended APIs or code examples of using Solr and then
> highlighting results below the search box?
>
> I'm trying to implement a search box that will search solr as the user
> types, if that makes sense?
>
> Thanks,
>
> Rhys
>


Highlighting on typing in search box

2019-11-21 Thread rhys J
Are there any recommended APIs or code examples of using Solr and then
highlighting results below the search box?

I'm trying to implement a search box that will search solr as the user
types, if that makes sense?

Thanks,

Rhys


Re: exact matches on a join

2019-11-21 Thread rhys J
On Thu, Nov 21, 2019 at 8:04 AM Jason Gerlowski 
wrote:

> Are these fields "string" or "text" fields?
>
> Text fields receive analysis that splits them into a series of terms.
> That's why the query "Freeman" matches the document "A-1 Freeman".
> "A-1 Freeman" gets split up into multiple terms, and the "Freeman"
> query matches one of those terms.  Text fields are what you use when
> you want matches to have some wiggle room based on your analyzers.
>
> String fields are much more geared towards exact matches.  No analysis
> is done, so a query for "Freeman" would only match docs who have that
> value identically.
>
>
Thanks, this was the conclusion I came to too. When I asked, they decided
that those matches were acceptable, and to keep the field a textField.

Rhys


RE: fq pfloat_field:* returns no documents, tfloat:* does

2019-11-21 Thread Webster Homer
Thank you. Why don't point fields get loaded by the Schema Browser's "Load Term 
Info" button?


-Original Message-
From: Tomás Fernández Löbbe 
Sent: Wednesday, November 20, 2019 4:38 PM
To: solr-user@lucene.apache.org
Subject: Re: fq pfloat_field:* returns no documents, tfloat:* does

Hi Webster,
> The fq  facet_melting_point:*
"Point" numeric fields don't support that syntax currently, and the way to 
retrieve "docs with any value in field foo" is "foo:[* TO *]". See
https://issues.apache.org/jira/browse/SOLR-11746


On Wed, Nov 20, 2019 at 2:21 PM Webster Homer < 
webster.ho...@milliporesigma.com> wrote:

> The fq   facet_melting_point:*
> Returns 0 rows. However the field clearly has data in it, why does
> this query return rows where there is data
>
> I am trying to update our solr schemas to use the point fields instead
> of the trie fields.
>
> We have a number of pfloat fields. These fields are indexed and I can
> facet on them
>
> This is a typical definition
>  stored="true" required="false" multiValued="true" docValues="true"/>
>
> Another odd behavior is that when I use the Schema Browser the "Load
> Term Info" loads no data.
>
> I am using Solr 7.2
> This message and any attachment are confidential and may be privileged
> or otherwise protected from disclosure. If you are not the intended
> recipient, you must not copy this message or attachment or disclose
> the contents to any other person. If you have received this
> transmission in error, please notify the sender immediately and delete
> the message and any attachment from your system. Merck KGaA,
> Darmstadt, Germany and any of its subsidiaries do not accept liability
> for any omissions or errors in this message which may arise as a
> result of E-Mail-transmission or for damages resulting from any
> unauthorized changes of the content of this message and any attachment
> thereto. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do
> not guarantee that this message is free of viruses and does not accept
> liability for any damages caused by any virus transmitted therewith.
> Click http://www.merckgroup.com/disclaimer to access the German, French, 
> Spanish and Portuguese versions of this disclaimer.
>
This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipient, you 
must not copy this message or attachment or disclose the contents to any other 
person. If you have received this transmission in error, please notify the 
sender immediately and delete the message and any attachment from your system. 
Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not accept 
liability for any omissions or errors in this message which may arise as a 
result of E-Mail-transmission or for damages resulting from any unauthorized 
changes of the content of this message and any attachment thereto. Merck KGaA, 
Darmstadt, Germany and any of its subsidiaries do not guarantee that this 
message is free of viruses and does not accept liability for any damages caused 
by any virus transmitted therewith. Click http://www.merckgroup.com/disclaimer 
to access the German, French, Spanish and Portuguese versions of this 
disclaimer.


Re: Solr process takes several minutes before accepting commands after restart

2019-11-21 Thread Erick Erickson
Koen:

Do you have any spellcheckers or suggesters defined with buildOnCommit or 
buildOnStartup set to “true”? Depending on the implementation, this may have to 
read the stored data for the field used in the suggester/spellchecker from 
_every_ document in your collection, which can take many minutes. Even if your 
implementation in your config is file-based it can still take a while.

Shot in the dark….

Erick

> On Nov 21, 2019, at 4:03 AM, Koen De Groote  
> wrote:
> 
> The logs files showed a startup, printing of all the config options that
> had been set, 1 or 2 commands that got executed and then nothing.
> 
> Sending the curl did not get shown in the logs files until after that
> period where Solr became unresponsive.
> 
> Service mesh, I don't think so? It's in a docker container, but that
> shouldn't be a problem, it usually never is.
> 
> 
> On Wed, Nov 20, 2019 at 10:42 AM Jörn Franke  wrote:
> 
>> Have you checked the log files of Solr?
>> 
>> 
>> Do you have a service mesh in-between? Could it be something at the
>> network layer/container orchestration  that is blocking requests for some
>> minutes?
>> 
>>> Am 20.11.2019 um 10:32 schrieb Koen De Groote <
>> koen.degro...@limecraft.com>:
>>> 
>>> Hello
>>> 
>>> I was testing some backup/restore scenarios.
>>> 
>>> 1 of them is Solr7.6 in a docker container(7.6.0-slim), set up as
>>> SolrCloud, with zookeeper.
>>> 
>>> The steps are as follows:
>>> 
>>> 1. Manually delete the data folder.
>>> 2. Restart the container. The process is now in error mode, complaining
>>> that it cannot find the cores.
>>> 3. Fix the install, meaning create new data folders, which are empty at
>>> this point.
>>> 4. Restart the container again, to pick up the empty folders and not be
>> in
>>> error anymore.
>>> 5. Perform the restore
>>> 6. Check if everything is available again
>>> 
>>> The problem is between step 4 and 5. After step 4, it takes several
>> minutes
>>> before solr actually responds to curl commands.
>>> 
>>> Once responsive, the restore happened just fine. But it's very stressful
>> in
>>> a situation where you have to restore a production environment and the
>>> process just doesn't respond for 5-10 minutes.
>>> 
>>> We're talking about 20GB of data here, so not very much, but not little
>>> either.
>>> 
>>> Is it normal that it takes so long before solr responds? If not, what
>>> should I look at in order to find the cause?
>>> 
>>> I have asked this before recently, though the wording was confusing. This
>>> should be clearer.
>>> 
>>> Kind regards,
>>> Koen De Groote
>> 



Re: Possible data corruption in JavaBinCodec in Solr 8.3 during distributed update?

2019-11-21 Thread Jason Gerlowski
Very curious what the config change that's related to reproducing this
looks like.  Maybe it's something that is worth adding
test-randomization around?  Just thinking aloud.


Re: exact matches on a join

2019-11-21 Thread Jason Gerlowski
Are these fields "string" or "text" fields?

Text fields receive analysis that splits them into a series of terms.
That's why the query "Freeman" matches the document "A-1 Freeman".
"A-1 Freeman" gets split up into multiple terms, and the "Freeman"
query matches one of those terms.  Text fields are what you use when
you want matches to have some wiggle room based on your analyzers.

String fields are much more geared towards exact matches.  No analysis
is done, so a query for "Freeman" would only match docs who have that
value identically.

Jason

On Tue, Nov 19, 2019 at 2:44 PM rhys J  wrote:
>
> I am trying to do a join, which I have working properly on 2 cores.
>
> One core has report_as, and the other core has debt_id.
>
> If I enter 'report_as: "Freeman", I expect to get 272 results. But I get
> 557.
>
> When I do a database search on the matched fields, it shows me that
> report_as: "Freeman" is matching also on 'A-1 Freeman'.
>
> I have tried boosting the score as report_as: "Freeman"^2, but I get the
> same results from the API, and from the browser itself.
>
> Here is my query:
>
> {
>   "responseHeader":{
> "status":0,
> "QTime":5,
> "params":{
>   "q":"( * )",
>   "indent":"on",
>   "fl":"debt_id, score",
>   "cursorMark":"*",
>   "sort":"score desc, id desc",
>   "fq":"{!join from=debtor_id to=debt_id fromIndex=dbtr}(
> report_as:\"Freeman\"^2)",
>   "rows":"1000"}},
>   "response":{"numFound":557,"start":0,"maxScore":1.0,"docs":[
>   {
> "debt_id":"485435",
> "score":1.0},
>   {
> "debt_id":"485435",
> "score":1.0},
>   {
> "debt_id":"482795",
> "score":1.0},
>   {
> "debt_id":"482795",
> "score":1.0},
>   {
> "debt_id":"482794",
> "score":1.0},
>   {
> "debt_id":"482794",
> "score":1.0},
>   {
> "debt_id":"482794",
> "score":1.0},
>
> SKIP
>
>
>
> {
> "debt_id":"396925",
> "score":1.0},
>   {
> "debt_id":"396925",
> "score":1.0},
>   {
> "debt_id":"396925",
> "score":1.0},
>   {
> "debt_id":"396925",
> "score":1.0},
>   {
> "debt_id":"396925",
> "score":1.0},
>   {
> "debt_id":"396925",
> "score":1.0},
>   {
> "debt_id":"396925",
> "score":1.0},
>   {
> "debt_id":"396925",
> "score":1.0},
>   {
> "debt_id":"396925",
> "score":1.0},
>   {
> "debt_id":"396925",
> "score":1.0},
>   {
> "debt_id":"396925",
>
>
> These ones are the correct matches that I can verify with the
> database, but their scores are the same as the ones matching on
> 'A1-Freeman'
>
> Is my scoring set up wrong?
>
> Thanks,
>
> Rhys


Re: Possible bug in cluster status - > solr 8.3

2019-11-21 Thread Jason Gerlowski
It seems like an issue to me.  Can you open a JIRA with these details?

On Fri, Nov 15, 2019 at 10:51 AM Jacek Kikiewicz  wrote:
>
> I found interesting situation, I've created a collection with only one 
> replica.
> Then I scaled solr-cloud cluster, and run  'addreplica' call to add 2 more.
> So I have a collection with 3 tlog replicas, cluster status page shows
> them but shows also this:
>   "core_node2":{
> "core":"EDITED_NAME_shard1_replica_t1",
> "base_url":"http://EDITED_NODE:8983/solr;,
> "node_name":"EDITED_NODE:8983_solr",
> "state":"active",
> "type":"TLOG",
> "force_set_state":"false",
> "leader":"true"},
>   "core_node5":{
> "core":"EDITED_NAME_shard1_replica_t3",
> "base_url":"http://EDITED_NODE:8983/solr;,
> "node_name":"EDITED_NODE:8983_solr",
> "state":"active",
> "type":"TLOG",
> "force_set_state":"false"},
>   "core_node6":{
> "core":"EDITED_NAME_shard1_replica_t4",
> "base_url":"http://EDITED_NODE:8983/solr;,
> "node_name":"EDITED_NODE:8983_solr",
> "state":"active",
> "type":"TLOG",
> "force_set_state":"false",
> "router":{"name":"compositeId"},
> "maxShardsPerNode":"1",
> "autoAddReplicas":"false",
> "nrtReplicas":"1",
> "tlogReplicas":"1",
> "znodeVersion":11,
>
>
> As you can see I have 3 replicas but then I have also: "tlogReplicas":"1"
>
> If I create collection with tlogReplicas=3 then cluster status shows
> "tlogReplicas":"3"
> IS that a bug or somehow 'works as it should' ?
>
> Regards,
> Jacek


Re: About Snapshot API and Backup for Solr Index

2019-11-21 Thread Kayak28
I was not clear in the last email.
I mean "For me, it is impossible to "backup" or "restore" Solr's index by
taking a snapshot."

If I make you confuse, I am sorry about that.

Sincerely,
Kaya Ota

2019年11月21日(木) 19:50 Kayak28 :

> Hello, Community Members:
>
> I am using Solr 7.7.4
> I have a question about a Snapshot API.
>
> https://lucene.apache.org/solr/guide/7_4/making-and-restoring-backups.html#create-snapshot-api
>
> I have tested basic of snapshot APIs, create snapshot, list snapshot,
> delete snapshot.
>
> As far as I know, when I do:
> - create a snapshot: create a binary file (snapshot_N where n is identical
> to segment_N) that contains a path of the index.
> - the file is created under data/snapshot_metadata directory.
>
> - list snapshot: return JSON, containing all snapshot data which show
> segment generation and path to the index.
> - delete snapshot: delete a snapshot data from snapshot_N.
>
> For me, it is impossible to "backup" or "restore" Solr's index.
>
> So, my questions are:
>
> - How snapshot APIs are related to "backup" or "restore"?
> - or more basically, when should I use snapshot API?
> - Is there any way to make a "backup" without consuming a double-size of
> the index? (I am asking because if I use backup API, it will copy the
> entire index)
> - what is the cheapest way to make a backup for Solr?
>
> If you help me out one of these questions or give me any clue,
> I will really appreciate.
>
> Sincerely,
> Kaya Ota
>
>
>
>
>
>
>


About Snapshot API and Backup for Solr Index

2019-11-21 Thread Kayak28
Hello, Community Members:

I am using Solr 7.7.4
I have a question about a Snapshot API.
https://lucene.apache.org/solr/guide/7_4/making-and-restoring-backups.html#create-snapshot-api

I have tested basic of snapshot APIs, create snapshot, list snapshot,
delete snapshot.

As far as I know, when I do:
- create a snapshot: create a binary file (snapshot_N where n is identical
to segment_N) that contains a path of the index.
- the file is created under data/snapshot_metadata directory.

- list snapshot: return JSON, containing all snapshot data which show
segment generation and path to the index.
- delete snapshot: delete a snapshot data from snapshot_N.

For me, it is impossible to "backup" or "restore" Solr's index.

So, my questions are:

- How snapshot APIs are related to "backup" or "restore"?
- or more basically, when should I use snapshot API?
- Is there any way to make a "backup" without consuming a double-size of
the index? (I am asking because if I use backup API, it will copy the
entire index)
- what is the cheapest way to make a backup for Solr?

If you help me out one of these questions or give me any clue,
I will really appreciate.

Sincerely,
Kaya Ota


Re: Metrics avgRequestsPerSecond and avgRequestsPerSecond from documentation gone?

2019-11-21 Thread Koen De Groote
Thanks for that.

On Wed, Nov 20, 2019 at 4:48 PM Andrzej Białecki  wrote:

> Hi,
>
> Yes, the documentation needs to be fixed, these attributes have been
> removed or replaced.
>
> * avgRequestsPerSecond -> requestTimes:meanRate. Please note that this is
> a non-decaying simple average based on the total wall clock time elapsed
> since the handler was started until NOW, and the total number of requests
> the handler processed in this time.
>
> * avgTimePerRequest =  totalTime / requests (in nano-seconds). Please note
> that the “totalTime” metric represents the aggregated elapsed time when the
> handler was processing requests (ie. not including all other elapsed time
> when the handler was just sitting idle). Perhaps a better name for this
> metric would be “totalProcessingTime”.
>
> > On 19 Nov 2019, at 17:35, Koen De Groote 
> wrote:
> >
> > Greetings,
> >
> > I'm using Solr 7.6 and have enabled JMX metrics.
> >
> > I ran into this page:
> >
> https://lucene.apache.org/solr/guide/7_6/performance-statistics-reference.html#commonly-used-stats-for-request-handlers
> >
> > Which mentions "avgRequestsPerSecond" and "avgTimePerRequest" and some
> > other attributes, which do not exist anymore in this version. I have an
> > older version(4) I spun up to have a look and they do exist in that
> version.
> >
> > When getting info on a QUERY or UPDATE bean with name `requestTimes`, I
> get
> > this:
> >
> > # attributes
> >  %0   - 50thPercentile (double, r)
> >  %1   - 75thPercentile (double, r)
> >  %2   - 95thPercentile (double, r)
> >  %3   - 98thPercentile (double, r)
> >  %4   - 999thPercentile (double, r)
> >  %5   - 99thPercentile (double, r)
> >  %6   - Count (long, r)
> >  %7   - DurationUnit (java.lang.String, r)
> >  %8   - FifteenMinuteRate (double, r)
> >  %9   - FiveMinuteRate (double, r)
> >  %10  - Max (double, r)
> >  %11  - Mean (double, r)
> >  %12  - MeanRate (double, r)
> >  %13  - Min (double, r)
> >  %14  - OneMinuteRate (double, r)
> >  %15  - RateUnit (java.lang.String, r)
> >  %16  - StdDev (double, r)
> >  %17  - _instanceTag (java.lang.String, r)
> > # operations
> >  %0   - javax.management.ObjectName objectName()
> >  %1   - [J values()
> > #there's no notifications
> >
> > And it seems that none of the current values are actually a proper
> > replacement for the functionality these values used to offer.
> >
> > How shall I go about getting this info now? Do I need to combine several
> > other metrics?
> >
> > For completeness sake, my solr.xml, where I enabled JMX, is just the
> > default example from the documentation, with JMX added:
> >
> >
> > 
> >
> >${host:}
> >${jetty.port:8983}
> >${hostContext:solr}
> >${zkClientTimeout:15000}
> > > name="genericCoreNodeNames">${genericCoreNodeNames:true}
> >
> > > class="HttpShardHandlerFactory">
> >${socketTimeout:0}
> >${connTimeout:0}
> >
> >
> >
> >javax.net.ssl.keyStorePassword
> >javax.net.ssl.trustStorePassword
> >basicauth
> >zkDigestPassword
> >zkDigestReadonlyPassword
> >
> > > class="org.apache.solr.metrics.reporters.SolrJmxReporter">
> > >
> name="rootName">very_obvious_name_for_easy_reading_${jetty.port:8983}
> >
> >
> > 
> >
> >
> > Kind regards,
> > Koen De Groote
>
>


Re: Solr process takes several minutes before accepting commands after restart

2019-11-21 Thread Koen De Groote
The logs files showed a startup, printing of all the config options that
had been set, 1 or 2 commands that got executed and then nothing.

Sending the curl did not get shown in the logs files until after that
period where Solr became unresponsive.

Service mesh, I don't think so? It's in a docker container, but that
shouldn't be a problem, it usually never is.


On Wed, Nov 20, 2019 at 10:42 AM Jörn Franke  wrote:

> Have you checked the log files of Solr?
>
>
> Do you have a service mesh in-between? Could it be something at the
> network layer/container orchestration  that is blocking requests for some
> minutes?
>
> > Am 20.11.2019 um 10:32 schrieb Koen De Groote <
> koen.degro...@limecraft.com>:
> >
> > Hello
> >
> > I was testing some backup/restore scenarios.
> >
> > 1 of them is Solr7.6 in a docker container(7.6.0-slim), set up as
> > SolrCloud, with zookeeper.
> >
> > The steps are as follows:
> >
> > 1. Manually delete the data folder.
> > 2. Restart the container. The process is now in error mode, complaining
> > that it cannot find the cores.
> > 3. Fix the install, meaning create new data folders, which are empty at
> > this point.
> > 4. Restart the container again, to pick up the empty folders and not be
> in
> > error anymore.
> > 5. Perform the restore
> > 6. Check if everything is available again
> >
> > The problem is between step 4 and 5. After step 4, it takes several
> minutes
> > before solr actually responds to curl commands.
> >
> > Once responsive, the restore happened just fine. But it's very stressful
> in
> > a situation where you have to restore a production environment and the
> > process just doesn't respond for 5-10 minutes.
> >
> > We're talking about 20GB of data here, so not very much, but not little
> > either.
> >
> > Is it normal that it takes so long before solr responds? If not, what
> > should I look at in order to find the cause?
> >
> > I have asked this before recently, though the wording was confusing. This
> > should be clearer.
> >
> > Kind regards,
> > Koen De Groote
>