Re: solrcloud Auto-commit doesn't seem reliable

2018-02-09 Thread Shawn Heisey

On 2/9/2018 8:44 AM, Webster Homer wrote:

I look at the latest timestamp on a record in the collection and see that
it is over 24 hours old.

I send a commit to the collection, and then see that the core is now
current, and the segments are fewer. The commit worked

This is the setting in solrconfig.xml
 ${solr.autoCommit.maxTime:6} <
openSearcher>false 


As recommended, you have openSearcher set to false.

This means that these commits are NEVER going to make changes visible.

Don't go and change openSearcher to true.  It is STRONGLY recommended to 
have openSearcher=false in your autoCommit settings.  The reason for 
this configuration is that it prevents the transaction log from growing 
out of control.  With openSearcher=false, those commits will be very 
fast.  This is because it's opening the searcher that's slow, not the 
process of writing data to disk.


Here's the recommended reading on the subject:

https://lucidworks.com/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

For change visibility, configure autoSoftCommit, probably with a 
different interval than you have for autoCommit.  I would recommend a 
longer interval.  Or include the commitWithin parameter on at least some 
of your update requests.  Or send explicit commit requests, preferably 
as soft commits.


Thanks,
Shawn


Re: solrcloud Auto-commit doesn't seem reliable

2018-02-09 Thread Webster Homer
I we do have autoSoftcommit set to 3 seconds. It is NOT the visibility of
the records that is my primary concern. I am concerned about is the
accumulation of uncommitted tlog files and the larger number of deleted
documents.

I am VERY familiar with the Solr documentation on this.

On Fri, Feb 9, 2018 at 10:08 AM, Shawn Heisey  wrote:

> On 2/9/2018 8:44 AM, Webster Homer wrote:
>
>> I look at the latest timestamp on a record in the collection and see that
>> it is over 24 hours old.
>>
>> I send a commit to the collection, and then see that the core is now
>> current, and the segments are fewer. The commit worked
>>
>> This is the setting in solrconfig.xml
>>  ${solr.autoCommit.maxTime:6} <
>> openSearcher>false 
>>
>
> As recommended, you have openSearcher set to false.
>
> This means that these commits are NEVER going to make changes visible.
>
> Don't go and change openSearcher to true.  It is STRONGLY recommended to
> have openSearcher=false in your autoCommit settings.  The reason for this
> configuration is that it prevents the transaction log from growing out of
> control.  With openSearcher=false, those commits will be very fast.  This
> is because it's opening the searcher that's slow, not the process of
> writing data to disk.
>
> Here's the recommended reading on the subject:
>
> https://lucidworks.com/understanding-transaction-logs-
> softcommit-and-commit-in-sorlcloud/
>
> For change visibility, configure autoSoftCommit, probably with a different
> interval than you have for autoCommit.  I would recommend a longer
> interval.  Or include the commitWithin parameter on at least some of your
> update requests.  Or send explicit commit requests, preferably as soft
> commits.
>
> Thanks,
> Shawn
>

-- 


This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipient, 
you must not copy this message or attachment or disclose the contents to 
any other person. If you have received this transmission in error, please 
notify the sender immediately and delete the message and any attachment 
from your system. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not accept liability for any omissions or errors in this 
message which may arise as a result of E-Mail-transmission or for damages 
resulting from any unauthorized changes of the content of this message and 
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not guarantee that this message is free of viruses and does 
not accept liability for any damages caused by any virus transmitted 
therewith.

Click http://www.emdgroup.com/disclaimer to access the German, French, 
Spanish and Portuguese versions of this disclaimer.


Re: solrcloud Auto-commit doesn't seem reliable

2018-02-09 Thread Webster Homer
A little more background. Our production Solrclouds are populated via CDCR,
CDCR does not replicate commits, Commits to the target clouds happen via
autoCommit settings

We see relvancy scores get inconsistent when there are too many deletes
which seems to happen when hard commits don't happen.

On Fri, Feb 9, 2018 at 10:25 AM, Webster Homer 
wrote:

> I we do have autoSoftcommit set to 3 seconds. It is NOT the visibility of
> the records that is my primary concern. I am concerned about is the
> accumulation of uncommitted tlog files and the larger number of deleted
> documents.
>
> I am VERY familiar with the Solr documentation on this.
>
> On Fri, Feb 9, 2018 at 10:08 AM, Shawn Heisey  wrote:
>
>> On 2/9/2018 8:44 AM, Webster Homer wrote:
>>
>>> I look at the latest timestamp on a record in the collection and see that
>>> it is over 24 hours old.
>>>
>>> I send a commit to the collection, and then see that the core is now
>>> current, and the segments are fewer. The commit worked
>>>
>>> This is the setting in solrconfig.xml
>>>  ${solr.autoCommit.maxTime:6} <
>>> openSearcher>false 
>>>
>>
>> As recommended, you have openSearcher set to false.
>>
>> This means that these commits are NEVER going to make changes visible.
>>
>> Don't go and change openSearcher to true.  It is STRONGLY recommended to
>> have openSearcher=false in your autoCommit settings.  The reason for this
>> configuration is that it prevents the transaction log from growing out of
>> control.  With openSearcher=false, those commits will be very fast.  This
>> is because it's opening the searcher that's slow, not the process of
>> writing data to disk.
>>
>> Here's the recommended reading on the subject:
>>
>> https://lucidworks.com/understanding-transaction-logs-softco
>> mmit-and-commit-in-sorlcloud/
>>
>> For change visibility, configure autoSoftCommit, probably with a
>> different interval than you have for autoCommit.  I would recommend a
>> longer interval.  Or include the commitWithin parameter on at least some of
>> your update requests.  Or send explicit commit requests, preferably as soft
>> commits.
>>
>> Thanks,
>> Shawn
>>
>
>

-- 


This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipient, 
you must not copy this message or attachment or disclose the contents to 
any other person. If you have received this transmission in error, please 
notify the sender immediately and delete the message and any attachment 
from your system. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not accept liability for any omissions or errors in this 
message which may arise as a result of E-Mail-transmission or for damages 
resulting from any unauthorized changes of the content of this message and 
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not guarantee that this message is free of viruses and does 
not accept liability for any damages caused by any virus transmitted 
therewith.

Click http://www.emdgroup.com/disclaimer to access the German, French, 
Spanish and Portuguese versions of this disclaimer.


Re: solrcloud Auto-commit doesn't seem reliable

2018-02-09 Thread Erick Erickson
Do you by any chance have buffering turned on for CDCR? That parameter
is misleading. If true, tlogs will accumulate forever. The blanket
recommendation is becoming turn buffering off and leave it off, the
original intention there has been replaced really by bootstrapping.
Buffering was there for maintenance windows before bootstrapping was
put in place. However, even if this is an issue, it shouldn't be
affecting your commits.

This is puzzling. CDCR is just using tlogs as a queueing mechanism.
Docs are sent from the source to the target cluster, but once received
they're _supposed_ to be just like any other doc that's indexed, i.e.
when the first one is received it should trigger the start of the
autocommit intervals.

Is there any possibility that:
1> your config isn't correct? I'm guessing this is just a typo:
"autoSoftcommit", but worth checking. Besides, that's not your main
concern anyway.
2> your startup parameters override your solrconfig settings for hard
commit interval and set it to -1 or something? When you start Solr you
should see the result of all the various ways you can set these
intervals "rolled up", look for messages like:

INFO  org.apache.solr.update.CommitTracker; Hard AutoCommit: if
uncommited for 15000ms;
INFO  org.apache.solr.update.CommitTracker; Soft AutoCommit: disabled


There's some chance that the admin console is misleading, but you've
seen the behavior change when you commit so that's unlikely to be the
root of your issue either, but mentioning it in passing.

bq: We see relvancy scores get inconsistent when there are too many
deletes which seems to happen when hard commits don't happen

right, the hard commit will trigger merging, which in turn removes the
terms associated with deleted documents in the segments that are
merged which in turn changes your TF/IDF stats. So this is another
piece of evidence that your getting unexpected behavior. And your
tlogs will accumulate forever too.

This is the first time I've ever heard of this problem, so I'm still
thinking that this is something odd about your setup, but what it is
escapes me from what you've said so far.

I want to check one other thing: You say you've seen this  behavior in
4.10. CDCR wasn't introduced until considerably later, so what was the
scenario in the 4.10 case? Is my tangent for CDCR just a red herring?

Best,
Erick


On Fri, Feb 9, 2018 at 8:29 AM, Webster Homer  wrote:
> A little more background. Our production Solrclouds are populated via CDCR,
> CDCR does not replicate commits, Commits to the target clouds happen via
> autoCommit settings
>
> We see relvancy scores get inconsistent when there are too many deletes
> which seems to happen when hard commits don't happen.
>
> On Fri, Feb 9, 2018 at 10:25 AM, Webster Homer 
> wrote:
>
>> I we do have autoSoftcommit set to 3 seconds. It is NOT the visibility of
>> the records that is my primary concern. I am concerned about is the
>> accumulation of uncommitted tlog files and the larger number of deleted
>> documents.
>>
>> I am VERY familiar with the Solr documentation on this.
>>
>> On Fri, Feb 9, 2018 at 10:08 AM, Shawn Heisey  wrote:
>>
>>> On 2/9/2018 8:44 AM, Webster Homer wrote:
>>>
 I look at the latest timestamp on a record in the collection and see that
 it is over 24 hours old.

 I send a commit to the collection, and then see that the core is now
 current, and the segments are fewer. The commit worked

 This is the setting in solrconfig.xml
  ${solr.autoCommit.maxTime:6} <
 openSearcher>false 

>>>
>>> As recommended, you have openSearcher set to false.
>>>
>>> This means that these commits are NEVER going to make changes visible.
>>>
>>> Don't go and change openSearcher to true.  It is STRONGLY recommended to
>>> have openSearcher=false in your autoCommit settings.  The reason for this
>>> configuration is that it prevents the transaction log from growing out of
>>> control.  With openSearcher=false, those commits will be very fast.  This
>>> is because it's opening the searcher that's slow, not the process of
>>> writing data to disk.
>>>
>>> Here's the recommended reading on the subject:
>>>
>>> https://lucidworks.com/understanding-transaction-logs-softco
>>> mmit-and-commit-in-sorlcloud/
>>>
>>> For change visibility, configure autoSoftCommit, probably with a
>>> different interval than you have for autoCommit.  I would recommend a
>>> longer interval.  Or include the commitWithin parameter on at least some of
>>> your update requests.  Or send explicit commit requests, preferably as soft
>>> commits.
>>>
>>> Thanks,
>>> Shawn
>>>
>>
>>
>
> --
>
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the mess

Re: solrcloud Auto-commit doesn't seem reliable

2018-02-09 Thread Shawn Heisey

On 2/9/2018 9:29 AM, Webster Homer wrote:

A little more background. Our production Solrclouds are populated via CDCR,
CDCR does not replicate commits, Commits to the target clouds happen via
autoCommit settings

We see relvancy scores get inconsistent when there are too many deletes
which seems to happen when hard commits don't happen.

On Fri, Feb 9, 2018 at 10:25 AM, Webster Homer 
wrote:


I we do have autoSoftcommit set to 3 seconds. It is NOT the visibility of
the records that is my primary concern. I am concerned about is the
accumulation of uncommitted tlog files and the larger number of deleted
documents.


For the deleted documents:  Have you ever done an optimize on the 
collection?  If so, you're going to need to re-do the optimize regularly 
to keep deleted documents from growing out of control.  See this issue 
for a very technical discussion about it:


https://issues.apache.org/jira/browse/LUCENE-7976

Deleted documents probably aren't really related to what we've been 
discussing.  That shouldn't really be strongly affected by commit settings.


-

A 3 second autoSoftCommit is VERY aggressive.   If your soft commits are 
taking longer than 3 seconds to complete, which is often what happens, 
then that will lead to problems.  I wouldn't expect it to cause the 
kinds of problems you describe, though.  It would manifest as Solr 
working too hard, logging warnings or errors, and changes taking too 
long to show up.


Assuming that the config for autoSoftCommit doesn't have the typo that 
Erick mentioned.




I have never used CDCR, so I know very little about it.  But I have seen 
reports on this mailing list saying that transaction logs never get 
deleted when CDCR is configured.


Below is a link to a mailing list discussion related to CDCR not 
deleting transaction logs.  Looks like for it to work right a buffer 
needs to be disabled, and there may also be problems caused by not 
having a complete zkHost string in the CDCR config:


http://lucene.472066.n3.nabble.com/CDCR-how-to-deal-with-the-transaction-log-files-td4345062.html

Erick also mentioned this.

Thanks,
Shawn


Re: solrcloud Auto-commit doesn't seem reliable

2018-02-12 Thread Webster Homer
Erick, I am aware of the CDCR buffering problem causing tlog retention, we
always turn buffering off in our cdcr configurations.

My post was precipitated by seeing that we had uncommitted data in
collections > 24 hours after it was loaded. The collections I was looking
at are in our development environment, where we do not use CDCR. However
I'm pretty sure that I've seen situations in production where commits were
also long overdue.

the "autoSoftcommit" was a typo. The soft commit logic seems to be fine, I
don't see an issue with data visibility. But if 3 seconds is aggressive
what would be a  good value for soft commit? We have a couple of
collections that are updated every minute although most of them are updated
much less frequently.

My reason for raising this commit issue is that we see problems with the
relevancy of solrcloud searches, and the NRT replica type. Sometimes the
results flip where the best hit varies by what replica serviced the search.
This is hard to explain to management. Doing an optimized does address the
problem for a while. I try to avoid optimizing for the reasons you and Sean
list. If a commit doesn't happen how would there ever be an index merge
that would remove the deleted documents.

The problem with deletes and relevancy don't seem to occur when we use TLOG
replicas, probably because they don't do their own indexing but get copies
from their leader. We are testing them now eventually we may abandon the
use of NRT replicas for most of our collections.

I am quite concerned about this commit issue. What kinds of things would
influence whether a commit occurs? One commonality for our systems is that
they are hosted in a Google cloud. We have a number of collections that
share configurations, but others that do not. I think commits do happen,
but I don't trust that autoCommit is reliable. What can we do to make it
reliable?

Most of our collections are reindexed weekly with partial updates applied
daily, that at least is what happens in production, our development clouds
are not as regular.

Our solr startup script sets the following values:
-Dsolr.autoCommit.maxDocs=35000
-Dsolr.autoCommit.maxTime=6
-Dsolr.autoSoftCommit.maxTime=3000

I don't think we reference  solr.autoCommit.maxDocs in our solrconfig.xml
files.

here are our settings for autoCommit and autoSoftCommit

We had a lot of issues with missing commits when we didn't set
solr.autoCommit.maxTime
 
   ${solr.autoCommit.maxTime:6}
   false


 
   ${solr.autoSoftCommit.maxTime:5000}
 



On Fri, Feb 9, 2018 at 3:49 PM, Shawn Heisey  wrote:

> On 2/9/2018 9:29 AM, Webster Homer wrote:
>
>> A little more background. Our production Solrclouds are populated via
>> CDCR,
>> CDCR does not replicate commits, Commits to the target clouds happen via
>> autoCommit settings
>>
>> We see relvancy scores get inconsistent when there are too many deletes
>> which seems to happen when hard commits don't happen.
>>
>> On Fri, Feb 9, 2018 at 10:25 AM, Webster Homer 
>> wrote:
>>
>> I we do have autoSoftcommit set to 3 seconds. It is NOT the visibility of
>>> the records that is my primary concern. I am concerned about is the
>>> accumulation of uncommitted tlog files and the larger number of deleted
>>> documents.
>>>
>>
> For the deleted documents:  Have you ever done an optimize on the
> collection?  If so, you're going to need to re-do the optimize regularly to
> keep deleted documents from growing out of control.  See this issue for a
> very technical discussion about it:
>
> https://issues.apache.org/jira/browse/LUCENE-7976
>
> Deleted documents probably aren't really related to what we've been
> discussing.  That shouldn't really be strongly affected by commit settings.
>
> -
>
> A 3 second autoSoftCommit is VERY aggressive.   If your soft commits are
> taking longer than 3 seconds to complete, which is often what happens, then
> that will lead to problems.  I wouldn't expect it to cause the kinds of
> problems you describe, though.  It would manifest as Solr working too hard,
> logging warnings or errors, and changes taking too long to show up.
>
> Assuming that the config for autoSoftCommit doesn't have the typo that
> Erick mentioned.
>
> 
>
> I have never used CDCR, so I know very little about it.  But I have seen
> reports on this mailing list saying that transaction logs never get deleted
> when CDCR is configured.
>
> Below is a link to a mailing list discussion related to CDCR not deleting
> transaction logs.  Looks like for it to work right a buffer needs to be
> disabled, and there may also be problems caused by not having a complete
> zkHost string in the CDCR config:
>
> http://lucene.472066.n3.nabble.com/CDCR-how-to-deal-with-
> the-transaction-log-files-td4345062.html
>
> Erick also mentioned this.
>
> Thanks,
> Shawn
>

-- 


This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipie

Re: solrcloud Auto-commit doesn't seem reliable

2018-02-12 Thread Erick Erickson
bq: But if 3 seconds is aggressive what would be a  good value for soft commit?

The usual answer is "as long as you can stand". All top-level caches are
invalidated, autowarming is done etc. on each soft commit. That can be a lot of
work and if your users are comfortable with docs not showing up for,
say, 10 minutes
then use 10 minutes. As always "it depends" here, the point is not to
do unnecessary
work if possible.

bq: If a commit doesn't happen how would there ever be an index merge
that would remove the deleted documents.

Right, it wouldn't. It's a little more subtle than that though.
Segments on various
replicas will contain different docs, thus the term/doc statistics can be a bit
different between multiple replicas. None of the stats will change
until the commit
though. You might try turning no distributed doc/term stats though.

Your comments about PULL or TLOG replicas are well taken. However, even those
won't be absolutely in sync since they'll replicate from the master at slightly
different times and _could_ get slightly different segments _if_
there's indexing
going on. But let's say you stop indexing. After the next poll
interval all the replicas
will have identical characteristics and will score the docs the same.

I don't have any signifiant wisdom to offer here, except this is really the
first time I've heard of this behavior. About all I can imagine is
that _somehow_
the soft commit interval is -1. When you say you "issue a commit" I'm assuming
it's via collection/update?commit=true or some such which issues a hard
commit with openSearcher=true. And it's on a _collection_ basis, right?

Sorry I can't be more help
Erick




On Mon, Feb 12, 2018 at 10:44 AM, Webster Homer  wrote:
> Erick, I am aware of the CDCR buffering problem causing tlog retention, we
> always turn buffering off in our cdcr configurations.
>
> My post was precipitated by seeing that we had uncommitted data in
> collections > 24 hours after it was loaded. The collections I was looking
> at are in our development environment, where we do not use CDCR. However
> I'm pretty sure that I've seen situations in production where commits were
> also long overdue.
>
> the "autoSoftcommit" was a typo. The soft commit logic seems to be fine, I
> don't see an issue with data visibility. But if 3 seconds is aggressive
> what would be a  good value for soft commit? We have a couple of
> collections that are updated every minute although most of them are updated
> much less frequently.
>
> My reason for raising this commit issue is that we see problems with the
> relevancy of solrcloud searches, and the NRT replica type. Sometimes the
> results flip where the best hit varies by what replica serviced the search.
> This is hard to explain to management. Doing an optimized does address the
> problem for a while. I try to avoid optimizing for the reasons you and Sean
> list. If a commit doesn't happen how would there ever be an index merge
> that would remove the deleted documents.
>
> The problem with deletes and relevancy don't seem to occur when we use TLOG
> replicas, probably because they don't do their own indexing but get copies
> from their leader. We are testing them now eventually we may abandon the
> use of NRT replicas for most of our collections.
>
> I am quite concerned about this commit issue. What kinds of things would
> influence whether a commit occurs? One commonality for our systems is that
> they are hosted in a Google cloud. We have a number of collections that
> share configurations, but others that do not. I think commits do happen,
> but I don't trust that autoCommit is reliable. What can we do to make it
> reliable?
>
> Most of our collections are reindexed weekly with partial updates applied
> daily, that at least is what happens in production, our development clouds
> are not as regular.
>
> Our solr startup script sets the following values:
> -Dsolr.autoCommit.maxDocs=35000
> -Dsolr.autoCommit.maxTime=6
> -Dsolr.autoSoftCommit.maxTime=3000
>
> I don't think we reference  solr.autoCommit.maxDocs in our solrconfig.xml
> files.
>
> here are our settings for autoCommit and autoSoftCommit
>
> We had a lot of issues with missing commits when we didn't set
> solr.autoCommit.maxTime
>  
>${solr.autoCommit.maxTime:6}
>false
> 
>
>  
>${solr.autoSoftCommit.maxTime:5000}
>  
>
>
>
> On Fri, Feb 9, 2018 at 3:49 PM, Shawn Heisey  wrote:
>
>> On 2/9/2018 9:29 AM, Webster Homer wrote:
>>
>>> A little more background. Our production Solrclouds are populated via
>>> CDCR,
>>> CDCR does not replicate commits, Commits to the target clouds happen via
>>> autoCommit settings
>>>
>>> We see relvancy scores get inconsistent when there are too many deletes
>>> which seems to happen when hard commits don't happen.
>>>
>>> On Fri, Feb 9, 2018 at 10:25 AM, Webster Homer 
>>> wrote:
>>>
>>> I we do have autoSoftcommit set to 3 seconds. It is NOT the visibility of
 the records that i

Re: solrcloud Auto-commit doesn't seem reliable

2018-02-16 Thread Webster Homer
I meant to get back to this sooner.

When I say I issued a commit I do issue it as collection/update?commit=true

The soft commit interval is set to 3000, but I don't have a problem with
soft commits ( I think). I was responding

I am concerned that some hard commits don't seem to happen, but I think
many commits do occur. I'd like suggestions on how to diagnose this, and
perhaps an idea of where to look. Typically I believe that issues like this
are from our configuration.

Our indexing job is pretty simple, we send blocks of JSON to
/update/json. We have either re-index the whole collection, or
just apply updates. Typically we reindex the data once a week and delete
any records that are older than the last full index. This does lead to a
fair number of deleted records in the index especially if commits fail.
Most of our collections are not large between 2 and 3 million records.

The collections are hosted in google cloud

On Mon, Feb 12, 2018 at 5:00 PM, Erick Erickson 
wrote:

> bq: But if 3 seconds is aggressive what would be a  good value for soft
> commit?
>
> The usual answer is "as long as you can stand". All top-level caches are
> invalidated, autowarming is done etc. on each soft commit. That can be a
> lot of
> work and if your users are comfortable with docs not showing up for,
> say, 10 minutes
> then use 10 minutes. As always "it depends" here, the point is not to
> do unnecessary
> work if possible.
>
> bq: If a commit doesn't happen how would there ever be an index merge
> that would remove the deleted documents.
>
> Right, it wouldn't. It's a little more subtle than that though.
> Segments on various
> replicas will contain different docs, thus the term/doc statistics can be
> a bit
> different between multiple replicas. None of the stats will change
> until the commit
> though. You might try turning no distributed doc/term stats though.
>
> Your comments about PULL or TLOG replicas are well taken. However, even
> those
> won't be absolutely in sync since they'll replicate from the master at
> slightly
> different times and _could_ get slightly different segments _if_
> there's indexing
> going on. But let's say you stop indexing. After the next poll
> interval all the replicas
> will have identical characteristics and will score the docs the same.
>
> I don't have any signifiant wisdom to offer here, except this is really the
> first time I've heard of this behavior. About all I can imagine is
> that _somehow_
> the soft commit interval is -1. When you say you "issue a commit" I'm
> assuming
> it's via collection/update?commit=true or some such which issues a
> hard
> commit with openSearcher=true. And it's on a _collection_ basis, right?
>
> Sorry I can't be more help
> Erick
>
>
>
>
> On Mon, Feb 12, 2018 at 10:44 AM, Webster Homer 
> wrote:
> > Erick, I am aware of the CDCR buffering problem causing tlog retention,
> we
> > always turn buffering off in our cdcr configurations.
> >
> > My post was precipitated by seeing that we had uncommitted data in
> > collections > 24 hours after it was loaded. The collections I was looking
> > at are in our development environment, where we do not use CDCR. However
> > I'm pretty sure that I've seen situations in production where commits
> were
> > also long overdue.
> >
> > the "autoSoftcommit" was a typo. The soft commit logic seems to be fine,
> I
> > don't see an issue with data visibility. But if 3 seconds is aggressive
> > what would be a  good value for soft commit? We have a couple of
> > collections that are updated every minute although most of them are
> updated
> > much less frequently.
> >
> > My reason for raising this commit issue is that we see problems with the
> > relevancy of solrcloud searches, and the NRT replica type. Sometimes the
> > results flip where the best hit varies by what replica serviced the
> search.
> > This is hard to explain to management. Doing an optimized does address
> the
> > problem for a while. I try to avoid optimizing for the reasons you and
> Sean
> > list. If a commit doesn't happen how would there ever be an index merge
> > that would remove the deleted documents.
> >
> > The problem with deletes and relevancy don't seem to occur when we use
> TLOG
> > replicas, probably because they don't do their own indexing but get
> copies
> > from their leader. We are testing them now eventually we may abandon the
> > use of NRT replicas for most of our collections.
> >
> > I am quite concerned about this commit issue. What kinds of things would
> > influence whether a commit occurs? One commonality for our systems is
> that
> > they are hosted in a Google cloud. We have a number of collections that
> > share configurations, but others that do not. I think commits do happen,
> > but I don't trust that autoCommit is reliable. What can we do to make it
> > reliable?
> >
> > Most of our collections are reindexed weekly with partial updates applied
> > daily, that at least is what happens in production, our dev

Re: solrcloud Auto-commit doesn't seem reliable

2018-03-21 Thread Elaine Cario
I'm just catching up on reading solr emails, so forgive me for being late
to this dance

I've just gone through a project to enable CDCR on our Solr, and I also
experienced a small period of time where the commits on the source server
just seemed to stop.  This was during a period of intense experimentation
where I was mucking around with configurations, turning CDCR on/off, etc.
At some point the commits stopped occurring, and it drove me nuts for a
couple of days - tried everything - restarting Solr, reloading, turned
buffering on, turned buffering off, etc.  I finally threw up my hands and
rebooted the server out of desperation (it was a physical Linux box).
Commits worked fine after that.  I don't know what caused the commits to
stop, and why re-booting (and not just restarting Solr) caused them to work
fine.

Wondering if you ever found a solution to your situation?



On Fri, Feb 16, 2018 at 2:44 PM, Webster Homer 
wrote:

> I meant to get back to this sooner.
>
> When I say I issued a commit I do issue it as collection/update?commit=true
>
> The soft commit interval is set to 3000, but I don't have a problem with
> soft commits ( I think). I was responding
>
> I am concerned that some hard commits don't seem to happen, but I think
> many commits do occur. I'd like suggestions on how to diagnose this, and
> perhaps an idea of where to look. Typically I believe that issues like this
> are from our configuration.
>
> Our indexing job is pretty simple, we send blocks of JSON to
> /update/json. We have either re-index the whole collection, or
> just apply updates. Typically we reindex the data once a week and delete
> any records that are older than the last full index. This does lead to a
> fair number of deleted records in the index especially if commits fail.
> Most of our collections are not large between 2 and 3 million records.
>
> The collections are hosted in google cloud
>
> On Mon, Feb 12, 2018 at 5:00 PM, Erick Erickson 
> wrote:
>
> > bq: But if 3 seconds is aggressive what would be a  good value for soft
> > commit?
> >
> > The usual answer is "as long as you can stand". All top-level caches are
> > invalidated, autowarming is done etc. on each soft commit. That can be a
> > lot of
> > work and if your users are comfortable with docs not showing up for,
> > say, 10 minutes
> > then use 10 minutes. As always "it depends" here, the point is not to
> > do unnecessary
> > work if possible.
> >
> > bq: If a commit doesn't happen how would there ever be an index merge
> > that would remove the deleted documents.
> >
> > Right, it wouldn't. It's a little more subtle than that though.
> > Segments on various
> > replicas will contain different docs, thus the term/doc statistics can be
> > a bit
> > different between multiple replicas. None of the stats will change
> > until the commit
> > though. You might try turning no distributed doc/term stats though.
> >
> > Your comments about PULL or TLOG replicas are well taken. However, even
> > those
> > won't be absolutely in sync since they'll replicate from the master at
> > slightly
> > different times and _could_ get slightly different segments _if_
> > there's indexing
> > going on. But let's say you stop indexing. After the next poll
> > interval all the replicas
> > will have identical characteristics and will score the docs the same.
> >
> > I don't have any signifiant wisdom to offer here, except this is really
> the
> > first time I've heard of this behavior. About all I can imagine is
> > that _somehow_
> > the soft commit interval is -1. When you say you "issue a commit" I'm
> > assuming
> > it's via collection/update?commit=true or some such which issues a
> > hard
> > commit with openSearcher=true. And it's on a _collection_ basis, right?
> >
> > Sorry I can't be more help
> > Erick
> >
> >
> >
> >
> > On Mon, Feb 12, 2018 at 10:44 AM, Webster Homer 
> > wrote:
> > > Erick, I am aware of the CDCR buffering problem causing tlog retention,
> > we
> > > always turn buffering off in our cdcr configurations.
> > >
> > > My post was precipitated by seeing that we had uncommitted data in
> > > collections > 24 hours after it was loaded. The collections I was
> looking
> > > at are in our development environment, where we do not use CDCR.
> However
> > > I'm pretty sure that I've seen situations in production where commits
> > were
> > > also long overdue.
> > >
> > > the "autoSoftcommit" was a typo. The soft commit logic seems to be
> fine,
> > I
> > > don't see an issue with data visibility. But if 3 seconds is aggressive
> > > what would be a  good value for soft commit? We have a couple of
> > > collections that are updated every minute although most of them are
> > updated
> > > much less frequently.
> > >
> > > My reason for raising this commit issue is that we see problems with
> the
> > > relevancy of solrcloud searches, and the NRT replica type. Sometimes
> the
> > > results flip where the best hit varies by what replica serviced th

Re: solrcloud Auto-commit doesn't seem reliable

2018-03-23 Thread Amrit Sarkar
Elaino,

When you say commits not working, the solr logs not printing "commit"
messages? or documents are not appearing when we search.

Amrit Sarkar
Search Engineer
Lucidworks, Inc.
415-589-9269
www.lucidworks.com
Twitter http://twitter.com/lucidworks
LinkedIn: https://www.linkedin.com/in/sarkaramrit2
Medium: https://medium.com/@sarkaramrit2

On Thu, Mar 22, 2018 at 4:05 AM, Elaine Cario  wrote:

> I'm just catching up on reading solr emails, so forgive me for being late
> to this dance
>
> I've just gone through a project to enable CDCR on our Solr, and I also
> experienced a small period of time where the commits on the source server
> just seemed to stop.  This was during a period of intense experimentation
> where I was mucking around with configurations, turning CDCR on/off, etc.
> At some point the commits stopped occurring, and it drove me nuts for a
> couple of days - tried everything - restarting Solr, reloading, turned
> buffering on, turned buffering off, etc.  I finally threw up my hands and
> rebooted the server out of desperation (it was a physical Linux box).
> Commits worked fine after that.  I don't know what caused the commits to
> stop, and why re-booting (and not just restarting Solr) caused them to work
> fine.
>
> Wondering if you ever found a solution to your situation?
>
>
>
> On Fri, Feb 16, 2018 at 2:44 PM, Webster Homer 
> wrote:
>
> > I meant to get back to this sooner.
> >
> > When I say I issued a commit I do issue it as
> collection/update?commit=true
> >
> > The soft commit interval is set to 3000, but I don't have a problem with
> > soft commits ( I think). I was responding
> >
> > I am concerned that some hard commits don't seem to happen, but I think
> > many commits do occur. I'd like suggestions on how to diagnose this, and
> > perhaps an idea of where to look. Typically I believe that issues like
> this
> > are from our configuration.
> >
> > Our indexing job is pretty simple, we send blocks of JSON to
> > /update/json. We have either re-index the whole collection,
> or
> > just apply updates. Typically we reindex the data once a week and delete
> > any records that are older than the last full index. This does lead to a
> > fair number of deleted records in the index especially if commits fail.
> > Most of our collections are not large between 2 and 3 million records.
> >
> > The collections are hosted in google cloud
> >
> > On Mon, Feb 12, 2018 at 5:00 PM, Erick Erickson  >
> > wrote:
> >
> > > bq: But if 3 seconds is aggressive what would be a  good value for soft
> > > commit?
> > >
> > > The usual answer is "as long as you can stand". All top-level caches
> are
> > > invalidated, autowarming is done etc. on each soft commit. That can be
> a
> > > lot of
> > > work and if your users are comfortable with docs not showing up for,
> > > say, 10 minutes
> > > then use 10 minutes. As always "it depends" here, the point is not to
> > > do unnecessary
> > > work if possible.
> > >
> > > bq: If a commit doesn't happen how would there ever be an index merge
> > > that would remove the deleted documents.
> > >
> > > Right, it wouldn't. It's a little more subtle than that though.
> > > Segments on various
> > > replicas will contain different docs, thus the term/doc statistics can
> be
> > > a bit
> > > different between multiple replicas. None of the stats will change
> > > until the commit
> > > though. You might try turning no distributed doc/term stats though.
> > >
> > > Your comments about PULL or TLOG replicas are well taken. However, even
> > > those
> > > won't be absolutely in sync since they'll replicate from the master at
> > > slightly
> > > different times and _could_ get slightly different segments _if_
> > > there's indexing
> > > going on. But let's say you stop indexing. After the next poll
> > > interval all the replicas
> > > will have identical characteristics and will score the docs the same.
> > >
> > > I don't have any signifiant wisdom to offer here, except this is really
> > the
> > > first time I've heard of this behavior. About all I can imagine is
> > > that _somehow_
> > > the soft commit interval is -1. When you say you "issue a commit" I'm
> > > assuming
> > > it's via collection/update?commit=true or some such which issues a
> > > hard
> > > commit with openSearcher=true. And it's on a _collection_ basis, right?
> > >
> > > Sorry I can't be more help
> > > Erick
> > >
> > >
> > >
> > >
> > > On Mon, Feb 12, 2018 at 10:44 AM, Webster Homer <
> webster.ho...@sial.com>
> > > wrote:
> > > > Erick, I am aware of the CDCR buffering problem causing tlog
> retention,
> > > we
> > > > always turn buffering off in our cdcr configurations.
> > > >
> > > > My post was precipitated by seeing that we had uncommitted data in
> > > > collections > 24 hours after it was loaded. The collections I was
> > looking
> > > > at are in our development environment, where we do not use CDCR.
> > However
> > > > I'm pretty sure that I've seen situations in pro

Re: solrcloud Auto-commit doesn't seem reliable

2018-03-23 Thread Webster Homer
It's been a while since I had time to look further into this. I'll have to
go back through logs, which I need to get retrieved by an admin.

On Fri, Mar 23, 2018 at 8:45 AM, Amrit Sarkar 
wrote:

> Elaino,
>
> When you say commits not working, the solr logs not printing "commit"
> messages? or documents are not appearing when we search.
>
> Amrit Sarkar
> Search Engineer
> Lucidworks, Inc.
> 415-589-9269
> www.lucidworks.com
> Twitter http://twitter.com/lucidworks
> LinkedIn: https://www.linkedin.com/in/sarkaramrit2
> Medium: https://medium.com/@sarkaramrit2
>
> On Thu, Mar 22, 2018 at 4:05 AM, Elaine Cario  wrote:
>
> > I'm just catching up on reading solr emails, so forgive me for being late
> > to this dance
> >
> > I've just gone through a project to enable CDCR on our Solr, and I also
> > experienced a small period of time where the commits on the source server
> > just seemed to stop.  This was during a period of intense experimentation
> > where I was mucking around with configurations, turning CDCR on/off, etc.
> > At some point the commits stopped occurring, and it drove me nuts for a
> > couple of days - tried everything - restarting Solr, reloading, turned
> > buffering on, turned buffering off, etc.  I finally threw up my hands and
> > rebooted the server out of desperation (it was a physical Linux box).
> > Commits worked fine after that.  I don't know what caused the commits to
> > stop, and why re-booting (and not just restarting Solr) caused them to
> work
> > fine.
> >
> > Wondering if you ever found a solution to your situation?
> >
> >
> >
> > On Fri, Feb 16, 2018 at 2:44 PM, Webster Homer 
> > wrote:
> >
> > > I meant to get back to this sooner.
> > >
> > > When I say I issued a commit I do issue it as
> > collection/update?commit=true
> > >
> > > The soft commit interval is set to 3000, but I don't have a problem
> with
> > > soft commits ( I think). I was responding
> > >
> > > I am concerned that some hard commits don't seem to happen, but I think
> > > many commits do occur. I'd like suggestions on how to diagnose this,
> and
> > > perhaps an idea of where to look. Typically I believe that issues like
> > this
> > > are from our configuration.
> > >
> > > Our indexing job is pretty simple, we send blocks of JSON to
> > > /update/json. We have either re-index the whole collection,
> > or
> > > just apply updates. Typically we reindex the data once a week and
> delete
> > > any records that are older than the last full index. This does lead to
> a
> > > fair number of deleted records in the index especially if commits fail.
> > > Most of our collections are not large between 2 and 3 million records.
> > >
> > > The collections are hosted in google cloud
> > >
> > > On Mon, Feb 12, 2018 at 5:00 PM, Erick Erickson <
> erickerick...@gmail.com
> > >
> > > wrote:
> > >
> > > > bq: But if 3 seconds is aggressive what would be a  good value for
> soft
> > > > commit?
> > > >
> > > > The usual answer is "as long as you can stand". All top-level caches
> > are
> > > > invalidated, autowarming is done etc. on each soft commit. That can
> be
> > a
> > > > lot of
> > > > work and if your users are comfortable with docs not showing up for,
> > > > say, 10 minutes
> > > > then use 10 minutes. As always "it depends" here, the point is not to
> > > > do unnecessary
> > > > work if possible.
> > > >
> > > > bq: If a commit doesn't happen how would there ever be an index merge
> > > > that would remove the deleted documents.
> > > >
> > > > Right, it wouldn't. It's a little more subtle than that though.
> > > > Segments on various
> > > > replicas will contain different docs, thus the term/doc statistics
> can
> > be
> > > > a bit
> > > > different between multiple replicas. None of the stats will change
> > > > until the commit
> > > > though. You might try turning no distributed doc/term stats though.
> > > >
> > > > Your comments about PULL or TLOG replicas are well taken. However,
> even
> > > > those
> > > > won't be absolutely in sync since they'll replicate from the master
> at
> > > > slightly
> > > > different times and _could_ get slightly different segments _if_
> > > > there's indexing
> > > > going on. But let's say you stop indexing. After the next poll
> > > > interval all the replicas
> > > > will have identical characteristics and will score the docs the same.
> > > >
> > > > I don't have any signifiant wisdom to offer here, except this is
> really
> > > the
> > > > first time I've heard of this behavior. About all I can imagine is
> > > > that _somehow_
> > > > the soft commit interval is -1. When you say you "issue a commit" I'm
> > > > assuming
> > > > it's via collection/update?commit=true or some such which
> issues a
> > > > hard
> > > > commit with openSearcher=true. And it's on a _collection_ basis,
> right?
> > > >
> > > > Sorry I can't be more help
> > > > Erick
> > > >
> > > >
> > > >
> > > >
> > > > On Mon, Feb 12, 2018 at 10:44 AM, Webster Homer <
> > webster.ho...