Re: replication problems with solr4.1

2013-02-14 Thread Amit Nithian
I may be missing something but let me go back to your original statements:
1) You build the index once per week from scratch
2) You replicate this from master to slave.

My understanding of the way replication works is that it's meant to only
send along files that are new and if any files named the same between the
master and slave have different sizes then this is a "corruption" of sorts
and do this index. and send the full thing down. This, I think,
explains your index. issue although why the old index/ directory
isn't being deleted i'm not sure about. This is why I was asking about OS
details, file system details etc (perhaps something else is locking that
directory preventing Java from deleting it?)

The second issue is the index generation which is governed by commits and
is represented by looking at the last few characters in the segments_XX
file. When the slave downloads the index and does the copy of the new
files, it does a commit to force a new searcher hence why the slave
generation will be +1 from the master.

The index "version" is a timestamp and it may be the case that the version
represents the point in time when the index was downloaded to the slave? In
general, it shouldn't matter about these details because replication is
only triggered if the master's version > slave's version and the clocks
that all servers use are synched to some common clock.

Caveat however in my answer is that I have yet to try 4.1 as this is next
on my TODO list so maybe I'll run into the same problem :-) but I wanted to
provide some info as I just recently dug through the replication code to
understand it better myself.

Cheers
Amit


On Wed, Feb 13, 2013 at 11:57 PM, Bernd Fehling <
bernd.fehl...@uni-bielefeld.de> wrote:

> OK then index generation and index version are out of count when it comes
> to verify that master and slave index are in sync.
>
> What else is possible?
>
> The strange thing is if master is 2 or more generations ahead of slave
> then it works!
> With your logic the slave must _always_ be one generation ahead of the
> master,
> because the slave replicates from master and then does an additional commit
> to recognize the changes on the slave.
> This implies that the slave acts as follows:
> - if the master is one generation ahaed then do an additional commit
> - if the master is 2 or more generations ahead then do _no_ commit
> OR
> - if the master is 2 or more generations ahead then do a commit but don't
>   change generation and version of index
>
> Can this be true?
>
> I would say "not really".
>
> Regards
> Bernd
>
>
> Am 13.02.2013 20:38, schrieb Amit Nithian:
> > Okay so then that should explain the generation difference of 1 between
> the
> > master and slave
> >
> >
> > On Wed, Feb 13, 2013 at 10:26 AM, Mark Miller 
> wrote:
> >
> >>
> >> On Feb 13, 2013, at 1:17 PM, Amit Nithian  wrote:
> >>
> >>> doesn't it do a commit to force solr to recognize the changes?
> >>
> >> yes.
> >>
> >> - Mark
> >>
> >
>


Re: replication problems with solr4.1

2013-02-13 Thread Bernd Fehling
OK then index generation and index version are out of count when it comes
to verify that master and slave index are in sync.

What else is possible?

The strange thing is if master is 2 or more generations ahead of slave then it 
works!
With your logic the slave must _always_ be one generation ahead of the master,
because the slave replicates from master and then does an additional commit
to recognize the changes on the slave.
This implies that the slave acts as follows:
- if the master is one generation ahaed then do an additional commit
- if the master is 2 or more generations ahead then do _no_ commit
OR
- if the master is 2 or more generations ahead then do a commit but don't
  change generation and version of index

Can this be true?

I would say "not really".

Regards
Bernd


Am 13.02.2013 20:38, schrieb Amit Nithian:
> Okay so then that should explain the generation difference of 1 between the
> master and slave
> 
> 
> On Wed, Feb 13, 2013 at 10:26 AM, Mark Miller  wrote:
> 
>>
>> On Feb 13, 2013, at 1:17 PM, Amit Nithian  wrote:
>>
>>> doesn't it do a commit to force solr to recognize the changes?
>>
>> yes.
>>
>> - Mark
>>
> 


Re: replication problems with solr4.1

2013-02-13 Thread Amit Nithian
Okay so then that should explain the generation difference of 1 between the
master and slave


On Wed, Feb 13, 2013 at 10:26 AM, Mark Miller  wrote:

>
> On Feb 13, 2013, at 1:17 PM, Amit Nithian  wrote:
>
> > doesn't it do a commit to force solr to recognize the changes?
>
> yes.
>
> - Mark
>


Re: replication problems with solr4.1

2013-02-13 Thread Mark Miller

On Feb 13, 2013, at 1:17 PM, Amit Nithian  wrote:

> doesn't it do a commit to force solr to recognize the changes?

yes.

- Mark


Re: replication problems with solr4.1

2013-02-13 Thread Amit Nithian
So just a hunch... but when the slave downloads the data from the master,
doesn't it do a commit to force solr to recognize the changes? In so doing,
wouldn't that increase the generation number? In theory it shouldn't matter
because the replication looks for files that are different to determine
whether or not to do a full download or a partial replication. In the event
of a full replication (an optimize would cause this), I think the
replication handler considers this a "corruption" and forces a full
download into this index. folder with the index.properties
pointing at this folder to tell solr this is the new index directory. Since
you mentioned you rebuild the index from scratch once per week I'd expect
to see this behavior you are mentioning.

I remember debugging the code to find out how replication works in 4.0
because of a bug that was fixed in 4.1 but I haven't read through the 4.1
code to see how much (if any) has changed from this logic.

In short, I don't know why you'd have the old "index/" directory there..
that seems either like a bug or something was locking that directory in the
filesystem preventing it from being removed. What OS are you using and is
the index/ directory stored on a local file system vs NFS?

HTH
Amit


On Tue, Feb 12, 2013 at 2:26 AM, Bernd Fehling <
bernd.fehl...@uni-bielefeld.de> wrote:

>
> Now this is strange, the index generation and index version
> is changing with replication.
>
> e.g. master has index generation 118 index version 136059533234
> and  slave  has index generation 118 index version 136059533234
> are both same.
>
> Now add one doc to master with commit.
> master has index generation 119 index version 1360595446556
>
> Next replicate master to slave. The result is:
> master has index generation 119 index version 1360595446556
> slave  has index generation 120 index version 1360595564333
>
> I have not seen this before.
> I thought replication is just taking over the index from master to slave,
> more like a sync?
>
>
>
>
> Am 11.02.2013 09:29, schrieb Bernd Fehling:
> > Hi list,
> >
> > after upgrading from solr4.0 to solr4.1 and running it for two weeks now
> > it turns out that replication has problems and unpredictable results.
> > My installation is single index 41 mio. docs / 115 GB index size / 1
> master / 3 slaves.
> > - the master builds a new index from scratch once a week
> > - a replication is started manually with Solr admin GUI
> >
> > What I see is one of these cases:
> > - after a replication a new searcher is opened on index.xxx
> directory and
> >   the old data/index/ directory is never deleted and besides the file
> >   replication.properties there is also a file index.properties
> > OR
> > - the replication takes place everything looks fine but when opening the
> admin GUI
> >   the statistics report
> > Last Modified: a day ago
> > Num Docs: 42262349
> > Max Doc:  42262349
> > Deleted Docs:  0
> > Version:  45174
> > Segment Count: 1
> >
> > VersionGen  Size
> > Master: 1360483635404  112  116.5 GB
> > Slave:1360483806741  113  116.5 GB
> >
> >
> > In the first case, why is the replication doing that???
> > It is an offline slave, no search activity, just there fore backup!
> >
> >
> > In the second case, why is the version and generation different right
> after
> > full replication?
> >
> >
> > Any thoughts on this?
> >
> >
> > - Bernd
> >
>
> --
> *
> Bernd FehlingBielefeld University Library
> Dipl.-Inform. (FH)LibTec - Library Technology
> Universitätsstr. 25  and Knowledge Management
> 33615 Bielefeld
> Tel. +49 521 106-4060   bernd.fehling(at)uni-bielefeld.de
>
> BASE - Bielefeld Academic Search Engine - www.base-search.net
> *
>


Re: replication problems with solr4.1

2013-02-12 Thread Bernd Fehling

Now this is strange, the index generation and index version
is changing with replication.

e.g. master has index generation 118 index version 136059533234
and  slave  has index generation 118 index version 136059533234
are both same.

Now add one doc to master with commit.
master has index generation 119 index version 1360595446556

Next replicate master to slave. The result is:
master has index generation 119 index version 1360595446556
slave  has index generation 120 index version 1360595564333

I have not seen this before.
I thought replication is just taking over the index from master to slave,
more like a sync?




Am 11.02.2013 09:29, schrieb Bernd Fehling:
> Hi list,
> 
> after upgrading from solr4.0 to solr4.1 and running it for two weeks now
> it turns out that replication has problems and unpredictable results.
> My installation is single index 41 mio. docs / 115 GB index size / 1 master / 
> 3 slaves.
> - the master builds a new index from scratch once a week
> - a replication is started manually with Solr admin GUI
> 
> What I see is one of these cases:
> - after a replication a new searcher is opened on index.xxx directory and
>   the old data/index/ directory is never deleted and besides the file
>   replication.properties there is also a file index.properties
> OR
> - the replication takes place everything looks fine but when opening the 
> admin GUI
>   the statistics report
> Last Modified: a day ago
> Num Docs: 42262349
> Max Doc:  42262349
> Deleted Docs:  0
> Version:  45174
> Segment Count: 1
> 
> VersionGen  Size
> Master: 1360483635404  112  116.5 GB
> Slave:1360483806741  113  116.5 GB
> 
> 
> In the first case, why is the replication doing that???
> It is an offline slave, no search activity, just there fore backup!
> 
> 
> In the second case, why is the version and generation different right after
> full replication?
> 
> 
> Any thoughts on this?
> 
> 
> - Bernd
> 

-- 
*
Bernd FehlingBielefeld University Library
Dipl.-Inform. (FH)LibTec - Library Technology
Universitätsstr. 25  and Knowledge Management
33615 Bielefeld
Tel. +49 521 106-4060   bernd.fehling(at)uni-bielefeld.de

BASE - Bielefeld Academic Search Engine - www.base-search.net
*


replication problems with solr4.1

2013-02-11 Thread Bernd Fehling
Hi list,

after upgrading from solr4.0 to solr4.1 and running it for two weeks now
it turns out that replication has problems and unpredictable results.
My installation is single index 41 mio. docs / 115 GB index size / 1 master / 3 
slaves.
- the master builds a new index from scratch once a week
- a replication is started manually with Solr admin GUI

What I see is one of these cases:
- after a replication a new searcher is opened on index.xxx directory and
  the old data/index/ directory is never deleted and besides the file
  replication.properties there is also a file index.properties
OR
- the replication takes place everything looks fine but when opening the admin 
GUI
  the statistics report
Last Modified: a day ago
Num Docs: 42262349
Max Doc:  42262349
Deleted Docs:  0
Version:  45174
Segment Count: 1

VersionGen  Size
Master: 1360483635404  112  116.5 GB
Slave:  1360483806741  113  116.5 GB


In the first case, why is the replication doing that???
It is an offline slave, no search activity, just there fore backup!


In the second case, why is the version and generation different right after
full replication?


Any thoughts on this?


- Bernd