Re: Two instances of solr - the same datadir?

Roman Chyla Wed, 03 Jul 2013 23:20:09 -0700

I have spent lot of time in the past day playing with this setup, and made
it work finally, here are few bits of interest:


- solr v40
- linux, java7, local filesystem
- big index, 1 RW instance + 2 RO instances (sharing the same index)


lock is acquired when solr is writing data - if you happen to be starting
your RO instance at this moment and you are using 'native' lock, it will
fail. However, when using RW instance with 'native' lock, and 2 RO
instances 'single' lock, the RO instances can start, but they will
eventually get into troubles too - our index is too big and so when core
RELOAD is called and indexing is under way, the RO instances time out.

core reload, when using 'native' lock, seems to work fine - if you were
lucky and all instances managed to start - HOWEVER, the core is
unresponsive until fully loaded (makes sense), but this is actually
terrible - your search is gone for seconds/minutes

the best setup is as described in my original post - RO instances MUST NOT
commit anything - neither use reload (because during reload solr tries to
acquire lock). Instead, they should just reopen the searcher - i repeat:
you should make sure that nothing is every going to write on the RO
instance. And because there is no public api for reopening the searcher, I
wrote a simple handler which just calls:

req.getCore().getSearcher(true, false, null, false);

when called, the RO instances continue to handle requests using the old
searcher, warming in the background, once ready, the new searcher takes
over [to repeat: i am triggering this refresh from the RW instance, it does
'curl http://foo/solr/myhandler?command=reopenSearcher]


the bad thing: when the RO instance dies (eg OOM error) and the RW is just
in the middle of writing data, you can't restart RO instance (unless you
use lock 'single' or some other lock)

HTH,

  roman




On Tue, Jul 2, 2013 at 5:35 PM, Michael Della Bitta <
michael.della.bi...@appinions.com> wrote:

> Wouldn't it be better to do a RELOAD?
>
> http://wiki.apache.org/solr/CoreAdmin#RELOAD
>
> Michael Della Bitta
>
> Applications Developer
>
> o: +1 646 532 3062  | c: +1 917 477 7906
>
> appinions inc.
>
> “The Science of Influence Marketing”
>
> 18 East 41st Street
>
> New York, NY 10017
>
> t: @appinions <https://twitter.com/Appinions> | g+:
> plus.google.com/appinions
> w: appinions.com <http://www.appinions.com/>
>
>
> On Tue, Jul 2, 2013 at 5:05 PM, Peter Sturge <peter.stu...@gmail.com>
> wrote:
>
> > The RO instance commit isn't (or shouldn't be) doing any real writing,
> just
> > an empty commit to force new searchers, autowarm/refresh caches etc.
> > Admittedly, we do all this on 3.6, so 4.0 could have different behaviour
> in
> > this area.
> > As long as you don't have autocommit in solrconfig.xml, there wouldn't be
> > any commits 'behind the scenes' (we do all our commits via a local solrj
> > client so it can be fully managed).
> > The only caveat might be NRT/soft commits, but I'm not too familiar with
> > this in 4.0.
> > In any case, your RO instance must be getting updated somehow, otherwise
> > how would it know your write instance made any changes?
> > Perhaps your write instance notifies the RO instance externally from
> Solr?
> > (a perfectly valid approach, and one that would allow a 'single' lock to
> > work without contention)
> >
> >
> >
> > On Tue, Jul 2, 2013 at 7:59 PM, Roman Chyla <roman.ch...@gmail.com>
> wrote:
> >
> > > Interesting, we are running 4.0 - and solr will refuse the start (or
> > > reload) the core. But from looking at the code I am not seeing it is
> > doing
> > > any writing - but I should digg more...
> > >
> > > Are you sure it needs to do writing? Because I am not calling commits,
> in
> > > fact I have deactivated *all* components that write into index, so
> unless
> > > there is something deep inside, which automatically calls the commit,
> it
> > > should never happen.
> > >
> > > roman
> > >
> > >
> > > On Tue, Jul 2, 2013 at 2:54 PM, Peter Sturge <peter.stu...@gmail.com>
> > > wrote:
> > >
> > > > Hmmm, single lock sounds dangerous. It probably works ok because
> you've
> > > > been [un]lucky.
> > > > For example, even with a RO instance, you still need to do a commit
> in
> > > > order to reload caches/changes from the other instance.
> > > > What happens if this commit gets called in the middle of the other
> > > > instance's commit? I've not tested this scenario, but it's very
> > possible
> > > > with a 'single' lock the results are indeterminate.
> > > > If the 'single' lock mechanism is making assumptions e.g. no other
> > > process
> > > > will interfere, and then one does, the Lucene index could very well
> get
> > > > corrupted.
> > > >
> > > > For the error you're seeing using 'native', we use native lockType
> for
> > > both
> > > > write and RO instances, and it works fine - no contention.
> > > > Which version of Solr are you using? Perhaps there's been a change in
> > > > behaviour?
> > > >
> > > > Peter
> > > >
> > > >
> > > > On Tue, Jul 2, 2013 at 7:30 PM, Roman Chyla <roman.ch...@gmail.com>
> > > wrote:
> > > >
> > > > > as i discovered, it is not good to use 'native' locktype in this
> > > > scenario,
> > > > > actually there is a note in the solrconfig.xml which says the same
> > > > >
> > > > > when a core is reloaded and solr tries to grab lock, it will fail -
> > > even
> > > > if
> > > > > the instance is configured to be read-only, so i am using 'single'
> > lock
> > > > for
> > > > > the readers and 'native' for the writer, which seems to work OK
> > > > >
> > > > > roman
> > > > >
> > > > >
> > > > > On Fri, Jun 7, 2013 at 9:05 PM, Roman Chyla <roman.ch...@gmail.com
> >
> > > > wrote:
> > > > >
> > > > > > I have auto commit after 40k RECs/1800secs. But I only tested
> with
> > > > manual
> > > > > > commit, but I don't see why it should work differently.
> > > > > > Roman
> > > > > > On 7 Jun 2013 20:52, "Tim Vaillancourt" <t...@elementspace.com>
> > > wrote:
> > > > > >
> > > > > >> If it makes you feel better, I also considered this approach
> when
> > I
> > > > was
> > > > > in
> > > > > >> the same situation with a separate indexer and searcher on one
> > > > Physical
> > > > > >> linux machine.
> > > > > >>
> > > > > >> My main concern was "re-using" the FS cache between both
> > instances -
> > > > If
> > > > > I
> > > > > >> replicated to myself there would be two independent copies of
> the
> > > > index,
> > > > > >> FS-cached separately.
> > > > > >>
> > > > > >> I like the suggestion of using autoCommit to reload the index.
> If
> > > I'm
> > > > > >> reading that right, you'd set an autoCommit on 'zero docs
> > changing',
> > > > or
> > > > > >> just 'every N seconds'? Did that work?
> > > > > >>
> > > > > >> Best of luck!
> > > > > >>
> > > > > >> Tim
> > > > > >>
> > > > > >>
> > > > > >> On 5 June 2013 10:19, Roman Chyla <roman.ch...@gmail.com>
> wrote:
> > > > > >>
> > > > > >> > So here it is for a record how I am solving it right now:
> > > > > >> >
> > > > > >> > Write-master is started with:
> -Dmontysolr.warming.enabled=false
> > > > > >> > -Dmontysolr.write.master=true -Dmontysolr.read.master=
> > > > > >> > http://localhost:5005
> > > > > >> > Read-master is started with: -Dmontysolr.warming.enabled=true
> > > > > >> > -Dmontysolr.write.master=false
> > > > > >> >
> > > > > >> >
> > > > > >> > solrconfig.xml changes:
> > > > > >> >
> > > > > >> > 1. all index changing components have this bit,
> > > > > >> > enable="${montysolr.master:true}" - ie.
> > > > > >> >
> > > > > >> > <updateHandler class="solr.DirectUpdateHandler2"
> > > > > >> >                  enable="${montysolr.master:true}">
> > > > > >> >
> > > > > >> > 2. for cache warming de/activation
> > > > > >> >
> > > > > >> > <listener event="newSearcher"
> > > > > >> >       class="solr.QuerySenderListener"
> > > > > >> >       enable="${montysolr.enable.warming:true}">...
> > > > > >> >
> > > > > >> > 3. to trigger refresh of the read-only-master (from
> > write-master):
> > > > > >> >
> > > > > >> >     <listener event="postCommit"
> > > > > >> >       class="solr.RunExecutableListener"
> > > > > >> >       enable="${montysolr.master:true}">
> > > > > >> >       <str name="exe">curl</str>
> > > > > >> >       <str name="dir">.</str>
> > > > > >> >       <bool name="wait">false</bool>
> > > > > >> >       <arr name="args"> <str>${montysolr.read.master:
> > > > http://localhost
> > > > > >> >
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> >
> }/solr/admin/cores?wt=json&amp;action=RELOAD&amp;core=collection1</str></arr>
> > > > > >> >     </listener>
> > > > > >> >
> > > > > >> > This works, I still don't like the reload of the whole core,
> but
> > > it
> > > > > >> seems
> > > > > >> > like the easiest thing to do now.
> > > > > >> >
> > > > > >> > -- roman
> > > > > >> >
> > > > > >> >
> > > > > >> > On Wed, Jun 5, 2013 at 12:07 PM, Roman Chyla <
> > > roman.ch...@gmail.com
> > > > >
> > > > > >> > wrote:
> > > > > >> >
> > > > > >> > > Hi Peter,
> > > > > >> > >
> > > > > >> > > Thank you, I am glad to read that this usecase is not alien.
> > > > > >> > >
> > > > > >> > > I'd like to make the second instance (searcher) completely
> > > > > read-only,
> > > > > >> so
> > > > > >> > I
> > > > > >> > > have disabled all the components that can write.
> > > > > >> > >
> > > > > >> > > (being lazy ;)) I'll probably use
> > > > > >> > > http://wiki.apache.org/solr/CollectionDistribution to call
> > the
> > > > curl
> > > > > >> > after
> > > > > >> > > commit, or write some IndexReaderFactory that checks for
> > changes
> > > > > >> > >
> > > > > >> > > The problem with calling the 'core reload' - is that it
> seems
> > > lots
> > > > > of
> > > > > >> > work
> > > > > >> > > for just opening a new searcher, eeekkk...somewhere I read
> > that
> > > it
> > > > > is
> > > > > >> > cheap
> > > > > >> > > to reload a core, but re-opening the index searches must be
> > > > > definitely
> > > > > >> > > cheaper...
> > > > > >> > >
> > > > > >> > > roman
> > > > > >> > >
> > > > > >> > >
> > > > > >> > > On Wed, Jun 5, 2013 at 4:03 AM, Peter Sturge <
> > > > > peter.stu...@gmail.com
> > > > > >> > >wrote:
> > > > > >> > >
> > > > > >> > >> Hi,
> > > > > >> > >> We use this very same scenario to great effect - 2
> instances
> > > > using
> > > > > >> the
> > > > > >> > >> same
> > > > > >> > >> dataDir with many cores - 1 is a writer (no caching), the
> > other
> > > > is
> > > > > a
> > > > > >> > >> searcher (lots of caching).
> > > > > >> > >> To get the searcher to see the index changes from the
> writer,
> > > you
> > > > > >> need
> > > > > >> > the
> > > > > >> > >> searcher to do an empty commit - i.e. you invoke a commit
> > with
> > > 0
> > > > > >> > >> documents.
> > > > > >> > >> This will refresh the caches (including autowarming),
> > [re]build
> > > > the
> > > > > >> > >> relevant searchers etc. and make any index changes visible
> to
> > > the
> > > > > RO
> > > > > >> > >> instance.
> > > > > >> > >> Also, make sure to use <lockType>native</lockType> in
> > > > > solrconfig.xml
> > > > > >> to
> > > > > >> > >> ensure the two instances don't try to commit at the same
> > time.
> > > > > >> > >> There are several ways to trigger a commit:
> > > > > >> > >> Call commit() periodically within your own code.
> > > > > >> > >> Use autoCommit in solrconfig.xml.
> > > > > >> > >> Use an RPC/IPC mechanism between the 2 instance processes
> to
> > > tell
> > > > > the
> > > > > >> > >> searcher the index has changed, then call commit when
> called
> > > > (more
> > > > > >> > complex
> > > > > >> > >> coding, but good if the index changes on an ad-hoc basis).
> > > > > >> > >> Note, doing things this way isn't really suitable for an
> NRT
> > > > > >> > environment.
> > > > > >> > >>
> > > > > >> > >> HTH,
> > > > > >> > >> Peter
> > > > > >> > >>
> > > > > >> > >>
> > > > > >> > >>
> > > > > >> > >> On Tue, Jun 4, 2013 at 11:23 PM, Roman Chyla <
> > > > > roman.ch...@gmail.com>
> > > > > >> > >> wrote:
> > > > > >> > >>
> > > > > >> > >> > Replication is fine, I am going to use it, but I wanted
> it
> > > for
> > > > > >> > instances
> > > > > >> > >> > *distributed* across several (physical) machines - but
> > here I
> > > > > have
> > > > > >> one
> > > > > >> > >> > physical machine, it has many cores. I want to run 2
> > > instances
> > > > of
> > > > > >> solr
> > > > > >> > >> > because I think it has these benefits:
> > > > > >> > >> >
> > > > > >> > >> > 1) I can give less RAM to the writer (4GB), and use more
> > RAM
> > > > for
> > > > > >> the
> > > > > >> > >> > searcher (28GB)
> > > > > >> > >> > 2) I can deactivate warming for the writer and keep it
> for
> > > the
> > > > > >> > searcher
> > > > > >> > >> > (this considerably speeds up indexing - each time we
> > commit,
> > > > the
> > > > > >> > server
> > > > > >> > >> is
> > > > > >> > >> > rebuilding a citation network of 80M edges)
> > > > > >> > >> > 3) saving disk space and better OS caching (OS should be
> > able
> > > > to
> > > > > >> use
> > > > > >> > >> more
> > > > > >> > >> > RAM for the caching, which should result in faster
> > > operations -
> > > > > the
> > > > > >> > two
> > > > > >> > >> > processes are accessing the same index)
> > > > > >> > >> >
> > > > > >> > >> > Maybe I should just forget it and go with the
> replication,
> > > but
> > > > it
> > > > > >> > >> doesn't
> > > > > >> > >> > 'feel right' IFF it is on the same physical machine. And
> > > Lucene
> > > > > >> > >> > specifically has a method for discovering changes and
> > > > re-opening
> > > > > >> the
> > > > > >> > >> index
> > > > > >> > >> > (DirectoryReader.openIfChanged)
> > > > > >> > >> >
> > > > > >> > >> > Am I not seeing something?
> > > > > >> > >> >
> > > > > >> > >> > roman
> > > > > >> > >> >
> > > > > >> > >> >
> > > > > >> > >> >
> > > > > >> > >> > On Tue, Jun 4, 2013 at 5:30 PM, Jason Hellman <
> > > > > >> > >> > jhell...@innoventsolutions.com> wrote:
> > > > > >> > >> >
> > > > > >> > >> > > Roman,
> > > > > >> > >> > >
> > > > > >> > >> > > Could you be more specific as to why replication
> doesn't
> > > meet
> > > > > >> your
> > > > > >> > >> > > requirements?  It was geared explicitly for this
> purpose,
> > > > > >> including
> > > > > >> > >> the
> > > > > >> > >> > > automatic discovery of changes to the data on the index
> > > > master.
> > > > > >> > >> > >
> > > > > >> > >> > > Jason
> > > > > >> > >> > >
> > > > > >> > >> > > On Jun 4, 2013, at 1:50 PM, Roman Chyla <
> > > > roman.ch...@gmail.com
> > > > > >
> > > > > >> > >> wrote:
> > > > > >> > >> > >
> > > > > >> > >> > > > OK, so I have verified the two instances can run
> > > alongside,
> > > > > >> > sharing
> > > > > >> > >> the
> > > > > >> > >> > > > same datadir
> > > > > >> > >> > > >
> > > > > >> > >> > > > All update handlers are unaccessible in the read-only
> > > > master
> > > > > >> > >> > > >
> > > > > >> > >> > > > <updateHandler class="solr.DirectUpdateHandler2"
> > > > > >> > >> > > >                 enable="${solr.can.write:true}">
> > > > > >> > >> > > >
> > > > > >> > >> > > > java -Dsolr.can.write=false .....
> > > > > >> > >> > > >
> > > > > >> > >> > > > And I can reload the index manually:
> > > > > >> > >> > > >
> > > > > >> > >> > > > curl "
> > > > > >> > >> > > >
> > > > > >> > >> > >
> > > > > >> > >> >
> > > > > >> > >>
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> >
> http://localhost:5005/solr/admin/cores?wt=json&action=RELOAD&core=collection1
> > > > > >> > >> > > > "
> > > > > >> > >> > > >
> > > > > >> > >> > > > But this is not an ideal solution; I'd like for the
> > > > read-only
> > > > > >> > >> server to
> > > > > >> > >> > > > discover index changes on its own. Any pointers?
> > > > > >> > >> > > >
> > > > > >> > >> > > > Thanks,
> > > > > >> > >> > > >
> > > > > >> > >> > > >  roman
> > > > > >> > >> > > >
> > > > > >> > >> > > >
> > > > > >> > >> > > > On Tue, Jun 4, 2013 at 2:01 PM, Roman Chyla <
> > > > > >> > roman.ch...@gmail.com>
> > > > > >> > >> > > wrote:
> > > > > >> > >> > > >
> > > > > >> > >> > > >> Hello,
> > > > > >> > >> > > >>
> > > > > >> > >> > > >> I need your expert advice. I am thinking about
> running
> > > two
> > > > > >> > >> instances
> > > > > >> > >> > of
> > > > > >> > >> > > >> solr that share the same datadirectory. The *reason*
> > > > being:
> > > > > >> > >> indexing
> > > > > >> > >> > > >> instance is constantly building cache after every
> > commit
> > > > (we
> > > > > >> > have a
> > > > > >> > >> > big
> > > > > >> > >> > > >> cache) and this slows it down. But indexing doesn't
> > need
> > > > > much
> > > > > >> > RAM,
> > > > > >> > >> > only
> > > > > >> > >> > > the
> > > > > >> > >> > > >> search does (and server has lots of CPUs)
> > > > > >> > >> > > >>
> > > > > >> > >> > > >> So, it is like having two solr instances
> > > > > >> > >> > > >>
> > > > > >> > >> > > >> 1. solr-indexing-master
> > > > > >> > >> > > >> 2. solr-read-only-master
> > > > > >> > >> > > >>
> > > > > >> > >> > > >> In the solrconfig.xml I can disable update
> components,
> > > It
> > > > > >> should
> > > > > >> > be
> > > > > >> > >> > > fine.
> > > > > >> > >> > > >> However, I don't know how to 'trigger' index
> > re-opening
> > > on
> > > > > (2)
> > > > > >> > >> after
> > > > > >> > >> > the
> > > > > >> > >> > > >> commit happens on (1).
> > > > > >> > >> > > >>
> > > > > >> > >> > > >> Ideally, the second instance could monitor the disk
> > and
> > > > > >> re-open
> > > > > >> > >> disk
> > > > > >> > >> > > after
> > > > > >> > >> > > >> new files appear there. Do I have to implement
> custom
> > > > > >> > >> > > IndexReaderFactory?
> > > > > >> > >> > > >> Or something else?
> > > > > >> > >> > > >>
> > > > > >> > >> > > >> Please note: I know about the replication, this
> > usecase
> > > is
> > > > > >> IMHO
> > > > > >> > >> > slightly
> > > > > >> > >> > > >> different - in fact, write-only-master (1) is also a
> > > > > >> replication
> > > > > >> > >> > master
> > > > > >> > >> > > >>
> > > > > >> > >> > > >> Googling turned out only this
> > > > > >> > >> > > >>
> > > > > >> > >>
> > > > >
> http://comments.gmane.org/gmane.comp.jakarta.lucene.solr.user/71912-
> > > > > >> > >> > > no
> > > > > >> > >> > > >> pointers there.
> > > > > >> > >> > > >>
> > > > > >> > >> > > >> But If I am approaching the problem wrongly, please
> > > don't
> > > > > >> > hesitate
> > > > > >> > >> to
> > > > > >> > >> > > >> 're-educate' me :)
> > > > > >> > >> > > >>
> > > > > >> > >> > > >> Thanks!
> > > > > >> > >> > > >>
> > > > > >> > >> > > >>  roman
> > > > > >> > >> > > >>
> > > > > >> > >> > >
> > > > > >> > >> > >
> > > > > >> > >> >
> > > > > >> > >>
> > > > > >> > >
> > > > > >> > >
> > > > > >> >
> > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Two instances of solr - the same datadir?

Reply via email to