right, autoCommit (in solrconfig.xml) will
1> close the current Lucene segments and open a new one
2> close the tlog and start a new one.

Those actions are independent of whether openSearcher=true or false.
if (and only if) openSearcher=true, then the commits will be
immediately visible to a query.

So then it's up to you to issue either a soft commit (or hard commit
with openSearcher=true) at
some point for the docs to be visible.

bq: Does it mean, let me say, that when openSearcher=false we have implicit
commit done by solrCloud <autoCommit> not visible to world and explicit
commit done by clients visible to world?

Exactly. Now, this all assumes that you want all your recent indexing
to be visible at once. If you don't care whether documents become
visible while you're indexing but before the whole thing is done,
then:
1> set autoCommit with openSearcher=false to some fairly short
interval, say 1 minute.
2> set autoSoftCommit to some longer interval (say 5 minutes).

Now you don't have to do anything at all. Don't commit from the
client. Just wait 5 minutes after the indexing is done before
expecting to see _all_ the docs from your indexing run.

Do note one quirk though. Let's claim you're doing autoCommits with
openSearcher=false. If you restart Solr, then those changes _will_
become visible.

Best,
Erick

On Tue, May 26, 2015 at 9:33 AM, Vincenzo D'Amore <v.dam...@gmail.com> wrote:
> Thanks Erick for your willingness and patience,
>
> if I understood well when autoCommit with openSearcher=true at first commit
> (soft or hard) all new documents will be automatically available for search.
> But when openSearcher=false, the commit will flush recent index changes to
> stable storage, but does not cause a new searcher to be opened to make
> those changes visible
> <https://cwiki.apache.org/confluence/display/solr/UpdateHandlers+in+SolrConfig#UpdateHandlersinSolrConfig-autoCommit>
> .
>
> So, it is not clear what is this stable storage, where is and when the new
> documents will be visible?
> Only when at very end of indexing process my code will commit ?
>
> Does it mean, let me say, that when openSearcher=false we have implicit
> commit done by solrCloud <autoCommit> not visible to world and explicit
> commit done by clients visible to world?
>
>
>
>
> On Tue, May 26, 2015 at 2:55 AM, Erick Erickson <erickerick...@gmail.com>
> wrote:
>
>> The design is that the latest successfully flushed tlog file is kept
>> for "peer sync" in SolrCloud mode. When a replica comes up, there's a
>> chance that it's not very many docs behind. So, if possible, some of
>> the docs are taken from the leader's tlog and replayed to the follower
>> that's just been started. If the follower is too far out of sync, a
>> full old-style replication is done. So there will always be a tlog
>> file (and occasionally more than one if they're very small) kept
>> around, even on successful commit. It doesn't matter if you have
>> leaders and replicas or not, that's still the process that's followed.
>>
>> Please re-read the link I sent earlier. There's absolutely no reason
>> your tlog files have to be so big! Really, set you autoCommit to, say,
>> 15 seconds and 100000 docs and set openSearcher=false in your
>> solrconfig.xml file and your tlog file that's kept around will be much
>> smaller and they'll be available for "peer sync"..
>>
>> And if you really don't care about tlogs at all, just take this bit
>> our of your solrconfig.xml
>>
>>     <updateLog>
>>       <str name="dir">${solr.ulog.dir:}</str>
>>       <int name="">${solr.ulog.numVersionBuckets:256}</int>
>>     </updateLog>
>>
>>
>>
>> Best,
>> Erick
>>
>> On Mon, May 25, 2015 at 4:40 PM, Vincenzo D'Amore <v.dam...@gmail.com>
>> wrote:
>> > Hi Erick,
>> >
>> > I have tried indexing code I have few times, this is the behaviour I have
>> > tried out:
>> >
>> > When an indexing process starts, even if one or more tlog file exists, a
>> > new tlog file is created and all the new documents are stored there.
>> > When indexing process ends and does an hard commit, older old tlog files
>> > are removed but the new one (the latest) remains.
>> >
>> > As far as I can see, since my indexing process every time loads few
>> > millions of documents, at end of process latest tlog file persist with
>> all
>> > these documents there.
>> > So I have such big tlog files. Now the question is, why latest tlog file
>> > persist even if the code have done a hard commit.
>> > When an hard commit is done successfully, why should we keep latest tlog
>> > file?
>> >
>> >
>> >
>> > On Mon, May 25, 2015 at 7:24 PM, Erick Erickson <erickerick...@gmail.com
>> >
>> > wrote:
>> >
>> >> OK, assuming you're not doing any commits at all until the very end,
>> >> then the tlog contains all the docs for the _entire_ run. The article
>> >> really doesn't care whether the commits come from the solrconfig.xml
>> >> or SolrJ client or curl. The tlog simply is not truncated until a hard
>> >> commit happens, no matter where it comes from.
>> >>
>> >> So here's what I'd do:
>> >> 1> set autoCommit in your solrconfig.xml with openSearcher=false for
>> >> every minute. Then the problem will probably go away.
>> >> or
>> >> 2> periodically issue a hard commit (openSearcher=false) from the
>> client.
>> >>
>> >> Of the two, I _strongly_ recommend <1> as it's more graceful when
>> >> there are multiple clents.
>> >>
>> >> Best,
>> >> Erick
>> >>
>> >> On Mon, May 25, 2015 at 4:45 AM, Vincenzo D'Amore <v.dam...@gmail.com>
>> >> wrote:
>> >> > Hi Erick, thanks for your support.
>> >> >
>> >> > Reading the post I realised that my scenario does not apply the
>> >> autoCommit
>> >> > configuration, now we don't have autoCommit in our solrconfig.xml.
>> >> >
>> >> > We need docs are searchable only after the indexing process, and all
>> the
>> >> > documents are committed only at end of index process.
>> >> >
>> >> > Now I don't understand why tlog files are so big, given that we have
>> an
>> >> > hard commit at end of every indexing.
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On Sun, May 24, 2015 at 5:49 PM, Erick Erickson <
>> erickerick...@gmail.com
>> >> >
>> >> > wrote:
>> >> >
>> >> >> Vincenzo:
>> >> >>
>> >> >> Here's perhaps more than you want to know about hard commits, soft
>> >> >> commits and transaction logs:
>> >> >>
>> >> >>
>> >> >>
>> >>
>> http://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>> >> >>
>> >> >> Best,
>> >> >> Erick
>> >> >>
>> >> >> On Sun, May 24, 2015 at 12:04 AM, Vincenzo D'Amore <
>> v.dam...@gmail.com>
>> >> >> wrote:
>> >> >> > Thanks Shawn for your prompt support.
>> >> >> >
>> >> >> > Best regards,
>> >> >> > Vincenzo
>> >> >> >
>> >> >> > On Sun, May 24, 2015 at 6:45 AM, Shawn Heisey <apa...@elyograg.org
>> >
>> >> >> wrote:
>> >> >> >
>> >> >> >> On 5/23/2015 9:41 PM, Vincenzo D'Amore wrote:
>> >> >> >> > Thanks Shawn,
>> >> >> >> >
>> >> >> >> > may be this is a silly question, but I looked around and didn't
>> >> find
>> >> >> an
>> >> >> >> > answer...
>> >> >> >> > Well, could I update solrconfig.xml for the collection while the
>> >> >> >> instances
>> >> >> >> > are running or should I restart the cluster/reload the cores?
>> >> >> >>
>> >> >> >> You can upload a new config to zookeeper with the zkcli program
>> while
>> >> >> >> Solr is running, and nothing will change, at least not
>> immediately.
>> >> The
>> >> >> >> new config will take effect when you reload the collection or
>> restart
>> >> >> >> all the Solr instances.
>> >> >> >>
>> >> >> >> Thanks,
>> >> >> >> Shawn
>> >> >> >>
>> >> >> >>
>> >> >>
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > Vincenzo D'Amore
>> >> > email: v.dam...@gmail.com
>> >> > skype: free.dev
>> >> > mobile: +39 349 8513251
>> >>
>> >
>> >
>> >
>> > --
>> > Vincenzo D'Amore
>> > email: v.dam...@gmail.com
>> > skype: free.dev
>> > mobile: +39 349 8513251
>>
>
>
>
> --
> Vincenzo D'Amore
> email: v.dam...@gmail.com
> skype: free.dev
> mobile: +39 349 8513251

Reply via email to