Re: SOLR nrt read writes

2015-07-21 Thread Alessandro Benedetti
>
> Could this be due to caching? I have tried to disable all in my solrconfig.


If you mean Solr caches ? NO .
Solr caches live the life of the searcher.
So new searcher, new caches ( possibly warmed with updated results) .

If you mean your application caching or browser caching, you should verify,
i assume you have control on that.

Cheers

2015-07-21 6:02 GMT+01:00 Bhawna Asnani :

> Thanks, I tried turning off auto softCommits but that didn't help much.
> Still seeing stale results every now and then. Also load on the server very
> light. We are running this just on a test server with one or two users. I
> don't see any warning in logs whole doing softCommits and it says it
> successfully opened new searcher and registered it as main searcher. Could
> this be due to caching? I have tried to disable all in my solrconfig.
>
> Sent from my iPhone
>
> > On Jul 20, 2015, at 12:16 PM, Shawn Heisey  wrote:
> >
> >> On 7/20/2015 9:29 AM, Bhawna Asnani wrote:
> >> Thanks for your suggestions. The requirement is still the same , to be
> >> able to make a change to some solr documents and be able to see it on
> >> subsequent search/facet calls.
> >> I am using softCommit with waitSearcher=true.
> >>
> >> Also I am sending reads/writes to a single solr node only.
> >> I have tried disabling caches and warmup time in logs is '0' but every
> >> once in a while I do get the document just updated with stale data.
> >>
> >> I went through lucene documentation and it seems opening the
> >> IndexReader with the IndexWriter should make the changes visible to
> >> the reader.
> >>
> >> I checked solr logs no errors. I see this in logs each time
> >> 'Registered new searcher Searcher@x' even before searches that had
> >> the stale document.
> >>
> >> I have attached my solrconfig.xml for reference.
> >
> > Your attachment made it through the mailing list processing.  Most
> > don't, I'm surprised.  Some thoughts:
> >
> > maxBooleanClauses has been set to 40.  This is a lot.  If you
> > actually need a setting that high, then you are sending some MASSIVE
> > queries, which probably means that your Solr install is exceptionally
> > busy running those queries.
> >
> > If the server is fairly busy, then you should increase maxTime on
> > autoCommit.  I use a value of five minutes (30) ... and my server is
> > NOT very busy most of the time.  A commit with openSearcher set to false
> > is relatively fast, but it still has somewhat heavy CPU, memory, and
> > disk I/O resource requirements.
> >
> > You have autoSoftCommit set to happen after five seconds.  If updates
> > happen frequently or run for very long, this is potentially a LOT of
> > committing and opening new searchers.  I guess it's better than trying
> > for one second, but anything more frequent than once a minute is likely
> > to get you into trouble unless the system load is extremely light ...
> > but as already discussed, your system load is probably not light.
> >
> > For the kind of Near Real Time setup you have mentioned, where you want
> > to do one or more updates, commit, and then query for the changes, you
> > probably should completely remove autoSoftCommit from the config and
> > *only* open new searchers with explicit soft commits.  Let autoCommit
> > (with a maxTime of 1 to 5 minutes) handle durability concerns.
> >
> > A lot of pieces in your config file are set to depend on java system
> > properties just like the example does, but since we do not know what
> > system properties have been set, we can't tell for sure what those parts
> > of the config are doing.
> >
> > Thanks,
> > Shawn
> >
>



-- 
--

Benedetti Alessandro
Visiting card - http://about.me/alessandro_benedetti
Blog - http://alexbenedetti.blogspot.co.uk

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England


Re: SOLR nrt read writes

2015-07-21 Thread Upayavira
Bhawna,

I think you need to reconcile yourself to the fact that what you want to
achieve is not going to be possible.

Solr (and Lucene underneath it) is HEAVILY optimised for high read/low
write situations, and that leads to some latency in content reaching the
index. If you wanted to change this, you'd have to get into some heavy
Java/Lucene coding, as I believe Twitter have done on Lucene itself.

I'd say, rather than attempting to change this, I'd say you need to work
out a way in your UI to handle this situation. E.g. have a "refresh on
stale results" button, or "not seeing your data, try here". Or, if a
user submits data, then wants to search for it in the same session, have
your UI enforce a minimum 10s delay before it sends a request to Solr,
or something like that. Efforts to solve this at the Solr end, without
spending substantial sums and effort on it, will be futile as it isn't
what Solr/Lucene are designed for.

Upayavira

On Tue, Jul 21, 2015, at 06:02 AM, Bhawna Asnani wrote:
> Thanks, I tried turning off auto softCommits but that didn't help much.
> Still seeing stale results every now and then. Also load on the server
> very light. We are running this just on a test server with one or two
> users. I don't see any warning in logs whole doing softCommits and it
> says it successfully opened new searcher and registered it as main
> searcher. Could this be due to caching? I have tried to disable all in my
> solrconfig.
> 
> Sent from my iPhone
> 
> > On Jul 20, 2015, at 12:16 PM, Shawn Heisey  wrote:
> > 
> >> On 7/20/2015 9:29 AM, Bhawna Asnani wrote:
> >> Thanks for your suggestions. The requirement is still the same , to be
> >> able to make a change to some solr documents and be able to see it on
> >> subsequent search/facet calls.
> >> I am using softCommit with waitSearcher=true.
> >> 
> >> Also I am sending reads/writes to a single solr node only.
> >> I have tried disabling caches and warmup time in logs is '0' but every
> >> once in a while I do get the document just updated with stale data.
> >> 
> >> I went through lucene documentation and it seems opening the
> >> IndexReader with the IndexWriter should make the changes visible to
> >> the reader.
> >> 
> >> I checked solr logs no errors. I see this in logs each time
> >> 'Registered new searcher Searcher@x' even before searches that had
> >> the stale document. 
> >> 
> >> I have attached my solrconfig.xml for reference.
> > 
> > Your attachment made it through the mailing list processing.  Most
> > don't, I'm surprised.  Some thoughts:
> > 
> > maxBooleanClauses has been set to 40.  This is a lot.  If you
> > actually need a setting that high, then you are sending some MASSIVE
> > queries, which probably means that your Solr install is exceptionally
> > busy running those queries.
> > 
> > If the server is fairly busy, then you should increase maxTime on
> > autoCommit.  I use a value of five minutes (30) ... and my server is
> > NOT very busy most of the time.  A commit with openSearcher set to false
> > is relatively fast, but it still has somewhat heavy CPU, memory, and
> > disk I/O resource requirements.
> > 
> > You have autoSoftCommit set to happen after five seconds.  If updates
> > happen frequently or run for very long, this is potentially a LOT of
> > committing and opening new searchers.  I guess it's better than trying
> > for one second, but anything more frequent than once a minute is likely
> > to get you into trouble unless the system load is extremely light ...
> > but as already discussed, your system load is probably not light.
> > 
> > For the kind of Near Real Time setup you have mentioned, where you want
> > to do one or more updates, commit, and then query for the changes, you
> > probably should completely remove autoSoftCommit from the config and
> > *only* open new searchers with explicit soft commits.  Let autoCommit
> > (with a maxTime of 1 to 5 minutes) handle durability concerns.
> > 
> > A lot of pieces in your config file are set to depend on java system
> > properties just like the example does, but since we do not know what
> > system properties have been set, we can't tell for sure what those parts
> > of the config are doing.
> > 
> > Thanks,
> > Shawn
> > 


Re: SOLR nrt read writes

2015-07-20 Thread Bhawna Asnani
Thanks, I tried turning off auto softCommits but that didn't help much. Still 
seeing stale results every now and then. Also load on the server very light. We 
are running this just on a test server with one or two users. I don't see any 
warning in logs whole doing softCommits and it says it successfully opened new 
searcher and registered it as main searcher. Could this be due to caching? I 
have tried to disable all in my solrconfig.

Sent from my iPhone

> On Jul 20, 2015, at 12:16 PM, Shawn Heisey  wrote:
> 
>> On 7/20/2015 9:29 AM, Bhawna Asnani wrote:
>> Thanks for your suggestions. The requirement is still the same , to be
>> able to make a change to some solr documents and be able to see it on
>> subsequent search/facet calls.
>> I am using softCommit with waitSearcher=true.
>> 
>> Also I am sending reads/writes to a single solr node only.
>> I have tried disabling caches and warmup time in logs is '0' but every
>> once in a while I do get the document just updated with stale data.
>> 
>> I went through lucene documentation and it seems opening the
>> IndexReader with the IndexWriter should make the changes visible to
>> the reader.
>> 
>> I checked solr logs no errors. I see this in logs each time
>> 'Registered new searcher Searcher@x' even before searches that had
>> the stale document. 
>> 
>> I have attached my solrconfig.xml for reference.
> 
> Your attachment made it through the mailing list processing.  Most
> don't, I'm surprised.  Some thoughts:
> 
> maxBooleanClauses has been set to 40.  This is a lot.  If you
> actually need a setting that high, then you are sending some MASSIVE
> queries, which probably means that your Solr install is exceptionally
> busy running those queries.
> 
> If the server is fairly busy, then you should increase maxTime on
> autoCommit.  I use a value of five minutes (30) ... and my server is
> NOT very busy most of the time.  A commit with openSearcher set to false
> is relatively fast, but it still has somewhat heavy CPU, memory, and
> disk I/O resource requirements.
> 
> You have autoSoftCommit set to happen after five seconds.  If updates
> happen frequently or run for very long, this is potentially a LOT of
> committing and opening new searchers.  I guess it's better than trying
> for one second, but anything more frequent than once a minute is likely
> to get you into trouble unless the system load is extremely light ...
> but as already discussed, your system load is probably not light.
> 
> For the kind of Near Real Time setup you have mentioned, where you want
> to do one or more updates, commit, and then query for the changes, you
> probably should completely remove autoSoftCommit from the config and
> *only* open new searchers with explicit soft commits.  Let autoCommit
> (with a maxTime of 1 to 5 minutes) handle durability concerns.
> 
> A lot of pieces in your config file are set to depend on java system
> properties just like the example does, but since we do not know what
> system properties have been set, we can't tell for sure what those parts
> of the config are doing.
> 
> Thanks,
> Shawn
> 


Re: SOLR nrt read writes

2015-07-20 Thread Shawn Heisey
On 7/20/2015 9:29 AM, Bhawna Asnani wrote:
> Thanks for your suggestions. The requirement is still the same , to be
> able to make a change to some solr documents and be able to see it on
> subsequent search/facet calls.
> I am using softCommit with waitSearcher=true.
>
> Also I am sending reads/writes to a single solr node only.
> I have tried disabling caches and warmup time in logs is '0' but every
> once in a while I do get the document just updated with stale data.
>
> I went through lucene documentation and it seems opening the
> IndexReader with the IndexWriter should make the changes visible to
> the reader.
>
> I checked solr logs no errors. I see this in logs each time
> 'Registered new searcher Searcher@x' even before searches that had
> the stale document. 
>
> I have attached my solrconfig.xml for reference.

Your attachment made it through the mailing list processing.  Most
don't, I'm surprised.  Some thoughts:

maxBooleanClauses has been set to 40.  This is a lot.  If you
actually need a setting that high, then you are sending some MASSIVE
queries, which probably means that your Solr install is exceptionally
busy running those queries.

If the server is fairly busy, then you should increase maxTime on
autoCommit.  I use a value of five minutes (30) ... and my server is
NOT very busy most of the time.  A commit with openSearcher set to false
is relatively fast, but it still has somewhat heavy CPU, memory, and
disk I/O resource requirements.

You have autoSoftCommit set to happen after five seconds.  If updates
happen frequently or run for very long, this is potentially a LOT of
committing and opening new searchers.  I guess it's better than trying
for one second, but anything more frequent than once a minute is likely
to get you into trouble unless the system load is extremely light ...
but as already discussed, your system load is probably not light.

For the kind of Near Real Time setup you have mentioned, where you want
to do one or more updates, commit, and then query for the changes, you
probably should completely remove autoSoftCommit from the config and
*only* open new searchers with explicit soft commits.  Let autoCommit
(with a maxTime of 1 to 5 minutes) handle durability concerns.

A lot of pieces in your config file are set to depend on java system
properties just like the example does, but since we do not know what
system properties have been set, we can't tell for sure what those parts
of the config are doing.

Thanks,
Shawn



Re: SOLR nrt read writes

2015-07-20 Thread Bhawna Asnani
Hi,

Thanks for your suggestions. The requirement is still the same , to be able
to make a change to some solr documents and be able to see it on subsequent
search/facet calls.
I am using softCommit with waitSearcher=true.

Also I am sending reads/writes to a single solr node only.
I have tried disabling caches and warmup time in logs is '0' but every once
in a while I do get the document just updated with stale data.

I went through lucene documentation and it seems opening the IndexReader
with the IndexWriter should make the changes visible to the reader.

I checked solr logs no errors. I see this in logs each time 'Registered new
searcher Searcher@x' even before searches that had the stale document.

I have attached my solrconfig.xml for reference.
Thanks.

On Wed, Jul 15, 2015 at 11:18 AM, Erick Erickson 
wrote:

> bq: The admin can also do some updates on the items and they need to see
> the
> updates almost real time.
>
> Why not give the admin control over commits and default the other commits
> to
> something reasonable? So make your defaults, say, 15 seconds (or 30 seconds
> or longer). If the admin really needs the search to be absolutely up to
> date, they can hit the "commit" button. With perhaps a little tool tip that
> "the index is up to date as of  seconds ago,
> press this button
> to see absolutely all changes in real time".
>
> That will quickly train the admins to use that button as necessary
> when they really
> _do_ need absolutely up-to-date data. My prediction: they'll issues these
> quite
> rarely. 9 times out of 10, this kind of requirement is based on faulty
> assumptions
> and/or not understanding the work flow. That said, it may be totally a
> requirement.
> But at least ask the question.
>
> Best,
> Erick
>
> On Wed, Jul 15, 2015 at 7:57 AM, Bhawna Asnani 
> wrote:
> > We are building an admin for our inventory. Using solr's faceting,
> > searching and stats functionality it provides different ways an admin can
> > look at the inventory.
> > The admin can also do some updates on the items and they need to see the
> > updates almost real time.
> >
> > Our public facing website is already built using solr so we already have
> > the api in place to work with solr.
> > We were hoping we can put a solr instance just for admin (low traffic and
> > low latency) and build the functionality.
> >
> > Thanks for your suggesstions.
> >
> > On Wed, Jul 15, 2015 at 9:37 AM, Daniel Collins 
> > wrote:
> >
> >> Just to re-iterate Charles' response with an example, we have a system
> >> which needs to be as Near RT as we can make it.  So we have application
> >> level commitWith set to 250ms.  Yes, we have to turn off a lot of
> caching,
> >> auto-warming, etc, but it was necessary to make the index as real time
> as
> >> we needed it to be.  Now we have the benefit of being able to throw a
> lot
> >> of hardware, RAM and SSDs at this in order to get any kind of sane
> search
> >> latency.
> >>
> >> We have the luxury of being able to afford that, but it comes with other
> >> problems because we have an index that is changing so fast (replicating
> to
> >> other nodes in the cloud becomes tricky, peer sync fails most of the
> time,
> >> etc.)
> >>
> >> What is your use case that requires this level of real-time access?
> >>
> >> On 15 July 2015 at 13:59, Reitzel, Charles <
> charles.reit...@tiaa-cref.org>
> >> wrote:
> >>
> >> > And, to answer your other question, yes, you can turn off
> auto-warming.
> >> > If your instance is dedicated to this client task, it may serve no
> >> purpose
> >> > or be actually counter-productive.
> >> >
> >> > In the past, I worked on a Solr-based application that committed
> >> > frequently under application control (vs. auto commit) and we turned
> off
> >> > all auto-warming and most of the caching.
> >> >
> >> > There is scant documentation in the new Solr reference (
> cwiki.apache.org
> >> ),
> >> > but the old docs cover this well and appear current enough:
> >> > https://wiki.apache.org/solr/SolrCaching
> >> >
> >> > Just a thought: would true be
> helpful
> >> > here?
> >> >
> >> > Also, since you have just inserted the documents, it sounds like you
> >> > probably could search by ID ...
> >> >
> >> > -Original Message-
> >> > From: Shawn Heisey [mailto:apa.

Re: SOLR nrt read writes

2015-07-15 Thread Erick Erickson
bq: The admin can also do some updates on the items and they need to see the
updates almost real time.

Why not give the admin control over commits and default the other commits to
something reasonable? So make your defaults, say, 15 seconds (or 30 seconds
or longer). If the admin really needs the search to be absolutely up to
date, they can hit the "commit" button. With perhaps a little tool tip that
"the index is up to date as of  seconds ago,
press this button
to see absolutely all changes in real time".

That will quickly train the admins to use that button as necessary
when they really
_do_ need absolutely up-to-date data. My prediction: they'll issues these quite
rarely. 9 times out of 10, this kind of requirement is based on faulty
assumptions
and/or not understanding the work flow. That said, it may be totally a
requirement.
But at least ask the question.

Best,
Erick

On Wed, Jul 15, 2015 at 7:57 AM, Bhawna Asnani  wrote:
> We are building an admin for our inventory. Using solr's faceting,
> searching and stats functionality it provides different ways an admin can
> look at the inventory.
> The admin can also do some updates on the items and they need to see the
> updates almost real time.
>
> Our public facing website is already built using solr so we already have
> the api in place to work with solr.
> We were hoping we can put a solr instance just for admin (low traffic and
> low latency) and build the functionality.
>
> Thanks for your suggesstions.
>
> On Wed, Jul 15, 2015 at 9:37 AM, Daniel Collins 
> wrote:
>
>> Just to re-iterate Charles' response with an example, we have a system
>> which needs to be as Near RT as we can make it.  So we have application
>> level commitWith set to 250ms.  Yes, we have to turn off a lot of caching,
>> auto-warming, etc, but it was necessary to make the index as real time as
>> we needed it to be.  Now we have the benefit of being able to throw a lot
>> of hardware, RAM and SSDs at this in order to get any kind of sane search
>> latency.
>>
>> We have the luxury of being able to afford that, but it comes with other
>> problems because we have an index that is changing so fast (replicating to
>> other nodes in the cloud becomes tricky, peer sync fails most of the time,
>> etc.)
>>
>> What is your use case that requires this level of real-time access?
>>
>> On 15 July 2015 at 13:59, Reitzel, Charles 
>> wrote:
>>
>> > And, to answer your other question, yes, you can turn off auto-warming.
>> > If your instance is dedicated to this client task, it may serve no
>> purpose
>> > or be actually counter-productive.
>> >
>> > In the past, I worked on a Solr-based application that committed
>> > frequently under application control (vs. auto commit) and we turned off
>> > all auto-warming and most of the caching.
>> >
>> > There is scant documentation in the new Solr reference (cwiki.apache.org
>> ),
>> > but the old docs cover this well and appear current enough:
>> > https://wiki.apache.org/solr/SolrCaching
>> >
>> > Just a thought: would true be helpful
>> > here?
>> >
>> > Also, since you have just inserted the documents, it sounds like you
>> > probably could search by ID ...
>> >
>> > -Original Message-
>> > From: Shawn Heisey [mailto:apa...@elyograg.org]
>> > Sent: Tuesday, July 14, 2015 6:04 PM
>> > To: solr-user@lucene.apache.org
>> > Subject: Re: SOLR nrt read writes
>> >
>> > On 7/14/2015 12:19 PM, Bhawna Asnani wrote:
>> > > I have a use case where we have to write data into solr and
>> > > immediately read it back.
>> > > The read is not get by Id but a search call.
>> > >
>> > > I am doing a softCommit after every such write which needs to be
>> > > visible immediately.
>> > > However sometimes the changes are not visible immediately.
>> > >
>> > > We have a solr cloud but I have also tried sending reads, writes and
>> > > commits to cloud leader only and still there is some latency.
>> > >
>> > > Has anybody tried to use solr this way?
>> >
>> > Don't ignore what Erick has said just because you're getting this reply
>> > from someone else.  That advice is correct.  My intent here is to provide
>> > more detail.
>> >
>> > Since you are not doing a retrieval by id (uniqueKey field), you can't
>> use
>> > the Realtime Get handler.  That handler would get the latest version of a
>> > doc, wh

Re: SOLR nrt read writes

2015-07-15 Thread Bhawna Asnani
We are building an admin for our inventory. Using solr's faceting,
searching and stats functionality it provides different ways an admin can
look at the inventory.
The admin can also do some updates on the items and they need to see the
updates almost real time.

Our public facing website is already built using solr so we already have
the api in place to work with solr.
We were hoping we can put a solr instance just for admin (low traffic and
low latency) and build the functionality.

Thanks for your suggesstions.

On Wed, Jul 15, 2015 at 9:37 AM, Daniel Collins 
wrote:

> Just to re-iterate Charles' response with an example, we have a system
> which needs to be as Near RT as we can make it.  So we have application
> level commitWith set to 250ms.  Yes, we have to turn off a lot of caching,
> auto-warming, etc, but it was necessary to make the index as real time as
> we needed it to be.  Now we have the benefit of being able to throw a lot
> of hardware, RAM and SSDs at this in order to get any kind of sane search
> latency.
>
> We have the luxury of being able to afford that, but it comes with other
> problems because we have an index that is changing so fast (replicating to
> other nodes in the cloud becomes tricky, peer sync fails most of the time,
> etc.)
>
> What is your use case that requires this level of real-time access?
>
> On 15 July 2015 at 13:59, Reitzel, Charles 
> wrote:
>
> > And, to answer your other question, yes, you can turn off auto-warming.
> > If your instance is dedicated to this client task, it may serve no
> purpose
> > or be actually counter-productive.
> >
> > In the past, I worked on a Solr-based application that committed
> > frequently under application control (vs. auto commit) and we turned off
> > all auto-warming and most of the caching.
> >
> > There is scant documentation in the new Solr reference (cwiki.apache.org
> ),
> > but the old docs cover this well and appear current enough:
> > https://wiki.apache.org/solr/SolrCaching
> >
> > Just a thought: would true be helpful
> > here?
> >
> > Also, since you have just inserted the documents, it sounds like you
> > probably could search by ID ...
> >
> > -----Original Message-
> > From: Shawn Heisey [mailto:apa...@elyograg.org]
> > Sent: Tuesday, July 14, 2015 6:04 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: SOLR nrt read writes
> >
> > On 7/14/2015 12:19 PM, Bhawna Asnani wrote:
> > > I have a use case where we have to write data into solr and
> > > immediately read it back.
> > > The read is not get by Id but a search call.
> > >
> > > I am doing a softCommit after every such write which needs to be
> > > visible immediately.
> > > However sometimes the changes are not visible immediately.
> > >
> > > We have a solr cloud but I have also tried sending reads, writes and
> > > commits to cloud leader only and still there is some latency.
> > >
> > > Has anybody tried to use solr this way?
> >
> > Don't ignore what Erick has said just because you're getting this reply
> > from someone else.  That advice is correct.  My intent here is to provide
> > more detail.
> >
> > Since you are not doing a retrieval by id (uniqueKey field), you can't
> use
> > the Realtime Get handler.  That handler would get the latest version of a
> > doc, whether it has been committed or not.  The transaction logs
> > (configured with updateLog in solrconfig.xml) are used to retrieve
> > uncommitted information.  Can you change your retrieval so it's by id
> > rather than a search query?  If you can, that might solve this for you.
> >
> > Normally, if you do a commit operation with openSearcher=true and
> > waitSearcher=true, control of the program will not be returned until that
> > commit is completely done ... but as Erick said, if you are doing a LOT
> of
> > commits very quickly, you're probably going to exceed
> maxWarmingSearchers,
> > and in that scenario, you cannot rely on using the commit operation as a
> > blocker for your retrieval attempt.
> >
> > In order to have any hope of getting what you want with your current
> > methods, your commit frequency must be low enough that each commit has
> time
> > to finish before the next one begins.  I personally would not do commits
> > more often than once a minute.  Commits on my larger index shards are
> known
> > to take up to ten seconds when the index is quiet, and even more if the
> > index is busy.  There are ways to make commits happen faster, but it
> often
> > involves disabling features that you might want to leave enabled.
> >
> > Thanks,
> > Shawn
> >
> >
> > *
> > This e-mail may contain confidential or privileged information.
> > If you are not the intended recipient, please notify the sender
> > immediately and then delete it.
> >
> > TIAA-CREF
> > *
> >
>


Re: SOLR nrt read writes

2015-07-15 Thread Daniel Collins
Just to re-iterate Charles' response with an example, we have a system
which needs to be as Near RT as we can make it.  So we have application
level commitWith set to 250ms.  Yes, we have to turn off a lot of caching,
auto-warming, etc, but it was necessary to make the index as real time as
we needed it to be.  Now we have the benefit of being able to throw a lot
of hardware, RAM and SSDs at this in order to get any kind of sane search
latency.

We have the luxury of being able to afford that, but it comes with other
problems because we have an index that is changing so fast (replicating to
other nodes in the cloud becomes tricky, peer sync fails most of the time,
etc.)

What is your use case that requires this level of real-time access?

On 15 July 2015 at 13:59, Reitzel, Charles 
wrote:

> And, to answer your other question, yes, you can turn off auto-warming.
> If your instance is dedicated to this client task, it may serve no purpose
> or be actually counter-productive.
>
> In the past, I worked on a Solr-based application that committed
> frequently under application control (vs. auto commit) and we turned off
> all auto-warming and most of the caching.
>
> There is scant documentation in the new Solr reference (cwiki.apache.org),
> but the old docs cover this well and appear current enough:
> https://wiki.apache.org/solr/SolrCaching
>
> Just a thought: would true be helpful
> here?
>
> Also, since you have just inserted the documents, it sounds like you
> probably could search by ID ...
>
> -Original Message-
> From: Shawn Heisey [mailto:apa...@elyograg.org]
> Sent: Tuesday, July 14, 2015 6:04 PM
> To: solr-user@lucene.apache.org
> Subject: Re: SOLR nrt read writes
>
> On 7/14/2015 12:19 PM, Bhawna Asnani wrote:
> > I have a use case where we have to write data into solr and
> > immediately read it back.
> > The read is not get by Id but a search call.
> >
> > I am doing a softCommit after every such write which needs to be
> > visible immediately.
> > However sometimes the changes are not visible immediately.
> >
> > We have a solr cloud but I have also tried sending reads, writes and
> > commits to cloud leader only and still there is some latency.
> >
> > Has anybody tried to use solr this way?
>
> Don't ignore what Erick has said just because you're getting this reply
> from someone else.  That advice is correct.  My intent here is to provide
> more detail.
>
> Since you are not doing a retrieval by id (uniqueKey field), you can't use
> the Realtime Get handler.  That handler would get the latest version of a
> doc, whether it has been committed or not.  The transaction logs
> (configured with updateLog in solrconfig.xml) are used to retrieve
> uncommitted information.  Can you change your retrieval so it's by id
> rather than a search query?  If you can, that might solve this for you.
>
> Normally, if you do a commit operation with openSearcher=true and
> waitSearcher=true, control of the program will not be returned until that
> commit is completely done ... but as Erick said, if you are doing a LOT of
> commits very quickly, you're probably going to exceed maxWarmingSearchers,
> and in that scenario, you cannot rely on using the commit operation as a
> blocker for your retrieval attempt.
>
> In order to have any hope of getting what you want with your current
> methods, your commit frequency must be low enough that each commit has time
> to finish before the next one begins.  I personally would not do commits
> more often than once a minute.  Commits on my larger index shards are known
> to take up to ten seconds when the index is quiet, and even more if the
> index is busy.  There are ways to make commits happen faster, but it often
> involves disabling features that you might want to leave enabled.
>
> Thanks,
> Shawn
>
>
> *
> This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender
> immediately and then delete it.
>
> TIAA-CREF
> *
>


RE: SOLR nrt read writes

2015-07-15 Thread Reitzel, Charles
And, to answer your other question, yes, you can turn off auto-warming.If 
your instance is dedicated to this client task, it may serve no purpose or be 
actually counter-productive.

In the past, I worked on a Solr-based application that committed frequently 
under application control (vs. auto commit) and we turned off all auto-warming 
and most of the caching.

There is scant documentation in the new Solr reference (cwiki.apache.org), but 
the old docs cover this well and appear current enough: 
https://wiki.apache.org/solr/SolrCaching

Just a thought: would true be helpful here?

Also, since you have just inserted the documents, it sounds like you probably 
could search by ID ...

-Original Message-
From: Shawn Heisey [mailto:apa...@elyograg.org] 
Sent: Tuesday, July 14, 2015 6:04 PM
To: solr-user@lucene.apache.org
Subject: Re: SOLR nrt read writes

On 7/14/2015 12:19 PM, Bhawna Asnani wrote:
> I have a use case where we have to write data into solr and 
> immediately read it back.
> The read is not get by Id but a search call.
>
> I am doing a softCommit after every such write which needs to be 
> visible immediately.
> However sometimes the changes are not visible immediately.
>
> We have a solr cloud but I have also tried sending reads, writes and 
> commits to cloud leader only and still there is some latency.
>
> Has anybody tried to use solr this way?

Don't ignore what Erick has said just because you're getting this reply from 
someone else.  That advice is correct.  My intent here is to provide more 
detail.

Since you are not doing a retrieval by id (uniqueKey field), you can't use the 
Realtime Get handler.  That handler would get the latest version of a doc, 
whether it has been committed or not.  The transaction logs (configured with 
updateLog in solrconfig.xml) are used to retrieve uncommitted information.  Can 
you change your retrieval so it's by id rather than a search query?  If you 
can, that might solve this for you.

Normally, if you do a commit operation with openSearcher=true and 
waitSearcher=true, control of the program will not be returned until that 
commit is completely done ... but as Erick said, if you are doing a LOT of 
commits very quickly, you're probably going to exceed maxWarmingSearchers, and 
in that scenario, you cannot rely on using the commit operation as a blocker 
for your retrieval attempt.

In order to have any hope of getting what you want with your current methods, 
your commit frequency must be low enough that each commit has time to finish 
before the next one begins.  I personally would not do commits more often than 
once a minute.  Commits on my larger index shards are known to take up to ten 
seconds when the index is quiet, and even more if the index is busy.  There are 
ways to make commits happen faster, but it often involves disabling features 
that you might want to leave enabled.

Thanks,
Shawn


*
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and 
then delete it.

TIAA-CREF
*


Re: SOLR nrt read writes

2015-07-14 Thread Shawn Heisey
On 7/14/2015 12:19 PM, Bhawna Asnani wrote:
> I have a use case where we have to write data into solr and immediately
> read it back.
> The read is not get by Id but a search call.
>
> I am doing a softCommit after every such write which needs to be visible
> immediately.
> However sometimes the changes are not visible immediately.
>
> We have a solr cloud but I have also tried sending reads, writes and
> commits to cloud leader only and still there is some latency.
>
> Has anybody tried to use solr this way?

Don't ignore what Erick has said just because you're getting this reply
from someone else.  That advice is correct.  My intent here is to
provide more detail.

Since you are not doing a retrieval by id (uniqueKey field), you can't
use the Realtime Get handler.  That handler would get the latest version
of a doc, whether it has been committed or not.  The transaction logs
(configured with updateLog in solrconfig.xml) are used to retrieve
uncommitted information.  Can you change your retrieval so it's by id
rather than a search query?  If you can, that might solve this for you.

Normally, if you do a commit operation with openSearcher=true and
waitSearcher=true, control of the program will not be returned until
that commit is completely done ... but as Erick said, if you are doing a
LOT of commits very quickly, you're probably going to exceed
maxWarmingSearchers, and in that scenario, you cannot rely on using the
commit operation as a blocker for your retrieval attempt.

In order to have any hope of getting what you want with your current
methods, your commit frequency must be low enough that each commit has
time to finish before the next one begins.  I personally would not do
commits more often than once a minute.  Commits on my larger index
shards are known to take up to ten seconds when the index is quiet, and
even more if the index is busy.  There are ways to make commits happen
faster, but it often involves disabling features that you might want to
leave enabled.

Thanks,
Shawn



Re: SOLR nrt read writes

2015-07-14 Thread Erick Erickson
Ahh, good point about setting waitSearcher=true, I should have thought of that.
Although the default is set to "true", so unless you're doing
something different
that should be set already.

Look at your Solr logs and see if you find messages about "too many warming
searchers" or some such. IN that case I _think_ you'll return without opening
a new searcher and that might be your problem.

The retry loop would just be running your query and seeing if the
returned doc had
the most recent update in it, sleeping a bit and trying again. But setting
waitSearcher=true should do this if you don't exceed max warming searchers.

Best,
Erick

On Tue, Jul 14, 2015 at 12:56 PM, Bhawna Asnani  wrote:
> Thanks.
>
> Load is really not a concern. We will be using it only for a handful of
> admin users and we are ok dedicated a solr server for just this user case.
> If I have to write a loop to check back if the the updates are written and
> searcher picked those up, what would that call look like?
>
> Can I set waitSeacher=true or turn off cache autowarms? Anything that  make
> sure the updates are visible before search?
>
> Thanks once again.
>
> On Tue, Jul 14, 2015 at 3:09 PM, Erick Erickson 
> wrote:
>
>> bq: I have a use case where we have to write data into solr and immediately
>> read it back.
>>
>> This is simply not going to work with frequent updates. Solr
>> promises Near in NRT, not "real time".
>>
>> If nothing else, if you fire the query before autowarming is completed. In
>> this
>> case you'll sometimes get the doc and sometimes not because you'll
>> get a search on the old version of the index before the update. And if you
>> fire
>> soft commits rapidly, you'll exceed maxWarmingSearchers
>> and get warnings in the log.. which will also not return you the new docs.
>>
>> You'll have to revisit this requirement. Either you'll have to build in a
>> retry
>> loop, some other kind of delay or change the requirement.
>>
>> And under heavy indexing load this will not be performant.
>>
>> Best,
>> Erick
>>
>>
>> On Tue, Jul 14, 2015 at 11:19 AM, Bhawna Asnani 
>> wrote:
>> > Hi,
>> > I have a use case where we have to write data into solr and immediately
>> > read it back.
>> > The read is not get by Id but a search call.
>> >
>> > I am doing a softCommit after every such write which needs to be visible
>> > immediately.
>> > However sometimes the changes are not visible immediately.
>> >
>> > We have a solr cloud but I have also tried sending reads, writes and
>> > commits to cloud leader only and still there is some latency.
>> >
>> > Has anybody tried to use solr this way?
>> >
>> > -Bhawna
>>


Re: SOLR nrt read writes

2015-07-14 Thread Bhawna Asnani
Thanks.

Load is really not a concern. We will be using it only for a handful of
admin users and we are ok dedicated a solr server for just this user case.
If I have to write a loop to check back if the the updates are written and
searcher picked those up, what would that call look like?

Can I set waitSeacher=true or turn off cache autowarms? Anything that  make
sure the updates are visible before search?

Thanks once again.

On Tue, Jul 14, 2015 at 3:09 PM, Erick Erickson 
wrote:

> bq: I have a use case where we have to write data into solr and immediately
> read it back.
>
> This is simply not going to work with frequent updates. Solr
> promises Near in NRT, not "real time".
>
> If nothing else, if you fire the query before autowarming is completed. In
> this
> case you'll sometimes get the doc and sometimes not because you'll
> get a search on the old version of the index before the update. And if you
> fire
> soft commits rapidly, you'll exceed maxWarmingSearchers
> and get warnings in the log.. which will also not return you the new docs.
>
> You'll have to revisit this requirement. Either you'll have to build in a
> retry
> loop, some other kind of delay or change the requirement.
>
> And under heavy indexing load this will not be performant.
>
> Best,
> Erick
>
>
> On Tue, Jul 14, 2015 at 11:19 AM, Bhawna Asnani 
> wrote:
> > Hi,
> > I have a use case where we have to write data into solr and immediately
> > read it back.
> > The read is not get by Id but a search call.
> >
> > I am doing a softCommit after every such write which needs to be visible
> > immediately.
> > However sometimes the changes are not visible immediately.
> >
> > We have a solr cloud but I have also tried sending reads, writes and
> > commits to cloud leader only and still there is some latency.
> >
> > Has anybody tried to use solr this way?
> >
> > -Bhawna
>


Re: SOLR nrt read writes

2015-07-14 Thread Erick Erickson
bq: I have a use case where we have to write data into solr and immediately
read it back.

This is simply not going to work with frequent updates. Solr
promises Near in NRT, not "real time".

If nothing else, if you fire the query before autowarming is completed. In this
case you'll sometimes get the doc and sometimes not because you'll
get a search on the old version of the index before the update. And if you fire
soft commits rapidly, you'll exceed maxWarmingSearchers
and get warnings in the log.. which will also not return you the new docs.

You'll have to revisit this requirement. Either you'll have to build in a retry
loop, some other kind of delay or change the requirement.

And under heavy indexing load this will not be performant.

Best,
Erick


On Tue, Jul 14, 2015 at 11:19 AM, Bhawna Asnani  wrote:
> Hi,
> I have a use case where we have to write data into solr and immediately
> read it back.
> The read is not get by Id but a search call.
>
> I am doing a softCommit after every such write which needs to be visible
> immediately.
> However sometimes the changes are not visible immediately.
>
> We have a solr cloud but I have also tried sending reads, writes and
> commits to cloud leader only and still there is some latency.
>
> Has anybody tried to use solr this way?
>
> -Bhawna


SOLR nrt read writes

2015-07-14 Thread Bhawna Asnani
Hi,
I have a use case where we have to write data into solr and immediately
read it back.
The read is not get by Id but a search call.

I am doing a softCommit after every such write which needs to be visible
immediately.
However sometimes the changes are not visible immediately.

We have a solr cloud but I have also tried sending reads, writes and
commits to cloud leader only and still there is some latency.

Has anybody tried to use solr this way?

-Bhawna