Re: almost realtime updates with replication

2009-02-16 Thread Noble Paul നോബിള്‍ नोब्ळ्
yes , it does . it just blindly creates hard links irrespective of a
document is added or not. but no snappull will happen because there is
no new file to be downloaded

On Mon, Feb 16, 2009 at 7:40 PM, sunnyfr  wrote:
>
> Hi Noble,
>
> So ok I don't mind really if it miss one, if it get the last one it's good.
> I've was wondering as well if a snapshot is created even if no document has
> been update?
>
> Thanks a lot Noble,
> Wish you a very nice day,
>
>
> Noble Paul നോബിള്‍  नोब्ळ् wrote:
>>
>> I guess , it should not be a problem
>> --Noble
>>
>> On Mon, Feb 16, 2009 at 3:28 PM, sunnyfr  wrote:
>>>
>>> Hi Hoss,
>>>
>>> Is it a problem if the snappuller miss one snapshot before the last one
>>> ??
>>>
>>> Cheer,
>>> Have a nice day,
>>>
>>>
>>> hossman wrote:
>>>>
>>>> :
>>>> : There are a couple queries that we would like to run almost realtime
>>>> so
>>>> : I would like to have it so our client sends an update on every new
>>>> : document and then have solr configured to do an autocommit every 5-10
>>>> : seconds.
>>>> :
>>>> : reading the Wiki, it seems like this isn't possible because of the
>>>> : strain of snapshotting and pulling to the slaves at such a high rate.
>>>> : What I was thinking was for these few queries to just query the master
>>>> : and the rest can query the slave with the not realtime data, although
>>>> : I'm assuming this wouldn't work either because since a snapshot is
>>>> : created on every commit, we would still impact the performance too
>>>> much?
>>>>
>>>> there is no reason why a commit has to trigger a snapshot, that happens
>>>> only if you configure a postCommit hook to do so in your solrconfig.xml
>>>>
>>>> you can absolutely commit every 5 seconds, but have a seperate cron task
>>>> that runs snapshooter ever 5 minutes -- you could even continue to run
>>>> snapshooter on every commit, and get a new snapshot ever 5 seconds, but
>>>> only run snappuller on your slave machines ever 5 minutes (the
>>>> snapshots are hardlinks and don't take up a lot of space, and snappuller
>>>> only needs to fetch the most recent snapshot)
>>>>
>>>> your idea of querying the msater directly for these queries seems
>>>> perfectly fine to me ... just make sure the auto warm count on the
>>>> caches
>>>> on your master is very tiny so the new searchers are ready quickly after
>>>> each commit.
>>>>
>>>>
>>>>
>>>>
>>>> -Hoss
>>>>
>>>>
>>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/almost-realtime-updates-with-replication-tp12276614p22034406.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>
>>>
>>
>>
>>
>> --
>> --Noble Paul
>>
>>
>
> --
> View this message in context: 
> http://www.nabble.com/almost-realtime-updates-with-replication-tp12276614p22037977.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
--Noble Paul


Re: almost realtime updates with replication

2009-02-16 Thread sunnyfr

Hi Noble,

So ok I don't mind really if it miss one, if it get the last one it's good.
I've was wondering as well if a snapshot is created even if no document has
been update?

Thanks a lot Noble,
Wish you a very nice day,


Noble Paul നോബിള്‍  नोब्ळ् wrote:
> 
> I guess , it should not be a problem
> --Noble
> 
> On Mon, Feb 16, 2009 at 3:28 PM, sunnyfr  wrote:
>>
>> Hi Hoss,
>>
>> Is it a problem if the snappuller miss one snapshot before the last one
>> ??
>>
>> Cheer,
>> Have a nice day,
>>
>>
>> hossman wrote:
>>>
>>> :
>>> : There are a couple queries that we would like to run almost realtime
>>> so
>>> : I would like to have it so our client sends an update on every new
>>> : document and then have solr configured to do an autocommit every 5-10
>>> : seconds.
>>> :
>>> : reading the Wiki, it seems like this isn't possible because of the
>>> : strain of snapshotting and pulling to the slaves at such a high rate.
>>> : What I was thinking was for these few queries to just query the master
>>> : and the rest can query the slave with the not realtime data, although
>>> : I'm assuming this wouldn't work either because since a snapshot is
>>> : created on every commit, we would still impact the performance too
>>> much?
>>>
>>> there is no reason why a commit has to trigger a snapshot, that happens
>>> only if you configure a postCommit hook to do so in your solrconfig.xml
>>>
>>> you can absolutely commit every 5 seconds, but have a seperate cron task
>>> that runs snapshooter ever 5 minutes -- you could even continue to run
>>> snapshooter on every commit, and get a new snapshot ever 5 seconds, but
>>> only run snappuller on your slave machines ever 5 minutes (the
>>> snapshots are hardlinks and don't take up a lot of space, and snappuller
>>> only needs to fetch the most recent snapshot)
>>>
>>> your idea of querying the msater directly for these queries seems
>>> perfectly fine to me ... just make sure the auto warm count on the
>>> caches
>>> on your master is very tiny so the new searchers are ready quickly after
>>> each commit.
>>>
>>>
>>>
>>>
>>> -Hoss
>>>
>>>
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/almost-realtime-updates-with-replication-tp12276614p22034406.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> -- 
> --Noble Paul
> 
> 

-- 
View this message in context: 
http://www.nabble.com/almost-realtime-updates-with-replication-tp12276614p22037977.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: almost realtime updates with replication

2009-02-16 Thread Noble Paul നോബിള്‍ नोब्ळ्
I guess , it should not be a problem
--Noble

On Mon, Feb 16, 2009 at 3:28 PM, sunnyfr  wrote:
>
> Hi Hoss,
>
> Is it a problem if the snappuller miss one snapshot before the last one ??
>
> Cheer,
> Have a nice day,
>
>
> hossman wrote:
>>
>> :
>> : There are a couple queries that we would like to run almost realtime so
>> : I would like to have it so our client sends an update on every new
>> : document and then have solr configured to do an autocommit every 5-10
>> : seconds.
>> :
>> : reading the Wiki, it seems like this isn't possible because of the
>> : strain of snapshotting and pulling to the slaves at such a high rate.
>> : What I was thinking was for these few queries to just query the master
>> : and the rest can query the slave with the not realtime data, although
>> : I'm assuming this wouldn't work either because since a snapshot is
>> : created on every commit, we would still impact the performance too much?
>>
>> there is no reason why a commit has to trigger a snapshot, that happens
>> only if you configure a postCommit hook to do so in your solrconfig.xml
>>
>> you can absolutely commit every 5 seconds, but have a seperate cron task
>> that runs snapshooter ever 5 minutes -- you could even continue to run
>> snapshooter on every commit, and get a new snapshot ever 5 seconds, but
>> only run snappuller on your slave machines ever 5 minutes (the
>> snapshots are hardlinks and don't take up a lot of space, and snappuller
>> only needs to fetch the most recent snapshot)
>>
>> your idea of querying the msater directly for these queries seems
>> perfectly fine to me ... just make sure the auto warm count on the caches
>> on your master is very tiny so the new searchers are ready quickly after
>> each commit.
>>
>>
>>
>>
>> -Hoss
>>
>>
>>
>
> --
> View this message in context: 
> http://www.nabble.com/almost-realtime-updates-with-replication-tp12276614p22034406.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
--Noble Paul


Re: almost realtime updates with replication

2009-02-16 Thread sunnyfr

Hi Hoss,

Is it a problem if the snappuller miss one snapshot before the last one ?? 

Cheer,
Have a nice day,


hossman wrote:
> 
> :
> : There are a couple queries that we would like to run almost realtime so
> : I would like to have it so our client sends an update on every new
> : document and then have solr configured to do an autocommit every 5-10
> : seconds.
> :
> : reading the Wiki, it seems like this isn't possible because of the
> : strain of snapshotting and pulling to the slaves at such a high rate.
> : What I was thinking was for these few queries to just query the master
> : and the rest can query the slave with the not realtime data, although
> : I'm assuming this wouldn't work either because since a snapshot is
> : created on every commit, we would still impact the performance too much?
> 
> there is no reason why a commit has to trigger a snapshot, that happens
> only if you configure a postCommit hook to do so in your solrconfig.xml
> 
> you can absolutely commit every 5 seconds, but have a seperate cron task
> that runs snapshooter ever 5 minutes -- you could even continue to run
> snapshooter on every commit, and get a new snapshot ever 5 seconds, but
> only run snappuller on your slave machines ever 5 minutes (the
> snapshots are hardlinks and don't take up a lot of space, and snappuller
> only needs to fetch the most recent snapshot)
> 
> your idea of querying the msater directly for these queries seems
> perfectly fine to me ... just make sure the auto warm count on the caches
> on your master is very tiny so the new searchers are ready quickly after
> each commit.
> 
> 
> 
> 
> -Hoss
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/almost-realtime-updates-with-replication-tp12276614p22034406.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: almost realtime updates with replication

2007-08-22 Thread Chris Hostetter
:
: There are a couple queries that we would like to run almost realtime so
: I would like to have it so our client sends an update on every new
: document and then have solr configured to do an autocommit every 5-10
: seconds.
:
: reading the Wiki, it seems like this isn't possible because of the
: strain of snapshotting and pulling to the slaves at such a high rate.
: What I was thinking was for these few queries to just query the master
: and the rest can query the slave with the not realtime data, although
: I'm assuming this wouldn't work either because since a snapshot is
: created on every commit, we would still impact the performance too much?

there is no reason why a commit has to trigger a snapshot, that happens
only if you configure a postCommit hook to do so in your solrconfig.xml

you can absolutely commit every 5 seconds, but have a seperate cron task
that runs snapshooter ever 5 minutes -- you could even continue to run
snapshooter on every commit, and get a new snapshot ever 5 seconds, but
only run snappuller on your slave machines ever 5 minutes (the
snapshots are hardlinks and don't take up a lot of space, and snappuller
only needs to fetch the most recent snapshot)

your idea of querying the msater directly for these queries seems
perfectly fine to me ... just make sure the auto warm count on the caches
on your master is very tiny so the new searchers are ready quickly after
each commit.




-Hoss



Re: almost realtime updates with replication

2007-08-22 Thread Walter Underwood
At Infoseek, we ran a separate search index with today's updates
and merged that in once each day. It requires a little bit of
federated search to prefer the new content over the big index,
but the daily index can be very nimble for update.

wunder

On 8/22/07 7:58 AM, "mike topper" <[EMAIL PROTECTED]> wrote:

> Hello,
> 
> Currently in our application we are using the master/slave setup and
> have a batch update/commit about every 5 minutes.
> 
> There are a couple queries that we would like to run almost realtime so
> I would like to have it so our client sends an update on every new
> document and then have solr configured to do an autocommit every 5-10
> seconds.
> 
> reading the Wiki, it seems like this isn't possible because of the
> strain of snapshotting and pulling to the slaves at such a high rate.
> What I was thinking was for these few queries to just query the master
> and the rest can query the slave with the not realtime data, although
> I'm assuming this wouldn't work either because since a snapshot is
> created on every commit, we would still impact the performance too much?
> 
> anyone have any suggestions?  If I set autowarmingCount=0 would I be
> able to to pull to the slave faster than every couple of minutes (say,
> every 10 seconds)?
> 
> what if I take out the postcommit hook on the master and just have the
> snapshooter run on a cron every 5 minutes?
> 
> -Mike
> 
> 



almost realtime updates with replication

2007-08-22 Thread mike topper

Hello,

Currently in our application we are using the master/slave setup and 
have a batch update/commit about every 5 minutes.


There are a couple queries that we would like to run almost realtime so 
I would like to have it so our client sends an update on every new 
document and then have solr configured to do an autocommit every 5-10 
seconds.


reading the Wiki, it seems like this isn't possible because of the 
strain of snapshotting and pulling to the slaves at such a high rate.  
What I was thinking was for these few queries to just query the master 
and the rest can query the slave with the not realtime data, although 
I'm assuming this wouldn't work either because since a snapshot is 
created on every commit, we would still impact the performance too much?


anyone have any suggestions?  If I set autowarmingCount=0 would I be 
able to to pull to the slave faster than every couple of minutes (say, 
every 10 seconds)?


what if I take out the postcommit hook on the master and just have the 
snapshooter run on a cron every 5 minutes?


-Mike