Re: almost realtime updates with replication
yes , it does . it just blindly creates hard links irrespective of a document is added or not. but no snappull will happen because there is no new file to be downloaded On Mon, Feb 16, 2009 at 7:40 PM, sunnyfr wrote: > > Hi Noble, > > So ok I don't mind really if it miss one, if it get the last one it's good. > I've was wondering as well if a snapshot is created even if no document has > been update? > > Thanks a lot Noble, > Wish you a very nice day, > > > Noble Paul നോബിള് नोब्ळ् wrote: >> >> I guess , it should not be a problem >> --Noble >> >> On Mon, Feb 16, 2009 at 3:28 PM, sunnyfr wrote: >>> >>> Hi Hoss, >>> >>> Is it a problem if the snappuller miss one snapshot before the last one >>> ?? >>> >>> Cheer, >>> Have a nice day, >>> >>> >>> hossman wrote: >>>> >>>> : >>>> : There are a couple queries that we would like to run almost realtime >>>> so >>>> : I would like to have it so our client sends an update on every new >>>> : document and then have solr configured to do an autocommit every 5-10 >>>> : seconds. >>>> : >>>> : reading the Wiki, it seems like this isn't possible because of the >>>> : strain of snapshotting and pulling to the slaves at such a high rate. >>>> : What I was thinking was for these few queries to just query the master >>>> : and the rest can query the slave with the not realtime data, although >>>> : I'm assuming this wouldn't work either because since a snapshot is >>>> : created on every commit, we would still impact the performance too >>>> much? >>>> >>>> there is no reason why a commit has to trigger a snapshot, that happens >>>> only if you configure a postCommit hook to do so in your solrconfig.xml >>>> >>>> you can absolutely commit every 5 seconds, but have a seperate cron task >>>> that runs snapshooter ever 5 minutes -- you could even continue to run >>>> snapshooter on every commit, and get a new snapshot ever 5 seconds, but >>>> only run snappuller on your slave machines ever 5 minutes (the >>>> snapshots are hardlinks and don't take up a lot of space, and snappuller >>>> only needs to fetch the most recent snapshot) >>>> >>>> your idea of querying the msater directly for these queries seems >>>> perfectly fine to me ... just make sure the auto warm count on the >>>> caches >>>> on your master is very tiny so the new searchers are ready quickly after >>>> each commit. >>>> >>>> >>>> >>>> >>>> -Hoss >>>> >>>> >>>> >>> >>> -- >>> View this message in context: >>> http://www.nabble.com/almost-realtime-updates-with-replication-tp12276614p22034406.html >>> Sent from the Solr - User mailing list archive at Nabble.com. >>> >>> >> >> >> >> -- >> --Noble Paul >> >> > > -- > View this message in context: > http://www.nabble.com/almost-realtime-updates-with-replication-tp12276614p22037977.html > Sent from the Solr - User mailing list archive at Nabble.com. > > -- --Noble Paul
Re: almost realtime updates with replication
Hi Noble, So ok I don't mind really if it miss one, if it get the last one it's good. I've was wondering as well if a snapshot is created even if no document has been update? Thanks a lot Noble, Wish you a very nice day, Noble Paul നോബിള് नोब्ळ् wrote: > > I guess , it should not be a problem > --Noble > > On Mon, Feb 16, 2009 at 3:28 PM, sunnyfr wrote: >> >> Hi Hoss, >> >> Is it a problem if the snappuller miss one snapshot before the last one >> ?? >> >> Cheer, >> Have a nice day, >> >> >> hossman wrote: >>> >>> : >>> : There are a couple queries that we would like to run almost realtime >>> so >>> : I would like to have it so our client sends an update on every new >>> : document and then have solr configured to do an autocommit every 5-10 >>> : seconds. >>> : >>> : reading the Wiki, it seems like this isn't possible because of the >>> : strain of snapshotting and pulling to the slaves at such a high rate. >>> : What I was thinking was for these few queries to just query the master >>> : and the rest can query the slave with the not realtime data, although >>> : I'm assuming this wouldn't work either because since a snapshot is >>> : created on every commit, we would still impact the performance too >>> much? >>> >>> there is no reason why a commit has to trigger a snapshot, that happens >>> only if you configure a postCommit hook to do so in your solrconfig.xml >>> >>> you can absolutely commit every 5 seconds, but have a seperate cron task >>> that runs snapshooter ever 5 minutes -- you could even continue to run >>> snapshooter on every commit, and get a new snapshot ever 5 seconds, but >>> only run snappuller on your slave machines ever 5 minutes (the >>> snapshots are hardlinks and don't take up a lot of space, and snappuller >>> only needs to fetch the most recent snapshot) >>> >>> your idea of querying the msater directly for these queries seems >>> perfectly fine to me ... just make sure the auto warm count on the >>> caches >>> on your master is very tiny so the new searchers are ready quickly after >>> each commit. >>> >>> >>> >>> >>> -Hoss >>> >>> >>> >> >> -- >> View this message in context: >> http://www.nabble.com/almost-realtime-updates-with-replication-tp12276614p22034406.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > > > -- > --Noble Paul > > -- View this message in context: http://www.nabble.com/almost-realtime-updates-with-replication-tp12276614p22037977.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: almost realtime updates with replication
I guess , it should not be a problem --Noble On Mon, Feb 16, 2009 at 3:28 PM, sunnyfr wrote: > > Hi Hoss, > > Is it a problem if the snappuller miss one snapshot before the last one ?? > > Cheer, > Have a nice day, > > > hossman wrote: >> >> : >> : There are a couple queries that we would like to run almost realtime so >> : I would like to have it so our client sends an update on every new >> : document and then have solr configured to do an autocommit every 5-10 >> : seconds. >> : >> : reading the Wiki, it seems like this isn't possible because of the >> : strain of snapshotting and pulling to the slaves at such a high rate. >> : What I was thinking was for these few queries to just query the master >> : and the rest can query the slave with the not realtime data, although >> : I'm assuming this wouldn't work either because since a snapshot is >> : created on every commit, we would still impact the performance too much? >> >> there is no reason why a commit has to trigger a snapshot, that happens >> only if you configure a postCommit hook to do so in your solrconfig.xml >> >> you can absolutely commit every 5 seconds, but have a seperate cron task >> that runs snapshooter ever 5 minutes -- you could even continue to run >> snapshooter on every commit, and get a new snapshot ever 5 seconds, but >> only run snappuller on your slave machines ever 5 minutes (the >> snapshots are hardlinks and don't take up a lot of space, and snappuller >> only needs to fetch the most recent snapshot) >> >> your idea of querying the msater directly for these queries seems >> perfectly fine to me ... just make sure the auto warm count on the caches >> on your master is very tiny so the new searchers are ready quickly after >> each commit. >> >> >> >> >> -Hoss >> >> >> > > -- > View this message in context: > http://www.nabble.com/almost-realtime-updates-with-replication-tp12276614p22034406.html > Sent from the Solr - User mailing list archive at Nabble.com. > > -- --Noble Paul
Re: almost realtime updates with replication
Hi Hoss, Is it a problem if the snappuller miss one snapshot before the last one ?? Cheer, Have a nice day, hossman wrote: > > : > : There are a couple queries that we would like to run almost realtime so > : I would like to have it so our client sends an update on every new > : document and then have solr configured to do an autocommit every 5-10 > : seconds. > : > : reading the Wiki, it seems like this isn't possible because of the > : strain of snapshotting and pulling to the slaves at such a high rate. > : What I was thinking was for these few queries to just query the master > : and the rest can query the slave with the not realtime data, although > : I'm assuming this wouldn't work either because since a snapshot is > : created on every commit, we would still impact the performance too much? > > there is no reason why a commit has to trigger a snapshot, that happens > only if you configure a postCommit hook to do so in your solrconfig.xml > > you can absolutely commit every 5 seconds, but have a seperate cron task > that runs snapshooter ever 5 minutes -- you could even continue to run > snapshooter on every commit, and get a new snapshot ever 5 seconds, but > only run snappuller on your slave machines ever 5 minutes (the > snapshots are hardlinks and don't take up a lot of space, and snappuller > only needs to fetch the most recent snapshot) > > your idea of querying the msater directly for these queries seems > perfectly fine to me ... just make sure the auto warm count on the caches > on your master is very tiny so the new searchers are ready quickly after > each commit. > > > > > -Hoss > > > -- View this message in context: http://www.nabble.com/almost-realtime-updates-with-replication-tp12276614p22034406.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: almost realtime updates with replication
: : There are a couple queries that we would like to run almost realtime so : I would like to have it so our client sends an update on every new : document and then have solr configured to do an autocommit every 5-10 : seconds. : : reading the Wiki, it seems like this isn't possible because of the : strain of snapshotting and pulling to the slaves at such a high rate. : What I was thinking was for these few queries to just query the master : and the rest can query the slave with the not realtime data, although : I'm assuming this wouldn't work either because since a snapshot is : created on every commit, we would still impact the performance too much? there is no reason why a commit has to trigger a snapshot, that happens only if you configure a postCommit hook to do so in your solrconfig.xml you can absolutely commit every 5 seconds, but have a seperate cron task that runs snapshooter ever 5 minutes -- you could even continue to run snapshooter on every commit, and get a new snapshot ever 5 seconds, but only run snappuller on your slave machines ever 5 minutes (the snapshots are hardlinks and don't take up a lot of space, and snappuller only needs to fetch the most recent snapshot) your idea of querying the msater directly for these queries seems perfectly fine to me ... just make sure the auto warm count on the caches on your master is very tiny so the new searchers are ready quickly after each commit. -Hoss
Re: almost realtime updates with replication
At Infoseek, we ran a separate search index with today's updates and merged that in once each day. It requires a little bit of federated search to prefer the new content over the big index, but the daily index can be very nimble for update. wunder On 8/22/07 7:58 AM, "mike topper" <[EMAIL PROTECTED]> wrote: > Hello, > > Currently in our application we are using the master/slave setup and > have a batch update/commit about every 5 minutes. > > There are a couple queries that we would like to run almost realtime so > I would like to have it so our client sends an update on every new > document and then have solr configured to do an autocommit every 5-10 > seconds. > > reading the Wiki, it seems like this isn't possible because of the > strain of snapshotting and pulling to the slaves at such a high rate. > What I was thinking was for these few queries to just query the master > and the rest can query the slave with the not realtime data, although > I'm assuming this wouldn't work either because since a snapshot is > created on every commit, we would still impact the performance too much? > > anyone have any suggestions? If I set autowarmingCount=0 would I be > able to to pull to the slave faster than every couple of minutes (say, > every 10 seconds)? > > what if I take out the postcommit hook on the master and just have the > snapshooter run on a cron every 5 minutes? > > -Mike > >
almost realtime updates with replication
Hello, Currently in our application we are using the master/slave setup and have a batch update/commit about every 5 minutes. There are a couple queries that we would like to run almost realtime so I would like to have it so our client sends an update on every new document and then have solr configured to do an autocommit every 5-10 seconds. reading the Wiki, it seems like this isn't possible because of the strain of snapshotting and pulling to the slaves at such a high rate. What I was thinking was for these few queries to just query the master and the rest can query the slave with the not realtime data, although I'm assuming this wouldn't work either because since a snapshot is created on every commit, we would still impact the performance too much? anyone have any suggestions? If I set autowarmingCount=0 would I be able to to pull to the slave faster than every couple of minutes (say, every 10 seconds)? what if I take out the postcommit hook on the master and just have the snapshooter run on a cron every 5 minutes? -Mike