from:"Yves Dorfsman"

Re: Why ES node starts recovering all the data from other nodes after reboot?

2014-11-22 Thread Yves Dorfsman

On 2014-11-22 09:35, Otis Gospodnetic wrote:
 Hi Konstantin,
 
 Check out http://gibrown.com/2014/11/19/elasticsearch-the-broken-bits/
 

Good writing! Thanks.

I wonder if there's any drawback from cutting indices in smaller (tiny?) shards?

My thinking is this: We don't really change data in our bigger indices, we
just keep adding to them, so ultimately as we re-build node, they should all
have the same version of the old shards, which should make re-start, and even
re-build from backups much faster.

-- 
http://yves.zioup.com
gpg: 4096R/32B0F416

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5470DE32.5070902%40zioup.com.
For more options, visit https://groups.google.com/d/optout.

Re: Why ES node starts recovering all the data from other nodes after reboot?

2014-11-21 Thread Yves Dorfsman

Thanks Nicolas.

Is this true on versions 0.9, or only on 1?
I've had nodes die and restart, and they did copy everything!

On 2014-11-20 22:02, Nikolas Everett wrote:
The thing is that this is a disk level operation. It pretty much rsyncs the
files from the current master shard to the node when it comes back online.
This would be OK if the replica shards matched the master but that is only
normally the case if the shard was moved to the node after it was mostly
complete and then you've had only a few writes. Normally shards don't match
each other because the way the index is maintained is nondeterministic.

The translog replay is only used as a catch up after the rsync-like step.

This is something that is being worked on. Its certainly my biggest complaint
about elasticsearch but I'm confident that it'll get better.

Nik

On Nov 20, 2014 11:11 PM, Mark Walkom markwal...@gmail.com
mailto:markwal...@gmail.com wrote:

It will enter recovery where it syncs at the segment level from the
current primary, then the translog gets shipped over and (re)played, which
brings it all up to date.

On 21 November 2014 14:51, Yves Dorfsman y...@zioup.com
mailto:y...@zioup.com wrote:

If you do disable allocation before you reboot a node and a client
writes to a shard that had a replica on that node, does the entire
replica gets copied when the node come up? Or does it get just
updated?

On Thursday, 20 November 2014 19:52:26 UTC-7, Mark Walkom wrote:

You should disable allocation before you reboot, that will save a
lot of shard shuffling -

http://www.elasticsearch.org/__guide/en/elasticsearch/__reference/current/setup-__upgrade.html#rolling-upgrades

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-upgrade.html#rolling-upgrades

On 21 November 2014 13:48, Konstantin Erman kon...@gmail.com
wrote:

I work on an experimental cluster of ES nodes running on
Windows Server machines. Once in a while we have a need to
reboot machines. The initial state - cluster is green and well
balanced. One machine is gracefully taken offline and then
after necessary service is performed it comes back online. All
the hardware and file system content is intact. As soon as ES
service starts on that machine, it assumes that there is no
usable data locally and recovers as much data as it deems
necessary for balancing from other nodes.

This behavior puzzles me, because most of the data shards
stored on that machine file system can be reused as they are.
Cluster stores logs, so all indices except those for
the current day never ever change until they get deleted.
Can't ES node detect that it has perfect copies of some
(actually most) of the shards and instead of copying them over
just mark them as up to date?

I suspect I don't know about some step to enable this behavior
and I'm looking to enable it. Any advice?

Thank you!
Konstantin

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscr...@googlegroups.com
mailto:elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/51b3ff69-a126-4f2f-9838-0098bc26694d%40googlegroups.com

https://groups.google.com/d/msgid/elasticsearch/51b3ff69-a126-4f2f-9838-0098bc26694d%40googlegroups.com?utm_medium=emailutm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com
mailto:elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/CAF3ZnZmxZuSjJAJPj_yKT6d8_L-Mx6ceZfDNmJCLkSOXsfeydQ%40mail.gmail.com

https://groups.google.com/d/msgid/elasticsearch/CAF3ZnZmxZuSjJAJPj_yKT6d8_L-Mx6ceZfDNmJCLkSOXsfeydQ%40mail.gmail.com?utm_medium=emailutm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google
Groups elasticsearch group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic

adding a new node: how to prime the data

2014-11-20 Thread Yves Dorfsman

We upgrade our clusters by adding new nodes, increase the number or 
replicas on the indices, let the new node catch up, then exclude the old 
node, and reduce the number of replicas on the indices.

One cluster has a large index for which this operation takes hours. We 
tried to copy data from an existing node, but it copies everything 
regardless (I suspect it has no way to know what's new or not?). We're do 
plan to split that index into smaller shards, but in the meantime we are 
wondering if there is a better way of doing this?

Thanks.

---
http://yves.zioup.com
gpg: 4096R/32B0F416 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/60428bf4-675b-47bd-8b8b-e90e7e967b0b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Why ES node starts recovering all the data from other nodes after reboot?

2014-11-20 Thread Yves Dorfsman

If you do disable allocation before you reboot a node and a client writes
to a shard that had a replica on that node, does the entire replica gets
copied when the node come up? Or does it get just updated?

On Thursday, 20 November 2014 19:52:26 UTC-7, Mark Walkom wrote:

You should disable allocation before you reboot, that will save a lot of
shard shuffling -
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-upgrade.html#rolling-upgrades

On 21 November 2014 13:48, Konstantin Erman kon...@gmail.com
javascript: wrote:

I work on an experimental cluster of ES nodes running on Windows Server
machines. Once in a while we have a need to reboot machines. The initial
state - cluster is green and well balanced. One machine is gracefully taken
offline and then after necessary service is performed it comes back online.
All the hardware and file system content is intact. As soon as ES service
starts on that machine, it assumes that there is no usable data locally and
recovers as much data as it deems necessary for balancing from other nodes.

This behavior puzzles me, because most of the data shards stored on that
machine file system can be reused as they are. Cluster stores logs, so all
indices except those for the current day never ever change until they get
deleted. Can't ES node detect that it has perfect copies of some (actually
most) of the shards and instead of copying them over just mark them as up
to date?

I suspect I don't know about some step to enable this behavior and I'm
looking to enable it. Any advice?

Thank you!
Konstantin

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/51b3ff69-a126-4f2f-9838-0098bc26694d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: priming data for a new node

2014-11-20 Thread Yves Dorfsman


So if a shard has been updated since the data copy, will it copy the entire 
shard, or just update it?

On Wednesday, 19 November 2014 23:34:01 UTC-7, Mark Walkom wrote:

 It doesn't copy everything, only what it needs to balance the shards.

 On 20 November 2014 17:20, Yves Dorfsman yv...@zioup.com javascript: 
 wrote:

 When adding a new node to a cluster, is there a way to prevent it from 
 having
 to copy all the data from the other nodes?

 We tried to copy the data on disk from an existing node (one that had all 
 the
 data for the given indices), but it still copied everything. Is there a 
 way to
 make it update what is new only?

 Thanks.

 --
 http://yves.zioup.com
 gpg: 4096R/32B0F416




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b7b30007-972b-40cb-a5b0-5eb1c1b738c5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

upgrading from 0.90.7 to 1.4. Gotchas?

2014-11-19 Thread Yves Dorfsman

Are there any precautions to take before upgrading from 0.9 to 1.4?

Different data types?
Different API calls?
etc...

And, what is the best way to upgrade? Can we just add a node at the newer 
version and let it pull the data?

Thanks.

http://yves.zioup.com
gpg: 4096R/32B0F416 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3c6b6789-de98-40d4-9532-ae78b5465c4a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

priming data for a new node

2014-11-19 Thread Yves Dorfsman

When adding a new node to a cluster, is there a way to prevent it from having
to copy all the data from the other nodes?

We tried to copy the data on disk from an existing node (one that had all the
data for the given indices), but it still copied everything. Is there a way to
make it update what is new only?

Thanks.

-- 
http://yves.zioup.com
gpg: 4096R/32B0F416

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/546D8838.5070405%40zioup.com.
For more options, visit https://groups.google.com/d/optout.

Is it possible to isolate search quesries to a single node

2014-03-10 Thread Yves Dorfsman

I have a job that makes heavy use to ES, to the point that it affects the 
cluster. Is it possible to:

  - add a replica
  - force the extra replica to a specific node
  - isolate some of the queries to that particular node?

Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2815f221-4828-4382-a246-97973cd98709%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Can a replica be updated with the deltas only?

2014-03-10 Thread Yves Dorfsman

When I shutdown a node that holds a replica and updates are happening to 
the rest of the cluster, then re-start this node, it seems that the entire 
replica is being copied again to that node.

Is there a way to make ES just update that node with the updates that 
happened while it was down?


Thanks.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2d6138a0-5b4d-4ab4-9ef8-2f94beaef241%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Why ES node starts recovering all the data from other nodes after reboot?

Re: Why ES node starts recovering all the data from other nodes after reboot?

adding a new node: how to prime the data

Re: Why ES node starts recovering all the data from other nodes after reboot?

Re: priming data for a new node

upgrading from 0.90.7 to 1.4. Gotchas?

priming data for a new node

Is it possible to isolate search quesries to a single node

Can a replica be updated with the deltas only?

9 matches

Site Navigation

Mail list logo

Footer information