Marvel creating disk usage imbalance

2014-11-11 Thread Duncan Innes
I now know that Marvel creates a lot of data per day of monitoring - in our 
case around 1Gb.

What I'm just starting to get my head around is the imbalance of disk usage 
that this caused on my 5 node cluster.

I've now removed Marvel and deleted the indexes for now (great tool, but I 
don't have the disk space to spare on this proof of concept) and my disk 
usage for the 12 months of rsyslog data has equalised across all the nodes 
in my cluster.  When the Marvel data was sitting there, not only was I 
using far too much disk space, but I was also seeing significant 
differences between nodes.  At least one node would be using nearly all of 
the 32Gb, where other nodes would sit at half that or even less.  Is there 
something intrinsically different about Marvel's indexes that makes them 
prone to such wild differences?

Thanks

Duncan

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7c7d7fb3-a704-4ea5-a74d-efa01f1fa11d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Marvel creating disk usage imbalance

2014-11-11 Thread Duncan Innes
Interesting - I thought I'd narrowed it down to Marvel.  I had big 
imbalances with Marvel running, now it all seems flat (although to be fair, 
the disk usage has dropped to around 5Gb used in a 32Gb partition, so 
there's large amounts of free space).

Same as you though - I could do nothing to rebalance the usage.  We've 
built the cluster so that nodes can be rebuilt and rejoin the cluster if 
absolutely necessary.  Even doing that didn't affect the balance. 

Will keep an eye on those issues.  Looks like there's still a chance of 
disk imbalance though, if all disks are well below the water mark. 
 Although at least the issue looks to be addressed once disk use pops above 
the high water mark.

Cheers

D

On Tuesday, 11 November 2014 13:26:01 UTC, Michael Hart wrote:

 I think it's related to this: 
 https://github.com/elasticsearch/elasticsearch/pull/8270 which I believe 
 was released with 1.4.

 We see the same thing, with hot spots on some nodes. You can poke the 
 cluster to rebalance itself, which that #8270 fixes permanently, using 
 curl -XPOST localhost:9200/_cluster/reroute. That doesn't always sort it 
 out, and this issue (
 https://github.com/elasticsearch/elasticsearch/issues/8149) is our 
 primary issue.

 AFAIK it's not just Marvel, but any indice can get into this situation. 
 Right now I have a few nodes with 1TB of free disk and others with 400Gb, 
 and Marvel is in another cluster entirely.

 cheers
 mike

 On Tuesday, November 11, 2014 4:15:33 AM UTC-5, Duncan Innes wrote:

 I now know that Marvel creates a lot of data per day of monitoring - in 
 our case around 1Gb.

 What I'm just starting to get my head around is the imbalance of disk 
 usage that this caused on my 5 node cluster.

 I've now removed Marvel and deleted the indexes for now (great tool, but 
 I don't have the disk space to spare on this proof of concept) and my disk 
 usage for the 12 months of rsyslog data has equalised across all the nodes 
 in my cluster.  When the Marvel data was sitting there, not only was I 
 using far too much disk space, but I was also seeing significant 
 differences between nodes.  At least one node would be using nearly all of 
 the 32Gb, where other nodes would sit at half that or even less.  Is there 
 something intrinsically different about Marvel's indexes that makes them 
 prone to such wild differences?

 Thanks

 Duncan



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3aeaae9e-95c4-4cee-b7dd-1c005e2b965f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES and Java 8. Does it worth the effort ?

2014-10-31 Thread Duncan Innes
We upgraded to Java 8 OpenJDK a few weeks ago - no problems seen so far.

D

On Friday, 31 October 2014 08:39:39 UTC, Jörg Prante wrote:

 For JVM support, see


 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup.html

 We recommend installing the Java 8 update 20 or later, or Java 7 update 
 55 or later

 Jörg

 On Fri, Oct 31, 2014 at 8:19 AM, Vaidik Kapoor kapoor...@gmail.com 
 javascript: wrote:

 Would love to hear from Elasticsearch about plans on moving to Java 8.

 Vaidik Kapoor
 vaidikkapoor.info

 On 31 October 2014 03:08, joerg...@gmail.com javascript: 
 joerg...@gmail.com javascript: wrote:

 Of course is Java 8 worth the effort.

 Some highlights:

 - no more permgen OOMs
 - improved concurrency implementations: 
 http://docs.oracle.com/javase/8/docs/technotes/guides/concurrency/changes8.html
 - faster hash maps (20%)
 - lambda expressions are faster than inner classes

 Some concurrency classes like LongAdder are already included in ES.

 Java 8 JVM brings also G1 GC to its full extension. G1 GC is not faster 
 than CMS GC, but it scales much better over multicore and reduces 
 stop-the-world pauses to milliseconds.

 To exploit the full advantage of Java 8, ES would need a large overhaul 
 by rewriting inner classes to lambda style, streams, fork/join pool etc. As 
 long as Lucene does not switch to Java 8, the benefit is only partial.

 Jörg




 On Thu, Oct 30, 2014 at 9:01 PM, Georgi Ivanov georgi@gmail.com 
 javascript: wrote:

 Hi ,
 I wander if i should start using Java 8 with my ES cluster.

 Are there any benefits using Java 8 ?
 For example :
 faster GC , faster Java itself .. anything ES would bebefit from Java 8 
 .. etc


 Please share your experience.

 Georgi

 -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/9cf82905-63cf-43f2-b14a-de8f21cb4b50%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/9cf82905-63cf-43f2-b14a-de8f21cb4b50%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGS98nt8%2ByPdm14UHR0pvZVzqVVE%3DtQakQhB_8R6arnNw%40mail.gmail.com
  
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGS98nt8%2ByPdm14UHR0pvZVzqVVE%3DtQakQhB_8R6arnNw%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/CACWtv5mwvxGmww6kQyEtmcq9XjjghAB1LbrGaMJ2gRcF%2BFsJqg%40mail.gmail.com
  
 https://groups.google.com/d/msgid/elasticsearch/CACWtv5mwvxGmww6kQyEtmcq9XjjghAB1LbrGaMJ2gRcF%2BFsJqg%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a04b814e-307c-46ba-aff3-fed79da04f09%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Refusal to recover after node rebuild

2014-10-10 Thread Duncan Innes
Hi,

I've got a proof of concept cluster with 5 nodes.  Several months rsyslog 
data is in there with 2 replicas per index.

I then decided to rebuilt 2 nodes simultaneously.  No problem.  Cluster 
reallocated as expected and each of the remaining 3 nodes stored all of the 
indexes and replicas in full.  Once the cluster had finished this 
reallocation, I decided to rebuild another 2 nodes simultaneously (without 
waiting for the first 2 to come back).  This, after all, would leave 1 node 
storing all of the data.

Unfortunately that's where things start to unravel.  The initial 2 nodes 
have come back online and joined the cluster.  But not my cluster reports 
that every shard is unassigned and there doesn't seem to be any process 
running to reallocate.

What I don't understand is that the cluster was fully balanced at 3 nodes 
and 2 replicas per index.  Does taking a node out in this instance cause a 
problem?  My data is still sitting on the node that hasn't been rebuilt, 
but I can't get it to reallocate onto the other nodes.

It's only a proof of concept, so data loss isn't the issue here.  It's 
understanding why this happened and figuring out if I did anything 
inherently wrong.

Cheers

Duncan

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8a301f07-1634-4b8c-93c7-5a84f45b534e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Disk changes forced resync

2014-05-12 Thread Duncan Innes
Not doing any monitoring yet - this is my dev cluster running on 3 
workstations.

I thought I was quick enough that the rebalance wouldn't have marched ahead 
and changed much - clearly my admin skills need sharpening!

Is there a way to get the cluster to avoid rebalancing when a node is 
removed from the cluster?  I wouldn't want a cluster rebalance starting 
just because I'm patching the OS and need a reboot.

Thanks

Duncan

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8eed4547-1ca7-4fc7-b291-529653618f94%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Disk changes forced resync

2014-05-09 Thread Duncan Innes
Apologies if this is a silly question.

I recently changed the disk layout on one of my ES nodes to put 
/var/lib/elasticsearch on it's own disk partition.  Around 100Gb data was 
set to one side, new disk created, then rsync'd to it's new partition.

As far as I can be certain, everything was the same before and after the 
operation - just that the data now sat on it's own partition.  Ownership, 
permissions, ACL's, SELinux contexts were all synced.

When I started elasticsearch back up again, however, the 100Gb in the 
partition vanished and the node started to rebuild itself from scratch, 
copying data in from all the other nodes.

The time between stopping and restarting elasticsearch was around 15 
minutes.  I expected the data that I'd put back in place to be used first 
as most of it is historical and won't have changed.

Did I do something wrong in my procedure, or am I just expecting the wrong 
thing.

I'm hoping that doing disk maintenance like this in a production system 
doesn't trigger such a rebuild as my prod systems will have significantly 
more data.

Many thanks

Duncan

RHEL 6.5 x86_64
java-1.7.0-oracle-1.7.0.51-1jpp.1.el6_5.x86_64
elasticsearch-1.1.0-1.noarch

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e0a204a7-318b-447b-8fcb-799b5a3b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.