[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption
[ https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13430400#comment-13430400 ] Mark Miller commented on SOLR-3685: --- I was off a bit - even a non graceful shutdown should not cause this - if you are not indexing when you shutdown, at worst nodes should sync - not replicate. In my testing, I could easily replicate this though - replication recoveries when it should be a sync. Yonik recently committed a fix to this on trunk. solrcloud crashes on startup due to excessive memory consumption Key: SOLR-3685 URL: https://issues.apache.org/jira/browse/SOLR-3685 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 4.0-ALPHA Environment: Debian GNU/Linux Squeeze 64bit Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43 Reporter: Markus Jelsma Priority: Critical Fix For: 4.1 Attachments: info.log There's a serious problem with restarting nodes, not cleaning old or unused index directories and sudden replication and Java being killed by the OS due to excessive memory allocation. Since SOLR-1781 was fixed index directories get cleaned up when a node is being restarted cleanly, however, old or unused index directories still pile up if Solr crashes or is being killed by the OS, happening here. We have a six-node 64-bit Linux test cluster with each node having two shards. There's 512MB RAM available and no swap. Each index is roughly 27MB so about 50MB per node, this fits easily and works fine. However, if a node is being restarted, Solr will consistently crash because it immediately eats up all RAM. If swap is enabled Solr will eat an additional few 100MB's right after start up. This cannot be solved by restarting Solr, it will just crash again and leave index directories in place until the disk is full. The only way i can restart a node safely is to delete the index directories and have it replicate from another node. If i then restart the node it will crash almost consistently. I'll attach a log of one of the nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption
[ https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13430438#comment-13430438 ] Markus Jelsma commented on SOLR-3685: - When exactly? Do you have an issue? solrcloud crashes on startup due to excessive memory consumption Key: SOLR-3685 URL: https://issues.apache.org/jira/browse/SOLR-3685 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 4.0-ALPHA Environment: Debian GNU/Linux Squeeze 64bit Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43 Reporter: Markus Jelsma Priority: Critical Fix For: 4.1 Attachments: info.log There's a serious problem with restarting nodes, not cleaning old or unused index directories and sudden replication and Java being killed by the OS due to excessive memory allocation. Since SOLR-1781 was fixed index directories get cleaned up when a node is being restarted cleanly, however, old or unused index directories still pile up if Solr crashes or is being killed by the OS, happening here. We have a six-node 64-bit Linux test cluster with each node having two shards. There's 512MB RAM available and no swap. Each index is roughly 27MB so about 50MB per node, this fits easily and works fine. However, if a node is being restarted, Solr will consistently crash because it immediately eats up all RAM. If swap is enabled Solr will eat an additional few 100MB's right after start up. This cannot be solved by restarting Solr, it will just crash again and leave index directories in place until the disk is full. The only way i can restart a node safely is to delete the index directories and have it replicate from another node. If i then restart the node it will crash almost consistently. I'll attach a log of one of the nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption
[ https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13430448#comment-13430448 ] Mark Miller commented on SOLR-3685: --- It was tagged to this issue number: +* SOLR-3685: Solr Cloud sometimes skipped peersync attempt and replicated instead due + to tlog flags not being cleared when no updates were buffered during a previous + replication. (Markus Jelsma, Mark Miller, yonik) solrcloud crashes on startup due to excessive memory consumption Key: SOLR-3685 URL: https://issues.apache.org/jira/browse/SOLR-3685 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 4.0-ALPHA Environment: Debian GNU/Linux Squeeze 64bit Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43 Reporter: Markus Jelsma Priority: Critical Fix For: 4.1 Attachments: info.log There's a serious problem with restarting nodes, not cleaning old or unused index directories and sudden replication and Java being killed by the OS due to excessive memory allocation. Since SOLR-1781 was fixed index directories get cleaned up when a node is being restarted cleanly, however, old or unused index directories still pile up if Solr crashes or is being killed by the OS, happening here. We have a six-node 64-bit Linux test cluster with each node having two shards. There's 512MB RAM available and no swap. Each index is roughly 27MB so about 50MB per node, this fits easily and works fine. However, if a node is being restarted, Solr will consistently crash because it immediately eats up all RAM. If swap is enabled Solr will eat an additional few 100MB's right after start up. This cannot be solved by restarting Solr, it will just crash again and leave index directories in place until the disk is full. The only way i can restart a node safely is to delete the index directories and have it replicate from another node. If i then restart the node it will crash almost consistently. I'll attach a log of one of the nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption
[ https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13430450#comment-13430450 ] Mark Miller commented on SOLR-3685: --- I think we still need to make an issue for cleaning up replication directories on non graceful shutdown. I'll rename this issue to match the recovery issue. And we can create a new issue for the memory thing (I tried to spot that locally, but have not yet). solrcloud crashes on startup due to excessive memory consumption Key: SOLR-3685 URL: https://issues.apache.org/jira/browse/SOLR-3685 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 4.0-ALPHA Environment: Debian GNU/Linux Squeeze 64bit Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43 Reporter: Markus Jelsma Priority: Critical Fix For: 4.1 Attachments: info.log There's a serious problem with restarting nodes, not cleaning old or unused index directories and sudden replication and Java being killed by the OS due to excessive memory allocation. Since SOLR-1781 was fixed index directories get cleaned up when a node is being restarted cleanly, however, old or unused index directories still pile up if Solr crashes or is being killed by the OS, happening here. We have a six-node 64-bit Linux test cluster with each node having two shards. There's 512MB RAM available and no swap. Each index is roughly 27MB so about 50MB per node, this fits easily and works fine. However, if a node is being restarted, Solr will consistently crash because it immediately eats up all RAM. If swap is enabled Solr will eat an additional few 100MB's right after start up. This cannot be solved by restarting Solr, it will just crash again and leave index directories in place until the disk is full. The only way i can restart a node safely is to delete the index directories and have it replicate from another node. If i then restart the node it will crash almost consistently. I'll attach a log of one of the nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption
[ https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13429132#comment-13429132 ] Markus Jelsma commented on SOLR-3685: - Each node has two cores and allow only one warming searcher at any time. The problem is triggered on start up after graceful shutdown as well as a hard power off. I've seen it happening not only when the whole cluster if restarted (i don't think i've ever done that) but just one node of the 6 shard 2 replica test cluster. The attached log is of one node being restarted out of the whole cluster. Could the off-heap RAM be part of data being sent over the wire? We've worked around the problem for now by getting more RAM. solrcloud crashes on startup due to excessive memory consumption Key: SOLR-3685 URL: https://issues.apache.org/jira/browse/SOLR-3685 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 4.0-ALPHA Environment: Debian GNU/Linux Squeeze 64bit Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43 Reporter: Markus Jelsma Priority: Critical Fix For: 4.1 Attachments: info.log There's a serious problem with restarting nodes, not cleaning old or unused index directories and sudden replication and Java being killed by the OS due to excessive memory allocation. Since SOLR-1781 was fixed index directories get cleaned up when a node is being restarted cleanly, however, old or unused index directories still pile up if Solr crashes or is being killed by the OS, happening here. We have a six-node 64-bit Linux test cluster with each node having two shards. There's 512MB RAM available and no swap. Each index is roughly 27MB so about 50MB per node, this fits easily and works fine. However, if a node is being restarted, Solr will consistently crash because it immediately eats up all RAM. If swap is enabled Solr will eat an additional few 100MB's right after start up. This cannot be solved by restarting Solr, it will just crash again and leave index directories in place until the disk is full. The only way i can restart a node safely is to delete the index directories and have it replicate from another node. If i then restart the node it will crash almost consistently. I'll attach a log of one of the nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption
[ https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428255#comment-13428255 ] Mark Miller commented on SOLR-3685: --- Is it 2 or 3 cores you have? One thing is that it won't be just one extra searcher and index - it will be that times the number of cores. All of them will attempt to recover at the same time. So you will see a bump in RAM reqs. You are talking about off heap RAM though - I don't think SolrCloud will have much to do with that. Looking at your logs, it appears that you are replicating because the transaction logs look suspect - probably because of a hard power down. If you shutdown gracefully, you would get a peer sync instead which should determine you are up to date. The comment for the path you are taking says: {quote} // last operation at the time of startup had the GAP flag set... // this means we were previously doing a full index replication // that probably didn't complete and buffering updates in the meantime. {quote} solrcloud crashes on startup due to excessive memory consumption Key: SOLR-3685 URL: https://issues.apache.org/jira/browse/SOLR-3685 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 4.0-ALPHA Environment: Debian GNU/Linux Squeeze 64bit Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43 Reporter: Markus Jelsma Priority: Critical Fix For: 4.1 Attachments: info.log There's a serious problem with restarting nodes, not cleaning old or unused index directories and sudden replication and Java being killed by the OS due to excessive memory allocation. Since SOLR-1781 was fixed index directories get cleaned up when a node is being restarted cleanly, however, old or unused index directories still pile up if Solr crashes or is being killed by the OS, happening here. We have a six-node 64-bit Linux test cluster with each node having two shards. There's 512MB RAM available and no swap. Each index is roughly 27MB so about 50MB per node, this fits easily and works fine. However, if a node is being restarted, Solr will consistently crash because it immediately eats up all RAM. If swap is enabled Solr will eat an additional few 100MB's right after start up. This cannot be solved by restarting Solr, it will just crash again and leave index directories in place until the disk is full. The only way i can restart a node safely is to delete the index directories and have it replicate from another node. If i then restart the node it will crash almost consistently. I'll attach a log of one of the nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption
[ https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428310#comment-13428310 ] Mark Miller commented on SOLR-3685: --- bq. Looking at your logs, it appears that you are replicating because the transaction logs look suspect - probably because of a hard power down. If you shutdown gracefully, you would get a peer sync instead which should determine you are up to date. Alright, I just saw a similar thing happen even shutting everyone down gracefully. I think it's likely our kind of un-orderly cluster shutdown. If you shutdown all the nodes at once, depending on some timing differences, some recoveries may trigger as the leader goes down. Then the replica would go down. In my case though, I was working with a 3 shard 2 replica cluster - so I don't think that was likely the issue. If one node goes down, there is no one to recover from. We need to investigate a bit more. solrcloud crashes on startup due to excessive memory consumption Key: SOLR-3685 URL: https://issues.apache.org/jira/browse/SOLR-3685 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 4.0-ALPHA Environment: Debian GNU/Linux Squeeze 64bit Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43 Reporter: Markus Jelsma Priority: Critical Fix For: 4.1 Attachments: info.log There's a serious problem with restarting nodes, not cleaning old or unused index directories and sudden replication and Java being killed by the OS due to excessive memory allocation. Since SOLR-1781 was fixed index directories get cleaned up when a node is being restarted cleanly, however, old or unused index directories still pile up if Solr crashes or is being killed by the OS, happening here. We have a six-node 64-bit Linux test cluster with each node having two shards. There's 512MB RAM available and no swap. Each index is roughly 27MB so about 50MB per node, this fits easily and works fine. However, if a node is being restarted, Solr will consistently crash because it immediately eats up all RAM. If swap is enabled Solr will eat an additional few 100MB's right after start up. This cannot be solved by restarting Solr, it will just crash again and leave index directories in place until the disk is full. The only way i can restart a node safely is to delete the index directories and have it replicate from another node. If i then restart the node it will crash almost consistently. I'll attach a log of one of the nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption
[ https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424785#comment-13424785 ] Markus Jelsma commented on SOLR-3685: - Hi, 1. Yes, but we allow only one searcher at the same time to be warmed. This resource usage also belongs to the Java heap, it cannot cause 5x as much heap being allocated. 2. Yes, i'll open a new issue and refer to this. 3. Well, in some logs i clearly see a core is attempting to download and judging from the multiple index directories it's true. I am very sure no updates have been added to the cluster for a long time yet it still attempts to recover. Below is a core recovering. {code} 2012-07-30 09:48:36,970 INFO [solr.cloud.ZkController] - [main] - : We are http://nl2.index.openindex.io:8080/solr/openindex_a/ and leader is http://nl1.index.openindex.io:8080/solr/openindex_a/ 2012-07-30 09:48:36,970 INFO [solr.cloud.ZkController] - [main] - : No LogReplay needed for core=openindex_a baseURL=http://nl2.index.openindex.io:8080/solr 2012-07-30 09:48:36,970 INFO [solr.cloud.ZkController] - [main] - : Core needs to recover:openindex_a {code} Something noteworthy may be that for some reasons the index versions of all cores and their replica's don't match. After a restart the generation of a core is also different while it shouldn't have changed. The size in bytes is also slightly different (~20 bytes). The main thing that's concerning that Solr consumes 5x the allocated heap space in the RESident memory. Caches and such are in the heap and the MMapped index dir should be in VIRTual memory and not cause the kernel to kill the process. I'm not yet sure what's going on here. Also, according to Uwe virtual memory should not be more than 2-3 times index size. In our case we see ~800Mb virtual memory for two 26Mb cores right after start up. We have only allocated 98Mb to the heap for now and this is enough for such a small index. solrcloud crashes on startup due to excessive memory consumption Key: SOLR-3685 URL: https://issues.apache.org/jira/browse/SOLR-3685 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 4.0-ALPHA Environment: Debian GNU/Linux Squeeze 64bit Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43 Reporter: Markus Jelsma Priority: Critical Fix For: 4.1 Attachments: info.log There's a serious problem with restarting nodes, not cleaning old or unused index directories and sudden replication and Java being killed by the OS due to excessive memory allocation. Since SOLR-1781 was fixed index directories get cleaned up when a node is being restarted cleanly, however, old or unused index directories still pile up if Solr crashes or is being killed by the OS, happening here. We have a six-node 64-bit Linux test cluster with each node having two shards. There's 512MB RAM available and no swap. Each index is roughly 27MB so about 50MB per node, this fits easily and works fine. However, if a node is being restarted, Solr will consistently crash because it immediately eats up all RAM. If swap is enabled Solr will eat an additional few 100MB's right after start up. This cannot be solved by restarting Solr, it will just crash again and leave index directories in place until the disk is full. The only way i can restart a node safely is to delete the index directories and have it replicate from another node. If i then restart the node it will crash almost consistently. I'll attach a log of one of the nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption
[ https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423782#comment-13423782 ] Markus Jelsma commented on SOLR-3685: - I forgot to add that it doesn't matter if updates are sent to the cluster. A node will start to replicate on startup when it's update to date as well and crash subsequently. solrcloud crashes on startup due to excessive memory consumption Key: SOLR-3685 URL: https://issues.apache.org/jira/browse/SOLR-3685 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 4.0-ALPHA Environment: Debian GNU/Linux Squeeze 64bit Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43 Reporter: Markus Jelsma Priority: Critical Fix For: 4.1 Attachments: info.log There's a serious problem with restarting nodes, not cleaning old or unused index directories and sudden replication and Java being killed by the OS due to excessive memory allocation. Since SOLR-1781 was fixed index directories get cleaned up when a node is being restarted cleanly, however, old or unused index directories still pile up if Solr crashes or is being killed by the OS, happening here. We have a six-node 64-bit Linux test cluster with each node having two shards. There's 512MB RAM available and no swap. Each index is roughly 27MB so about 50MB per node, this fits easily and works fine. However, if a node is being restarted, Solr will consistently crash because it immediately eats up all RAM. If swap is enabled Solr will eat an additional few 100MB's right after start up. This cannot be solved by restarting Solr, it will just crash again and leave index directories in place until the disk is full. The only way i can restart a node safely is to delete the index directories and have it replicate from another node. If i then restart the node it will crash almost consistently. I'll attach a log of one of the nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption
[ https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423785#comment-13423785 ] Uwe Schindler commented on SOLR-3685: - How much heap do you assign to Solr's Java process (-Xmx)? 512 MB physical RAM is very few. The Jetty default is as far as I remember larger. If the OS kills processes in its OOM process killer, we cannot do so much, as those processes are killed with a hard sigkill (-9) not sigterm. solrcloud crashes on startup due to excessive memory consumption Key: SOLR-3685 URL: https://issues.apache.org/jira/browse/SOLR-3685 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 4.0-ALPHA Environment: Debian GNU/Linux Squeeze 64bit Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43 Reporter: Markus Jelsma Priority: Critical Fix For: 4.1 Attachments: info.log There's a serious problem with restarting nodes, not cleaning old or unused index directories and sudden replication and Java being killed by the OS due to excessive memory allocation. Since SOLR-1781 was fixed index directories get cleaned up when a node is being restarted cleanly, however, old or unused index directories still pile up if Solr crashes or is being killed by the OS, happening here. We have a six-node 64-bit Linux test cluster with each node having two shards. There's 512MB RAM available and no swap. Each index is roughly 27MB so about 50MB per node, this fits easily and works fine. However, if a node is being restarted, Solr will consistently crash because it immediately eats up all RAM. If swap is enabled Solr will eat an additional few 100MB's right after start up. This cannot be solved by restarting Solr, it will just crash again and leave index directories in place until the disk is full. The only way i can restart a node safely is to delete the index directories and have it replicate from another node. If i then restart the node it will crash almost consistently. I'll attach a log of one of the nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption
[ https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423786#comment-13423786 ] Markus Jelsma commented on SOLR-3685: - I should have added this. I allocate just 98MB to the heap and 32 to the permgen so there just 130MB allocated. solrcloud crashes on startup due to excessive memory consumption Key: SOLR-3685 URL: https://issues.apache.org/jira/browse/SOLR-3685 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 4.0-ALPHA Environment: Debian GNU/Linux Squeeze 64bit Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43 Reporter: Markus Jelsma Priority: Critical Fix For: 4.1 Attachments: info.log There's a serious problem with restarting nodes, not cleaning old or unused index directories and sudden replication and Java being killed by the OS due to excessive memory allocation. Since SOLR-1781 was fixed index directories get cleaned up when a node is being restarted cleanly, however, old or unused index directories still pile up if Solr crashes or is being killed by the OS, happening here. We have a six-node 64-bit Linux test cluster with each node having two shards. There's 512MB RAM available and no swap. Each index is roughly 27MB so about 50MB per node, this fits easily and works fine. However, if a node is being restarted, Solr will consistently crash because it immediately eats up all RAM. If swap is enabled Solr will eat an additional few 100MB's right after start up. This cannot be solved by restarting Solr, it will just crash again and leave index directories in place until the disk is full. The only way i can restart a node safely is to delete the index directories and have it replicate from another node. If i then restart the node it will crash almost consistently. I'll attach a log of one of the nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption
[ https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423787#comment-13423787 ] Uwe Schindler commented on SOLR-3685: - Is it 32 bit or 64 bit JVM? solrcloud crashes on startup due to excessive memory consumption Key: SOLR-3685 URL: https://issues.apache.org/jira/browse/SOLR-3685 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 4.0-ALPHA Environment: Debian GNU/Linux Squeeze 64bit Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43 Reporter: Markus Jelsma Priority: Critical Fix For: 4.1 Attachments: info.log There's a serious problem with restarting nodes, not cleaning old or unused index directories and sudden replication and Java being killed by the OS due to excessive memory allocation. Since SOLR-1781 was fixed index directories get cleaned up when a node is being restarted cleanly, however, old or unused index directories still pile up if Solr crashes or is being killed by the OS, happening here. We have a six-node 64-bit Linux test cluster with each node having two shards. There's 512MB RAM available and no swap. Each index is roughly 27MB so about 50MB per node, this fits easily and works fine. However, if a node is being restarted, Solr will consistently crash because it immediately eats up all RAM. If swap is enabled Solr will eat an additional few 100MB's right after start up. This cannot be solved by restarting Solr, it will just crash again and leave index directories in place until the disk is full. The only way i can restart a node safely is to delete the index directories and have it replicate from another node. If i then restart the node it will crash almost consistently. I'll attach a log of one of the nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption
[ https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423794#comment-13423794 ] Markus Jelsma commented on SOLR-3685: - Java 1.6.0-26 64bit, just as Linux. I should also note now that i made an error in the configuration. I thought i had reduced the DocumentCache size to 64 but the node it was testing on had a size of 1024 configured and redistributed the config over the cluster via config bootstrap. This still leaves the problem that Solr itself should run out of memory and not the OS as the cache is part of the heap. It also should clean old index directories. So this issue may consist of multiple problems. solrcloud crashes on startup due to excessive memory consumption Key: SOLR-3685 URL: https://issues.apache.org/jira/browse/SOLR-3685 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 4.0-ALPHA Environment: Debian GNU/Linux Squeeze 64bit Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43 Reporter: Markus Jelsma Priority: Critical Fix For: 4.1 Attachments: info.log There's a serious problem with restarting nodes, not cleaning old or unused index directories and sudden replication and Java being killed by the OS due to excessive memory allocation. Since SOLR-1781 was fixed index directories get cleaned up when a node is being restarted cleanly, however, old or unused index directories still pile up if Solr crashes or is being killed by the OS, happening here. We have a six-node 64-bit Linux test cluster with each node having two shards. There's 512MB RAM available and no swap. Each index is roughly 27MB so about 50MB per node, this fits easily and works fine. However, if a node is being restarted, Solr will consistently crash because it immediately eats up all RAM. If swap is enabled Solr will eat an additional few 100MB's right after start up. This cannot be solved by restarting Solr, it will just crash again and leave index directories in place until the disk is full. The only way i can restart a node safely is to delete the index directories and have it replicate from another node. If i then restart the node it will crash almost consistently. I'll attach a log of one of the nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption
[ https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423796#comment-13423796 ] Uwe Schindler commented on SOLR-3685: - OK, I wanted to come back: From what I see, 96 MB of heap is very few for Solr. Tests are running with -Xmx512. But regarding memory consumtion (Java's heap OOMs), Mark Miller might know better. But Solr will not use all available RAM, as you are on 64 bit Java, Solr defaults to MMapDirectory - I recommend to read: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html It will allocate from the system only heap + what Java itsself needs. Everything else is only allocated as adress space to directly acces file system cache. So the real memory usage of Solr is not what top reports in column VIRT but in column RES (resident memory). VIRT can be much higher (multiples of system RAM) with MMapDirectoy, as it only shows virtual address space allocated. This cannot cause Kernel-OOM to get active and kill processes, if that happens you have too few RAM for kernel, Solr + tools, sorry. solrcloud crashes on startup due to excessive memory consumption Key: SOLR-3685 URL: https://issues.apache.org/jira/browse/SOLR-3685 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 4.0-ALPHA Environment: Debian GNU/Linux Squeeze 64bit Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43 Reporter: Markus Jelsma Priority: Critical Fix For: 4.1 Attachments: info.log There's a serious problem with restarting nodes, not cleaning old or unused index directories and sudden replication and Java being killed by the OS due to excessive memory allocation. Since SOLR-1781 was fixed index directories get cleaned up when a node is being restarted cleanly, however, old or unused index directories still pile up if Solr crashes or is being killed by the OS, happening here. We have a six-node 64-bit Linux test cluster with each node having two shards. There's 512MB RAM available and no swap. Each index is roughly 27MB so about 50MB per node, this fits easily and works fine. However, if a node is being restarted, Solr will consistently crash because it immediately eats up all RAM. If swap is enabled Solr will eat an additional few 100MB's right after start up. This cannot be solved by restarting Solr, it will just crash again and leave index directories in place until the disk is full. The only way i can restart a node safely is to delete the index directories and have it replicate from another node. If i then restart the node it will crash almost consistently. I'll attach a log of one of the nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption
[ https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423801#comment-13423801 ] Markus Jelsma commented on SOLR-3685: - Hi - i don't look at virtual memory but RESident memory. My Solr install here will eat up to 512MB RESIDENT MEMORY and is killed by the OS. The virtual memory will then be almost 800MB, while both indexes are just 27MB in size. This sounds a lot of VIRT and RES for a tiny index and tiny heap. Also, Solr will run fine and fast with just 100MB of memory, the index is still very small. Thanks solrcloud crashes on startup due to excessive memory consumption Key: SOLR-3685 URL: https://issues.apache.org/jira/browse/SOLR-3685 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 4.0-ALPHA Environment: Debian GNU/Linux Squeeze 64bit Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43 Reporter: Markus Jelsma Priority: Critical Fix For: 4.1 Attachments: info.log There's a serious problem with restarting nodes, not cleaning old or unused index directories and sudden replication and Java being killed by the OS due to excessive memory allocation. Since SOLR-1781 was fixed index directories get cleaned up when a node is being restarted cleanly, however, old or unused index directories still pile up if Solr crashes or is being killed by the OS, happening here. We have a six-node 64-bit Linux test cluster with each node having two shards. There's 512MB RAM available and no swap. Each index is roughly 27MB so about 50MB per node, this fits easily and works fine. However, if a node is being restarted, Solr will consistently crash because it immediately eats up all RAM. If swap is enabled Solr will eat an additional few 100MB's right after start up. This cannot be solved by restarting Solr, it will just crash again and leave index directories in place until the disk is full. The only way i can restart a node safely is to delete the index directories and have it replicate from another node. If i then restart the node it will crash almost consistently. I'll attach a log of one of the nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption
[ https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423821#comment-13423821 ] Uwe Schindler commented on SOLR-3685: - I have no idea what libraries bundled by Solr do outside, but as you poroblem seems to be related to cloud, it might be another thing in JVMs: DirectMemory (allocated by ByteBuffer.allocateDirect()). By default the JVM allows up to the heap size to be allocated on this space external to heap, so your -Xmx is only half of the truth. Solr by itsself does not use direct memory (only mmapped memory, but that is not resident), but I am not sure about Zookeeper and all that cloud stuff (and maybe plugins like TIKA-extraction). You can limit direct memory with: -XX:MaxDirectMemorySize=size The VIRT column can contain aditionally 2-3 times your index size depending on pending commits, merges,... Please report back what this changes! solrcloud crashes on startup due to excessive memory consumption Key: SOLR-3685 URL: https://issues.apache.org/jira/browse/SOLR-3685 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 4.0-ALPHA Environment: Debian GNU/Linux Squeeze 64bit Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43 Reporter: Markus Jelsma Priority: Critical Fix For: 4.1 Attachments: info.log There's a serious problem with restarting nodes, not cleaning old or unused index directories and sudden replication and Java being killed by the OS due to excessive memory allocation. Since SOLR-1781 was fixed index directories get cleaned up when a node is being restarted cleanly, however, old or unused index directories still pile up if Solr crashes or is being killed by the OS, happening here. We have a six-node 64-bit Linux test cluster with each node having two shards. There's 512MB RAM available and no swap. Each index is roughly 27MB so about 50MB per node, this fits easily and works fine. However, if a node is being restarted, Solr will consistently crash because it immediately eats up all RAM. If swap is enabled Solr will eat an additional few 100MB's right after start up. This cannot be solved by restarting Solr, it will just crash again and leave index directories in place until the disk is full. The only way i can restart a node safely is to delete the index directories and have it replicate from another node. If i then restart the node it will crash almost consistently. I'll attach a log of one of the nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption
[ https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423841#comment-13423841 ] Markus Jelsma commented on SOLR-3685: - Ok, i have increased my DocumentCache again to reproduce the problem and configured from -XX:MaxDirectMemorySize=100m to 10m but RES is still climbing at the same rate as before so no change. We don't use Tika only Zookeeper. About virtual memory. That also climbes to ~800Mb which is many times more than the index size. There are no pending commits or merges right after start up. There may be some cloud replication related process that eats the RAM. Thanks solrcloud crashes on startup due to excessive memory consumption Key: SOLR-3685 URL: https://issues.apache.org/jira/browse/SOLR-3685 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 4.0-ALPHA Environment: Debian GNU/Linux Squeeze 64bit Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43 Reporter: Markus Jelsma Priority: Critical Fix For: 4.1 Attachments: info.log There's a serious problem with restarting nodes, not cleaning old or unused index directories and sudden replication and Java being killed by the OS due to excessive memory allocation. Since SOLR-1781 was fixed index directories get cleaned up when a node is being restarted cleanly, however, old or unused index directories still pile up if Solr crashes or is being killed by the OS, happening here. We have a six-node 64-bit Linux test cluster with each node having two shards. There's 512MB RAM available and no swap. Each index is roughly 27MB so about 50MB per node, this fits easily and works fine. However, if a node is being restarted, Solr will consistently crash because it immediately eats up all RAM. If swap is enabled Solr will eat an additional few 100MB's right after start up. This cannot be solved by restarting Solr, it will just crash again and leave index directories in place until the disk is full. The only way i can restart a node safely is to delete the index directories and have it replicate from another node. If i then restart the node it will crash almost consistently. I'll attach a log of one of the nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption
[ https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424026#comment-13424026 ] Mark Miller commented on SOLR-3685: --- Seems this is perhaps two or three issues here. 1. The resource usage. This may just be because replication causes two searchers to be open at the same time briefly? I really don't have any guesses at the moment. 2. On a non graceful shutdown, old index dirs may end up left behind. We could look at cleaning them up on startup, but that should be it's own issue. 3. You claim you are replicating on startup even though the shards should be in sync. You should not be replicating in that case. solrcloud crashes on startup due to excessive memory consumption Key: SOLR-3685 URL: https://issues.apache.org/jira/browse/SOLR-3685 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 4.0-ALPHA Environment: Debian GNU/Linux Squeeze 64bit Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43 Reporter: Markus Jelsma Priority: Critical Fix For: 4.1 Attachments: info.log There's a serious problem with restarting nodes, not cleaning old or unused index directories and sudden replication and Java being killed by the OS due to excessive memory allocation. Since SOLR-1781 was fixed index directories get cleaned up when a node is being restarted cleanly, however, old or unused index directories still pile up if Solr crashes or is being killed by the OS, happening here. We have a six-node 64-bit Linux test cluster with each node having two shards. There's 512MB RAM available and no swap. Each index is roughly 27MB so about 50MB per node, this fits easily and works fine. However, if a node is being restarted, Solr will consistently crash because it immediately eats up all RAM. If swap is enabled Solr will eat an additional few 100MB's right after start up. This cannot be solved by restarting Solr, it will just crash again and leave index directories in place until the disk is full. The only way i can restart a node safely is to delete the index directories and have it replicate from another node. If i then restart the node it will crash almost consistently. I'll attach a log of one of the nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org