Thank you Mostafa. On Tue, Feb 7, 2017 at 2:25 PM, Mostafa Gomaa <[email protected]> wrote:
> I had a similar issue and I solved it by setting this option worker.heap. > memory.mb > > > On Feb 7, 2017 10:45 AM, "Navin Ipe" <[email protected]> > wrote: > >> Hi, >> >> Even though I ran the topology on a server with 30GB RAM, it still >> crashed. >> I had set *stormConfig.put(Config.TOPOLOGY_WORKER_CHILDOPTS, "-Xmx" + >> "15g");* >> >> But still, when I see the workers using htop, their virtual memory is >> shown as 15G, but toward the right side of the screen, under the command >> column it shows "java -Xmx2048m and a few other options". I assume this was >> the command that storm used to start the worker. >> >> So howcome my memory setting isn't getting used by the worker? Why is it >> still using 2GB instead of 15GB? >> Also, out of the 30GB, 25GB was getting used. How could that happen when >> I have only 4 slots and 4 workers running? The exact same thing was taking >> up just 5GB on a system with 10GB RAM, where I configured -Xmx to "2g". >> >> Could you help me understand this? >> >> >> On Mon, Feb 6, 2017 at 2:29 PM, Navin Ipe <[email protected] >> om> wrote: >> >>> Thank you. Been monitoring it via JConsole, and these are what I see: >>> Supervisor used memory: 61MB >>> >>> *Supervisor committed memory: 171MB* >>> *Supervisor Max memory: 239.1MB* >>> >>> Nimbus used memory: 44.3MB >>> Nimbus committed memory: 169.3MB >>> Nimbus max memory: 954.7MB >>> >>> Zookeeper used memory: 224MB >>> Zookeeper committed memory: 529MB >>> Zookeeper Max memory: 1.9GB >>> >>> Worker used memory: 941MB >>> >>> *Worker committed memory: 1.4GB* >>> *Worker Max memory: 1.9GB* >>> >>> So from what it looks like, even if the worker memory is managed and >>> kept low, the supervisor can crash because of low memory. So the solution >>> appears to be to increase supervisor memory in storm.yaml, use bigger RAM >>> and use swap space. >>> >>> If you have any other opinions, please let me know. >>> >>> >>> On Sun, Feb 5, 2017 at 7:10 PM, Andrea Gazzarini <[email protected]> >>> wrote: >>> >>>> Hi Navin, >>>> I think this line is a good starting point for your analysis: >>>> >>>> >>>> >>>> *"There is insufficient memory for the Java Runtime Environment to >>>> continue." *I don't believe this scenario is caught by the JVM as a >>>> checked exception: in my opinion it belongs to the "Error" class, and that >>>> would explain why the catch block is never reached. >>>> In addition, your assumption could be also right: the part of code that >>>> raises the exception could be everywhere in the worker code, not >>>> necessarily within your class; this because memory errors, differently from >>>> what in general happens for exceptions, don't have a deterministic point of >>>> failure, they depends on the system state at a given moment. >>>> >>>> Please expand a bit (or investigate on yourself) your architecture, >>>> nodes, hardware resources and any information that can helps understanding >>>> your context. Tools like JVisualVM, JConsole, Storm GUI are precious >>>> friends in this contexts. >>>> >>>> Best, >>>> Andrea >>>> >>>> >>>> On 05/02/17 12:53, Navin Ipe wrote: >>>> >>>> >>>> >>>> *Hi, * >>>> *I have a bolt which emits around 15000 tuples sometimes. Sometimes it >>>> emits more than 20000 tuples. I think when this happens, there's a memory >>>> issue and the workers get restarted. This is what worker.log.err contains:* >>>> >>>> >>>> >>>> >>>> >>>> * Java HotSpot(TM) 64-Bit Server VM warning: INFO: >>>> os::commit_memory(0x00000000f1000000, 62914560, 0) failed; error='Cannot >>>> allocate memory' (errno=12) # There is insufficient memory for the Java >>>> Runtime Environment to continue. # Native memory allocation (mmap) failed >>>> to map 62914560 bytes for committing reserved memory. # An error report >>>> file with more information is saved as: # >>>> /home/storm/apache-storm-1.0.0/storm-local/workers/6a1a70ad-d094-437a-a9c5-e837fc1b3535/hs_err_pid2766.log* >>>> >>>> *The odd part is, that in all my bolts I have * >>>> >>>> >>>> >>>> * @Override public void execute(Tuple tuple) { try { * >>>> >>>> *..some code; including the code that emits tuples * >>>> >>>> *} catch(Exception ex) {logger.info <http://logger.info>("The exception >>>> {}, {}", ex.getCause(), ex.getMessage());} }* >>>> >>>> *But in the logs I never see the string "The exception". But worker.log >>>> shows:* >>>> >>>> >>>> >>>> >>>> >>>> >>>> *2017-02-05 09:14:01.320 STDERR [INFO] Java HotSpot(TM) 64-Bit Server >>>> VM warning: INFO: os::commit_memory(0x00000000e6f80000, 37748736, 0) >>>> failed; error='Cannot allocate memory' (errno=12) 2017-02-05 09:14:01.320 >>>> STDERR [INFO] # 2017-02-05 09:14:01.330 STDERR [INFO] # There is >>>> insufficient memory for the Java Runtime Environment to continue. >>>> 2017-02-05 09:14:01.330 STDERR [INFO] # Native memory allocation (mmap) >>>> failed to map 37748736 bytes for committing reserved memory. 2017-02-05 >>>> 09:14:01.331 STDERR [INFO] # An error report file with more information is >>>> saved as: 2017-02-05 09:14:01.331 STDERR [INFO] # >>>> /home/storm/apache-storm-1.0.0/storm-local/workers/2685b445-c4a9-4f7e-94e1-1ce3fe13de47/hs_err_pid3022.log >>>> 2017-02-05 09:14:06.904 o.a.s.d.worker [INFO] Launching worker for >>>> HydraCellGen-138-1486283223 on 3fc3c05e-9769-4033-bf7d-df609d6c4963:6701 >>>> with id 575bd7ed-a3fc-4f7f-a7d0-cdd4054c9fc5 and conf >>>> {"topology.builtin.metrics.bucket.size.secs" 60, "nimbus.childopts" >>>> "-Xmx1024m",... etc* >>>> >>>> *These are the settings I'm using for the topology:* >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> * Config stormConfig = new Config(); >>>> stormConfig.setNumWorkers(20); stormConfig.setNumAckers(20); >>>> stormConfig.put(Config.TOPOLOGY_DEBUG, false); >>>> stormConfig.put(Config.TOPOLOGY_TRANSFER_BUFFER_SIZE, 1024); >>>> stormConfig.put(Config.TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE, >>>> 65536); >>>> stormConfig.put(Config.TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE, 65536); >>>> stormConfig.put(Config.TOPOLOGY_MAX_SPOUT_PENDING, 2); >>>> stormConfig.put(Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS, 2200); >>>> stormConfig.put(Config.STORM_ZOOKEEPER_SERVERS, Arrays.asList(new >>>> String[]{"localhost"})); >>>> stormConfig.put(Config.TOPOLOGY_WORKER_CHILDOPTS, "-Xmx" + "2g");* >>>> >>>> >>>> >>>> *So am I right in assuming the exception is not thrown in my code but >>>> is thrown in the worker thread? Do such exceptions happen when the worker >>>> isn't able to receive too many tuples in its queue? * >>>> *What can I do to avoid this problem?* >>>> >>>> -- >>>> Regards, >>>> Navin >>>> >>>> >>>> >>> >>> >>> -- >>> Regards, >>> Navin >>> >> >> >> >> -- >> Regards, >> Navin >> > -- Regards, Navin
