Re: Ignite 2.7 Errors
Philip, if you can bear with so huge JVM pauses, then there is no use to investigate stacktraces anymore. Just increase systemWorkerBlockedTimeout parameter of IgniteConfiguration appropriately, as described in https://apacheignite.readme.io/docs/critical-failures-handling#section-critical-workers-health-check and Ignite 2.7 won't report these failures. чт, 21 мар. 2019 г. в 20:42, Philip Wu : > hello, Andrey - > > for your 2nd question, in Ignite 2.5, we have 15 mins + JVM paused as well, > but no IgniteException, was working fine. > > 2019-03-15 19:08:46,088 WARNING [ (jvm-pause-detector-worker)] Possible too > long JVM pause: 1001113 milliseconds. > > 2019-03-15 19:08:46,280 INFO [IgniteKernal%XXXGrid > (grid-timeout-worker-#71%XXXGrid%)] > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=1dc0de55, name=EnfusionGrid, uptime=01:49:39.992] > ^-- H/N/C [hosts=1, nodes=1, CPUs=32] > ^-- CPU [cur=100%, avg=39.77%, GC=1042.83%] > ^-- PageMemory [pages=2300496] > ^-- Heap [used=295123MB, free=14.48%, comm=345088MB] > ^-- Non heap [used=442MB, free=-1%, comm=463MB] > ^-- Outbound messages queue [size=0] > ^-- Public thread pool [active=0, idle=0, qSize=0] > ^-- System thread pool [active=0, idle=1, qSize=0] > 2019-03-15 19:08:46,280 INFO [IgniteKernal%XXXGrid > (grid-timeout-worker-#71%XXXGrid%)] FreeList [name=XXXGrid, buckets=256, > dataPages=1, reusePages=0] > > > > > > -- > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ > -- Best regards, Andrey Kuznetsov.
Re: Ignite 2.7 Errors
Sorry, my mistake. I meant the last message you provided, but it doesn't contain stacktrace, only brief thread information. Anyway, 15 minutes long JVM pauses are suspicious. Do you have the same message on 2.5 or 2.6? Best regards, Andrey Kuznetsov. чт, 21 марта 2019, 19:49 Philip Wu p...@enfusionsystems.com: > before that , there was: > > 2019-03-20 22:28:45,028 WARNING [G (tcp-disco-msg-worker-#2%XXXGrid%)] > Thread [name="grid-nio-worker-tcp-comm-1-#73%XXXGrid%", id=415, > state=RUNNABLE, blockCnt=0, waitCnt=0] > > > > > -- > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ >
Re: Ignite 2.7 Errors
Hi, Philip! There should be a stacktrace of the blocked worker itself in the log, with warn level, before the message you cite, but after "Blocked system-critical thread has been detected." Could you please share that trace? It can help to understand failure cause. Best regards, Andrey Kuznetsov. чт, 21 марта 2019, 18:34 Ilya Kasnacheev ilya.kasnach...@gmail.com: > Hello! > > With NoOp handler this should be a purely cosmetic message. > > Regards, > -- > Ilya Kasnacheev > > > чт, 21 мар. 2019 г. в 18:15, Philip Wu : > >> Thanks, llya! >> >> Actually it happened in PROD system again last night ... even with >> NoOpFailureHandler. >> >> I am rolling back to Ignite 2.5 or 2.6 for now. Thanks! >> >> 2019-03-20 22:28:45,044 SEVERE [ (tcp-disco-msg-worker-#2%XXXGrid%)] >> Critical system error detected. Will be handled accordingly to configured >> handler [hnd=NoOpFailureHandler [super=AbstractFailureHandler >> [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], failureCtx=FailureContext >> [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker >> [name=grid-nio-worker-tcp-comm-1, igniteInstanceName=XXXGrid, >> finished=false, heartbeatTs=1553137996031]]]: class >> org.apache.ignite.IgniteException: GridWorker >> [name=grid-nio-worker-tcp-comm-1, igniteInstanceName=XXXGrid, >> finished=false, heartbeatTs=1553137996031] >> at >> >> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831) >> at >> >> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826) >> at >> >> org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:233) >> at >> >> org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297) >> at >> >> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.lambda$new$0(ServerImpl.java:2663) >> at >> >> org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:7181) >> at >> >> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2700) >> at >> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) >> at >> >> org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7119) >> at >> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) >> >> >> >> >> -- >> Sent from: http://apache-ignite-users.70518.x6.nabble.com/ >> >
Re: IgniteDataStreamer.flush() returns before all futures are completed
Thanks, David. I've created a ticket [1] to update javadoc for flush(). Behavior you describe looks rather natural. flush() should not track all previously completed add operations. So, one of the possible ways to ensure all data were added is to check whether flush() finishes without an exception, and then check if get() doesn't throw for each addData future (this part won't block). [1] https://issues.apache.org/jira/browse/IGNITE-8406 2018-04-26 20:47 GMT+03:00 David Harvey <dhar...@jobcase.com>: > "Could you please refine what kind of flaw do you suspect? " > I think the basic flaw is that it is unclear how to write a simple task to > stream in some data, and then confirm that all of that data was > successfully stored before taking an action to record that. This may be > simply a documentation issue for flush(), where I can't tell what the > intended design would be for such code. > > We ran into this issue because we assumed that we needed to test the > status of all of the futures returned by addData, and we found that the > listener was not always called before flush() returned. > > As I dig deeper into the code, I see an attempt to cause flush() to throw > an exception if any exception was thrown on the server for *any* prior > record. If that is the intent (which is not stated but would be > convenient), then I think there is a race: > >- DataStreamerImpl.flush() calls EnterBusy while activeFuts is non >empty. This seems to be the last test of "cancelled". If there were >failed requests before this, flush() would throw. >- Before doFlush() looks at activeFuts, activeFuts becomes empty, >because the requests failed. > - flush() returns without throwing an exception. > > > -DH > > -- Best regards, Andrey Kuznetsov.
Re: IgniteDataStreamer.flush() returns before all futures are completed
Hi, David, As far as a can see from streamer implementation, flush() call 1. won't return until all futures returned by addData() are (successfully) finished 2. will throw an exception if at least one future has been failed. Could you please refine what kind of flaw do you suspect? As for callback added by listen(), it is called right _after_ the future gets to 'done' state, thus flush() completion does not guarantee callback's apply() completion. 2018-04-26 16:37 GMT+03:00 David Harvey <dhar...@jobcase.com>: > Thanks Yakov. I think this is more subtle. > > Our loading via IgniteDatastreamer is idempotent, but this depends on > being certain that a batch of work has successfully completed. It is > *not* sufficient for us to listen to the futures returned by addData, > then to call flush(), and then to record success if there have been no > exceptions. We must wait until get() is called on every future before > recording that the operation was successful.The fact that the future is > done is not sufficient, we need to know that it is done and there is no > exception. We can call flush and then do a future.get() on the incomplete > futures, but not as an assert. (It is valid to assert that fut.isDone(), > but that is not sufficient.) > > Based on by current understanding, I think this is a flaw in Ignite, even > if the fix might only be to clarify the comments for flush() to make this > behavior clear. > > > -- Best regards, Andrey Kuznetsov.
Re: Data Streamer not flushing data to cache
Indeed, the only reliable way is flush/close. Nonzero automatic flush frequency doesn't provide the same guarantee. 2018-03-31 21:11 GMT+03:00 begineer <redni...@gmail.com>: > One more query.. Would it never flush the data if nothing more is added to > streamer and current size is less than buffer size ? > What is the default time. I can see only flush frequency > > > > -- > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ > -- Best regards, Andrey Kuznetsov.
Re: Data Streamer not flushing data to cache
Hello! The simplest way to ensure your data have got to the cache is to use IgniteDataStreamer in try-with-resources block. I some rare scenarios it can make sense to call {{flush()}} or {{close()}} on streamer instance directly. 2018-03-31 12:38 GMT+03:00 begineer <redni...@gmail.com>: > Hi, This must be something very simple. I am adding 100 items to data > streamer. But it is not flushing items to cache. Is there a settings much > enables it. Cache size is zero. Am I doing something wrong ? > > public class DataStreamerExample { > public static void main(String[] args) throws InterruptedException { > Ignite ignite = Ignition.start("examples/ > config/example-ignite.xml"); > CacheConfiguration<Long, Long> config = new > CacheConfiguration<>("mycache"); > IgniteCache<Long, Long> cache = ignite.getOrCreateCache(config); > IgniteDataStreamer<Long, Long> streamer = > ignite.dataStreamer("mycache"); > LongStream.range(1, 100).forEach( l->{ > System.out.println("Adding to streamer "+ l); > streamer.addData(l, l); > }); > System.out.println(streamer.perNodeBufferSize()); > System.out.println("Cache size : "+ cache.size(CachePeekMode.ALL)) > ; > cache.query(new ScanQuery<>()).getAll().stream().forEach(entry->{ > System.out.println("cache Entry: " + entry.getKey()+" "+ > entry.getValue()); > }); > } > } > > > > > -- > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ > -- Best regards, Andrey Kuznetsov.