Question on Ignite Persistence:
On a deployed Ignite (3 node) cluster, I see one one node being taken out of 
the cluster because it encounters GC Pauses. Worse, when this node leaves the 
cluster, a Rebalance is initiated (and re-initiated when the node joins back).
Note: Data that Ignite Cluster holds is fully transactional. We cannot put up 
with Data Loss.
>From the logs :

[14:32:01,643][INFO][wal-file-archiver%null-#44][FsyncModeFileWriteAheadLogManager]
 Copied file 
[src=/data2/data/wal/node00-8d707f27-d022-4237-85cf-28c36828a0a3/0000000000000006.wal,
 
dst=/data2/data/wal/archive/node00-8d707f27-d022-4237-85cf-28c36828a0a3/0000000000000306.wal]

[14:32:02,830][INFO][wal-file-archiver%null-#44][FsyncModeFileWriteAheadLogManager]
 Starting to copy WAL segment [absIdx=307, segIdx=7, 
origFile=/data2/data/wal/node00-8d707f27-d022-4237-85cf-28c36828a0a3/0000000000000007.wal,
 
dstFile=/data2/data/wal/archive/node00-8d707f27-d022-4237-85cf-28c36828a0a3/0000000000000307.wal]

[14:32:17,999][WARNING][jvm-pause-detector-worker][IgniteKernal] Possible too 
long JVM pause: 15044 milliseconds. 


It is clear that WAL writes (FSYNC in this case) always precede GC Pauses. 

Question:
The only advantage of FSYNC Vs LOG_ONLY seems to be surviving OS Level Crashes. 
With a Journaled filesystem like Ext4FS, do I really need FSYNC? Can't I get 
around with LOG_ONLY ?
If not, how do I minimise the perf bottlenecks using FSYNC ?

Reply via email to