Hello everyone I have made some tests following Hamed Zahedifar suggestion about the -XX:+AlwaysPreTouch and pointing to the RedHat thread. By now, I simply run the tests cutting off -XX:+AlwaysPreTouch from JVM startup command. It starts back exaggeratedly faster than before. With the same amount of data, around 4 GB, it starts in less than 15 seconds, while with -XX:+AlwaysPreTouch flag, it won't finish startup even in 20 minutes. By now I feel completely satisfied with the performance. Thanks everyone for the support, and expecially Hamed for pointing out about the flag behaviour.
Cheers Gianluca Il giorno mar 2 ott 2018 alle ore 10:31 Gianluca Bonetti < gianluca.bone...@gmail.com> ha scritto: > Hello everyone > > This is my first question to the mailing list, which I follow since some > time, to get hints about using Ignite. > Until now I used in other softwares development, and Ignite always rocked > and made the difference, hence I literally love it :) > > Now I am facing troubles in restarting an Apache Ignite instance on a new > product we are developing and testing. > Previously, I have been developing using Apache Ignite with custom loader > from database, but this time we wanted to go with a "cache centric" > approach and use only Ignite Persistence, as there is no need of > integrating with databases or JDBC tools. > So Ignite Instance is the main and only storage. > > The software is a monitoring platform, which receives small chunks of data > (more or less 500 bytes) and stores in different caches, depending on the > source address. > The number of incoming data packets is really low as we are only in > testing, let's say around 100 packes per minute. > The software is running in testing enviroment, so only one server is > deployed at the moment. > > The software can run for weeks with no problem, the caches get bigger and > bigger and everything runs fine and fast. > Then if we restart the software, it takes ages to restart, and actually > most of the times it does not ever complete the initial restart of Ignite. > So we have to delete the persistence storage files, to be able to start > again. > As we are only in testing, we can still withstand it. > > We get just a message in the logs: "Ignite node stopped in the middle of > checkpoint. Will restore memory state and finish checkpoint on node start." > The client instances connecting to Ignite gets the log: " > org.apache.ignite.logger.java.JavaLogger.info Join cluster while cluster > state transition is in progress, waiting when transition finish." > But it never finishes. > > Speaking of sizes, when running tests with no interruption, the cache grew > up to 50 GBs, with no degradation in performance or data loss. > The issues with restarting start just when the cache grows up to ~4 GBs. > The other softwares I developed using Ignite, with custom database loader, > never had problems with large caches in memory. > > The testing server is a dedicated Linux machine with 8 cores Xeon > processor, 64 GB RAM, and SATA disks on software mdraid. > The JVM is OpenJDK 8, started with "-server -Xms24g -Xmx24g > -XX:MaxMetaspaceSize=1g -XX:+AlwaysPreTouch -XX:+UseG1GC > -XX:+ScavengeBeforeFullGC -XX:+DisableExplicitGC -XX:+AggressiveOpts" > > For starting Ignite instance, I am one (the last?) which prefers Java code > instead of XML files. > I recently switched off PeerClassLoading and added the > BinaryTypeConfiguration, which previosly I hadn't specified, but didn't > help. > > public static final Ignite newInstance(List<String> remotes) { > DataStorageConfiguration storage = new DataStorageConfiguration(); > DataRegionConfiguration region = > storage.getDefaultDataRegionConfiguration(); > BinaryConfiguration binary = new BinaryConfiguration(); > TcpDiscoveryVmIpFinder finder = new TcpDiscoveryVmIpFinder(); > TcpDiscoverySpi discovery = new TcpDiscoverySpi(); > IgniteConfiguration config = new IgniteConfiguration(); > storage.setStoragePath("/home/ignite/data"); > storage.setWalPath("/home/ignite/wal"); > storage.setWalArchivePath("/home/ignite/archive"); > region.setPersistenceEnabled(true); > region.setInitialSize(16L * 1024 * 1024 * 1024); > region.setMaxSize(16L * 1024 * 1024 * 1024); > binary.setCompactFooter(false); > binary.setTypeConfigurations(Arrays.asList(new > BinaryTypeConfiguration(Datum.class.getCanonicalName()))); > finder.setAddresses(remotes); > discovery.setIpFinder(finder); > config.setDataStorageConfiguration(storage); > config.setBinaryConfiguration(binary); > config.setPeerClassLoadingEnabled(false); > config.setDiscoverySpi(discovery); > config.setClientMode(false); > Ignite ignite = Ignition.start(config); > ignite.cluster().active(true); > return ignite; > } > > Datum is a small POJO class, with nearly 100 fields and should be less > than 500 bytes of data. > Then there are nearly 200 caches in use, all containing Datum objects (at > least for now). > > I am quite sure I am missing something when starting the instance, but > cannot understand what. > > Is there a way to inspect the progress of the checkpoint at startup? > I cannot do anything by Ignite Visor as it would not connect until the > cluster activation finishes. > > If you have any suggestions, let me know. > > Thank you very much! > Best regards > Gianluca >