Hi Mike,
you are right we have the special NIO.2 filesystem that makes fsync a no-op in 90% of all cases. This works fine with Lucene, but as Solr does not use the virtual filesystem and instead just copies the path name of the temp directory as a string and puts it into the default directory factory through its solrconfig.xml file, there is no way to capture fsyncs, as Solr uses plain default filesystem. We should work on a solution for this, as it may speed up tests dramatically. In the meantime I did “apt install eatmydata” (http://manpages.ubuntu.com/manpages/bionic/man1/eatmydata.1.html <http://manpages.ubuntu.com/manpages/trusty/man1/eatmydata.1.html> ). This makes it easy to hide all fsyncs. We can just add this to Jenkins config for new jobs in the job environment plugin, so all jenkins jobs don’t fsync: LD_PRELOAD=libeatmydata.so This trick may be interesting for others, too. Steve Rowe? To test the difference, I will now run the jenkins server for a day, measure number of reads/writes from smart output and then enable this for the linux jobs (it’s easy in the groovy file that selects the random JVM). The VMs for Windows, Mac, Solaris have the virtual disk already configured to ignore any device syncs. Uwe ----- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de From: Michael McCandless <luc...@mikemccandless.com> Sent: Saturday, August 31, 2019 1:32 PM To: Lucene/Solr dev <dev@lucene.apache.org> Subject: Re: NVMe - SSD shredding due to Lucene :-) Nice to know :) Thanks for upgrading Uwe. I thought we randomly disable fsync in tests just to protect our precious SSDs? Mike McCandless http://blog.mikemccandless.com On Sat, Aug 31, 2019 at 6:20 AM Uwe Schindler <u...@thetaphi.de <mailto:u...@thetaphi.de> > wrote: Hi all, I just wanted to inform you that I asked the provider of the Policeman Jenkins Server to replace the first of two NVMe SSDs, because it failed with fatal warnings due to too many writes and no more spare sectors: > root@serv1 ~ # nvme smart-log /dev/nvme0 > Smart Log for NVME device:nvme0 namespace-id:ffffffff > critical_warning : 0x1 > temperature : 76 C > available_spare : 2% > available_spare_threshold : 10% > percentage_used : 67% > data_units_read : 62,129,054 > data_units_written : 648,788,135 > host_read_commands : 6,426,997,226 > host_write_commands : 5,582,107,803 > controller_busy_time : 86,754 > power_cycles : 21 > power_on_hours : 20,252 > unsafe_shutdowns : 16 > media_errors : 0 > num_err_log_entries : 0 > Warning Temperature Time : 7855 > Critical Composite Temperature Time : 0 > Temperature Sensor 1 : 76 C > Thermal Management T1 Trans Count : 0 > Thermal Management T2 Trans Count : 0 > Thermal Management T1 Total Time : 0 > Thermal Management T2 Total Time : 0 The second one looks a bit better, but will be changed later, too. I have no idea what a data unit is (512 bytes, 2048 bytes,... - I think one LBA). So we are really shredding SSDs with Lucene tests 😊 Uwe P.S.: The replacement is currently going on... ----- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de <mailto:u...@thetaphi.de> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org <mailto:dev-unsubscr...@lucene.apache.org> For additional commands, e-mail: dev-h...@lucene.apache.org <mailto:dev-h...@lucene.apache.org>