Just for the reference I tried to measure the disk performance of the host I am using (4 years old laptop).
dd if=libjvm-stripped.so of=/dev/null iflag=direct ibs=512 //default dd block size dd: error reading 'libjvm-stripped.so': Invalid argument 33534+1 records in 33534+1 records out 17169688 bytes (17 MB, 16 MiB) copied, 3.45808 s, 5.0 MB/s dd if=r/libjvm-stripped.so of=/dev/null iflag=direct ibs=64K dd: error reading 'libjvm-stripped.so': Invalid argument 261+1 records in 33534+1 records out 17169688 bytes (17 MB, 16 MiB) copied, 0.0453535 s, 379 MB/s As you can see with default block size of 512 bytes dd reads data with similar speed I have observed testing OSv Java image with mfs. On other hand same so file (17MB) is read in ~45ms with block size of 64K. Waldek On Sunday, November 19, 2017 at 10:52:18 AM UTC-5, Waldek Kozaczuk wrote: > > For reference I am including the link to my mfs branch - > https://github.com/wkozaczuk/osv/tree/readonly_mfs - that includes all my > changes. > > Also I forgot to mention that I discover that mfs does not support links > to directories. I had to manually tweak java image. > > On Sunday, November 19, 2017 at 10:45:19 AM UTC-5, Waldek Kozaczuk wrote: >> >> I have applied the patch produced by James Root. I had to manually tweak >> Makefile and change gen-mfs-img.py by making it reuse manifest_common.py. >> >> Overall I think it works well and as designed. BTW I can send new MFS >> patch which would be original one plus my changes explained above. >> >> However I ran various tests to benchmark it and the results were not as >> great as I expected especially for bigger images like Java. On other hand >> it is much faster than ZFS for small images. Please see the numbers below. >> >> In essence I tested simple minimal Java "hello world" (./scripts/build >> image=openjdk9-java-base,java-example fs=<fs>) and native one >> (./scripts/build image=native-example fs=<fs>. I also added a stop watch >> around java app main (modules/java-base/java.cc). I assumed that loading >> and executing it (under 10K) would be negligible. Also I did not measure >> whole run by doing "time ./scripts/run ..." as this would include loading >> and starting QEMU which is besides what we need to measure. >> >> As you can see both native ramfs and mfs images boot in 100-110 ms where >> zfs adds ~ 1,400 ms (it seems my machine is slow as others report similar >> small images with ZFS to boot in 600-700ms). >> >> Java on other hand was different. The fastest one was ramfs, 2nd zfs and >> slowest mfs. In both ramfs and zfs test Java app was able to run in ~ 200ms >> (which probably includes loading libjvm.so (17MB) and other files, >> initializing JVM, compiling byte code to native code and executing which >> makes 200ms pretty impressive). In mfs case as you see the boot time is >> very fast as fs is mounted rapidly but Java app executions takes over 3 >> seconds :-( >> >> I added more stop watches in mfs code around mfsvnops.cc/mfs_read() to >> see where time gets spent. In case of Java example it loads 15 files/inodes >> with 2 bulky ones. Please see count of mfs_read calls: >> >> 5 of inode 110 >> 51 of inode 124 >> 28 of inode 125 >> 32 of inode 127 >> * 630 of inode 128 // 17MB of /usr/lib/jvm/java/lib/server/libjvm.so* >> * 2457 of inode 131 // 8MB of /usr/lib/jvm/java/lib/modules (Java 9 >> replacement of rt.jar)* >> 17 of inode 133 >> 31 of inode 134 >> 34 of inode 135 >> 1 of inode 32 >> 35 of inode 5 >> 1 of inode 55 >> 24 of inode 87 >> 19 of inode 88 >> 53 of inode 9 >> >> Obviously with much printing to the console the timings gets affected and >> entire java run is 9,000ms where 5586.47 ms is spent on mfs_read. >> >> As I understand during the java mfs image execution dlopen loads >> libjvm.so file which involves calling mfs_read (4096 bytes) where each >> involves 8 calls to bread. So it looks the system ends up reading 8 times >> 2457 blocks of 512 bytes (the math does not seem to work as it means >> only 9828 KB of libvm.so gets loaded - shouldn't dlopen read all 17MBs of >> libjvm.so?). Please see attached java_mfs2.log that has all the stopwatches >> (node 131 - libjvm.so). >> >> Is there a way to speed it up. It loading 512 bytes (single block) at a >> time the only way to load data from disk? Why when ramfs image loads all >> files in real mode it takes under 200ms - see "disk read (real mode): >> 148.11ms, (+148.11ms)" line from bootchart of ramfs java benchmark and it >> takes over 3 seconds to load less data. Is it because it does it through >> all layers? >> >> Would it help if entire file was loaded at once on a first read into >> memory and then eventually released? Is it possible to load content in >> bigger chunks than 512 bytes from disk? >> >> I also wonder if it helped if the information about inodes in form of >> some kind of map (file path -> inode) was preloaded when mounting mfs? >> >> Any ideas? >> >> Waldek >> >> PS1. Also BTW there is some problem with building ramfs images. Everytime >> one changes lzkernel_base he needs to do a clean build otherwise produces >> image crashes. So to reproduce build zfs image and then change >> lzkernel_base and build ramfs image. >> >> PS2. >> *NATIVE* >> *zfs* >> disk read (real mode): 26.26ms, (+26.26ms) >> .init functions: 48.96ms, (+22.70ms) >> SMP launched: 49.60ms, (+0.63ms) >> VFS initialized: 51.90ms, (+2.30ms) >> Network initialized: 52.16ms, (+0.27ms) >> pvpanic done: 52.27ms, (+0.11ms) >> pci enumerated: 56.99ms, (+4.71ms) >> drivers probe: 56.99ms, (+0.00ms) >> drivers loaded: 102.13ms, (+45.14ms) >> ZFS mounted: 1449.70ms, (+1347.57ms) >> Total time: 1455.74ms, (+6.04ms) >> Hello from C code >> >> *ramfs* >> disk read (real mode): 25.49ms, (+25.49ms) >> .init functions: 46.62ms, (+21.13ms) >> SMP launched: 47.37ms, (+0.75ms) >> VFS initialized: 49.50ms, (+2.13ms) >> Network initialized: 49.87ms, (+0.37ms) >> pvpanic done: 49.99ms, (+0.11ms) >> pci enumerated: 54.41ms, (+4.42ms) >> drivers probe: 54.41ms, (+0.00ms) >> drivers loaded: 97.66ms, (+43.25ms) >> ZFS mounted: 100.99ms, (+3.34ms) >> Total time: 106.63ms, (+5.64ms) >> Hello from C code >> >> *mfs* >> disk read (real mode): 26.36ms, (+26.36ms) >> .init functions: 48.97ms, (+22.61ms) >> SMP launched: 49.64ms, (+0.67ms) >> VFS initialized: 51.63ms, (+1.99ms) >> Network initialized: 52.00ms, (+0.37ms) >> pvpanic done: 52.12ms, (+0.12ms) >> pci enumerated: 56.52ms, (+4.40ms) >> drivers probe: 56.52ms, (+0.00ms) >> drivers loaded: 101.56ms, (+45.04ms) >> ZFS mounted: 104.72ms, (+3.16ms) >> Total time: 110.17ms, (+5.45ms) >> Hello from C code >> >> *JAVA 9 * >> *zfs* >> disk read (real mode): 27.13ms, (+27.13ms) >> .init functions: 50.21ms, (+23.07ms) >> SMP launched: 50.81ms, (+0.60ms) >> VFS initialized: 52.87ms, (+2.07ms) >> Network initialized: 53.12ms, (+0.24ms) >> pvpanic done: 53.24ms, (+0.12ms) >> pci enumerated: 57.59ms, (+4.35ms) >> drivers probe: 57.59ms, (+0.00ms) >> drivers loaded: 100.55ms, (+42.96ms) >> ZFS mounted: 1479.34ms, (+1378.79ms) >> * Total time: 1484.99ms, (+5.65ms)* >> java.so: Starting JVM app using: io/osv/nonisolated/RunNonIsolatedJvmApp >> java.so: Setting Java system classloader to >> NonIsolatingOsvSystemClassLoader >> Hello, World! >> *java main took 201.02ms* >> >> *ramfs* >> disk read (real mode): 148.11ms, (+148.11ms) >> .init functions: 306.51ms, (+158.40ms) >> SMP launched: 307.87ms, (+1.37ms) >> VFS initialized: 322.56ms, (+14.68ms) >> Network initialized: 323.10ms, (+0.54ms) >> pvpanic done: 323.31ms, (+0.20ms) >> pci enumerated: 333.27ms, (+9.96ms) >> drivers probe: 333.27ms, (+0.01ms) >> drivers loaded: 423.81ms, (+90.53ms) >> ZFS mounted: 430.88ms, (+7.07ms) >> * Total time: 442.40ms, (+11.52ms)* >> java.so: Starting JVM app using: io/osv/nonisolated/RunNonIsolatedJvmApp >> java.so: Setting Java system classloader to >> NonIsolatingOsvSystemClassLoader >> random: device unblocked. >> Hello, World! >> *java main took 216.36ms* >> >> *mfs* >> disk read (real mode): 24.60ms, (+24.60ms) >> .init functions: 47.69ms, (+23.09ms) >> SMP launched: 48.40ms, (+0.72ms) >> VFS initialized: 50.49ms, (+2.09ms) >> Network initialized: 50.82ms, (+0.33ms) >> pvpanic done: 50.94ms, (+0.11ms) >> pci enumerated: 55.60ms, (+4.66ms) >> drivers probe: 55.60ms, (+0.00ms) >> drivers loaded: 114.12ms, (+58.52ms) >> ZFS mounted: 117.55ms, (+3.43ms) >> * Total time: 123.21ms, (+5.67ms)* >> java.so: Starting JVM app using: io/osv/nonisolated/RunNonIsolatedJvmApp >> random: device unblocked. >> java.so: Setting Java system classloader to >> NonIsolatingOsvSystemClassLoader >> Hello, World! >> *java main took 3386.41ms* >> >> On Wednesday, November 15, 2017 at 4:46:57 AM UTC-5, Nadav Har'El wrote: >>> >>> >>> On Wed, Nov 15, 2017 at 5:21 AM, Waldek Kozaczuk <jwkoz...@gmail.com> >>> wrote: >>> >>>> A better solution could be to not have a read-write ramdisk at all, but >>>> rather a simple read-only filesystem on the disk image where we only read >>>> file blocks as they are needed, and also cache them somehow for the next >>>> reads. >>>> >>>>> >>>>> This is what the proposed patches were about >>>>> https://github.com/cloudius-systems/osv/pull/646 (which you already >>>>> found). I reviewed those patches in this thread: >>>>> >>>>> https://groups.google.com/forum/#!searchin/osv-dev/readonly/osv-dev/DYSH4_NqyMg/YFJxM9NKBgAJ >>>>> >>>>> James Root came in that thread back with a new patch, but sadly it >>>>> appears that nobody ever reviewed it, and James never complained about >>>>> forgetting this code. >>>>> At this point I don't really remember if there was anything wrong in >>>>> Jame's last patch - it might be in good working order and you can try it >>>>> (and patch the build script to use it instead of ramfs). >>>>> >>>>> If you're interested in reviving it, it might be very close to being >>>>> complete! >>>>> >>>> >>>> I am definitely interested and I will give it a try. Is there any >>>> existing performance test suite that I could use to test its performance? >>>> What would be the things to measure (speed, etc?) and observe (memory >>>> utilization?)? >>>> >>> >>> No, I don't think we have any organized test suite. >>> >>> I'd try to observe the boot time of a large program (e.g., some Java >>> application), the memory it needs (e.g., you can run "scripts/run.py >>> -m1000M" and reduce the size until you can't reduce it any more), and also >>> check that the amount of disk IO (perhaps via iostat in the host or some >>> internal OSv tracepoint count) it does to run some workload. >>> >>> >>>> >>>>> >>>> Also even though most of the files in stateless apps do not change >>>> there is always a need to write to some log files. I guess for that I >>>> could >>>> use the ramfs but probably improve its write implementation by using some >>>> kind of linked list (*std::deque) *of fixed side memory blocks to >>>> store data so we do not constantly free/malloc per Justin's >>>> recommendations. Do you have any other recommendations? >>>> >>> >>> Yes, you can have the root filesystem ("/") on some read-only >>> filesystem, but additionally mount a read-write filesystem, like ramfs, on >>> /tmp or /var/log or whatever (of course, it's questionable why you'd want >>> to write there if it's not persistant, but it will be easier if this is >>> allowed for applications that can't be changed to log over the network, >>> etc.). >>> >>> It should indeed be fairly easy to fix the ramfs append - see >>> https://github.com/cloudius-systems/osv/issues/884 >>> >>> https://github.com/cloudius-systems/osv/issues/739 is also about a >>> different /tmp filesystem. >>> >>> Nadav. >>> >>> -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.