This is quite small and simple patch but it has taken me almost 2 months
researching and understanding the problem and finding the right solution. It
has involved reading ARMv8 programmer guide, posting questions to ARM forums
as well as trying to debug the problem mostly in trial-and-error fashion as 
somewhat
documented by the issue #1100. The special credit goes to Claudio
Fontana who helped me tremendously by explaining and suggesting
many valuable ideas. 

As the issue #1100 explains, OSv would occasionally or quite repeatedly
depending on the application, crash due to an unexpected Unknown Reason
class synchronous exception (EC=0). This would never happen in emulated
mode (QEMU with TCG) but quite freqently on real ARM hardware like RPI 4
on QEMU with KVM or Firecracker. Per ARM documentation -
https://developer.arm.com/docs/ddi0595/h/aarch64-system-registers/esr_el1#ISS_exceptionswithanunknownreason
- there are many potential causes of EC=0 exception  including "attempted 
execution
of an instruction bit pattern that has no allocated instruction" which
means trying to execute garbage.

All of those potential causes which I quite meticulously researched,
examined and discussed some with Claudio, did not seem to apply or did
not make much sense in OSv context. Until one of them did when I stumbled
across this article - 
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/caches-and-self-modifying-code
- about "self-modifying code". Initially this article seemed to apply to 
JIT-type of scenarios
but then after eventually seeing this small font annotation:
"A more common (though less obvious) example is that of an operating
system kernel: from the point of view of the processor, some code in the
system is modifying some other code in the system every time a process
is swapped in or out." it kind of started making me think that OSv
dynamic linker is somewhat close.

Then I eventually found this paragraph in ARMv8 programmer's guide
in chapter 11.5 "Cache maintenance":
"It is sometimes necessary for software to clean or invalidate a cache.
This might be required when the contents of external memory have been
changed and it is necessary to remove stale data from the cache. It can
also be required after MMU-related activity such as changing access
permissions, cache policies, or virtual to Physical Address mappings, or
when I and D-caches must be synchronized for dynamically generated code
such as JIT-compilers and dynamic library loaders."

In essence aarch64 architecture (Modified Harvard) defines separate
instruction and data caches - I-cache and D-cache. Therefore it is sometimes
necessary to synchronize both caches with each other by cleaning the D-cache
and invalidating the I-cache cache after loading code into memory.
Which is exactly what the article about self modifying code explains.
How does it apply to OSv? Well, OSv dynamic linker being part of kernel (code 
A) loads
into memory application code (B), which by itself does not mean OSv
modifies its own kernel code however it dynamically loads another code
and executes it in the same memory space. 

Making this long story short, this patch modifies critical part
of OSv memory management code - populate_vma() - which gets called any time
vma portion (page) is filled due to page fault or eagerly. It changes
the populate_vma() by making it synchronize the data and instruction caches
with each other if the vma is executable per its permission - 
in essence any time any code is loaded into memory.
To achieve it delegates to an obscure built-in - __clear_cache().
This logic is actually no-op in x86-64 port, as this architecture has strong
coherency between instruction and data caches and there is no need to do 
anything
special in this case.

Fixes #1100

Signed-off-by: Waldemar Kozaczuk <jwkozac...@gmail.com>
---
 arch/aarch64/mmu.cc | 34 ++++++++++++++++++++++++++++++++++
 arch/x64/mmu.cc     |  4 ++++
 core/mmu.cc         |  7 +++++++
 include/osv/mmu.hh  |  2 ++
 4 files changed, 47 insertions(+)

diff --git a/arch/aarch64/mmu.cc b/arch/aarch64/mmu.cc
index dd8ef850..8fd71b51 100644
--- a/arch/aarch64/mmu.cc
+++ b/arch/aarch64/mmu.cc
@@ -97,4 +97,38 @@ bool is_page_fault_write_exclusive(unsigned int esr) {
 bool fast_sigsegv_check(uintptr_t addr, exception_frame* ef) {
     return false;
 }
+
+void synchronize_cpu_caches(void *v, size_t size) {
+    // The aarch64 qualifies as Modified Harvard architecture and defines 
separate
+    // cpu instruction and data caches - I-cache and D-cache. Therefore it is 
necessary
+    // to synchronize both caches by cleaning data cache and invalidating 
instruction
+    // cache after loading code into memory before letting it be executed.
+    // For more details of why and when it is necessary please read this 
excellent article -
+    // 
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/caches-and-self-modifying-code
+    // or this paper - https://hal.inria.fr/hal-02509910/document.
+    //
+    // So when OSv dynamic linker, being part of the kernel code, loads pages
+    // of executable sections of ELF segments into memory, we need to clean 
D-cache
+    // in order push code (as data) into next cache level (L2) and invalidate
+    // the I-cache right before it gets executed.
+    //
+    // In order to achieve the above we delegate to the __clear_cache builtin.
+    // The __clear_cache does following in terms of ARM64 assembly:
+    //
+    // For each D-cache line in the range (v, v + size):
+    //   DC CVAU, Xn ; Clean data cache by virtual address (VA) to PoU
+    // DSB ISH       ; Ensure visibility of the data cleaned from cache
+    // For each I-cache line in the range (v, v + size):
+    //   IC IVAU, Xn ; Invalidate instruction cache by VA to PoU
+    // DSB ISH       ; Ensure completion of the invalidations
+    // ISB           ; Synchronize the fetched instruction stream
+    //
+    // Please note that that both DC CVAU and IC CVAU are broadcast to all 
cores in the
+    // same Inner Sharebility domain (which all OSv memory is mapped as) so 
that all
+    // caches in all cores should eventually see and execute same code.
+    //
+    // For more details about what this built-in does, please read this gcc 
documentation -
+    // https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
+    __builtin___clear_cache((char*)v, (char*)(v + size));
+}
 }
diff --git a/arch/x64/mmu.cc b/arch/x64/mmu.cc
index 24da5caa..1af268c0 100644
--- a/arch/x64/mmu.cc
+++ b/arch/x64/mmu.cc
@@ -191,4 +191,8 @@ bool fast_sigsegv_check(uintptr_t addr, exception_frame* ef)
 
     return false;
 }
+
+// The x86_64 is considered to conform to the von Neumann architecture with 
unified
+// data and instruction caches. Therefore we do not need to do anything as 
they are always in sync.
+void synchronize_cpu_caches(void *v, size_t size) {}
 }
diff --git a/core/mmu.cc b/core/mmu.cc
index ff3fab47..37a1c60b 100644
--- a/core/mmu.cc
+++ b/core/mmu.cc
@@ -1206,6 +1206,13 @@ ulong populate_vma(vma *vma, void *v, size_t size, bool 
write = false)
         vma->operate_range(populate_small<Account>(map, vma->perm(), write, 
vma->map_dirty()), v, size) :
         vma->operate_range(populate<Account>(map, vma->perm(), write, 
vma->map_dirty()), v, size);
 
+    // On some architectures, the cpu data and instruction caches are separate 
(non-unified)
+    // and therefore it might be necessary to synchronize data cache with 
instruction cache
+    // after populating vma with executable code.
+    if (vma->perm() & perm_exec) {
+        synchronize_cpu_caches(v, size);
+    }
+
     return total;
 }
 
diff --git a/include/osv/mmu.hh b/include/osv/mmu.hh
index 1830048c..12fcb8a4 100644
--- a/include/osv/mmu.hh
+++ b/include/osv/mmu.hh
@@ -319,6 +319,8 @@ std::string procfs_maps();
 
 unsigned long all_vmas_size();
 
+// Synchronize cpu data and instruction caches for specified area of virtual 
memory
+void synchronize_cpu_caches(void *v, size_t size);
 }
 
 #endif /* MMU_HH */
-- 
2.29.2

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/20201228065122.63815-1-jwkozaczuk%40gmail.com.

Reply via email to