Hello,

I'd like to gather some feedback on how to best tackle kern/53124.

The problem there is that FFS triggers a pathologic case. I/O transfer maps
and then unmaps each block into kernel pmap, so that the data could be
copied into user memory. This triggers TLB shootdown IPIs for each FS
block, sent  to all CPUs which happen to be idle, or otherwise running on
kernel pmap. On systems with many idle CPUs these TLB shootdowns cause a
lot of synchronization overhead.

I see three possible ways how to fix this particular case:
1. make it possible to lazy invalidate TLB also for kernel pmap, i.e. make
pmap_deactivate()/pmap_reactivate() work with kernel pmap -  this avoids
the TLB shootdown IPIs to be sent to idle CPUs
2. make idle lwp run in it's own pmap and address space - variant to #1,
avoids changing invariant about kernel map being always loaded
3. change UVM code to not do this mapping via kernel map, so pmap_update()
and the IPIs are not triggered in first place

I reckon #2 would add a lot of overhead into the idle routine, it would
require at least an address space switch. This address space switch might
be fairly cheap on architectures with address space ID (avoiding TLB
flushes), but still much more than what the idle entry/exit does now.

Variants of problems with #3 was discussed on and off during the years as I
recall, but there is no resolution yet, and I'm not aware of anyone
actively working on this. I understand this would be Hard, with nobody
currently having the time and courage.

This leaves #1 as a short-term practical solution. Anyone foresees any
particular problems with this, does it have a chance to fly? Any other idea?

Jaromir

Reply via email to