On Tuesday, December 5, 2017 at 12:31:52 PM UTC+1, Mark Price wrote:
>
> Hi Gil,
> thanks for the response.
>
> I'm fairly sure that interaction with the page-cache is one of the 
> problems. When this is occurring, the free-mem is already hovering around 
> vm.min_free_kbytes, and the mapped files are a significant fraction of 
> system memory. From what I can see, each process that maps a file will get 
> its own copy in the page-cache (kernel shared pages doesn't seem to apply 
> to the page-cache, and is otherwise disabled on the test machine), so we 
> probably have approaching total system memory in use by cached pages.
>

That (each process having it's own copy) is surprising to me. Unless the 
mapping is such that private copies are required, I'd expect the processes 
to share the page cache entries.
 

>
> I had thought that pages that were last written-to before the 
> vm.dirty_expire_centisecs threshold would be written out to disk by the 
> flusher threads, but I read on lkml that the access times are maintained on 
> a per-inode basis, rather than per-page. If this is the case, then the 
> system in question is making it very difficult for the page-cache to work 
> efficiently.
>
> The system makes use of a "pre-toucher" thread to try to handle 
> page-faults ahead of the thread that is trying to write application data to 
> the mapped pages. However, it seems that it is not always successful, so I 
> need to spend a bit of time trying to figure out why that is not working. 
> It's possible that there is just too much memory pressure, and the OS is 
> swapping out pages that have be loaded by the pre-toucher before the 
> application gets to them.
>

Is your pre-toucher thread a Java thread doing it's pre-touching using 
mapped i/o in the same process? If so, then the pre-toucher thread itself 
will be a high TTSP causer. The trick is to do the pre-touch in a thread 
that is already at a safepoint (e.g. do your pre-touch using mapped i/o 
from within a JNI call, use another process, or do the retouch with 
non-mapped i/o).
 

>
>
> Cheers,
>
> Mark
>
> On Tuesday, 5 December 2017 10:53:17 UTC, Gil Tene wrote:
>>
>> Page faults in mapped file i/o and counted loops are certainly two common 
>> causes of long TTSP. But there are many other paths that *could* cause it 
>> as well in HotSpot. Without catching it and looking at the stack trace, 
>> it's hard to know which ones to blame. Once you knock out one cause, you'll 
>> see if there is another.
>>
>> In the specific stack trace you showed [assuming that trace was taken 
>> during a long TTSP], mapped file i/o is the most likely culprit. Your trace 
>> seems to be around making the page write-able for the first time and 
>> updating the file time (which takes a lock), but even without needing the 
>> lock, the fault itself could end up waiting for the i/o to complete (read 
>> page from disk), and that (when Murphy pays you a visit) can end up waiting 
>> behind 100s other i/o operations (e.g. when your i/o happens at the same 
>> time the kernel decided to flush some dirty pages in the cache), leading to 
>> TTSPs in the 100s of msec.
>>
>> As I'm sure you already know, one simple way to get around mapped file 
>> related TTSP is to not used mapped files. Explicit random i/o calls are 
>> always done while at a safepoint, so they can't cause high TTSPs.
>>
>> On Tuesday, December 5, 2017 at 10:30:57 AM UTC+1, Mark Price wrote:
>>>
>>> Hi Aleksey,
>>> thanks for the response. The I/O is definitely one problem, but I was 
>>> trying to figure out whether it was contributing to the long TTSP times, or 
>>> whether I might have some code that was misbehaving (e.g. NonCountedLoops).
>>>
>>> Your response aligns with my guesswork, so hopefully I just have the one 
>>> problem to solve ;)
>>>
>>>
>>>
>>> Cheers,
>>>
>>> Mark
>>>
>>> On Tuesday, 5 December 2017 09:24:33 UTC, Aleksey Shipilev wrote:
>>>>
>>>> On 12/05/2017 09:26 AM, Mark Price wrote: 
>>>> > I'm investigating some long time-to-safepoint pauses in 
>>>> oracle/openjdk. The application in question 
>>>> > is also suffering from some fairly nasty I/O problems where 
>>>> latency-sensitive threads are being 
>>>> > descheduled in uninterruptible sleep state due to needing a 
>>>> file-system lock. 
>>>> > 
>>>> > My question: can the JVM detect that a thread is in 
>>>> signal/interrupt-handler code and thus treat it 
>>>> > as though it is at a safepoint (as I believe happens when a thread is 
>>>> in native code via a JNI call)? 
>>>> > 
>>>> > For instance, given the stack trace below, will the JVM need to wait 
>>>> for the thread to be scheduled 
>>>> > back on to CPU in order to come to a safepoint, or will it be treated 
>>>> as "in-native"? 
>>>> > 
>>>> >         7fff81714cd9 __schedule ([kernel.kallsyms]) 
>>>> >         7fff817151e5 schedule ([kernel.kallsyms]) 
>>>> >         7fff81717a4b rwsem_down_write_failed ([kernel.kallsyms]) 
>>>> >         7fff813556e7 call_rwsem_down_write_failed ([kernel.kallsyms]) 
>>>> >         7fff817172ad down_write ([kernel.kallsyms]) 
>>>> >         7fffa0403dcf xfs_ilock ([kernel.kallsyms]) 
>>>> >         7fffa04018fe xfs_vn_update_time ([kernel.kallsyms]) 
>>>> >         7fff8122cc5d file_update_time ([kernel.kallsyms]) 
>>>> >         7fffa03f7183 xfs_filemap_page_mkwrite ([kernel.kallsyms]) 
>>>> >         7fff811ba935 do_page_mkwrite ([kernel.kallsyms]) 
>>>> >         7fff811bda74 handle_pte_fault ([kernel.kallsyms]) 
>>>> >         7fff811c041b handle_mm_fault ([kernel.kallsyms]) 
>>>> >         7fff8106adbe __do_page_fault ([kernel.kallsyms]) 
>>>> >         7fff8106b0c0 do_page_fault ([kernel.kallsyms]) 
>>>> >         7fff8171af48 page_fault ([kernel.kallsyms]) 
>>>> >         ---- java stack trace ends here ---- 
>>>>
>>>> I am pretty sure out-of-band page fault in Java thread does not yield a 
>>>> safepoint. At least because 
>>>> safepoint polls happen at given location in the generated code, because 
>>>> we need the pointer map as 
>>>> the part of the machine state, and that is generated by Hotspot (only) 
>>>> around the safepoint polls. 
>>>> Page faulting on random read/write insns does not have that luxury. 
>>>> Even if JVM had intercepted that 
>>>> fault, there is not enough metadata to work on. 
>>>>
>>>> The stacktrace above seems to say you have page faulted and this 
>>>> incurred disk I/O? This is 
>>>> swapping, I think, and all performance bets are off at that point. 
>>>>
>>>> Thanks, 
>>>> -Aleksey 
>>>>
>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mechanical-sympathy+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to