On 9/15/2018 7:09 AM, Duy Nguyen wrote:
On Sat, Sep 15, 2018 at 01:07:46PM +0200, Duy Nguyen wrote:
12:50:00.084237 read-cache.c:1721       start loading index
12:50:00.119941 read-cache.c:1943       performance: 0.034778758 s: loaded all 
extensions (1667075 bytes)
12:50:00.185352 read-cache.c:2029       performance: 0.100152079 s: loaded 
367110 entries
12:50:00.189683 read-cache.c:2126       performance: 0.104566615 s: finished 
scanning all entries
12:50:00.217900 read-cache.c:2029       performance: 0.082309193 s: loaded 
367110 entries
12:50:00.259969 read-cache.c:2029       performance: 0.070257130 s: loaded 
367108 entries
12:50:00.263662 read-cache.c:2278       performance: 0.179344458 s: read cache 
.git/index

The previous mail wraps these lines and make it a bit hard to read. Corrected 
now.

--
Duy


Interesting! Clearly the data shape makes a big difference here as I had run a similar test but in my case, the extensions thread actually finished last (and it's cost is what drove me to move that onto a separate thread that starts first).

Purpose                         First   Last    Duration
load_index_extensions_thread    719.40  968.50  249.10
load_cache_entries_thread       718.89  738.65  19.76
load_cache_entries_thread       730.39  753.83  23.43
load_cache_entries_thread       741.23  751.23  10.00
load_cache_entries_thread       751.93  780.88  28.95
load_cache_entries_thread       763.60  791.31  27.72
load_cache_entries_thread       773.46  783.46  10.00
load_cache_entries_thread       783.96  794.28  10.32
load_cache_entries_thread       795.61  805.52  9.91
load_cache_entries_thread       805.99  827.21  21.22
load_cache_entries_thread       816.85  826.85  10.00
load_cache_entries_thread       827.03  837.96  10.93

In my tests, the scanning thread clearly delayed the later ce threads but given the extension was so slow, it didn't impact the overall time nearly as much as your case.

I completely agree that the optimal solution would be to go back to my original patch/design. It eliminates the overhead of the scanning thread entirely and allows all threads to start at the same time. This would ensure the best performance whether the extensions were the longest thread or the cache entry threads took the longest.

I ran out of time and energy last year so dropped it to work on other tasks. I appreciate your offer of help. Perhaps between the two of us we could successfully get it through the mailing list this time. :-) Let me go back and see what it would take to combine the current EOIE patch with the older IEOT patch.

I'm also intrigued with your observation that over committing the cpu actually results in time savings. I hadn't tested that. It looks like that could have a positive impact on the overall time and warrant a change to the default nr_threads logic.

Reply via email to