[GitHub] couchdb-couch issue #185: 3061 adaptive header search
Github user kocolosk commented on the issue: https://github.com/apache/couchdb-couch/pull/185 Wow, great stuff @jaydoane --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #185: 3061 adaptive header search
Github user davisp commented on the issue: https://github.com/apache/couchdb-couch/pull/185 Good work and persistence, @jaydoane! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #185: 3061 adaptive header search
Github user davisp commented on the issue: https://github.com/apache/couchdb-couch/pull/185 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #185: 3061 adaptive header search
Github user davisp commented on the issue: https://github.com/apache/couchdb-couch/pull/185 Yep, +1 to squashing with that commit message --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #185: 3061 adaptive header search
Github user jaydoane commented on the issue: https://github.com/apache/couchdb-couch/pull/185 @davisp, formatting fixed in latest commit. Here's a proposed commit message: > Current behavior attempts to read a header at each block, starting at the eof and working backwards one block at a time. Deeply buried headers can take a long time to find, depending on the depth of the header, server load, etc. > > This commit changes the behavior so that if the last block in the file does not contain a header, it switches to using a "vectored" approach where multiple candidate header prefixes (of 5 bytes) are read in a single operation, greatly speeding up the search. On a modern linux system with SSD, we see improvements up to 15x. If this looks ok to you, should I rebase as well as squash? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #185: 3061 adaptive header search
Github user davisp commented on the issue: https://github.com/apache/couchdb-couch/pull/185 Oh, and we'll want to squash this down into a single commit that has a good explanation of the change and the basics of how it works. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #185: 3061 adaptive header search
Github user theburge commented on the issue: https://github.com/apache/couchdb-couch/pull/185 Thanks for doing the additional investigation @jaydoane. I'm glad the performance remains consistently improved. From memory, I've observed deeply buried headers frequently in MT, although admittedly not lots at once (usually a DB or two). However, they're not only on .meta files if memory serves, but also on view files (and I'm fairly sure on primary database files, although IIRC usually in correlation with some other odd behaviour -- so perhaps it shouldn't be considered _normal_ but it's certainly possible). In any case, it's reassuring that the vectored approach is only tried if the basic approach fails, given the striking difference in page-in rate. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #185: 3061 adaptive header search
Github user jaydoane commented on the issue: https://github.com/apache/couchdb-couch/pull/185 The following image adds vmpage_io.memory.in vmpage_io.memory.out to the experiment. All experiments were searching 4 files in parallel. The first starts around 23:48, using the current algorithm, and it is followed around 23:57, 23:59 and 00:03 using vectored reads. https://cloud.githubusercontent.com/assets/51209/18733383/5a83b6e8-801f-11e6-84f1-3a4fc074f9e1.png;> The most notable aspect of the graph to me is the consistently high vmpage_io.memory.in for the vectored read. Just eyeballing the graphs, it looks like the area under the curves for vmpage_io.memory.in are similar for both algorithms, which I think is what @theburge was expecting to see. As for a more realistic MT scenario, I want to clarify something. It's my understanding that under normal circumstances when opening a couch file, the header is found at the end of the file. In such cases, the existing algorithm will be used (since it's been micro-optimized for this case by reading the entire remainder of the block in a single disk read operation). Only when the existing algorithm fails to find a header do we employ the vectored read algorithm. The only scenario I know of for which we have deeply buried headers is that of .compact.meta files, and the number of those presumably is limited to the number of simultaneous compactions that occur at any time. My understanding is that concurrency is governed by smoosh, and typical numbers are on the order of 10. If all of those assumptions are true, a realistic scenario probably wouldn't have more than a handful of vectored searches happening at one time on any given node, and so my test case of 4 is not terribly unrealistic. That said, the image below shows a series of 3 experiments using 8 parallel searches; the first with the current algorithm, and the other 2 using vectored reads. The main thing to note is that the speed improvement drops to "only" 4x the current algorithm. https://cloud.githubusercontent.com/assets/51209/18734223/076489da-8027-11e6-9517-844e741cd40b.png;> @davisp, I'm all for getting this wrapped up. What are some final tweaks you had in mind? Clearly, it should be squashed into a single commit. Are there other problems you'd like to see addressed? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #185: 3061 adaptive header search
Github user davisp commented on the issue: https://github.com/apache/couchdb-couch/pull/185 That seems fairly convincingly just all around better than the default implementation. The only real worry I had was RAM usage with the vectored reads but that appears to be a non issue (and even an improvement). If you want to try the tests that @theburge suggested that'd be fine but I'd also be willing to start making final tweaks to get this merged. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #185: 3061 adaptive header search
Github user theburge commented on the issue: https://github.com/apache/couchdb-couch/pull/185 @jaydoane Thanks for running and documenting those experiments, I think those results look very promising. Reducing time taken and user-space memory usage can only be good. :) FWIW, I'm not convinced major faults is the metric of interest for identifying file reads on newly opened file in an empty cache scenario. (My thinking is that it shows faults for explicitly mapped ranges that are only available on-disk; whereas page ins show pages being pulled in for whatever reason via the VM itself. For laughs and demonstration purposes, run `sar -B 1` and try some experiments, like dropping cache and loading a `man` page. You'll typically see page in rate exceed major fault rate, once you do the conversation from pages/s to kB/s. In truth, it's probably most comprehensive to look at both fault types and page in/page out rates, but that gets trickier.) The big surprise here for me is the difference in physical disk operations, since my understanding would be that the VM would fill complete OS pages to satisfy reads, so I can't see why reading N bytes on page boundaries would be much different (from an OS perspective) to reading 4kB. I'd love an explanation for that (although the user-space benefits are clear to see and very positive), just to be sure it's not some other effect. My final suggestion would be to evaluate a more serious MT-esque scenario: how does the performance improvement of the new algorithm vary with with lots (10s to 100s) of files opening concurrently? (And maybe with higher background CPU/disk utilisation, if you can set-up the experiment.) Remember that that's the scenario where we see this as most relevant, and timeouts there are a big issue (file opening times out, compaction keels over, file handle remains, etc.), so if there's little or no wall clock time improvement in a highly concurrent setting, that'd be a shame and undermine some of the practical benefit. Obviously, I'd expect only positive things compared to the status quo based on the evaluation above, but performance expectations in varying scenarios don't always comply with common sense... :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #185: 3061 adaptive header search
Github user jaydoane commented on the issue: https://github.com/apache/couchdb-couch/pull/185 @davisp, @theburge, et al, can you guys take a look? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #185: 3061 adaptive header search
Github user jaydoane commented on the issue: https://github.com/apache/couchdb-couch/pull/185 I have simplified the pread/2 based header search, and hopefully addressed most of the issues raised against the previous implementations. For a simple speed comparison using the same file as before, the new algorithm was measured to be about 13x faster than the current algorithm. When 4 such files are searched in parallel, it's only about 7x faster. Nevertheless, it uses many fewer disk reads, imposes less load and even seems to use less memory, as shown in the graphs below. It also seems like the file:advise/4 This first image shows first a single search using the default algorithm starting around 19:27 and ending around 19:35, followed by a single search using the pread/2 algorithm from around 19:43 to 19:45 ![image](https://cloud.githubusercontent.com/assets/51209/18415081/864ef900-779a-11e6-8ee8-ecb7cf4871f2.png) ![image](https://cloud.githubusercontent.com/assets/51209/18415083/8fbe6714-779a-11e6-8502-9670c8050752.png) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #185: 3061 adaptive header search
Github user davisp commented on the issue: https://github.com/apache/couchdb-couch/pull/185 @jaydoane pointed out that its a WIP commit currently so I stopped iterating style points until he's finished. Also, reading through this its getting complicated pretty quickly between spans and exponents and things. I'd drop all of that and make a much simpler first draft that just reads the last N header positions. Once that's working we can add tweaks to adjust the number of locations that are read dynamically if we find it to be an issue. Also, the simplification should make the implementation quite easy. Something like: find_header(Fd, Size) -> Start = (Size div ?BLOCK_SIZE) * ?BLOCK_SIZE, Locs = lists:foldl(fun(_, [Last | _] = Acc) -> [Last - ?BLOCK_SIZE | Acc] end, Start, lists:seq(1, 1024)), LocLens = [{Loc, 5} || Loc <- Locs, Loc >= 0], {ok, Bits} = file:pread(Fd#fd.fd, LocLens), try lists:foreach(fun(Data) -> if Data is a valid header throw({found, Header}) end, Bits) of ok -> find_header(Fd, hd(Locs)) catch throw:{found, Header} -> {ok, Header} end. Or something to that effect. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #185: 3061 adaptive header search
Github user jaydoane commented on the issue: https://github.com/apache/couchdb-couch/pull/185 @davisp, thanks for the explanation. You were of course correct that I was confusing `pread/3` with `pread/2`. I think it makes sense to look at an alternate version of header search which, like you suggest, uses `pread/2` to look for header blocks en masse. Do you think we should hold off on this PR until I complete those additional tests? I'm also curious about the style points you alluded to before. Can you enumerate them? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #185: 3061 adaptive header search
Github user theburge commented on the issue: https://github.com/apache/couchdb-couch/pull/185 @jaydoane: One additional thought I had was that it's taking good care of the page cache with `posix_fadvise`, but perhaps not quite such good care of the BEAM's memory usage. Have you measured system memory usage when there's a lot of trawling for headers? I think it's worth considering making an explicit garbage collection when the code finishes backtracking for headers, since it might have read a number of large, off-heap binaries. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #185: 3061 adaptive header search
Github user davisp commented on the issue: https://github.com/apache/couchdb-couch/pull/185 The implementation looks better. A few style issues I'll not nit pick on till later. I only meant "wrong" in so much in that it was a common recursion error in pulling out an iteration into a separate function for no good reason. To clarify the `file:pread/2` comment, I think you're confusing `file:pread/2` with `file:pread/3` which is totally understandable. I've collectively spent hours over the years poring over the file module and everything beneath it towards the kernel so I was probably a bit terse on that. Currently in couch_file we use `file:pread/3` quite a lot. This call basically ends up being a direct translation to the POSIX `pread` call. That is, it's "read N bytes at position X". The `file:pread/2` call similar except that in one call through the Erlang IO layers you can say "read [{N bytes, at X pos} | ...]` to grab a whole bunch of bytes at different locations. I used to be under the impression that this translated to a POSIX `readv` call but now that I spelunk through the VM code it appears to just be calling pread multiple times (although at the C driver level, so it should be noticeably faster than using `file:pread/3` multiple times). But in the end it'll depend on drives and various bits since our headers are only 4KiB apart. SSDs and the like have page reads in that range which means we may or may not be saving on IO bandwidth. I dunno when the hardware/kernel drops unwanted bytes. However under load we should at least prevent larger amounts of allocated RAM when searching files for headers as it'd be at most in the kernel. Your use of fadvise would also likely help though for that we'd want to investigate the note about page offsets and the like to make sure those calls are effective. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #185: 3061 adaptive header search
Github user jaydoane commented on the issue: https://github.com/apache/couchdb-couch/pull/185 Thank you @davisp for the thorough review! I will address your points in turn: > the timing numbers reported are in microseconds so I'm not super convinced on them. What size of file are you testing against and how far back is the header You are correct that my previous example file `foo.couch` was tiny, and the timing numbers were of course too small to be useful. I've updated the sample benchmarks with one of the real .couch.meta files I've been using, and it shows a 17x speed improvement. Other files with deeply buried headers also show a consistently large speedup. > you've added two exported functions for testing which really shouldn't be in the final PR I agree. I left those for ease of testing, but they are now removed. Hmm, actually, it looks like they actually may need to be exposed based on the test failures I'm now seeing... > The way the recursion works having the first iteration in find_last_header before moving to find_header_in_chunks seems wrong. That recursion should just be a single loop that expands its read-ahead starting with the initial values in find_last_header. "Wrong" might be a bit strong (since it actually worked) but it was certainly _inelegant_. In any case, commit ce91605 eliminated that special case by setting the start position past the eof so it could essentially use the same algorithm for the whole file. > the default appears to devolve into nearly exactly the same behavior as the current implementation This was a deliberate decision in order to incur the least possible risk. I assumed we could roll this code out without too much worry and turn it on for selective cluster(s) via configuration. Once we were satisfied that it behaved well in isolation, the defaults could easily be changed to give everyone the adaptive behavior by default. > it might be beneficial to look at using file:pread/2 which does vectored reads We're already using `file:pread/2`, but it sounds like you're suggesting a somewhat different algorithm. I'm going to mull your last paragraph over as a possible further refinement, but for now, I think I've addressed your objections to the existing code. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #185: 3061 adaptive header search
Github user davisp commented on the issue: https://github.com/apache/couchdb-couch/pull/185 I like the idea here but I think we need to rethink the implementation a bit. First, the timing numbers reported are in microseconds so I'm not super convinced on them. What size of file are you testing against and how far back is the header? I'd almost wager its only one or two blocks back and the timing is fairly random. However I know you had a big file we found in production where the adaptive algorithm probably fared much better. Obviously we can't give that out but we can fake the data by creating files with successively more data after the last header fairly easily. I'd like to see a test/benchmark of that nature before we get too lost in the weeds. On the implementation side, this is a fine proof of concept but needs a bit of cleaning up. First, you've added two exported functions for testing which really shouldn't be in the final PR. The way the recursion works having the first iteration in find_last_header before moving to find_header_in_chunks seems wrong. That recursion should just be a single loop that expands its read-ahead starting with the initial values in find_last_header. Second, the default appears to devolve into nearly exactly the same behavior as the current implementation which makes the timing numbers surprising. It seems a bit odd to write an improvement that needs to be configured to be effective. Given that people are going to get hit by this unexpectedly it seems like a fairly obscure expert option to know when and how to change the config when a database or view has a timeout when being opened. Reading this I also realized that it might be beneficial to look at using file:pread/2 which does vectored reads. Using that we can read five bytes at the last N header offsets and then only read the exact header size when checking each spot with the header flag set. Using pread/2 here would allow us to scan larger chunks of file at the cost of having a second read for every header flag that's set. Assuming that headers are relatively rare events it seems like preferring to seek backwards without reading the entire file into RAM would be faster than attempting to avoid any follow on reads (given that we stop at the first valid header and don't expect many invalid headers). However this is obviously something that needs to be tested before we can be sure. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #185: 3061 adaptive header search
Github user jaydoane commented on the issue: https://github.com/apache/couchdb-couch/pull/185 @theburge, @kocolosk, @davisp, @eiri: can I get some feedback on this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #185: 3061 adaptive header search
Github user jaydoane commented on the issue: https://github.com/apache/couchdb-couch/pull/185 I've been testing with this repl command: ```erlang _time_find_header = fun(File, Algorithm) when is_list(File) -> FileSize = filelib:file_size(File), {ok, Fd} = file:open(File, [raw, read, binary]), {Time, Result} = case Algorithm of current -> timer:tc(couch_file,find_header,[Fd, FileSize div 4096]); default -> timer:tc(couch_file,find_last_header,[Fd, FileSize]); adaptive -> ok = config:set_integer("couchdb", "chunk_max_size", 4096*4096), ok = config:set_integer("couchdb", "chunk_exponent_base", 2), Res = timer:tc(couch_file,find_last_header,[Fd, FileSize]), ok = config:delete("couchdb", "chunk_max_size"), ok = config:delete("couchdb", "chunk_exponent_base"), Res end, ok = file:close(Fd), case Result of {ok, HeaderBin} -> {Time, binary_to_term(HeaderBin)}; _ -> {Time, Result} end end. ``` and use it like so: ```erlang (node1@127.0.0.1)78> _time_find_header("/Users/jay/proj/ibm/sample-data/foo.couch", current). {255, {db_header,6,0,0,nil,nil,nil,0,nil,nil,1000, <<"1ca5e7003114d27fa1dcfff52b10163f">>, [{'node1@127.0.0.1',0}], 0}} (node1@127.0.0.1)79> _time_find_header("/Users/jay/proj/ibm/sample-data/foo.couch", default). {119, {db_header,6,0,0,nil,nil,nil,0,nil,nil,1000, <<"1ca5e7003114d27fa1dcfff52b10163f">>, [{'node1@127.0.0.1',0}], 0}} (node1@127.0.0.1)80> _time_find_header("/Users/jay/proj/ibm/sample-data/foo.couch", adaptive). {138, {db_header,6,0,0,nil,nil,nil,0,nil,nil,1000, <<"1ca5e7003114d27fa1dcfff52b10163f">>, [{'node1@127.0.0.1',0}], 0}} ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---