[GitHub] couchdb-couch issue #185: 3061 adaptive header search

2016-10-05 Thread kocolosk
Github user kocolosk commented on the issue:

https://github.com/apache/couchdb-couch/pull/185
  
Wow, great stuff @jaydoane


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch issue #185: 3061 adaptive header search

2016-10-04 Thread davisp
Github user davisp commented on the issue:

https://github.com/apache/couchdb-couch/pull/185
  
Good work and persistence, @jaydoane!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch issue #185: 3061 adaptive header search

2016-10-04 Thread davisp
Github user davisp commented on the issue:

https://github.com/apache/couchdb-couch/pull/185
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch issue #185: 3061 adaptive header search

2016-10-04 Thread davisp
Github user davisp commented on the issue:

https://github.com/apache/couchdb-couch/pull/185
  
Yep, +1 to squashing with that commit message


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch issue #185: 3061 adaptive header search

2016-10-04 Thread jaydoane
Github user jaydoane commented on the issue:

https://github.com/apache/couchdb-couch/pull/185
  
@davisp, formatting fixed in latest commit.

Here's a proposed commit message:

> Current behavior attempts to read a header at each block, starting at the 
eof and working backwards one block at a time. Deeply buried headers can take a 
long time to find, depending on the depth of the header, server load, etc.
> 
> This commit changes the behavior so that if the last block in the file 
does not contain a header, it switches to using a "vectored" approach where 
multiple candidate header prefixes (of 5 bytes) are read in a single operation, 
greatly speeding up the search. On a modern linux system with SSD, we see 
improvements up to 15x.

If this looks ok to you, should I rebase as well as squash?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch issue #185: 3061 adaptive header search

2016-10-04 Thread davisp
Github user davisp commented on the issue:

https://github.com/apache/couchdb-couch/pull/185
  
Oh, and we'll want to squash this down into a single commit that has a good 
explanation of the change and the basics of how it works.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch issue #185: 3061 adaptive header search

2016-09-22 Thread theburge
Github user theburge commented on the issue:

https://github.com/apache/couchdb-couch/pull/185
  
Thanks for doing the additional investigation @jaydoane. I'm glad the 
performance remains consistently improved.

From memory, I've observed deeply buried headers frequently in MT, although 
admittedly not lots at once (usually a DB or two). However, they're not only on 
.meta files if memory serves, but also on view files (and I'm fairly sure on 
primary database files, although IIRC usually in correlation with some other 
odd behaviour -- so perhaps it shouldn't be considered _normal_ but it's 
certainly possible).

In any case, it's reassuring that the vectored approach is only tried if 
the basic approach fails, given the striking difference in page-in rate.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch issue #185: 3061 adaptive header search

2016-09-21 Thread jaydoane
Github user jaydoane commented on the issue:

https://github.com/apache/couchdb-couch/pull/185
  
The following image adds vmpage_io.memory.in vmpage_io.memory.out to the 
experiment. All experiments were searching 4 files in parallel. The first 
starts around 23:48, using the current algorithm, and it is followed around 
23:57, 23:59 and 00:03 using vectored reads.

https://cloud.githubusercontent.com/assets/51209/18733383/5a83b6e8-801f-11e6-84f1-3a4fc074f9e1.png;>
The most notable aspect of the graph to me is the consistently high 
vmpage_io.memory.in for the vectored read. Just eyeballing the graphs, it looks 
like the area under the curves for vmpage_io.memory.in are similar for both 
algorithms, which I think is what @theburge was expecting to see.

As for a more realistic MT scenario, I want to clarify something. It's my 
understanding that under normal circumstances when opening a couch file, the 
header is found at the end of the file. In such cases, the existing algorithm 
will be used (since it's been micro-optimized for this case by reading the 
entire remainder of the block in a single disk read operation). Only when the 
existing algorithm fails to find a header do we employ the vectored read 
algorithm.

The only scenario I know of for which we have deeply buried headers is that 
of .compact.meta files, and the number of those presumably is limited to the 
number of simultaneous compactions that occur at any time. My understanding is 
that concurrency is governed by smoosh, and typical numbers are on the order of 
10. If all of those assumptions are true, a realistic scenario probably 
wouldn't have more than a handful of vectored searches happening at one time on 
any given node, and so my test case of 4 is not terribly unrealistic. 

That said, the image below shows a series of 3 experiments using 8 parallel 
searches; the first with the current algorithm, and the other 2 using vectored 
reads. The main thing to note is that the speed improvement drops to "only" 4x 
the current algorithm.

https://cloud.githubusercontent.com/assets/51209/18734223/076489da-8027-11e6-9517-844e741cd40b.png;>

@davisp, I'm all for getting this wrapped up. What are some final tweaks 
you had in mind? Clearly, it should be squashed into a single commit. Are there 
other problems you'd like to see addressed?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch issue #185: 3061 adaptive header search

2016-09-16 Thread davisp
Github user davisp commented on the issue:

https://github.com/apache/couchdb-couch/pull/185
  
That seems fairly convincingly just all around better than the default 
implementation. The only real worry I had was RAM usage with the vectored reads 
but that appears to be a non issue (and even an improvement).

If you want to try the tests that @theburge suggested that'd be fine but 
I'd also be willing to start making final tweaks to get this merged.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch issue #185: 3061 adaptive header search

2016-09-16 Thread theburge
Github user theburge commented on the issue:

https://github.com/apache/couchdb-couch/pull/185
  
@jaydoane Thanks for running and documenting those experiments, I think 
those results look very promising. Reducing time taken and user-space memory 
usage can only be good. :)

FWIW, I'm not convinced major faults is the metric of interest for 
identifying file reads on newly opened file in an empty cache scenario. (My 
thinking is that it shows faults for explicitly mapped ranges that are only 
available on-disk; whereas page ins show pages being pulled in for whatever 
reason via the VM itself. For laughs and demonstration purposes, run `sar -B 1` 
and try some experiments, like dropping cache and loading a `man` page. You'll 
typically see page in rate exceed major fault rate, once you do the 
conversation from pages/s to kB/s. In truth, it's probably most comprehensive 
to look at both fault types and page in/page out rates, but that gets trickier.)

The big surprise here for me is the difference in physical disk operations, 
since my understanding would be that the VM would fill complete OS pages to 
satisfy reads, so I can't see why reading N bytes on page boundaries would be 
much different (from an OS perspective) to reading 4kB. I'd love an explanation 
for that (although the user-space benefits are clear to see and very positive), 
just to be sure it's not some other effect.

My final suggestion would be to evaluate a more serious MT-esque scenario: 
how does the performance improvement of the new algorithm vary with with lots 
(10s to 100s) of files opening concurrently? (And maybe with higher background 
CPU/disk utilisation, if you can set-up the experiment.) Remember that that's 
the scenario where we see this as most relevant, and timeouts there are a big 
issue (file opening times out, compaction keels over, file handle remains, 
etc.), so if there's little or no wall clock time improvement in a highly 
concurrent setting, that'd be a shame and undermine some of the practical 
benefit. Obviously, I'd expect only positive things compared to the status quo 
based on the evaluation above, but performance expectations in varying 
scenarios don't always comply with common sense... :) 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch issue #185: 3061 adaptive header search

2016-09-15 Thread jaydoane
Github user jaydoane commented on the issue:

https://github.com/apache/couchdb-couch/pull/185
  
@davisp, @theburge, et al, can you guys take a look?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch issue #185: 3061 adaptive header search

2016-09-10 Thread jaydoane
Github user jaydoane commented on the issue:

https://github.com/apache/couchdb-couch/pull/185
  
I have simplified the pread/2 based header search, and hopefully addressed 
most of the issues raised against the previous implementations. For a simple 
speed comparison using the same file as before, the new algorithm was measured 
to be about 13x faster than the current algorithm. When 4 such files are 
searched in parallel, it's only about 7x faster. Nevertheless, it uses many 
fewer disk reads, imposes less load and even seems to use less memory, as shown 
in the graphs below. It also seems like the file:advise/4

This first image shows first a single search using the default algorithm 
starting around 19:27 and ending around 19:35, followed by a single search 
using the pread/2 algorithm from around 19:43 to 19:45

![image](https://cloud.githubusercontent.com/assets/51209/18415081/864ef900-779a-11e6-8ee8-ecb7cf4871f2.png)


![image](https://cloud.githubusercontent.com/assets/51209/18415083/8fbe6714-779a-11e6-8502-9670c8050752.png)




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch issue #185: 3061 adaptive header search

2016-08-23 Thread davisp
Github user davisp commented on the issue:

https://github.com/apache/couchdb-couch/pull/185
  
@jaydoane pointed out that its a WIP commit currently so I stopped 
iterating style points until he's finished.

Also, reading through this its getting complicated pretty quickly between 
spans and exponents and things. I'd drop all of that and make a much simpler 
first draft that just reads the last N header positions. Once that's working we 
can add tweaks to adjust the number of locations that are read dynamically if 
we find it to be an issue.

Also, the simplification should make the implementation quite easy. 
Something like:

find_header(Fd, Size) ->
Start = (Size div ?BLOCK_SIZE) * ?BLOCK_SIZE,
Locs = lists:foldl(fun(_, [Last | _] = Acc) ->
[Last - ?BLOCK_SIZE | Acc]
end, Start, lists:seq(1, 1024)),
LocLens = [{Loc, 5} || Loc <- Locs, Loc >= 0],
{ok, Bits} = file:pread(Fd#fd.fd, LocLens),
try
lists:foreach(fun(Data) ->
if Data is a valid header throw({found, Header})
end, Bits) of
ok -> find_header(Fd, hd(Locs))
 catch throw:{found, Header} ->
 {ok, Header}
end.


Or something to that effect.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch issue #185: 3061 adaptive header search

2016-07-20 Thread jaydoane
Github user jaydoane commented on the issue:

https://github.com/apache/couchdb-couch/pull/185
  
@davisp, thanks for the explanation. You were of course correct that I was 
confusing `pread/3` with `pread/2`. I think it makes sense to look at an 
alternate version of header search which, like you suggest, uses `pread/2` to 
look for header blocks en masse. Do you think we should hold off on this PR 
until I complete those additional tests? I'm also curious about the style 
points you alluded to before. Can you enumerate them?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch issue #185: 3061 adaptive header search

2016-07-20 Thread theburge
Github user theburge commented on the issue:

https://github.com/apache/couchdb-couch/pull/185
  
@jaydoane: One additional thought I had was that it's taking good care of 
the page cache with `posix_fadvise`, but perhaps not quite such good care of 
the BEAM's memory usage. Have you measured system memory usage when there's a 
lot of trawling for headers?

I think it's worth considering making an explicit garbage collection when 
the code finishes backtracking for headers, since it might have read a number 
of large, off-heap binaries.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch issue #185: 3061 adaptive header search

2016-07-19 Thread davisp
Github user davisp commented on the issue:

https://github.com/apache/couchdb-couch/pull/185
  
The implementation looks better. A few style issues I'll not nit pick on 
till later.

I only meant "wrong" in so much in that it was a common recursion error in 
pulling out an iteration into a separate function for no good reason.

To clarify the `file:pread/2` comment, I think you're confusing 
`file:pread/2` with `file:pread/3` which is totally understandable. I've 
collectively spent hours over the years poring over the file module and 
everything beneath it towards the kernel so I was probably a bit terse on that.

Currently in couch_file we use `file:pread/3` quite a lot. This call 
basically ends up being a direct translation to the POSIX `pread` call. That 
is, it's "read N bytes at position X". The `file:pread/2` call similar except 
that in one call through the Erlang IO layers you can say "read [{N bytes, at X 
pos} | ...]` to grab a whole bunch of bytes at different locations. I used to 
be under the impression that this translated to a POSIX `readv` call but now 
that I spelunk through the VM code it appears to just be calling pread multiple 
times (although at the C driver level, so it should be noticeably faster than 
using `file:pread/3` multiple times).

But in the end it'll depend on drives and various bits since our headers 
are only 4KiB apart. SSDs and the like have page reads in that range which 
means we may or may not be saving on IO bandwidth. I dunno when the 
hardware/kernel drops unwanted bytes. However under load we should at least 
prevent larger amounts of allocated RAM when searching files for headers as 
it'd be at most in the kernel. Your use of fadvise would also likely help 
though for that we'd want to investigate the note about page offsets and the 
like to make sure those calls are effective.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch issue #185: 3061 adaptive header search

2016-07-19 Thread jaydoane
Github user jaydoane commented on the issue:

https://github.com/apache/couchdb-couch/pull/185
  
Thank you @davisp for the thorough review! I will address your points in 
turn:

> the timing numbers reported are in microseconds so I'm not super 
convinced on them. What size of file are you testing against and how far back 
is the header

You are correct that my previous example file `foo.couch` was tiny, and the 
timing numbers were of course too small to be useful. I've updated the sample 
benchmarks with one of the real .couch.meta files I've been using, and it shows 
a 17x speed improvement. Other files with deeply buried headers also show a 
consistently large speedup.

> you've added two exported functions for testing which really shouldn't be 
in the final PR

I agree. I left those for ease of testing, but they are now removed. Hmm, 
actually, it looks like they actually may need to be exposed based on the test 
failures I'm now seeing...

> The way the recursion works having the first iteration in 
find_last_header before moving to find_header_in_chunks seems wrong. That 
recursion should just be a single loop that expands its read-ahead starting 
with the initial values in find_last_header.

"Wrong" might be a bit strong (since it actually worked) but it was 
certainly _inelegant_. In any case, commit ce91605 eliminated that special case 
by setting the start position past the eof so it could essentially use the same 
algorithm for the whole file.

> the default appears to devolve into nearly exactly the same behavior as 
the current implementation

This was a deliberate decision in order to incur the least possible risk. I 
assumed we could roll this code out without too much worry and turn it on for 
selective cluster(s) via configuration. Once we were satisfied that it behaved 
well in isolation, the defaults could easily be changed to give everyone the 
adaptive behavior by default.

>  it might be beneficial to look at using file:pread/2 which does vectored 
reads

We're already using `file:pread/2`, but it sounds like you're suggesting a 
somewhat different algorithm. I'm going to mull your last paragraph over as a 
possible further refinement, but for now, I think I've addressed your 
objections to the existing code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch issue #185: 3061 adaptive header search

2016-07-19 Thread davisp
Github user davisp commented on the issue:

https://github.com/apache/couchdb-couch/pull/185
  
I like the idea here but I think we need to rethink the implementation a 
bit. First, the timing numbers reported are in microseconds so I'm not super 
convinced on them. What size of file are you testing against and how far back 
is the header? I'd almost wager its only one or two blocks back and the timing 
is fairly random.

However I know you had a big file we found in production where the adaptive 
algorithm probably fared much better. Obviously we can't give that out but we 
can fake the data by creating files with successively more data after the last 
header fairly easily. I'd like to see a test/benchmark of that nature before we 
get too lost in the weeds.

On the implementation side, this is a fine proof of concept but needs a bit 
of cleaning up. First, you've added two exported functions for testing which 
really shouldn't be in the final PR. The way the recursion works having the 
first iteration in find_last_header before moving to find_header_in_chunks 
seems wrong. That recursion should just be a single loop that expands its 
read-ahead starting with the initial values in find_last_header.

Second, the default appears to devolve into nearly exactly the same 
behavior as the current implementation which makes the timing numbers 
surprising. It seems a bit odd to write an improvement that needs to be 
configured to be effective. Given that people are going to get hit by this 
unexpectedly it seems like a fairly obscure expert option to know when and how 
to change the config when a database or view has a timeout when being opened.

Reading this I also realized that it might be beneficial to look at using 
file:pread/2 which does vectored reads. Using that we can read five bytes at 
the last N header offsets and then only read the exact header size when 
checking each spot with the header flag set. Using pread/2 here would allow us 
to scan larger chunks of file at the cost of having a second read for every 
header flag that's set. Assuming that headers are relatively rare events it 
seems like preferring to seek backwards without reading the entire file into 
RAM would be faster than attempting to avoid any follow on reads (given that we 
stop at the first valid header and don't expect many invalid headers). However 
this is obviously something that needs to be tested before we can be sure.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch issue #185: 3061 adaptive header search

2016-07-18 Thread jaydoane
Github user jaydoane commented on the issue:

https://github.com/apache/couchdb-couch/pull/185
  
@theburge, @kocolosk, @davisp, @eiri: can I get some feedback on this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch issue #185: 3061 adaptive header search

2016-07-15 Thread jaydoane
Github user jaydoane commented on the issue:

https://github.com/apache/couchdb-couch/pull/185
  
I've been testing with this repl command:
```erlang
_time_find_header = fun(File, Algorithm) when is_list(File) ->
FileSize = filelib:file_size(File),
{ok, Fd} = file:open(File, [raw, read, binary]),
{Time, Result} = case Algorithm of
current ->
timer:tc(couch_file,find_header,[Fd, FileSize div 4096]);
default ->
timer:tc(couch_file,find_last_header,[Fd, FileSize]);
adaptive ->
ok = config:set_integer("couchdb", "chunk_max_size", 4096*4096),
ok = config:set_integer("couchdb", "chunk_exponent_base", 2),
Res = timer:tc(couch_file,find_last_header,[Fd, FileSize]),
ok = config:delete("couchdb", "chunk_max_size"),
ok = config:delete("couchdb", "chunk_exponent_base"),
Res
end,
ok = file:close(Fd),
case Result of 
{ok, HeaderBin} ->
{Time, binary_to_term(HeaderBin)};
_ ->
{Time, Result}
end
end.
```
and use it like so:
```erlang
(node1@127.0.0.1)78> 
_time_find_header("/Users/jay/proj/ibm/sample-data/foo.couch", current).
{255,
 {db_header,6,0,0,nil,nil,nil,0,nil,nil,1000,
<<"1ca5e7003114d27fa1dcfff52b10163f">>,
[{'node1@127.0.0.1',0}],
0}}
(node1@127.0.0.1)79> 
_time_find_header("/Users/jay/proj/ibm/sample-data/foo.couch", default).
{119,
 {db_header,6,0,0,nil,nil,nil,0,nil,nil,1000,
<<"1ca5e7003114d27fa1dcfff52b10163f">>,
[{'node1@127.0.0.1',0}],
0}}
(node1@127.0.0.1)80> 
_time_find_header("/Users/jay/proj/ibm/sample-data/foo.couch", adaptive).
{138,
 {db_header,6,0,0,nil,nil,nil,0,nil,nil,1000,
<<"1ca5e7003114d27fa1dcfff52b10163f">>,
[{'node1@127.0.0.1',0}],
0}}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---