On Friday, 11 May 2018 at 13:28:58 UTC, Steven Schveighoffer
wrote:
[...]
I do get the point of having to go outside the cache. I'll look
and see if maybe specifying a 1000 line context helps ;)
Update: nope, still pretty much the same.
I'm sure someone will find some good show off program.
The amount of work done per byte though has to be minimal to
actually see anything.
Right, this is another part of the problem -- if copying is so
rare compared to the other operations, then the difference is
going to be lost in the noise.
What I have learned here is:
1. Ring buffers are really cool (I still love how it works) and
perform as well as normal buffers
2. The use cases are much smaller than I thought
3. In most real-world applications, they are a wash, and not
worth the OS tricks needed to use it.
Now I need to learn all about ring-buffers. Do you have any good
starting points?
4. iopipe makes testing with a different kind of buffer really
easy, which was one of my original goals. So I'm glad that
works!
That satisfying feeling when the code works exactly the way you
wanted it to!
I'm going to (obviously) leave them there, hoping that someone
finds a good use case, but I can say that my extreme excitement
at getting it to work was depressed quite a bit when I found it
didn't really gain much in terms of performance for the use
cases I have been doing.
I'm sure someone will find a place where its useful.
However, this example *does* show the power of iopipe -- it
handles all flavors of unicode with one template function, is
quite straightforward (though I want to abstract the line
tracking code, that stuff is really tricky to get right). Oh,
and it's roughly 10x faster than grep, and a bunch faster
than fgrep, at least on my machine ;) I'm tempted to add
regex processing to see if it still beats grep.
Should be mostly trivial in fact. I mean our first designs for
IOpipe is where I wanted regex to work with it.
Basically - if we started a match, extend window until we get
it or lose it. Then release up to the next point of potential
start.
I'm thinking it's even simpler than that. All matches are dead
on a line break (it's how grep normally works), so you simply
have to parse the lines and run each one via regex. What I
don't know is how much it costs regex to startup and run on an
individual line.
One thing I could do to amortize is keep 2N lines in the
buffer, and run the regex on a whole context's worth of lines,
then dump them all.
iopipe is looking like a great library!
I don't get why grep is so bad at this, since it is supposedly
doing the matching without line boundaries. I was actually
quite shocked when iopipe was that much faster -- even when I'm
not asking grep to print out line numbers (so it doesn't
actually ever really have to keep track of lines).
-Steve
That reminds me of this great blog post detailing grep's
performance:
http://ridiculousfish.com/blog/posts/old-age-and-treachery.html
Also, one of the original authors of grep wrote about its
performance optimizations, for anyone interested:
https://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html