Re: vibe.d benchmarks

2016-01-08 Thread Nikolay via Digitalmars-d

On Friday, 8 January 2016 at 04:02:39 UTC, Etienne Cimon wrote:


It's possible that those cache misses will be irrelevant when 
the requests actually do something, is it not? When a lot of 
different requests are competing for cache lines, I'd assume 
it's shuffling it enough to change these readings


I believe cache-misses problem is related to old vibed version. 
There was to many context switch. Now vibed uses SO_REUSEPORT 
socket option. It reduces context switch count radically.


Re: vibe.d benchmarks

2016-01-07 Thread Etienne Cimon via Digitalmars-d

On Wednesday, 6 January 2016 at 08:21:00 UTC, Atila Neves wrote:

On Tuesday, 5 January 2016 at 13:09:55 UTC, Etienne Cimon wrote:

On Tuesday, 5 January 2016 at 10:11:36 UTC, Atila Neves wrote:

[...]


The Rust mio library doesn't seem to be doing any black magic. 
I wonder how libasync could be optimized to match it.


No black magic, it's a thin wrapper over epoll. But it was 
faster than boost::asio and vibe.d the last time I measured.


Atila


You tested D+mio, but the equivalent would probably be D+libasync 
as it is a standalone library, thin wrapper around epoll


Re: vibe.d benchmarks

2016-01-07 Thread Etienne Cimon via Digitalmars-d

On Wednesday, 6 January 2016 at 08:24:10 UTC, Atila Neves wrote:

On Tuesday, 5 January 2016 at 14:15:18 UTC, rsw0x wrote:
On Tuesday, 5 January 2016 at 13:09:55 UTC, Etienne Cimon 
wrote:

On Tuesday, 5 January 2016 at 10:11:36 UTC, Atila Neves wrote:

[...]


The Rust mio library doesn't seem to be doing any black 
magic. I wonder how libasync could be optimized to match it.


Have you used perf(or similar) to attempt to find bottlenecks 
yet?


Extensively. I optimised my D code as much as I know how to. 
And that's the same code that gets driven by vibe.d, 
boost::asio and mio.


Nothing stands out anymore in perf. The only main difference I 
can see is that the vibe.d version has far more cache misses. I 
used perf to try and figure out where those came from and 
included them in the email I sent to Soenke.


Perf is a bit hard to understand if you've never used it 
before, but it's also very powerful.


Oh, I know. :)

Atila


It's possible that those cache misses will be irrelevant when the 
requests actually do something, is it not? When a lot of 
different requests are competing for cache lines, I'd assume it's 
shuffling it enough to change these readings


Re: vibe.d benchmarks

2016-01-06 Thread Atila Neves via Digitalmars-d

On Tuesday, 5 January 2016 at 13:09:55 UTC, Etienne Cimon wrote:

On Tuesday, 5 January 2016 at 10:11:36 UTC, Atila Neves wrote:

[...]


The Rust mio library doesn't seem to be doing any black magic. 
I wonder how libasync could be optimized to match it.


No black magic, it's a thin wrapper over epoll. But it was faster 
than boost::asio and vibe.d the last time I measured.


Atila


Re: vibe.d benchmarks

2016-01-06 Thread Atila Neves via Digitalmars-d

On Tuesday, 5 January 2016 at 14:15:18 UTC, rsw0x wrote:

On Tuesday, 5 January 2016 at 13:09:55 UTC, Etienne Cimon wrote:

On Tuesday, 5 January 2016 at 10:11:36 UTC, Atila Neves wrote:
On Thursday, 31 December 2015 at 08:23:26 UTC, Laeeth Isharc 
wrote:

 [...]


vibe.d _was_ faster than Go. I redid the measurements 
recently once I wrote an MQTT broker in Rust, and it was 
losing to boost::asio, Rust's mio, Go, and Java. I told 
Soenke about it.


I know it's vibe.d and not my code because after I got the 
disappointing results I wrote bindings from both boost::asio 
and mio to my D code and the winner of the benchmarks shifted 
to the D/mio combo (previously it was Rust - I figured the 
library was the cause and not the language and I was right).


I'd've put up new benchmarks already, I'm only waiting so I 
can show vibe.d in a good light.


Atila


The Rust mio library doesn't seem to be doing any black magic. 
I wonder how libasync could be optimized to match it.


Have you used perf(or similar) to attempt to find bottlenecks 
yet?


Extensively. I optimised my D code as much as I know how to. And 
that's the same code that gets driven by vibe.d, boost::asio and 
mio.


Nothing stands out anymore in perf. The only main difference I 
can see is that the vibe.d version has far more cache misses. I 
used perf to try and figure out where those came from and 
included them in the email I sent to Soenke.


Perf is a bit hard to understand if you've never used it 
before, but it's also very powerful.


Oh, I know. :)

Atila


Re: vibe.d benchmarks

2016-01-05 Thread Etienne via Digitalmars-d

On Tuesday, 5 January 2016 at 14:45:18 UTC, Nikolay wrote:

On Tuesday, 5 January 2016 at 14:15:18 UTC, rsw0x wrote:


Have you used perf(or similar) to attempt to find bottlenecks 
yet?




I used perf and wrote my result here: 
http://forum.rejectedsoftware.com/groups/rejectedsoftware.vibed/thread/1670/?page=2


As Sönke Ludwig said direct epoll usage can give more then 200% 
improvements over libevent.


libasync is the result of an attempt to use epoll directly


Re: vibe.d benchmarks

2016-01-05 Thread Nikolay via Digitalmars-d

On Tuesday, 5 January 2016 at 14:15:18 UTC, rsw0x wrote:


Have you used perf(or similar) to attempt to find bottlenecks 
yet?




I used perf and wrote my result here: 
http://forum.rejectedsoftware.com/groups/rejectedsoftware.vibed/thread/1670/?page=2


As Sönke Ludwig said direct epoll usage can give more then 200% 
improvements over libevent.


Re: vibe.d benchmarks

2016-01-05 Thread rsw0x via Digitalmars-d

On Tuesday, 5 January 2016 at 13:09:55 UTC, Etienne Cimon wrote:

On Tuesday, 5 January 2016 at 10:11:36 UTC, Atila Neves wrote:
On Thursday, 31 December 2015 at 08:23:26 UTC, Laeeth Isharc 
wrote:

 [...]


vibe.d _was_ faster than Go. I redid the measurements recently 
once I wrote an MQTT broker in Rust, and it was losing to 
boost::asio, Rust's mio, Go, and Java. I told Soenke about it.


I know it's vibe.d and not my code because after I got the 
disappointing results I wrote bindings from both boost::asio 
and mio to my D code and the winner of the benchmarks shifted 
to the D/mio combo (previously it was Rust - I figured the 
library was the cause and not the language and I was right).


I'd've put up new benchmarks already, I'm only waiting so I 
can show vibe.d in a good light.


Atila


The Rust mio library doesn't seem to be doing any black magic. 
I wonder how libasync could be optimized to match it.


Have you used perf(or similar) to attempt to find bottlenecks yet?

If you use linux and LDC or GDC, I found it worked fine for my 
needs. Just compile it with optimizations & frame 
pointers(-fno-omit-frame-pointers for GDC and -disable-fp-elim 
for LDC) or dwarf debug symbols. I can't remember which generates 
a better callstack right now, actually, so it's probably worth 
playing around with under the --call-graph flag(fp or dwarf).


Perf is a bit hard to understand if you've never used it before, 
but it's also very powerful.


Bye.


Re: vibe.d benchmarks

2016-01-05 Thread Etienne Cimon via Digitalmars-d

On Tuesday, 5 January 2016 at 10:11:36 UTC, Atila Neves wrote:
On Thursday, 31 December 2015 at 08:23:26 UTC, Laeeth Isharc 
wrote:

 [...]


vibe.d _was_ faster than Go. I redid the measurements recently 
once I wrote an MQTT broker in Rust, and it was losing to 
boost::asio, Rust's mio, Go, and Java. I told Soenke about it.


I know it's vibe.d and not my code because after I got the 
disappointing results I wrote bindings from both boost::asio 
and mio to my D code and the winner of the benchmarks shifted 
to the D/mio combo (previously it was Rust - I figured the 
library was the cause and not the language and I was right).


I'd've put up new benchmarks already, I'm only waiting so I can 
show vibe.d in a good light.


Atila


The Rust mio library doesn't seem to be doing any black magic. I 
wonder how libasync could be optimized to match it.


Re: vibe.d benchmarks

2016-01-05 Thread Atila Neves via Digitalmars-d
On Thursday, 31 December 2015 at 08:23:26 UTC, Laeeth Isharc 
wrote:

On Wednesday, 30 December 2015 at 20:32:08 UTC, yawniek wrote:

Sönke is already on it.

http://forum.rejectedsoftware.com/groups/rejectedsoftware.vibed/post/29110


i guess its not enough, there are still things that make 
vibe.d slow.


i quickly tried
https://github.com/nanoant/WebFrameworkBenchmark.git
which is really a very simple benchmark but it shows about the 
general overhead.


single core results against go-fasthttp with GOMAXPROCS=1 and 
vibe distribution disabled on a c4.2xlarge ec2 instance 
(archlinux):


vibe.d 0.7.23 with ldc
Requests/sec:  52102.06

vibe.d 0.7.26 with dmd
Requests/sec:  44438.47

vibe.d 0.7.26 with ldc
Requests/sec:  53996.62

go-fasthttp:
Requests/sec: 152573.32

go:
Requests/sec:  62310.04

its sad.

i am aware that go-fasthttp is a very simplistic, stripped 
down webserver and vibe is almost a full blown framework. 
still it should be D and vibe.d's USP to be faster than the 
fastest in the world and not limping around at the end of the 
charts.


Isn't there a decent chance the bottleneck is vibe.d's JSON 
implementation rather than the framework as such ?  We know 
from Atila's MQTT project that vibe.D can be significantly 
faster than Go, and we also know that its JSON implementation 
isn't that fast.  Replacing with FastJSON might be interesting.

 Sadly I don't have time to do that myself.


vibe.d _was_ faster than Go. I redid the measurements recently 
once I wrote an MQTT broker in Rust, and it was losing to 
boost::asio, Rust's mio, Go, and Java. I told Soenke about it.


I know it's vibe.d and not my code because after I got the 
disappointing results I wrote bindings from both boost::asio and 
mio to my D code and the winner of the benchmarks shifted to the 
D/mio combo (previously it was Rust - I figured the library was 
the cause and not the language and I was right).


I'd've put up new benchmarks already, I'm only waiting so I can 
show vibe.d in a good light.


Atila




Re: vibe.d benchmarks

2016-01-04 Thread Sönke Ludwig via Digitalmars-d

Am 30.12.2015 um 21:32 schrieb yawniek:

Sönke is already on it.

http://forum.rejectedsoftware.com/groups/rejectedsoftware.vibed/post/29110



i guess its not enough, there are still things that make vibe.d slow.

i quickly tried
https://github.com/nanoant/WebFrameworkBenchmark.git
which is really a very simple benchmark but it shows about the general
overhead.

single core results against go-fasthttp with GOMAXPROCS=1 and vibe
distribution disabled on a c4.2xlarge ec2 instance (archlinux):

(...)
its sad.



Can you try with the latest GIT master? There are some important 
optimizations which are not in 0.7.26 (which has at least one 
performance regression).




Re: vibe.d benchmarks

2016-01-04 Thread Etienne Cimon via Digitalmars-d

On Monday, 4 January 2016 at 10:32:41 UTC, Daniel Kozak wrote:

V Sat, 02 Jan 2016 03:00:19 +
Etienne Cimon via Digitalmars-d  
napsáno:



On Friday, 1 January 2016 at 11:38:53 UTC, Daniel Kozak wrote:
> On Thursday, 31 December 2015 at 18:23:17 UTC, Etienne Cimon 
> wrote:

>> [...]
>
> ?

With libasync, you can run multiple instances of your vibe.d 
server and the linux kernel will round robin the incoming 
connections.


Yes, but I speak about one instance of vibe.d with multiple 
workerThreads witch perform really bad with libasync


Yes, I will investigate this.


Re: vibe.d benchmarks

2016-01-04 Thread Daniel Kozak via Digitalmars-d
V Mon, 4 Jan 2016 08:37:10 +0100
Sönke Ludwig via Digitalmars-d  napsáno:

> Am 31.12.2015 um 13:44 schrieb Daniel Kozak via Digitalmars-d:
> >
> > vibe.d has (probably) bug it use threadPerCPU instead of corePerCPU
> > in setupWorkerThreads, here is a commit which make possible to
> > setup it by hand.
> >
> > https://github.com/rejectedsoftware/vibe.d/commit/f946c3a840eab4ef5f7b98906a6eb143509e1447
> >
> > (I just modify vibe.d code to use all my 4 cores and it helps a lot)
> >
> > To use more threads it must be setup with distribute option:
> >
> > settings.options |= HTTPServerOption.distribute;
> > //setupWorkerThreads(4); // works with master
> > listenHTTP(settings, &hello);  
> 
> For me, threadsPerCPU correctly yields the number of logical cores
> (i.e. coresPerCPU * 2 for hyper threading enabled CPUs), which is
> usually the optimal number of threads*. What numbers did you
> get/expect?
> 

On my AMD FX4100 (4 cores) and my AMD AMD A10-7850K(4 core) it is
return 1.

> One actual issue could be that, judging by the name, these functions 
> would yield the wrong numbers for multi-processor systems. I didn't
> try that so far. Do we have a function in Phobos/Druntime to get the
> number of processors?
> 
> * Granted, HT won't help for pure I/O payloads, but worker threads
> are primarily meant for computational tasks.




Re: vibe.d benchmarks

2016-01-04 Thread Daniel Kozak via Digitalmars-d
V Sat, 02 Jan 2016 03:00:19 +
Etienne Cimon via Digitalmars-d  napsáno:

> On Friday, 1 January 2016 at 11:38:53 UTC, Daniel Kozak wrote:
> > On Thursday, 31 December 2015 at 18:23:17 UTC, Etienne Cimon 
> > wrote:  
> >> On Thursday, 31 December 2015 at 13:29:49 UTC, Daniel Kozak 
> >> wrote:  
> >>> On Thursday, 31 December 2015 at 12:09:30 UTC, Etienne Cimon 
> >>> wrote:  
>  [...]  
> >>>
> >>> When I use HTTPServerOption.distribute with libevent I get 
> >>> better performance but with libasync it drops from 2 
> >>> req/s to 80 req/s. So maybe some another performance problem  
> >>
> >> I launch libasync programs as multiple processes, a bit like 
> >> postgresql. The TCP listening is done with REUSEADDR, so the 
> >> kernel can distribute it and it scales linearly without any 
> >> fear of contention on the GC. My globals go in redis or 
> >> databases  
> >
> > ?  
> 
> With libasync, you can run multiple instances of your vibe.d 
> server and the linux kernel will round robin the incoming 
> connections.

Yes, but I speak about one instance of vibe.d with multiple
workerThreads witch perform really bad with libasync



Re: vibe.d benchmarks

2016-01-03 Thread Sönke Ludwig via Digitalmars-d

Am 31.12.2015 um 13:44 schrieb Daniel Kozak via Digitalmars-d:


vibe.d has (probably) bug it use threadPerCPU instead of corePerCPU in
setupWorkerThreads, here is a commit which make possible to setup it by
hand.

https://github.com/rejectedsoftware/vibe.d/commit/f946c3a840eab4ef5f7b98906a6eb143509e1447

(I just modify vibe.d code to use all my 4 cores and it helps a lot)

To use more threads it must be setup with distribute option:

settings.options |= HTTPServerOption.distribute;
//setupWorkerThreads(4); // works with master
listenHTTP(settings, &hello);


For me, threadsPerCPU correctly yields the number of logical cores (i.e. 
coresPerCPU * 2 for hyper threading enabled CPUs), which is usually the 
optimal number of threads*. What numbers did you get/expect?


One actual issue could be that, judging by the name, these functions 
would yield the wrong numbers for multi-processor systems. I didn't try 
that so far. Do we have a function in Phobos/Druntime to get the number 
of processors?


* Granted, HT won't help for pure I/O payloads, but worker threads are 
primarily meant for computational tasks.


Re: vibe.d benchmarks

2016-01-03 Thread Sönke Ludwig via Digitalmars-d

Am 04.01.2016 um 04:27 schrieb Etienne Cimon:

On Sunday, 3 January 2016 at 22:16:08 UTC, Nick B wrote:

can someone tell me what changes need to be commited, so that we have
a chance at getting some decent (or even average) benchmark numbers ?


Considering that the best benchmarks are from tools that have all the C
calls inlined, I think the best optimizations would be in pragma(inline,
true), even doing inlining for fiber context changes.


Fiber context changes are not a significant influence. I've created a 
proof of concept HTTP-server based in vanilla OS calls a while ago and 
got almost no slowdown compared to using only callbacks. The performance 
level was around 200% of current vibe.d.


Having said that, the latest version (0.7.27-alpha.3) contains some 
important performance optimizations over 0.7.26 and should be used for 
comparisons. 0.7.26 also had a performance regression related to allocators.


Re: vibe.d benchmarks

2016-01-03 Thread Etienne Cimon via Digitalmars-d

On Sunday, 3 January 2016 at 22:16:08 UTC, Nick B wrote:
can someone tell me what changes need to be commited, so that 
we have a chance at getting some decent (or even average) 
benchmark numbers ?


Considering that the best benchmarks are from tools that have all 
the C calls inlined, I think the best optimizations would be in 
pragma(inline, true), even doing inlining for fiber context 
changes.


Re: vibe.d benchmarks

2016-01-03 Thread Nick B via Digitalmars-d

On Thursday, 31 December 2015 at 12:44:37 UTC, Daniel Kozak wrote:

V Thu, 31 Dec 2015 12:26:12 +
yawniek via Digitalmars-d  napsáno:





obvious typo and thanks for investigating etienne.

@daniel: i made similar results over the network.
i want to redo them with a more optimized setup though. my wrk
server was too weak.

the local results are still relevant as its a common setup to 
have nginx distribute to a few vibe instances locally.


One thing I forgot to mention I have to modify few things

vibe.d has (probably) bug it use threadPerCPU instead of 
corePerCPU in setupWorkerThreads, here is a commit which make 
possible to setup it by hand.


https://github.com/rejectedsoftware/vibe.d/commit/f946c3a840eab4ef5f7b98906a6eb143509e1447

(I just modify vibe.d code to use all my 4 cores and it helps a 
lot)


can someone tell me what changes need to be commited, so that we 
have a chance at getting some decent (or even average) benchmark 
numbers ?





Re: vibe.d benchmarks

2016-01-02 Thread Etienne Cimon via Digitalmars-d
On Saturday, 2 January 2016 at 10:05:56 UTC, Sebastiaan Koppe 
wrote:
That is nice. Didn't know that. That would enable 
zero-downtime-updates right?
Yes, although you might still break existing connections unless 
you can make the previous process wait for the existing 
connections to close after killing it.


I use docker a lot so normally I run a proxy container in front 
of the app containers and have it handle ssl and virtual hosts 
routing.
I haven't needed to migrate out of my linux server yet (12c/24t 
128gb) but when I do, I'll just add another one and go for DNS 
round robin. I use cloudflare currently and in practice you can 
add/remove A records and it'll round robin through them.


If your server application is capable of running as multiple 
instances, it's only a matter of having the database/cache 
servers accessible from another server and you've got a very 
efficient load balancing that doesn't require any proxies.


Re: vibe.d benchmarks

2016-01-02 Thread Sebastiaan Koppe via Digitalmars-d

On Saturday, 2 January 2016 at 03:00:19 UTC, Etienne Cimon wrote:
With libasync, you can run multiple instances of your vibe.d 
server and the linux kernel will round robin the incoming 
connections.


That is nice. Didn't know that. That would enable 
zero-downtime-updates right?


I use docker a lot so normally I run a proxy container in front 
of the app containers and have it handle ssl and virtual hosts 
routing.


Re: vibe.d benchmarks

2016-01-01 Thread Etienne Cimon via Digitalmars-d

On Friday, 1 January 2016 at 11:38:53 UTC, Daniel Kozak wrote:
On Thursday, 31 December 2015 at 18:23:17 UTC, Etienne Cimon 
wrote:
On Thursday, 31 December 2015 at 13:29:49 UTC, Daniel Kozak 
wrote:
On Thursday, 31 December 2015 at 12:09:30 UTC, Etienne Cimon 
wrote:

[...]


When I use HTTPServerOption.distribute with libevent I get 
better performance but with libasync it drops from 2 
req/s to 80 req/s. So maybe some another performance problem


I launch libasync programs as multiple processes, a bit like 
postgresql. The TCP listening is done with REUSEADDR, so the 
kernel can distribute it and it scales linearly without any 
fear of contention on the GC. My globals go in redis or 
databases


?


With libasync, you can run multiple instances of your vibe.d 
server and the linux kernel will round robin the incoming 
connections.


Re: vibe.d benchmarks

2016-01-01 Thread Daniel Kozak via Digitalmars-d
On Thursday, 31 December 2015 at 18:23:17 UTC, Etienne Cimon 
wrote:
On Thursday, 31 December 2015 at 13:29:49 UTC, Daniel Kozak 
wrote:
On Thursday, 31 December 2015 at 12:09:30 UTC, Etienne Cimon 
wrote:

[...]


When I use HTTPServerOption.distribute with libevent I get 
better performance but with libasync it drops from 2 req/s 
to 80 req/s. So maybe some another performance problem


I launch libasync programs as multiple processes, a bit like 
postgresql. The TCP listening is done with REUSEADDR, so the 
kernel can distribute it and it scales linearly without any 
fear of contention on the GC. My globals go in redis or 
databases


?


Re: vibe.d benchmarks

2015-12-31 Thread Etienne Cimon via Digitalmars-d

On Thursday, 31 December 2015 at 13:29:49 UTC, Daniel Kozak wrote:
On Thursday, 31 December 2015 at 12:09:30 UTC, Etienne Cimon 
wrote:


That would be the other way around. TCP_NODELAY is not enabled 
in the local connection, which makes a ~20-30ms difference per 
request on keep-alive connections and is the bottleneck in 
this case. Enabling it makes the library competitive in these 
benchmarks.


When I use HTTPServerOption.distribute with libevent I get 
better performance but with libasync it drops from 2 req/s 
to 80 req/s. So maybe some another performance problem


I launch libasync programs as multiple processes, a bit like 
postgresql. The TCP listening is done with REUSEADDR, so the 
kernel can distribute it and it scales linearly without any fear 
of contention on the GC. My globals go in redis or databases


Re: vibe.d benchmarks

2015-12-31 Thread Ola Fosheim Grøstad via Digitalmars-d

On Thursday, 31 December 2015 at 15:51:50 UTC, yawniek wrote:
its actually pretty realistic, one point of having a fast 
webserver is that you can save on ressources.

you get a cheap box and have everything there. very common.


It does not scale. If you can do it, then you don't really have a 
real need for the throughput in the first place...




Re: vibe.d benchmarks

2015-12-31 Thread yawniek via Digitalmars-d
On Thursday, 31 December 2015 at 15:35:45 UTC, Ola Fosheim 
Grøstad wrote:
I don't know how the benchmarks are set up, but I would assume 
that they don't use a local socket. I wonder if they run the 
database on the same machine, maybe they do, but that's not 
realistic, so they really should not.


its actually pretty realistic, one point of having a fast 
webserver is that you can save on ressources.

you get a cheap box and have everything there. very common.


Re: vibe.d benchmarks

2015-12-31 Thread Ola Fosheim Grøstad via Digitalmars-d
On Thursday, 31 December 2015 at 12:09:30 UTC, Etienne Cimon 
wrote:
That would be the other way around. TCP_NODELAY is not enabled 
in the local connection, which makes a ~20-30ms difference per 
request on keep-alive connections and is the bottleneck in this 
case. Enabling it makes the library competitive in these 
benchmarks.


I don't know how the benchmarks are set up, but I would assume 
that they don't use a local socket. I wonder if they run the 
database on the same machine, maybe they do, but that's not 
realistic, so they really should not.




Re: vibe.d benchmarks

2015-12-31 Thread Daniel Kozak via Digitalmars-d
On Thursday, 31 December 2015 at 12:09:30 UTC, Etienne Cimon 
wrote:


That would be the other way around. TCP_NODELAY is not enabled 
in the local connection, which makes a ~20-30ms difference per 
request on keep-alive connections and is the bottleneck in this 
case. Enabling it makes the library competitive in these 
benchmarks.


When I use HTTPServerOption.distribute with libevent I get better 
performance but with libasync it drops from 2 req/s to 80 
req/s. So maybe some another performance problem


Re: vibe.d benchmarks

2015-12-31 Thread Daniel Kozak via Digitalmars-d
V Thu, 31 Dec 2015 12:26:12 +
yawniek via Digitalmars-d  napsáno:

> On Thursday, 31 December 2015 at 12:09:30 UTC, Etienne Cimon 
> wrote:
> > On Thursday, 31 December 2015 at 08:51:31 UTC, yawniek wrote:  
> >> the libasync problem seem seems to be because of TCP_NODELAY 
> >> not being deactivated for local connection.  
> >
> > That would be the other way around. TCP_NODELAY is not enabled 
> > in the local connection, which makes a ~20-30ms difference per 
> > request on keep-alive connections and is the bottleneck in this 
> > case. Enabling it makes the library competitive in these 
> > benchmarks.  
> 
> obvious typo and thanks for investigating etienne.
> 
> @daniel: i made similar results over the network.
> i want to redo them with a more optimized setup though. my wrk 
> server was too weak.
> 
> the local results are still relevant as its a common setup to 
> have nginx distribute to a few vibe instances locally.

One thing I forgot to mention I have to modify few things

vibe.d has (probably) bug it use threadPerCPU instead of corePerCPU in
setupWorkerThreads, here is a commit which make possible to setup it by
hand.

https://github.com/rejectedsoftware/vibe.d/commit/f946c3a840eab4ef5f7b98906a6eb143509e1447

(I just modify vibe.d code to use all my 4 cores and it helps a lot)

To use more threads it must be setup with distribute option:

settings.options |= HTTPServerOption.distribute;
//setupWorkerThreads(4); // works with master
listenHTTP(settings, &hello);





Re: vibe.d benchmarks

2015-12-31 Thread yawniek via Digitalmars-d
On Thursday, 31 December 2015 at 12:09:30 UTC, Etienne Cimon 
wrote:

On Thursday, 31 December 2015 at 08:51:31 UTC, yawniek wrote:
the libasync problem seem seems to be because of TCP_NODELAY 
not being deactivated for local connection.


That would be the other way around. TCP_NODELAY is not enabled 
in the local connection, which makes a ~20-30ms difference per 
request on keep-alive connections and is the bottleneck in this 
case. Enabling it makes the library competitive in these 
benchmarks.


obvious typo and thanks for investigating etienne.

@daniel: i made similar results over the network.
i want to redo them with a more optimized setup though. my wrk 
server was too weak.


the local results are still relevant as its a common setup to 
have nginx distribute to a few vibe instances locally.


Re: vibe.d benchmarks

2015-12-31 Thread Etienne Cimon via Digitalmars-d

On Thursday, 31 December 2015 at 08:51:31 UTC, yawniek wrote:
On Thursday, 31 December 2015 at 08:23:26 UTC, Laeeth Isharc 
wrote:
Isn't there a decent chance the bottleneck is vibe.d's JSON 
implementation rather than the framework as such ?  We know 
from Atila's MQTT project that vibe.D can be significantly 
faster than Go, and we also know that its JSON implementation 
isn't that fast.  Replacing with FastJSON might be interesting.

 Sadly I don't have time to do that myself.


this is not the same benchmark discussed elsewhere, this one is 
a simple echo thing.
no json. it just states that there is some overhead around on 
various layers.

so its testimony is very limited.

from a slightly more distant view you can thus argue that 50k 
rps vs 150k rps basically just means that the framework will 
most probably not be your bottle neck.
none the less, getting ahead in the benchmarks would help to 
attract people who are then pleasantly surprised how easy it is 
to make full blown services with vibe.


the libasync problem seem seems to be because of TCP_NODELAY 
not being deactivated for local connection.


That would be the other way around. TCP_NODELAY is not enabled in 
the local connection, which makes a ~20-30ms difference per 
request on keep-alive connections and is the bottleneck in this 
case. Enabling it makes the library competitive in these 
benchmarks.


Re: vibe.d benchmarks

2015-12-31 Thread Daniel Kozak via Digitalmars-d

On Wednesday, 30 December 2015 at 20:32:08 UTC, yawniek wrote:

Sönke is already on it.

http://forum.rejectedsoftware.com/groups/rejectedsoftware.vibed/post/29110


i guess its not enough, there are still things that make vibe.d 
slow.


i quickly tried
https://github.com/nanoant/WebFrameworkBenchmark.git
which is really a very simple benchmark but it shows about the 
general overhead.


single core results against go-fasthttp with GOMAXPROCS=1 and 
vibe distribution disabled on a c4.2xlarge ec2 instance 
(archlinux):


vibe.d 0.7.23 with ldc
Requests/sec:  52102.06

vibe.d 0.7.26 with dmd
Requests/sec:  44438.47

vibe.d 0.7.26 with ldc
Requests/sec:  53996.62

go-fasthttp:
Requests/sec: 152573.32

go:
Requests/sec:  62310.04

its sad.

i am aware that go-fasthttp is a very simplistic, stripped down 
webserver and vibe is almost a full blown framework. still it 
should be D and vibe.d's USP to be faster than the fastest in 
the world and not limping around at the end of the charts.


My results from siege(just return page with Hello World same as 
WebFrameworkBenchmark):


siege -c 20 -q -b -t30S http://127.0.0.1:8080

vibed: --combined -b release-nobounds --compiler=ldmd

Transactions: 968269 hits
Availability: 100.00 %
Elapsed time:  29.10 secs
Data transferred:  12.00 MB
Response time:  0.00 secs
Transaction rate:   33273.85 trans/sec
Throughput: 0.41 MB/sec
Concurrency:   19.62
Successful transactions:  968269
Failed transactions:   0
Longest transaction:0.04
Shortest transaction:   0.00

vibed(one thread):

Transactions: 767815 hits
Availability: 100.00 %
Elapsed time:  29.94 secs
Data transferred:   9.52 MB
Response time:  0.00 secs
Transaction rate:   25645.12 trans/sec
Throughput: 0.32 MB/sec
Concurrency:   19.66
Successful transactions:  767815
Failed transactions:   0
Longest transaction:0.02
Shortest transaction:   0.00


GOMAXPROCS=4 go run hello.go

Transactions: 765301 hits
Availability: 100.00 %
Elapsed time:  29.52 secs
Data transferred:   8.03 MB
Response time:  0.00 secs
Transaction rate:   25924.83 trans/sec
Throughput: 0.27 MB/sec
Concurrency:   19.68
Successful transactions:  765301
Failed transactions:   0
Longest transaction:0.02
Shortest transaction:   0.00

GOMAXPROCS=1 go run hello.go

Transactions: 478991 hits
Availability: 100.00 %
Elapsed time:  29.47 secs
Data transferred:   5.02 MB
Response time:  0.00 secs
Transaction rate:   16253.51 trans/sec
Throughput: 0.17 MB/sec
Concurrency:   19.75
Successful transactions:  478992
Failed transactions:   0
Longest transaction:0.02
Shortest transaction:   0.00

UnderTow (4 cores):

Transactions: 965835 hits
Availability: 100.00 %
Elapsed time:  29.41 secs
Data transferred:  10.13 MB
Response time:  0.00 secs
Transaction rate:   32840.36 trans/sec
Throughput: 0.34 MB/sec
Concurrency:   19.57
Successful transactions:  965836
Failed transactions:   0
Longest transaction:0.01
Shortest transaction:   0.00

Kore.io (4 workers)

Transactions:   2043 hits
Availability: 100.00 %
Elapsed time:  29.61 secs
Data transferred:   0.02 MB
Response time:  0.29 secs
Transaction rate:  69.00 trans/sec
Throughput: 0.00 MB/sec
Concurrency:   19.96
Successful transactions:2043
Failed transactions:   0
Longest transaction:0.55
Shortest transaction:   0.00


So it seems vibed has the best results :)


Re: vibe.d benchmarks

2015-12-31 Thread Ola Fosheim Grøstad via Digitalmars-d

On Thursday, 31 December 2015 at 08:51:31 UTC, yawniek wrote:
from a slightly more distant view you can thus argue that 50k 
rps vs 150k rps basically just means that the framework will 
most probably not be your bottle neck.


Go scores 0.5ms latency, vibe.d scores 14.7ms latency. That's a 
big difference that actually matters.


Dart + MongoDB also does very well in the multiple request tests. 
17300 requests versus Python + MySQL at 8800.


none the less, getting ahead in the benchmarks would help to 
attract people who are then pleasantly surprised how easy it is 
to make full blown services with vibe.


It also matters for people who pick a framework. Although the 
benchmark isn't great as general benchmarks it says something 
about:


1. Whether you can stick to the framework even when you need 
better performance, which is why the overhead versus raw platform 
speed is interesting.


2. That the framework has been engineered using performance 
measurements.


It is more useful for writing dynamic web services with simple 
requests rather than regular web servers though.




Re: vibe.d benchmarks

2015-12-31 Thread yawniek via Digitalmars-d
On Thursday, 31 December 2015 at 08:23:26 UTC, Laeeth Isharc 
wrote:
Isn't there a decent chance the bottleneck is vibe.d's JSON 
implementation rather than the framework as such ?  We know 
from Atila's MQTT project that vibe.D can be significantly 
faster than Go, and we also know that its JSON implementation 
isn't that fast.  Replacing with FastJSON might be interesting.

 Sadly I don't have time to do that myself.


this is not the same benchmark discussed elsewhere, this one is a 
simple echo thing.
no json. it just states that there is some overhead around on 
various layers.

so its testimony is very limited.

from a slightly more distant view you can thus argue that 50k rps 
vs 150k rps basically just means that the framework will most 
probably not be your bottle neck.
none the less, getting ahead in the benchmarks would help to 
attract people who are then pleasantly surprised how easy it is 
to make full blown services with vibe.


the libasync problem seem seems to be because of TCP_NODELAY not 
being deactivated for local connection.







Re: vibe.d benchmarks

2015-12-31 Thread Laeeth Isharc via Digitalmars-d

On Wednesday, 30 December 2015 at 20:32:08 UTC, yawniek wrote:

Sönke is already on it.

http://forum.rejectedsoftware.com/groups/rejectedsoftware.vibed/post/29110


i guess its not enough, there are still things that make vibe.d 
slow.


i quickly tried
https://github.com/nanoant/WebFrameworkBenchmark.git
which is really a very simple benchmark but it shows about the 
general overhead.


single core results against go-fasthttp with GOMAXPROCS=1 and 
vibe distribution disabled on a c4.2xlarge ec2 instance 
(archlinux):


vibe.d 0.7.23 with ldc
Requests/sec:  52102.06

vibe.d 0.7.26 with dmd
Requests/sec:  44438.47

vibe.d 0.7.26 with ldc
Requests/sec:  53996.62

go-fasthttp:
Requests/sec: 152573.32

go:
Requests/sec:  62310.04

its sad.

i am aware that go-fasthttp is a very simplistic, stripped down 
webserver and vibe is almost a full blown framework. still it 
should be D and vibe.d's USP to be faster than the fastest in 
the world and not limping around at the end of the charts.


Isn't there a decent chance the bottleneck is vibe.d's JSON 
implementation rather than the framework as such ?  We know from 
Atila's MQTT project that vibe.D can be significantly faster than 
Go, and we also know that its JSON implementation isn't that 
fast.  Replacing with FastJSON might be interesting.  Sadly I 
don't have time to do that myself.




Re: vibe.d benchmarks

2015-12-30 Thread Daniel Kozak via Digitalmars-d
V Wed, 30 Dec 2015 21:09:37 +
yawniek via Digitalmars-d  napsáno:

> On Wednesday, 30 December 2015 at 20:38:58 UTC, Daniel Kozak 
> wrote:
> > V Wed, 30 Dec 2015 20:32:08 +
> > yawniek via Digitalmars-d  napsáno:
> >
> > Which async library you use for vibed? libevent? libev? or 
> > libasync? Which compilation switches you used?
> >
> > Without this info it says nothing about vibe.d's performance :)  
> 
> the numbers above are libevent in release mode, as per original 
> configuration.
> 
> for libasync there is a problem so its stuck at 2.4 rps. etcimon 
> is currently investigating there.
> 

Thanks, it is wierd I use libasync and have quite good performance,
probably some regression (which version of libasync?)




Re: vibe.d benchmarks

2015-12-30 Thread yawniek via Digitalmars-d
On Wednesday, 30 December 2015 at 20:38:58 UTC, Daniel Kozak 
wrote:

V Wed, 30 Dec 2015 20:32:08 +
yawniek via Digitalmars-d  napsáno:

Which async library you use for vibed? libevent? libev? or 
libasync? Which compilation switches you used?


Without this info it says nothing about vibe.d's performance :)


the numbers above are libevent in release mode, as per original 
configuration.


for libasync there is a problem so its stuck at 2.4 rps. etcimon 
is currently investigating there.







Re: vibe.d benchmarks

2015-12-30 Thread Daniel Kozak via Digitalmars-d
V Wed, 30 Dec 2015 20:32:08 +
yawniek via Digitalmars-d  napsáno:

> >>> Sönke is already on it.
> >>>
> >>> http://forum.rejectedsoftware.com/groups/rejectedsoftware.vibed/post/29110
> >>>   
> 
> i guess its not enough, there are still things that make vibe.d 
> slow.
> 
> i quickly tried
> https://github.com/nanoant/WebFrameworkBenchmark.git
> which is really a very simple benchmark but it shows about the 
> general overhead.
> 
> single core results against go-fasthttp with GOMAXPROCS=1 and 
> vibe distribution disabled on a c4.2xlarge ec2 instance 
> (archlinux):
> 
> vibe.d 0.7.23 with ldc
> Requests/sec:  52102.06
> 
> vibe.d 0.7.26 with dmd
> Requests/sec:  44438.47
> 
> vibe.d 0.7.26 with ldc
> Requests/sec:  53996.62
> 
> go-fasthttp:
> Requests/sec: 152573.32
> 
> go:
> Requests/sec:  62310.04
> 
> its sad.
> 
> i am aware that go-fasthttp is a very simplistic, stripped down 
> webserver and vibe is almost a full blown framework. still it 
> should be D and vibe.d's USP to be faster than the fastest in the 
> world and not limping around at the end of the charts.
> 
> 

Which async library you use for vibed? libevent? libev? or libasync?
Which compilation switches you used?

Without this info it says nothing about vibe.d's performance :)



Re: vibe.d benchmarks

2015-12-30 Thread yawniek via Digitalmars-d

Sönke is already on it.

http://forum.rejectedsoftware.com/groups/rejectedsoftware.vibed/post/29110


i guess its not enough, there are still things that make vibe.d 
slow.


i quickly tried
https://github.com/nanoant/WebFrameworkBenchmark.git
which is really a very simple benchmark but it shows about the 
general overhead.


single core results against go-fasthttp with GOMAXPROCS=1 and 
vibe distribution disabled on a c4.2xlarge ec2 instance 
(archlinux):


vibe.d 0.7.23 with ldc
Requests/sec:  52102.06

vibe.d 0.7.26 with dmd
Requests/sec:  44438.47

vibe.d 0.7.26 with ldc
Requests/sec:  53996.62

go-fasthttp:
Requests/sec: 152573.32

go:
Requests/sec:  62310.04

its sad.

i am aware that go-fasthttp is a very simplistic, stripped down 
webserver and vibe is almost a full blown framework. still it 
should be D and vibe.d's USP to be faster than the fastest in the 
world and not limping around at the end of the charts.





Re: vibe.d benchmarks

2015-12-29 Thread Charles via Digitalmars-d

On Tuesday, 29 December 2015 at 22:49:36 UTC, Nick B wrote:

On Monday, 28 December 2015 at 13:10:59 UTC, Charles wrote:
On Monday, 28 December 2015 at 12:24:17 UTC, Ola Fosheim 
Grøstad wrote:

https://www.techempower.com/benchmarks/

The entries for vibe.d are either doing very poorly or fail 
to complete. Maybe someone should look into this?


Sönke is already on it.

http://forum.rejectedsoftware.com/groups/rejectedsoftware.vibed/post/29110


Correct me if I am wrong here, but as far I can tell there is 
no independent benchmarks showing performance (superior or good 
enough) of D verses Go, or against just about any other 
language, as well  ?


https://www.techempower.com/benchmarks/#section=data-r11&hw=peak&test=json&l=cnc&f=zik0vz-zik0zj-zik0zj-zik0zj-hra0hr


The last time the official benchmark was run was over a month 
before Sönke's PR.


Re: vibe.d benchmarks

2015-12-29 Thread Nick B via Digitalmars-d

On Monday, 28 December 2015 at 13:10:59 UTC, Charles wrote:
On Monday, 28 December 2015 at 12:24:17 UTC, Ola Fosheim 
Grøstad wrote:

https://www.techempower.com/benchmarks/

The entries for vibe.d are either doing very poorly or fail to 
complete. Maybe someone should look into this?


Sönke is already on it.

http://forum.rejectedsoftware.com/groups/rejectedsoftware.vibed/post/29110


Correct me if I am wrong here, but as far I can tell there is no 
independent benchmarks showing performance (superior or good 
enough) of D verses Go, or against just about any other language, 
as well  ?


https://www.techempower.com/benchmarks/#section=data-r11&hw=peak&test=json&l=cnc&f=zik0vz-zik0zj-zik0zj-zik0zj-hra0hr


Re: vibe.d benchmarks

2015-12-28 Thread Charles via Digitalmars-d
On Monday, 28 December 2015 at 12:24:17 UTC, Ola Fosheim Grøstad 
wrote:

https://www.techempower.com/benchmarks/

The entries for vibe.d are either doing very poorly or fail to 
complete. Maybe someone should look into this?


Sönke is already on it.

http://forum.rejectedsoftware.com/groups/rejectedsoftware.vibed/post/29110


vibe.d benchmarks

2015-12-28 Thread Ola Fosheim Grøstad via Digitalmars-d

https://www.techempower.com/benchmarks/

The entries for vibe.d are either doing very poorly or fail to 
complete. Maybe someone should look into this?