Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-19 Thread Tobias Pankrath

On Thursday, 9 January 2014 at 10:14:08 UTC, Benjamin Thaut wrote:
If requested I can make a list with all language features / 
decisions so far that prevent the implementation of a state of 
the art GC.


At least I am interested in your observations.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-19 Thread Benjamin Thaut

Am 09.01.2014 08:07, schrieb Walter Bright:


The point is, no matter how slow the GC is relative to malloc, not
allocating is faster than allocating, and a GC can greatly reduce the
amount of alloc/copy going on.



The points should be, if D is going to stay with a GC, and if so, when 
we will actually get propper GC support so a state of the art GC can be 
implemented. Or if we are going to replace the GC with ARC.


This is a really important topic which shouldn't wait until the language 
is 20 years old. I'm already using D since almost 3 years, and the more 
I learn about Garbage Collectors and about D, the more obvious becomes 
that D does not properly support garbage collection and it will require 
quite some effort and spec changes to do so. And in all the time I used 
D nothing changed about the garbage collector. The only thing that 
happend was the RtInfo template in object.d. But it still isn't used and 
only solves a small portion of the percise scanning problem. In my 
opinion D was designed with language features in mind that need a GC, 
but D was not designed to actually support a GC. And this needs to change.


If requested I can make a list with all language features / decisions so 
far that prevent the implementation of a state of the art GC.


--
Kind Regards
Benjamin Thaut


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-19 Thread Ola Fosheim Grøstad

On Thursday, 9 January 2014 at 09:58:24 UTC, Walter Bright wrote:
Please explain how this can work passing both string literals 
and allocated strings to cat().


By having your own string allocator that tests for membership 
when you free (if you allow free and foreign strings in your cat)?


How do you return a string that is the path part of a 
path/filename? (The terminating 0 is not a problem solved by 
creating your own allocator.)


If you discard the original you split at '/'. If you use your own 
stringallocator you don't have to worry about free... You either 
let the garbage remain until the pool is released or have a 
separate allocation structure that allows internal splits (no 
private size info before first char).







Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-19 Thread Ola Fosheim Grøstad
And, if it isn't in D already I would very much like to have a 
weak pointer type that will be set to null if the object is only 
pointed to by weak pointers.


It is a PITA to have objects die and get them out of a bunch of 
event-queues etc.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-10 Thread Paulo Pinto
On Thursday, 9 January 2014 at 23:02:57 UTC, Joseph Rushton 
Wakeling wrote:

On 08/01/14 21:22, Paulo Pinto wrote:
As I shared a few times here, it was Oberon which opened my 
eyes
to GC enabled systems programming languages, around 1996, 
maybe.


What was the GC design for Oberon, and how does that relate to 
what's in D (and what's in other GC'd languages)?


The original Oberon was a simple mark and sweep collector. 
Initially implemented in Assembly. In later versions it was coded 
in Oberon itself.


Original 1992/2005 edition
http://www.inf.ethz.ch/personal/wirth/ProjectOberon1992.pdf

2013 edition with images of the workstations were Oberon ran
http://www.inf.ethz.ch/personal/wirth/ProjectOberon/PO.System.pdf

EthOS used a mark and sweep GC with support for weak pointers and 
finalization. Running when the system was idle or when not enough 
memory was available.


http://research.microsoft.com/en-us/um/people/cszypers/books/insight-ethos.pdf

Active Oberon implementation used a mark and sweep with 
finalization support.


http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.85.5753rep=rep1type=pdf

Modula-3 used a compacting GC initially, with an optional 
background one.


https://modula3.elegosoft.com/cm3/doc/help/bib.html
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.36.6890

Cedar used a concurrent reference-counting collector, coupled 
with a mark and sweep one for cycle removals, with finalization 
support


http://www.textfiles.com/bitsavers/pdf/xerox/parc/techReports/CSL-84-7_On_Adding_Garbage_Collection_and_Runtime_Types_to_a_Strongly-Typed_Statically-Checked_Concurrent_Language.pdf

The features are quite similar to D:

- GC
- Allocation of data structures statically in global memory and 
stack

- Escape hatches to allocate memory manually when needed

I cannot say if they also allow for interior pointers like D does.

However the main point about Oberon and other languages wasn't 
only technical, but human. Funny enough that is also Andrew 
Koening's latest post


http://www.drdobbs.com/cpp/social-processes-and-the-design-of-progr/240165221

The people designing such systems believed that it was possible 
to write from the ground up a workstation operating system in a 
GC enabled systems programming language, with minimal Assembly.


They did succeed and built workstations that were usable for 
normal office work, which were then used at ETHZ, Xerox and 
Olivetti for some time.


For games, some more effort would be required I do acknowledge 
that.


However the world at large, ignored these efforts. As Andrew 
nicely puts on his article, many times the social barrier is 
higher than the technical one.


For many developers hearing the GC word, safe coding, bounds 
checking is enough to make them run away as fast as they can.


--
Paulo


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-10 Thread Paulo Pinto
On Thursday, 9 January 2014 at 22:02:48 UTC, Ola Fosheim Grøstad 
wrote:

On Thursday, 9 January 2014 at 19:16:10 UTC, Paulo Pinto wrote:
Every time I see such discussions, it reminds me when I 
started coding in the mid-80s and the heresy of using 
languages like Pascal and C dialects for microcomputers, 
instead of coding everything in Assembly or Forth


If you insist on bringing up heresy...

Motorola 680xx is pretty nice compared to x86, although the 
AMD64bit mode is better than it was. 680xx feels almost like C, 
just better ;9, I think only MIPS is close in programmer 
friendlieness.  Forth is nice too, very minimalistic and quite 
powerful for the simplistic implementation. I had a Forth64 
module for my C64 to dabble with, a bit hard to create more 
than toy programs in Forth... Postscript is pretty close 
actually, and clean. But Forth is dense, so dense that you 
don't edit text files, you edit text screens...  But don't diss 
assembly, try to get more than 8 sprites and move sprites into 
the screen border without assembly, can't be done! High level 
languages, my ass, BASIC can't do that!


But hey, I am not arguing in favour of Forth and C (although I 
would argue in favour of 680xx and MIPS). I am arguing in 
favour of smart compilers that allow you to go low level at the 
top of the call stack where it matters (inner loops) without 
having to resort to a different language. D is close to that, 
so it is a promising candidate.


And... I actually think D is too lax in some areas. I don't 
think you should be allowed to call C/C++ without nailing down 
the pre/postconditions, basically describing what happens in 
terms of optimization constraints. I also want the programmer 
to be able to assert facts that the compiler fail to prove so 
that it can be used for optimization. Basically the ability to 
guide the optimizer so you don't have to resort to low level 
coding. I also think giving access to malloc is a bad idea. :-P


And well, I am not new to GC, I have actually used Simula quite 
a bit in classes/teaching newbies. Simula incidentally has 
exactly the same Garbage Collector that D has AFAIK.  I 
remember we had a 1970s internal memo describing the garbage 
collector of Simula on the curriculum of the compiler course... 
So that is vry old news.


Actually Simula kinda has the same kind of string type 
representation that D has too. And OO. And it has coroutines… 
While it doesn't have templates, it does actually have name 
parameters that has textual substitution semantics (in addition 
to ref and value). Now I also kinda like that it has :- for 
reference assignment and := for value assignment, but I 
didn't like it back then.


45 years later D merge Simula semantics with C (and some more 
stuff). And that is an interesting thing, of course.


But hey, no point in pretending that other people don't know 
what programming a GC high level language entails. If I want 
low latency, I go to C/C++ and hopefully D. If I want high 
level productivity I use whatever fits the bill… all GC 
languages. But I don't think D should be the first option in 
any non-speed area yet, so the GC is of limited use for now 
IMO. (In clusters you might want that though, speed+convenience 
but no need for low latency.)


I think D could pick up more good stuff from Python, like the 
array closures that allows you to succinctly transform arrays. 
Makes large portions of Python's standard library pointless.


What I really like about D is that the front end code appears 
to be quite readable. Take a look at clang and you will see the 
difference. So, I guess anyone with C++ knowledge has the 
opportunity to tune both syntax and semantics to their own 
liking and share it with others. That's pretty sweet (I'd like 
to try that one day).



Sorry if I hit any nerve, one never knows the experience of other 
people in the Internet.


It is just that in the enterprise world I have been part of 
projects that ported C and C++ based servers to JVM/.NET ones, 
always with comparable performance.


I do acknowledge that in game programming it might be different, 
however even AAA do play with GC systems nowadays, even if they 
have some issues to optimize their behavior.


For example, The Witcher 2 for the XBox 360.

http://www.makinggames.de/index.php/magazin/2155_porting_the_witcher_2_on_xbox_360

--
Paulo


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-10 Thread Atila Neves

On Thursday, 9 January 2014 at 15:37:11 UTC, Jesse Phillips wrote:

On Thursday, 9 January 2014 at 00:37:27 UTC, Atila Neves wrote:
Thanks. Not many votes though given all the downvotes. The 
comments manage to be even worse than on my first blog post.


For some reason they all assume I don't know C++ even though I 
know it way better than D, not to mention that they nearly all 
miss the point altogether. Sigh.


I wonder if someone who knows C++ is going to help you out 
and improve your code, much like others did with the other 
languages you used.


I know C++. It's not that I can't finish it, it's that I can't be
bothered to. That's the whole point of the post.

Atila


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-10 Thread Jesse Phillips

On Friday, 10 January 2014 at 11:43:05 UTC, Atila Neves wrote:
On Thursday, 9 January 2014 at 15:37:11 UTC, Jesse Phillips 
wrote:

On Thursday, 9 January 2014 at 00:37:27 UTC, Atila Neves wrote:
Thanks. Not many votes though given all the downvotes. The 
comments manage to be even worse than on my first blog post.


For some reason they all assume I don't know C++ even though 
I know it way better than D, not to mention that they nearly 
all miss the point altogether. Sigh.


I wonder if someone who knows C++ is going to help you out 
and improve your code, much like others did with the other 
languages you used.


I know C++. It's not that I can't finish it, it's that I can't 
be

bothered to. That's the whole point of the post.

Atila


I know, that doesn't mean someone can't come in and fix what they 
see wrong with it. C++ programmers have less reason to prove 
their language, but I think most are in denial that their 
language is diffacult and that it is a problem.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-10 Thread Atila Neves
I wonder if someone who knows C++ is going to help you out 
and improve your code, much like others did with the other 
languages you used.


I know C++. It's not that I can't finish it, it's that I can't 
be

bothered to. That's the whole point of the post.

Atila


I know, that doesn't mean someone can't come in and fix what 
they see wrong with it. C++ programmers have less reason to 
prove their language, but I think most are in denial that their 
language is diffacult and that it is a problem.


Ah right, I misunderstood your what you meant. The denial is real 
and I think the comments on reddit are proof of that. Who knows, 
maybe I'll do it myself.


The weirdest part of it for me is that my (broken but working) 
C++ implementation didn't even do badly performance-wise and 
people still complained.


Atila


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-10 Thread Atila Neves


It does not help that C and C++ are currently the only portable 
languages across mainstream OS vendors.


Currently I am using C++ for my Android hobby development, not 
because I don't like Java, rather as it being the only common 
language across all mobile SDKs.


I feel your pain. If I were to do a cross-platform app I'd 
probably do the same. At least the Android NDK has new gcc 
versions to use for C++11. I assume the same is true for iOS.


Atila


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-10 Thread Paulo Pinto

On 10.01.2014 17:21, Jesse Phillips wrote:

On Friday, 10 January 2014 at 11:43:05 UTC, Atila Neves wrote:

On Thursday, 9 January 2014 at 15:37:11 UTC, Jesse Phillips wrote:

On Thursday, 9 January 2014 at 00:37:27 UTC, Atila Neves wrote:

Thanks. Not many votes though given all the downvotes. The comments
manage to be even worse than on my first blog post.

For some reason they all assume I don't know C++ even though I know
it way better than D, not to mention that they nearly all miss the
point altogether. Sigh.


I wonder if someone who knows C++ is going to help you out and
improve your code, much like others did with the other languages you
used.


I know C++. It's not that I can't finish it, it's that I can't be
bothered to. That's the whole point of the post.

Atila


I know, that doesn't mean someone can't come in and fix what they see
wrong with it. C++ programmers have less reason to prove their language,
but I think most are in denial that their language is diffacult and that
it is a problem.



It does not help that C and C++ are currently the only portable 
languages across mainstream OS vendors.


Currently I am using C++ for my Android hobby development, not because I 
don't like Java, rather as it being the only common language across all 
mobile SDKs.


--
Paulo


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-10 Thread Jacob Carlborg

On 2014-01-10 19:52, Atila Neves wrote:


I feel your pain. If I were to do a cross-platform app I'd probably do
the same. At least the Android NDK has new gcc versions to use for
C++11. I assume the same is true for iOS.


Yeah, iOS uses LLVM so that means C++11 as well.

--
/Jacob Carlborg


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Paulo Pinto

On Thursday, 9 January 2014 at 07:07:29 UTC, Walter Bright wrote:

On 1/8/2014 10:11 PM, Manu wrote:
On 9 January 2014 13:08, Walter Bright 
newshou...@digitalmars.com

mailto:newshou...@digitalmars.com wrote:

The reason that Java does excessive amounts of allocation is 
because Java doesn't have value types, not because Java has a 
GC.


That might change if IBM's extensions ever land in Java.

http://www.slideshare.net/rsciampacone/javaone-2013-introduction-to-packedobjects

Video presentation available here,
http://www.parleys.com/play/52504e5ee4b0a43ac121240b

Walter is right regarding D. All other GC enabled systems 
programming languages do have value objects and don't require 
everything to be on heap.


So the stress on the GC to clean memory is not as high as on Java 
and similar systems.


--
Paulo



Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Ola Fosheim Grøstad

On Thursday, 9 January 2014 at 07:07:29 UTC, Walter Bright wrote:
and it works without copying in D, it just returns s1. In C, I 
gotta copy, ALWAYS.


Only if you write libraries, in an application you can set your 
own policies (invariants).


(C's strings being 0 terminated also forces much extra copying, 
but that's another topic.)


Not if you have your own allocator and split chopped strings (you 
can just overwrite the boundary character).


The point is, no matter how slow the GC is relative to malloc, 
not allocating is faster than allocating, and a GC can greatly 
reduce the amount of alloc/copy going on.


But since malloc/free is tedious c-programmers tend to avoid it 
by embedding objects in large structs and put a variable sized 
object at the end of it... Or have their own pool (possibly on 
the stack at the location where it should be released).





Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Paulo Pinto
On Thursday, 9 January 2014 at 08:40:30 UTC, Ola Fosheim Grøstad 
wrote:
On Thursday, 9 January 2014 at 07:07:29 UTC, Walter Bright 
wrote:
and it works without copying in D, it just returns s1. In C, I 
gotta copy, ALWAYS.


Only if you write libraries, in an application you can set your 
own policies (invariants).


(C's strings being 0 terminated also forces much extra 
copying, but that's another topic.)


Not if you have your own allocator and split chopped strings 
(you can just overwrite the boundary character).


The point is, no matter how slow the GC is relative to malloc, 
not allocating is faster than allocating, and a GC can greatly 
reduce the amount of alloc/copy going on.


But since malloc/free is tedious c-programmers tend to avoid it 
by embedding objects in large structs and put a variable sized 
object at the end of it... Or have their own pool (possibly on 
the stack at the location where it should be released).


I have only seen those things work in small AAA class teams.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Ola Fosheim Grøstad

On Thursday, 9 January 2014 at 09:10:07 UTC, Paulo Pinto wrote:

I have only seen those things work in small AAA class teams.


But you have probably seen c programs allocate a bunch of 
different small structs with a single malloc where it is known 
that they will be freed in the same location? A compiler needs 
whole program analysis to do the same.


So yes, c programs will have fewer allocs if the programmer cared.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Paulo Pinto
On Thursday, 9 January 2014 at 09:38:31 UTC, Ola Fosheim Grøstad 
wrote:

On Thursday, 9 January 2014 at 09:10:07 UTC, Paulo Pinto wrote:

I have only seen those things work in small AAA class teams.


But you have probably seen c programs allocate a bunch of 
different small structs with a single malloc where it is known 
that they will be freed in the same location? A compiler needs 
whole program analysis to do the same.


So yes, c programs will have fewer allocs if the programmer 
cared.


Yes, I did.

Not much different than memory pools in Turbo Pascal and 
Objective-C for that matter.


And even more strange things, where the whole memory gets 
allocated at start, then some handles are used with mysterious 
macros to convert back and forth to real pointers.


I have also seen lots of other storage tricks that go easily out 
of control when the team either grows over a certain size, or 
management decides to outsource part of the development or 
lowering the expected skill set of new team members.


Then you watch the older guys playing fire brigade to track down 
issues of release X.Y.Z at customer site, almost every week.



--
Paulo


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Walter Bright
On 1/9/2014 1:38 AM, Ola Fosheim Grøstad 
ola.fosheim.grostad+dl...@gmail.com wrote:

On Thursday, 9 January 2014 at 09:10:07 UTC, Paulo Pinto wrote:

I have only seen those things work in small AAA class teams.


But you have probably seen c programs allocate a bunch of different small
structs with a single malloc where it is known that they will be freed in the
same location? A compiler needs whole program analysis to do the same.

So yes, c programs will have fewer allocs if the programmer cared.


A GC does not prevent such techniques.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Walter Bright
On 1/9/2014 12:40 AM, Ola Fosheim Grøstad 
ola.fosheim.grostad+dl...@gmail.com wrote:

On Thursday, 9 January 2014 at 07:07:29 UTC, Walter Bright wrote:

and it works without copying in D, it just returns s1. In C, I gotta copy,
ALWAYS.


Only if you write libraries, in an application you can set your own policies
(invariants).


Please explain how this can work passing both string literals and allocated 
strings to cat().




(C's strings being 0 terminated also forces much extra copying, but that's
another topic.)


Not if you have your own allocator and split chopped strings (you can just
overwrite the boundary character).


How do you return a string that is the path part of a path/filename? (The 
terminating 0 is not a problem solved by creating your own allocator.)




Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Ola Fosheim Grøstad

On Thursday, 9 January 2014 at 09:55:42 UTC, Walter Bright wrote:

A GC does not prevent such techniques.


No, but programmers gravitate towards less work... If alloc is 
transparent and free is hidden... You gain a lot from not being 
explicit, but you get more allocations overall.




Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Ola Fosheim Grøstad

On Thursday, 9 January 2014 at 10:14:08 UTC, Benjamin Thaut wrote:
If requested I can make a list with all language features / 
decisions so far that prevent the implementation of a state of 
the art GC.


I am also interested in this, so that I can avoid those 
constructs.


I am in general in agreement with you. I think regular ownership 
combined with a segmented GC that only scan pointers to a 
signified GC type would not be such a big deal and could be a 
real bonus. With whole program analysis you could then reject a 
lot of the branches you otherwise have to follow and you would 
not have to stop threads that cannot touch those GC types. Of 
course, you would then avoid using generic pointers. So, you 
might not need an advanced GC, just partition the GC scan better.


Scanning stacks could be really fast if you know the call order 
of stack frames (and you have that opportunity with whole program 
analysis): e.g.: top frame is a(), but only b() and c() can call 
a() and b() and c() have same stack frame size and cannot hold 
pointers to GC object = skip over a() and b/c() in one go.


It doesn't matter much if the GC takes even 20% of your 
efficiency away, as long as it doesn't lock you down for more 
than 1-2 milliseconds: that's 4 million cycles for a single 
core. If you need 25 cycles per pointer you can scan 80.000 
pointers per core. So if the search space can be partitioned in a 
way that makes that possible by not following all pointers, then 
the GC would be fine. 100.000 cache lines = 3.2MB which is not 
too horrible either.


I'd rather have 1000% less efficiency in the GC by having 
frequent GC calls than 400% more latency less frequently.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Paulo Pinto
On Thursday, 9 January 2014 at 13:44:10 UTC, Ola Fosheim Grøstad 
wrote:
On Thursday, 9 January 2014 at 10:14:08 UTC, Benjamin Thaut 
wrote:
If requested I can make a list with all language features / 
decisions so far that prevent the implementation of a state of 
the art GC.


I am also interested in this, so that I can avoid those 
constructs.


I am in general in agreement with you. I think regular 
ownership combined with a segmented GC that only scan pointers 
to a signified GC type would not be such a big deal and could 
be a real bonus. With whole program analysis you could then 
reject a lot of the branches you otherwise have to follow and 
you would not have to stop threads that cannot touch those GC 
types. Of course, you would then avoid using generic pointers. 
So, you might not need an advanced GC, just partition the GC 
scan better.


Scanning stacks could be really fast if you know the call order 
of stack frames (and you have that opportunity with whole 
program analysis): e.g.: top frame is a(), but only b() and c() 
can call a() and b() and c() have same stack frame size and 
cannot hold pointers to GC object = skip over a() and b/c() in 
one go.


It doesn't matter much if the GC takes even 20% of your 
efficiency away, as long as it doesn't lock you down for more 
than 1-2 milliseconds: that's 4 million cycles for a single 
core. If you need 25 cycles per pointer you can scan 80.000 
pointers per core. So if the search space can be partitioned in 
a way that makes that possible by not following all pointers, 
then the GC would be fine. 100.000 cache lines = 3.2MB which is 
not too horrible either.


I'd rather have 1000% less efficiency in the GC by having 
frequent GC calls than 400% more latency less frequently.


That could possibly be achieved with a generational parallel GC.


--
Paulo


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Ola Fosheim Grøstad

On Thursday, 9 January 2014 at 13:51:09 UTC, Paulo Pinto wrote:

That could possibly be achieved with a generational parallel GC.


Isn't the basic assumption in a generational GC that most free'd 
objects has a short life span and happened since the last 
collection?  Was there some assumption about the majority of 
inter-object pointers being within the same generation, too? So 
that you partition the objects in train carts and only have few 
pointers going between carts? I haven't looked at the original 
paper in a long time...


Anyway, if that is the assumption then it is generally not true 
for programs that are written for real time. Temporary objects 
are then allocated in pools or on the stack. Objects that are 
free'd tend to come from timers, events or because they have a 
lifespan (like enemies in a computer game).


I also dislike the idea of the GC locking cores down when it 
doesn't have to, so I don't think parallel is particularly 
useful. It will just put more pressure on the memory bus. I think 
it is sufficient to have a simple GC that only scans disjoint 
subsets (for that kind of application), so yes partitioned by 
type, or better: by reachability, but not by generation.


If the GC behaviour is predictable then the application can be 
designed to not trigger bad behaviour from the get go.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Paulo Pinto
On Thursday, 9 January 2014 at 14:19:41 UTC, Ola Fosheim Grøstad 
wrote:

On Thursday, 9 January 2014 at 13:51:09 UTC, Paulo Pinto wrote:
That could possibly be achieved with a generational parallel 
GC.


Isn't the basic assumption in a generational GC that most 
free'd objects has a short life span and happened since the 
last collection?  Was there some assumption about the majority 
of inter-object pointers being within the same generation, too? 
So that you partition the objects in train carts and only 
have few pointers going between carts? I haven't looked at the 
original paper in a long time...


That was just a suggestion. There are plenty of incremental GC 
algorithms to choose from.




Anyway, if that is the assumption then it is generally not true 
for programs that are written for real time. Temporary objects 
are then allocated in pools or on the stack. Objects that are 
free'd tend to come from timers, events or because they have a 
lifespan (like enemies in a computer game).


There are real time GCs controlling missile tracking systems.

Personally I find them a bit more real time than computer games.

On a game you might miss a few rendering frames, a GC induced
delay on a missile tracking system might turn out a bit ugly.



I also dislike the idea of the GC locking cores down when it 
doesn't have to, so I don't think parallel is particularly 
useful. It will just put more pressure on the memory bus. I 
think it is sufficient to have a simple GC that only scans 
disjoint subsets (for that kind of application), so yes 
partitioned by type, or better: by reachability, but not by 
generation.


If the GC behaviour is predictable then the application can be 
designed to not trigger bad behaviour from the get go.



Sure, the GC usage should not hinder the application's 
performance.


However, unless you target systems without an OS, you'll have 
anyway the OS making whatever it wants with the existing cores.


I never saw much control besides setting affinities.

--
Paulo


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Benjamin Thaut
Am 09.01.2014 15:28, schrieb Ola Fosheim Grøstad 
ola.fosheim.grostad+dl...@gmail.com:

And, if it isn't in D already I would very much like to have a weak
pointer type that will be set to null if the object is only pointed to
by weak pointers.

It is a PITA to have objects die and get them out of a bunch of
event-queues etc.


Didn't phobos get such a weak pointer type lately? I at least saw a 
implementation on the newsgroup very recently.


It used core.memory.setAttr to store information in objects. Then you 
can overwrite the collectHandler in core.runtime to null the weak 
references up destruction.


--
Kind Regards
Benjamin Thaut


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Ola Fosheim Grøstad

On Thursday, 9 January 2014 at 14:40:16 UTC, Paulo Pinto wrote:

On a game you might miss a few rendering frames, a GC induced
delay on a missile tracking system might turn out a bit ugly.


You have GC in games, but you limit it to a small set of objects 
(5?)

So you can have real time with GC with an upper-bound.

Putting everything under GC is probably not a future proof 
concept, since memory capacity most likely will increase faster 
than CPU speed for technical reasons.


However, unless you target systems without an OS, you'll have 
anyway the OS making whatever it wants with the existing cores.


Yes, but you don't blame the application if the scheduler isn't 
real time friendly. Linux has been a been kind of bad, because 
distributions have been focused on servers. But you find real 
time friendly schedulers too.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Klaim - Joël Lamotte
On Thu, Jan 9, 2014 at 7:11 AM, Manu turkey...@gmail.com wrote:

 You're making a keen assumption here that C programmers use STL. And no
 sane programmer that I've ever worked with uses STL precisely for this
 reason :P


I think this sentence is misleading. I've made high performance application
with no copy with the STL. Your sane programmers are just people who
don't want to learn it.
Sane programemrs make sure they know the strengh and pitfalls of their
tools. They don't avoid tools because they make incorrect assomptions, like
you are doing here.
Also, this have nothing to do with STL.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Paulo Pinto
On Thursday, 9 January 2014 at 14:57:31 UTC, Ola Fosheim Grøstad 
wrote:

On Thursday, 9 January 2014 at 14:40:16 UTC, Paulo Pinto wrote:

On a game you might miss a few rendering frames, a GC induced
delay on a missile tracking system might turn out a bit ugly.


You have GC in games, but you limit it to a small set of 
objects (5?)

So you can have real time with GC with an upper-bound.

Putting everything under GC is probably not a future proof 
concept, since memory capacity most likely will increase faster 
than CPU speed for technical reasons.




Sure. As I mentioned in another thread, the other GC enabled 
system programming languages I know, also allow for static, 
global and stack allocation.


And you also have an escape hatch to do manual memory management 
if you really have to.


Namely Oberon(-2), Component Pascal, Active Oberon, Modula-3, 
Sing# and Cedar. Just in case you feel like looking any of them 
up.


While those ended up never being adopted by the industry at 
large, we can draw lessons from the experience of their users. 
Positive features and related flaws.


Currently I am digging up the Mesa/Cedar reports from Xerox PARC.

I think D already has the necessary features, their performance 
just needs to be improved.


--
Paulo


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Jesse Phillips

On Thursday, 9 January 2014 at 00:37:27 UTC, Atila Neves wrote:
Thanks. Not many votes though given all the downvotes. The 
comments manage to be even worse than on my first blog post.


For some reason they all assume I don't know C++ even though I 
know it way better than D, not to mention that they nearly all 
miss the point altogether. Sigh.


I wonder if someone who knows C++ is going to help you out and 
improve your code, much like others did with the other languages 
you used.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread H. S. Teoh
On Thu, Jan 09, 2014 at 08:40:29AM +, digitalmars-d-boun...@puremagic.com 
wrote:
 On Thursday, 9 January 2014 at 07:07:29 UTC, Walter Bright wrote:
 and it works without copying in D, it just returns s1. In C, I
 gotta copy, ALWAYS.
 
 Only if you write libraries, in an application you can set your own
 policies (invariants).

Yes, programming by convention, which falls flat as soon as you have a
large team on the project, and people don't know your conventions
(you'll be surprised how many seasoned programmers will just walk all
over your code writing what they're used to writing, with no thought to
read the code first and figure out how their code might fit in with the
rest). I see lots of this at my job, and it inevitably leads to
problems, because in C, people just *expect* the usual copying
conventions. Sure, if you're a one-man project, then you can remove some
of this copying, but rest assured that in a team project things will go
haywire, and inevitably you'll end up dictating that everyone must copy
everything because that's the only way to guarantee module X, which is
written by team B, doesn't do something screwy with our data.


 (C's strings being 0 terminated also forces much extra copying,
 but that's another topic.)
 
 Not if you have your own allocator and split chopped strings (you
 can just overwrite the boundary character).

You can't do this if the caller still wishes to retain the original
string.


 The point is, no matter how slow the GC is relative to malloc, not
 allocating is faster than allocating, and a GC can greatly reduce
 the amount of alloc/copy going on.
 
 But since malloc/free is tedious c-programmers tend to avoid it by
 embedding objects in large structs and put a variable sized object
 at the end of it... Or have their own pool (possibly on the stack at
 the location where it should be released).
[...]

One thing I miss in D is a nice way to allocate structs with a
variable-length static array at the end. GCC supports this, probably
as an extension (I don't remember if the C standard specifies this). I
know I can just manually allocate this via core.gc and casts, but a
built-in solution would be really nice.


T

-- 
Sometimes the best solution to morale problems is just to fire all of the 
unhappy people. -- despair.com


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread bearophile

H. S. Teoh:


One thing I miss in D is a nice way to allocate structs with a
variable-length static array at the end. GCC supports this, 
probably as an extension (I don't remember if the C standard

specifies this). I know I can just manually allocate this via
core.gc and casts, but a built-in solution would be really nice.


Since dmd 2.065 D supports this very well (it was supported in 
past too, but a less well). See:

http://rosettacode.org/wiki/Sokoban#Faster_Version

Bye,
bearophile


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread H. S. Teoh
On Thu, Jan 09, 2014 at 09:49:15AM +, Paulo Pinto wrote:
 On Thursday, 9 January 2014 at 09:38:31 UTC, Ola Fosheim Grøstad
 wrote:
 On Thursday, 9 January 2014 at 09:10:07 UTC, Paulo Pinto wrote:
 I have only seen those things work in small AAA class teams.
 
 But you have probably seen c programs allocate a bunch of
 different small structs with a single malloc where it is known
 that they will be freed in the same location? A compiler needs
 whole program analysis to do the same.
 
 So yes, c programs will have fewer allocs if the programmer cared.
 
 Yes, I did.
 
 Not much different than memory pools in Turbo Pascal and Objective-C
 for that matter.
 
 And even more strange things, where the whole memory gets allocated
 at start, then some handles are used with mysterious macros to
 convert back and forth to real pointers.
 
 I have also seen lots of other storage tricks that go easily out of
 control when the team either grows over a certain size, or
 management decides to outsource part of the development or lowering
 the expected skill set of new team members.
 
 Then you watch the older guys playing fire brigade to track down
 issues of release X.Y.Z at customer site, almost every week.
[...]

Exactly!! All these tricks are possible in C, but that's what they
essentially are: tricks, hacks around the language. You can only keep it
up with a small, dedicated core team. As soon as the PTBs decide to hire
new grads and move people around, you're screwed, 'cos the old guy who
was in charge of the tricky macros is no longer on the team, and nobody
else understands how the macros work, and the new guys are under
pressure to show contribution, so they barge in making assumptions about
how things work -- which usually means naïve C semantics, lots of
strcpy's, direct pointer arithmetic, I don't use these weird macros 'cos
I don't understand what they do. Result: fire brigade. :-)

This is why compiler-enforced type attributes ultimately trumps any kind
of coding convention. It forces everyone to do the Right Thing. This is
why strings (arrays) with built-in length is better, because it allows
slicing without needing to decide whether you should copy or modify
in-place (*someone* will inevitably get it wrong).

C's superiority is keyed on the programmer being perfect -- the
philosophy of the language is to trust the programmer, to believe that
the programmer knows what he's doing. Theoretically speaking, this is a
good thing, because the compiler won't stand in your way and annoy you
when you're trying to do something clever. (This is also what made me
like C in the first place -- I was 19 at the time, so it figures. :-P)
Unfortunately, in practice, humans are fallible -- very much fallible
and error-prone -- so this philosophy only leads to pain and more pain.
With a single-person project you can still somewhat maintain some
semblance of order. But when you have a team of 15+ programmers (at my
job we have up to 50), then it's total chaos, and you start to code by
paranoia, i.e,, assume everyone else will screw up and add every
possible safeguard you can think of in your part of the code, so that
when things go wrong it's not your fault. Which means every string
modification requires copying, which means performance is out the
window. It means adding layers of indirection to shield your code from
the outside world. Which means even more pointers to work with, which in
turn means you start getting into pointer management problems, and start
needing reference counting (which, as I described in an earlier post,
people *still* screw up). At some point, you start wishing C had a GC to
clean up the mess.


T

-- 
Public parking: euphemism for paid parking. -- Flora


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Walter Bright
On 1/9/2014 3:40 AM, Ola Fosheim Grøstad 
ola.fosheim.grostad+dl...@gmail.com wrote:

On Thursday, 9 January 2014 at 09:55:42 UTC, Walter Bright wrote:

A GC does not prevent such techniques.


No, but programmers gravitate towards less work... If alloc is transparent and
free is hidden... You gain a lot from not being explicit, but you get more
allocations overall.



GC doesn't even make those techniques harder.

I can't see any merit to the idea that GC makes for excessive allocation.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Walter Bright
On 1/9/2014 2:46 AM, Ola Fosheim Grøstad 
ola.fosheim.grostad+dl...@gmail.com wrote:

On Thursday, 9 January 2014 at 09:58:24 UTC, Walter Bright wrote:

Please explain how this can work passing both string literals and allocated
strings to cat().


By having your own string allocator that tests for membership when you free (if
you allow free and foreign strings in your cat)?


How does that work when you pass it hello? allocated with malloc()? basically 
any data that has mixed ancestry?


Note that your code doesn't always have control over this - you may have written 
a library intended to be used by others, or you may be calling a library written 
by others.




How do you return a string that is the path part of a path/filename? (The
terminating 0 is not a problem solved by creating your own allocator.)

If you discard the original you split at '/'.


That doesn't work if you pass a string literal, or if you are not the owner of 
the data.




If you use your own
stringallocator you don't have to worry about free... You either let the garbage
remain until the pool is released or have a separate allocation structure that
allows internal splits (no private size info before first char).


That doesn't work if you're passing strings with mixed ancestry.



Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Benjamin Thaut

Am 09.01.2014 11:36, schrieb Tobias Pankrath:

On Thursday, 9 January 2014 at 10:14:08 UTC, Benjamin Thaut wrote:

If requested I can make a list with all language features / decisions
so far that prevent the implementation of a state of the art GC.


At least I am interested in your observations.


Ok I will put together a list. But as I'm currently swamped with end of 
semester stuff, you shouldn't expect it within the next 3 weeks. I will 
post it on my blog (www.benjamin-thaut.de) and I will post it in the 
D.annouce newsgroup.


Kind Regards
Benjamin Thaut


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Ola Fosheim Grøstad

On Thursday, 9 January 2014 at 17:15:46 UTC, Walter Bright wrote:
How does that work when you pass it hello? allocated with 
malloc()? basically any data that has mixed ancestry?


Why would you do that? You would have to overload cat then.

Note that your code doesn't always have control over this - you 
may have written a library intended to be used by others, or 
you may be calling a library written by others.


The typical C (and the old C++) way has been to roll your own to 
get what you want and only use very focused libraries (like zlib, 
fft etc), or only use one big framework that define all their own 
stuff in a efficient and uniform manner with their own systems 
(Qt etc).


But it becomes tedious when using more than one framework.


That doesn't work if you're passing strings with mixed ancestry.


Well, you have to decide if you want to roll your own, use a 
framework or use the old C way.


The point is more: you can make your own and make it 
C-compatible, and reasonably efficient.


Usually there are different representations that are more or less 
efficient or convenient based on what you want to do. Even for 
strings.  For instance, you can have a high speed ascii MSB  
string representation that is 64 bit aligned and that sorts fine 
using 64 bit uint, and which is 0 terminated (padded to the 8 
byte-aligned boundary).


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Ola Fosheim Grøstad

On Thursday, 9 January 2014 at 17:17:53 UTC, Walter Bright wrote:

GC doesn't even make those techniques harder.

I can't see any merit to the idea that GC makes for excessive 
allocation.


People do what they are accustomed to and what is easy. Library 
writers are more likely to do allocation for you if they can 
forget about ownership.


I am more likely to use several single object new calls in C++, 
and more likely to do a shared malloc in C. C++ support RAII, C 
doesn't. shared malloc is a cheap version of RAII.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Walter Bright
On 1/9/2014 10:18 AM, Ola Fosheim Grøstad 
ola.fosheim.grostad+dl...@gmail.com wrote:

On Thursday, 9 January 2014 at 17:15:46 UTC, Walter Bright wrote:

How does that work when you pass it hello? allocated with malloc()?
basically any data that has mixed ancestry?


Why would you do that? You would have to overload cat then.


So you agree that it won't work.

BTW, it happens all the time when dealing with strings. For example, dealing 
with filenames, file extensions, and paths. Components can come from the command 
line, string literals, malloc, slices, etc., all mixed up together.


Overloading doesn't work because a string literal and a string allocated by 
something else have the same type.




That doesn't work if you're passing strings with mixed ancestry.


Well, you have to decide if you want to roll your own, use a framework or use
the old C way.

The point is more: you can make your own and make it C-compatible, and
reasonably efficient.


My point is you can't avoid making the extra copies without GC in any reasonable 
way.




Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Sean Kelly

On Wednesday, 8 January 2014 at 19:17:08 UTC, H. S. Teoh wrote:

On Wed, Jan 08, 2014 at 11:35:19AM +, Atila Neves wrote:

http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/


I have to say, this is also my experience with C++ after I 
learnt D.
Writing C++ is just so painful, so time-consuming, and so not 
rewarding
for the amount of effort you put into it, that I just can't 
bring myself

to write C++ anymore when I have the choice. And manual memory
management is a big part of that time sink. Which is why I 
believe that

a lot of the GC-phobia among the C/C++ folk is misplaced.  I can
sympathise, though, because coming from a C/C++ background 
myself, I was

highly skeptical of GC'd languages, and didn't find it to be a
particularly appealing aspect of D when I first started 
learning it.


But as I learned D, I eventually got used to having the GC 
around, and

discovered that not only it reduced the number of memory bugs
dramatically, it also increased my productivity dramatically: I 
never realized just how much time and effort it took to write 
code with manual memory management: you constantly have to 
think about how exactly you're going to be storing your 
objects, who it's going to get passed to, how to decide who's 
responsible for freeing it, what's the best strategy for 
deciding who allocates and who frees. These considerations 
permeate every aspect of your code, because you need to know 
whether to
pass/return an object* to someone, and whether this pointer 
implies
transfer of ownership or not, since that determines who's 
responsible to free it, etc.. Even with C++'s smart pointers, 
you still have to decide which one to use, and what pitfalls 
are associated with them (beware of cycles with refcounted 
pointers, passing auto_ptr to somebody might invalidate it 
after they return, etc.). It's like income tax: on just about 
every line of code you write, you have to pay the memory
management tax of extra mental overhead and time spent fixing 
pointer bugs in order to not get the IRS (Invalid Reference 
Segfault :P)

knocking on your shell prompt.


This is what initially drew me to D from C++.  Having a GC is a
huge productivity gain.

Manual memory management is a LOT of effort, and to be quite 
honest, unless you're writing an AAA 3D game engine, you don't 
*need* that last 5% performance improvement that manual memory 
management *might* gives you. That is, if you get it right. 
Which most C/C++ coders don't.


The other common case is server apps, since unpredictable delays
can be quite undesirable as well.  Java seems to mostly get
around this by having very mature and capable GCs despite having
a standard library that wants you to churn through memory like
pies at an eating contest.  The best you can do with D so far is
mostly to just not allocate whenever possible, by slicing strings
and such, since scanning can still be costly.  I think there's
still some work to do here, despite loving the GC as a general
feature.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Paulo Pinto

Am 09.01.2014 19:34, schrieb Walter Bright:

On 1/9/2014 10:18 AM, Ola Fosheim Grøstad
ola.fosheim.grostad+dl...@gmail.com wrote:

On Thursday, 9 January 2014 at 17:15:46 UTC, Walter Bright wrote:

How does that work when you pass it hello? allocated with malloc()?
basically any data that has mixed ancestry?


Why would you do that? You would have to overload cat then.


So you agree that it won't work.

BTW, it happens all the time when dealing with strings. For example,
dealing with filenames, file extensions, and paths. Components can come
from the command line, string literals, malloc, slices, etc., all mixed
up together.

Overloading doesn't work because a string literal and a string allocated
by something else have the same type.



That doesn't work if you're passing strings with mixed ancestry.


Well, you have to decide if you want to roll your own, use a framework
or use
the old C way.

The point is more: you can make your own and make it C-compatible, and
reasonably efficient.


My point is you can't avoid making the extra copies without GC in any
reasonable way.



Every time I see such discussions, it reminds me when I started coding 
in the mid-80s and the heresy of using languages like Pascal and C 
dialects for microcomputers, instead of coding everything in Assembly or 
Forth.


:)

--
Paulo


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Ola Fosheim Grøstad

On Thursday, 9 January 2014 at 18:34:58 UTC, Walter Bright wrote:

On 1/9/2014 10:18 AM, Ola Fosheim Grøstad

Why would you do that? You would have to overload cat then.


So you agree that it won't work.


It will work for string literals or for malloc'ed strings, but 
not for both using the same function unless you start to depend 
on the data sections used for literals (memory range testing). 
Which is a dirty tool-dependent hack.


Overloading doesn't work because a string literal and a string 
allocated by something else have the same type.


Not if you return your own type, but have the same structure? You 
return a struct, containing a variabled sized array of char, and 
overload on that?


But I see your point regarding literal/malloc, const char* and 
char* is a shady area, you can basically get anything cast to 
const char*.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread H. S. Teoh
On Thu, Jan 09, 2014 at 07:08:42PM +, digitalmars-d-boun...@puremagic.com 
wrote:
 On Thursday, 9 January 2014 at 18:34:58 UTC, Walter Bright wrote:
 On 1/9/2014 10:18 AM, Ola Fosheim Grøstad
 Why would you do that? You would have to overload cat then.
 
 So you agree that it won't work.
 
 It will work for string literals or for malloc'ed strings, but not
 for both using the same function unless you start to depend on the
 data sections used for literals (memory range testing). Which is a
 dirty tool-dependent hack.
 
 Overloading doesn't work because a string literal and a string
 allocated by something else have the same type.
 
 Not if you return your own type, but have the same structure? You
 return a struct, containing a variabled sized array of char, and
 overload on that?
 
 But I see your point regarding literal/malloc, const char* and char*
 is a shady area, you can basically get anything cast to const char*.

And since it is C, people expect to pass char* and const char* around.
So most likely what will happen is that if there's any way at all to get
a char* or const char* out of your opaque struct, they will do it, and
then pass it to strcat, strlen, and who knows what else. You can't
really stop this except by convention, because the language doesn't
enforce the encapsulation, and making it truly opaque (via void* with
PIMPL) will require an extra layer of indirection and make it unusable
with commonly-expected C APIs like printf.

But we all know what happens with programming by convention when the
team grows bigger -- old people who know the Right Way of doing things
leave, and new people come in ignorant of how things are Supposed To Be,
falling back to const char*, so the code quickly degenerates into a
horrible mess of mixed conventions and memory leaks / pointer bugs
everywhere. Then you start strdup'ing everything Just In Case. Which was
Walter's original point.


T

-- 
By understanding a machine-oriented language, the programmer will tend to use a 
much more efficient method; it is much closer to reality. -- D. Knuth


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread H. S. Teoh
On Thu, Jan 09, 2014 at 08:16:12PM +0100, Paulo Pinto wrote:
 Am 09.01.2014 19:34, schrieb Walter Bright:
 On 1/9/2014 10:18 AM, Ola Fosheim Grøstad
 ola.fosheim.grostad+dl...@gmail.com wrote:
 On Thursday, 9 January 2014 at 17:15:46 UTC, Walter Bright wrote:
 How does that work when you pass it hello? allocated with
 malloc()?  basically any data that has mixed ancestry?
 
 Why would you do that? You would have to overload cat then.
 
 So you agree that it won't work.
 
 BTW, it happens all the time when dealing with strings. For example,
 dealing with filenames, file extensions, and paths. Components can
 come from the command line, string literals, malloc, slices, etc.,
 all mixed up together.
 
 Overloading doesn't work because a string literal and a string
 allocated by something else have the same type.
 
 
 That doesn't work if you're passing strings with mixed ancestry.
 
 Well, you have to decide if you want to roll your own, use a
 framework or use the old C way.
 
 The point is more: you can make your own and make it C-compatible,
 and reasonably efficient.
 
 My point is you can't avoid making the extra copies without GC in any
 reasonable way.
 
 
 Every time I see such discussions, it reminds me when I started
 coding in the mid-80s and the heresy of using languages like Pascal
 and C dialects for microcomputers, instead of coding everything in
 Assembly or Forth.
 
 :)
[...]

Ah, the good ole 80's. I remember I was strongly pro-assembly in those
days. Back then compiler / interpreter technology was still rather
young, and the little that I saw of it didn't leave a good impression,
so I regarded all high-level languages with suspicion. :) Especially
languages that sport nice string operators, since back then many
language implementations had rather naïve string implementations, which
are really slow and inefficient.


T

-- 
Always remember that you are unique. Just like everybody else. -- despair.com


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread H. S. Teoh
On Thu, Jan 09, 2014 at 07:01:59PM +, Sean Kelly wrote:
 On Wednesday, 8 January 2014 at 19:17:08 UTC, H. S. Teoh wrote:
[...]
 Manual memory management is a LOT of effort, and to be quite
 honest, unless you're writing an AAA 3D game engine, you don't
 *need* that last 5% performance improvement that manual memory
 management *might* gives you. That is, if you get it right. Which
 most C/C++ coders don't.
 
 The other common case is server apps, since unpredictable delays
 can be quite undesirable as well.  Java seems to mostly get
 around this by having very mature and capable GCs despite having
 a standard library that wants you to churn through memory like
 pies at an eating contest.  The best you can do with D so far is
 mostly to just not allocate whenever possible, by slicing strings
 and such, since scanning can still be costly.  I think there's
 still some work to do here, despite loving the GC as a general
 feature.

I think we all agree that D's GC in its current state needs a lot of
improvement. While I have come to accept GCs as a good thing, that
doesn't mean that D's current GC is *that* good. Yet. I wish I had the
know-how (and the time!) to improve D's GC, because if D can get a GC
that's on par with Java's, then D can totally beat Java flat, since the
existence of value types greatly reduces the memory pressure on the GC,
so the GC will have much less work to do compared to an equivalent Java
program.

OTOH, even with D's suboptimal GC, I'm already seeing great productivity
gains at only a low cost, so that's a big thumbs up for GC's. And the
nice thing about being able to call malloc from D (which you can't in
Java) means you can still do manual memory management in critical code
sections when you need to squeeze out some extra performance.


T

-- 
Turning your clock 15 minutes ahead won't cure lateness---you're just making 
time go faster!


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Paulo Pinto

Am 09.01.2014 20:40, schrieb H. S. Teoh:

On Thu, Jan 09, 2014 at 07:01:59PM +, Sean Kelly wrote:

On Wednesday, 8 January 2014 at 19:17:08 UTC, H. S. Teoh wrote:

[...]

Manual memory management is a LOT of effort, and to be quite
honest, unless you're writing an AAA 3D game engine, you don't
*need* that last 5% performance improvement that manual memory
management *might* gives you. That is, if you get it right. Which
most C/C++ coders don't.


The other common case is server apps, since unpredictable delays
can be quite undesirable as well.  Java seems to mostly get
around this by having very mature and capable GCs despite having
a standard library that wants you to churn through memory like
pies at an eating contest.  The best you can do with D so far is
mostly to just not allocate whenever possible, by slicing strings
and such, since scanning can still be costly.  I think there's
still some work to do here, despite loving the GC as a general
feature.


I think we all agree that D's GC in its current state needs a lot of
improvement. While I have come to accept GCs as a good thing, that
doesn't mean that D's current GC is *that* good. Yet. I wish I had the
know-how (and the time!) to improve D's GC, because if D can get a GC
that's on par with Java's, then D can totally beat Java flat, since the
existence of value types greatly reduces the memory pressure on the GC,
so the GC will have much less work to do compared to an equivalent Java
program.

OTOH, even with D's suboptimal GC, I'm already seeing great productivity
gains at only a low cost, so that's a big thumbs up for GC's. And the
nice thing about being able to call malloc from D (which you can't in
Java) means you can still do manual memory management in critical code
sections when you need to squeeze out some extra performance.


T



Well, there are a few options to call malloc from Java:

- Do you own JNI wrapper
- Use Java Native Access
- Use Java Native Runtime
- Use NIO Buffers
- Use sun.misc.Unsafe.allocateMemory (sun.misc.Unsafe is planned to 
become a public API)


--
Paulo


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread qznc

On Thursday, 9 January 2014 at 19:41:43 UTC, H. S. Teoh wrote:

because if D can get a GC
that's on par with Java's, then D can totally beat Java flat, 
since the
existence of value types greatly reduces the memory pressure on 
the GC,
so the GC will have much less work to do compared to an 
equivalent Java

program.


Java will probably gain (something like) value types at some 
point. Google for packed objects, it provides similar gains as 
value types.


Hopefully, D gets a better GC first.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Ola Fosheim Grøstad

On Thursday, 9 January 2014 at 19:16:10 UTC, Paulo Pinto wrote:
Every time I see such discussions, it reminds me when I started 
coding in the mid-80s and the heresy of using languages like 
Pascal and C dialects for microcomputers, instead of coding 
everything in Assembly or Forth


If you insist on bringing up heresy...

Motorola 680xx is pretty nice compared to x86, although the 
AMD64bit mode is better than it was. 680xx feels almost like C, 
just better ;9, I think only MIPS is close in programmer 
friendlieness.  Forth is nice too, very minimalistic and quite 
powerful for the simplistic implementation. I had a Forth64 
module for my C64 to dabble with, a bit hard to create more than 
toy programs in Forth... Postscript is pretty close actually, and 
clean. But Forth is dense, so dense that you don't edit text 
files, you edit text screens...  But don't diss assembly, try to 
get more than 8 sprites and move sprites into the screen border 
without assembly, can't be done! High level languages, my ass, 
BASIC can't do that!


But hey, I am not arguing in favour of Forth and C (although I 
would argue in favour of 680xx and MIPS). I am arguing in favour 
of smart compilers that allow you to go low level at the top of 
the call stack where it matters (inner loops) without having to 
resort to a different language. D is close to that, so it is a 
promising candidate.


And... I actually think D is too lax in some areas. I don't think 
you should be allowed to call C/C++ without nailing down the 
pre/postconditions, basically describing what happens in terms of 
optimization constraints. I also want the programmer to be able 
to assert facts that the compiler fail to prove so that it can be 
used for optimization. Basically the ability to guide the 
optimizer so you don't have to resort to low level coding. I also 
think giving access to malloc is a bad idea. :-P


And well, I am not new to GC, I have actually used Simula quite a 
bit in classes/teaching newbies. Simula incidentally has exactly 
the same Garbage Collector that D has AFAIK.  I remember we had a 
1970s internal memo describing the garbage collector of Simula on 
the curriculum of the compiler course... So that is vry old 
news.


Actually Simula kinda has the same kind of string type 
representation that D has too. And OO. And it has coroutines… 
While it doesn't have templates, it does actually have name 
parameters that has textual substitution semantics (in addition 
to ref and value). Now I also kinda like that it has :- for 
reference assignment and := for value assignment, but I didn't 
like it back then.


45 years later D merge Simula semantics with C (and some more 
stuff). And that is an interesting thing, of course.


But hey, no point in pretending that other people don't know what 
programming a GC high level language entails. If I want low 
latency, I go to C/C++ and hopefully D. If I want high level 
productivity I use whatever fits the bill… all GC languages. But 
I don't think D should be the first option in any non-speed area 
yet, so the GC is of limited use for now IMO. (In clusters you 
might want that though, speed+convenience but no need for low 
latency.)


I think D could pick up more good stuff from Python, like the 
array closures that allows you to succinctly transform arrays. 
Makes large portions of Python's standard library pointless.


What I really like about D is that the front end code appears to 
be quite readable. Take a look at clang and you will see the 
difference. So, I guess anyone with C++ knowledge has the 
opportunity to tune both syntax and semantics to their own liking 
and share it with others. That's pretty sweet (I'd like to try 
that one day).


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread deadalnix

On Thursday, 9 January 2014 at 22:02:48 UTC, Ola Fosheim Grøstad
wrote:
What I really like about D is that the front end code appears 
to be quite readable. Take a look at clang and you will see the 
difference. So, I guess anyone with C++ knowledge has the 
opportunity to tune both syntax and semantics to their own 
liking and share it with others. That's pretty sweet (I'd like 
to try that one day).


This definitively convinced me that you must be very high on
drugs.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Ola Fosheim Grøstad

On Thursday, 9 January 2014 at 22:15:18 UTC, deadalnix wrote:

This definitively convinced me that you must be very high on
drugs.


Why is that? I have browsed the repositories and had no problems 
figuring out what was going on from what I read. I don't 
understand all the interdependencies of course, but making small 
changes should not be a big deal from what I've seen.




Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Brian Rogoff

On Thursday, 9 January 2014 at 21:35:45 UTC, qznc wrote:

On Thursday, 9 January 2014 at 19:41:43 UTC, H. S. Teoh wrote:

because if D can get a GC
that's on par with Java's, then D can totally beat Java flat, 
since the
existence of value types greatly reduces the memory pressure 
on the GC,
so the GC will have much less work to do compared to an 
equivalent Java

program.


Java will probably gain (something like) value types at some 
point. Google for packed objects, it provides similar gains 
as value types.


Hopefully, D gets a better GC first.


What's the status of all that? There were interesting talks at 
DConf 2013 about precise and concurrent GCs, and it seemed that 
work was going on to fold all that into the compilers, and that 
Walter/Andrei were ready to make changes to the spec and runtime 
if needed to support precise GC. All very encouraging.


Will DMD have a precise GC by the next DConf?

-- Brian



Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Joseph Rushton Wakeling

On 08/01/14 21:22, Paulo Pinto wrote:

As I shared a few times here, it was Oberon which opened my eyes
to GC enabled systems programming languages, around 1996, maybe.


What was the GC design for Oberon, and how does that relate to what's in D (and 
what's in other GC'd languages)?




Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread H. S. Teoh
On Thu, Jan 09, 2014 at 10:51:22PM +, Brian Rogoff wrote:
 On Thursday, 9 January 2014 at 21:35:45 UTC, qznc wrote:
 On Thursday, 9 January 2014 at 19:41:43 UTC, H. S. Teoh wrote:
 because if D can get a GC that's on par with Java's, then D can
 totally beat Java flat, since the existence of value types greatly
 reduces the memory pressure on the GC, so the GC will have much less
 work to do compared to an equivalent Java program.
 
 Java will probably gain (something like) value types at some
 point. Google for packed objects, it provides similar gains as
 value types.
 
 Hopefully, D gets a better GC first.
 
 What's the status of all that? There were interesting talks at DConf
 2013 about precise and concurrent GCs, and it seemed that work was
 going on to fold all that into the compilers, and that Walter/Andrei
 were ready to make changes to the spec and runtime if needed to
 support precise GC. All very encouraging.
 
 Will DMD have a precise GC by the next DConf?
[...]

Has *anything* been done on the GC at all since the previous DConf? Not
trying to be provocative, just genuinely curious if anything has been
happening on that front, since I don't remember seeing any commits in
that area all year.


T

-- 
I'm running Windows '98. Yes. My computer isn't working now. Yes, you 
already said that. -- User-Friendly


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-09 Thread Walter Bright

On 1/9/2014 3:29 PM, H. S. Teoh wrote:

Has *anything* been done on the GC at all since the previous DConf? Not
trying to be provocative, just genuinely curious if anything has been
happening on that front, since I don't remember seeing any commits in
that area all year.


Not much.



Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread Atila Neves

http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread bearophile

Atila Neves:


http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/


In this file:
https://github.com/atilaneves/mqtt/blob/master/mqttd/factory.d

Instead of code:

switch(fixedHeader.type) {
case MqttType.CONNECT:
return cereal.value!MqttConnect(fixedHeader);
case MqttType.CONNACK:


Perhaps you want code as:

final switch(fixedHeader.type) with (MqttType) {
case CONNECT:
return cereal.value!MqttConnect(fixedHeader);
case CONNACK:
...


Or even (modifying the enum):

final switch(fixedHeader.type) with (MqttType) {
case connect:
return cereal.value!MqttConnect(fixedHeader);
case connack:
...


Bye,
bearophile


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread Paulo Pinto

On Wednesday, 8 January 2014 at 11:35:21 UTC, Atila Neves wrote:

http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/


Thanks for sharing your experience.

It goes with my experience moving enterprise server code from C++ 
to JVM/.NET land.


What people forget about C++ smart pointers vs 
Objective-C/Rust/ParaSail ones is that without compiler support, 
you just spend too much time doing the said operations.


Over the holidays I spent some time researching about the 
Mesa/Cedar system developed at Xerox PARC. Cedar was already a GC 
enabled systems programming language, strong typed.


Quite remarkable what the system could do as a GUI desktop 
workstation in the early 80's and we are still fighting in 2014 
to get GC enabled systems programming languages accepted in the 
mainstream.


--
Paulo


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread Atila Neves
Thanks. I didn't think of using with, possibly because I've never 
used it before. It's one of those cool little features that I 
liked when I read about it but never remember about later.


I didn't use final switch on purpose; I normally would, but I 
didn't implement all the possible MQTT message types. If I ever 
do, it'll definitely be a final switch.


Atila

On Wednesday, 8 January 2014 at 12:35:02 UTC, bearophile wrote:

Atila Neves:


http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/


In this file:
https://github.com/atilaneves/mqtt/blob/master/mqttd/factory.d

Instead of code:

switch(fixedHeader.type) {
case MqttType.CONNECT:
return cereal.value!MqttConnect(fixedHeader);
case MqttType.CONNACK:


Perhaps you want code as:

final switch(fixedHeader.type) with (MqttType) {
case CONNECT:
return cereal.value!MqttConnect(fixedHeader);
case CONNACK:
...


Or even (modifying the enum):

final switch(fixedHeader.type) with (MqttType) {
case connect:
return cereal.value!MqttConnect(fixedHeader);
case connack:
...


Bye,
bearophile




Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread bearophile

Atila Neves:


http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/


Going to Reddit?

Bye,
bearophile


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread Atila Neves
I don't know if I have enough rep for it, I'd appreciate it if 
someone who does posts it there.


On Wednesday, 8 January 2014 at 18:24:00 UTC, bearophile wrote:

Atila Neves:


http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/


Going to Reddit?

Bye,
bearophile




Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread Paulo Pinto

Am 08.01.2014 19:31, schrieb Atila Neves:

I don't know if I have enough rep for it, I'd appreciate it if someone
who does posts it there.

On Wednesday, 8 January 2014 at 18:24:00 UTC, bearophile wrote:

Atila Neves:


http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/



Going to Reddit?

Bye,
bearophile




Done

http://www.reddit.com/r/programming/comments/1uqabe/adding_java_and_c_to_the_mqtt_benchmarks_or_how_i/

http://www.reddit.com/r/d_language/comments/1uqa4d/adding_java_and_c_to_the_mqtt_benchmarks_or_how_i/

--
Paulo


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread H. S. Teoh
On Wed, Jan 08, 2014 at 11:35:19AM +, Atila Neves wrote:
 http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/

I have to say, this is also my experience with C++ after I learnt D.
Writing C++ is just so painful, so time-consuming, and so not rewarding
for the amount of effort you put into it, that I just can't bring myself
to write C++ anymore when I have the choice. And manual memory
management is a big part of that time sink. Which is why I believe that
a lot of the GC-phobia among the C/C++ folk is misplaced.  I can
sympathise, though, because coming from a C/C++ background myself, I was
highly skeptical of GC'd languages, and didn't find it to be a
particularly appealing aspect of D when I first started learning it.

But as I learned D, I eventually got used to having the GC around, and
discovered that not only it reduced the number of memory bugs
dramatically, it also increased my productivity dramatically: I never
realized just how much time and effort it took to write code with manual
memory management: you constantly have to think about how exactly you're
going to be storing your objects, who it's going to get passed to, how
to decide who's responsible for freeing it, what's the best strategy for
deciding who allocates and who frees. These considerations permeate
every aspect of your code, because you need to know whether to
pass/return an object* to someone, and whether this pointer implies
transfer of ownership or not, since that determines who's responsible to
free it, etc.. Even with C++'s smart pointers, you still have to decide
which one to use, and what pitfalls are associated with them (beware of
cycles with refcounted pointers, passing auto_ptr to somebody might
invalidate it after they return, etc.). It's like income tax: on just
about every line of code you write, you have to pay the memory
management tax of extra mental overhead and time spent fixing pointer
bugs in order to not get the IRS (Invalid Reference Segfault :P)
knocking on your shell prompt.

Manual memory management is a LOT of effort, and to be quite honest,
unless you're writing an AAA 3D game engine, you don't *need* that last
5% performance improvement that manual memory management *might* gives
you. That is, if you get it right. Which most C/C++ coders don't.

Case in point: recently at work I had the dubious pleasure of
encountering some C code with a particularly pathological memory
mismanagement bug.  To give a bit of context: in the past, this part of
the code used to be completely manually-managed with malloc's and free's
everywhere. Just like most C code that implements business logic, it
worked well when the original people who wrote it maintained it. But
life happens, and people leave and new people come, so over time, the
code degenerated into a sad mess riddled with memory leaks and pointer
bugs everywhere. So the team lead finally put his foot down, and
replaced much of that old code with a ref-counted infrastructure. (This
being C, installing a GC was too much work; plus, GC-phobia is pretty
strong in these parts.) After all, ref-counting is the silver bullet to
cure manual memory management troubles, right? Well...

Fast-forward a couple o' years, and here I am, helping a coworker figure
out why the code was crashing. Long story short, we eventually found
that it was keeping a ref-counted container that contains two (or more)
ref-counted objects, each of which represented an async task spawned by
the parent process. The idea behind this code was to run multiple
computations on the same data, and we will use the results from whoever
finishes first. The remaining task(s) will simply be terminated. So
*somebody*, noting that we had a ref-counted system, decided to take
advantage of that fact by setting it up so that when a task finishes, it
will destroy the sub-object it's associated with, and the dtor of this
object (which will be automatically invoked by the ref-counting system)
will then walk the container and destruct every other object, which in
turn will terminate their associated tasks. Anybody spot the problem
yet? The reasoning (as far as I can reconstruct it, anyway), goes: In
order for the dtor to destruct the remaining tasks, we just have to
decrement the refcount on the container object; since there should only
be 1 reference to it, this will cause it to dip to 0, and then the
container's dtor will take care of cleaning up all the other tasks. But
in order for the task, when it finishes, to trigger the dtor of its
associated sub-object, the refcount of the sub-object must be 1,
otherwise the dtor won't trigger and we'll get stuck. So either the
container's reference to the sub-object shouldn't be counted, or the
task's reference to the sub-object shouldn't be counted. ... And it
just goes downhill from there.

So much for refcounting solving memory-management woes. I'm becoming
more and 

Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread Andrei Alexandrescu

On 1/8/14 3:35 AM, Atila Neves wrote:

http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/


http://www.reddit.com/r/programming/comments/1uqabe/adding_java_and_c_to_the_mqtt_benchmarks_or_how_i/?already_submitted=true

Andrei



Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread Andrei Alexandrescu

On 1/8/14 11:15 AM, H. S. Teoh wrote:

On Wed, Jan 08, 2014 at 11:35:19AM +, Atila Neves wrote:

http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/



[snip]

You may want to paste all that as a reddit comment.

Andrei



Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread Benjamin Thaut

Am 08.01.2014 20:15, schrieb H. S. Teoh:

Manual memory management is a LOT of effort, and to be quite honest,
unless you're writing an AAA 3D game engine, you don't *need* that last
5% performance improvement that manual memory management *might* gives
you. That is, if you get it right. Which most C/C++ coders don't.



The problem is, that with the current D-GC its not 5%. Its 300%.  See: 
http://3d.benjamin-thaut.de/?p=20
And people who are currently using C++ use C++ for a reason. And usually 
this reason is performance. As long as D remains with its current GC 
people will refuse to switch, given the 300% speed impact.
Additionaly programming with a GC often leads to a lot more allocations, 
and programmers beeing unaware of all those allocations and the 
possibility that those allocations slow down the program and might even 
trash the cache. Programmers who properly learned manual memory 
management are often more aware of whats happening in the background and 
how to optmize algorithms for memory usage, which can lead to 
astonishing performance improvements on modern hardware.


Also a GC is for automatic memory management. But memory is just a 
resource. And there are a lot other resources then just memory. Having a 
GC does not free you from doing other manual memory management, which 
still can be annoying and can create the exact same issues as with 
manual memory management. Having a large C# codebase where almost 
everything implementes the IDisposeable interface doesn't really improve 
the situation. It would be a lot better if GCs would focus on automatic 
resource management in general, so the user is freed of all such tedious 
tasks, and not just a portion of it.


Additionaly switching away from C++ is also not a option because of 
other reasons. For example cross plattform compatibility. I don't know 
any language other then C/C++ which would actually work on all 
plattforms we (my team at work) currently develop for. Not even D 
(mostly because of missing ports of druntime / phobos. Maybe even a 
missing hardware architecture.)


But I fully agree, that if you do some non performance critical business 
logic or application logic its a lot more productive to use a garbage 
collected language. Unfortunately C# and Java are doing a far better job 
then D here, mostly because of better tooling and more mature libraries.


Kind Regards
Benjamin Thaut




Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread Paulo Pinto

Am 08.01.2014 20:15, schrieb H. S. Teoh:

On Wed, Jan 08, 2014 at 11:35:19AM +, Atila Neves wrote:

http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/




[snip]

Thanks very much for sharing your experience.

As I shared a few times here, it was Oberon which opened my eyes
to GC enabled systems programming languages, around 1996, maybe.

After that I was curious to learn about the other descendants of
Oberon and Modula-3. Sadly none of them got an uptake outside ETHZ
and Olivetti, except maybe for Modula-3's influence to C#.

While researching for my Oberon article, I have discovered the Cedar
programming language, developed at Xerox PARC as part of their Mesa
system.

A strong typed systems programming language with GC, as well as manual
memory management, modules and functional programming features done in
1981.

My initial though was, how would today's systems look like if Xerox had
better connections to the outside world instead of ATT.

--
Paulo




Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread H. S. Teoh
On Wed, Jan 08, 2014 at 09:23:48PM +0100, Benjamin Thaut wrote:
 Am 08.01.2014 20:15, schrieb H. S. Teoh:
 Manual memory management is a LOT of effort, and to be quite honest,
 unless you're writing an AAA 3D game engine, you don't *need* that
 last 5% performance improvement that manual memory management *might*
 gives you. That is, if you get it right. Which most C/C++ coders
 don't.
 
 
 The problem is, that with the current D-GC its not 5%. Its 300%.
 See: http://3d.benjamin-thaut.de/?p=20

Well, your experience was based on writing a 3D game engine. :) I didn't
claim that GCs are best for that scenario. How many of us write 3D game
engines for a living?


 And people who are currently using C++ use C++ for a reason. And
 usually this reason is performance. As long as D remains with its
 current GC people will refuse to switch, given the 300% speed
 impact.

I think your view is skewed by your bad experience with doing 3D in D.
I've ported (well, more like re-written) compute-intensive code from
C/C++ to D before, and my experience has been that the D version is
either on par, or performs even better. Definitely nowhere near the 300%
slowdown you quote. (Not the mention the 50% reduction in development
time compared with writing it in C/C++!) Like I said, if you're doing
something that *needs* to squeeze every last bit of performance out of
the machine, then the GC may not be for you.

In fact, from what I hear, most people doing 3D engine work don't even
*use* memory allocation in the core engine -- everything is preallocated
so no allocation / free (not even malloc/free) is done at all. You never
know if a particular system's malloc/free relies on linear free lists,
which may cause O(n) worst-case performance -- something you definitely
want to avoid if you have only 20ms to render the next frame. If so,
then it's no wonder you see a 300% slowdown if you start using the GC
inside of the 3D engine.


 Additionaly programming with a GC often leads to a lot more
 allocations, and programmers beeing unaware of all those allocations
 and the possibility that those allocations slow down the program and
 might even trash the cache. Programmers who properly learned manual
 memory management are often more aware of whats happening in the
 background and how to optmize algorithms for memory usage, which can
 lead to astonishing performance improvements on modern hardware.

But the same programmers who don't know how to allocate properly on a
GC'd language will also write poorly-performing malloc/free code.
Freeing the root of a large tree structure can potentially run with no
fixed upper bound on time if the dtor recursively frees all child nodes,
so it's not that much better than a GC collection cycle. People who know
to avoid doing that will also know to write GC'd code in a way that
doesn't cause bad GC performance.


 Also a GC is for automatic memory management. But memory is just a
 resource. And there are a lot other resources then just memory.
 Having a GC does not free you from doing other manual memory
 management, which still can be annoying and can create the exact
 same issues as with manual memory management. Having a large C#
 codebase where almost everything implementes the IDisposeable
 interface doesn't really improve the situation. It would be a lot
 better if GCs would focus on automatic resource management in
 general, so the user is freed of all such tedious tasks, and not
 just a portion of it.

True, but having a GC for memory is still better than having nothing at
all. Memory, after all, is the most commonly used resource, generically
speaking.


 Additionaly switching away from C++ is also not a option because of
 other reasons. For example cross plattform compatibility. I don't
 know any language other then C/C++ which would actually work on all
 plattforms we (my team at work) currently develop for. Not even D
 (mostly because of missing ports of druntime / phobos. Maybe even a
 missing hardware architecture.)

That doesn't alleviate the painfulness of coding in C++.


 But I fully agree, that if you do some non performance critical
 business logic or application logic its a lot more productive to use a
 garbage collected language.

If you're doing performance-critical / realtime stuff, you probably want
to be very careful about how you use malloc/free anyway, same goes for
GC's.


 Unfortunately C# and Java are doing a far better job then D here,
 mostly because of better tooling and more mature libraries.
[...]

I find the lack of strong metaprogramming capabilities in Java (never
tried C# before) a show-stopper for me. You have to resort to either
lots of duplicated code, or adding too many indirections that hurts
performance.  For compute-intensive code, too many indirections can mean
the difference between something finishing in 2 days instead of 2 hours.


T

-- 
Computers are like a jungle: they have monitor lizards, rams, mice,
c-moss, binary trees... and bugs.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread Benjamin Thaut

Am 08.01.2014 21:57, schrieb H. S. Teoh:

On Wed, Jan 08, 2014 at 09:23:48PM +0100, Benjamin Thaut wrote:

Well, your experience was based on writing a 3D game engine. :) I didn't
claim that GCs are best for that scenario. How many of us write 3D game
engines for a living?


No, this expierence is not only based of this. I observed multiple 
discussions on the newsgroup, where turning off the GC would speed up 
the program by factor 3. The most recent one was parsing a text file and 
filling a associative array with the contents of that text file, which 
is not really 3d programming. What I'm really trying to say is: I would 
be willing to use a GC in D to, but only if D actually has a state of 
the art GC and not some primitive old does work without language support GC.




In fact, from what I hear, most people doing 3D engine work don't even
*use* memory allocation in the core engine -- everything is preallocated
so no allocation / free (not even malloc/free) is done at all. You never
know if a particular system's malloc/free relies on linear free lists,
which may cause O(n) worst-case performance -- something you definitely
want to avoid if you have only 20ms to render the next frame. If so,
then it's no wonder you see a 300% slowdown if you start using the GC
inside of the 3D engine.


That is a common misconception you can read very often on the internet. 
That doesn't make it true however. I saw lots of game and engine code in 
my life already, and its far from preallocating everything. It is tried 
to keep allocations to a minimum, but they are not avoided at all costs. 
If its neccessary they are just done (for example when spawning a new 
object, like a particle effect). It is even common to use scripting 
languages like lua for some tasks in game development, and lua allocates 
quite a lot during execution.




But the same programmers who don't know how to allocate properly on a
GC'd language will also write poorly-performing malloc/free code.
Freeing the root of a large tree structure can potentially run with no
fixed upper bound on time if the dtor recursively frees all child nodes,
so it's not that much better than a GC collection cycle. People who know
to avoid doing that will also know to write GC'd code in a way that
doesn't cause bad GC performance.


That is another common argument of pro GC people I have never seen in 
partice yet. Meaning, I never seen a case where freeing a tree of 
objects would cause a significant enough slowdown. I however saw lots of 
cases where a garbage collection caused a significant slowdown.





True, but having a GC for memory is still better than having nothing at
all. Memory, after all, is the most commonly used resource, generically
speaking.



Still it only solves half the problem.




Additionaly switching away from C++ is also not a option because of
other reasons. For example cross plattform compatibility. I don't
know any language other then C/C++ which would actually work on all
plattforms we (my team at work) currently develop for. Not even D
(mostly because of missing ports of druntime / phobos. Maybe even a
missing hardware architecture.)


That doesn't alleviate the painfulness of coding in C++.


It was never intended to. I just wanted to make the point, that even if 
you want, you can't avoid C++.






But I fully agree, that if you do some non performance critical
business logic or application logic its a lot more productive to use a
garbage collected language.


If you're doing performance-critical / realtime stuff, you probably want
to be very careful about how you use malloc/free anyway, same goes for
GC's.


This statement again has been posted hunderts of times in the GC vs 
manual memory management discussion. And again I never saw that the 
execution time of malloc or other self written allocators are a problem 
in partice. I did however see that the runtime of a GC allocation became 
a problem, to the point where it is avoided entierly. With realtime I 
didn't really mean that hard realtime requirements of embeded systems 
and alike more like soft realtime requirements where you want to avoid 
pause times as much as possible.





I find the lack of strong metaprogramming capabilities in Java (never
tried C# before) a show-stopper for me. You have to resort to either
lots of duplicated code, or adding too many indirections that hurts
performance.  For compute-intensive code, too many indirections can mean
the difference between something finishing in 2 days instead of 2 hours.



I fully agree here. Still when choosing a programming language you also 
have to pick one that all programmers on the team can and want to use. I 
fear that the D metaprogramming capabilities will scare of quite a few 
programmers because it seems to complicated to them. (Its really the 
same with C++ metaprogramming. Its syntactically ugly and verbose, but 
gets the job done, and is not so complicated if you are familiar with 
the most important 

Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread Joseph Rushton Wakeling

On 08/01/14 23:23, Benjamin Thaut wrote:

No, this expierence is not only based of this. I observed multiple discussions
on the newsgroup, where turning off the GC would speed up the program by factor
3.


In my experience it seems to depend very much on the particular problem being 
solved and the circumstances in which memory is being allocated.  Example: I 
have some code where, at least in the source, dynamic arrays are being created 
via new in a (fairly) inner loop, and this can be run repeatedly apparently 
without the GC being triggered -- in fact, my suspicion is that the allocated 
space is just being repeatedly re-used and overwritten, so there are no new 
allocs or frees.


OTOH some other code I wrote recently had a situation where, as the data 
structure in question expanded, a new array was allocated, and an old one copied 
and then deallocated.  This was fine up to a certain scale but above a certain 
size the GC would (often but not always) kick in, leading to a significant (but 
unpredictable) slowdown.


My impression was that below a certain level the GC is happy to either 
over-allocate (leaving lots of space for expansion) and/or avoid freeing memory 
(because there's plenty of memory still free), which avoids all the slowdown of 
alloc/free until there's a significant need for it.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread H. S. Teoh
On Wed, Jan 08, 2014 at 11:43:26PM +0100, Joseph Rushton Wakeling wrote:
 On 08/01/14 23:23, Benjamin Thaut wrote:
 No, this expierence is not only based of this. I observed multiple
 discussions on the newsgroup, where turning off the GC would speed up
 the program by factor 3.
 
 In my experience it seems to depend very much on the particular
 problem being solved and the circumstances in which memory is being
 allocated.  Example: I have some code where, at least in the source,
 dynamic arrays are being created via new in a (fairly) inner loop,
 and this can be run repeatedly apparently without the GC being
 triggered -- in fact, my suspicion is that the allocated space is
 just being repeatedly re-used and overwritten, so there are no new
 allocs or frees.
 
 OTOH some other code I wrote recently had a situation where, as the
 data structure in question expanded, a new array was allocated, and
 an old one copied and then deallocated.  This was fine up to a
 certain scale but above a certain size the GC would (often but not
 always) kick in, leading to a significant (but unpredictable)
 slowdown.
 
 My impression was that below a certain level the GC is happy to
 either over-allocate (leaving lots of space for expansion) and/or
 avoid freeing memory (because there's plenty of memory still free),
 which avoids all the slowdown of alloc/free until there's a
 significant need for it.

So this proves that the real situation with GC vs manual memory
management isn't as simple as a binary GC is better or GC is bad. It
depends a lot on the exact use case.

And now that you mention it, there does seem to be some kind of
threshold where something happens (I wasn't sure what it was before, but
now I'm thinking maybe it's a change in GC behaviour) where there's a
sudden change in program performance, that I've observed recently in one
of my programs. I might have a look into it sometime -- though I was
planning to redo that part of the code anyway, so I may or may not find
out the real reason behind this.


T

-- 
Государство делает вид, что платит нам зарплату, а мы делаем вид, что работаем.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread H. S. Teoh
On Wed, Jan 08, 2014 at 11:23:50PM +0100, Benjamin Thaut wrote:
 Am 08.01.2014 21:57, schrieb H. S. Teoh:
[...]
 I find the lack of strong metaprogramming capabilities in Java (never
 tried C# before) a show-stopper for me. You have to resort to either
 lots of duplicated code, or adding too many indirections that hurts
 performance.  For compute-intensive code, too many indirections can
 mean the difference between something finishing in 2 days instead of
 2 hours.
 
 
 I fully agree here. Still when choosing a programming language you
 also have to pick one that all programmers on the team can and want to
 use. I fear that the D metaprogramming capabilities will scare of
 quite a few programmers because it seems to complicated to them.  (Its
 really the same with C++ metaprogramming. Its syntactically ugly and
 verbose, but gets the job done, and is not so complicated if you are
 familiar with the most important concepts).

Coming from a C++ background, I have to say that C++ metaprogramming,
while possible, is only so in the most painful possible ways. My
impression is that C++ gave template metaprogramming a bad name, because
much of the metaprogramming aspects of templates were only discovered
after the fact, so the original design was never intended to be used in
the way it's used nowadays. As a result, people associate the design
flaws in C++ templates with template programming and metaprogramming in
general, whereas such flaws aren't an inherent feature of
metaprogramming itself.

Unfortunately, this makes people go ewww when they hear about D's
metaprogramming, whereas the real situation is that metaprogramming is
actually a pleasant experience in D, and very powerful if you know how
to take advantage of it.

One thing I really liked about TDPL is that Andrei sneakily introduces
metaprogramming as compile-time parameters early on, so that by the
time you get to the actual chapter on templates, you've already been
using them comfortably for a long time, and no longer have an irrational
fear of them.


T

-- 
Without geometry, life would be pointless. -- VS


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread NoUseForAName

On Wednesday, 8 January 2014 at 19:17:08 UTC, H. S. Teoh wrote:

On Wed, Jan 08, 2014 at 11:35:19AM +, Atila Neves wrote:

http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/


Manual memory management is a LOT of effort


Not in my experience. It only gets ugly if you attempt to write 
Ruby/Java in C/C++. In C/C++ you do not wildly create short-lived 
objects all over the place. In embedded C there is often no 
object allocation at all after initialization. I have written C 
and C++ code for 15 years and the only real issue was memory 
safety but you do not need a GC to solve that problem.


unless you're writing an AAA 3D game engine, you don't *need* 
that last
5% performance improvement that manual memory management 
*might* gives

you.


The performance issues of GC are not measured in percentages but 
in pause times. Those become problematic when - for example - 
your software must achieve a frame rate of at least 60 frames per 
second - every second. In future this will get worse because it 
seems the trend goes towards 120 Hz screens which require a frame 
rate of at least 120 frames per second for the best experience. 
Try squeezing D's stop-the-world GC pause times in there.


The D solution is to avoid the GC and fallback to C-style code. 
That is why Rust creates so much more excitement among C/C++ 
programmers. You get high-level code, memory safety AND no pause 
times.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread Ola Fosheim Grøstad

On Wednesday, 8 January 2014 at 23:08:43 UTC, NoUseForAName wrote:
That is why Rust creates so much more excitement among C/C++ 
programmers. You get high-level code, memory safety AND no 
pause times.


let mut x = 4.

Whyyy would anyone want to create such a syntax? I really want to 
like Rust, but I... just...




Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread NoUseForAName
On Wednesday, 8 January 2014 at 23:27:39 UTC, Ola Fosheim Grøstad 
wrote:

let mut x = 4.

Whyyy would anyone want to create such a syntax? I really want 
to like Rust, but I... just...


Looks pretty boring/conventional to me. If you know many 
programming languages you immediately recognize let as a common 
keyword for assignment. That keyword is older than me and I am 
old (by Silicon Valley standards).


That leaves only the funny sounding mut as slightly unusual. It 
is the result of making immutable the default which I think is a 
good decision.


It is horribly abbreviated but the vast majority of programmers 
who know what a cache miss is seem to prefer such abbreviations 
(I am not part of that majority, though). I mean C gave us 
classics like atoi.. still reminds me of ahoi every time I 
read it. And I will never get over C++'s cout and cin. See? 
Rust makes C/C++ damaged people feel right at home even there ;P


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread Ola Fosheim Grøstad

On Wednesday, 8 January 2014 at 23:43:43 UTC, NoUseForAName wrote:
Looks pretty boring/conventional to me. If you know many 
programming languages you immediately recognize let as a 
common keyword for assignment.


Yes, but I cannot think of a single one of them that I would like 
to use! ;-)


That leaves only the funny sounding mut as slightly unusual. 
It is the result of making immutable the default which I think 
is a good decision.


Agree on the last point, immutable should be the default. Altough 
I think they should have skipped both let and mut and used a 
different symbol for initial-assignment instead.


(I am not part of that majority, though). I mean C gave us 
classics like atoi.. still reminds me of ahoi every time I 
read it. And I will never get over C++'s cout and cin. See?


I don't mind cout, I hardly use cin, I try to avoid cerr, and 
I've never used clog… I mind how you configure iostreams though. 
It looks worse than printf, not sure how they managed that.



Rust makes C/C++ damaged people feel right at home even there ;P


Well, I associate let with the functional-toy-languages we 
created/used at the university in the 90s so I kind of have 
problem taking Rust seriously. And the name? RUST? Decaying 
metal. Why? It gives me the eerie feeling that the designers are 
either brilliant, mad or both, or that it is a practical joke. 
I'm sure the compiler randomly tells you Aprils Fools! Or 
something.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread Atila Neves
Thanks. Not many votes though given all the downvotes. The 
comments manage to be even worse than on my first blog post.


For some reason they all assume I don't know C++ even though I 
know it way better than D, not to mention that they nearly all 
miss the point altogether. Sigh.


On Wednesday, 8 January 2014 at 18:59:45 UTC, Paulo Pinto wrote:

Am 08.01.2014 19:31, schrieb Atila Neves:
I don't know if I have enough rep for it, I'd appreciate it if 
someone

who does posts it there.

On Wednesday, 8 January 2014 at 18:24:00 UTC, bearophile wrote:

Atila Neves:


http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/



Going to Reddit?

Bye,
bearophile




Done

http://www.reddit.com/r/programming/comments/1uqabe/adding_java_and_c_to_the_mqtt_benchmarks_or_how_i/

http://www.reddit.com/r/d_language/comments/1uqa4d/adding_java_and_c_to_the_mqtt_benchmarks_or_how_i/

--
Paulo




Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread Atila Neves
No, this expierence is not only based of this. I observed 
multiple discussions on the newsgroup, where turning off the GC 
would speed up the program by factor 3. The most recent one was


The GC doesn't even show up in the profiler for this/my use case. 
The one optimisation I did to avoid allocations increased 
performance by all of 5%. It really depends on the use case, and 
I don't think assuming a factor of 3 is advisable.


That is another common argument of pro GC people I have never 
seen in partice yet. Meaning, I never seen a case where freeing 
a tree of objects would cause a significant enough slowdown. I 
however saw lots of cases where a garbage collection caused a 
significant slowdown.


Well, if I wasn't aware of allocation I wouldn't have done the 
optimisation mentioned above, so it's a good point.


As far as slowdown happening with manual memory management, in 
certain cases cleaning up reference counted smart pointers can 
cause as much of a slowdown as a GC kicking in. This isn't my 
opinion though, there are data to that effect. Again, it depends 
on the use case.



Still it only solves half the problem.


Maybe in Java. In D at least we have struct destructors for other 
resources.


It was never intended to. I just wanted to make the point, that 
even if you want, you can't avoid C++.


A fair point. I think what we're saying is not that we won't ever 
write C++ again, but that we won't write it again if given the 
choice and if another language (not necessarily D) is also a good 
fit.


I'd be surprised if I wasn't still writing / refactoring / 
debugging C++ code a few decades for now. I don't want to write C 
again ever, but I know I'll have to.


I fully agree here. Still when choosing a programming language 
you also have to pick one that all programmers on the team can 
and want to use. I fear that the D metaprogramming capabilities 
will scare of quite a few programmers because it seems to 
complicated to them. (Its really the same with C++ 
metaprogramming. Its syntactically ugly and verbose, but gets 
the job done, and is not so complicated if you are familiar 
with the most important concepts).


I disagree wholeheartedly. It's a _lot_ more complicated in C++. 
D can also do more than C++, with far saner syntax.




Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread H. S. Teoh
On Wed, Jan 08, 2014 at 11:59:58PM +, digitalmars-d-boun...@puremagic.com 
wrote:
 On Wednesday, 8 January 2014 at 23:43:43 UTC, NoUseForAName wrote:
[...]
 (I am not part of that majority, though). I mean C gave us
 classics like atoi.. still reminds me of ahoi every time I
 read it. And I will never get over C++'s cout and cin. See?

The absolute worst offender from the C days was creat(). I mean,
seriously?? I'm actually a fan of abbreviated names myself, but that one
simply takes it to a whole 'nother level of wrong.


 I don't mind cout, I hardly use cin, I try to avoid cerr, and I've
 never used clog… I mind how you configure iostreams though. It looks
 worse than printf, not sure how they managed that.
[...]

I hate iostream with a passion. The syntax is only the tip of the
proverbial iceberg. Manipulators that change the global state of the
output stream, pathologically verbose ways of controlling output format
(cout  setprecision(5)  num; -- really?!) that *also* modifies
global state, crazy choice of output operator with counter-intuitive
operator precedence (cout  ab doesn't do what you think it does), ...
I have trouble finding what's there to like about iostream.

Even when I was still writing C++ a few years ago, I avoided iostream
like the plague. For all of its flaws, C's stdio is still far better
than iostream in terms of everyday usability. At least for me. YMMV.


T

-- 
Marketing: the art of convincing people to pay for what they didn't need before 
which you can't deliver after.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread Ola Fosheim Grøstad

On Thursday, 9 January 2014 at 00:52:04 UTC, H. S. Teoh wrote:

The absolute worst offender from the C days was creat().


That's unfair, that's unix, not C!

http://linux.die.net/man/3/explain_creat_or_die


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread H. S. Teoh
On Thu, Jan 09, 2014 at 01:06:01AM +, digitalmars-d-boun...@puremagic.com 
wrote:
 On Thursday, 9 January 2014 at 00:52:04 UTC, H. S. Teoh wrote:
 The absolute worst offender from the C days was creat().
 
 That's unfair, that's unix, not C!
 
 http://linux.die.net/man/3/explain_creat_or_die

That's why I said from the C days, not in C. :) Remember that C was
created... um, creat-ed... in order to write Unix.


T

-- 
Gone Chopin. Bach in a minuet.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread Ola Fosheim Grøstad

On Thursday, 9 January 2014 at 01:26:27 UTC, H. S. Teoh wrote:
That's why I said from the C days, not in C. :) Remember 
that C was

created... um, creat-ed... in order to write Unix.


Yes, but you have to take into consideration that there are over 
twice as many anagrams for creat than for create, so creat 
is clearly more versatile. There are no anagrams for unix.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread Walter Bright

On 1/8/2014 12:23 PM, Benjamin Thaut wrote:

Additionaly programming with a GC often leads to a lot more allocations,


I believe that this is incorrect. Using GC leads to fewer allocations, because 
you do not have to make extra copies just so it's clear who owns the allocations.


For example, if you've got an array of char* pointers, in D some can be GC 
allocated, some can be malloc'd, some can be slices, some can be pointers to 
string literals. In C/C++, the array has to decide on an ownership policy, and 
all elements must conform.


This means extra copies.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread Brad Anderson

On Thursday, 9 January 2014 at 01:06:03 UTC, Ola Fosheim Grøstad
wrote:

On Thursday, 9 January 2014 at 00:52:04 UTC, H. S. Teoh wrote:

The absolute worst offender from the C days was creat().


That's unfair, that's unix, not C!

http://linux.die.net/man/3/explain_creat_or_die


But that just means the same people are responsible.


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread Manu
On 9 January 2014 13:08, Walter Bright newshou...@digitalmars.com wrote:

 On 1/8/2014 12:23 PM, Benjamin Thaut wrote:

 Additionaly programming with a GC often leads to a lot more allocations,


 I believe that this is incorrect. Using GC leads to fewer allocations,
 because you do not have to make extra copies just so it's clear who owns
 the allocations.


You're making a keen assumption here that C programmers use STL. And no
sane programmer that I've ever worked with uses STL precisely for this
reason :P
Sadly, being conscious of eliminating unnecessary copies in C/C++ takes a
lot of work (see: time and money), so there is definitely value in
factoring that problem away, but the existing GC is broken. Until it
doesn't leak, stop the world, and/or can run incrementally, it remains no
good for realtime usage.
There were 2 presentations on improved GC's last year, why do we still have
the lamest GC imaginable? I'm still yet to hear any proposal on how this
situation will ever significantly improve...

*cough* ARC...

For example, if you've got an array of char* pointers, in D some can be GC
 allocated, some can be malloc'd, some can be slices, some can be pointers
 to string literals. In C/C++, the array has to decide on an ownership
 policy, and all elements must conform.

 This means extra copies.



Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread Paulo Pinto

On Thursday, 9 January 2014 at 00:52:04 UTC, H. S. Teoh wrote:
On Wed, Jan 08, 2014 at 11:59:58PM +, 
digitalmars-d-boun...@puremagic.com wrote:
On Wednesday, 8 January 2014 at 23:43:43 UTC, NoUseForAName 
wrote:

[...]

(I am not part of that majority, though). I mean C gave us
classics like atoi.. still reminds me of ahoi every time I
read it. And I will never get over C++'s cout and cin. 
See?


The absolute worst offender from the C days was creat(). I mean,
seriously?? I'm actually a fan of abbreviated names myself, but 
that one

simply takes it to a whole 'nother level of wrong.


I don't mind cout, I hardly use cin, I try to avoid cerr, and 
I've
never used clog… I mind how you configure iostreams though. It 
looks

worse than printf, not sure how they managed that.

[...]

I hate iostream with a passion.


I am on the other side of the fence, enjoying iostream since 
1994. :)


--
Paulo


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread Paulo Pinto
On Wednesday, 8 January 2014 at 23:59:59 UTC, Ola Fosheim Grøstad 
wrote:
On Wednesday, 8 January 2014 at 23:43:43 UTC, NoUseForAName 
wrote:
Looks pretty boring/conventional to me. If you know many 
programming languages you immediately recognize let as a 
common keyword for assignment.


Yes, but I cannot think of a single one of them that I would 
like to use! ;-)


That leaves only the funny sounding mut as slightly unusual. 
It is the result of making immutable the default which I think 
is a good decision.


Agree on the last point, immutable should be the default. 
Altough I think they should have skipped both let and mut 
and used a different symbol for initial-assignment instead.


(I am not part of that majority, though). I mean C gave us 
classics like atoi.. still reminds me of ahoi every time I 
read it. And I will never get over C++'s cout and cin. See?


I don't mind cout, I hardly use cin, I try to avoid cerr, and 
I've never used clog… I mind how you configure iostreams 
though. It looks worse than printf, not sure how they managed 
that.


Rust makes C/C++ damaged people feel right at home even there 
;P


Well, I associate let with the functional-toy-languages we 
created/used at the university in the 90s so I kind of have 
problem taking Rust seriously. And the name? RUST? Decaying 
metal. Why? It gives me the eerie feeling that the designers 
are either brilliant, mad or both, or that it is a practical 
joke. I'm sure the compiler randomly tells you Aprils Fools! Or 
something.


You mean the toy languages that are slowly replacing C++ in the 
finance industry?


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread Paulo Pinto

On Thursday, 9 January 2014 at 06:11:58 UTC, Manu wrote:
On 9 January 2014 13:08, Walter Bright 
newshou...@digitalmars.com wrote:



On 1/8/2014 12:23 PM, Benjamin Thaut wrote:

Additionaly programming with a GC often leads to a lot more 
allocations,




I believe that this is incorrect. Using GC leads to fewer 
allocations,
because you do not have to make extra copies just so it's 
clear who owns

the allocations.



You're making a keen assumption here that C programmers use 
STL. And no
sane programmer that I've ever worked with uses STL precisely 
for this

reason :P
Sadly, being conscious of eliminating unnecessary copies in 
C/C++ takes a
lot of work (see: time and money), so there is definitely value 
in
factoring that problem away, but the existing GC is broken. 
Until it
doesn't leak, stop the world, and/or can run incrementally, it 
remains no

good for realtime usage.
There were 2 presentations on improved GC's last year, why do 
we still have
the lamest GC imaginable? I'm still yet to hear any proposal on 
how this

situation will ever significantly improve...

*cough* ARC...



For it to be done properly, RC needs to be compiler assisted, 
otherwise it is just too slow.


--
Paulo


Re: Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector

2014-01-08 Thread Walter Bright

On 1/8/2014 10:11 PM, Manu wrote:

On 9 January 2014 13:08, Walter Bright newshou...@digitalmars.com
mailto:newshou...@digitalmars.com wrote:

On 1/8/2014 12:23 PM, Benjamin Thaut wrote:

Additionaly programming with a GC often leads to a lot more allocations,


I believe that this is incorrect. Using GC leads to fewer allocations,
because you do not have to make extra copies just so it's clear who owns the
allocations.


You're making a keen assumption here that C programmers use STL.


My observation has nothing to do with the STL, nor does it have anything to do 
with how well the GC is implemented. Also, neither smart pointers nor ARC 
resolve the excessive copying problem as I described it.


I've been coding in C for 15-20 years before the STL, and the problem of 
excessive copying is a significant source of slowdown for C code.


Consider this C code:

char* cat(char* s1, char* s2) {
size_t len1 = s1 ? strlen(s1) : 0;
size_t len2 = s2 ? strlen(s2) : 0;
char* s = (char*)malloc(len1 + len2 + 1);
assert(s);
memcpy(s, s1, len1);
memcpy(s + len1, s2, len2);
s[len1 + len2] = 0;
return s;
}

Now consider D code:

string cat(string s1, string s2) {
return s1 ~ s2;
}

I can call cat with:

cat(hello, null);

and it works without copying in D, it just returns s1. In C, I gotta copy, 
ALWAYS.

(C's strings being 0 terminated also forces much extra copying, but that's 
another topic.)


The point is, no matter how slow the GC is relative to malloc, not allocating is 
faster than allocating, and a GC can greatly reduce the amount of alloc/copy 
going on.


The reason that Java does excessive amounts of allocation is because Java 
doesn't have value types, not because Java has a GC.