Re: DIP 50 - AST macros

2013-11-17 Thread SomeDude

On Thursday, 14 November 2013 at 22:06:06 UTC, deadalnix wrote:


mymacro(4+3); // <= can be "hello"
myfunction(4+3); // <= can be "hello as well"

Quite frankly, this isn't really convincing.



True but you will hardly use myfunction as an extension to the 
language.


If I understand the issues, I quite understand Walter's 
reluctance to add such a feature. Once you add it, you add the 
capability to change the language, or to create an entirely new 
language. If you let it, no matter how loud you cry, people 
*will* abuse the feature, and they will do it from day one. The 
C++ templates experiment shows what happens when you do that. We 
don't want to repeat the same mistakes.


Basically, people want to change the language, but without having 
to discuss their own extensions in the newsgroups. It's sometimes 
handy, but you'll end up with crappy features all over the place.


Re: DIP 50 - AST macros

2013-11-17 Thread SomeDude
On Thursday, 14 November 2013 at 19:42:36 UTC, Zsombor Barna 
wrote:


D's syntax remains the same ( statements, expressions, function 
calling, numbers etc. ). These AST manipulation tools just 
define new words or language constructs. Human languages tend 
to be altered as times passes and the needs change. That's why 
new words appear. Even the grammar is not the same as the one 
my grandparents used.


Which shows all the problems raised by Walter:
1. you need an entry in the dictionary that you need to look up 
for each new word that someone has introduced,
2. with natural language, you don't need to know its *exact* 
semantics when you use a new word, while in programming it is 
mandatory,
3. you *don't* want your programming language to arbitrarily 
change over time. That's probably the biggest issue.


Re: Worst-case performance of quickSort / getPivot

2013-11-17 Thread Craig Dillabaugh

On Sunday, 17 November 2013 at 02:43:05 UTC, Vladimir Panteleev
wrote:
On Sunday, 17 November 2013 at 01:48:14 UTC, Craig Dillabaugh 
wrote:

1. Find the median value in the array. This can be done
deterministically in linear time,


My understanding that for unordered data, there is no algorithm 
that runs in worst-case O(n):


http://en.wikipedia.org/wiki/Selection_algorithm
http://en.wikipedia.org/wiki/Quickselect


Unless I misunderstand something I am pretty sure there is a
deterministic algorithm for finding the median in unsorted data:

http://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15451-s07/www/lecture_notes/lect0125.pdf





so from a theoretical point of
view is just as fast as picking a random element since you use
linear time.


Picking a random element is constant time, though. You mean 
that O(1) and O(n) are equivalent here because they're both 
smaller than O(n log n), QuickSort's complexity?


Yeah, since you have to do O(n) work to partition the array,
whether you do O(1) work, or O(n) work to find your pivot really
doesn't matter. From a theoretical perspective anyway :o)



Re: DIP49 - Define qualified postblit

2013-11-17 Thread Jonathan M Davis
On Tuesday, November 12, 2013 13:30:49 Kenji Hara wrote:
> 2013/11/11 Daniel Davidson 
> 
> > From this thread (http://forum.dlang.org/post/mailman.89.1383248384.9546.
> > digitalmars-d-le...@puremagic.com) I was under the impression that
> > const/immutable and postblits don't mix. This DIP seems to be trying to
> > address that. One of the potential workarounds to this issue was the idea
> > of struct copy constructors. This is what I was referring to. With this
> > proposal, is there still a need for struct copy constructors?
> 
> 1.5 years ago, I did asked to Andrei about the postbit issue.
> 
>  Cc=d...@mail.gmail.com>
> http://forum.dlang.org/thread/CAFDvkcvvL8GxHQB=Rw9pTm-uxOKzNGVQNDv9w5Os3SkQ
> Cc=d...@mail.gmail.com
> 
> 
> Andrei had thought that the issue will be fixed by adding "copy
> constructor" in D.
> However I believed that the postblit concept would be able to improved
> more. So I couldn't convince about his thought.
> 
> DIP49 is the final conclusion of my belief. I can say that copy constructor
> is unnecessary in D.

Yeah. You appear to have figued out how to make postblits work with const and 
immutable - though the rules seem to be a bit complicated - particularly for 
the "unique" postblit. All in all, I think that having copy constructors would 
be much simpler, so if we were starting from scratch, I think that I'd be in 
favor of copy constructors over postblit constructors. However, given that we 
already have postblit constructors, it would definitely be nicer if we could 
tweak them to make them work with const and immutable and avoid having to 
either replace postblits with copy constructors or having both.

In any case, great job!

- Jonathan M Davis


Re: DIP 50 - AST macros

2013-11-17 Thread Walter Bright

On 11/14/2013 11:38 PM, Jacob Carlborg wrote:

On 2013-11-14 21:54, Walter Bright wrote:


If it is powerful enough to do async/await but look like normal D
syntax, then is going to suffer from these faults.


I don't think so. The idea is to have it look consistent within the language.
Take a look at __traits. It looks just like a function call, but it's far from
being a function call. Classes and structs look very similar, only the keyword
is different, but they are very different.



Think about the desired feature mentioned earlier in this thread of being able 
to insert a "return;" statement, causing what looks like a function call to 
return from the caller.


That's a major step increase in obfuscation, not just a detail.


Re: DIP 50 - AST macros

2013-11-17 Thread Walter Bright

On 11/14/2013 1:34 PM, H. S. Teoh wrote:

What if macros adopted an overtly different syntax?


That would be better, but I don't think enough better.


Re: DIP 50 - AST macros

2013-11-17 Thread Walter Bright

On 11/14/2013 2:06 PM, deadalnix wrote:

You can't argue that string mixin is good while AST macro is bad.


True, but you also cannot argue that since D has string mixins it must also have 
AST macros.


What macros do is bless, encourage, and make inevitable a particularly abusive 
style of programming. D already has super powerful metaprogramming abilities 
that we are far from exploring the limits of. I don't think we should be adding 
on even more that experience has shown goes quickly out of control.


Re: D Language Citation

2013-11-17 Thread Russel Winder
On Sun, 2013-11-17 at 12:25 +0530, Sumit Adhikari wrote:
> Dear User Community,
> 
> This mail is in particular to the citation of D.
> 
> D is extremely poorly cited (Yes this comes from a R&D guy). I searched and
> searched (everywhere including IEEEXplore) and nothing comes in my hand!
> 
> There are materials available on internet which are not peer reviewed and
> hence I cannot use for Journal citation! It is like I have everything but I
> cannot cite!

Clearly URLs have to be considered ephemera as far as academic
publication is concerned. However there are three classes of ephemera:
1. non-publishing websites; 2. publishing-related websites; and 3.
publishing related archives.

As an academic (admittedly a long time ago now), I would refuse to use
or allow use of (as an editor of journal or conference proceedings) URLs
in Category 1. However categories 2 and 3 are more reliable and so
acceptable. Actually they are mandatory these days with the emerging
academic publishing models. So where does this "ban" on using online
material for citations come from? Are you self-censoring based on the
notion of peer reviewed paper published journal? 

> It would be great idea as a beneficiary of D to publish for the future of
> D. Please consider what I am saying :). Please publish.

Walter Bright has published a number of papers in Dr Dobb's Journal and
elsewhere, For example:

http://www.drdobbs.com/tools/implementing-pure-functions/230700070
http://www.drdobbs.com/cpp/increasing-compiler-speed-by-over-75/240158941

Others have written and published in other places. There were some
articles about D published in CVu, the journal of ACCU
(http://www.accu.org) for example.

And then there is:

Alexandrescu A, The D Programming Language, Addison-Wesley, 2010.

A fine publication, with certain publishing faults that lead to extreme
collector pressure on value :-)

The last point for now is that there is academic programming language
research, and there is real world programming language development. The
two are very different – believe me I have done both. So whilst Scala,
which came from academia, has academic publications, Ceylon, Kotlin,
Groovy, Ruby, Clojure, C++, D, Rust, JavaScript, Python, etc., etc. are
developed almost entirely in a commercial or FOSS setting and so have no
academic publications associated with them per se. D is not unique in
this position.

-- 
Russel.
=
Dr Russel Winder  t: +44 20 7585 2200   voip: sip:russel.win...@ekiga.net
41 Buckmaster Roadm: +44 7770 465 077   xmpp: rus...@winder.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder



Re: Look and think good things about D

2013-11-17 Thread SomeDude

On Friday, 15 November 2013 at 01:09:53 UTC, bearophile wrote:
I have created two interesting D entries for this Rosettacode 
Task, is someone willing to create a Reddit entry for this? 
They show very different kinds of code in D.


http://rosettacode.org/wiki/Look-and-say_sequence#D

Bye,
bearophile


I have a task if you are interested, but I didn't bother to login 
to create it.

So if you like, you can create the page...

Here it is:
In a sequence of a million random integers, return the length and 
the indexes of the longest duplicate sequence (display indexes 
counting from one, not zero).


In order for everybody to start with the same random sequence, it 
may be useful to specify a simple implementation for the 
generating function.




Re: DIP 50 - AST macros

2013-11-17 Thread Zsombor Barna

On Sunday, 17 November 2013 at 09:13:04 UTC, Walter Bright wrote:

On 11/14/2013 2:06 PM, deadalnix wrote:
You can't argue that string mixin is good while AST macro is 
bad.


True, but you also cannot argue that since D has string mixins 
it must also have AST macros.


What macros do is bless, encourage, and make inevitable a 
particularly abusive style of programming. D already has super 
powerful metaprogramming abilities that we are far from 
exploring the limits of. I don't think we should be adding on 
even more that experience has shown goes quickly out of control.


Yeah, you are right, but I doubt anybody would use macros in 
production code without considering much less powerful constructs 
of the language to solve the problem. In hobby projects, I don't 
mind if create spaghetti code while experimenting.


Re: Look and think good things about D

2013-11-17 Thread SomeDude

On Friday, 15 November 2013 at 13:33:33 UTC, bearophile wrote:
If you use ranges badly you will get a slow program, if you use 
them well with a good back-end, you will have a fast program.




And so, what are the rules for not using ranges badly ? What 
should be avoided ?


Re: Worst-case performance of quickSort / getPivot

2013-11-17 Thread Ivan Kazmenko
On Sunday, 17 November 2013 at 03:58:58 UTC, Andrei Alexandrescu 
wrote:

On 11/16/13 5:07 PM, Ivan Kazmenko wrote:
The above is just my retelling of a great short article "A 
Killer

Adversary for Quicksort" by M. D. McIlroy here:
http://www.cs.dartmouth.edu/~doug/mdmspe.pdf


Nice story, but the setup is a tad tenuous (albeit indeed 
theoretically interesting). For starters, if the attacker has a 
hook into the comparison function, they could trivially do a 
lot worse...


Actually, I was thinking about a two phase attack, and it is not 
at all unlikely.


0. Say, the Sorter presents a library quicksort solution.  It may 
be closed source, but can take comparison function as argument.


1. On the preparation phase, the Attacker uses the tricky 
comparison function described previously.  What's important is 
that, besides observing Theta(n^2) behavior once, he now gets a 
real array a[] such that this behavior can be reproduced.


2. On the attack phase, the Attacker does not need access to the 
comparison function.  He just feeds the array obtained on the 
previous step as plain data.


Ivan Kazmenko.


Re: Worst-case performance of quickSort / getPivot

2013-11-17 Thread Ivan Kazmenko
On Sunday, 17 November 2013 at 01:48:14 UTC, Craig Dillabaugh 
wrote:
On Sunday, 17 November 2013 at 01:07:20 UTC, Ivan Kazmenko 
wrote:
Regarding an ideal pivot choice for quicksort, I'd like to 
emphasize that it is simply non-existent.  Here's why.


Let us measure quicksort performance as the number of 
comparisons it makes.


Let us make some assumptions:
quicksort goes as
--(1) choose a pivot p in O(1) comparisons;
--(2) partition the segment into two;
--(3) run quicksort recursively on the two resulting segments.

Now, (2) is effectively done by comparing every element with 
p, thus in Theta(n) (on the order of n) comparisons where n is 
the length of the current segment.


Under these assumptions, we can construct the killer array for 
an arbitrary quicksort of such kind, even if we don't know how 
exactly the pivot is chosen (say, closed source or 
obfuscation), but we have the following interface to it:


Instead of accessing the array a[], quicksort calls the 
function less(i,j) which tells it whether a[i]control that function.


(snip)


Woah, that sounds complicated.  Not 100% sure I entirely
understand but it seems to rely on changing the elements while
you sort.  How would that work in real life?

For example, how would this defeat the following pivoting 
method.


1. Find the median value in the array. This can be done
deterministically in linear time, so from a theoretical point of
view is just as fast as picking a random element since you use
linear time.
2. Use that as the pivot.


Right, this method (median of medians is an example 
implementation) will not be defeated by the above algorithm.  
There's nothing wrong in it, since it takes Theta(n) comparisons 
to choose the pivot and so violates our attack's working 
conditions.  In practice however, it means that, on an average 
case, the constant factor is very likely to be higher than for a 
quicksort vulnerable to our attack.


Ivan Kazmenko.


Re: Building druntime on MAC OS X

2013-11-17 Thread Jacob Carlborg

On 2013-11-16 23:13, Andrew Edwards wrote:


What else is missing for the OS X build? After correcting the DFLAGS in
dmd.conf, it performs as expected.


Nothing.


I manually changed dmd.conf to:

  DFLAGS=-L-export_dynamic -I%@P%/../import -L-L%@P%/../lib
-L-ldruntime-osxdefault

I've not experienced any issues since.


Please remove -L-ldruntime-osxdefault. druntime is statically linked 
with Phobos which DMD automatically links with.


-L-export_dynamic has never been used on Mac OS X. I don't remember why 
it was added to the Linux dmd.conf file.


--
/Jacob Carlborg


Re: DIP 50 - AST macros

2013-11-17 Thread Jacob Carlborg

On 2013-11-17 09:58, Walter Bright wrote:


Think about the desired feature mentioned earlier in this thread of
being able to insert a "return;" statement, causing what looks like a
function call to return from the caller.


I'm not entirely sure what you're saying. Could you please post a link 
to the post mentioning this or show an example.



That's a major step increase in obfuscation, not just a detail.


--
/Jacob Carlborg


Re: DIP 50 - AST macros

2013-11-17 Thread Jacob Carlborg

On 2013-11-17 10:00, Walter Bright wrote:


That would be better, but I don't think enough better.


Then why to we have UDA's with the same syntax as language attributes 
and why do we have operator overloading.


You can hide arbitrary code behind operator overloading.

--
/Jacob Carlborg


Re: DIP 50 - AST macros

2013-11-17 Thread Jacob Carlborg

On 2013-11-17 09:18, SomeDude wrote:


True but you will hardly use myfunction as an extension to the language.

If I understand the issues, I quite understand Walter's reluctance to
add such a feature. Once you add it, you add the capability to change
the language, or to create an entirely new language. If you let it, no
matter how loud you cry, people *will* abuse the feature, and they will
do it from day one. The C++ templates experiment shows what happens when
you do that. We don't want to repeat the same mistakes.


We already have templates and operator overloading. Perhaps we should 
remove those, we don't want to take the chance of people abusing them.



Basically, people want to change the language, but without having to
discuss their own extensions in the newsgroups. It's sometimes handy,
but you'll end up with crappy features all over the place.


No, people want a general solution that doesn't require the language to 
be extended each time they come up with a useful feature.


--
/Jacob Carlborg


Re: Look and think good things about D

2013-11-17 Thread bearophile

SomeDude:

I have a task if you are interested, but I didn't bother to 
login to create it.

So if you like, you can create the page...


It's better to discuss this in D.learn, where most of Rosettacode 
matters are discussed. I opened this in the main newsgroup to see 
if someone was willing to post it to Reddit.



In a sequence of a million random integers, return the length 
and the indexes of the longest duplicate sequence (display 
indexes counting from one, not zero).


If you write a first version of D solution I can post task 
description and its D solution in the Rosettacode site. I will 
later make improvements in your code, to make it more uniform 
with the other entries, etc.



In order for everybody to start with the same random sequence, 
it may be useful to specify a simple implementation for the 
generating function.


What function do you suggest? (It should be possible to implement 
on 32-64bit systems, in Haskell, in languages with only 
multi-precision integers, in languages like Ada that give errors 
on overflows, and in Java without unsigned integers).


Bye,
bearophile


Re: Look and think good things about D

2013-11-17 Thread bearophile

SomeDude:

And so, what are the rules for not using ranges badly ? What 
should be avoided ?


A newsgroup post is not large enough to contain an answer to 
this, that requires one or two articles to be written. In general 
writing quick code is partially an art, with the help of the 
profiler.


Bye,
bearophile


Re: D Language Citation

2013-11-17 Thread qznc

On Sunday, 17 November 2013 at 07:03:53 UTC, Sumit Adhikari wrote:

Dear User Community,

This mail is in particular to the citation of D.

D is extremely poorly cited (Yes this comes from a R&D guy). I 
searched and
searched (everywhere including IEEEXplore) and nothing comes in 
my hand!


There are materials available on internet which are not peer 
reviewed and
hence I cannot use for Journal citation! It is like I have 
everything but I

cannot cite!

It would be great idea as a beneficiary of D to publish for the 
future of

D. Please consider what I am saying :). Please publish.


In terms of academic citations, there is probably only Andreis 
book. If you just want a citation for D in general, this is fine. 
Do you need to reference anything more specific?


Alexandrescu A, The D Programming Language, Addison-Wesley, 2010.
Publication Date: June 12, 2010 | ISBN-10: 0321635361 | ISBN-13: 
978-0321635365 | Edition: 1


Re: Worst-case performance of quickSort / getPivot

2013-11-17 Thread Timon Gehr

On 11/17/2013 02:07 AM, Ivan Kazmenko wrote:

The random pick fails in the following sense: if we seed the RNG,
construct a killer case, and then start with the same seed, we get
Theta(n^2) behavior reproduced.


Hence, in no sense. This does not perform independent uniform random picks.


Re: Worst-case performance of quickSort / getPivot

2013-11-17 Thread Ivan Kazmenko

On Sunday, 17 November 2013 at 13:07:40 UTC, Timon Gehr wrote:

On 11/17/2013 02:07 AM, Ivan Kazmenko wrote:
The random pick fails in the following sense: if we seed the 
RNG,
construct a killer case, and then start with the same seed, we 
get

Theta(n^2) behavior reproduced.


Hence, in no sense. This does not perform independent uniform 
random picks.


Not at all.  There is a number of situations where you want your 
program to use RNG, but it is also important for the result to be 
reproducible.  In such cases, you typically store the RNG seed 
for re-use.


Of course there are also many cases where you don't need 
reproducibility guarantees, and there, the attack is useless.


Ivan Kazmenko.


Re: DIP 50 - AST macros

2013-11-17 Thread Dicebot


Think about the desired feature mentioned earlier in this 
thread of being able to insert a "return;" statement, causing 
what looks like a function call to return from the caller.


That's a major step increase in obfuscation, not just a detail.


You do realize that, as per more strict proposal, it won't change 
a _single_ thing about semantics of actual D function but allows 
to generate another one with semantics defined by macro?


It has _zero_ added ofbuscation over existing template + traits 
combo. You have not shown a single example which proves otherwise 
and keep resorting to personal experience argument instead of 
doing actual comparison. You have full right to do so but it does 
not sound like a decision made on behalf of careful reasoning.


Re: DIP 50 - AST macros

2013-11-17 Thread Simen Kjærås

On 2013-11-17 11:37, Jacob Carlborg wrote:

On 2013-11-17 09:58, Walter Bright wrote:


Think about the desired feature mentioned earlier in this thread of
being able to insert a "return;" statement, causing what looks like a
function call to return from the caller.


I'm not entirely sure what you're saying. Could you please post a link
to the post mentioning this or show an example.


I believe it is basically this:

int bar() {
foo(return 3);
return 5;
}


If that program is allowed to compile, and to return 3.


Re: D Language Citation

2013-11-17 Thread Sumit Adhikari
Dear Russel,

Thanks for the comments and suggestions.

I do not make rules, I follow them. I know my reviewers very well, so
better I be critical on what I put in.

Regards, Sumit


On Sun, Nov 17, 2013 at 2:55 PM, Russel Winder  wrote:

> On Sun, 2013-11-17 at 12:25 +0530, Sumit Adhikari wrote:
> > Dear User Community,
> >
> > This mail is in particular to the citation of D.
> >
> > D is extremely poorly cited (Yes this comes from a R&D guy). I searched
> and
> > searched (everywhere including IEEEXplore) and nothing comes in my hand!
> >
> > There are materials available on internet which are not peer reviewed and
> > hence I cannot use for Journal citation! It is like I have everything
> but I
> > cannot cite!
>
> Clearly URLs have to be considered ephemera as far as academic
> publication is concerned. However there are three classes of ephemera:
> 1. non-publishing websites; 2. publishing-related websites; and 3.
> publishing related archives.
>
> As an academic (admittedly a long time ago now), I would refuse to use
> or allow use of (as an editor of journal or conference proceedings) URLs
> in Category 1. However categories 2 and 3 are more reliable and so
> acceptable. Actually they are mandatory these days with the emerging
> academic publishing models. So where does this "ban" on using online
> material for citations come from? Are you self-censoring based on the
> notion of peer reviewed paper published journal?
>
> > It would be great idea as a beneficiary of D to publish for the future of
> > D. Please consider what I am saying :). Please publish.
>
> Walter Bright has published a number of papers in Dr Dobb's Journal and
> elsewhere, For example:
>
> http://www.drdobbs.com/tools/implementing-pure-functions/230700070
> http://www.drdobbs.com/cpp/increasing-compiler-speed-by-over-75/240158941
>
> Others have written and published in other places. There were some
> articles about D published in CVu, the journal of ACCU
> (http://www.accu.org) for example.
>
> And then there is:
>
> Alexandrescu A, The D Programming Language, Addison-Wesley, 2010.
>
> A fine publication, with certain publishing faults that lead to extreme
> collector pressure on value :-)
>
> The last point for now is that there is academic programming language
> research, and there is real world programming language development. The
> two are very different – believe me I have done both. So whilst Scala,
> which came from academia, has academic publications, Ceylon, Kotlin,
> Groovy, Ruby, Clojure, C++, D, Rust, JavaScript, Python, etc., etc. are
> developed almost entirely in a commercial or FOSS setting and so have no
> academic publications associated with them per se. D is not unique in
> this position.
>
> --
> Russel.
>
> =
> Dr Russel Winder  t: +44 20 7585 2200   voip:
> sip:russel.win...@ekiga.net
> 41 Buckmaster Roadm: +44 7770 465 077   xmpp: rus...@winder.org.uk
> London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder
>
>


-- 
Sumit Adhikari,


Re: D Language Citation

2013-11-17 Thread Sumit Adhikari
Thanks,

Book of Andrei is the only material I am left with, nevertheless, some
article of Walter is also in my bib file.

I am not particular for any specific material in D, my aim is to prove the
novelty of D. Hence, my search should be:

1. Qualitative analysis between C++ and D.
2. Outlook of D.
3. Objective orientation in D.

Book of Andrei has become too old for what D has walked over last 3 years.
I have particular problem to cite them.

Thanks for understanding me at least.


On Sun, Nov 17, 2013 at 5:26 PM, qznc  wrote:

> On Sunday, 17 November 2013 at 07:03:53 UTC, Sumit Adhikari wrote:
>
>> Dear User Community,
>>
>> This mail is in particular to the citation of D.
>>
>> D is extremely poorly cited (Yes this comes from a R&D guy). I searched
>> and
>> searched (everywhere including IEEEXplore) and nothing comes in my hand!
>>
>> There are materials available on internet which are not peer reviewed and
>> hence I cannot use for Journal citation! It is like I have everything but
>> I
>> cannot cite!
>>
>> It would be great idea as a beneficiary of D to publish for the future of
>> D. Please consider what I am saying :). Please publish.
>>
>
> In terms of academic citations, there is probably only Andreis book. If
> you just want a citation for D in general, this is fine. Do you need to
> reference anything more specific?
>
>
> Alexandrescu A, The D Programming Language, Addison-Wesley, 2010.
> Publication Date: June 12, 2010 | ISBN-10: 0321635361 | ISBN-13:
> 978-0321635365 | Edition: 1
>



-- 
Sumit Adhikari,


Re: Worst-case performance of quickSort / getPivot

2013-11-17 Thread Chris Cain
On Sunday, 17 November 2013 at 07:19:26 UTC, Andrei Alexandrescu 
wrote:

On 11/16/13 9:21 PM, Chris Cain wrote:
That said, it might also be reproduced "well enough" using a 
random
generator to create similar strings to sort, but the basic 
idea is
there. I just like using real genomes for performance testing 
things :)


I am hoping for some more representative corpora, along the 
lines of http://sortbenchmark.org/. Some data that we can use 
as good proxies for typical application usage.


Andrei


I think I get what you're saying, but sortbenchmark.org uses 
completely pseudorandom (but reproducable) entries that I don't 
think are representative of real data either:


(using gensort -a minus the verification columns)
---
AsfAGHM5om
~sHd0jDv6X
uI^EYm8s=|
Q)JN)R9z-L
o4FoBkqERn
*}-Wz1;TD-
0fssx}~[oB
...
---

Most places use very fake data as proxies for real data. It's 
better to have something somewhat structured and choose data 
that, despite not being real data, stresses the benchmark in a 
unique way.


I'm not suggesting my benchmark be the only one; if we're going 
to use pseudorandom data (I'm not certain we could actually get 
"realistic data" that would serve us that much better) we might 
as well have different test cases that stress the sort routine in 
different ways. Obviously, using the real genome would be 
preferable to generating some (since it's actually truly "real" 
data, just used in an unorthodox way) but there's a disadvantage 
to attaching a 4.6MB file to a benchmarking setup. Especially if 
more might come.


Anyway, it's a reasonable representation of "data that has no 
discernable order that can occasionally take some time to 
compare." Think something like sorting a list of customer records 
by name. If they're ordered by ID, then the names would not 
likely have a discernable pattern and the comparison between 
names might be "more expensive" because some names can be common.


We can do "more realistic" for that type of scenario, if you'd 
like. I could look at a distribution for last names/first names 
and generate fake names to sort in a reasonable approximation of 
a distribution of real names. I'm not certain the outcome would 
change that much.


Re: D Language Citation

2013-11-17 Thread Russel Winder
Sumit,

On Sun, 2013-11-17 at 19:09 +0530, Sumit Adhikari wrote:
> Dear Russel,
> 
> Thanks for the comments and suggestions.
> 
> I do not make rules, I follow them. I know my reviewers very well, so
> better I be critical on what I put in.
> 
> Regards, Sumit

You are right to be cautious, though I wouldn't want you to "over think"
the problem and/or make too rigid assumptions about reviewers' opinions.

I am assuming then the core problem is "peer reviewed papers in academic
journals and conferences" rather than URLs. I would suggest that you can
still cite material about D even though it is not in peer reviewed
academic journals and conferences, but make it clear that it is not
"peer reviewed" in the academic sense. The point here is to make clear
in your work the nature of the reference being cited, show that you are
using the data in a properly academic way. A corollary here is that you
don't use "opinion pieces", just works that present data pertinent to
your work.

-- 
Russel.
=
Dr Russel Winder  t: +44 20 7585 2200   voip: sip:russel.win...@ekiga.net
41 Buckmaster Roadm: +44 7770 465 077   xmpp: rus...@winder.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder



Re: Worst-case performance of quickSort / getPivot

2013-11-17 Thread Vladimir Panteleev
On Sunday, 17 November 2013 at 05:30:24 UTC, Jean Christophe 
wrote:

You mean, sort!`a.foo < b.foo` ?


Yes.

An indirect sorting, assuming a and b to be ojects of class 
SomePotentialyLargeClass.


Because the array to sort contains pointers only, all the data 
movement is essentially the same as if we were sorting integer.


If the range elements are reference types, that's what will 
happen (unless they overload opAssign oslt). Otherwise, there's 
makeIndex (already mentioned by Andrei), or you could also do it 
by hand:


1. r.length.iota.array.sort!((a, b) => r[a] &r[a]).array.sort!((a, b) => *a<*b);

r.map!`&a` doesn't work though, because we don't get a reference 
to the range element in the predicate.


Re: Worst-case performance of quickSort / getPivot

2013-11-17 Thread Andrei Alexandrescu

On 11/17/13 2:20 AM, Ivan Kazmenko wrote:

On Sunday, 17 November 2013 at 03:58:58 UTC, Andrei Alexandrescu wrote:

On 11/16/13 5:07 PM, Ivan Kazmenko wrote:

The above is just my retelling of a great short article "A Killer
Adversary for Quicksort" by M. D. McIlroy here:
http://www.cs.dartmouth.edu/~doug/mdmspe.pdf


Nice story, but the setup is a tad tenuous (albeit indeed
theoretically interesting). For starters, if the attacker has a hook
into the comparison function, they could trivially do a lot worse...


Actually, I was thinking about a two phase attack, and it is not at all
unlikely.

0. Say, the Sorter presents a library quicksort solution.  It may be
closed source, but can take comparison function as argument.

1. On the preparation phase, the Attacker uses the tricky comparison
function described previously.  What's important is that, besides
observing Theta(n^2) behavior once, he now gets a real array a[] such
that this behavior can be reproduced.

2. On the attack phase, the Attacker does not need access to the
comparison function.  He just feeds the array obtained on the previous
step as plain data.

Ivan Kazmenko.


That won't do with randomized pivot selection.

Andrei


Re: Worst-case performance of quickSort / getPivot

2013-11-17 Thread Vladimir Panteleev
On Sunday, 17 November 2013 at 08:33:11 UTC, Craig Dillabaugh 
wrote:

http://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15451-s07/www/lecture_notes/lect0125.pdf


Nice! Didn't know about this - although still seems to be a lot 
of work for each 'n'.


Stop the world GC, why?

2013-11-17 Thread Stretto
What is the crucial reason why the current GC has to stop the 
world? It would seem to me that only memory defragmentation would 
require such a step? If so, then could we not turn off 
defragmentation and allow the GC to run in the background to 
clean up unused memory and maybe periodically manually defragment?


Also, With the new std.allocator it would be nice if either

1) crucial parts of phobos were rewritten to use allocators such 
as file and console io. This would allow one to have basic 
routines without requiring the GC or reimplementation.


2) Use a dual heap with two GC's where phobo's routines live. 
When one heap is being collected by the GC the other one becomes 
active effectively allowing a concurrent GC. By having only 
phobos and internal routines work on these heaps they should be 
allowed to be much smaller(hence quicker GC collection and 
prevent stopping the main thread).


These options would allow more predictable real time behavior.


Re: Building druntime on MAC OS X

2013-11-17 Thread Andrew Edwards

On 11/17/13, 5:34 AM, Jacob Carlborg wrote:

Please remove -L-ldruntime-osxdefault. druntime is statically linked
with Phobos which DMD automatically links with.


Done.


-L-export_dynamic has never been used on Mac OS X. I don't remember why
it was added to the Linux dmd.conf file.



Removed also.


I have couple issues remaining for Mac OS X that I'm hoping you can 
provide some clarification on:


1) The install option for TOOLS, unlike DMD, DRUNTIME or PHOBOS attempts 
to install the generated binaries directly to /usr/local/bin/. Better to 
insert them in ../install/bin/ like the other three allowing for 
automatic inclusion in the release zip and installer.


2) Executing DMD with no arguments displays the version number, 
copyright, and usage info of course. Whether building from the v2.064 or 
v2.064.2 tag on github, the resulting binary always displays "DMD64 D 
Compiler v2.064-devel-a9eedd1". How do I get it to build in the correct 
version? The VERSION file is present in the git clone but for some 
reason it is ignored?




Re: Worst-case performance of quickSort / getPivot

2013-11-17 Thread Andrei Alexandrescu

On 11/17/13 6:20 AM, Chris Cain wrote:

I'm not suggesting my benchmark be the only one; if we're going to use
pseudorandom data (I'm not certain we could actually get "realistic
data" that would serve us that much better) we might as well have
different test cases that stress the sort routine in different ways.
Obviously, using the real genome would be preferable to generating some
(since it's actually truly "real" data, just used in an unorthodox way)
but there's a disadvantage to attaching a 4.6MB file to a benchmarking
setup. Especially if more might come.


OK, since I see you have some interest...

You said nobody would care to actually sort genome data. I'm aiming for 
data that's likely to be a good proxy for tasks people are interested in 
doing.


For example, people may be interested in sorting floating-point numbers 
resulting from sales, measurements, frequencies, probabilities, and 
whatnot. Since most of those have a Gaussian distribution, a corpus with 
Gaussian-distributed measurements would be nice.


Then, people may want to sort things by date/time. Depending on the 
scale the distribution is different - diurnal cycle, weekly cycle, 
seasonal cycle, secular ebbs and flows etc. I'm unclear on what would be 
a good set of data. For sub-day time ranges uniform distribution may be 
appropriate.


Then, people may want to sort records by e.g. Lastname, Firstname, or 
index a text by words. For names we'd need some census data or 
phonebook. For general text sorting we can use classic texts such as 
Alice in Wonderland or the King James Bible (see 
http://corpus.canterbury.ac.nz/descriptions/). Sorting by word length is 
a possibility (word lengths are probably Gaussian-distributed).


Uniform random data is also a baseline, not terribly representative, but 
worth keeping an eye on. In fact uniform data is unfairly rigged in 
quicksort's favor: any pivot is likely to be pretty good, and there are 
no sorted runs that often occur in real data.



Andrei


Re: Look and think good things about D

2013-11-17 Thread bearophile
What function do you suggest? (It should be possible to 
implement on 32-64bit systems, in Haskell, in languages with 
only multi-precision integers, in languages like Ada that give 
errors on overflows, and in Java without unsigned integers).


The Computer Shootout site uses this one:


int nextRandom() nothrow {
enum int IM = 139968, IA = 3877, IC = 29573;
__gshared static int last = 42;
return last = (last * IA + IC) % IM;
}

void main() {
import std.stdio;

foreach (i; 0 .. 100) {
nextRandom.writeln;
}
}


Bye,
bearophile


Re: Stop the world GC, why?

2013-11-17 Thread Paulo Pinto

Am 17.11.2013 17:21, schrieb Stretto:

What is the crucial reason why the current GC has to stop the world? It
would seem to me that only memory defragmentation would require such a
step? If so, then could we not turn off defragmentation and allow the GC
to run in the background to clean up unused memory and maybe
periodically manually defragment?



Because so far no one created a better one to integrate into D compilers.

It is an implementation issue that will be eventually fixed.

--
Paulo



Overriding interface method without implementation

2013-11-17 Thread Uranuz

Sometimes when I build my code I get error message(s) starting
with `undefined reference to`. Because it happens not too often
it's not obvious and easy to understand what is the problem.
There is an easy example:

import std.stdio;

interface I
{
void foo(int num);
int bar();  
}

class A: I
{
override {
void foo(int num); //This string is evil root))

int bar()
{   return 100; }
}
}

void main()
{
writeln("Hello, world!!!");
}

Compilation output:
/d621/f189.o:(.rodata+0x138): undefined reference to
`_D4f1891A3fooMFiZv'
/d621/f189.o: In function `_TMP3':
/d621/f189.d:(.text._D4f1891A3barMFZi+0x55): undefined reference
to `_D4f1891A3fooMFiZv'
collect2: error: ld returned 1 exit status
--- errorlevel 1

It happens because after invention of some interface I simply
copy-paste its contents into class A that implements I. And
sometimes I can forget to implement some small function that for
example just returns value of some field.

In such cases I will get some not very obvious error that can't
help me to find out module name and line index that produces this
error.

As far as I know this syntax for function without implementation
mean that it will be implemented somewhere else for example in C
code. But I think that for "usual" programmer some understandable
error message or warning is needed.


Re: Overriding interface method without implementation

2013-11-17 Thread bearophile

See also:
http://d.puremagic.com/issues/show_bug.cgi?id=5129

The special use case should be respected and allowed, but it 
can't make error messages too much harder for every one else in 
most other common cases.


Bye,
bearophile


Re: DIP 50 - AST macros

2013-11-17 Thread deadalnix

On Sunday, 17 November 2013 at 13:36:32 UTC, Simen Kjærås wrote:

On 2013-11-17 11:37, Jacob Carlborg wrote:

On 2013-11-17 09:58, Walter Bright wrote:

Think about the desired feature mentioned earlier in this 
thread of
being able to insert a "return;" statement, causing what 
looks like a

function call to return from the caller.


I'm not entirely sure what you're saying. Could you please 
post a link

to the post mentioning this or show an example.


I believe it is basically this:

int bar() {
foo(return 3);
return 5;
}


If that program is allowed to compile, and to return 3.


That never was in the DIP.


Re: Overriding interface method without implementation

2013-11-17 Thread Uranuz
As far as I understand it's not a problem at compile stage but 
it's a link error. So compiler can't decide if it's error or not 
when you have function without body. But may be some keyword or 
attribute is needed to tell the compiler if this function will be 
implemented in some other way or it's just a missing body.


Re: (Linux) Which compilers should be working now with shared libraries?

2013-11-17 Thread Ellery Newcomer

On 11/16/2013 02:40 PM, qznc wrote:

On Saturday, 16 November 2013 at 13:00:54 UTC, Martin Nowak wrote:

On Friday, 15 November 2013 at 20:17:38 UTC, Marco Leise wrote:

Oops, this was answered with "not ready yet" a few topics
earlier. Ok, then static D libs only for now.


That applies only to loading, linking shared libraries works.


I think telling anybody "shared libraries do (not) work" is too
simplistic. Is there an overview, what does and what does not work with
shared libraries? I envision a matrix with one axis "dmd, gdc, ldc" and
one axis "linking shared C library, loading/unloading shared C
libraries, link shared D library into D, link shared D library into C, ..."


and then another axis for dmdfe/druntime release, and then another axis 
for platform, and then another axis for architecture


and then this becomes maybe a bit much for a wiki system to handle.

But I would like something like this too.


Re: DIP 50 - AST macros

2013-11-17 Thread Simen Kjærås

On 13.11.2013 20:05, Ellery Newcomer wrote:

On Wednesday, 13 November 2013 at 10:51:48 UTC, Simen Kjærås wrote:

On 12.11.2013 18:53, Ellery Newcomer wrote:

It's perfectly possible in D to make this work:



hey, cool impl

*comprehends code*

I mean Ewww


I *can* make that work. I'm not going to.

--
  Simen


I concur with the second part.


I decided to abandon sanity. Luckily, only for the named parameters. I 
now have this code working:



void test(int a, int b = 2, string c = "foo", int d = 14) {
}

alias test2 = nameify!test;

void main() {
test2(1, Args.d = 4, Args.c = "bar");
}

With reasonable error messages for duplicate and missing parameters, as 
well as type mismatches.


Source is attached. I hope God forgives me.

--
  Simen
import std.stdio : writeln;
import std.traits : ParameterTypeTuple, ParameterIdentifierTuple, 
ParameterDefaultValueTuple;
import std.typetuple : TypeTuple, staticIndexOf, staticMap;
import std.conv : to;

struct NamedArg(string _name, Arg) {
enum name = _name;
Arg value;
alias value this;
}

template isNotNamedArg(T...) if (T.length == 1) {
enum isNotNamedArg = !isNamedArg!T;
}
template isNamedArg(T...) if (T.length == 1) {
static if (is(typeof(T[0]))) {
enum isNamedArg = isNamedArg!(typeof(T[0]));
} else {
enum isNamedArg = is(T[0] == NamedArg!U, U...);
}
} unittest {
assert( isNamedArg!(NamedArg!("a", int)));
assert(!isNamedArg!int);
NamedArg!("a", float) f;
assert(isNamedArg!f);
}


final abstract class Args {
@property
static auto opDispatch(string name, Arg)(Arg arg) {
return NamedArg!(name, Arg)(arg);
}
}

template staticFilter(alias F, T...) {
static if (T.length == 0) {
alias staticFilter = TypeTuple!();
} else static if (T.length == 1) {
static if (F!T) {
alias staticFilter = TypeTuple!(T);
} else {
alias staticFilter = TypeTuple!();
}
} else {
alias staticFilter = TypeTuple!(
staticFilter!(F, T[0..$/2]),
staticFilter!(F, T[$/2..$]),
);
}
}

template Head(T...) if (T.length) {
alias Head = TypeTuple!(T[0]);
}

template Tail(T...) {
static if (T.length) {
alias Tail = T[1..$];
} else {
alias Tail = TypeTuple!();
}
}

template Chop(size_t block, alias F) {
alias Chop = TypeTuple!();
}

template Chop(size_t blocks, alias F, T...) if (T.length && T.length % blocks 
== 0) {
alias Chop = TypeTuple!(
F!(T[0..$/blocks]),
Chop!(blocks-1, F, T[$/blocks..$])
);
}

template staticZip(size_t N, alias F) {
alias staticZip = TypeTuple!();
}

template staticZip(size_t N, alias F, T...) {
static if (T.length == 0) {
alias staticZip = TypeTuple!();
} else {
alias staticZip = TypeTuple!(
F!(Chop!(N, Head, T)),
staticZip!(N, F, Chop!(N, Tail, T))
);
}
}

final abstract class ArgInfo(string _name, _type, _defaultValue...) if 
(_defaultValue.length == 1) {
enum name = _name;
alias type = _type;
alias value = _defaultValue;
} 

template staticToString(T...) {
enum staticToString = T.stringof;
}

template staticNameOf(T) {
enum staticNameOf = T.name;
}

string sortNamedArgs(string[] realNames, string[] defaults, string[] lookHere, 
int line = __LINE__, string file = __FILE__) {
string result = "alias sorted = TypeTuple!(";
foreach (i, name; realNames) {
bool found = false;
foreach (j, needle; lookHere) {
if (needle == name) {
if (found) {
return "static assert(false, \"" ~ file ~ "(" ~ 
to!string(line) ~ "): Duplicate parameter '" ~ needle ~ "'.\");";
}
found = true;
result ~= "lookup!(" ~ to!string(j) ~ ", namedArgs), ";
}
}
if (!found) {
result ~= "lookup!(" ~ to!string(i) ~ ", 
paramDefaults[unnamedArgs.length..$]), ";
}
}
return result ~ ");";
}

template lookup(int i, T...) {
alias lookup = T[i];
}

struct Compare(T1, T2, string _name) {
static if (isNamedArg!T1) {
alias Type1 = typeof(T1.value);
} else {
alias Type1 = T1;
}
enum name = _name;
alias Type2 = T2;
}

template checkTypesImpl(int line = __LINE__, string file = __FILE__, int idx, 
T...) {
static if(T.length == 0) {
enum checkTypesImpl = true;
} else {
alias T1 = TypeTuple!(T[0].Type1);
alias T2 = TypeTuple!(T[0].Type2);
enum N1 = T[0].name;
static assert(!is(T1 == TypeTuple!void), file ~ "(" ~ to!string(line) ~ 
"): Missing value for parameter " ~ N1 ~ ".");
static assert(is(T2 : T1), file ~ "(" ~ to!string(line) ~ "): type 
mismatch. Expected " ~ T1.stringof ~ ", got " ~ T2.stringof ~ " for parameter " 
~ N1 ~ ".");
enum checkTypesImpl = checkTypesImpl!(line, file, idx+1, T[1..$]);
}
}

t

Re: DIP 50 - AST macros

2013-11-17 Thread Philippe Sigaud
On Sun, Nov 17, 2013 at 8:41 PM, Simen Kjærås  wrote:

> Source is attached. I hope God forgives me.

Nice ideas. I particularly like this one:

final abstract class Args {
@property
static auto opDispatch(string name, Arg)(Arg arg) {
return NamedArg!(name, Arg)(arg);
}
}

Which allows you to have Args.c = 1 (and thus storing the name and the value).

Half of your implementation is in fact generic enough to be in Phobos
(staticFilter, staticZip). I know I use my own...

Btw, with DMD 2.064.2, we can now write:

template staticToString(T...) {
enum staticToString = T.stringof;
}

as:

alias staticToString(T...) = T.stringof;

Which is both easier to understand and easier on the eyes.



Re: DIP 50 - AST macros

2013-11-17 Thread deadalnix

On Sunday, 17 November 2013 at 19:41:40 UTC, Simen Kjærås wrote:

On 13.11.2013 20:05, Ellery Newcomer wrote:
On Wednesday, 13 November 2013 at 10:51:48 UTC, Simen Kjærås 
wrote:

On 12.11.2013 18:53, Ellery Newcomer wrote:

It's perfectly possible in D to make this work:



hey, cool impl

*comprehends code*

I mean Ewww


I *can* make that work. I'm not going to.

--
  Simen


I concur with the second part.


I decided to abandon sanity. Luckily, only for the named 
parameters. I

now have this code working:


void test(int a, int b = 2, string c = "foo", int d = 14) {
}

alias test2 = nameify!test;

void main() {
 test2(1, Args.d = 4, Args.c = "bar");
}

With reasonable error messages for duplicate and missing 
parameters, as

well as type mismatches.

Source is attached. I hope God forgives me.


That is a really cool idea !


Re: DIP 50 - AST macros

2013-11-17 Thread Walter Bright

On 11/17/2013 10:47 AM, deadalnix wrote:

On Sunday, 17 November 2013 at 13:36:32 UTC, Simen Kjærås wrote:

On 2013-11-17 11:37, Jacob Carlborg wrote:

On 2013-11-17 09:58, Walter Bright wrote:


Think about the desired feature mentioned earlier in this thread of
being able to insert a "return;" statement, causing what looks like a
function call to return from the caller.


I'm not entirely sure what you're saying. Could you please post a link
to the post mentioning this or show an example.


I believe it is basically this:

int bar() {
foo(return 3);
return 5;
}


If that program is allowed to compile, and to return 3.


That never was in the DIP.


It was this by Timon Gehr essentially posting why lazy parameters weren't good 
enough:



On 11/13/2013 08:25 PM, Walter Bright wrote:




Ah, found the code:

void ifthen(bool cond, lazy void dg)
{
 if (cond)
 dg();
}


int foo(int x){
ifthen(!x, return 2); // uh oh
return 3;
}





Re: DIP 50 - AST macros

2013-11-17 Thread Walter Bright

On 11/17/2013 2:38 AM, Jacob Carlborg wrote:

On 2013-11-17 10:00, Walter Bright wrote:


That would be better, but I don't think enough better.


Then why to we have UDA's with the same syntax as language attributes and why do
we have operator overloading.

You can hide arbitrary code behind operator overloading.


1. I don't believe we can decide on language features by analogy. D is complex 
enough that one can use analogy to justify anything.


2. You cannot do anything behind a function call - the 'return' discussed 
earlier, and async/await for another, i.e. operator overloading cannot introduce 
control flow, cannot introduce variables into the current scope, etc.




Re: DIP 50 - AST macros

2013-11-17 Thread Ellery Newcomer

On 11/17/2013 11:41 AM, Simen Kjærås wrote:


Source is attached. I hope God forgives me.



I suppose you'd have to do something like

x.Where(OR( NOT(Args.thing = "thing"), Args.sing = "sing"))

for nesting and negation and all that.

I'd wait for walter to relax the restrictions on ==, &&, etc operator 
overloading. Then you probably could do


x.Where(X => X.thing != "thing" || X.sing == "sing")

let's see here. If X's type is QueryExp!Entity and the return type is 
SomeRange!Entity then querying looks doable.


how about composing expressions? take the result of that lambda EXP,
EXP = EXP && X.cling == 1; // should work fine?

how about decomposing expressions?
EXP = EXP.lhs; // should work fine?

EXP.rhs.rhs.value // returns "sing" - should work fine? the lambda 
should create a closure, so we should be able to get out any value we put in


you won't be able to do

x.Where(X => upper(X.thing) == "THING")

without ast macros but will have to be satisfied with

x.Where(X => X.thing.upper() == "THING")

will you be able to do

x.Where(X => X.thing == "THING" && x.Any(X2 => X2.thing == X.thing && 
X2.id != X.id))


?

either "Any" would have to be only for template expressions, or X would 
have to be an outer context like so:


x.Where(X => X.it.thing == "THING" && X.x.Any(X2 => X2.it.thing == 
X.it.thing && X2.it.id != X.it.id))


because x.Any(...) should return bool, but inside the query expression 
it should return QueryExp. In both cases it would take a param of


QueryExp delegate(QueryExp)

Anybody else have any other ideas?



Re: Worst-case performance of quickSort / getPivot

2013-11-17 Thread Chris Cain
On Sunday, 17 November 2013 at 16:57:19 UTC, Andrei Alexandrescu 
wrote:
Then, people may want to sort records by e.g. Lastname, 
Firstname, or index a text by words. For names we'd need some 
census data or phonebook.


Sure.

As a source of data, app_c.csv from 
http://www.census.gov/genealogy/www/data/2000surnames/ -> File B: 
Surnames...



Test case:
---
import std.stdio, std.algorithm, std.random, std.range, std.conv, 
std.datetime;


// Name generator constants
enum NumberOfPossibleNames = 1_000_000;
enum NumberOfRandomNames = 1_000_000;

enum SortStrategy = SwapStrategy.unstable;

void main() {
auto data = File("app_c.csv").byLine
.drop(1) // First line is just column names
.take(NumberOfPossibleNames)
.map!(e => e.text.split(","))
.array;
auto names = data.map!(e => e[0]).array;
auto proportions = data.map!(e => e[2].to!size_t).array;

auto rnd = Random(50);
auto randomNames = fastDice(rnd, proportions)
.take(NumberOfRandomNames)
.map!(i => names[i])
.array;

StopWatch sw = AutoStart.yes;
randomNames.sort!("a < b", SortStrategy)();
sw.stop();

writeln(randomNames.length, " names sorted in ",
sw.peek.msecs, " msecs using ", SortStrategy, " sort");
}

struct FastDice(Rng, SearchPolicy pol = SearchPolicy.gallop) {
SortedRange!(size_t[]) _propCumSumSorted;
size_t _sum;
size_t _front;
Rng* _rng;

this(ref Rng rng, size_t[] proportions) {
size_t[] _propCumSum = proportions.save.array;
_rng = &rng;

size_t mass = 0;
foreach(ref e; _propCumSum) {
mass += e;
e = mass;
}

_sum = _propCumSum.back;

_propCumSumSorted = assumeSorted(_propCumSum);

popFront();
}

void popFront() {
immutable point = uniform(0, _sum, *_rng);
assert(point < _sum);

_front = _propCumSumSorted.lowerBound!pol(point).length;
}

enum empty = false;

@property
auto front() {
return _front;
}
}

auto fastDice(Rng)(ref Rng rng, size_t[] proportions) {
return FastDice!(Rng)(rng, proportions);
}

auto fastDice(size_t[] proportions) {
return fastDice(rndGen, proportions);
}
---

Results (using `-O -inline -release -noboundschecks` on my 
computer):

unstable sort: 738 msecs
stable sort: 1001 msecs


Also, to do this (in reasonable time) I had to create a new range 
which I called "FastDice" ... it does the same as 
std.random.dice, but is intended for cases where you'll be 
throwing dice throws often on the same data, so it does a bit of 
precomputation (creating a cumulative sum array) and allows for 
binary search/gallop to reduce the time to come up with each 
throw. I opted for gallop in this since the data is sorted in 
such a way that most common names come first.


It needs a bit of work to make it actually generic, but it's a 
good start and I'll just say it's WTFPL code, so use it for 
whatever. It'll be especially good for generating test cases like 
I have done above.


FWIW, FastDice takes ~400ms to generate all those dice throws. I 
did a back-of-the-envelope calculation on using dice, and just 
the time saved caching the sum saved maybe 30 minutes (No, I 
didn't wait that long, I stopped after 5 and wrote FastDice) of 
time each run. And the time saved using a gallop search instead 
of linear search is pretty significant as well (difference 
between n and log n time).


Re: DIP 50 - AST macros

2013-11-17 Thread Doodoo

HS Teoh why u always gotta go and break the forum


Re: DIP 50 - AST macros

2013-11-17 Thread Rob T

Walter,

what do you think about allowing mixins to work with parameter 
list construction?


Currently you cannot generate code unless it is syntactically 
complete on its own, this disallows parameter list construction 
even for syntactically correct parameter lists.


The other failure point of mixins is when stringing together 
function calls with the dot, eg A.mixin(MYFUNC).foo;


A whole lot more could be done with mixins if restrictions were 
relaxed just a bit more in a few key areas. I don't see too much 
obfuscation happening so long as the allowable constructs are 
carefully limited and cannot be manipulated in ways that factor 
into the concerns you mentioned previously.


To make it clear we could have a special parammixin() and 
dotmixin() or whatever to ensure there's no possibility of 
confusion.


--rt


Re: DIP 50 - AST macros

2013-11-17 Thread Timon Gehr

On 11/17/2013 09:48 PM, Walter Bright wrote:


It was this by Timon Gehr essentially posting why lazy parameters
weren't good enough:


On 11/13/2013 08:25 PM, Walter Bright wrote:




Ah, found the code:

void ifthen(bool cond, lazy void dg)
{
 if (cond)
 dg();
}


int foo(int x){
ifthen(!x, return 2); // uh oh
return 3;
}




This code was just supposed to demonstrate that ifthen is not actually a 
"statement" work-alike as claimed in a previous post.


Re: DIP 50 - AST macros

2013-11-17 Thread Timon Gehr

On 11/17/2013 08:41 PM, Simen Kjærås wrote:


I decided to abandon sanity. Luckily, only for the named parameters. I
now have this code working:


void test(int a, int b = 2, string c = "foo", int d = 14) {
}

alias test2 = nameify!test;

void main() {
 test2(1, Args.d = 4, Args.c = "bar");
}

With reasonable error messages for duplicate and missing parameters, as
well as type mismatches.
...


Still missing overload resolution. :o)


Re: Overriding interface method without implementation

2013-11-17 Thread Timon Gehr

On 11/17/2013 07:58 PM, Uranuz wrote:

But may be some keyword or attribute is needed to tell the compiler if
this function will be implemented in some other way or it's just a
missing body.


Mark foo 'abstract' and your code will compile.


Re: Worst-case performance of quickSort / getPivot

2013-11-17 Thread Timon Gehr

On 11/17/2013 02:14 PM, Ivan Kazmenko wrote:

On Sunday, 17 November 2013 at 13:07:40 UTC, Timon Gehr wrote:

On 11/17/2013 02:07 AM, Ivan Kazmenko wrote:

The random pick fails in the following sense: if we seed the RNG,
construct a killer case, and then start with the same seed, we get
Theta(n^2) behavior reproduced.


Hence, in no sense. This does not perform independent uniform random
picks.


Not at all.  There is a number of situations where you want your program
to use RNG, but it is also important for the result to be reproducible.
In such cases, you typically store the RNG seed for re-use.

Of course there are also many cases where you don't need reproducibility
guarantees, and there, the attack is useless.

Ivan Kazmenko.


One can't say random picking is bad because one can use some other 
(deterministic) pivot selection algorithm instead which is bad. If you 
need deterministic reproducibility guarantees, then random picking is 
useless.


Re: DIP 50 - AST macros

2013-11-17 Thread Simen Kjærås

On 17.11.2013 23:21, Timon Gehr wrote:

On 11/17/2013 08:41 PM, Simen Kjærås wrote:


I decided to abandon sanity. Luckily, only for the named parameters. I
now have this code working:


void test(int a, int b = 2, string c = "foo", int d = 14) {
}

alias test2 = nameify!test;

void main() {
 test2(1, Args.d = 4, Args.c = "bar");
}

With reasonable error messages for duplicate and missing parameters, as
well as type mismatches.
...


Still missing overload resolution. :o)


Please don't tempt me. :p

--
  Simen


Re: DIP 50 - AST macros

2013-11-17 Thread Walter Bright

On 11/17/2013 1:58 PM, Timon Gehr wrote:

This code was just supposed to demonstrate that ifthen is not actually a
"statement" work-alike as claimed in a previous post.


Ok, my apologies.


Re: Build Master: Scheduling

2013-11-17 Thread Jonathan M Davis
On Thursday, November 14, 2013 14:55:44 Dicebot wrote:
> On Thursday, 14 November 2013 at 00:37:38 UTC, Tyro[17] wrote:
> > ...
> > 
> > Your thoughts and concerns please.
> 
> Key problem here is that you call by beta something that is
> really not a beta. It is short term support release similar to
> ones we currently have with a shorter release schedule. And if it
> is _really_ supposed to be beta (== with some potentially unfixed
> regressions remaining) it is not really usable even in bleeding
> edge code.
> 
> I have been proposing for ages to take existing
> http://wiki.dlang.org/Release_Process and define long-term
> support releases on top of it (== "normal" releases in your
> proposal). That way one can do releases once in 2 months to keep
> stuff going and mark ones as LTS once in a 6 months (with
> intention to backport non-breaking fixes into last 2 LTS) for
> users that want more stability.
> 
> It fulfills same goals but does not result in broken releases /
> confusing naming.

I think that it makes some sense to continue to do releases more or less as we 
have (though hopefully more frequently - monthly or bi-monthly) but to have an 
"LTS" release every six months and then update that LTS branch and do patch 
releases for it whenever there's a critical bug that needs to be fixed. So, we 
have the LTS version which only gets critical patches, and the normal one 
which gets everything. But we don't try and treat the normal one as a beta or 
anything like that. That's a horrible idea IMHO, and will just lead to 
everyone using git head.

However, even with the LTS release, we should be _very_ restrictive on what 
bugs get backported, because it's actually the bug fixes which cause the most 
bugs and code breakage.

But I think that the idea should be that the LTS release is for those who need 
things to change as little as possible but still get fixes for bugs that really 
need them, whereas the normal releases should be what most everyone uses and 
that we not try and avoid having most folks use the releases which can include 
new features or other major changes.

- Jonathan M Davis


dmd 2.064.2 stuck compiling code?

2013-11-17 Thread michaelc37
I have an issue where dmd v2.064.2 never finishes compiling 
(hangs)


i'm trying to recompile my patched up fork of qtd 
(https://github.com/michaelc37/qtd) but the code never stops 
compiling (it was compiling with v2.063.2).


i was able to catch the last compiler command line executed using
make SHELL="/bin/bash -x"

/usr/bin/dmd -version=QtdCppShared -Ibuild_dir/build 
-I -I/d2 -O -release -inline -c and lots of d source files> 
-of/build_dir/build/CMakeFiles/qtdgui.dir/qtdgui0.o


anyone experiencing this issue using similar command line 
params(-O -release -inline -c)?


any suggestions on how to investigate this further?


Re: Worst-case performance of quickSort / getPivot

2013-11-17 Thread Andrei Alexandrescu

On 11/17/13 1:23 PM, Chris Cain wrote:

On Sunday, 17 November 2013 at 16:57:19 UTC, Andrei Alexandrescu wrote:

Then, people may want to sort records by e.g. Lastname, Firstname, or
index a text by words. For names we'd need some census data or phonebook.


Sure.

As a source of data, app_c.csv from
http://www.census.gov/genealogy/www/data/2000surnames/ -> File B:
Surnames...

[snip]


It needs a bit of work to make it actually generic, but it's a good
start and I'll just say it's WTFPL code, so use it for whatever. It'll
be especially good for generating test cases like I have done above.

FWIW, FastDice takes ~400ms to generate all those dice throws. I did a
back-of-the-envelope calculation on using dice, and just the time saved
caching the sum saved maybe 30 minutes (No, I didn't wait that long, I
stopped after 5 and wrote FastDice) of time each run. And the time saved
using a gallop search instead of linear search is pretty significant as
well (difference between n and log n time).


I think that's a terrific start! (Not sure I understand where the 30 
minutes come from...)



Andrei



Re: D Language Citation

2013-11-17 Thread Charles Hixson

On 11/17/2013 05:44 AM, Sumit Adhikari wrote:

Thanks,

Book of Andrei is the only material I am left with, nevertheless, some 
article of Walter is also in my bib file.


I am not particular for any specific material in D, my aim is to prove 
the novelty of D. Hence, my search should be:


1. Qualitative analysis between C++ and D.
2. Outlook of D.
3. Objective orientation in D.

Book of Andrei has become too old for what D has walked over last 3 
years. I have particular problem to cite them.


Thanks for understanding me at least.


On Sun, Nov 17, 2013 at 5:26 PM, qznc > wrote:


On Sunday, 17 November 2013 at 07:03:53 UTC, Sumit Adhikari wrote:

Dear User Community,

This mail is in particular to the citation of D.

D is extremely poorly cited (Yes this comes from a R&D guy). I
searched and
searched (everywhere including IEEEXplore) and nothing comes
in my hand!

There are materials available on internet which are not peer
reviewed and
hence I cannot use for Journal citation! It is like I have
everything but I
cannot cite!

It would be great idea as a beneficiary of D to publish for
the future of
D. Please consider what I am saying :). Please publish.


In terms of academic citations, there is probably only Andreis
book. If you just want a citation for D in general, this is fine.
Do you need to reference anything more specific?


Alexandrescu A, The D Programming Language, Addison-Wesley, 2010.
Publication Date: June 12, 2010 | ISBN-10: 0321635361 | ISBN-13:
978-0321635365 | Edition: 1




--
Sumit Adhikari,

IIRC Dr. Dobbs did an article on D a few years ago.  Is that 
publication? (I don't have it anymore, and I don't know for certain that 
it was in a print edition.  Do those matter?)


FWIW, no library that I have access to has any information on any 
computer language less than about a decade old, except for some 
magazines to which they are donated subscriptions, and which they don't 
keep around.  (I think they officially retain them for about a year now, 
but they often go missing sooner.)  Even the used book stores around 
here don't stock computer books...which I find quite annoying.


--
Charles Hixson



Re: DIP 50 - AST macros

2013-11-17 Thread deadalnix

On Sunday, 17 November 2013 at 20:48:15 UTC, Walter Bright wrote:

That never was in the DIP.


It was this by Timon Gehr essentially posting why lazy 
parameters weren't good enough:



On 11/13/2013 08:25 PM, Walter Bright wrote:




Ah, found the code:

void ifthen(bool cond, lazy void dg)
{
if (cond)
dg();
}


int foo(int x){
   ifthen(!x, return 2); // uh oh
   return 3;
}


As I understand it, Timon choosed that syntax simply to 
demonstrate the limitation of your proposal using a similar 
syntax. Not to propose a syntax.


Re: Worst-case performance of quickSort / getPivot

2013-11-17 Thread bearophile

Chris Cain:

Also, to do this (in reasonable time) I had to create a new 
range which I called "FastDice" ... it does the same as 
std.random.dice, but is intended for cases where you'll be 
throwing dice throws often on the same data, so it does a bit 
of precomputation (creating a cumulative sum array)


See also:
http://d.puremagic.com/issues/show_bug.cgi?id=5849

Bye,
bearophile


Re: dmd 2.064.2 stuck compiling code?

2013-11-17 Thread Brian Schott

On Monday, 18 November 2013 at 02:18:34 UTC, michaelc37 wrote:
I have an issue where dmd v2.064.2 never finishes compiling 
(hangs)


i'm trying to recompile my patched up fork of qtd 
(https://github.com/michaelc37/qtd) but the code never stops 
compiling (it was compiling with v2.063.2).


i was able to catch the last compiler command line executed 
using

make SHELL="/bin/bash -x"

/usr/bin/dmd -version=QtdCppShared 
-Ibuild_dir/build -I 
-I/d2 -O -release -inline -c source files> 
-of/build_dir/build/CMakeFiles/qtdgui.dir/qtdgui0.o


anyone experiencing this issue using similar command line 
params(-O -release -inline -c)?


any suggestions on how to investigate this further?


Possibly related: https://github.com/Hackerpilot/DCD/issues/79


Re: DIP 50 - AST macros

2013-11-17 Thread Walter Bright

On 11/17/2013 7:14 PM, deadalnix wrote:

As I understand it, Timon choosed that syntax simply to demonstrate the
limitation of your proposal using a similar syntax. Not to propose a syntax.


Ok, then I'm not seeing what AST macros do that lazy parameters / template 
overloading / mixin templates do not?


Re: DIP 45 - approval discussion

2013-11-17 Thread Walter Bright

On 11/15/2013 7:31 AM, Benjamin Thaut wrote:

Am 15/11/2013 08:27, schrieb Walter Bright:

On 11/14/2013 3:37 AM, Benjamin Thaut wrote:

Am 14.11.2013 11:28, schrieb Walter Bright:

On 11/12/2013 2:23 PM, Martin Nowak wrote:

One possibility is modules listed on the command line are regarded as
export==dllexport, and other modules as export==dllimport.

This of course means that functions may wind up going through the
dllimport indirection even for calling functions in the same dll, but it
should work.


That doesns't work for the case where a dll "A" uses a dll "B".
In that case export has to mean "dllexport" for all modules of A but
"dllimport"
for all modules of B.


I don't follow. If you're compiling A, you're specifying A modules on
the command line, and those will regard the B modules as dllimport.



Ok now I understand what you suggest. So basically you want to do the exact same
as DIP 45 just giving the compiler parameter a different name.

But you still didn't give a solution for the problem that the compiler does not
know which modules are part of a shared library, which are part of a static
library and which are part of the exeuctable.


I thought this did cover it. "export" in imported modules that are not listed on 
the compiler command line is regarded as "dllimport". "export" in modules listed 
on the commmand line is regarded as "dllexport".



And please also consider single file compilation.


And yes, this would mean that a function may wind up calling a function in its 
own dll through the dll import thunk -- however -- this thunk is often provided 
by the import library, and so wouldn't be there for intra-dll calls.




Re: DIP 45 - approval discussion

2013-11-17 Thread Walter Bright

On 11/15/2013 8:08 AM, Daniel Murphy wrote:

"Walter Bright"  wrote in message
news:l64lji$2buh$1...@digitalmars.com...

On 11/15/2013 12:00 AM, Daniel Murphy wrote:

"Walter Bright"  wrote in message
news:l64imh$27na$1...@digitalmars.com...


Also, at least on Windows, you can call functions in a DLL without
saying
dllimport on them and suffering a layer of indirection. The magic
happens
in the import library, which provides the relevant thunk. It's about 15
years since I worked on this stuff, so I might be a bit fuzzy on the
details.


The symbol in the import library just translates to an import table
indirection.


Yes, meaning the compiler doesn't have to do it if the import library is
set up correctly. (implib.exe should do it.)



Right, I was saying the indirection still exists.




No, not really, at least not on Windows. Try calling a function in the Windows 
API and disassemble it. You'll see a direct call.


Re: DIP 50 - AST macros

2013-11-17 Thread deadalnix

On Monday, 18 November 2013 at 05:05:02 UTC, Walter Bright wrote:

On 11/17/2013 7:14 PM, deadalnix wrote:
As I understand it, Timon choosed that syntax simply to 
demonstrate the
limitation of your proposal using a similar syntax. Not to 
propose a syntax.


Ok, then I'm not seeing what AST macros do that lazy parameters 
/ template overloading / mixin templates do not?


2 things. First, they can act on statement or declaration. Simply 
not with the proposed syntax. Second they can reflect what is 
passed as argument and act accordingly, when the lazy expression 
solution cannot.


Re: DIP 45 - approval discussion

2013-11-17 Thread Walter Bright

On 11/15/2013 7:34 AM, Benjamin Thaut wrote:

Am 15/11/2013 08:32, schrieb Walter Bright:

It's not that bad. Phobos can be built by specifying all the files on
the command line.

Also, at least on Windows, you can call functions in a DLL without
saying dllimport on them and suffering a layer of indirection. The magic
happens in the import library, which provides the relevant thunk. It's
about 15 years since I worked on this stuff, so I might be a bit fuzzy
on the details.


I know that. And we are using that fact in DIP 45. For data symbols we did
suggest a similar behaviour that has to be implemented by us first. And yes we
need data symbols, because of all the implicit data symbols the D programming
language generates. TypeInfos, vtables etc.

I suggest that you read all the links I gave for further reading in DIP 45 and
DIP 45 again, because you are pretty close to what we suggested in DIP 45
without actually realizing it.


I'm very much against the suggested rewriting of obj files to strip the export 
records :-)




Re: DIP 45 - approval discussion

2013-11-17 Thread Walter Bright

On 11/16/2013 7:22 AM, Martin Nowak wrote:

On 11/15/2013 08:32 AM, Walter Bright wrote:


It's not that bad. Phobos can be built by specifying all the files on
the command line.


That's the essential trade-off we have to make.
Either we come up with a clumsy way to list all DLL modules during compilation
(separate compilation) or we add some compiler logic to auto-import exported
data symbols. In the latter case compiled object files can be used for a DLL AND
a static library.
We found the second approach to be more attractive.


Why? dmd is very fast at compiling. I'm not sure what is being saved here.



The only issue is that most of the time one wants to strip exports when using
objects for a static library.


That's a pretty big issue (to me, anyway).


Re: DIP 45 - approval discussion

2013-11-17 Thread Daniel Murphy
"Walter Bright"  wrote in message 
news:l6c7gt$27fc$1...@digitalmars.com...
>>
>> Right, I was saying the indirection still exists.
>>
>>
>
> No, not really, at least not on Windows. Try calling a function in the 
> Windows API and disassemble it. You'll see a direct call.

You sure about that?

src:

import core.sys.windows.windows;

void main()
{
auto x = GetModuleHandleA(null);
}

obj:

__Dmain PROC NEAR
;  COMDEF __Dmain
push0   ;  _ 6A, 00
calldword ptr [__imp__GetModuleHandleA@4]   ; 0002 _ FF. 15, 
(segrel)
xor eax, eax  ; 0008 _ 31. C0
ret ; 000A _ C3
__Dmain ENDP


exe:

?_0072  LABEL NEAR
push0   ; 00402010 _ 6A, 00
calldword ptr [?_0067]  ; 00402012 _ FF. 15, 
00401794(d)
xor eax, eax; 00402018 _ 31. C0
ret ; 0040201A _ C3




Re: Worst-case performance of quickSort / getPivot

2013-11-17 Thread Chris Cain
On Monday, 18 November 2013 at 02:36:40 UTC, Andrei Alexandrescu 
wrote:

I think that's a terrific start!


Thanks :)


(Not sure I understand where the 30 minutes come from...)


On my computer (`-release -inline -noboundscheck` ... `-O` is 
omitted because it removes the summation all together since it 
isn't used anywhere):


---
auto benchmarkSumming() {
auto proportions = iota(150_000);
// copied from std.random.dice:
auto sum = reduce!("(assert(b >= 0), a + b)")(0.0, 
proportions.save);

}

auto bench = benchmark!benchmarkSumming(1_000);
writeln(bench.front.msecs, " ms");
---

Gives me 1618 ms. ~150_000 is the number of entries in the 
probability table (number of surnames), and generating the sum 
for 1_000 names would take about that long. I really wanted 1 
million names to sort, so scaling that up would mean 1618 
seconds, which is nearly 27 minutes.


That summation doesn't change, hence the importance of caching it 
for repeated runs. That's about 27 minutes of completely wasted 
work.


That all isn't to say that std.random.dice is fundamentally 
flawed (it's fine and perfectly suited for one-shot dice rolls 
and is, in fact, a bit better than a DiceRange for that task) but 
a DiceRange is really needed to efficiently do repeated rolls 
like this.


Jeez, we're talking about maximizing sorting efficiency, and now 
I've gone off talking about making dice rolls faster too! :)


Dynamic Sized Stack Frames

2013-11-17 Thread Jonathan Marler
When the life of a buffer is tied to the life of a function, it 
makes sense to allocate it on the stack because it's much easier 
to pop memory off the stack then to garbage collect it later.  
For example if had a function that required a buffer to read from 
some sort of input, it would make sense to allocate the buffer on 
the stack, i.e.


run() {
  byte buffer[BUFF_SIZE];
  while(running) {
int n = read(buffer, BUFF_SIZE);
...

However, what if I want the size of the buffer to be configurable 
at runtime?  Maybe this function is part of some sort of library 
that allows the user to specify a smaller or larger buffer to 
optimize the number of read calls.  As far as I know, in order 
for a library to have a buffer whose length isn't known until 
runtime, it would require allocating it on the heap.  Then we 
have a buffer that is only needed for the life of this function 
but still needs to be garbage collected.  The solution I came up 
with for this type of scenario is what I am calling "dynamic 
sized stack frames".  The buffer could be allocated on the stack, 
and the length of the buffer could be pushed on the stack as well 
so the function can properly clean it when it returns.  I don't 
know of any languages that support this type of functionality, 
does anyone know if there is one?  Maybe this is a new feature 
that D could introduce in a later version?


As far as implementation there's a couple of ways this could be 
done...the caller could allocate the buffer, or the caller could 
just push the buffer length as an extra parameter and the callee 
could be solely responsible for allocating the buffer and 
cleaning it up afterwards (as I'm writing this that sounds like a 
safer option).  Anyway I wanted to put this idea out there and 
see what people's thoughts are.  Thanks in advance for any 
helpful opinions or criticism.


Fast dice rolls

2013-11-17 Thread Chris Cain

On Monday, 18 November 2013 at 03:10:22 UTC, bearophile wrote:

See also:
http://d.puremagic.com/issues/show_bug.cgi?id=5849

Bye,
bearophile


Nice one, bearophile. You called it.

I've done a bit of testing with the solution you've attached to 
that bug report. It's 10-50% faster than my solution depending on 
how many proportions are provided (I've tested as much as 
500_000). Despite mine being O(log(n)) to find the next dice roll 
and your solution being O(1), there really isn't that much of a 
difference. Strangely (to me), your solution actually seems to 
grow faster than mine as the number of proportions grow larger 
despite the proof claiming it should grow slower (although 
testing for > 500_000 seems pointless until actual data demands 
it).


As far as integer proportions go, I think my solution might be 
better overall because it's effectively equivalent to the current 
dice implementation (meaning it could be used as a near drop-in 
replacement and get precisely the same results, as long as the 
proportions are floating point values) and the code is simpler 
and avoids hiding any real bugs.


(That said, the original implementation has a bug that was easy 
to spot on my second comb-through)

--- in popFront:
// Return the number of elements that are not strictly greater 
than point
_front = _propCumSumSorted.length - 
_propCumSumSorted.upperBound!pol(point).length;

---

On the other hand, a cumulative sum array approach (like mine and 
the std.random.dice implementation) will likely suffer major 
precision issues with floating point numbers. Especially if the 
first proportions are very large and the later proportions are 
very small. I haven't closely looked through the algorithm you're 
using to see whether it fares better, but I can't imagine it 
doing much worse.


Re: Dynamic Sized Stack Frames

2013-11-17 Thread Chris Cain
On Monday, 18 November 2013 at 05:46:48 UTC, Jonathan Marler 
wrote:
However, what if I want the size of the buffer to be 
configurable at runtime?


core.stdc.stdlib has "alloca" ... it allocates a runtime 
configurable amount of memory on the stack. There's a few threads 
on doing some tricks with it. For instance, if you use it as a 
default parameter, then it'll allocate on the caller's stack so 
you can actually return a ref to it from the function :)


Re: Dynamic Sized Stack Frames

2013-11-17 Thread Jonathan Marler

On Monday, 18 November 2013 at 05:58:25 UTC, Chris Cain wrote:
On Monday, 18 November 2013 at 05:46:48 UTC, Jonathan Marler 
wrote:
However, what if I want the size of the buffer to be 
configurable at runtime?


core.stdc.stdlib has "alloca" ... it allocates a runtime 
configurable amount of memory on the stack. There's a few 
threads on doing some tricks with it. For instance, if you use 
it as a default parameter, then it'll allocate on the caller's 
stack so you can actually return a ref to it from the function 
:)


Are you serious?  It seems the more I learn about D the more 
impressed I become.  Thanks for the quick response :)


try/catch idiom in std.datetime

2013-11-17 Thread Andrei Alexandrescu
I'm of a stance that good code should have many throws and few try/catch 
statements. I was curious how Phobos does there, so I ran a few simple 
greps around.


Seems like there's about 145 try statements in Phobos. Of these, more 
than a third (51) belong to std.datetime.


Looks like the following idiom is often present in std.datetime:

@property FracSec fracSec() const nothrow
{
try
{
auto hnsecs = removeUnitsFromHNSecs!"days"(adjTime);

if(hnsecs < 0)
hnsecs += convert!("hours", "hnsecs")(24);

hnsecs = removeUnitsFromHNSecs!"seconds"(hnsecs);

return FracSec.from!"hnsecs"(cast(int)hnsecs);
}
catch(Exception e)
assert(0, "FracSec.from!\"hnsecs\"() threw.");
}

(It may seem I chose this example to discuss the "let's insert one empty 
line every other line" idiom. I didn't.) Essentially the code relies on 
calls that may generally throw, but calls them with parameters that 
should guarantee there won't be any throwing. In wanting to offer as 
many "nothrow" guarantees as possible, the code ends up inserting these 
try/catch statements - seemingly needlessly.


This is quite heavy-handed, so I was wondering what we could do to 
improve on this. I thought of the following possibility:


@property FracSec fracSec() const nothrow
{
scope(failure) assert(0, "FracSec.from!\"hnsecs\"() threw.");
auto hnsecs = removeUnitsFromHNSecs!"days"(adjTime);
if (hnsecs < 0)
hnsecs += convert!("hours", "hnsecs")(24);
hnsecs = removeUnitsFromHNSecs!"seconds"(hnsecs);
return FracSec.from!"hnsecs"(cast(int)hnsecs);
}

This at least leaves the normal path unaltered and deals with the 
unexpected in a more isolated manner. I was pleased that the changed 
code compiled just fine. My happiness was short-lived because before 
long I figured this compiles as well:


...
scope(failure) {}
...

So it's not that the compiler cleverly figured the function will not 
throw. Instead, the compiler considers everything dandy as soon as it 
sees a scope(failure), no matter of the statement it controls. I'll 
report that bug soon.


What would be the best approach here?

0. Do nothing. This is as good as it gets.

1. Fix scope(failure) and then use it.

2. Relax the nothrow guarantees. After all, nothrow is opt-in.

3. Look into API changes that add nothrow, narrower functions to the 
more general ones that may throw.


4. ...?


Andrei


Re: try/catch idiom in std.datetime

2013-11-17 Thread Jonathan M Davis
On Sunday, November 17, 2013 22:32:46 Andrei Alexandrescu wrote:
> (It may seem I chose this example to discuss the "let's insert one empty
> line every other line" idiom. I didn't.)

That code's like that just because I like to put empty lines before and after 
if statements and before return statements, as I think that that improves 
legibility. Short functions like that suffer as a result, because they end up 
with a larger proportion of the lines being empty than is normal. I'm always a 
bit torn on that, because I don't like having quite that many empty lines in 
so few lines, but I also don't like not having space around if statements and 
return statements. I never feel like I can find a nice, legible balance with 
short functions like that.

> Essentially the code relies on
> calls that may generally throw, but calls them with parameters that
> should guarantee there won't be any throwing. In wanting to offer as
> many "nothrow" guarantees as possible, the code ends up inserting these
> try/catch statements - seemingly needlessly.

Yeah, I tried to use pure and nothrow fairly heavily in std.datetime and ran 
into a number of hurdles like this. Fortunately, the situation has improved 
somewhat (e.g. format can finally be pure at least some of the time), but it 
does show that it's not always easy to use pure or nothrow even when it 
arguably should be.

> What would be the best approach here?
> 
> 0. Do nothing. This is as good as it gets.

Well, it works just fine as-is, but it would be kind of nice to be able to 
solve the problem in a less verbose manner (though you're talking about saving 
only a few lines of code).

> 1. Fix scope(failure) and then use it.

Fine with me. I might still favor the try-catch in cases where you can clearly 
wrap it around one function call, because then you avoid problems where you've 
accidentally effectively marked the whole function as trusted-nothrow when you 
only want to mark one function call that way. But you could do the same thing 
with scope(failure) and a new scope. The main problem is when you can't really 
put the calls that need to be trusted-nothrow inside a new scope, in which 
case, you're forced to mark the whole function (or at least large portions of 
it) as trusted-nothrow by wrapping it all in a try-catch or scope(failure).

> 2. Relax the nothrow guarantees. After all, nothrow is opt-in.

I'm not quite sure what you're suggesting here. Make it so that nothrow does 
checking at runtime instead of compile time? That would be moving in the 
direction of C++ and throw specifiers (or more precisely, noexcept, I suppose). 
If that's what you're suggesting, I'd be very much against that. I think that 
the fact D's nothrow is statically checked is a huge advantage over C++'s 
noexcept. The fact that you have to sometimes use try-catch blocks (or 
scope(failure) if that works) to make it work is essentially the same as 
needing @trusted to make some stuff @safe. I wouldn't want to throw away 
@trusted in favor of making @safe more lax either (though that's almost all 
static checking which can't be done at runtime, unlike with noexcept).

Of course, there's no way to verify trusted-nothrow except at runtime like 
std.datetime is doing with try-catch and assertions, but most code _can_ be 
checked statically (including the code that calls the functions that use the 
try-catch-assert idiom to be able to be nothrow), and that's much more 
pleasant, particularly because it's actually checked by the compiler that way 
rather than just blowing up on you at runtime.

I suppose that if it were considered annoying enough to have to use try-catch 
blocks or scope(failure), we could add a nothrow equivalent to @trusted to 
mark functions with, though it's already been argued that @trusted should be 
on pieces of a function rather than on the whole function, and it would 
arguably be better to mark sections of a function as trusted-nothrow rather 
than the entire thing. try-catch lets us do that already, but it might be nice 
to be able to do the equivalent of

 @property FracSec fracSec() const nothrow
 {
 trusted-nothrow
 {
 auto hnsecs = removeUnitsFromHNSecs!"days"(adjTime);

 if(hnsecs < 0)
 hnsecs += convert!("hours", "hnsecs")(24);

 hnsecs = removeUnitsFromHNSecs!"seconds"(hnsecs);

 return FracSec.from!"hnsecs"(cast(int)hnsecs);
 }
 }

and have the compiler insert the catch and assertion for you.

> 3. Look into API changes that add nothrow, narrower functions to the
> more general ones that may throw.

In some cases (e.g. format) that's probably a good idea, but I don't think 
that really scales. In some cases, it would be well worth it, whereas in 
others, it would just be better to use try-catch in the few places where you 
know the function won't throw, because it would be quite rare to be able to 
guarantee that it wouldn't throw. It's also not pleasa

Re: Dynamic Sized Stack Frames

2013-11-17 Thread Mike Parker

On 11/18/2013 3:03 PM, Jonathan Marler wrote:

On Monday, 18 November 2013 at 05:58:25 UTC, Chris Cain wrote:



core.stdc.stdlib has "alloca" ... it allocates a runtime configurable
amount of memory on the stack. There's a few threads on doing some
tricks with it. For instance, if you use it as a default parameter,
then it'll allocate on the caller's stack so you can actually return a
ref to it from the function :)


Are you serious?  It seems the more I learn about D the more impressed I
become.  Thanks for the quick response :)


While D is awesome, alloca is actually a standard C function, which is 
why it's in the core.stdc.stdlib package. You have full access to the C 
standard library from D.