Re: Logical const

2010-11-28 Thread Rainer Deyke
On 11/28/2010 17:29, bearophile wrote:
 Peter Alexander:
 
 If I give some const object to a function:
 
 void render(const GameObject);
 
 GameObject obj; render(obj);
 
 I can be sure that my object will come back unmodified.
 
 render() is free to modify the objects contained inside GameObject,
 because that const isn't transitive.

This is simply not true.  Observe:

  class InnerObject {
  public:
f();
  };

  class GameObject {
  public:
void f() const {
  this-inner.f(); // Error: const object modified.
}
  private:
InnerObject inner;
  };

In C++, const is transitive for direct members.  It is only intransitive
for pointer/references, and even these can be made transitive through
the use of a transitive-const smart pointer class.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Spec#, nullables and more

2010-11-26 Thread Rainer Deyke
On 11/26/2010 10:28, Bruno Medeiros wrote:
 Yes, Walter's statement that it is impossible for a null pointer to
 cause a security vulnerability is (likely) incorrect.
 But his point at large, considering the discussion that preceded the
 comment, was that null pointers are utterly insignificant with regards
 to security vulnerabilities.

I really hate this way of thinking.  Security vulnerabilities are binary
- either they exist or they don't.  Every security vulnerability seems
minor until it is exploited.

Yes, some security vulnerabilities are more likely to be exploited than
others.  But instead of rationalizing about how significant each
individual security vulnerability is, isn't it better to just fix all of
them?

(I know, I'm a hopeless idealist.)


-- 
Rainer Deyke - rain...@eldwood.com


Re: std.algorithm.remove and principle of least astonishment

2010-11-22 Thread Rainer Deyke
On 11/22/2010 00:08, Andrei Alexandrescu wrote:
 On 11/21/10 11:59 PM, Rainer Deyke wrote:
 That the range view and the array view provide direct access to the same
 data.
 
 Where do ranges state that assumption?

Are you saying that arrays of T do not function as ranges of T when T is
not a character type?

 One of the useful features of most arrays is that an array of T can be
 treated as a range of T.  However, this feature is missing for arrays of
 char and wchar.
 
 This is not a guarantee by ranges, it's just a mistaken assumption.

I'm not saying that this feature is guaranteed for all arrays, because
it clearly isn't.  I'm saying that this feature is present for T[] where
T is not a character type, and missing for T[] where T is a character
type.  When writing code that is not intended to operate on character
data, it is natural to use this feature.  The code then breaks when the
code is used with character data.

 No, I'm saying that I write generic code that declares T[] and then
 passes it off to a function that operates on ranges, or to a foreach
 loop.
 
 A function that operates on ranges would have an appropriate constraint
 so it would work properly or not at all. foreach works fine with all
 arrays.

It works, but produces different results than when iterating over a
character array than when iterating over a non-character array.  Code
can compile, have well-defined behavior, run, produce correct results in
most cases, but still be wrong.

 Let's say I have an array and I want to iterate over the first ten
 items.  My first instinct would be to write something like this:

foreach (item; array[0 .. 10]) {
  doSomethingWith(item);
}

 Simple, natural, readable code.  Broken for arrays of char or wchar, but
 in a way that is difficult to detect.
 
 Why is it broken? Please try it to convince yourself of the contrary.

I see, foreach still iterates over code units by default.  Of course,
this means that foreach over ranges doesn't work with strings, which in
turn means that algorithms that use foreach over ranges are broken.
Observe:

  import std.stdio;
  import std.algorithm;

  void main() {
writeln(count!(true)(日本語)); // Three characters.
  }

Output (compiled with Digital Marse D Compiler v2.050):
  9

 Fine. Use T[] generically in conjunction with the array primitives. If
 you plan to use them with the range primitives, you do as ranges do.

If arrays can't operate as ranges, what's the point of giving them a
range interface?

 Easy:
- string_t becomes a keyword.
- Syntactically speaking, string_t!T is the name of a type when T is a
 type.
- For every built-in character type T (including const and immutable
 versions), the type currently called T[] is now called string_t!T, but
 otherwise maintains all of its current behavior.
- For every other type T, string_t!T is an error.
- char[] and wchar[] (including const and immutable versions) are
 plain arrays of code units, even when viewed as a range.

 It's not my preferred solution, but it's easy to explain, it fixes the
 main problem with the current system, and it only costs one keyword.

 (I'd rather treat string_t as a library template with compiler support
 like and rename it to String, but then it wouldn't be a built-in string.)
 
 I very much prefer the current state of affairs.

Care to support that with some arguments, or is it just a purely
subjective preference?


-- 
Rainer Deyke - rain...@eldwood.com


Re: std.algorithm.remove and principle of least astonishment

2010-11-22 Thread Rainer Deyke
On 11/22/2010 03:57, Jonathan M Davis wrote:
 On Monday 22 November 2010 02:01:38 Rainer Deyke wrote:
 Are you saying that arrays of T do not function as ranges of T when T is
 not a character type?
 
 I believe that he means that you either use them as ranges or you use them as 
 arrays. Mixing the two sets of operations is asking for trouble.

It is impossible to have a non-empty array without at some point using
an array operation.  If you can't mix array operations with range
operations, then you can't use arrays as ranges.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Basic coding style

2010-11-22 Thread Rainer Deyke
On 11/22/2010 11:03, bearophile wrote:
 If you write Python or C# code that other people are supposed to use,
 then people will surely tell you that your coding style is bad, if
 you don't follow their basic coding styles.

Python is a bad example to mention, methinks.  Even C++ has a more
consistent style.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Basic coding style

2010-11-22 Thread Rainer Deyke
On 11/22/2010 13:25, spir wrote:
 On Mon, 22 Nov 2010 11:24:20 -0700 Rainer Deyke rain...@eldwood.com
 wrote:
 
 On 11/22/2010 11:03, bearophile wrote:
 If you write Python or C# code that other people are supposed to
 use, then people will surely tell you that your coding style is
 bad, if you don't follow their basic coding styles.
 
 Python is a bad example to mention, methinks.  Even C++ has a more 
 consistent style.
 
 ???
 
 [What is such a wild judgement supposed to mean, Rainer? Python
 certainly has one of the most sensible and consistent coding styles
 in practice: http://www.python.org/dev/peps/pep-0008/. I do _not_
 agree with half of it ;-) but it is consistent and sensible (and I
 used it for all published code.)]

I'm talking about naming conventions, which I think is the most
important part of a coding style.  The indentation level of a third
party library doesn't affect my own code, so I don't care about it.  I
Do care about naming conventions, because the naming convention used in
third party libraries affects my own code.

The C++ standard library has a fairly consistent naming convention.  All
words are lower case, usually separated by underscores, sometimes
prefixed or postfixed with individual letters carrying additional meaning.

In Python, the general rule is that classes use CaptializedWords,
constants use UPPER_CASE (with underscores), and everything else uses
lowercase (usually without underscores).  However, these rules are
broken all the time by Python's own standard library.  For example:
  - Almost all built-in classes use lowercase.  In some cases this is
for historical reasons because the name was originally used for a
function.  However, even new built-in classes tend to use lowercase.
  - Built-in constants 'None', 'True', 'False'.
  - Some functions (e.g. 'raw_input') use underscores while most don't.
 This is allowed by the style guide, but it's still an inconsistency.


-- 
Rainer Deyke - rain...@eldwood.com


Re: std.algorithm.remove and principle of least astonishment

2010-11-22 Thread Rainer Deyke
On 11/22/2010 11:55, Andrei Alexandrescu wrote:
 On 11/22/10 4:01 AM, Rainer Deyke wrote:
 I see, foreach still iterates over code units by default.  Of course,
 this means that foreach over ranges doesn't work with strings, which in
 turn means that algorithms that use foreach over ranges are broken.
 Observe:

import std.stdio;
import std.algorithm;

void main() {
  writeln(count!(true)(日本語)); // Three characters.
}

 Output (compiled with Digital Marse D Compiler v2.050):
9
 
 Thanks.
 
 http://d.puremagic.com/issues/show_bug.cgi?id=5257

I think this bug is a symptom of a larger issue.  The range abstraction
is too fragile.  If even you can't use the range abstraction correctly
(in the library that defines this abstraction no less), how can you
expect anyone else to do so?

At the very least, this is a sign that std.algorithm needs more thorough
testing, and/or a through code review.  This is far from the only use of
foreach on a range in std.algorithm.  It just happens to be the first
example I found to illustrate my point.


-- 
Rainer Deyke - rain...@eldwood.com


Re: std.algorithm.remove and principle of least astonishment

2010-11-21 Thread Rainer Deyke
On 11/21/2010 11:23, Andrei Alexandrescu wrote:
 On 11/20/10 9:42 PM, Rainer Deyke wrote:
 On 11/20/2010 16:58, Andrei Alexandrescu wrote:
 The parallel does not stand scrutiny. The problem with vectorbool  in
 C++ is that it implements no formal abstraction, although it is a
 specialization of one.

 The problem with std::vectorbool  is that it pretends to be a
 std::vector, but isn't.  If it was called dynamic_bitset instead, nobody
 would have complained.  char[] has exactly the same problem.
 
 char[] does not exhibit the same issues that vectorbool has. The
 situation is very different, and again, trying to reduce one to another
 misses a lot of the picture.

I agree that there are differences.  For one thing, if you iterate over
a std::vectorbool you get actual booleans, albeit through an extra
layer of indirection.  If you iterate over char[] you might get chars or
you might get dchars depending on the method you use for iterating.

char[] isn't the equivalent of std::vectorbool.  It's worse.  char[]
is the equivalent of a vectorbool that keeps the current behavior of
std::vectorbool when iterating through iterators, but gives access to
bytes of packed booleans when using operator[].

 vectorbool hides representation and in doing so becomes non-compliant
 with vectorT which does expose representation. Worse, vectorbool is
 not compliant with any concept, express or implied, which makes
 vectorbool virtually unusable with generic code.

The ways in which std::vectorbool differs from any other vector are
well understood.  It uses proxies instead of true references.  Its
iterators meet the requirements of input/output iterators (or in boost
terms, readable, writable iterators with random access traversal).  Any
generic code written with these limitations in mind can use
std::vectorT freely.  (The C++ standard library doesn't play nicely
with std::vectorbool, but that's another issue entirely.)

std::vectorbool is a useful type, it just isn't a std::vector.  In
that respect, its situation is analogous to that of char[].

 It may be wise in fact to start using D2 and make criticism grounded in
 reality that could help us improve the state of affairs.

 Sorry, but no.  It would take a huge investment of time and effort on my
 part to switch from C++ to D.  I'm not going to make that leap without
 looking first, and I'm not going to make it when I can see that I'm
 about to jump into a spike pit.
 
 You may rest assured that if anything, strings are not a problem.

I'm not concerned about strings, I'm concerned about *arrays*.  Arrays
of T, where T may or not be a character type.  I see that you ignored my
Vector!char example yet again.

Your assurances aren't increasing my confidence in D, they're decreasing
my confidence in your judgment (and by extension my confidence in D).


-- 
Rainer Deyke - rain...@eldwood.com


Re: std.algorithm.remove and principle of least astonishment

2010-11-21 Thread Rainer Deyke
On 11/21/2010 17:31, Andrei Alexandrescu wrote:
 On 11/21/10 6:12 PM, Rainer Deyke wrote:
 I agree that there are differences.  For one thing, if you iterate
 over a std::vectorbool  you get actual booleans, albeit through an
 extra layer of indirection.  If you iterate over char[] you might get
 chars or you might get dchars depending on the method you use for
 iterating.
 
 This is sensible because a string may be seen as a sequence of code
 points or a sequence of code units. Either view is useful.

I don't dispute that either view is useful.

 char[] isn't the equivalent of std::vectorbool.  It's worse.
 char[] is the equivalent of a vectorbool  that keeps the current
 behavior of std::vectorbool  when iterating through iterators, but
 gives access to bytes of packed booleans when using operator[].
 
 I explained why char[] is better than vectorbool. Ignoring the
 explanation and restating a fallacious conclusion based on an
 overstretched parallel does hardly much to push forward the discussion.

I'm not interested in discussing if char[] is overall a better data
structure than std::vectorbool.  I'm focusing on one particular
property of both.

std::vectorbool fails to provide some of the guarantees of all other
instances of std::vectorT.  This means that generic code that uses
std::vectorT needs to take special consideration of std::vectorbool
if it wants to work correctly when T = bool.  This is an indisputable fact.

char[] and wchar[] fail to provide some of the guarantees of all other
instances of T[].  This means that generic code that uses T[] needs to
take special consideration of char[] if it wants to work correctly when
T = char.  This is also an indisputable fact.

I don't think it's much a stretch to draw an analogy from
std::vectorbool to char[] based on this.  However, even if
std::vectorbool did not exist, I would still consider this a design
flaw of char[].

 Again: code units _are_ well-defined, useful to have access to, and good
 for a variety of uses. Please understand this.

Again, I understand this and don't dispute it.  It's a complete
non-sequitur to this discussion.  I'm not arguing against the string
type providing access to both code points and code units.  I'm arguing
against the string type having the name of the array when it doesn't
share the behavior of an array.

 I'm not concerned about strings, I'm concerned about *arrays*.
 Arrays of T, where T may or not be a character type.  I see that you
 ignored my Vector!char example yet again.
 
 I sure have replied to it, but probably my reply hasn't been read.
 Please allow me to paste it again:
 
 When you define your abstractions, you are free to decide how you
 want to go about them. The D programming language makes it
 unequivocally clear that char[] is an array of UTF-8 code units that
 offers a bidirectional range of code points. Same about wchar[]
 (replace UTF-8 with UTF-16). dchar[] is an array of UTF-32 code
 points which are equivalent to code units, and as such is a full
 random-access range.
 
 So it's up to you what Vector!char does. In D char[] is an array of code
 units that can be iterated as a bidirectional range of code points. I
 don't see anything cagey about that.

Ah, I did read that, but it doesn't address my concerns about
Vector!char at all.  I'm aware that I can write Vector!char to act like
a container of code units.  I'm also aware that I can write Vector!char
to automatically translate to code points.  My concerns are these:

  - When writing code that uses T[], it is often natural to mix
range-based access and index-based access, with the assumption that both
provide direct access to the same underlying data.  However, with char[]
this assumption is incorrect, as the underlying data is transformed when
viewing the array as a range.  This means that generic code that uses
T[] must take special consideration of char[] or it may unexpectedly
produce incorrect results when T = char.

  - char[] sets a precedent of Container!char providing a dchar range
interface.  Other containers must choose to either follow this precedent
or to avoid it.  Either choice may require extra work when implementing
the container.  Either choice can lead to surprising behavior for the
user of the container.


-- 
Rainer Deyke - rain...@eldwood.com


Re: std.algorithm.remove and principle of least astonishment

2010-11-21 Thread Rainer Deyke
On 11/21/2010 21:56, Andrei Alexandrescu wrote:
 On 11/21/10 22:09 CST, Rainer Deyke wrote:
 On 11/21/2010 17:31, Andrei Alexandrescu wrote:
 char[] and wchar[] fail to provide some of the guarantees of all other
 instances of T[].
 
 What exactly are those guarantees?

That the range view and the array view provide direct access to the same
data.

One of the useful features of most arrays is that an array of T can be
treated as a range of T.  However, this feature is missing for arrays of
char and wchar.

- When writing code that uses T[], it is often natural to mix
 range-based access and index-based access, with the assumption that both
 provide direct access to the same underlying data.  However, with char[]
 this assumption is incorrect, as the underlying data is transformed when
 viewing the array as a range.  This means that generic code that uses
 T[] must take special consideration of char[] or it may unexpectedly
 produce incorrect results when T = char.
 
 What you're saying is that you write generic code that requires T[], and
 then the code itself uses front, popFront, and other range-specific
 functions in conjunction with it.

No, I'm saying that I write generic code that declares T[] and then
passes it off to a function that operates on ranges, or to a foreach loop.

 But this is exactly the problem. If you want to use range primitives,
 you submit to the requirement of ranges. So you write the generic
 function to ask for ranges (with e.g. isForwardRange etc). Otherwise
 your code is incorrect.

Again, my generic function declares the array as a local variable or a
member variable.  It cannot declare a generic range.

 If you want to work with arrays, use a[0] to access the front, a[$ - 1]
 to access the back, and a = a[1 .. $] to chop off the first element of
 the array. It is not AT ALL natural to mix those with a.front, a.back
 etc. It is not - why? because std.range defines them with specific
 meanings for arrays in general and for arrays of characters in
 particular. If you submit to use std.range's abstraction, you submit to
 using it the way it is defined.

It absolutely is natural to mix these in code that is written without
consideration for strings, especially when you consider that foreach
also uses the range interface.

Let's say I have an array and I want to iterate over the first ten
items.  My first instinct would be to write something like this:

  foreach (item; array[0 .. 10]) {
doSomethingWith(item);
  }

Simple, natural, readable code.  Broken for arrays of char or wchar, but
in a way that is difficult to detect.

 So: if you want to use char[] as an array with the built-in array
 interface, no problem. If you want to use char[] as a range with the
 range interface as defined by std.range, again no problem. But asking
 for one and then surreptitiously using the other is simply incorrect
 code. You can't use std.range while at the same time complaining you
 can't be bothered to read its docs.

This would sound reasonable if I were using char[] directly.  I'm not.
I'm using T[] in a generic context.  I may not have considered the case
of T = char when I wrote the code.  The code may even have originally
used Widget[] before I decided to make it generic.

 I challenge you to define an alternative built-in string that fares
 better than string  Comp. Before long you'll be overwhelmed by the
 various necessities imposed by your constraints.

Easy:
  - string_t becomes a keyword.
  - Syntactically speaking, string_t!T is the name of a type when T is a
type.
  - For every built-in character type T (including const and immutable
versions), the type currently called T[] is now called string_t!T, but
otherwise maintains all of its current behavior.
  - For every other type T, string_t!T is an error.
  - char[] and wchar[] (including const and immutable versions) are
plain arrays of code units, even when viewed as a range.

It's not my preferred solution, but it's easy to explain, it fixes the
main problem with the current system, and it only costs one keyword.

(I'd rather treat string_t as a library template with compiler support
like and rename it to String, but then it wouldn't be a built-in string.)


-- 
Rainer Deyke - rain...@eldwood.com


Re: std.algorithm.remove and principle of least astonishment

2010-11-20 Thread Rainer Deyke
On 11/20/2010 05:12, spir wrote:
 On Fri, 19 Nov 2010 22:04:51 -0700 Rainer Deyke rain...@eldwood.com
 wrote:
 You don't see the advantage of generic types behaving in a generic 
 manner?  Do you know how much pain std::vectorbool caused in
 C++?
 
 I asked this before, but I received no answer.  Let me ask it
 again. Imagine a container Vector!T that uses T[] internally.  Then
 consider Vector!char.  What would be its correct element type?
 What would be its correct behavior during iteration?  What would be
 its correct response when asked to return its length?  Assuming you
 come up with a coherent set of semantics for Vector!char, how would
 you implement it?  Do you see how easy it would be to implement it
 incorrectly?
 
 Hello Rainer,
 
 The original proposal by Bruno would simplify some project I have in
 mind (namely, a higher-level universal text type already evoked). The
 issues you point to intuitively seem relevant to me, but I cannot
 really understand any. Would be kind enough and expand a bit on each
 question? (Thinking at people who about nothing of C++ -- yes, they
 exist ;-)

std::vectorbool in C++ is a specialization of std::vector that packs
eight booleans into a byte instead of storing each element separately.
It doesn't behave exactly like other std::vectors and technically
doesn't meet the C++ requirements of a container, although it tries to
come as close as possible.  This means that any code that uses
std::vectorbool needs to be extra careful to take those differences in
account.  This is especially an issue when dealing with generic code
that uses std::vectorT, where T may or may not be bool.

The issue with Vector!char is similar.  Because char[] is not a true
array, generic code that uses T[] can unexpectedly fail when T is char.
 Other containers of char behave like normal containers, iterating over
individual chars.  char[] iterates over dchars.  Vector!char can,
depending on its implementation, iterate over chars, iterate over
dchars, or fail to compile at all when instantiated with T=char.  It's
not even clear which of these is the correct behavior.

Vector!char is just an example.  Any generic code that uses T[] can
unexpectedly fail to compile or behave incorrectly used when T=char.  If
I were to use D2 in its present state, I would try to avoid both
char/wchar and arrays as much as possible in order to avoid this trap.
This would mean avoiding large parts of Phobos, and providing safe
wrappers around the rest.


-- 
Rainer Deyke - rain...@eldwood.com


Re: std.algorithm.remove and principle of least astonishment

2010-11-20 Thread Rainer Deyke
On 11/20/2010 16:58, Andrei Alexandrescu wrote:
 On 11/20/10 12:32 PM, Rainer Deyke wrote:
 std::vectorbool  in C++ is a specialization of std::vector that packs
 eight booleans into a byte instead of storing each element separately.
 It doesn't behave exactly like other std::vectors and technically
 doesn't meet the C++ requirements of a container, although it tries to
 come as close as possible.  This means that any code that uses
 std::vectorbool  needs to be extra careful to take those differences in
 account.  This is especially an issue when dealing with generic code
 that uses std::vectorT, where T may or may not be bool.

 The issue with Vector!char is similar.  Because char[] is not a true
 array, generic code that uses T[] can unexpectedly fail when T is char.
   Other containers of char behave like normal containers, iterating over
 individual chars.  char[] iterates over dchars.  Vector!char can,
 depending on its implementation, iterate over chars, iterate over
 dchars, or fail to compile at all when instantiated with T=char.  It's
 not even clear which of these is the correct behavior.
 
 The parallel does not stand scrutiny. The problem with vectorbool in
 C++ is that it implements no formal abstraction, although it is a
 specialization of one.

The problem with std::vectorbool is that it pretends to be a
std::vector, but isn't.  If it was called dynamic_bitset instead, nobody
would have complained.  char[] has exactly the same problem.

 Vector!char is just an example. Any generic code that uses T[] can
 unexpectedly fail to compile or behave incorrectly used when T=char.
 If I were to use D2 in its present state, I would try to avoid both
 char/wchar and arrays as much as possible in order to avoid this
 trap. This would mean avoiding large parts of Phobos, and providing
 safe wrappers around the rest.
 
 It may be wise in fact to start using D2 and make criticism grounded in
 reality that could help us improve the state of affairs.

Sorry, but no.  It would take a huge investment of time and effort on my
part to switch from C++ to D.  I'm not going to make that leap without
looking first, and I'm not going to make it when I can see that I'm
about to jump into a spike pit.

 The above is
 only fallacious presupposition. Algorithms in Phobos are abstracted on
 the formal range interface, and as such you won't be exposed to risks
 when using them with strings.

I'm not concerned about algorithms, I'm concerned about code that uses
arrays directly.  Like my Vector!char example, which I see you still
haven't addressed.


-- 
Rainer Deyke - rain...@eldwood.com


Re: std.algorithm.remove and principle of least astonishment

2010-11-19 Thread Rainer Deyke
On 11/19/2010 16:40, Andrei Alexandrescu wrote:
 On 11/19/10 12:59 PM, Bruno Medeiros wrote:
 Sorry, what I mean is: so we agree that char[] and wchar[] are special.
 Unlike *all other arrays*, there are restrictions to what you can assign
 to each element of the array. So conceptually they are not arrays, but
 in the type system they are very much arrays. (or described
 alternatively: implemented with arrays).

 Isn't this a clear sign that what currently is char[] and wchar[] (=
 UTF-8 and UTF-16 encoded strings) should not be arrays, but instead a
 struct which would correctly represents the semantics and contracts of
 char[] and wchar[]? Let me clarify what I'm suggesting:
 * char[] and wchar[] would be just arrays of char's and wchar's,
 completely orthogonal with other arrays types, no restrictions on
 assignment, no further contracts.
 * UTF-8 and UTF-16 encoded strings would have their own struct-based
 type, lets called them string and wstring, which would likely use char[]
 and wchar[] as the contents (but these fields would be internal), and
 have whatever methods be appropriate, including opIndex.
 * string literals would be of type string and wstring, not char[] and
 wchar[].
 * for consistency, probably this would be true for UTF-32 as well: we
 would have a dstring, with dchar[] as the contents.

 Problem solved. You're welcome. (as John Hodgeman would say)

 No?
 
 I don't think that would mark an improvement.

You don't see the advantage of generic types behaving in a generic
manner?  Do you know how much pain std::vectorbool caused in C++?

I asked this before, but I received no answer.  Let me ask it again.
Imagine a container Vector!T that uses T[] internally.  Then consider
Vector!char.  What would be its correct element type?  What would be its
correct behavior during iteration?  What would be its correct response
when asked to return its length?  Assuming you come up with a coherent
set of semantics for Vector!char, how would you implement it?  Do you
see how easy it would be to implement it incorrectly?


-- 
Rainer Deyke - rain...@eldwood.com


Re: D1 - D2

2010-11-18 Thread Rainer Deyke
On 11/18/2010 12:39, Walter Bright wrote:
 I agree that's an issue. Currently, the only way to deal with this is
 one of:
 
 1. Minimize the differences, and maintain two copies of the source code.
 Using the (rather fabulous) meld tool (available on Linux), the merging
 is pretty easy. I use meld all the time to, for example, merge
 differences in the code base between dmd1 and dmd2.
 
 2. Isolate the code that is different into different files, which
 minimizes the work involved in (1).
 
 3. Use string mixins and the token string literal form.

4. Use an external preprocessor to generate the D1 and D2 code from the
same source file.


-- 
Rainer Deyke - rain...@eldwood.com


Re: In praise of Go discussion on ycombinator

2010-11-17 Thread Rainer Deyke
On 11/17/2010 03:26, Daniel Gibson wrote:
 Rainer Deyke schrieb:
 Let's say I see something like this in C/C++/D:

 if(blah())
 {
   x++;
 }

 This is not my usual style, so I have to stop and think.  
 
 What about
 if( (blah() || foo())  (x  42)
  (baz.iDontKnowHowtoNameThisMethod() !is null)
  someOtherThing.color = COLORS.Octarine )
 {
   x++;
 }

At first glance, it looks like two statements to me.  The intended
meaning could have been this:

if ((blah() || foo())
 (x  42)
 (baz.iDontKnowHowtoNameThisMethod() !is null)
 someOtherThing.color == COLORS.Octarine) {
  ++x;
}

Or this:

if((blah() || foo())
 (x  42)
 (baz.iDontKnowHowtoNameThisMethod() !is null)
 someOtherThing.color == COLORS.Octarine) {}
{
  ++x;
}

The latter seems extremely unlikely, so it was probably the former.
Still, I have to stop and think about it.  There is also the third
possibility that the intended meaning of the statement is something else
entirely, and the relevant parts have been lost or have not yet been
written.

Language-enforced coding standards are a good thing, because they make
foreign code easier to read.  For this purpose, it doesn't matter if the
chosen style is your usual style or if you subjectively like it.  Even a
bad coding standard is better than no coding standard.


-- 
Rainer Deyke - rain...@eldwood.com


Re: RFC, ensureHeaped

2010-11-17 Thread Rainer Deyke
On 11/17/2010 05:10, spir wrote:
 Output in general, programmer feedback in particuliar, should simply
 not be considered effect. It is transitory change to dedicated areas
 of memory -- not state. Isn't this the sense of output, after all?

My debug output actually goes through my logging library which, among
other things, maintains a list of log messages in memory.  If this is
considered pure, then we might as well strip pure from the language,
because it has lost all meaning.


-- 
Rainer Deyke - rain...@eldwood.com


Re: In praise of Go discussion on ycombinator

2010-11-16 Thread Rainer Deyke
On 11/16/2010 22:24, Andrei Alexandrescu wrote:
 I'm curious what the response to my example will be. So far I got one
 that doesn't even address it.

I really don't see the problem with requiring that '{' goes on the same
line as 'if'.  It's something you learn once and never forget because it
is reinforced through constant exposure.  After a day or two, '{' on a
separate line will just feel wrong and raise an immediate alarm in your
mind.

I would even argue that Go's syntax actually makes code /easier/ to read
and write.  Let's say I see something like this in C/C++/D:

if(blah())
{
  x++;
}

This is not my usual style, so I have to stop and think.  It could be
correct code written in another style, or it could be code that has been
mangled during editing and now needs to be fixed.  In Go, I /know/ it's
mangled code, and I'm far less likely to encounter it, so I can find
mangled code much more easily.  A compiler error would be even better,
but Go's syntax is already an improvement over C/C++/D.

There are huge problems with Go that will probably keep me from ever
using the language.  This isn't one of them.


-- 
Rainer Deyke - rain...@eldwood.com


Re: RFC, ensureHeaped

2010-11-16 Thread Rainer Deyke
On 11/16/2010 21:53, Steven Schveighoffer wrote:
 It makes me think that this is going to be extremely confusing for a
 while, because people are so used to pure being equated with a
 functional language, so when they see a function is pure but takes
 mutable data, they will be scratching their heads.  It would be awesome
 to make weakly pure the default, and it would also make it so we have to
 change much less code.

Making functions weakly pure by default means that temporarily adding a
tiny debug printf to any function will require a shitload of cascading
'impure' annotations.  I would consider that completely unacceptable.

(Unless, of course, purity is detected automatically without the use of
annotations at all.)


-- 
Rainer Deyke - rain...@eldwood.com


Re: linker wrapper

2010-11-14 Thread Rainer Deyke
On 11/14/2010 00:09, Walter Bright wrote:
 I suspect that trying to guess what modules should be added to the
 linker list may cause far more confusion than enlightenment when it goes
 awry. Currently, a lot of people seem to regard what a linker does as
 magic. Making it more magical won't help.

It seems completely straightforward to me.  If module A imports module
B, then module A depends on module B, therefore compiling and linking
module A should also cause module B to be compiled and linked.  Apply
recursively.  The only part of this that is remotely difficult - mapping
module names to files on the disk - is already done by the compiler.

This would only happen when compiling and linking as a single step,
which would be the preferred way to invoke DMD.  When linking as a
separate step, all object files would still need to be individually
passed to the linker.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Passing dynamic arrays

2010-11-12 Thread Rainer Deyke
On 11/8/2010 17:43, Jonathan M Davis wrote:
 D references are more like Java references.

That's true for class references.  D also supports pass-by-reference
through the 'ref' keyword, which works like C++ references.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Multichar literals

2010-11-12 Thread Rainer Deyke
On 11/12/2010 20:02, bearophile wrote:
 In this language a single quote can encompass multiple characters.
 'ABC' is equal to 0x434241.

I seem to recall that this is feature is also present in C/C++, although
I can't say that I've ever used it.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Kill implicit joining of adjacent strings

2010-11-11 Thread Rainer Deyke
On 11/11/2010 06:06, Michel Fortin wrote:
 On 2010-11-10 23:51:38 -0500, Rainer Deyke rain...@eldwood.com said:
 
 As it turns out, the joining of adjacent strings is a critical feature.
  Consider the following:
   f(a b);
   f(a ~ b);
 These are /not/ equivalent.  In the former cases, 'f' receives a string
 literal as argument, which means that the string is guaranteed to be
 zero terminated.  In the latter case, 'f' receives an expression (which
 can be evaluated at compile time) as argument, so the string may not be
 zero terminated.  This is a critical difference if 'f' is a (wrapper
 around a) C function.
 
 You worry too much. With 'f' a wrapper around a C function that takes a
 const(char)* argument, if the argument is not a literal string then it
 won't compile. Only string literals are implicitly convertible to
 const(char)*, not 'string' variables.

You just restated the problem.  There needs to be a way to break up
string literals while still treating them as a single string literal
that is convertible to 'const(char)*'.  You could overload binary '~'
for this, but I think this may be confusing.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Kill implicit joining of adjacent strings

2010-11-11 Thread Rainer Deyke
On 11/11/2010 13:37, Sean Kelly wrote:
 Rainer Deyke Wrote:
 
 As it turns out, the joining of adjacent strings is a critical
 feature. Consider the following: f(a b); f(a ~ b); These
 are /not/ equivalent.
 
 I would hope that the const folding mechanism would combine these at
 compile-time.

Of course it would.  That's not the issue.  The issue is, is a string
that's generated at compile-time guaranteed to be zero-terminated, the
way a string literal is?  Even if the same operation at run-time would
/not/ generate a zero-terminated string?


-- 
Rainer Deyke - rain...@eldwood.com


Re: Kill implicit joining of adjacent strings

2010-11-10 Thread Rainer Deyke
On 11/10/2010 19:34, bearophile wrote:
 Do you seen anything wrong in this code? It compiles with no errors:
 
 enum string[5] data = [green, magenta, blue red, yellow];
 static assert(data[4] == yellow);
 void main() {}
 
 
 Yet that code asserts.

Wait, what?  That's a static assert.  How can it both assert and compile
with no errors?

As it turns out, the joining of adjacent strings is a critical feature.
 Consider the following:
  f(a b);
  f(a ~ b);
These are /not/ equivalent.  In the former cases, 'f' receives a string
literal as argument, which means that the string is guaranteed to be
zero terminated.  In the latter case, 'f' receives an expression (which
can be evaluated at compile time) as argument, so the string may not be
zero terminated.  This is a critical difference if 'f' is a (wrapper
around a) C function.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Spec#, nullables and more

2010-11-06 Thread Rainer Deyke
On 11/6/2010 01:12, spir wrote:
 On Fri, 05 Nov 2010 23:13:44 -0600 Rainer Deyke rain...@eldwood.com
 wrote:
 That's a faulty idiom.  A data structure that exists but contains
 no valid data is a bug waiting to happen - no, it /is/ a bug, even
 if it does not yet manifest as incorrect observable behavior.  (Or
 at best, it's an unsafe optimization technique that should be
 wrapped up in an encapsulating function.)
 
 You may be right as for local variables. But think at elements of
 structured data. It constantly happens that one needs to define
 fields that have no meaningful value at startup, maybe even never
 will on some instances.

It doesn't happen in dynamic languages.  It doesn't happen in pure
functional languages, since these languages provide no way to alter a
data structure after it has been created.  In my experience, it happens
very rarely in C++.

If it happens constantly in D, then that's a flaw in the language.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Spec#, nullables and more

2010-11-06 Thread Rainer Deyke
On 11/6/2010 02:42, Walter Bright wrote:
 Adam D. Ruppe wrote:
 It wasn't until I added the invariant and in/out contracts to all the
 functions
 asserting about null that the problem's true cause became apparent.
 
 Couldn't this happen to you with any datum that has an unexpected value
 in it?
 
 Suppose, for example, you are appending the numbers 1..5 to the array,
 and somehow appended a 17. Many moons later, something crashes because
 the 17 was out of range.

That's an argument for limited-range data types, not against
non-nullable types.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Spec#, nullables and more

2010-11-06 Thread Rainer Deyke
On 11/6/2010 02:47, Walter Bright wrote:
 Rainer Deyke wrote:
 On 11/5/2010 17:41, Walter Bright wrote:
 In other words, I create an array that I mean to fill in later, because
 I don't have meaningful data for it in advance.

 That's a faulty idiom.  A data structure that exists but contains no
 valid data is a bug waiting to happen - no, it /is/ a bug, even if it
 does not yet manifest as incorrect observable behavior.  (Or at best,
 it's an unsafe optimization technique that should be wrapped up in an
 encapsulating function.)
 
 An example would be the bucket array for a hash table. It starts out
 initially empty, and values get added to it. I have a hard time agreeing
 that such a ubiquitous and useful data structure is a bad idiom.

Empty is a valid value for a hash table, so that's a completely
different situation.  Obviously the bucket array would not use a
non-nullable type, and less obviously the bucket array should be
explicitly initialized to nulls at creation time.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Spec#, nullables and more

2010-11-06 Thread Rainer Deyke
On 11/6/2010 15:00, Walter Bright wrote:
 I don't see that non-null is such a special case that it would benefit
 from a special case syntax.

I'm less concerned about syntax and more about semantics.  I already use
nn_ptrT (short for non-null pointer) in C++.  It works in C++ because
types in C++ are not assumed to have a default constructor.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Spec#, nullables and more

2010-11-05 Thread Rainer Deyke
On 11/5/2010 17:41, Walter Bright wrote:
 In other words, I create an array that I mean to fill in later, because
 I don't have meaningful data for it in advance.

That's a faulty idiom.  A data structure that exists but contains no
valid data is a bug waiting to happen - no, it /is/ a bug, even if it
does not yet manifest as incorrect observable behavior.  (Or at best,
it's an unsafe optimization technique that should be wrapped up in an
encapsulating function.)


-- 
Rainer Deyke - rain...@eldwood.com


Re: Ruling out arbitrary cost copy construction?

2010-10-31 Thread Rainer Deyke
On 10/31/2010 10:42, Andrei Alexandrescu wrote:
 There are several solutions possible, some that require the compiler
 knowing about the idiom, and some relying on trusted code. One in the
 latter category is to create immutable objects with an unattainable
 reference count (e.g. size_t.max) and then incrementing the reference
 count only if it's not equal to that value. That adds one more test for
 code that copies const object, but I think it's acceptable.

This means that immutable objects do not have deterministic destructors.
 I don't think this is acceptable.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Ruling out arbitrary cost copy construction?

2010-10-30 Thread Rainer Deyke
On 10/30/2010 21:56, Andrei Alexandrescu wrote:
 Walter and I discussed the matter again today and we're on the brink of
 deciding that cheap copy construction is to be assumed. This simplifies
 the language and the library a great deal, and makes it perfectly good
 for 95% of the cases. For a minority of types, code would need to go
 through extra hoops (e.g. COW, refcounting) to be compliant.

For what it's worth, I've used the ref-counting/COW style even in C++.
C++ might not assume that copy construction is cheap, but it certainly
performs better when it is.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Temporary suspension of disbelief (invariant)

2010-10-26 Thread Rainer Deyke
On 10/26/2010 18:00, bearophile wrote:
 All this looks bug-prone, and surely hairy, but it looks potentially
 useful. Is it a good idea to design a class that uses such temporary
 suspension of the invariant?

I think invariants should be checked whenever a public member function
is called from outside the object.  If setting the object to an invalid
state from outside is valid, then clearly the state isn't really invalid.

On the other hand, if the object itself calls it own public member
functions, then no invariants should be checked.  Not being able to call
public member functions while the object is temporarily in an invalid
state is too strict.  This is a problem that I actually ran into while
using D, and one of the reasons for why I stopped using invariants.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Temporary suspension of disbelief (invariant)

2010-10-26 Thread Rainer Deyke
On 10/26/2010 20:16, Walter Bright wrote:
 Rainer Deyke wrote:
 On the other hand, if the object itself calls it own public member
 functions, then no invariants should be checked.  Not being able to call
 public member functions while the object is temporarily in an invalid
 state is too strict.  This is a problem that I actually ran into while
 using D, and one of the reasons for why I stopped using invariants.
 
 A solution is to redesign what the class considers public and private. A
 public member can be a shell around a private implementation, and other
 class members can call that private implementation without invoking the
 invariant.

Writing wrapper functions is a waste of my time.  Auto-generating
wrapper functions through some sort of meta-programming magic is still a
waste of my time, since the process cannot be completely automated.


-- 
Rainer Deyke - rain...@eldwood.com


Re: More Clang diagnostic

2010-10-25 Thread Rainer Deyke
On 10/25/2010 19:01, Walter Bright wrote:
 Yes, we discussed it before. The Digital Mars C/C++ compiler does this,
 and NOBODY CARES.
 Not one person in 25 years has ever even commented on it. Nobody
 commented on its lack in dmd.

I think someone just did.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Less free underscores in number literals

2010-10-23 Thread Rainer Deyke
On 10/23/2010 10:53, Kagamin wrote:
 It's how their language builds numbers. Numbers written in ideographs
 use this grouping, but this doesn't mean, they use the same grouping
 for arabic digits. For example, amazon.co.jp uses arabic numbers and
 western 3-digit grouping.

Using groupings of three digits in Japanese seems extremely awkward,
especially for larger numbers, since you would have to mentally regroup
the digits in groups of four in order to read it.  It's not just the
written language but the spoken language that uses groups of four.  For
example, the number 1,234,567,890 would be read as 12億, 3456万, 7890.

If amazon.jp uses groups of three, then my initial reaction is
imperfect localization.

Also, even in English there are cases where groupings other than three
make sense.  Consider:

int price_in_cents = 54_95;


-- 
Rainer Deyke - rain...@eldwood.com


Re: blog: Overlooked Essentials for Optimizing Code

2010-10-21 Thread Rainer Deyke
On 10/21/2010 02:02, Peter Alexander wrote:
 I don't really think of CS that way. To me, CS is to practical
 programming as pure math is to accounting, i.e. I don't think CS should
 be teaching about profiling because that's what software engineering is
 for. They are two different worlds in my opinion. If you wanted to get a
 practical programming education and you took CS then I think you took
 the wrong degree.

There are fundamental skills that you will need if you spend any amount
of time programming, whether you are a CS student, a computational
scientist, or an actual programmer.  Profiling is one of these skills.


-- 
Rainer Deyke - rain...@eldwood.com


Re: @noreturn property

2010-10-21 Thread Rainer Deyke
On 10/21/2010 05:54, Iain Buclaw wrote:
 @noreturn void fatal()
 {
 print(Error);
 exit(1);
 }

 Thoughts?

This looks wrong to me.  'fatal' returns type 'void', except that it
doesn't.  I would prefer this:

null_type fatal()
{
print(Error);
exit(1);
}

Here 'null_type' is a type that has no values, as opposed to 'void'
which has only one possible value.  No variables may be declared of type
'null_type'.  A function with a return type of 'null_type' never returns.

Either way would work (and this feature is definitely useful), but
'null_type' has some significant advantages:
  - It can be used in delegates and function pointers.
  - It can be used in generic code where you don't know if a function
will return or not.
  - It makes for more concise code.

Feel free to think of a better name than 'null_type'.


-- 
Rainer Deyke - rain...@eldwood.com


Re: @noreturn property

2010-10-21 Thread Rainer Deyke
On 10/21/2010 11:37, Iain Buclaw wrote:
 Not sure what you mean when you say that void has only one possible value. To 
 me,
 'void' for a function means something that does not return any value (or 
 result).
 Are you perhaps confusing it with to, lets say an 'int' function that is 
 marked as
 noreturn?

A 'void' function returns, therefore it conceptually returns a value.
For generic programming, it is useful to treat 'void' as a type like any
other, except that it only has one possible value (and therefore encodes
no information and requires no storage).  If this is not implemented in
D at the moment, it should be.

auto callFunction(F)(F f) {
  return f();
}

void f() {
}

callFunction(f);


-- 
Rainer Deyke - rain...@eldwood.com


Re: Tips from the compiler

2010-10-19 Thread Rainer Deyke
On 10/19/2010 01:31, Don wrote:
 It's not obvious to me how that can be done in the presence of
 templates. Although it's easy to distinguish between library and
 non-library *files*, it's not at all easy to distinguish between library
 and non-library *code*.

Simple.  If the template is in a library file, it's library code,
regardless of where it was instantiated.  Rationale:
  - If the warning is triggered on every instantiation of the template,
then it's obviously the library's problem.
  - If the warning is triggered on only some template instantiations,
the warning is probably spurious.  Template instantiations often
generate code that would look wrong in isolation, but is actually correct.
  - If you really want to catch those extra warnings, you can always
turn on warnings for library code and shift through the results.

You could add a compiler option to enable warning on templates in
library code if the template was instantiated in user code, but I
personally don't see the point.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Tips from the compiler

2010-10-19 Thread Rainer Deyke
On 10/19/2010 05:44, Don wrote:
 Rainer Deyke wrote:
 Simple.  If the template is in a library file, it's library code,
 regardless of where it was instantiated.
 
 The separation isn't clean. User code instantiates library code which
 instantiates user code. Look at std.algorithm, for example.
 Mixins and template alias parameters blur things even further.

There are exactly four possible situations:
  - Template in library code instantiated by library code.
  - Template in library code instantiated by user code.
  - Template in user code instantiated by library code.
  - Template in user code instantiated by user code.

These cases can be nested arbitrarily deep, but that doesn't add
additional cases.  If we categorically ignore where a template is
instantiated, the four cases reduce to just two:
  - Template in library code.
  - Template in user code.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Tips from the compiler

2010-10-19 Thread Rainer Deyke
On 10/19/2010 18:20, Don wrote:
 Example: Suppose we have warnings for possible numeric overflow. Then
 consider:
 sort!a*5b(x);
 We should have a warning that 'a*5' may cause a numeric overflow. How
 can that be done?

I'm not convinced that we should have a warning here.  Validating
template arguments is the 'sort' template's job.  If the argument
validates, then the compiler should accept it as valid unless it leads
to an actual error during compilation.  If this causes some valid
warnings to be lost, so be it.  It beats the alternative of forcing
library authors to write warning-free code.

If you are really paranoid, there's always the option of generating
warnings on all template instantiations triggered (directly or
indirectly) by user code.  It can and will lead to spurious warnings,
but that's the price you pay for paranoia.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Why struct opEquals must be const?

2010-10-18 Thread Rainer Deyke
On 10/18/2010 09:07, Steven Schveighoffer wrote:
 What case were you thinking of for requiring mutablity for equality
 testing?

What about caching?

class C {
  private int hashValue = -1;
  // Mutators reset hashValue to -1.

  int getHash() {
if (this.hashValue == -1) {
  this.hashValue = longExpensiveCalculation();
}
return this.hashValue();
  }

  bool opEquals(C other) {
if (this.getHash() == other.getHash()) {
  return slowElementwiseCompare(this, other);
} else {
  return false;
}
  }
}


-- 
Rainer Deyke - rain...@eldwood.com


Re: The Next Big Language

2010-10-18 Thread Rainer Deyke
On 10/18/2010 10:55, Sean Kelly wrote:
 Clearly, your friend never used VC++6.  It was the buggiest piece of
 software I've ever used, and yet I never once heard someone say they
 were giving up on C++ because of it.  Sounds to me like he was just
 looking for a reason to not give D a fair shake.

My personal experience says otherwise, after using VC++6 for a long time
and DMD1 for a short time.  VC++6 is horrible, but DMD1 is (was?) much
worse.  (It has been some time since I last touched DMD, so the
situation has probably improved since then.)


-- 
Rainer Deyke - rain...@eldwood.com


Re: Why struct opEquals must be const?

2010-10-18 Thread Rainer Deyke
On 10/18/2010 13:51, Steven Schveighoffer wrote:
 On Mon, 18 Oct 2010 15:43:29 -0400, Rainer Deyke rain...@eldwood.com
 wrote:
 On 10/18/2010 09:07, Steven Schveighoffer wrote:
 What case were you thinking of for requiring mutablity for equality
 testing?

 What about caching?

 The object's state is still changing.

Yes, that's my point.  D doesn't have logical const like C++, so not
all logically const operations can be declared const.  opEquals is
always logically const, but it cannot always be physically const.

 If you think it's worth doing, you can always cast.  Despite the label
 of 'undefined behavior', I believe the compiler will behave fine if you
 do that (it can't really optimize out const calls).

Not safe.  The set of const objects includes immutable objects, and
immutable objects can be accessed from multiple threads at the same
time, but caching is not (by default) thread-safe.


-- 
Rainer Deyke - rain...@eldwood.com


Re: The Next Big Language

2010-10-18 Thread Rainer Deyke
On 10/18/2010 07:18, Paulo Pinto wrote:
 So the question is, what could the D killer feature be?

The power of C++ template metaprogramming, without the horribly
convoluted syntax.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Tips from the compiler

2010-10-18 Thread Rainer Deyke
On 10/18/2010 02:44, Nick Sabalausky wrote:
 - Warnings A: For things that are likely problems, but are acceptable 
 temporarily (and only temporarily), ie, until you're ready to make a commit: 
 Always on, nonfatal.
 
 - Warnings B: For things that may indicate a problem (and therefore are 
 useful to be notified of), but may also be perfectly correct: Normally off, 
 but occasionally turn on to check.
 
 - Warnings C: For things that may indicate a problem, but take a significant 
 amount of processing time: Normally off, but occasionally turn on to check, 
 possibly as a background cron job.

I'd really like to see a system that's not based on turning warnings on
and off by type, but on hiding known warnings and only showing new
warnings.  Warnings of type B are only a problem because you see the
same set of warnings each time you recompile, and this obscures the real
(new) warnings.  If you could explicitly mark individual type B warnings
as expected, there wouldn't be any need to ever turn the entire category
off.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Tips from the compiler

2010-10-18 Thread Rainer Deyke
On 10/18/2010 14:53, Don wrote:
 The problem is, once you have an optional warning in a compiler, they
 are NOT optional. All standard or pseudo-standard libraries MUST comply
 with them.

Libraries are assumed to be fully debugged, therefore they should always
be compiled with all warnings turned off.  Therefore libraries don't
have to comply with ANY warnings.  (And if your compiler doesn't allow
you to specify which files are library files, then the problem is with
the compiler.)


-- 
Rainer Deyke - rain...@eldwood.com


Re: Improving version(...)

2010-10-18 Thread Rainer Deyke
On 10/18/2010 13:42, Tomek Sowiński wrote:
 template isVersion(string ver) {
 enum bool isVersion = !is(typeof({
   mixin(version( ~ ver ~) static assert(0););
 }));
 }
 
 static if (isVersionVERSION1 || isVersion!VERSION3) {
 ...
 }
 
 If you're rushing to reply That's hideous!, don't bother. I know.

Hideous?  That's beautiful!  It lets me pretend that the ugly,
restrictive, pointless version construct in the language didn't exist,
while still reaping the benefits of the version construct.  If I ever
write a style guide for D, it will forbid the direct use of 'version'
and require that 'isVersion' be used instead.


-- 
Rainer Deyke - rain...@eldwood.com


Re: The Next Big Language

2010-10-18 Thread Rainer Deyke
On 10/18/2010 19:12, Walter Bright wrote:
 retard wrote:
 I've many times wondered why techies become so emotional. Editors,
 licenses, languages, operating systems, browsers, everything! force
 them to choose sides. This kind of thinking forces them to keep the
 train of thought inside a sealed box.
 
 It may not be emotional at all. Learning a particular language
 thoroughly takes *years*. You *have* to choose.

That depends on the language, I think.  C++ takes years to learn.
Python took me one month to reach full fluency.  There may still be
obscure corners of Python that I haven't explored, but they so obscure
that I'm unlikely to ever encounter them in normal programming.  They
don't matter.

Learning the associated libraries is another matter, but also largely
unnecessary in my opinion.  I just program with a reference manual in my
web browser.


-- 
Rainer Deyke - rain...@eldwood.com


Re: std.algorithm.remove and principle of least astonishment

2010-10-16 Thread Rainer Deyke
On 10/16/2010 13:51, Andrei Alexandrescu wrote:
 char[] and wchar[] are special. They embed their UTF affiliation in
 their type. I don't think we should make a wash of all that by handling
 them as arrays. They are not arrays.

Then rename them to something else.  Problem solved.


-- 
Rainer Deyke - rain...@eldwood.com


Re: duck!

2010-10-16 Thread Rainer Deyke
On 10/16/2010 14:02, Walter Bright wrote:
 If it's a cringeworthy name, I'd agree. But duck is not cringeworthy.

Fact: I cringe every time I hear duck typing.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Typeless function arguments

2010-10-16 Thread Rainer Deyke
On 10/16/2010 19:26, dsimcha wrote:
 The rule with uninstantiated template bodies is that the code needs to be
 syntactically correct, but not necessarily semantically correct (since the
 semantics can only be fully analyzed on instantiation).  void foo(t) looks
 syntactically incorrect.

I think 'void foo(t)' is syntactically correct.  You can't know that 't'
isn't the name of a type without semantic analysis.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Streaming library

2010-10-14 Thread Rainer Deyke
On 10/14/2010 15:49, Andrei Alexandrescu wrote:
 Good point. Perhaps indeed it's best to only deal with bytes and
 characters at transport level.

Make that just bytes.

Characters data must be encoded into bytes before it is written and
decoded before it is read.  The low-level OS functions only deal with
bytes, not characters.

Text encoding is a complicated process - consider different unicode
encodings, different non-unicode encodings, byte order markers, and
Windows versus Unix line endings.  Furthermore, it is often useful to
wedge an additional translation layer between the low-level (binary)
stream and the high-level text encoding layer, such as an encryption or
compression layer.

Writing characters directly to streams made sense in the pre-Unicode
world where there was a one-to-one correspondence between characters and
bytes.  In a modern world, text encoding is an important service that
deserves its own standalone module.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Streaming library

2010-10-14 Thread Rainer Deyke
On 10/14/2010 22:24, Andrei Alexandrescu wrote:
 On 10/14/10 21:22 CDT, Rainer Deyke wrote:
 Characters data must be encoded into bytes before it is written and
 decoded before it is read.  The low-level OS functions only deal with
 bytes, not characters.
 
 I'm not so sure about that. For example, some code in std.stdio is
 dedicated to supporting fwide():
 
 http://www.opengroup.org/onlinepubs/95399/functions/fwide.html

I don't think that's not a low-level OS function.  But it is true that I
may have overstated my case.  Still, the underlying file system and the
underlying hardware deal in bytes, not chars, on all platforms that matter.

Encoded text /is/ bytes.


 So the $1M question is, do we support text transports or not?

All text is encoded, and encoded text is logically bytes, not chars.
This is distinction is somewhat confused in D because the native string
types in D do specify an encoding.  However, it would be a mistake to
conflate the internal encoding with the external encoding used by text
transports.

It's also worth noting that some of these text transports are not 8-bit
clean.  This means that they cannot transport UTF-8 (without
transcoding), which means that they cannot transport D strings.

 - email protocol and probably other Internet protocols

All internet protocols ultimately work over IP, and IP is a binary protocol.

 If we don't support text at the transport level, things can still made
 to work but in a more fragile manner: upper-level protocols will need to
 _know_ that although the API accepts any ubyte[], in fact the results
 would be weird and malfunctioning if the wrong things are being passed.

The situation for text would be no different from the situation for any
other structured binary format.

 A text-based transport would clarify at the type level that a text
 stream accepts only UTF-encoded characters.

You can still have that, as a wrapper around the byte stream.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Ruling out arbitrary cost copy construction?

2010-10-07 Thread Rainer Deyke
On 10/7/2010 07:24, Steven Schveighoffer wrote:
 First, let me discuss why I don't like save.

...

 So my question is, what is the point of save?  The whole point is for
 this last class of ranges, so they can implement a way to copy the
 iteration position in a way that isn't doable via simple assignment. 
 But there *AREN'T ANY* of these ranges in existence.  Why do we have a
 feature that is littered all over phobos and any range that wants to be
 more than a basic imput range when the implementation is return this; ?

Let me add two reasons to that list.

First, I expect that forgetting to call 'save' is or will be a very
common bug.  There's no way to detect it at compile time, and when the
code is used with a range with a trivial 'save', the code will work as
expected.  The bug will only be triggered by a range with a non-trivial
'save'.  Therefore ranges with non-trivial 'save' should be considered
error-prone and should not be used.

Second, it is in fact fairly easy to wrap a range with a non-trivial
'save' in a range with a trivial 'save', using a copy-on-write strategy.
 So if there /were/ any ranges with non-trivial 'save', it would be easy
enough to rewrite them to eliminate the non-trivial 'save'.

Eliminating 'save' makes ranges a lot easier and safer for range users,
at a minor cost for range writers.  Since range users greatly outnumber
range writers, this seems like an overwhelmingly good trade-off.


-- 
Rainer Deyke - rain...@eldwood.com


Re: in everywhere

2010-10-07 Thread Rainer Deyke
On 10/7/2010 13:57, Andrei Alexandrescu wrote:
 On 10/7/10 14:40 CDT, bearophile wrote:
 Another solution is just to accept O(n) as the worst complexity for
 the in operator. I don't understand what's the problem in this.
 
 That means we'd have to define another operation, i.e. quickIn that
 has O(log n) bound.

Why?

I can't say I've ever cared about the big-O complexity of an operation.
 All I care about is that it's fast enough, which is highly
context-dependent and may have nothing to do with complexity.  I can't
see myself replacing my 'int[]' arrays with the much slower and bigger
'int[MAX_SIZE]' arrays just to satisfy the compiler.  I shouldn't have
to.  The type system shouldn't encourage me to.

I think it's an abuse of the type system to use it to guarantee
performance.  However, if I wanted the type system to provide
performance guarantees, I would need a lot more language support than a
convention that certain operations are supposed to be O(n).  I'm talking
performance specification on *all* functions, with a compile-time error
if the compiler can't prove that the compiled function meets those
guarantees.  And *even then*, I would like to be able to use an O(n)
implementation of 'in' where I know that O(n) performance is acceptable.


-- 
Rainer Deyke - rain...@eldwood.com


Re: in everywhere

2010-10-07 Thread Rainer Deyke
On 10/7/2010 14:33, Steven Schveighoffer wrote:
 On Thu, 07 Oct 2010 16:23:47 -0400, Rainer Deyke rain...@eldwood.com
 wrote:
 I can't say I've ever cared about the big-O complexity of an operation.
 
 Then you don't understand how important it is.

Let me rephrase that.  I care about performance.  Big-O complexity can
obviously have a significant effect on performance, so I so do care
about it, but only to the extend that it affects performance.  Low big-O
complexity is a means to an end, not a goal in and of itself.  If 'n' is
low enough, then a O(2**n) algorithm may well be faster than an O(1)
algorithm.

I also believe that, in the absence of a sophisticated system that
actually verifies performance guarantees, the language and standard
library should trust the programmer to know what he is doing.  The
standard library should only provide transitive performance guarantees,
e.g. this algorithm calls function 'f' 'n' times, so the algorithm's
performance is O(n * complexity(f)).  If 'f' runs in constant time, the
algorithm runs in linear time.  If 'f' runs in exponential time, the
algorithm still runs.

 big O complexity is very important when you are writing libraries.  Not
 so much when you are writing applications -- if you can live with it in
 your application, then fine.  But Phobos should not have these problems
 for people who *do* care.
 
 What I'd suggest is to write your own function that uses in when
 possible and find when not possible.  Then use that in your code.

The issue is that algorithms may use 'in' internally, so I may have to
rewrite large parts of Phobos.  (And the issue isn't about 'in'
specifically, but complexity guarantees in general.)


-- 
Rainer Deyke - rain...@eldwood.com


Re: Ruling out arbitrary cost copy construction?

2010-10-06 Thread Rainer Deyke
On 10/6/2010 19:58, Andrei Alexandrescu wrote:
 Once the copy constructor has constant complexity, moving vs. copying
 becomes a minor optimization.

this(this) {
  sleep(10); // -- Constant amount.
}


-- 
Rainer Deyke - rain...@eldwood.com


Re: Bug 3999 and 4261

2010-09-01 Thread Rainer Deyke
On 8/31/2010 19:46, bearophile wrote:
 But you can use const for constants that are known at run-time only.
 While you can't use enum for constant known at run-time.

In C++, const is used for both run-time and compile-time constants.  In
practice, this works out fine.  It its value can only be known at
run-time, it's a run-time constant.  If its value is used at
compile-time, it's a compile-time constant.  If both of these apply,
it's an error.  If neither applies, nobody cares if it's a compile-time
or run-time constant.

(The actual rules in C++ are a bit more complex, less intuitive, and
less useful than that, which is presumably why Walter chose not to copy
the C++ in this case.  Still, overloading 'const' for both compile-time
and run-time constants is viable, and more intuitive than the current
situation with 'enum'.)


-- 
Rainer Deyke - rain...@eldwood.com


Re: Generic code: @autoconst, @autopure, @autonothrow

2010-08-29 Thread Rainer Deyke
On 8/29/2010 00:32, Jonathan M Davis wrote:
 Templates are instantiated when you use them. They can't work any other way. 
 Normal functions are instantiated where they are declared. Unless you want to 
 try and make it so that _all_ functions are instantiated where they are used 
 (which IMHO seems like a really _bad_ idea), templates are and must be 
 treated 
 differently.

Why would that be a bad idea?

You gain:
  - Consistency.
  - The ability to treat all libraries as header-only, with no
separate compilation.
  - Because there is no separate compilation, templated virtual
functions.  (Funny how that works out.)
  - Better global optimizations.

You lose:
  - Binary libraries.  (Which barely work anyway, because most of my
functions are already templated.)
  - Potentially some (or even a lot of) compilation speed, and
potentially not.  Separate compilation introduces its own slow-downs.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Generic code: @autoconst, @autopure, @autonothrow

2010-08-29 Thread Rainer Deyke
On 8/29/2010 08:44, Robert Jacques wrote:
 So... dynamic libraries (DLL's, etc.) are an archaic compilation model?
 Might you be able to suggest an newer alternative? (that doesn't involve
 a JIT, of course)

Not necessarily, but it /is/ a special case.  Dlls are best treated as
separate programs communicating through C function calls.  C function
calls are used to achieve language neutrality.  C function calls don't
support pure and D-const, so the purity and constness of functions at
the dll boundary is a moot point.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Generic code: @autoconst, @autopure, @autonothrow

2010-08-28 Thread Rainer Deyke
On 8/28/2010 19:29, dsimcha wrote:
 Looks pretty good.  Won't work with BigInt because opBinary!* isn't pure and
 can't practically be made pure.  A solution I propose is to allow the
 annotations @autoconst, @autopure and @autonothrow for template functions.
 These would mean everything I do is const/pure/nothrow as long as all of the
 functions I call are const/pure/nothrow.  As far as I can tell, this would be
 reasonably implementable because the compiler always has the source code to
 template functions, unlike non-template functions.

On one hand, this addresses a real need.  On the other hand, D is
already has a serious case featuritis.

Is there any real reason why we can't apply these modifiers
automatically to all functions?  (And by real I don't mean it would
be hard to do or it is incompatible with the archaic compilation model
chosen by one D implementation.)

Failing that, are the arguments for the inclusion of pure/nothrow/const
really strong enough to justify all this extra cruft in the language?


-- 
Rainer Deyke - rain...@eldwood.com


Re: Generic code: @autoconst, @autopure, @autonothrow

2010-08-28 Thread Rainer Deyke
On 8/28/2010 22:33, dsimcha wrote:
 Is there any real reason why we can't apply these modifiers
 automatically to all functions?  (And by real I don't mean it would
 be hard to do or it is incompatible with the archaic compilation model
 chosen by one D implementation.)
 
 Two reasons:
 
 1.  Unless the function is a template, the compiler isn't guaranteed to have 
 the
 source available.  What if it's a binary-only library?

It is incompatible with the archaic compilation model chosen by one D
implementation.

This special treatment of templates has got to end.

 2.  The modifiers are part of a contract and part of the public API.  What if 
 some
 function just happens to be pure now, but you consider that an implementation
 detail, not part of its specification?  Client code may rely on this, not
 realizing it's an implementation detail.  Then, when you make the function 
 impure,
 your client code will break.

That may or may not be a compelling argument against always
auto-detecting pure.  It seems stronger as an argument against having
pure as a language feature at all.  (How can you know ahead of time that
a logically pure function will remain physically pure?)


-- 
Rainer Deyke - rain...@eldwood.com


Re: ddmd

2010-08-21 Thread Rainer Deyke
On 8/21/2010 15:18, Nick Sabalausky wrote:
 All good points, but one problem with moving DMD's official source from C++ 
 to D is it would make it difficult to port DMD to new platforms. Nearly 
 every platform under then sun has a C++ compiler, but not so much for D 
 compilers.

Why would that matter?  Why would you want to compile the compiler for a
platform, on that platform?


-- 
Rainer Deyke - rain...@eldwood.com


Re: ddmd

2010-08-21 Thread Rainer Deyke
On 8/21/2010 17:14, Nick Sabalausky wrote:
 Rainer Deyke rain...@eldwood.com wrote in message 
 news:i4pju0$1pj...@digitalmars.com...
 Why would that matter?  Why would you want to compile the compiler for a
 platform, on that platform?

 
 To use that platform to compile something.

It took me a while to make any sense of that statement.  You're talking
about a situation where no compiler binary exists that runs on your
platform, so you want to compile the compiler for that platform on the
platform itself, right?

The platform the compiler targets and the platform on which the compiler
runs are orthogonal issues.  If you want the compiler to produce native
binaries on platform X, it must first be able to target platform X.  If
the compiler can target platform X, then it is easy enough to create a
binary of the compiler that runs on platform X by compiling on platform
A.  From there it's a simple matter to write a script that compiles,
packages, and uploads the compiler for all supported platforms.  It
should therefore never be necessary to compile the compiler itself on
platform X.  Or am I missing something?


-- 
Rainer Deyke - rain...@eldwood.com


Re: Why foreach(c; someString) must yield dchar

2010-08-20 Thread Rainer Deyke
On 8/20/2010 10:44, Simen kjaeraas wrote:
 First off, char, wchar, and dchar are special cases already - they're
 basically byte, short, and int, but are treated somewhat differently.

They're only special cases when placed in a built-in array.  In any
other container, they behave like normal types - unless the container
uses built-in arrays internally, in which case it may not work at all.

I have no objection to a string type that uses utf-8 internally but
iterates over full code points.  My objection is specifically to
special-casing built-in arrays to behave differently from all other
arrays when instantiated on 'char' and 'wchar'.  Rename 'char[]' to
'char' (and keep 'char[]' as a simple array) and my objection goes away.

Again, I ask: what about 'Array!char'?


-- 
Rainer Deyke - rain...@eldwood.com


Re: Why foreach(c; someString) must yield dchar

2010-08-19 Thread Rainer Deyke
On 8/19/2010 03:56, Jonathan Davis wrote:
 The problem is that chars are not characters. They are UTF-8 code
 units.

So what?  You're acting like 'char' (and specifically 'char[]') is some
sort of unique special case.  In reality, it's just one case of encoded
data.  What about compressed data?  What about packed arrays of bits?
What about other containers?

There's a useful generic idiom for iterating over a sequence of A as if
it was a sequence of B: the adapter range.  Narrow strings aren't
special enough to deserve special language support.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Why foreach(c; someString) must yield dchar

2010-08-18 Thread Rainer Deyke
On 8/18/2010 20:37, dsimcha wrote:
 I've been hacking in Phobos and parallelfuture and I've come to the conclusion
 that having typeof(c) in the expression foreach(c; string.init) not be a dchar
 is simply ridiculous.

I have long ago come to the opposite conclusion.  An array of 'char'
should act like any other array.  If you want a sequence of 'dchar' that
is internally stored as an array of 'char', don't call it 'char[]'.

You propose to fix a special case by adding more special cases.  This
will increase, not decrease, the number of cases that will need special
treatment in generic code.

Iterating over a sequence of 'char' as a sequence of 'dchar' is very
useful.  Implementing this functionality as a language feature, tied to
the built-in array type, is just plain wrong.

 static assert(is(typeof({
 foreach(elem; T.init) {
 return elem;
 }
 assert(0);
 }) == ElementType!(T));
 
 Looks reasonable.  FAILS on narrow strings.

Because ElementType!(string) is broken.

 size_t walkLength1(R)(R input) {
 size_t ret = 0;
 foreach(elem; input) {
 ret++;
 }
 
 return ret;
 }
 
 size_t walkLength2(R)(R input) {
 size_t ret = 0;
 while(!input.empty) {
ret++;
input.popFront();
 }
 
 return ret;
 }
 
 assert(walkLength1(stuff) == walkLength2(stuff));
 
 FAILS if stuff is a narrow string with characters that aren't a single code 
 point.

Because 'popFront' is broken for narrow strings.

 void printRange(R)(R range) {
 foreach(elem; range) {
 write(elem, ' ');
 }
 writeln();
 }
 
 Prints garbage if range is a string with characters that aren't a single code
 point.

Prints bytes from the string separated by spaces.  This may be
intentional behavior if the parser on the other side is not utf-aware.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Why foreach(c; someString) must yield dchar

2010-08-18 Thread Rainer Deyke
On 8/18/2010 21:12, Jonathan M Davis wrote:
 The one thing about it that bugs me is that it means 
 that foreach acts differently with chars and wchars then it does with 
 everything 
 else, but really, that's a _lot_ less of an issue than the problems that you 
 get 
 with generic programming where you have to special case strings all over the 
 place.

False dichotomy.  If foreach acts differently with chars and wchars than
it does with everything else, then you /do/ need to special case strings
all over the place.

Thought experiment: what happens if you iterate not over 'char[]', but
over 'Array!char'?


-- 
Rainer Deyke - rain...@eldwood.com


Re: Destructor semantics

2010-08-12 Thread Rainer Deyke
On 8/12/2010 06:59, Joe Greer wrote:
 Logically speaking if an object isn't destructed, then it lives forever 
 and if it continues to hold it's resource, then we have a programming 
 error.  The GC is for reclaiming memory, not files.  It can take a long 
 time for a GC to reclaim an object and you surely don't want a file 
 locked for that long anymore than you want it held open forever.

Furthermore, the GC is conservative, so it isn't guaranteed to collect
any particular object at all, even if manually invoked.  (This also
means that any D program that relies in the GC potentially leaks memory.)


-- 
Rainer Deyke - rain...@eldwood.com


Re: Destructor semantics

2010-08-10 Thread Rainer Deyke
On 8/10/2010 16:59, foobar wrote:
 Steven Schveighoffer Wrote:
 what happens when GC destroys a C?
 
 C::~this(); // auto generated
   B::~this(); // so good so far 
 A::~this(); // oops!  the a is gone, program vomits bits all over itself 
 and
 chokes to death.
 
 -Steve
 
 This can only happen if you use delete on a class instance. My
 understanding was that this is going to be removed from D2.

Same problem without 'delete':

class A {
  void dispose();
}

struct B {
  A a;
  ~this() { a.dispose(); }
}

class C {
  B b;
}

C::~this(); // auto generated
  B::~this(); // so good so far
A::dispose(); // oops!


-- 
Rainer Deyke - rain...@eldwood.com


Re: D's treatment of values versus side-effect free nullary functions

2010-07-26 Thread Rainer Deyke
On 7/25/2010 13:41, Jim Balter wrote:
 But there are
 (at least) two problems: 1) you can't be certain that the code will be
 run at run time at all -- in generic code you could easily have function
 invocations with constant values that would fail in various ways but the
 function is never run with those values because of prior tests.

That's a good point - but at the same time, I'm having trouble thinking
of any sensible example where this would be an issue.  It would require
at all of the following factors:
  - The function itself is a candidate for CTFE.
  - The arguments to the function can be evaluated at compile time, but
are not simple constants (i.e. they depend on template parameters).
  - The code that calls the function depends on run-time parameters.
  - The function is never called with parameters that prevent it from
terminating, but the compiler is unable to determine that this is the case.

Something like this would work, I guess, but it's horribly contrived:

int terminate_if_nonzero(int i) {
  return (i != 0) ? i : terminate_if_nonzero(i);
}

void f(int i)(bool call_it) {
  if (call_it) terminate_if_nonzero(i);
}

bool get_input_and_return_false() {
  // This function is not a CTFE candidate because it performs I/O.
  read_input();
  return false;
}

void main() {
  f!(0)(get_input_and_return_false());
}


-- 
Rainer Deyke - rain...@eldwood.com


Re: D's treatment of values versus side-effect free nullary functions

2010-07-26 Thread Rainer Deyke
On 7/26/2010 18:30, Jim Balter wrote:
 Consider some code that calls a pure function that uses a low-overhead
 exponential algorithm when the parameter is small, and otherwise calls a
 pure function that uses a high-overhead linear algorithm. The calling
 code happens not to be CTFEable and thus the test, even though it only
 depends on a constant, is not computed at compile time. The compiler
 sees two calls, one to each of the two functions, with the same
 parameter passed to each, but only one of the two will actually be
 called at run time. Trying to evaluate the low-overhead exponential
 algorithm with large parameters at compile time would be a lose without
 a timeout to terminate the attempt. It might be best if the compiler
 only attempts CTFE if the code explicitly requests it.

It seems to me that you're describing something like this:

  if (x  some_limit) {
return some_function(x);
  } else {
return some_other_function(x);
  }

This does not pose a problem, assuming 'some_limit' is a compile-time
constant.  If 'x' is known at compile, the test can be performed at
compile time.  If 'x' is not known at compile time, neither of the
function invocations can be evaluated at compile time.

The problem does exist in this code:

  if (complex_predicate(x)) {
return some_function(x);
  } else {
return some_other_function(x);
  }

...but only if 'complex_predicate' is not a candidate for CTFE but the
other functions are.  (This can happen if 'complex_predicate' performs
any type of output, including debug logging, so the scenario is not
entirely unlikely.)

Actually I'm not entirely sure if CTFE is even necessary for producing
optimal code.  The optimizations enabled by CTFE seem like a subset of
those enabled by aggressive inlining combined with other common
optimizations.


-- 
Rainer Deyke - rain...@eldwood.com


Re: [OT] The Clay Programming Language

2010-07-26 Thread Rainer Deyke
On 7/26/2010 09:07, Lurker wrote:
 Interesting generics
 
 http://www.reddit.com/r/programming/comments/ctmxx/the_clay_programming_language/

This looks very much like the programming language that I was /going/ to
write myself when I had some free time.  Your post just justified the
hours I spent reading the D newsgroups even though my interest in D has
almost died.  Thanks for posting!


-- 
Rainer Deyke - rain...@eldwood.com


Re: D's treatment of values versus side-effect free nullary functions

2010-07-24 Thread Rainer Deyke
On 7/24/2010 15:34, Jim Balter wrote:
 The point about difficulty goes to why this is not a matter of the
 halting problem. Even if the halting problem were decidable, that would
 not help us in the slightest because we are trying to solve a
 *practical* problem. Even if you could prove, for every given function,
 whether it would halt on all inputs, that wouldn't tell you which ones
 to perform CTFE on -- we want CTFE to terminate before the programmer
 dies of old age. The halting problem is strictly a matter of theory;
 it's ironic to see someone who has designed a programming language based
 on *pragmatic* rather than theoretical considerations to invoke it.

That's exactly backwards.  It's better to catch errors at compile time
than at run time.  A program that fails to terminate and fails to
perform I/O is a faulty program.  (A function that performs I/O is
obviously not a candidate for CTFE.)  I'd rather catch the faulty
program by having the compiler lock up at compile time than by having
the compiled program lock up after deployment.  Testing whether the
program terminates at compile time by attempting to execute the program
at compile time is a feature, not a bug.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Are iterators and ranges going to co-exist?

2010-07-20 Thread Rainer Deyke
On 7/20/2010 01:39, Peter Alexander wrote:
 Iterator is probably a bad name for describing this concept, as it
 implies that they have the same usage as ranges, but they do not. An
 iterator/cursor points to one element -- a generic pointer if you like.
 Ranges define a self-sufficient machine for iteration, which makes them
 overkill (and unwieldy) when you just want to refer to one element.

I agree with this.  Ranges are great for iterating, but iterators are
better for defining ranges.  This leads to confusion.

The great strength of STL-type iterators is that you can easily refer to
any single element or any range of elements out of a sequence.

Take, for example, the 'find' algorithm.  When I use 'find' in C++, I
use it to find a position, not an element.  I can do any of the following:
  - Iterate through all the items after the found item.
  - Iterate through all the items before the found item.
  - Iterate through all the items before the found item, and then
iterate through all the items after the found item, with just a single
search.
  - Find two items (in a random-access range) and compare the iterators
to see which one comes first.
  - Iterate through all the items /between/ two found items.

The last one is especially interesting to me.  STL-style iterators allow
me to easily define a range by specifying two end points, even if the
end points come from different sources.  I don't think this is possible
with D-style ranges.  It's certainly not possible in any clean way,
because D-style ranges have no provision for specifying individual
positions in a sequence.


-- 
Rainer Deyke - rain...@eldwood.com


Re: What are AST Macros?

2010-07-13 Thread Rainer Deyke
On 7/13/2010 01:03, Nick Sabalausky wrote:
 Rainer Deyke rain...@eldwood.com wrote in message 
 news:i1gs16$1oj...@digitalmars.com...
 The great strength of string mixins is that you can use them to add the
 AST macros to D.  The great weakness of string mixins is that doing so
 requires a full (and extendable) CTFE D parser, and that no such parser
 is provided by Phobos.
 
 Seems to me that would lead to unnecessary decreases in compilation 
 performance. Such as superfluous re-parsing. And depending how exactly DMD 
 does CTFE, a CTFE D parser could be slower than just simply having DMD do 
 the parsing directly.

True. I tend to ignore compile-time costs.  The performance of computers
is increasing exponentially.  The length of the average computer program
is more or less stable.  Therefore this particular problem will
eventually solve itself.

Of course, this is just my perspective as a developer who uses a
compiler maybe twenty times a day.  If I was writing my own compiler
which was going to be used thousands of times a day by thousands of
different developers, I'd have a different attitude.


-- 
Rainer Deyke - rain...@eldwood.com


Re: What are AST Macros?

2010-07-12 Thread Rainer Deyke
On 7/12/2010 19:41, Nick Sabalausky wrote:
 I already agreed to that part (For writing, yes...). But there are other 
 uses that *do* parse, and others that do both. The point is NOT that string 
 mixins are *always* unsatisfactory as a replacement for AST macros. The 
 point is that *there are perfectly legitimate use-cases* where string mixins 
 are unsatisfactory as a replacement for AST macros. I think you've already 
 agreed to this in other posts.

The great strength of string mixins is that you can use them to add the
AST macros to D.  The great weakness of string mixins is that doing so
requires a full (and extendable) CTFE D parser, and that no such parser
is provided by Phobos.

It would be interesting to try the extendable CTFE D parser route
instead of the AST macros as built-in primitive route.  For example,
you could use it to add new lexical tokens to D, something that would be
impossible with AST macros.


-- 
Rainer Deyke - rain...@eldwood.com


Re: mangle

2010-07-01 Thread Rainer Deyke
On 7/1/2010 19:32, Jonathan M Davis wrote:
 By the way, why _does_ D mangle its names? What's the advantage? I understood 
 that C++ does it because it was forced to back in the days when it was 
 transformed into C code during compilation but that it's now generally 
 considered a legacy problem that we're stuck with rather than something that 
 would still be considered a good design decision.
 
 So, why did D go with name mangling? It certainly makes stuff like stack 
 traces 
 harder to deal with. I've never heard of any advantage to name mangling, only 
 disadvantages.

Because DMD is stuck with a C-age linker.


-- 
Rainer Deyke - rain...@eldwood.com


Re: mangle

2010-07-01 Thread Rainer Deyke
On 7/1/2010 20:34, Jonathan M Davis wrote:
 On Thursday, July 01, 2010 19:13:02 Rainer Deyke wrote:
 Because DMD is stuck with a C-age linker.
 
 Well, I guess that it just goes to show how little I understand about exactly 
 how linking works when I don't understand what that means. After all, C 
 doesn't 
 using name mangling. Does that mean that name mangling it meant as a 
 namespacing 
 tool to ensure that no D function could possibly have the same linking name 
 as a 
 C function?

That, and to allow for overloaded functions, functions with the same
name in different modules, and member functions.  Each symbol with
external linkage must map to a single unique identifier.  The concerns
are exactly the same as those in C++.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Using ()s in @property functions

2010-06-28 Thread Rainer Deyke
On 6/28/2010 20:40, dsimcha wrote:
 Once enforcement of @property is enabled, we need to decide whether calling an
 @property function using ()s should be legal.

No, we don't.  Walter does.


  In other words, should
 @property **require** omission of ()s or just allow it?  My vote is for just
 allowing omission, because I've run into the following ambiguity while
 debugging std.range.  Here's a reduced test case:
 
 struct Foo {
 uint num;
 
 @property ref uint front() {
 return num;
 }
 }
 
 void main() {
 Foo foo;
 uint* bar = foo.front;  // Tries to return a delegate.
 }

Allowing parentheses also leads to ambiguity:

  struct Foo {
  uint num;

  @property int delegate() front() {
  return delegate (){ return 5; };
  }
  }


  void main() {
  Foo foo;
  auto bar = foo.front();  // What is the type of 'bar'?
  }

The problem is that 'front' refers to two things: the property and its
accessor function.  So long as this is the case, ambiguity is unavoidable.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Wide characters support in D

2010-06-08 Thread Rainer Deyke
On 6/8/2010 13:57, bearophile wrote:
 I hope we'll soon have computers with 200+ GB of RAM where using
 strings that use less than 32-bit chars is in most cases a premature
 optimization (like today is often a silly optimization to use arrays
 of 16-bit ints instead of 32-bit or 64-bit ints. Only special
 situations found with the profiler can justify the use of arrays of
 shorts in a low level language).

Off-topic, but I don't need a profiler to tell me that my 1024x1024x1024
arrays should use shorts instead of ints.  And even when 200GB becomes
common, I'd still rather not waste that memory by using twice as much
space as I have to just because I can.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Containers I'd like to see in std.containers

2010-05-30 Thread Rainer Deyke
On 5/30/2010 15:53, Philippe Sigaud wrote:
 There are some simple containers I'd like to see in std.containers:
 
 - a priority queue
 - a heap
 - a stack, a queue
 - a set

Any container that supports pushing and popping on one end can be used
as a stack.  Any container that supports pushing on one end and popping
on the other can be used as a queue.  I don't think either of these need
their own container type.

The others, sure.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Poll: Primary D version

2010-05-23 Thread Rainer Deyke
On 5/23/2010 07:33, Andrei Alexandrescu wrote:
 On 05/23/2010 12:30 AM, Rainer Deyke wrote:
 There is no way to define this function with the correct semantics in D.
   'toStringz' must append a null character to the string, therefore it
 cannot return a pointer to the original string data in the general case.
   If you pass the resulting string to a function that mutates it, then
 the changes will not be reflected in the original string.

 If you pass the resulting string to a function that does /not/ mutate
 it, then that function should be defined to take a 'const char *'.
 
 There is a way, you could simply allocate a copy plus the \0 on the GC
 heap. In fact that's what happens right now.

No, the problem is getting any changes to the copy back to the original.
 It can be done, but not with a simple conversion function.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Poll: Primary D version

2010-05-22 Thread Rainer Deyke
On 5/22/2010 23:16, Mike Parker wrote:
 That's not the problem. The problem is this:
 
 const(char)* toStringz(const(char)[] s);
 
 There's no equivalent for:
 
 char *toStringz(char[] s);
 
 Hence the need to cast away const or use a wrapper for non-const char*
 args.

There is no way to define this function with the correct semantics in D.
 'toStringz' must append a null character to the string, therefore it
cannot return a pointer to the original string data in the general case.
 If you pass the resulting string to a function that mutates it, then
the changes will not be reflected in the original string.

If you pass the resulting string to a function that does /not/ mutate
it, then that function should be defined to take a 'const char *'.


-- 
Rainer Deyke - rain...@eldwood.com


Re: envy for Writing Go Packages

2010-05-07 Thread Rainer Deyke
On 5/7/2010 11:55, Walter Bright wrote:
 Source code could look something like:
 
 import http.d_repository.foo.version1_23;
 
 and the compiler could interpret http as meaning the rest is an
 internet url, foo is the package name, and version1_23 is the particular
 version of it.

I like this.  The only question is, how do you handle computers without
an internet connection?


-- 
Rainer Deyke - rain...@eldwood.com


Re: JavaScript is the VM to target for D

2010-04-24 Thread Rainer Deyke
On 4/24/2010 09:27, Adam D. Ruppe wrote:
 If I had my way, I'd just be rid of the virtual machine altogether. Simply run
 native programs as a restricted user. (Indeed, I'd run the browser itself
 with that restricted user, then let it create whatever processes it
 wants, possibly stripping the child processes of more privileges.) The
 operating system keeps it from doing anything evil.

Congratulations, you just invented ActiveX.  I hope you like your
platform lockdown and your security vulnerabilities.

 X-Accept-Code: linux64; linux; win32

99% of web pages will offer just the win32 version.


-- 
Rainer Deyke - rain...@eldwood.com


Re: JavaScript is the VM to target for D

2010-04-24 Thread Rainer Deyke
On 4/24/2010 14:29, Adam D. Ruppe wrote:
 On Sat, Apr 24, 2010 at 01:53:10PM -0600, Rainer Deyke wrote:
 Congratulations, you just invented ActiveX.  I hope you like your
 platform lockdown and your security vulnerabilities.
 
 ActiveX controls don't run as a limited user account. That's the key here:
 the entire browser should be running as a restricted user, and it creates
 processes even more restricted than itself.

Running the browser as a restricted user is good (and indeed necessary),
but when you're running native code, you're only as secure your OS and
CPU allow.  Running on a VM provides an additional layer of insulation.

I like native code, but only for applications that I choose to install.


-- 
Rainer Deyke - rain...@eldwood.com


Re: value range propagation for _bitwise_ OR

2010-04-11 Thread Rainer Deyke
How about this?

uint fill_bits(uint min_v, uint max_v) {
  uint mask = 0;
  for (int i = 0; i  32; ++i) {
if ((min_v | (1  i)) = max_v) mask |= (1  i);
  }
  return mask;
}

max_c = min(
  max_a | fill_bits(min_b, max_b),
  max_b | fill_bits(min_a, max_a));


-- 
Rainer Deyke - rain...@eldwood.com


Re: value range propagation for _bitwise_ OR

2010-04-11 Thread Rainer Deyke
On 4/11/2010 11:16, Ali Çehreli wrote:
 Rainer Deyke wrote:
 How about this?

 uint fill_bits(uint min_v, uint max_v) {
   uint mask = 0;
   for (int i = 0; i  32; ++i) {
 if ((min_v | (1  i)) = max_v) mask |= (1  i);
   }
   return mask;
 }

 max_c = min(
   max_a | fill_bits(min_b, max_b),
   max_b | fill_bits(min_a, max_a));


 
 That proposal looks at the two pairs of a and b separately. For example,
 works with min_a and max_a, and decides a bit pattern.
 
 What if the allowable bit pattern for the b pair has 0 bits that would
 be filled with an value of a that was missed by fill_bits for a? Imagine
 a value of a, that has little number of 1 bits, but one of those bits
 happen to be the one that fills the hole in b...

The intention of fill_bits is to create a number that contains all of
the bits of all of the numbers from min_v to max_v.  In other words:

min_v | (min_v + 1) | (min_v + 2) | ... | (max_v - 1) | max_v

It does this by considering each bit separately.  For each bit 'i', is
there a number 'n' with that bit set such that 'min_v = n = max_v'?

'min_v | (1  i)' is my attempt at calculating the smallest number with
bit 'i' set that satisfies 'min_v | (1  i)'.  This is incorrect.
Correct would be
'(min_v  (1  i)) ? min_v : ((min_v  i)  i) | (1  i)'.

My other mistake is this bit:
  max_c = min(
 max_a | fill_bits(min_b, max_b),
 max_b | fill_bits(min_a, max_a));
This is my attempt to get a tighter fit than
'fill_bits(min_a, max_a) | fill_bits(min_b, max_b)'.  It doesn't work
correctly, as you have pointed out.

Here is my revised attempt, with those errors corrected:

uint fill_bits(uint min_v, uint max_v) {
  uint mask = min_v;
  for (int i = 0; i  32; ++i) {
if ((min_v  (1  i)) == 0) {
  if min_v  i)  i) | (1  i)) = max_v) {
mask |= (1  i);
  }
}
  }
  return mask;
}

max_c = fill_bits(min_a, max_a) | fill_bits(min_b, max_b);


-- 
Rainer Deyke - rain...@eldwood.com


Re: value range propagation for _bitwise_ OR

2010-04-10 Thread Rainer Deyke
On 4/10/2010 11:52, Andrei Alexandrescu wrote:
 I think this would work:
 
 uint maxOR(uint maxa, uint maxb) {
 if (maxa  maxb) return maxOR(maxb, maxa);
 uint candidate = 0;
 foreach (i, bit; byBitDescending(maxa)) {
 if (bit) continue;
 auto t = candidate | (1  (31 - i));
 if (t = maxb) candidate = t;
 }
 return maxa | candidate;
 }

This looks wrong.  Your function, if I understand it correctly, flips
all the bits in 'maxb' (excluding the leading zeros).  If 'maxb' is
exactly 1 less than a power of 2, then 'candidate' will be 0.  Now
consider a in [0, 16], b in [0, 15].  Your function will produce 16, but
the real maximum is 31.

For maximum accuracy, you have to consider the minima as well as the
maxima when calculating the new maximum.  With 'a' in [16, 16] and 'b'
in [16, 16], the new range can only be [16, 16].  With 'a' in [0, 16]
and 'b' in [0, 16], the correct new range is [0, 31].


-- 
Rainer Deyke - rain...@eldwood.com


Re: @pinned classes

2010-04-02 Thread Rainer Deyke
On 4/1/2010 15:22, div0 wrote:
 You can't easily or reasonably find the bits of the stack used by D
 code, so you can't scan the stack, and therefore you can't move any object.

You could, if D was modified to register the parts of the stack that it
uses in some sort of per-thread stack registry.  It's not cheap, but it
works, and it can be optimized until it's cheap enough.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Need help fixing The linker can't handle *.d.obj issue

2010-04-02 Thread Rainer Deyke
On 4/2/2010 00:14, Walter Bright wrote:
 BTW, it's highly unusual to have that file naming convention.

I use the same naming convention, independently of CMake.  I like being
able to tell if a .obj file came from a .c or .cpp (or .d) source.  I
like not getting linker errors in the rare case where I have both X.c
and X.cpp in the same directory.


-- 
Rainer Deyke - rain...@eldwood.com


Re: octal literals, was Re: Implicit enum conversions are a stupid PITA

2010-03-26 Thread Rainer Deyke
On 3/25/2010 23:40, Walter Bright wrote:
 Rainer Deyke wrote:
 I don't mind octal literals, but '0177' is a horrible syntax.  *Every*
 *single* *time* that I used that syntax in C or C++, I really meant to
 use a decimal.
 
 I'm curious what tempted you to use a leading 0 in the first place.

Padding to get an array literal to align properly.  Something like this:

int a[3][3] = {
  {001, 002, 003},
  {010, 020, 030},
  {100, 200, 300},
};

I could have used (and should have used, and eventually did use) spaces
instead, but I don't think they look as nice in this instance.


-- 
Rainer Deyke - rain...@eldwood.com


octal literals, was Re: Implicit enum conversions are a stupid PITA

2010-03-25 Thread Rainer Deyke
On 3/25/2010 17:52, Walter Bright wrote:
 One thing is clear, though. 0177 is never going to be a decimal number
 in D, because it will silently and disastrously break code translated
 from C. The only choice is to support 0177 as octal or make it a syntax
 error. I'd rather support them.

I don't mind octal literals, but '0177' is a horrible syntax.  *Every*
*single* *time* that I used that syntax in C or C++, I really meant to
use a decimal.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Associative Arrays need cleanout method or property to help

2010-03-20 Thread Rainer Deyke
On 3/20/2010 13:21, bearophile wrote:
 Moritz Warning:
 The Python docs say that common usage is mostly about very small
 aas with frequent accesses but rare changes.
 
 D and Python will probably have different usage patterns for AAs. In
 Python dicts (AAs) are used to give arguments to functions (that's
 why recursive functions are so slow in Python) and to represent most
 name spaces (there are few exceptions like the __slots__, but they
 are uncommon), so in a big percentage of cases they contain few
 string-value pairs (they have an optimization for string-only keys),
 that's why python dicts have an optimization for less than about 8
 key-value pairs.

I don't know about Java or D, but my usage pattern for
boost::unordered_map in C++ is rather similar to that of Python dicts:
  - Most keys are strings (either std::string or my own interned string
class).
  - The amount of read accesses far exceeds the amount of write
accesses.  Many AAs are write-once, read-often.
  - Small AAs are common, although probably not as common as in Python.
  - I even use boost::unordered_map as the dictionary type in my own
Python-like scripting language.  This accounts for around 10% of my
total usage of boost::unordered_map.


-- 
Rainer Deyke - rain...@eldwood.com


Re: 64-bit and SSE

2010-03-02 Thread Rainer Deyke
On 3/2/2010 14:28, Don wrote:
 retard wrote:
 Why not dynamic code path selection:

 if (cpu_capabilities  SSE4_2)
   run_fast_method();
 else if (cpu_capabilities  SSE2)
   run_medium_fast_method();
 else
   run_slow_method();

 One could also use higher level design patterns like abstract
 factories here.
 
 The method needs to be fairly large for that to be beneficial. For
 fine-grained stuff, like basic operations on 3D vectors, it doesn't work
 at all. And that's one of the primary use cases for SSE.

Why not do it at the largest possible level of granularity?

int main() {
  if (cpu_capabilities  SSE4_2) {
return run_fast_main();
  } else if (cpu_capabilities  SSE2) {
return run_medium_fast_main();
  } else {
return run_slow_main();
  }
}

The compiler should be able to do this automatically by compiling every
single function in the program N times with N different code generation
setting.  Executable size will skyrocket, but it won't matter because
executable size is rarely a significant concern.


-- 
Rainer Deyke - rain...@eldwood.com


Re: A rationale for pure nothrow --- @pure @nothrow (and nothing else changes)

2010-02-26 Thread Rainer Deyke
On 2/26/2010 14:48, Don wrote:
 I really thought the explanation that we made all attibutes use the @
 form, except those where it was prevented by historical precedent was
 quite defensible.

I'd like to see *all* adjectives in the language, including those
inherited from other languages, turned into attributes.  That would be
consistent.  Historical precedent is less important to me than a clean
language.

To deal with the huge body of code that uses e.g. 'public' instead of
'@public', the compiler could be modified to accept both forms.  Then
the non-attribute form could be deprecated and finally removed.


-- 
Rainer Deyke - rain...@eldwood.com


Re: Casts, especially array casts

2010-02-24 Thread Rainer Deyke
On 2/24/2010 14:30, Jonathan M Davis wrote:
 C++ is the only language that I'm aware of which has multiple types of 
 casts, and I don't think that it really adds anything of benefit.

Really?  I think the casts are one of the areas where C++ has a huge
advantage over C.  Maybe the new casts are not as important as templates
and RAII, but they're up there in the same category.  The best part of
the C++ cast system is that it uses template function syntax, which
allows user-defined casts to use the same syntax.

Casts are yet another issue where moving from C++ to D feels like
regressing to C.

 Can't the compiler figure out which cast is supposed to be used in a given 
 situation and deal with it internally?

That's not the point.

 Having multiple casts just confuses 
 things (certainly, I don't think that I know anyone who fully understands 
 the C++ casts).

If you don't understand C++ casts, then you don't understand D casts.
They do the same thing; the C++ version is just more explicit about what
it does (and therefore safer and easier to read).


-- 
Rainer Deyke - rain...@eldwood.com


Re: Casts, especially array casts

2010-02-24 Thread Rainer Deyke
On 2/24/2010 17:09, bearophile wrote:
 C++ has 4+1 casts, they are a good amount of complexity, and it's not
 easy to learn their difference and purposes.

C++ casts perform three different logical functions:
  - Removing cv-qualifiers. (const_cast)
  - Reinterpreting raw bytes. (reinterpret_cast)
  - Converting values. (static_cast/dynamic_cast)
These are clearly distinct function.  Accidentally performing the wrong
type of cast is clearly an error, and should be flagged by the compiler.

Ideally, casts should be (distinct) library functions, not language
features.

The only thing vaguely confusing about the C++ system is the distinction
between static_cast and dynamic_cast.  I wouldn't mind seeing those two
merged into a conversion_cast.


-- 
Rainer Deyke - rain...@eldwood.com


  1   2   3   4   >