from:"\"Juan Manuel Cabo\""

Re: Linking 2 c++ libraries with D

2013-06-27 Thread Juan Manuel Cabo

On 06/27/2013 10:12 PM, Milvakili wrote:
> I have successfully link c++ with D.
> 
> Have ever when I create a dependency:
> cpp2->cpp1->d
> 
> when compile with dmd
> 
> dmd cpp1.a cpp2.a file.d -L-lstdc++

I tried it and it works _perfectly_ for me, but instead of .a I
compiled the C++ files to .o

g++ -c cpp1.cpp
g++ -c cpp2.cpp
dmd cpp1.o cpp2.o file.d -L-lstdc++

(I had to comment the printf in cpp1.cpp)

Running the program prints this output:

Testing callinf C++ main from D

cbin2 c++ library called from:cbin.cpp

I'm on Kubuntu 12.04 64bits, using DMD v2.063.2
I have the following libstdc++ packages installed
As shown by   dpkg -l libstd*|grep ^ii

libstdc++6  4.6.3-1ubuntu5
libstdc++6-4.4-dev  4.4.7-1ubuntu2
libstdc++6-4.6-dev  4.6.3-1ubuntu5

G++ version is:  g++ (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3

ldd on the executable shows that it's using these libraries:
linux-vdso.so.1 =>  (0x7fff989ff000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 
(0x7f9c0cbb7000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 
(0x7f9c0c99a000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x7f9c0c791000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 
(0x7f9c0c57b000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x7f9c0c1bc000)
/lib64/ld-linux-x86-64.so.2 (0x7f9c0cedb000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x7f9c0bebf000)


Hope that any of this was useful to you!

--jm

Re: fun project - improving calcHash

2013-06-23 Thread Juan Manuel Cabo

On 06/23/2013 09:20 PM, Juan Manuel Cabo wrote:
> On 06/23/2013 06:22 PM, Walter Bright wrote:
>> https://github.com/D-Programming-Language/dmd/blob/master/src/root/stringtable.c#L21
>>
>> Profiling shows the calcHash function is a significant contributor to 
>> compilation time (3.25% of total time). So making
>> it faster is a win. Even making dmd 1% faster would be a nice win - all 
>> those little drops add up.
>>
>> There are many, many string hash functions findable through google. Anyone 
>> want to spend the effort to make a faster
>> one? Remember, ya gotta prove it's faster!
>>
>> A nice timing test would be the time expending compiling Phobos. I would 
>> suggest that the 64 bit build of dmd be used
>> for timing tests.
>>
>> Also, be careful, many of those hash functions on the intarnets have a 
>> license that makes it unusable for dmd.
> 
> 
> I performed a quick test, and I don't think that the original function
> can be improved for speed (though it might be improved for less
> collisions):
> 
> https://gist.github.com/jmcabo/5847036
> 
> I used words and lines from the complete works of Shakespeare.
> I tested separating the loop from the switch, as was suggested
> in Timon Gehr post. It didn't improve the speed when compiled
> with "g++ -O3", though it improved it a bit without -O3.
> 
> I also tested removing the multiplication by 37. It didn't
> improve the speed. With "g++ -O3" they are all the same.
> 
> So, unless I'm measuring it wrong, the algorithm is as fast
> as can be (as fast as just adding chars).
> 
> --jm
> 
> 

Oh, it might be improved by loading 128bits at a time instead of 32bits...
but that would benefit strings of more than 16 bytes. Google's CitiHash seems
tuned for large strings.

--jm

Re: fun project - improving calcHash

2013-06-23 Thread Juan Manuel Cabo

On 06/23/2013 06:22 PM, Walter Bright wrote:
> https://github.com/D-Programming-Language/dmd/blob/master/src/root/stringtable.c#L21
> 
> Profiling shows the calcHash function is a significant contributor to 
> compilation time (3.25% of total time). So making
> it faster is a win. Even making dmd 1% faster would be a nice win - all those 
> little drops add up.
> 
> There are many, many string hash functions findable through google. Anyone 
> want to spend the effort to make a faster
> one? Remember, ya gotta prove it's faster!
> 
> A nice timing test would be the time expending compiling Phobos. I would 
> suggest that the 64 bit build of dmd be used
> for timing tests.
> 
> Also, be careful, many of those hash functions on the intarnets have a 
> license that makes it unusable for dmd.

I performed a quick test, and I don't think that the original function
can be improved for speed (though it might be improved for less
collisions):

https://gist.github.com/jmcabo/5847036

I used words and lines from the complete works of Shakespeare.
I tested separating the loop from the switch, as was suggested
in Timon Gehr post. It didn't improve the speed when compiled
with "g++ -O3", though it improved it a bit without -O3.

I also tested removing the multiplication by 37. It didn't
improve the speed. With "g++ -O3" they are all the same.

So, unless I'm measuring it wrong, the algorithm is as fast
as can be (as fast as just adding chars).

--jm

Re: D vs Haskell

2013-06-23 Thread Juan Manuel Cabo

On 06/23/2013 05:42 AM, Joseph Rushton Wakeling wrote:
> On Saturday, 22 June 2013 at 21:45:48 UTC, Juan Manuel Cabo wrote:
>> Right, the author of the article used ldc. I'm always used
>> to dmd.
> 
> You know that you can use ldmd2 to invoke LDC using the same flags as DMD?
> 
>> This time I used LDC 0.11.0, with:
>>
>>  ldc2 -O -release -disable-boundscheck
>>
>> Times (minimum of 10 runs):
>>
>>  before: 1617 ms
>>
>>  after:  554 ms
>>
>> Conclusion: using "byWords" was 3.4 times faster with DMD
>> and 2.9 times faster with LDC.
> 
> 
> Any chance of GDC results? You can again use gdmd for DMD-like interface.

The computer where I tested has Kubuntu 12.04, which is
too old even for gcc-4.7, and I cannot compile GDC there.
   The gdc version which comes in the repository fails
to compile the program, even after some rearranging so
that it doesn't use UFCS.

Anyway, using my "byWord" range to iterate the string
will surely be faster with GDC too, because tr and split can
be a little slow, especially since the string is big (5MB).

I tested a little more and found that using a handcoded
function instead of tr, and using splitter instead of split,
it's almost as fast as the byWord version. The handcoded
function assumes ascii and preallocates the resulting string
(with result.length = source.length).

--jm

Re: D vs Haskell

2013-06-22 Thread Juan Manuel Cabo

On 06/22/2013 06:29 PM, Joseph Rushton Wakeling wrote:
> On 06/22/2013 10:11 PM, Juan Manuel Cabo wrote:
>> Testing against Complete Works of William Shakespeare (5.3 MiB
>> plaintext): http://www.gutenberg.org/ebooks/100
>> and using "dmd -O -inline -noboundscheck -release" on both
>> the last version of the author, and my version using "byWord"
>> I got these times (minimum of 10 runs):
>>
>> before: 2781 ms
>>
>> after:   805 ms
> 
> What about with GDC or LDC?
> 

Right, the author of the article used ldc. I'm always used
to dmd.
Keep in mind that he hasn't released the 9MB text that he
used, but instead pointed to that 5.3MB Shakespeare file.

This time I used LDC 0.11.0, with:

 ldc2 -O -release -disable-boundscheck

Times (minimum of 10 runs):

 before: 1617 ms

 after:  554 ms

Conclusion: using "byWords" was 3.4 times faster with DMD
and 2.9 times faster with LDC.

--jm

Re: D vs Haskell

2013-06-22 Thread Juan Manuel Cabo

On 06/22/2013 03:25 PM, Szymon Gatner wrote:
> Word counting problem in D and Haskell:
> 
> http://leonardo-m.livejournal.com/109201.html

I thought that the D time could be improved further with
little changes.

Testing against Complete Works of William Shakespeare (5.3 MiB
plaintext): http://www.gutenberg.org/ebooks/100
and using "dmd -O -inline -noboundscheck -release" on both
the last version of the author, and my version using "byWord"
I got these times (minimum of 10 runs):

before: 2781 ms

after:   805 ms


Here is the code, with a "byWord" range using std.ascii.toAlpha:


import std.stdio, std.conv, std.file, std.string,
   std.algorithm, std.range, std.traits, std.ascii;


auto hashCounter(R)(R items) if (isForwardRange!R) {
size_t[ForeachType!R] result;
foreach (x; items)
result[x]++;
return result.byKey.zip(result.byValue);
}

void main(string[] args) {
//Slow:
//  args[1]
//  .readText
//  .toLower
//  .tr("A-Za-z", "\n", "cs")
//  .split

//Faster:
args[1]
.readText
.byWord
.map!toLower()
.array
.hashCounter
.array
.sort!"-a[1] < -b[1]"()
.take(args[2].to!uint)
.map!q{ text(a[0], " ", a[1]) }
.join("\n")
.writeln;
}

/** Range that extracts words from a string. Words are
strings composed only of chars accepted by std.ascii.toAlpha() */
struct byWord {
string s;
size_t pos;
string word;

this(string s) {
this.s = s;
popFront();
}

@property bool empty() const {
return s.length == 0;
}

@property string front() {
return word;
}

void popFront() {
if (pos == s.length) {
//Mark the range as empty, only after popFront fails:
s = null;
return;
}

while (pos < s.length && !std.ascii.isAlpha(s[pos])) {
++pos;
}
auto start = pos;
while (pos < s.length && std.ascii.isAlpha(s[pos])) {
++pos;
}

if (start == s.length) {
//No more words. Range empty:
s = null;
} else {
word = s[start .. pos];
}
}
}

unittest {
assert([] == array(byWord("")));
assert([] == array(byWord("!@#$")));
assert(["a", "b"] == array(byWord("a b")));
assert(["a", "b", "c"] == array(byWord("a b c")));
}


--jm

Re: Low hanging fruit for optimizing loops

2013-06-07 Thread Juan Manuel Cabo


On Saturday, 8 June 2013 at 05:11:11 UTC, Walter Bright wrote:

On 6/7/2013 9:15 PM, Juan Manuel Cabo wrote:

Given the recent surge in interest for performance, I dusted
off a small test that I made long ago and determined myself
to find the cause of the performance difference.


It's great that you're doing this. You can track it down 
further by using inline assembler and trying different 
instruction sequences.


Also, obj2asm gives nicer disassembly :-)



Thanks!!

I now used inline assembler, and can confidently say
that the difference is because of the alignment.
   Changing the order of the cmp relative to the
increment didn't do anything.

Adding the right amount of 'nop' makes it run in

  957 ms, 921 μs, and 4 hnsecs

But if I overshoot it, or miss one, it goes back to

  1 sec, 438 ms, and 544 μs

Also, I couldn't use this instruction in D's asm{}

0f 1f 40 00   nopDWORD PTR [rax+0x0]

and obj2asm doesn't dissasemble it (it just puts "0f1f"
and gives incorrent asm for the next few instructions).

I'm now not entirely sure that aligning loop jumps would be
worthwhile though. They would have to be "leaf" loops
because any call made inside the loop would overshadow
the benefits (I was looping millons of times in my test).

Anyway, here is the new source:

import std.stdio;
import std.datetime;

int fiba(int n) {
asm {
naked;
push   RBP;
movRBP,RSP;
movRCX,RDI;
movESI,0x1;
movEAX,0x1;
movEDX,0x2;
cmpECX,0x2;
jl EXIT_LOOP;
nop;
nop; nop; nop; nop;
nop; nop; nop; nop;
nop; nop; nop; nop;
LOOP_START:
leaEDI,[RSI+RAX*1];
movRSI,RAX;
movRAX,RDI;
incEDX;
cmpEDX,ECX;
jleLOOP_START;
EXIT_LOOP:
popRBP;
ret;
}
}

void main() {
auto start = Clock.currTime();
int r = fiba(1000_000_000);
auto elapsed = Clock.currTime() - start;
writeln(r);
writeln(elapsed);
}


--jm

Low hanging fruit for optimizing loops

2013-06-07 Thread Juan Manuel Cabo


Given the recent surge in interest for performance, I dusted
off a small test that I made long ago and determined myself
to find the cause of the performance difference.

I tested the linear version of fibonacci both in DMD and
in C with GCC. Without optimization switches, I'm happy
to see that the D version is faster. But when using the
switches, the C version takes 30% less time.

I'm including the dissasembly here. The asm functions
are very very close to each other. Both loops are only
6 instructions long.

So I think it is a low hanging fruit because IMO the
speed difference is either because the GCC loop jump is
64bit aligned, or because of the order of the instructions
(there is an extra instruction between the loop increment
and the loop comparison, giving an opportunity for
parallelization).

OS:
   Kubuntu Linux 12.04  64bit

CPU:
   2.1GHz - AMD Athlon(tm) 64 X2 Dual Core Processor 4000+

Switches:
   dmd -O -inline -noboundscheck -release dtest.d
   gcc -O3 -o ctest ctest.c

Times:
   D:   1 sec, 430 ms, 207 μs, and 4 hnsecs
   C:   940 ms

D version:
-

import std.stdio;
import std.datetime;

int fibw(int n) { //Linear Fibonacci
int a = 1;
int b = 1;
for (int i=2; i <= n; ++i) {
int sum = a + b;
a = b;
b = sum;
}
return b;
}

void main() {
auto start = Clock.currTime();
int r = fibw(1000_000_000);
auto elapsed = Clock.currTime() - start;
writeln(r);
writeln(elapsed);
}


C Version:
-

#include 
#include 

int fibw(int n) { //Linear Fibonacci
int a = 1;
int b = 1;
int i;
for (i=2; i <= n; ++i) {
int sum = a + b;
a = b;
b = sum;
}
return b;
}

int main() {
clock_t start = clock();
int r = fibw(1000*1000*1000);
clock_t elapsed = clock() - start;
printf("%d\n", r);
printf("%d ms\n", (int)(elapsed * 1000 / CLOCKS_PER_SEC));
return 0;
}



D Version DISASM:

004681d0 <_D5dtest4fibwFiZi>:
  4681d0:   55 push   rbp
  4681d1:   48 8b ec   movrbp,rsp
  4681d4:   48 89 f9   movrcx,rdi
  4681d7:   be 01 00 00 00 movesi,0x1
  4681dc:   b8 01 00 00 00 moveax,0x1
  4681e1:   ba 02 00 00 00 movedx,0x2
  4681e6:   83 f9 02   cmpecx,0x2
  4681e9:   7c 0f  jl 4681fa 
<_D5dtest4fibwFiZi+0x2a>

   ; LOOP JUMP --->
  4681eb:   8d 3c 06   leaedi,[rsi+rax*1]
  4681ee:   48 89 c6   movrsi,rax
  4681f1:   48 89 f8   movrax,rdi
  4681f4:   ff c2  incedx
  4681f6:   39 ca  cmpedx,ecx
  4681f8:   7e f1  jle4681eb 
<_D5dtest4fibwFiZi+0x1b>

   ; LOOP END
  4681fa:   5d poprbp
  4681fb:   c3 ret


C Version DISASM:

00400860 :
  400860:   83 ff 01  cmpedi,0x1
  400863:   b8 01 00 00 00moveax,0x1
  400868:   7e 1c jle400886 
  40086a:   ba 02 00 00 00movedx,0x2
  40086f:   b9 01 00 00 00movecx,0x1
  ; NOTICE THE nop (64bit 
alignment??):

  400874:   0f 1f 40 00   nopDWORD PTR [rax+0x0]
  ; LOOP JUMP -->
  400878:   8d 34 01  leaesi,[rcx+rax*1]
  40087b:   83 c2 01  addedx,0x1
  40087e:   89 c1 movecx,eax; REORDERED cmp
  400880:   39 d7 cmpedi,edx
  400882:   89 f0 moveax,esi
  400884:   7d f2 jge400878 
  ; LOOP END
  400886:   f3 c3 repz ret
  400888:   90nop
  400889:   90nop



both files were dissasembled with:

   objdump -M intel -d

--jm

Re: Slow performance compared to C++, ideas?

2013-05-31 Thread Juan Manuel Cabo

On 05/31/2013 10:27 PM, Jonathan M Davis wrote:
> On Saturday, June 01, 2013 09:04:50 Manu wrote:
>> **applause**
>> Someone other than me said it, out loud!
>> This is a magnificent day! :)
> 
> Well, the discussions at dconf convinced me. Certainly, at this point, I 
> think 
> that the only semi-viable excuse for not making functions non-virtual by 
> default is the code breakage that it would cause, and given how surprisingly 
> minimal that is, I think that it's definitely worth it - especially when the 
> kind of folks whose code Walter is most worried about breaking are the guys 
> most interested in the change.
> 
> - Jonathan M Davis
> 

Making everything final by default would IMO kind of break
automated mock classes generation for unit testing,
automatic proxy class generation for DB entities, and
other OOP niceities.

And it wasn't just marking methods final which got the
D version of the raytracer in this thread faster than
the C++ version in the end. (it was a combination of
four or five things, which involved a bit of unrolling,
avoiding array literals, and so on).

--jm

Re: Slow performance compared to C++, ideas?

2013-05-31 Thread Juan Manuel Cabo

On 05/31/2013 10:27 PM, Jonathan M Davis wrote:
> On Saturday, June 01, 2013 09:04:50 Manu wrote:
>> **applause**
>> Someone other than me said it, out loud!
>> This is a magnificent day! :)
> 
> Well, the discussions at dconf convinced me. Certainly, at this point, I 
> think 
> that the only semi-viable excuse for not making functions non-virtual by 
> default is the code breakage that it would cause, and given how surprisingly 
> minimal that is, I think that it's definitely worth it - especially when the 
> kind of folks whose code Walter is most worried about breaking are the guys 
> most interested in the change.
> 
> - Jonathan M Davis
> 

Making everything final by default would IMO kind of break
automated mock classes generation for unit testing,
automatic proxy class generation for DB entities, and
other OOP niceities.

And it wasn't just marking methods final which got the
D version of the raytracer in this thread faster than
the C++ version in the end. (it was a combination of
four or five things, which involved a bit of unrolling,
avoiding array literals, and so on).

--jm

Re: Slow performance compared to C++, ideas?

2013-05-30 Thread Juan Manuel Cabo

On 05/31/2013 02:15 AM, nazriel wrote:
> On Friday, 31 May 2013 at 01:26:13 UTC, finalpatch wrote:
>> Recently I ported a simple ray tracer I wrote in C++11 to D. Thanks to the 
>> similarity between D and C++ it was almost
>> a line by line translation, in other words, very very close. However, the D 
>> verson runs much slower than the C++11
>> version. On Windows, with MinGW GCC and GDC, the C++ version is twice as 
>> fast as the D version. On OSX, I used Clang++
>> and LDC, and the C++11 version was 4x faster than D verson.  Since the 
>> comparison were between compilers that share
>> the same codegen backends I suppose that's a relatively fair comparison.  
>> (flags used for GDC: -O3 -fno-bounds-check
>> -frelease,  flags used for LDC: -O3 -release)
>>
>> I really like the features offered by D but it's the raw performance that's 
>> worrying me. From what I read D should
>> offer similar performance when doing similar things but my own test results 
>> is not consistent with this claim. I want
>> to know whether this slowness is inherent to the language or it's something 
>> I was not doing right (very possible
>> because I have only a few days of experience with D).
>>
>> Below is the link to the D and C++ code, in case anyone is interested to 
>> have a look.
>>
>> https://dl.dropboxusercontent.com/u/974356/raytracer.d
>> https://dl.dropboxusercontent.com/u/974356/raytracer.cpp
> 
> Greetings.
> 
> After few fast changes I manage to get such results:
> [raz@d3 tmp]$ ./a.out
> rendering time 276 ms
> [raz@d3 tmp]$ ./test
> 346 ms, 814 μs, and 5 hnsecs
> 
> 
> ./a.out being binary compiled with clang++ ./test.cxx -std=c++11 -lSDL -O3
> ./test being binary compiled with ldmd2 -O3 -release -inline -noboundscheck 
> ./test.d (Actually I used rdmd with
> --compiler=ldmd2 but I omitted it because it was rather long cmd line :p)
> 
> 
> Here is source code with changes I applied to D-code (I hope you don't mind 
> repasting it): http://dpaste.dzfl.pl/84bb308d
> 
> I am sure there is way more room for improvements and at minimum achieving 
> C++ performance.


You might also try changing:

float[3] t = mixin("v[]"~op~"rhs.v[]");
return Vec3(t[0], t[1], t[2]);

for:
Vec3 t;
t.v[0] = mixin("v[0] "~op~" rhs.v[0]");
t.v[1] = mixin("v[1] "~op~" rhs.v[1]");
t.v[2] = mixin("v[2] "~op~" rhs.v[2]");
return t;

and so on, avoiding the float[3] and the v[] operations (which would
loop, unless the compiler/optimizer unrolls them (didn't check)).

I tested this change (removing v[] ops) in Vec3 and in
normalize(), and it made your version slightly faster
with DMD (didn't check with ldmd2).

--jm

Re: Slow performance compared to C++, ideas?

2013-05-30 Thread Juan Manuel Cabo

On 05/31/2013 12:45 AM, finalpatch wrote:
> Hi FeepingCreature,
> 
> Thanks for the tip, getting rid of the array constructor helped a lot, 
> Runtime is down from 800+ms to 583ms (with LDC,
> still cannot match C++ though). Maybe I should get rid of all arrays and use 
> hardcoded x,y,z member variables instead,
> or use tuples.
> 
> On Friday, 31 May 2013 at 03:26:16 UTC, FeepingCreature wrote:
>> There's some issues involving the use of array literals - they
>> get allocated on the heap for no clear reason. Create a version
>> of your vector constructor that uses four floats, then call that
>> instead in your line 324.
> 

I just shaved 1.2 seconds trying with dmd by changing the dot function from:

float dot(in Vec3 v1, in Vec3 v2)
{
auto t = v1.v * v2.v;
auto p = t.ptr;
return p[0] + p[1] + p[2];
}

to:

float dot(in Vec3 v1, in Vec3 v2)
{
auto one = v1.v.ptr;
auto two = v2.v.ptr;
return one[0] * two[0]
+ one[1] * two[1]
+ one[2] * two[2];
}

Before:
2 secs, 895 ms, 891 μs, and 7 hnsecs
After:
1 sec, 648 ms, 698 μs, and 1 hnsec


For others who might want to try, I downloaded the necessary derelict files 
from:

http://www.dsource.org/projects/derelict/browser/branches/Derelict2

(DerelictUtil and DerelictSDL directories). And compiled with:

dmd -O -inline -noboundscheck -release raytracer.d \
derelict/sdl/sdl.d derelict/sdl/sdlfuncs.d \
derelict/sdl/sdltypes.d derelict/util/compat.d \
derelict/util/exception.d derelict/util/loader.d \
derelict/util/sharedlib.d -L-ldl


I also ran it with the -profile switch. Here are the top functions in trace.log:


 Timer Is 3579545 Ticks/Sec, Times are in Microsecs 

  Num  TreeFuncPer
  CallsTimeTimeCall

11834377  688307713   688307713  58 const(bool 
function(raytracer.Ray, float*)) raytracer.Sphere.intersect
1446294  2922493954   582795433 402 raytracer.Vec3 
raytracer.trace(const(raytracer.Ray), raytracer.Scene, int)
  1  1748464181   296122753   296122753 void 
raytracer.render(raytracer.Scene, derelict.sdl.sdltypes.SDL_Surface*)
 933910   309760738   110563786 118
_D9raytracer5traceFxS9raytracer3RayS  (lambda)
  1  18298653367820011378200113 _Dmain
 9339104208487942084879  45 const(raytracer.Vec3 
function(raytracer.Vec3)) raytracer.Sphere.normal
 7950951342371613423716  16 const(raytracer.Vec3 
function(const(raytracer.Vec3)))
raytracer.Vec3.opBinary!("*").opBinary
 9339101112293411122934  11 pure nothrow @trusted float 
std.math.pow!(float, int).pow(float, int)
 933910   313479603 3718864   3 
_D9raytracer5traceFxS9raytracer3RayS9raytracer5SceneiZS   ... (lambda)
  1 3014385 2991659 2991659 void 
derelict.util.sharedlib.SharedLib.load(immutable(char)[][])
  1  152945  152945  152945 void 
derelict.util.loader.SharedLibLoader.unload()
1590190   89018   89018   0 const(raytracer.Vec3 
function(const(float)))
raytracer.Vec3.opBinary!("*").opBinary
1047016   70383   70383   0 const(float function()) 
raytracer.Sphere.transparency
186   66925   66925 359 void 
derelict.util.loader.SharedLibLoader.bindFunc(void**,
immutable(char)[], bool)

Re: Slow performance compared to C++, ideas?

2013-05-30 Thread Juan Manuel Cabo

On 05/30/2013 11:31 PM, finalpatch wrote:
> Hi Walter,
> 
> Thanks for the reply. I have already tried these flags. However, DMD's 
> codegen is lagging behind GCC and LLVM at the
> moment, so even with these flags, the runtime is ~10x longer than the C++ 
> version compiled with clang++ (2sec with DMD,
> 200ms with clang++ on a Core2 Mac Pro). I know this is comparing apples to 
> oranges though, that's why I was comparing
> GDC vs G++ and LDC vs Clang++.
> 
> On Friday, 31 May 2013 at 02:19:40 UTC, Walter Bright wrote:
>> For max speed using dmd, use the flags:
>>
>>-O -release -inline -noboundscheck
>>
>> The -inline is especially important.


Have you tried:

 dmd -profile

it compiles in trace generation, so that when you run the program you get a 
.log file which tells you the slowest
functions and other info.

Please not that the resulting code compiled with -profile is slower because it 
is instrumented.

--jm

Re: New UTF-8 stride function

2013-05-26 Thread Juan Manuel Cabo

And these are the results for the same linux 64bit system but 
compiling with -m32:


$ dmd -m32 -O -inline -release -noboundscheck fast_stride.d

$ for a in *wiki*; do echo ; echo $a: ; ./fast_stride $a; done

arwiki-latest-all-titles-in-ns0:
stride 89362
myStride 49974
myStride 51140
stride 88308

dewiki-latest-all-titles-in-ns0:
stride 138381
myStride 116971
myStride 116662
stride 139681

enwiki-latest-all-titles-in-ns0:
stride 584787
myStride 490681
myStride 490909
stride 584694

ruwiki-latest-all-titles-in-ns0:
stride 585372
myStride 333905
myStride 341274
stride 585050

--jm

Re: New UTF-8 stride function

2013-05-26 Thread Juan Manuel Cabo


On Sunday, 26 May 2013 at 20:49:36 UTC, Dmitry Olshansky wrote:

[..]
Thus I encourage curious folks to measure/analyze it and report 
back (don't forget to include your processor model).

[..]


Ok, I just tested on my old trusty linux 64bit (Ubuntu 12.04).

I had to download those *wiki* files from this url:

   
https://github.com/blackwhale/gsoc-bench-2012/archive/master.zip


because github gave an error trying to download the ones that are 
more than 10Mb.


I ran the tests multiple times before copying them here.

And here are the results:

$ uname -a
Linux lolita 3.2.0-43-generic #68-Ubuntu SMP Wed May 15 03:33:33 
UTC 2013 x86_64 x86_64 x86_64 GNU/Linux


$ grep 'model.name\|cpu.MHz' /proc/cpuinfo
model name  : AMD Athlon(tm) 64 X2 Dual Core Processor 4000+
cpu MHz : 2109.518
model name  : AMD Athlon(tm) 64 X2 Dual Core Processor 4000+
cpu MHz : 2109.518

$ grep MemTotal /proc/meminfo
MemTotal:4049780 kB

$ dmd -O -inline -release -noboundscheck fast_stride.d
$ for a in *wiki*; do echo; echo $a: ; ./fast_stride $a; done

arwiki-latest-all-titles-in-ns0:
stride 67681
myStride 55908
myStride 66026
stride 66328

dewiki-latest-all-titles-in-ns0:
stride 155071
myStride 195449
myStride 196586
stride 154627

enwiki-latest-all-titles-in-ns0:
stride 688482
myStride 879950
myStride 879451
stride 689087

ruwiki-latest-all-titles-in-ns0:
stride 449133
myStride 364808
myStride 512485
stride 448841


--jm

Re: DLang Spec rewrite (?)

2013-05-26 Thread Juan Manuel Cabo


On Sunday, 26 May 2013 at 08:09:16 UTC, Borden wrote:

[...]
My 'complaint' - although I would prefer to have my 
observations about difficulties working with a markup system be 
called 'observations' - is that the current body of text files 
which comprise the DLang spec source cannot be easily compiled 
into clean, well-formed, XHTML5-compliant files from which I 
can build an ePUB file.

[...]


Maybe you can automatically convert HTML to XHTML, and then apply 
an XSLT transformation.


You mentioned somewhere that you needed something like a CSS 
transformation to target a  inside another element. You could 
do that with XSLT.


To convert from HTML to XHTML you could use the following:

http://www.codeproject.com/Articles/10792/Convert-HTML-to-XHTML-and-Clean-Unnecessary-Tags-a

It is made in C#, though if it works, I guess it could be ported 
to D.


ALso you could use Addam D. Ruppe XML DOM classes, which, though 
I'm not sure, seem to tolerate HTML4:


https://github.com/adamdruppe/misc-stuff-including-D-programming-language-web-stuff

(grab dom.d and characterencoding.d from there).

Or maybe the next generation xml library for D which will be 
revieed for inclusion, which supports XPATH queries:


  http://dsource.org/projects/xmlp

--jm

Re: Why UTF-8/16 character encodings?

2013-05-25 Thread Juan Manuel Cabo


On Saturday, 25 May 2013 at 19:51:43 UTC, Joakim wrote:
On Saturday, 25 May 2013 at 19:03:53 UTC, Dmitry Olshansky 
wrote:

You can map a codepage to a subset of UCS :)
That's what they do internally anyway.
If I take you right you propose to define string as a header 
that denotes a set of windows in code space? I still fail to 
see how that would scale see below.
Something like that.  For a multi-language string encoding, the 
header would contain a single byte for every language used in 
the string, along with multiple index bytes to signify the 
start and finish of every run of single-language characters in 
the string.  So, a list of languages and a list of pure 
single-language substrings.  This is just off the top of my 
head, I'm not suggesting it is definitive.




You obviously are not thinking it through. Such encoding would 
have a O(n^2) complexity for appending a character/symbol in a 
different language to the string, since you would have to update 
the beginning of the string, and move the contents forward to 
make room. Not to mention that it wouldn't be backwards 
compatible with ascii routines, and the complexity of such a 
header would be have to be carried all the way to font rendering 
routines in the OS.


Multiple languages/symbols in one string is a blessing of modern 
humane computing. It is the norm more than the exception in most 
of the world.


--jm

Re: Why UTF-8/16 character encodings?

2013-05-25 Thread Juan Manuel Cabo



░ⓌⓉⒻ░
╔╗░╔╗░╔╗╔╗╔╗░░
║║░║║░║║╚═╗╔═╝║╔═══╝░░
║║░║║░║║░░║║░░║╚═╗
║╚═╝╚═╝║╔╗║║╔╗║╔═╝╔╗░░
╚══╝╚╝╚╝╚╝╚╝░░╚╝░░


█░█░█░░▐░░▐░
█░█░█▐▀█▐▀█▐░█▐▀█▐▀█▐▀█░
█░█░█▐▄█▐▄█▐▄▀▐▄█▐░█▐░█░
█▄█▄█▐▄▄▐▄▄▐░█▐▄▄▐░█▐▄█░



--jm

Re: Example for creating multiple signals?

2013-05-23 Thread Juan Manuel Cabo


On Thursday, 23 May 2013 at 15:22:06 UTC, Gary Willoughby wrote:
I'm just looking at the signals example here: 
http://dlang.org/phobos/std_signals.html#.Signal where it says 
"Different signals can be added to a class by naming the 
mixins.". Has anyone got an example of this please? I'd like to 
have a class raise different signals if possible.



class A {
mixin Signal!() onChanged;
mixin Signal!(int, string) onErrorMsg;

...
}

class B {
private void a_ErrorMsg(int code, string msg) {
writeln("Error code: ", code, ": ", msg);
}

void attachToA(A a) {
a.onErrorMsg.connect(&a_ErrorMsg);
}

void detachFromA(A a) {
a.onErrorMsg.disconnect(&a_ErrorMsg);
}
}



It works perfectly. BUT, keep in mind that a signal connection 
doesn't keep your class alive if the GC wants to take it (if the 
GC collects your class, it is disconnected). This means that if 
you want to keep your listener classes around, they must be 
referenced somewhere.


--jm

Re: Ideal D GUI Toolkit

2013-05-22 Thread Juan Manuel Cabo


On Tuesday, 21 May 2013 at 07:47:56 UTC, eles wrote:

On Tuesday, 21 May 2013 at 06:41:24 UTC, Jacob Carlborg wrote:

On 2013-05-20 07:25, Tyler Jameson Little wrote:
Here we go again, yet another massive thread about GUI 
toolkits :)


Anyway, the thread is already started, I think the alternatives 
are:


1) pick up a major well-known GUI library, fork it and spend 
some important time to re-write in D. Choices: Qt, GTK, 
wxWindows etc.


2) pick up a lighter GUI library, while still cross-platform, 
and re-write it in D. Spent time is less. Choices: FLTK, FOX 
Toolkit


3) start from scratch and write something new, while still 
having to decide if will wrap OS widgets or no.


Just to be sure that you know about FOX Toolkit:

http://fox-toolkit.org/goals.html


DWT is completely written in D. It is a port of a java library 
which originally contained java + jni + C++ code, which were all 
ported to D exclusively.


DWT interfaces directly with the OS in windows, and with GTK in 
linux.


So there. A native D GUI library already exists (DWT), which can 
work as a starting point for something else. Note that it is hard 
to create a GUI designer directly for SWT (because it would need 
to generate code), but a layer of declarative xml can be built on 
top, so that it is easier.


Code originally written for SWT 
http://www.eclipse.org/swt/widgets/  works with little 
modification.


I've been using DWT for some time and it seems stable for me. 
Thanks to Jacob Carlborg for maintaining it!


--jm

Re: Investigation: downsides of being generic and correct

2013-05-16 Thread Juan Manuel Cabo


On Thursday, 16 May 2013 at 22:58:42 UTC, 1100110 wrote:

On 05/16/2013 01:46 PM, Nick Sabalausky wrote:

On Thu, 16 May 2013 09:03:36 -0500
1100110 <0b1100...@gmail.com> wrote:

May I also recommend my tool "avgtime" to make simple 
benchmarks,
instead of "time" (you can see an ascii histogram as the 
output):


 https://github.com/jmcabo/avgtime/tree/

For example:

$ avgtime -r10 -h -q  ls

Total time (ms): 27.413
Repetitions: 10
Sample mode: 2.6 (4 ocurrences)
Median time: 2.6695
Avg time   : 2.7413
Std dev.   : 0.260515
Minimum: 2.557
Maximum: 3.505
95% conf.int.  : [2.2307, 3.2519]  e = 0.510599
99% conf.int.  : [2.07026, 3.41234]  e = 0.671041
EstimatedAvg95%: [2.57983, 2.90277]  e = 0.161466
EstimatedAvg99%: [2.5291, 2.9535]  e = 0.212202
Histogram  :
msecs: count  normalized bar
  2.5: 2  
  2.6: 4  
  2.7: 3  ##
  3.5: 1  ##

--jm



Thank you for self-promotion, I miss that tool.




Indeed. I had totally forgotten about that, and yet it 
*should* be the
first thing I think of when I think "timing a program". IMO, 
that

should be a standard tool in any unixy installation.




+1

That's worth creating a package for.


Thanks!
I currently don't have much time to make a ubuntu/arch/etc. 
package, between work and the university. I might in the future.


Keep in mind that it also works in windows. Though the process 
creation overhead is bigger in windows than in linux (because of 
the OS). Also, you can open the source up and easily modify it to 
measure your times directly, inside your programs.


--jm

Re: Investigation: downsides of being generic and correct

2013-05-16 Thread Juan Manuel Cabo


On Thursday, 16 May 2013 at 10:35:12 UTC, Dicebot wrote:
Want to bring into discussion people that are not on Google+. 
Samuel recently has posted there some simple experiments with 
bioinformatics and bad performance of Phobos-based snippet has 
surprised me.


I did explore issue a bit and reported results in a blog post 
(snippets are really small and simple) : 
http://dicebot.blogspot.com/2013/05/short-performance-tuning-story.html


One open question remains though - can D/Phobos do better here? 
Can some changes be done to Phobos functions in question to 
improve performance or creating bioinformatics-specialized 
library is only practical solution?



May I also recommend my tool "avgtime" to make simple benchmarks, 
instead of "time" (you can see an ascii histogram as the output):


 https://github.com/jmcabo/avgtime/tree/

For example:

$ avgtime -r10 -h -q  ls

Total time (ms): 27.413
Repetitions: 10
Sample mode: 2.6 (4 ocurrences)
Median time: 2.6695
Avg time   : 2.7413
Std dev.   : 0.260515
Minimum: 2.557
Maximum: 3.505
95% conf.int.  : [2.2307, 3.2519]  e = 0.510599
99% conf.int.  : [2.07026, 3.41234]  e = 0.671041
EstimatedAvg95%: [2.57983, 2.90277]  e = 0.161466
EstimatedAvg99%: [2.5291, 2.9535]  e = 0.212202
Histogram  :
msecs: count  normalized bar
  2.5: 2  
  2.6: 4  
  2.7: 3  ##
  3.5: 1  ##

--jm

Re: Investigation: downsides of being generic and correct

2013-05-16 Thread Juan Manuel Cabo


On Thursday, 16 May 2013 at 10:35:12 UTC, Dicebot wrote:
Want to bring into discussion people that are not on Google+. 
Samuel recently has posted there some simple experiments with 
bioinformatics and bad performance of Phobos-based snippet has 
surprised me.


I did explore issue a bit and reported results in a blog post 
(snippets are really small and simple) : 
http://dicebot.blogspot.com/2013/05/short-performance-tuning-story.html


One open question remains though - can D/Phobos do better here? 
Can some changes be done to Phobos functions in question to 
improve performance or creating bioinformatics-specialized 
library is only practical solution?


I bet the problem is in readln. Currently, File.byLine() and 
readln() are extremely slow, because they call fgetc() one char 
at a time.


I made an "byLineFast" implementation some time ago that is 10x 
faster than std.stdio.byLine. It reads lines through rawRead, and 
using buffers instead of char by char.


I don't have the time to make it phobos-ready (unicode, etc.). 
But I'll paste it here for any one to use (it works perfectly).


--jm

-

module ByLineFast;

import std.stdio;
import std.string: indexOf;
import std.c.string: memmove;


/**
  Reads by line in an efficient way (10 times faster than 
File.byLine

  from std.stdio).
  This is accomplished by reading entire buffers (fgetc() is not 
used),

  and allocating as little as possible.

  The char \n is considered as separator, removing the previous \r
  if it exists.

  The \n is never returned. The \r is not returned if it was
  part of a \r\n (but it is returned if it was by itself).

  The returned string is always a substring of a temporary
  buffer, that must not be stored. If necessary, you must
  use str[] or .dup or .idup to copy to another string.

  Example:

File f = File("file.txt");
foreach (string line; ByLineFast(f)) {
...process line...
//Make a copy:
string copy = line[];
}

  The file isn't closed when done iterating, unless it was
  the only reference to the file (same as std.stdio.byLine).
  (example: ByLineFast(File("file.txt"))).
*/
struct ByLineFast {
File file;
char[] line;
bool first_call = true;
char[] buffer;
char[] strBuffer;

this(File f, int bufferSize=4096) {
assert(bufferSize > 0);
file = f;
buffer.length = bufferSize;
}

@property bool empty() const {
//Its important to check "line !is null" instead of
//"line.length != 0", otherwise, no empty lines can
//be returned, the iteration would be closed.
if (line !is null) {
return false;
}
if (!file.isOpen) {
//Clean the buffer to avoid pointer false positives:
(cast(char[])buffer)[] = 0;
return true;
}

//First read. Determine if it's empty and put the char 
back.

auto mutableFP = (cast(File*) &file).getFP();
auto c = fgetc(mutableFP);
if (c == -1) {
//Clean the buffer to avoid pointer false positives:
(cast(char[])buffer)[] = 0;
return true;
}
if (ungetc(c, mutableFP) != c) {
assert(false, "Bug in cstdlib implementation");
}
return false;
}

@property char[] front() {
if (first_call) {
popFront();
first_call = false;
}
return line;
}

void popFront() {
if (strBuffer.length == 0) {
strBuffer = file.rawRead(buffer);
if (strBuffer.length == 0) {
file.detach();
line = null;
return;
}
}

int pos = strBuffer.indexOf('\n');
if (pos != -1) {
if (pos != 0 && strBuffer[pos-1] == '\r') {
line = strBuffer[0 .. (pos-1)];
} else {
line = strBuffer[0 .. pos];
}
//Pop the line, skipping the terminator:
strBuffer = strBuffer[(pos+1) .. $];
} else {
//More needs to be read here. Copy the tail of the 
buffer
//to the beginning, and try to read with the empty 
part of

//the buffer.
//If no buffer was left, extend the size of the 
buffer before
//reading. If the file has ended, then the line is 
the entire

//buffer.

if (strBuffer.ptr != buffer.ptr) {
//Must use memmove because there might be overlap
memmove(buffer.ptr, strBuffer.ptr, 
strBuffer.length * char.sizeof);

}
int spaceBegin = strBuffer.length;
if (strBuffer.length == buffer.length) {
//Must extend the buffer to keep reading.
assumeSafeAppend(buffer);
buffer.length = buffer.length * 2;
}

Re: DConf 2013 keynote

2013-05-12 Thread Juan Manuel Cabo


On Sunday, 12 May 2013 at 03:58:04 UTC, Nick Sabalausky wrote:
The nicest thing of all, IMO, about not strictly needing all 
that

support software is that basic things like
editing/navigating/opening/closing code is always and forever 
100%
unobstructed by things like startup delays and keyboard input 
lag which
have no business existing on the rocket-engined supercomputers 
we now

call "a PC".


I'm using a little known IDE for D known as Poseidon:
http://www.dsource.org/projects/poseidon/wiki/Screenshots
it is very fast, loads very quickly, and the editor is very 
responsive. The keyword autocompletion is mostly broken in D2 but 
I can live without it. It is a bit sad that it has gone 
unmantained for more than a year.


These are the things that I cannot live without for my big D2 
project:

- Syntax highlighting.
- Tree like structure for navigating all the many source 
files of my project.

- Search in multiple files.
- Debugging (breakpoints, step by step, go to line that 
crashed). It suprisingly still works in Poseidon.

- Can go to file/line when double-clicking on compiler error.
- Compile/run/debug just by hitting SHIFT-F5, and other keys.
- No need for a makefile. It feeds all source files 
(hundreds) and libraries to dmd.


For smaller D projects I use Vim/makefiles though.

Again, I'm a bit sad that it has gone unmantained for so long, 
but it's totally usable still. This is the faster IDE that I've 
found.


--jm

Re: some regex vs std.ascii vs handcode times

2012-03-26 Thread Juan Manuel Cabo


On Tuesday, 27 March 2012 at 00:23:46 UTC, Juan Manuel Cabo wrote:
[]

I forgot to mention that for Linux Kubuntu x64 I had to put
change the type of a variable to auto in this line of wcTest.d:
   auto c_cnt = input.length;
And that in windows7 x64, the default for dmd is -m32, so I
compiled and tried your benchmark that way in windows.

--jm

Re: some regex vs std.ascii vs handcode times

2012-03-26 Thread Juan Manuel Cabo


On Monday, 26 March 2012 at 07:10:00 UTC, Jay Norwood wrote:

On Thursday, 22 March 2012 at 04:29:41 UTC, Jay Norwood wrote:
On the use of larger files ... yes that will be interesting, 
but for these current measurements  the file reads are only 
taking on the order of 30ms for 20MB, which tells me they are 
already either being cached by win7, or else by the ssd's 
cache.


I'll use the article instructions below and put the files 
being read into the cache prior to the test,  so that the file 
read time  should be small and consistent relative to the 
other buffer processing time inside the loops.


http://us.generation-nt.com/activate-windows-file-caching-tip-tips-tricks-2130881-0.html


Thanks


I tried using a ramdisk from imdisk, because the above article 
was just for caching network drives to your local disk.  The 
first set of times are from the ssd, the second from the ram 
disk, and both are about the same.  So I guess win7 is caching 
these file reads already.


I got imdisk for the ramdisk here
http://www.ltr-data.se/opencode.html/#ImDisk


These are the times for the imdisk reads (still executing from 
G hard drive , but reading from F ram disk)

G:\d\a7\a7\Release>wctest f:\al*.txt
finished wcp_nothing! time: 1 ms
finished wcp_whole_file! time: 31 ms
finished wcp_byLine! time: 525 ms
finished wcp_byChunk! time: 22 ms
finished wcp_lcByChunk! time: 33 ms
finished wcp_lcDcharByChunk! time: 30 ms
finished wcp_lcRegex! time: 141 ms
finished wcp_lcCtRegex! time: 104 ms
finished wcp_lcStdAlgoCount! time: 139 ms
finished wcp_lcChar! time: 37 ms
finished wcp_wcPointer! time: 121 ms
finished wcp_wcCtRegex! time: 1269 ms
finished wcp_wcRegex! time: 2908 ms
finished wcp_wcRegex2! time: 2693 ms
finished wcp_wcSlices! time: 179 ms
finished wcp_wcStdAscii! time: 222 ms

This is reading from the ssd Intel 510 series 120GB
G:\d\a7\a7\Release>wctest h:\al*.txt
finished wcp_nothing! time: 1 ms
finished wcp_whole_file! time: 32 ms
finished wcp_byLine! time: 518 ms
finished wcp_byChunk! time: 23 ms
finished wcp_lcByChunk! time: 33 ms
finished wcp_lcDcharByChunk! time: 31 ms
finished wcp_lcRegex! time: 159 ms
finished wcp_lcCtRegex! time: 89 ms
finished wcp_lcStdAlgoCount! time: 144 ms
finished wcp_lcChar! time: 34 ms
finished wcp_wcPointer! time: 118 ms
finished wcp_wcCtRegex! time: 1273 ms
finished wcp_wcRegex! time: 2889 ms
finished wcp_wcRegex2! time: 2688 ms
finished wcp_wcSlices! time: 175 ms
finished wcp_wcStdAscii! time: 220 ms

I added the source and the test text files on github

https://github.com/jnorwood/wc_test





I downloaded and tried your benchmark. I first tried it
with the ten 10Mb files that you put in github, then
truncated them to 2Mb to get results comparable to
the test you said did.

* Used dmd 2.058.

* I tested both Windows7 64bit and then booted
into Linux Kubuntu 64bits to test there too.

* I tested in the following desktop computer, previously
disabling cpu throttling (disabled cool&quiet in the bios setup).

vendor_id   : AuthenticAMD
cpu family  : 15
model   : 107
model name  : AMD Athlon(tm) 64 X2 Dual Core Processor 4000+
stepping: 1
cpu MHz : 2109.443
cache size  : 512 KB

* The computer has 4Gb of RAM.

* Runned wcTest many times (more than 10) before saving
the results.

Results in Windows7 x64 with ten 10Mb files:
---
finished wcp_nothing! time: 1 ms
finished wcp_whole_file! time: 130 ms
finished wcp_byLine! time: 1574 ms
finished wcp_byChunk! time: 133 ms
finished wcp_lcByChunk! time: 207 ms
finished wcp_lcDcharByChunk! time: 181 ms
finished wcp_lcRegex! time: 579 ms
finished wcp_lcCtRegex! time: 365 ms
finished wcp_lcStdAlgoCount! time: 511 ms
finished wcp_lcChar! time: 188 ms
finished wcp_wcPointer! time: 438 ms
finished wcp_wcCtRegex! time: 5448 ms
finished wcp_wcRegex! time: 17277 ms
finished wcp_wcRegex2! time: 15524 ms
finished wcp_wcSlices! time: 632 ms
finished wcp_wcStdAscii! time: 814 ms


Results in Windows7 x64 with ten 2Mb files:
---
finished wcp_nothing! time: 1 ms
finished wcp_whole_file! time: 27 ms
finished wcp_byLine! time: 329 ms
finished wcp_byChunk! time: 34 ms
finished wcp_lcByChunk! time: 79 ms
finished wcp_lcDcharByChunk! time: 79 ms
finished wcp_lcRegex! time: 298 ms
finished wcp_lcCtRegex! time: 150 ms
finished wcp_lcStdAlgoCount! time: 216 ms
finished wcp_lcChar! time: 77 ms
finished wcp_wcPointer! time: 127 ms
finished wcp_wcCtRegex! time: 3250 ms
finished wcp_wcRegex! time: 6164 ms
finished wcp_wcRegex2! time: 5724 ms
finished wcp_wcSlices! time: 171 ms
finished wcp_wcStdAscii! time: 194 ms


Results in Kubuntu 64bits with ten 2Mb files:
--
finished wcp_nothing! time: 0 ms
finished wcp_whole_file! time: 28 ms
finished wcp_byLine! time: 212 ms
finished wcp_byChunk! time: 20 ms
finished wcp_lcByChunk! time: 90 ms
finished wcp_lcDcharByChunk! time: 77 ms
finished wcp_lcRegex! time: 1

Re: some regex vs std.ascii vs handcode times

2012-03-21 Thread Juan Manuel Cabo

On Wednesday, 21 March 2012 at 05:49:29 UTC, Juan Manuel Cabo 
wrote:

On Monday, 19 March 2012 at 04:12:33 UTC, Jay Norwood wrote:

[]

Ok, this was the good surprise.  Reading by chunks was faster 
than reading the whole file, by several ms.


// read files by chunk ...!better than full input
//finished! time: 23 ms
void wcp_files_by_chunk(string fn)
{
auto f = File(fn);
foreach(chunk; f.byChunk(1_000_000)){
}
}




mmm, I just looked in std/file.d and std/stdio.d.
The std.file.read() function calls GetFileSize()
before reading, and you are dealing with very tight
differences (23ms vs 31ms). So there is a chance that
either the difference is accounted for by the extra
GetFileSize() (and extra stat() in the posix version),
or your threads/process loosing their scheduled slice
for that extra I/O call of GetFileSize().
  Also, in windows, std.file.read() uses ReadFile, while
byChunk uses fread(), though it should all be the same
in the end.

  It is better to try with files big enough to see whether
the timing difference gets either bigger or stays
just around those 8ms.

I'll later try all this too!!! Nice benchmarking
by the way!! Got me interested!

--jm

Re: some regex vs std.ascii vs handcode times

2012-03-20 Thread Juan Manuel Cabo


On Monday, 19 March 2012 at 04:12:33 UTC, Jay Norwood wrote:

[]

Ok, this was the good surprise.  Reading by chunks was faster 
than reading the whole file, by several ms.


// read files by chunk ...!better than full input
//finished! time: 23 ms
void wcp_files_by_chunk(string fn)
{
auto f = File(fn);
foreach(chunk; f.byChunk(1_000_000)){
}
}


Try copying each received chunk into an array of the size of
the file, so that the comparison is fair (allocation time
of one big array != allocation time of single chunk). Plus,
std.file.read() might reallocate).
  Plus, I think that the GC runs at the end of a program,
and it's definitely not the same to have it scan through
a bigger heap (20Mb) than just scan through 1Mb of ram,
and with 20Mb of a file, it might have gotten 'distracted'
with pointer-looking data.
  Are you benchmarking the time of the whole program,
or of just that snippet? Is the big array out of scope
after the std.file.read() ? If so, try putting the benchmark
start and end inside the function, at the same scope
maybe inside that wcp_whole_file() function.


[]


So here is some surprise ... why is regex 136ms vs 34 ms hand 
code?


It's not surprising to me. I don't think that there
is a single regex engine in the world (I don't think even
the legendary Ken Thompson machine code engine) that can
surpass a hand coded:
 foreach(c; buffer) { lineCount += (c == '\n'); }
for line counting.

The same goes for indexOf. If you are searching for
the position of a single char (unless maybe that the
regex is optimized for that specific case, I'd like
to know of one if it exists), I think nothing beats indexOf.


Regular expressions are for what you rather never hand code,
or for convenience.
I would rather write a regex to do a complex subtitution in Vim,
than make a Vim function just for that. They have an amazing
descriptive power. You are basically describing a program
in a single regex line.


When people talk about *fast* regexen, they talk comparatively
to other engines, not as an absolute measure.
This confuses a lot of people.

--jm

Re: some regex vs std.ascii vs handcode times

2012-03-20 Thread Juan Manuel Cabo

On Monday, 19 March 2012 at 17:23:36 UTC, Andrei Alexandrescu 
wrote:


[.]



I wanted for a long time to improve byLine by allowing it to do 
its own buffering. That means once you used byLine it's not 
possible to stop it, get back to the original File, and 
continue reading it. Using byLine is a commitment. This is what 
most uses of it do anyway.


Great!! Perhaps we don't have to choose. We may have both!!
Allow me to suggest:

  byLineBuffered(bufferSize, keepTerminator);
orbyLineOnly(bufferSize, keepTerminator);
orbyLineChunked(bufferSize, keepTerminator);
orbyLineFastAndDangerous :-) hahah :-)

Or the other way around:

  byLine(keepTerminator, underlyingBufferSize);
renaming the current one to:
  byLineUnbuffered(keepTerminator);

Other ideas (I think I read them somewhere about
this same byLine topic):
  * I think it'd be cool if 'line' could be a slice of the
underlying buffer when possible if buffering is added.
  * Another good idea would be a new argument, maxLineLength,
so that one can avoid reading and allocating the whole
file into a big line string if there are no newlines
in the file, and one knows the max length desired.

--jm




Ok, this was the good surprise. Reading by chunks was faster 
than

reading the whole file, by several ms.


What may be at work here is cache effects. Reusing the same 1MB 
may place it in faster cache memory, whereas reading 20MB at 
once may spill into slower memory.



Andrei

Re: new std.variant (was Re: The Right Approach to Exceptions)

2012-02-22 Thread Juan Manuel Cabo

On Thursday, 23 February 2012 at 02:12:20 UTC, Juan Manuel Cabo 
wrote:



If we are going to get ideallistic [..]


I'm sorry. I went over the top. I apollogize.

..I won't post for a while.
This thread is almost poping a vein in my neck..

Passion can do that!
I love D. Love all your good work guys!!!

--jm

Re: new std.variant (was Re: The Right Approach to Exceptions)

2012-02-22 Thread Juan Manuel Cabo

On Thursday, 23 February 2012 at 01:57:49 UTC, Jonathan M Davis 
wrote:

The D equivalent would really be Array, not Appender.


Array!T in D is ref counted and more geared towards T being a
struct. And I had big trouble sorting it with sort!()() in D2.056,
so I made my own sort just to be able to use Array!(T).

I know that that situation will not remain forever (and I didn't
check if it was already fixed in D2.058).


I'm not sure that it's a great idea to use Appender as a
container - particularly when there are types
specifically intended to be used as containers. Appender is 
geared specifically  towards array building (like StringBuilder 
in Java, except generalized for all
arrays). If it's a container that you're looking for, then I 
really think that

you should use a container.

- Jonathan M Davis


If Appender supports the range interface, then it is a
container. Someone will use it that way, because in the real
world people take the things for what they are, not for
what they are named.

Appender can work with T classes well. Contains items.
It would be GC managed (as opposed to Array!T).
So it is a container. If it is not O(1) to access, it
should be said big in the ddoc though.

It is a recurrent trend in your posts, that you post just
because you have some ideallistic concern or opinion.

If we are going to get ideallistic, this is an extract
of a poem by Jorge Luis Borges (argentinian writer) that
illustrates my point:

  If (as the Greek states in the Cratilo)
  the name is the archetype of the thing,
  in the letters of rose it is the rose
  and all the Nile is in the word Nile

  Si como escribió el griego en el Crátilo,
  el nombre es arquetipo de la cosa,
  en el nombre de rosa está la rosa
  y todo el Nilo en la palabra Nilo.
  --Jorge Luis Borges

--jm

Re: new std.variant (was Re: The Right Approach to Exceptions)

2012-02-22 Thread Juan Manuel Cabo

On Thursday, 23 February 2012 at 01:36:32 UTC, Juan Manuel Cabo 
wrote:
On Thursday, 23 February 2012 at 00:51:38 UTC, Jonathan M Davis 
wrote:

[...]
If appender ends up with multiple arrays in it, then random 
access is no longer O(1) and is therefore unacceptable. As 
such, most sort algorithms wouldn't work with it.


If all I want is binary search on a big appender, then it
is O(k * n * log(n)), and that k right there doesn't
bother me. Also, binary search is absolutely not
cpu cache friendly to begin with.

Also, your bit about using appender to pass an array around 
wouldn't work either, because it wouldn't simply be wrapper

around an array anymore.

- Jonathan M Davis


Yeah, but I don't care about the underlying array. I care
about multiple places referencing the same Appender. If I
from any place that references it, it appends to the same
appender. The Appender "array" has identity. Ranges do not:

 int[] bla = [1,2,3];
 int[] ble = bla;
 ble ~= 4;
 assert(bla.length == 3);

This is very easy to solve with appender.
This is what happens in Java:
ArrayList bla = new ArrayList();
bla.add(1);
ArrayList ble = bla;
ble.add(2);
//prints 2
System.out.println(Integer.toString(bla.size()));
//prints 2
System.out.println(Integer.toString(ble.size()));

(yikes, aint that verbose!)
The ArrayList has identity. It is a class, so that it
many variables reference the _same_ object.
(this can be accomplished with structs too though, but
not with ranges).


I meant ref counted structs.






P.S. Please don't top post. Replies should go _after_ the 
preceding message.


Sorry, got it.

--jm

Re: new std.variant (was Re: The Right Approach to Exceptions)

2012-02-22 Thread Juan Manuel Cabo

On Thursday, 23 February 2012 at 00:51:38 UTC, Jonathan M Davis 
wrote:
P.S. Please don't top post. Replies should go _after_ the 
preceding message.


P.S: You are right though, that it wouldn't be O(1) anymore
and it should be said big in the documentation that it is
amortized.

--jm

Re: new std.variant (was Re: The Right Approach to Exceptions)

2012-02-22 Thread Juan Manuel Cabo

On Thursday, 23 February 2012 at 01:36:32 UTC, Juan Manuel Cabo 
wrote:

If all I want is binary search on a big appender, then it
is O(k * n * log(n)), and that k right there doesn't
bother me.


(Where binary search is of course O(log(n))
and accessing individual elements with the proposed
Appender is O(N / (4080/T.sizeof)), so k == 4080/T.sizeof)

--jm

Re: new std.variant (was Re: The Right Approach to Exceptions)

2012-02-22 Thread Juan Manuel Cabo

On Thursday, 23 February 2012 at 00:51:38 UTC, Jonathan M Davis 
wrote:

[...]
If appender ends up with multiple arrays in it, then random 
access is no longer O(1) and is therefore unacceptable. As 
such, most sort algorithms wouldn't work with it.


If all I want is binary search on a big appender, then it
is O(k * n * log(n)), and that k right there doesn't
bother me. Also, binary search is absolutely not
cpu cache friendly to begin with.

Also, your bit about using appender to pass an array around 
wouldn't work either, because it wouldn't simply be wrapper

around an array anymore.

- Jonathan M Davis


Yeah, but I don't care about the underlying array. I care
about multiple places referencing the same Appender. If I
from any place that references it, it appends to the same
appender. The Appender "array" has identity. Ranges do not:

 int[] bla = [1,2,3];
 int[] ble = bla;
 ble ~= 4;
 assert(bla.length == 3);

This is very easy to solve with appender.
This is what happens in Java:
ArrayList bla = new ArrayList();
bla.add(1);
ArrayList ble = bla;
ble.add(2);
//prints 2
System.out.println(Integer.toString(bla.size()));
//prints 2
System.out.println(Integer.toString(ble.size()));

(yikes, aint that verbose!)
The ArrayList has identity. It is a class, so that it
many variables reference the _same_ object.
(this can be accomplished with structs too though, but
not with ranges).




P.S. Please don't top post. Replies should go _after_ the 
preceding message.


Sorry, got it.

--jm

Re: new std.variant (was Re: The Right Approach to Exceptions)

2012-02-22 Thread Juan Manuel Cabo


(And not talking about some cheesy insertion sort!!)

If you build an array once and for all, and all you want
is to do binary search on it later, it doesn't make sense to
allocate that big contiguous .data. I'd rather leave it
as an appender.

--jm


On Wednesday, 22 February 2012 at 23:22:35 UTC, Juan Manuel Cabo 
wrote:
No, because the array doesn't actually exist until appender 
makes copy.


Will one be able to use the sort!()() algorithm directly on 
your appender,

that is, without accessing/creating the underlying array?

--jm

Re: new std.variant (was Re: The Right Approach to Exceptions)

2012-02-22 Thread Juan Manuel Cabo

On Wednesday, 22 February 2012 at 20:59:15 UTC, Jonathan M Davis 
wrote:
speed [...] is really its whole point of existance. I don't 
know why else you'd ever use appender.

[...]

- Jonathan M Davis


A use case is to give identity to a built-in array.

Consider this:

 class MyClass {
 private MyData[] theData;

 public @property MyData[] data() {
 return theData;
 }
 ...
 }


 MyClass m = new MyClass();
 m.data ~= new MyData();
 //Nothing got appended:
 assert(m.data.length == 0);

For the 95% of the use cases, that is the desired
behaviour. You don't want anyone appending to
your private array. If you wanted to, you would
have defined MyClass.append(myData).

  But there are a few cases where you want to
give identity to the array, and let anyone who
has a "handle" to it, to be able to append it.
(another case is while porting code from languages
that don't represent arrays as ranges, and return
them as getters).

--jm

Re: new std.variant (was Re: The Right Approach to Exceptions)

2012-02-22 Thread Juan Manuel Cabo

No, because the array doesn't actually exist until appender 
makes copy.


Will one be able to use the sort!()() algorithm directly on your 
appender,

that is, without accessing/creating the underlying array?

--jm

Re: size_t + ptrdiff_t

2012-02-21 Thread Juan Manuel Cabo

>> I'm surprised.  I'd assumed that, under 16-bit DOS/Windows, a size_t would
>> be 16 bits. But no.  Could memory blocks 64K  or larger actually be 
>> allocated under those systems?

size_t being the typeof sizeof() expressions, tell you the upper bound
for the size of static arrays, statically allocated: sizeof(buffer)/sizeof(int)
has to be representable in a size_t.

It tells you nothing else. What you can allocate dinamically depends on
the architecture and how it lets you address it.

16bit intel had 16bit segments and offsets, so memory was segmented
and you couldn't address more than 64kb at a time.
So you couldn't have grabbed^H^H"allocated" more than 64kb in real mode in intel
in a single linear block.


--jm

Re: size_t + ptrdiff_t

2012-02-21 Thread Juan Manuel Cabo

I just went to see a standard draft 
(http://www.clc-wiki.net/wiki/the_C_Standard)
to make sure, and it is even more convoluted than just that, but essentially the
same. Basically it says that chars must be at least
8bits and that shorts and ints must be able to represent at least 16 bits.

Then the ranks of the types are related to each other like that:
rank(_Bool) < rank(char) < rank(short) < rank(int) < rank(long int) < 
rank(long long int)
but rank is about conversion, not size, so while rank is strictly ordered,
sizeof types might not.

I didn't find size_t other than as the type of the value of sizeof() 
expressions.

So, size_t is just standardized as the type of (sizeof(anystuff)).

I saw sizeof(long) <= sizeof(size_t)  in a website, but not on the standard.
So the standard doesn't even guarantee size_t being more than other
types, or the bigger type.

--jm

On 02/21/2012 10:23 PM, Stewart Gordon wrote:
> On 21/02/2012 22:45, Juan Manuel Cabo wrote:
> 
>> The C standard only guarantees that:
>>
>>sizeof(char)<= sizeof(int)<= sizeof(long)<= sizeof(size_t)
> 
> 
> I'm surprised.  I'd assumed that, under 16-bit DOS/Windows, a size_t would be 
> 16 bits. But no.  Could memory blocks 64K
> or larger actually be allocated under those systems?
> 
> Stewart.

Re: size_t + ptrdiff_t

2012-02-21 Thread Juan Manuel Cabo

On 02/21/2012 10:13 PM, Sean Kelly wrote:
> I think this is actually a good thing, since working with unsigned integers 
> is a pain.

Yes, I would prefer that msb bit to be the sign too, but behavior might depend 
on it,
and correctness and predictability is important.

My first code snippet was WRONG (sorry for the noise). And I couldn't even
reproduce the problem with my VC.

A correct snippet is simply this:

size_t s = -2;

if (s > 0) {
 printf("unsigned");
} else {
 printf("signed");
}

Also, the shift operator is does a logical or arithmetic shift depending
on whether it is signed or unsigned, so the result is different
if you do
 s >> 1
whether s is signed or unsigned.

THis is already nitpicky. I'm sorry for the noise.

--jm

Re: size_t + ptrdiff_t

2012-02-21 Thread Juan Manuel Cabo

I'm sorry, my snippet is wrong. It's a bit more complicated than what I first
thought, and not even uniform between VC versions:

   "size_t definition and C4267 warning"
   
social.msdn.microsoft.com/forums/en-US/vclanguage/thread/62d6df45-e8e4-4bb2-87e2-b3f8e85b4b37

--jm


On 02/21/2012 09:50 PM, Juan Manuel Cabo wrote:
>> size_t is intended to be the C representation.  I very much do not want to 
>> end up with a c_size_t.
> 
> Hahah, hold your jaw because it might drop:
> 
> Looking for size_t extravagancies in C, I found that VC uses __int64
> for size_t in x64 target.
> 
> So this behaves differently according to the target:
> 
>size_t s = -2;
>if (s < -1) {
>printf("An supposedly unsigned size_t is less than -1 in x64 ??");
>} else {
>printf("all good, size_t is still unsigned");
>}
> 
> Though I don't ever expect to see a VC backend for D, this shows that
> anything can be expected from C.
> If you want to keep D's size_t unsigned, then let c_size_t do whatever C does,
> and let D use sane types that do what the documentation says.
> 
> :-)
> 
> --jm
> 
> 
> On 02/21/2012 09:10 PM, Sean Kelly wrote:
>> On Feb 21, 2012, at 3:50 PM, Juan Manuel Cabo wrote:
>>
>>>> Eh?
>>>
>>> All the type sizes vary in broken ways in C. The only sane way
>>> to port C structs to D is to use c_ that has the size of the
>>> C compiler in the target platform.
>>>
>>> If there is a single C compiler has a different sized size_t
>>> than D's, then one has achieved nothing with the c_int, c_long, etc.
>>>
>>> So that is why it'd be nice, in my opinion, to have c_size_t
>>> (and c_ssize_t) if we are going to have c_int, c_long, etc.
>>
>> size_t is intended to be the C representation.  I very much do not want to 
>> end up with a c_size_t.  Are there times when D's size_t would be a 
>> different type?  Also, I wonder how much code would break if we eliminated 
>> the size_t in object.di and replaced it with Size or whatever.
>

Re: size_t + ptrdiff_t

2012-02-21 Thread Juan Manuel Cabo

> size_t is intended to be the C representation.  I very much do not want to 
> end up with a c_size_t.

Hahah, hold your jaw because it might drop:

Looking for size_t extravagancies in C, I found that VC uses __int64
for size_t in x64 target.

So this behaves differently according to the target:

   size_t s = -2;
   if (s < -1) {
   printf("An supposedly unsigned size_t is less than -1 in x64 ??");
   } else {
   printf("all good, size_t is still unsigned");
   }

Though I don't ever expect to see a VC backend for D, this shows that
anything can be expected from C.
If you want to keep D's size_t unsigned, then let c_size_t do whatever C does,
and let D use sane types that do what the documentation says.

:-)

--jm


On 02/21/2012 09:10 PM, Sean Kelly wrote:
> On Feb 21, 2012, at 3:50 PM, Juan Manuel Cabo wrote:
> 
>>> Eh?
>>
>> All the type sizes vary in broken ways in C. The only sane way
>> to port C structs to D is to use c_ that has the size of the
>> C compiler in the target platform.
>>
>> If there is a single C compiler has a different sized size_t
>> than D's, then one has achieved nothing with the c_int, c_long, etc.
>>
>> So that is why it'd be nice, in my opinion, to have c_size_t
>> (and c_ssize_t) if we are going to have c_int, c_long, etc.
> 
> size_t is intended to be the C representation.  I very much do not want to 
> end up with a c_size_t.  Are there times when D's size_t would be a different 
> type?  Also, I wonder how much code would break if we eliminated the size_t 
> in object.di and replaced it with Size or whatever.

Re: size_t + ptrdiff_t

2012-02-21 Thread Juan Manuel Cabo

> Eh?

All the type sizes vary in broken ways in C. The only sane way
to port C structs to D is to use c_ that has the size of the
C compiler in the target platform.

If there is a single C compiler has a different sized size_t
than D's, then one has achieved nothing with the c_int, c_long, etc.

So that is why it'd be nice, in my opinion, to have c_size_t
(and c_ssize_t) if we are going to have c_int, c_long, etc.

Sorry for requesting things though!! I still don't know my way
around here.

--jm

On 02/21/2012 08:35 PM, Iain Buclaw wrote:
> On 21 February 2012 22:45, Juan Manuel Cabo  wrote:
>>
>> A REQUEST: how about adding a c_size_t too?
>>
> 
> Eh?
> 
>

Re: size_t + ptrdiff_t

2012-02-21 Thread Juan Manuel Cabo

> c_long and c_ulong are guaranteed to match target long size (here
> would also go c_int and c_uint ;-).
> https://bitbucket.org/goshawk/gdc/src/87241c8e754b/d/druntime/core/stdc/config.d#cl-22

That is so good! Thanks!

Currently, htod translates "unsigned long" to uint, which is wrong
in linux 64 bits. Translating to size_t fixes that for linux, but
I fear that a C "unsigned long" is not 64bit in all 64bit systems
(windows):

"About size_t and ptrdiff_t"
http://www.codeproject.com/Articles/60082/About-size_t-and-ptrdiff_t

Ran into this problem with mysql.h:

typedef st_mysql_field {
...
unsigned long length;
unsigned long max_length;
unsigned int name_length;
...
}

The only adequate fix, is your c_long that matches C's long.

This is because uint doesn't change when in 64bit, and size_t
fixes it for linux but maybe not for windows.


The C standard only guarantees that:

  sizeof(char) <= sizeof(int) <= sizeof(long) <= sizeof(size_t)

Which is insane. At some point in history, a C int meant the
native register size for fast integer operations. But now C int
seems to have been frozen to 32bits to avoid struct hell.

So, in conclusion, my opinion is that there is no sane way
of mapping C types to D types but to use something like your
c_int, c_uint, c_long, c_ulong. Otherwise, the insanity
of non-standard sizes and portability never stops.

Love the intptr_t!

A REQUEST: how about adding a c_size_t too?

--jm

Re: The Right Approach to Exceptions

2012-02-21 Thread Juan Manuel Cabo

> That because you can't (shouldn't) push up implementations specific to a 
> given subclass. Why don't we only have one
> class, Object, and add a Variant[string] there.
>
> Do you see how stupid that is.

As stupid as any database API which returns result items as Variant[string] or
string[string], but it works. (the sad part is that one has to rely a bit on
convention, but convention can be standardized (string constants) and measures
taken when deviated so that it is done gracefuly).

Do you have an alternative solution that allows to extend an exception
object with extra information, while keeping it the same class?

So if one removes the bad reasons to create new Exception types, then the
ones that DO get created are solid, standard, reusable, and can withstand
the test of time. Because they would be open for extension but closed for
source code modification.

--jm

On 02/21/2012 03:03 PM, Jacob Carlborg wrote:
> On 2012-02-21 17:57, Andrei Alexandrescu wrote:
>> On 2/21/12 10:50 AM, Juan Manuel Cabo wrote:
>>> I thought that an alternative to Variant[string] would be to have some
>>> virtual
>>> functions overrideable (getExceptionData(string dataName) or something).
>>> but they would all have to return Object or Variant, so it's the same
>>> thing.
>>
>> Exactly. By and large, I think in the fire of the debate too many people
>> in this thread have forgotten to apply a simple OO design principle:
>> push policy up and implementation down. Any good primitive pushed up the
>> exception hierarchy is a huge win, and any design that advocates
>> reliance on concrete types is admitting defeat.
>>
>> Andrei
> 
> That because you can't (shouldn't) push up implementations specific to a 
> given subclass. Why don't we only have one
> class, Object, and add a Variant[string] there.
> 
> Do you see how stupid that is.
>

Re: The Right Approach to Exceptions

2012-02-21 Thread Juan Manuel Cabo

I didn't know where I last read it, it got stuck in my head. I 
wrote:



[...] doesn't mean
that one must turn the advantages into disadvantages and start
hammering screws because we love hammers.


I forget to be careful with metaphors, I realize that some must
be emotionally loaded and don't really help in debating since
they can be used either way and are astranged from reason.

(Hahaha, nothing more dangerous than landing on a new 
forum/community

without tip toeing, ohh the rush of excitement!!)

--jm


On Tuesday, 21 February 2012 at 16:49:37 UTC, Andrei Alexandrescu 
wrote:

On 2/21/12 10:39 AM, foobar wrote:


...


To quote a classic:

made generic. You seem to prove the old saying that when all 
you have

is a hammer everything looks like a nail.




...



Andrei

Re: The Right Approach to Exceptions

2012-02-21 Thread Juan Manuel Cabo

> throw new WithRainbows!withErrorCode!withFoobar!FileNotFoundException(...);

So:

catch (WithRainbows!withErrorCode!withFoobar!FileNotFoundException ex) {

} catch (WithRainbows!withErrorCode!withFoobar!FileNotFoundException ex) {

} catch (WithErrorCode!withRainbows!withFoobar!FileNotFoundException ex) {

} catch (WithRainbows!withFoobar!withErrorCode!FileNotFoundException ex) {

and so on (in this case will be, its 3! == 6).

and you would have to write them all. You cannot catch only WithRainbows!* 
because
you miss the FileNotFoundException at the end.

Please, refer to my previous posts.
I don't want to start to repaste my posts.
In one of them, I said that what you care about for the catch selection
is the *what* of the error.   Not the *cause* of the error, not the *where*
of the error (no one catches by *where*). And that it seems wrong to encode
anything other than the *what* of the error in the type name. Other things
such as the cause or the date should be encoded inside the exception object
instead of in the exception class type name.

I thought that an alternative to Variant[string] would be to have some virtual
functions overrideable (getExceptionData(string dataName) or something).
but they would all have to return Object or Variant, so it's the same thing.

--jm

On 02/21/2012 01:39 PM, foobar wrote:
> On Tuesday, 21 February 2012 at 16:15:17 UTC, Juan Manuel Cabo wrote:
>>> FileNotFoundException is the super class of the others so the first catch 
>>> clause is enough. in fact, the others will
>>> never be called if listed in the above order.
>>
>> Nice! I missed that. But what if you want to add ErrorCode and Rainbows?
>> And with your approach, one has to test for type and downcast, or
>> otherwise have multiple catch blocks (I don't want to miss plain
>> FileNotFoundExceptions). So it's square one.
>>
>> With Variant[string] (or something equivalent, nothing better comes to mind)
>> one does:
>>
>>
>> try {
>> ...
>> } catch (FileNotFoundException ex) {
>>  if (ex.hasInfo(MyNameConstant)) {
>>  ... use that ...
>>  }
>>  ... common handling ...
>> }
>>
>>
>> --jm
> 
> Regarding the downcast - you still perform a check in the code above! You 
> gained nothing by replacing a type check with
> a check on a hash.
> 
> Regarding composition of several traits - even that simple snippet is enough:
> throw new WithRainbows!withErrorCode!withFoobar!FileNotFoundException(...);
> 
> That's without further design which could probably improve this further.

Re: The Right Approach to Exceptions

2012-02-21 Thread Juan Manuel Cabo

Also, you would lose the stacktrace by rethrowing with a different exception 
object.
(Currently, the stacktrace is lost by rethrowing the same object, but the 
Exception.file
and Exception.line are not lost, and it seems that it is very easy to not lose 
the
stacktrace when rethrowing, and it is the correct thing (for instance, java 
doesn't
lose the stacktrace when rethrowing, and C++ with its throw; statement for 
rethrowing
doesn't either).

--jm

On 02/21/2012 01:15 PM, Juan Manuel Cabo wrote:
>> FileNotFoundException is the super class of the others so the first catch 
>> clause is enough. in fact, the others will
>> never be called if listed in the above order.
> 
> Nice! I missed that. But what if you want to add ErrorCode and Rainbows?
> And with your approach, one has to test for type and downcast, or
> otherwise have multiple catch blocks (I don't want to miss plain
> FileNotFoundExceptions). So it's square one.
> 
> With Variant[string] (or something equivalent, nothing better comes to mind)
> one does:
> 
> 
> try {
> ...
> } catch (FileNotFoundException ex) {
>  if (ex.hasInfo(MyNameConstant)) {
>  ... use that ...
>  }
>  ... common handling ...
> }
> 
> 
> --jm
> 
>

Re: The Right Approach to Exceptions

2012-02-21 Thread Juan Manuel Cabo

> FileNotFoundException is the super class of the others so the first catch 
> clause is enough. in fact, the others will
> never be called if listed in the above order.

Nice! I missed that. But what if you want to add ErrorCode and Rainbows?
And with your approach, one has to test for type and downcast, or
otherwise have multiple catch blocks (I don't want to miss plain
FileNotFoundExceptions). So it's square one.

With Variant[string] (or something equivalent, nothing better comes to mind)
one does:


try {
...
} catch (FileNotFoundException ex) {
 if (ex.hasInfo(MyNameConstant)) {
 ... use that ...
 }
 ... common handling ...
}


--jm

Re: The Right Approach to Exceptions

2012-02-21 Thread Juan Manuel Cabo

Never mind modifying fields of the exception at some intermediate catch place.
Someone could even catch the exception and not rethrow it.
So: do some trusting. Life gets easier :-)

--jm

On 02/21/2012 12:46 PM, Juan Manuel Cabo wrote:
>> I think he meant to say things have been like that for a while and there's 
>> no blood in the streets.
> 
> That's exactly what I meant.
>

Re: The Right Approach to Exceptions

2012-02-21 Thread Juan Manuel Cabo

> I think he meant to say things have been like that for a while and there's no 
> blood in the streets.

That's exactly what I meant.

And even if one makes those fields private, anyone can take a pointer
to the class or void[] or whatever and do a mess. (Java isn't exepmpt,
you can do a mess with reflection there).
So there is a minimum of trust that we put on APIs and code that
we call downstream. The same trust that one puts, to begin with,
when one expects that an exception will be thrown when an error happens.

Ruby and PHP are based on a lot of trust for instance!
Having the advantages of staticlly typed language doesn't mean
that one must turn the advantages into disadvantages and start
hammering screws because we love hammers.

--jm

On 02/21/2012 11:20 AM, Andrei Alexandrescu wrote:
> On 2/21/12 6:36 AM, Jacob Carlborg wrote:
>> On 2012-02-20 23:44, Juan Manuel Cabo wrote:
>>>> I still don't like the idea of using Variant[string], though.
>>>>
>>>> (1) It doesn't allow compile-time type checking. This is a big minus, in
>>>> my book.
>>>
>>> When you need compile-time type checking, define a variable in your
>>> class.
>>> Just make sure that you are creating that new exception class for a
>>> good reason.
>>>
>>> When a user needs to add a variable to the exception, he can add it
>>> without putting your exception class chained in a new type of exception,
>>> that will hide your class from being selected by upstream catch blocks
>>> in the call tree.
>>>
>>>>
>>>> (2) It's overly flexible. Anyone along the call stack can insert
>>>> (hopefully NOT delete!!) additional data into the Exception object, as
>>>> the stack is unwound.
>>>
>>> As is currently the case.
>>> Did you know that anyone can overwrite any field of the exception and
>>> rethrow it? Such as msg field and so on?
>>
>> No one says the fields need to be public instance variables. You could
>> take the arguments in the constructor and only have getters.
> 
> I think he meant to say things have been like that for a while and there's no 
> blood in the streets.
> 
> Andrei
> 
>

Re: The Right Approach to Exceptions

2012-02-21 Thread Juan Manuel Cabo

> This works:
> // note: the int parameter above isn't static
> dbConn.query("select age from people where id='foobar'");
> throw new WithErrorCode!FileNotFoundException(
>   db.rs.getValue(1), "file not found");
...
> Can you offer a real world use-case where the above isn't sufficient?


What happened is that a file wasn't found. What one wants to catch is
a FileNotFoundException.

Do you suggest that I have to:

   try {
   ...
   } catch (FileNotFoundException ex) {
   ...
   } catch (WithErrorCode!FileNotFoundException ex) {
   ...
   } catch (WithRainbows!FileNotFoundException ex) {
   ...
   }
and so on?

--jm

Re: Questions about windows support

2012-02-20 Thread Juan Manuel Cabo

WARNING: for anyone reading dont try this without thinking: 

>find -print0 | xargs -0 rm

Please don't type it as is (that deletes files without problems
with spaces or '-', but will delete everything).

The -print0 is very useful. I use for instance to see the latest file
in a tree of directories:

 find -type f -print0 | xargs -0 ls -ltr

which comes up at the bottom of the listing.

--jm




On 02/21/2012 12:44 AM, Juan Manuel Cabo wrote:
> I think that the "for x in *" still gets you on the limit (not sure).
> 
> This is how you deal with spaces in filenames or '-'
> 
>find -print0 | xargs -0 rm
> 
> Another funny unix thing is awk... it solves all your problems but
> in one line, but then creates new ones until you get them right
> for separators and special cases.
> 
> --jm
> 
> 
> On 02/21/2012 12:31 AM, H. S. Teoh wrote:
>> On Tue, Feb 21, 2012 at 04:24:44AM +0100, Adam D. Ruppe wrote:
>>> On Tuesday, 21 February 2012 at 03:13:10 UTC, H. S. Teoh wrote:
>>>> for x in *; mv $x dest/$x; done
>>>>
>>>> Easy. :)
>>>
>>> And wrong!
>>>
>>> What if the filename has a space in it? You can say "$x", with quotes,
>>> to handle that.
>>
>> Argh, you're right. That's one reason I *hate* the implicit
>> interpolation that shells have the tendency to do. Perl got it right: $x
>> means the value of x as a *single* value, no secret additional
>> interpolation, no multiple layers of re-interpretation, and that
>> nonsense.
>>
>>
>>> But, worse yet... a leading dash? Another downside with the shell
>>> expansion is the program can't tell if that is an expanded filename or
>>> a user option.
>>
>> Heh. Never thought of this before. I can see some fun times to be had
>> with it, though!
>>
>> But you could probably handle it by:
>>
>>  mv -- "$x" "$dest/$x"
>>
>>
>>> In this case, the mv simply wouldn't work, but you can get some
>>> bizarre behavior out of that if you wanted to play with it.
>>>
>>> try this some day as a joke:
>>>
>>> $ mkdir evil-unix # toy directory
>>> $ cd evil-unix
>>> $ touch -- -l # our lol file
>>> $ touch cool # just to put a file in there
>>> $ ls
>>> -l  cool
>>> $ ls * # the lol file is interpreted as an option!
>>> -rw-r--r-- 1 me users 0 2012-02-20 22:18 cool
>>> $
>>>
>>>
>>> imagine the poor newb trying to understand that!
>>
>> +1, LOL.
>>
>>
>> T
>>
>

Re: Questions about windows support

2012-02-20 Thread Juan Manuel Cabo

I think that the "for x in *" still gets you on the limit (not sure).

This is how you deal with spaces in filenames or '-'

   find -print0 | xargs -0 rm

Another funny unix thing is awk... it solves all your problems but
in one line, but then creates new ones until you get them right
for separators and special cases.

--jm


On 02/21/2012 12:31 AM, H. S. Teoh wrote:
> On Tue, Feb 21, 2012 at 04:24:44AM +0100, Adam D. Ruppe wrote:
>> On Tuesday, 21 February 2012 at 03:13:10 UTC, H. S. Teoh wrote:
>>> for x in *; mv $x dest/$x; done
>>>
>>> Easy. :)
>>
>> And wrong!
>>
>> What if the filename has a space in it? You can say "$x", with quotes,
>> to handle that.
> 
> Argh, you're right. That's one reason I *hate* the implicit
> interpolation that shells have the tendency to do. Perl got it right: $x
> means the value of x as a *single* value, no secret additional
> interpolation, no multiple layers of re-interpretation, and that
> nonsense.
> 
> 
>> But, worse yet... a leading dash? Another downside with the shell
>> expansion is the program can't tell if that is an expanded filename or
>> a user option.
> 
> Heh. Never thought of this before. I can see some fun times to be had
> with it, though!
> 
> But you could probably handle it by:
> 
>   mv -- "$x" "$dest/$x"
> 
> 
>> In this case, the mv simply wouldn't work, but you can get some
>> bizarre behavior out of that if you wanted to play with it.
>>
>> try this some day as a joke:
>>
>> $ mkdir evil-unix # toy directory
>> $ cd evil-unix
>> $ touch -- -l # our lol file
>> $ touch cool # just to put a file in there
>> $ ls
>> -l  cool
>> $ ls * # the lol file is interpreted as an option!
>> -rw-r--r-- 1 me users 0 2012-02-20 22:18 cool
>> $
>>
>>
>> imagine the poor newb trying to understand that!
> 
> +1, LOL.
> 
> 
> T
>

Re: Questions about windows support

2012-02-20 Thread Juan Manuel Cabo

> That is so COOL!! I remember f*cking up one of my first linux computers
> that way. If I had known, I wouldn't have to go back to reinstall the
> many diskettes of slackware (no live cds at that time!, no easy way
> to fix the fs).

What happened was (If I remember correctly) that I renamed the /lib directory.
(PLEASE DON'T TRY THAT AT HOME!!)

Again, this:

>> In the end I had to use
>> bash's built-in echo command to recreate a statically-linked busybox
>> binary via copy-n-pasting over the terminal,

is so cool!!

--jm


On 02/20/2012 11:56 PM, Juan Manuel Cabo wrote:
> On 02/20/2012 11:06 PM, H. S. Teoh wrote:
>> On Tue, Feb 21, 2012 at 02:00:20AM +0100, Adam D. Ruppe wrote:
>> ...
>> Yeah I remember that. I thought they've since fixed it, though. That's
>> more a bash limitation than anything, AFAIK. Besides, what *were* you
>> trying to do with such a long command-line anyway? :-)
>> ...
> 
> I can think of one case where the command line argument limit
> is a problem: copying or moving files from a huge directory.
>   In that case, to do it with bash, there is no other way around
> but to do things such as iterate over the alphabet to copy the files that
> start with 'a', then the ones with 'b'..
> 
> 
>> ...
>> But then again, I *did* also have to deal with having to repair a remote
>> Linux server whose dynamic linker broke, causing basic commands like ls,
>> cp, chmod, to be completely non-functional. In fact, *nothing* worked
>> except that last remote login running bash. In the end I had to use
>> bash's built-in echo command to recreate a statically-linked busybox
>> binary via copy-n-pasting over the terminal, in order to get things back
>> into working condition again. (Yeah. Definitely not for the faint of
>> heart.)
>> ...
>>
>> T
>>
> 
> That is so COOL!! I remember f*cking up one of my first linux computers
> that way. If I had known, I wouldn't have to go back to reinstall the
> many diskettes of slackware (no live cds at that time!, no easy way
> to fix the fs).
> 
> --jm
>

Re: Questions about windows support

2012-02-20 Thread Juan Manuel Cabo

On 02/20/2012 11:06 PM, H. S. Teoh wrote:
> On Tue, Feb 21, 2012 at 02:00:20AM +0100, Adam D. Ruppe wrote:
> ...
> Yeah I remember that. I thought they've since fixed it, though. That's
> more a bash limitation than anything, AFAIK. Besides, what *were* you
> trying to do with such a long command-line anyway? :-)
> ...

I can think of one case where the command line argument limit
is a problem: copying or moving files from a huge directory.
  In that case, to do it with bash, there is no other way around
but to do things such as iterate over the alphabet to copy the files that
start with 'a', then the ones with 'b'..

> ...
> But then again, I *did* also have to deal with having to repair a remote
> Linux server whose dynamic linker broke, causing basic commands like ls,
> cp, chmod, to be completely non-functional. In fact, *nothing* worked
> except that last remote login running bash. In the end I had to use
> bash's built-in echo command to recreate a statically-linked busybox
> binary via copy-n-pasting over the terminal, in order to get things back
> into working condition again. (Yeah. Definitely not for the faint of
> heart.)
> ...
> 
> T
> 

That is so COOL!! I remember f*cking up one of my first linux computers
that way. If I had known, I wouldn't have to go back to reinstall the
many diskettes of slackware (no live cds at that time!, no easy way
to fix the fs).

--jm

Re: The Right Approach to Exceptions

2012-02-20 Thread Juan Manuel Cabo

Well... then why did this mistakes exist?:

In dot NET:

ComException - Exception encapsulating COM HRESULT information
SEHExceptionException encapsulating Win32 structured exception 
handling information.z

http://msdn.microsoft.com/en-us/library/z4c5tckx%28v=VS.71%29.aspx

And why do you think that a thing like standardizing DatabaseException
never survives users, and that each database manager library defines its
own top *DatabaseException base class?

This is a universal problem with transversal traits of exceptions.

--jm



On 02/20/2012 10:22 PM, H. S. Teoh wrote:
> On Mon, Feb 20, 2012 at 10:01:03PM -0300, Juan Manuel Cabo wrote:
> [...]
>> Back to the nasty argument. I think that the example that everyone
>> wants is this one. If anyone solves this one without Variant[string]
>> then it's a better solution than Variant[string]. (I repaste it
>> from an above reply I gave):
>>
>>   [..]
>>   For instance: a C library wrapper, which gets the library errors encoded
>>   as some error code and throws them as exceptions. Shouldn't the library
>>   throw a FileNotFoundException when that's the error, instead of throwing
>>   a LibraryException that has the error code in a field?
>>
>>   So the correct thing to do is: after a library call, the wrapper
>>   checks the last error code number with a switch statement, and deciding
>>   which standard exception type to throw (defaulting to whatever you like
>>   if the error code doesn't map to a standard D exception). Then you
>>   add the error code to the Variant[string], and any other extra info.
> 
> But why bother with the error code at all? If you get a
> FileNotFoundException, you already know all there is to know about the
> problem, adding errno to it is redundant and only encourages code that's
> bound to a specific implementation.
> 
> Instead, Phobos should present a self-consistent API that's independent
> of what it uses to implement it, be it C stdio (errno) or C++ iostreams
> or Oracle driver (Oracle-specific error codes) or Postgresql driver
> (Postgresql-specific error codes), or what have you.
> 
> For error codes that *don't* have a direct mapping to standard
> exceptions, you can just encapsulate the errno (or whatever) inside a
> specific catch-all exception type dedicated to catch these sorts of
> unmapped cases, so that code that *does* know what errno can just catch
> this exception and interpret what happened. General,
> platform-independent code need not know what this exception is at all,
> they can just treat it as a general problem and react accordingly.  We
> don't (and shouldn't) expect every app out there to know or care about
> the errno of a failed operation, especially if it doesn't map to one of
> the standard exception types.
> 
> 
>>   That way, exception types can be standard.
>>
>>   So, to keep D exception types standard reusable and respected by
>>   future code, you must follow the Open-Closed design principle
>>   (nicest principle of OO design ever).
>>   [..]
>>
>> Adding the Variant[string] is considered applying the great
>> Open-Closed Design Principle:
>>  -Open for reuse.
>>  -Closed for modification.
>> http://www.objectmentor.com/resources/articles/ocp.pdf
> [...]
> 
> Please bear in mind, I'm not saying that Variant[string] is *completely*
> useless. I'm just saying that most of the time it's not necessary. Sure
> there are some cases where it's useful, I've no problem with it being
> used in those cases. But we shouldn't be using it for all kinds of stuff
> that can be handled in better ways, e.g., static fields in a derived
> exception class.
> 
> 
> T
>

Re: The Right Approach to Exceptions

2012-02-20 Thread Juan Manuel Cabo

>
> Jose's argument convinced me otherwise. I retract my agreement.
>
...
>
> No, I'm afraid there's a sizeable misunderstanding here.
>
>
> Andrei

Hahah, yeah, I think there is a sizeable misunderstanding: unless you
are referring to another guy with a spanish name in this thread,
(which I haven't found). My name is Juan Manuel (people call me:
Juanma, JM or Juan, but José is a first! I wouldn't mind John
which is the 'translation' of my name).

Back to the nasty argument. I think that the example that everyone
wants is this one. If anyone solves this one without Variant[string]
then it's a better solution than Variant[string]. (I repaste it
from an above reply I gave):

  [..]
  For instance: a C library wrapper, which gets the library errors encoded
  as some error code and throws them as exceptions. Shouldn't the library
  throw a FileNotFoundException when that's the error, instead of throwing
  a LibraryException that has the error code in a field?

  So the correct thing to do is: after a library call, the wrapper
  checks the last error code number with a switch statement, and deciding
  which standard exception type to throw (defaulting to whatever you like
  if the error code doesn't map to a standard D exception). Then you
  add the error code to the Variant[string], and any other extra info.

  That way, exception types can be standard.

  So, to keep D exception types standard reusable and respected by
  future code, you must follow the Open-Closed design principle
  (nicest principle of OO design ever).
  [..]

Adding the Variant[string] is considered applying the great
Open-Closed Design Principle:
-Open for reuse.
-Closed for modification.
http://www.objectmentor.com/resources/articles/ocp.pdf

--jm


On 02/20/2012 09:38 PM, Andrei Alexandrescu wrote:
> On 2/20/12 6:25 PM, H. S. Teoh wrote:
>> On Mon, Feb 20, 2012 at 05:15:17PM -0600, Andrei Alexandrescu wrote:
>> Formatting should use class reflection. We already discussed that, and
>> we already agreed that was the superior approach.
> 
> Jose's argument convinced me otherwise. I retract my agreement.
> 
>> When you're catching a specific exception, you're catching it with the
>> view that it will contain precisely information X, Y, Z that you need to
>> recover from the problem. If you don't need to catch something, then
>> don't put the catch block there.
> 
> That's extremely rare in my experience, and only present in toy examples that 
> contain a ton of "..." magic.
> 
>> The problem with using Variant[string] is that everything gets lumped
>> into one Exception object, and there's no way to only catch the
>> Exception that happens to have variables "p", "q", and "r" set in the
>> Variant[string].
> 
> No. You are still free to define as many exception classes as you deem 
> necessary. Please let's not pit the hash against
> the hierarchy again. This is confusing the role of the two. Consider the hash 
> an interface function you want to
> occasionally implement.
> 
>> You have to catch an exception type that includes all
>> sorts of combinations of data in Variant[string], then manually do tests
>> to single out the exception you want, and rethrow the rest. That's where
>> the ugliness comes from.
> 
> Yah, that would suck, but it's not at all what I say.
> 
>> [...]
>>> The code with Variant[string] does not need combinatorial testing if
>>> it wants to do a uniform action (such as formatting). It handles
>>> formatting uniformly, and if it wants to look for one particular field
>>> it inserts a test.
>>
>> Again, we've already agreed class reflection is the proper solution to
>> this one.
> 
> Agreement rescinded. Sorry! Jose's point was just too good, and reminded me 
> of a pain point I so long had with
> exception, I'd got used to it as a fact of life.
> 
 And then what do you do if you're depending on a particular field to
 be set, but it's not? Rethrow the exception? Then you have the stack
 trace reset problem.
>>>
>>> Don't forget that Variant[string] does not preclude distinct
>>> exception types. It's not one or the other.
>> [...]
>>
>> Agreed. But it shouldn't be the be-all and end-all of data passed in
>> exceptions. If anything, it should only be rarely used, with most
>> exception classes using static fields to convey relevant information.
> 
> And to perfectly help code duplication everywhere.
> 
>> I can see the usefulness of using Variant[string] as a way of
>> "decorating" exceptions with "extra attributes", but it shouldn't be the
>> primary way of conveying information from the throw site to the catch
>> site.
>>
>> As for iterating over the information in the most derived class, for
>> formatting, etc., class reflection is the way to go.
> 
> Agreement rescinded as far as exceptions go. That doesn't make reflection any 
> less necessary btw. It just reflects the
> dynamism of exception paths.
> 
>> We shouldn't be
>> using Variant[string] for this, because there's another p

Re: The Right Approach to Exceptions

2012-02-20 Thread Juan Manuel Cabo

oops, sorry!! I just saw a post by someone named Jose. My thousand apollogies!!

On 02/20/2012 10:01 PM, Juan Manuel Cabo wrote:
>>
>> Jose's argument convinced me otherwise. I retract my agreement.
>>
> ...
>>
>> No, I'm afraid there's a sizeable misunderstanding here.
>>
>>
>> Andrei
> 
> Hahah, yeah, I think there is a sizeable misunderstanding: unless you
> are referring to another guy with a spanish name in this thread,
> (which I haven't found). My name is Juan Manuel (people call me:
> Juanma, JM or Juan, but José is a first! I wouldn't mind John
> which is the 'translation' of my name).
> 
> Back to the nasty argument. I think that the example that everyone
> wants is this one. If anyone solves this one without Variant[string]
> then it's a better solution than Variant[string]. (I repaste it
> from an above reply I gave):
> 
>   [..]
>   For instance: a C library wrapper, which gets the library errors encoded
>   as some error code and throws them as exceptions. Shouldn't the library
>   throw a FileNotFoundException when that's the error, instead of throwing
>   a LibraryException that has the error code in a field?
> 
>   So the correct thing to do is: after a library call, the wrapper
>   checks the last error code number with a switch statement, and deciding
>   which standard exception type to throw (defaulting to whatever you like
>   if the error code doesn't map to a standard D exception). Then you
>   add the error code to the Variant[string], and any other extra info.
> 
>   That way, exception types can be standard.
> 
>   So, to keep D exception types standard reusable and respected by
>   future code, you must follow the Open-Closed design principle
>   (nicest principle of OO design ever).
>   [..]
> 
> Adding the Variant[string] is considered applying the great
> Open-Closed Design Principle:
>   -Open for reuse.
>   -Closed for modification.
> http://www.objectmentor.com/resources/articles/ocp.pdf
> 
> --jm
> 
> 
> On 02/20/2012 09:38 PM, Andrei Alexandrescu wrote:
>> On 2/20/12 6:25 PM, H. S. Teoh wrote:
>>> On Mon, Feb 20, 2012 at 05:15:17PM -0600, Andrei Alexandrescu wrote:
>>> Formatting should use class reflection. We already discussed that, and
>>> we already agreed that was the superior approach.
>>
>> Jose's argument convinced me otherwise. I retract my agreement.
>>
>>> When you're catching a specific exception, you're catching it with the
>>> view that it will contain precisely information X, Y, Z that you need to
>>> recover from the problem. If you don't need to catch something, then
>>> don't put the catch block there.
>>
>> That's extremely rare in my experience, and only present in toy examples 
>> that contain a ton of "..." magic.
>>
>>> The problem with using Variant[string] is that everything gets lumped
>>> into one Exception object, and there's no way to only catch the
>>> Exception that happens to have variables "p", "q", and "r" set in the
>>> Variant[string].
>>
>> No. You are still free to define as many exception classes as you deem 
>> necessary. Please let's not pit the hash against
>> the hierarchy again. This is confusing the role of the two. Consider the 
>> hash an interface function you want to
>> occasionally implement.
>>
>>> You have to catch an exception type that includes all
>>> sorts of combinations of data in Variant[string], then manually do tests
>>> to single out the exception you want, and rethrow the rest. That's where
>>> the ugliness comes from.
>>
>> Yah, that would suck, but it's not at all what I say.
>>
>>> [...]
>>>> The code with Variant[string] does not need combinatorial testing if
>>>> it wants to do a uniform action (such as formatting). It handles
>>>> formatting uniformly, and if it wants to look for one particular field
>>>> it inserts a test.
>>>
>>> Again, we've already agreed class reflection is the proper solution to
>>> this one.
>>
>> Agreement rescinded. Sorry! Jose's point was just too good, and reminded me 
>> of a pain point I so long had with
>> exception, I'd got used to it as a fact of life.
>>
>>>>> And then what do you do if you're depending on a particular field to
>>>>> be set, but it's not? Rethrow the exception? Then you have the stack
>>>>> trace res

Re: The Right Approach to Exceptions

2012-02-20 Thread Juan Manuel Cabo

On 02/20/2012 04:44 PM, Sean Kelly wrote:
> On Feb 20, 2012, at 11:44 AM, deadalnix wrote:
> 
>> That wouldn't work, because you'll erase the stacktrace.
> 
> It wouldn't be difficult to not overwrite a stack trace if one already exists 
> on throw.

That would be very nice!!

Java doesn't have a stacktrace reset problem:

I tried the following in java. You can see by the output
that the stacktrace of the exception object is
preserved (I didn't leave blank lines on purpouse so you can
count the line number shown in the output unequivocally):

public class bla {
public static void main(String[] args) throws Exception {
anotherfunc();
}
public static void anotherfunc() throws Exception {
try {
System.out.println("another func");
badfunc();
} catch (Exception ex) {
//rethrow the same exception:
throw ex;
}
}
public static void badfunc() throws Exception {
System.out.println("bad func");
throw new Exception("badfunc");
}
}

another func
bad func
Exception in thread "main" java.lang.Exception: badfunc
at bla.badfunc(bla.java:16)
at bla.anotherfunc(bla.java:8)
at bla.main(bla.java:3)

--jm

Re: The Right Approach to Exceptions

2012-02-20 Thread Juan Manuel Cabo

> Do you have actual use
> cases that requires adding data to exceptions? Without concrete examples
> we're just arguing about hypotheticals.

I posted a few hypothetical cases during in the thread, but this is one
long megathread!

I'm not facing an urgent need right now for the Variant[string] capability.

I just realized that the majority of twisted exception hierarchies are
grown trying to encode transversal traits of the exceptions in their
types names.

The exception types should encode only one thing: the error *what*.

Now, for real use cases for the Variant[string], you just have to look
around and they are everywhere.

For instance: a C library wrapper, which gets the library errors encoded
as some error code and throws them as exceptions. Shouldn't the library
throw a FileNotFoundException when that's the error, instead of throwing
a LibraryException that has the error code in a field?

So the correct thing to do is: after a library call, the wrapper
checks the last error code number with a switch statement, and deciding
which standard exception type to throw (defaulting to whatever you like
if the error code doesn't map to a standard D exception). Then you
add the error code to the Variant[string].

That way, exception types can be standard.

So, to keep D exception types standard reusable and respected by
future code, you must follow the Open-Closed design principle
(nicest principle of OO design ever).

--jm

On 02/20/2012 07:57 PM, H. S. Teoh wrote:
> On Mon, Feb 20, 2012 at 07:44:30PM -0300, Juan Manuel Cabo wrote:
> [...]
>>> (2) It's overly flexible. Anyone along the call stack can insert
>>> (hopefully NOT delete!!) additional data into the Exception object, as
>>> the stack is unwound. 
>>
>> As is currently the case.
>> Did you know that anyone can overwrite any field of the exception and
>> rethrow it? Such as msg field and so on?
> 
> This is an implementation bug. Exceptions should always be const in the
> catch block. I believe this issue has been filed, and will eventually be
> fixed.
> 
> 
>>> By the time it gets to the final catch() block, you cannot guarantee
>>> a particular field you depend on will be defined.
>>
>> If you want to guarantee it, then use a plain old variable for that
>> piece of data.
>>
>> I just would like a way to add data to an exception without creating a
>> new type.  If I create a new exception type for the wrong reasons, I'm
>> polluting the exception hierarchy.
> 
> Point taken.
> 
> So I think what we should have is *both* data stored in fields in
> Exception subclasses, and some kind of way to attach auxilliary data to
> the exception. Say with Variant[string], or whatever way you prefer.
> 
> But Variant[string] should not be used for *everything*. That only leads
> to problems. But then, it limits the usefulness of Variant[string],
> because then you can't just pass it to the i18n formatter, since now
> some fields are static but they may need to be part of the formatted
> message.
> 
> So we haven't really solved anything, we just added a new feature to
> Exception which I'm not sure how useful it is. Do you have actual use
> cases that requires adding data to exceptions? Without concrete examples
> we're just arguing about hypotheticals.
> 
> 
> T
>

Re: The Right Approach to Exceptions

2012-02-20 Thread Juan Manuel Cabo

> I still don't like the idea of using Variant[string], though.
> 
> (1) It doesn't allow compile-time type checking. This is a big minus, in
> my book.

When you need compile-time type checking, define a variable in your class.
Just make sure that you are creating that new exception class for a good reason.

When a user needs to add a variable to the exception, he can add it
without putting your exception class chained in a new type of exception,
that will hide your class from being selected by upstream catch blocks
in the call tree.

> 
> (2) It's overly flexible. Anyone along the call stack can insert
> (hopefully NOT delete!!) additional data into the Exception object, as
> the stack is unwound. 

As is currently the case.
Did you know that anyone can overwrite any field of the exception and
rethrow it? Such as msg field and so on?

> By the time it gets to the final catch() block,
> you cannot guarantee a particular field you depend on will be defined.

If you want to guarantee it, then use a plain old variable for that piece
of data.

I just would like a way to add data to an exception without creating a new type.
If I create a new exception type for the wrong reasons, I'm polluting the
exception hierarchy.

If I pollute the exception hierarchy, then catching by exception type
becomes convolluted. It becomes difficult for an exception to fall
in the right catch. And I think that is a worst problem than
not being sure if a piece of data is in the extra info.

Data is data. Types are types. Exceptions should be typed the best way
possible that will allow me to select them and fall in the right
catch block. And that is the *what* of the error, not the data of
the error.


> Say if your call graph looks something like this:
> 
>   main()
> +--func1()
> +--func2()
> |   +--helperFunc()
> |   +--func3()
> |   +--helperFunc()
> +--func4()
> +--helperFunc()
> 
> Suppose helperFunc() throws HelperException, which func1's catch block
> specifically wants to handle. Suppose func2() adds an attribute called
> "lineNumber" to its catch block, which then rethrows the exception, and
> func3() adds an attribute called "colNumber".
> 
> Now how should you write func1()'s catch block? You will get all
> HelperException's thrown, but you've no idea from which part of the call
> graph it originates. If it comes from func3(), then you have both
> "lineNumber" and "colNumber". If it comes before you reach func3(), then
> only "lineNumber" is defined. If it comes from func4(), then neither is
> present.
> 
> So your catch block degenerates into a morass of if-then-else
> conditions. And then what do you do if you're depending on a particular
> field to be set, but it's not? Rethrow the exception? Then you have the
> stack trace reset problem.
> 
> Whereas if HelperException always has the the same fields, the catch
> block is very straightforward: just catch HelperException, and you are
> guaranteed you have all the info you need.
> 
> Then if func3() wants to add more info, create a new exception derived
> from HelperException, and add the field there. Then in func1(), add a
> new catch block that catches the new exception, and makes use of the new
> field.
> 
> This does introduce a lot of little exception classes, which you could
> argue is class bloat, but I don't see how the Variant[string] method is
> necessarily superior. It comes with its own set of (IMHO quite nasty)
> problems.
> 
> 
> T
> 


HAhaha, it sometimes feel as though people are afraid that the Variant[string]
idea is to never use plain old variables and never use exception subclasses. :-)

On the contrary, the idea is so that plain old variables and exception 
subclasses
can be created for the right reasons, and to remove cases where they need
to be created for the wrong reasons.

--jm

Re: The Right Approach to Exceptions

2012-02-20 Thread Juan Manuel Cabo

I forgot to add, that you could define standard details names
as string constants, and even document them in the string constant
definition.

--jm


On 02/20/2012 06:11 PM, Juan Manuel Cabo wrote:
> Yeah.. that is a problem! :-) Thanks for liking the idea, now we can
> talk about the fine details!!
> 
> One way is to not let the user direct access to the associative array,
> but wrap the e.info["MyDetail"] call in a nothrow function, such as
> e.info("MyDetail"), and an e.hasInfo("MyDetail"), and of course:
> e.addInfo("MyDetail", value) and e.allInfoNames() or something.
> 
> The nothrow function would return an empty value if not found (I fear
> that it might not be of the same Variant subtype as the Variant value
> was intended when present).
> 
> --jm
> 
> 
> On 02/20/2012 05:53 PM, H. S. Teoh wrote:
>> On Mon, Feb 20, 2012 at 05:31:28PM -0300, Juan Manuel Cabo wrote:
>>>> ...
>>>> Sure. Again, this is not advocating replacement of exception hierarchies 
>>>> with tables!
>>>> ... 
>>>>
>>>> Andrei
>>>>
>>>
>>> I think that the case of rethrowing an exception with added detail is
>>> the worst enemy of clean Exception hierarchies.
>>
>> Hmm. This is a valid point. Sometimes you want to add contextual details
>> to an exception in order to provide the final catching code with more
>> useful information. Otherwise you may end up with a chain of mostly
>> redundant exception classes:
>>
>>  class UTFError : Exception {...}
>>  class LexUTFError : LexUTFError {
>>  int line, col;
>>  ...
>>  }
>>  class ConfigFileParseError : LexUTFError {
>>  string cfgfile_name;
>>  }
>>
>>  auto decodeUTF(...) {
>>  ...
>>  throw new UTFError;
>>  }
>>
>>  auto configLexer(...) {
>>  try {
>>  ...
>>  decodeUTF(...);
>>  } catch(UTFError e) {
>>  throw new LexUTFError(...);
>>  }
>>  }
>>
>>  auto configParser(...) {
>>  try {
>>  ...
>>  configLexer(...);
>>  } catch(LexUTFError e) {
>>  throw new ConfigFileParseError(...);
>>  }
>>  }
>>
>>
>>> The idea of Variant[string] remedies that case without creating a new
>>> exception class just for the added fields. If that case is solved,
>>> then the tipical need for creating new exception types that don't
>>> really aid selecting them for catching and recovery is solved too.
>> [...]
>>
>> However, I still hesitate about using Variant[string]. How would you
>> address the following problem:
>>
>>  // Module A
>>  class MyException : Exception {
>>  this() {
>>  info["mydetail"] = ...;
>>  }
>>  }
>>
>>  // Module B
>>  auto func() {
>>  try {
>>  ...
>>  } catch(MyException e) {
>>  if (e.info["mydetail"] == ...) {
>>  ...
>>  }
>>  }
>>  }
>>
>> If module A's maintainer renames "mydetail" to "detail", then module B
>> will still compile with no problem, but now e.info["mydetail"] doesn't
>> exist and will cause a runtime error at worst. At best, the catch block
>> won't be able to recover from the error as it did before, because now it
>> can't find the info it was looking for.
>>
>> If "mydetail" had been a field stored in MyException, then module B
>> would get a compile-time error, and the problem can be fixed
>> immediately, instead of going unnoticed until it blows up at the
>> customer's production server.
>>
>>
>> T
>>
>

Re: The Right Approach to Exceptions

2012-02-20 Thread Juan Manuel Cabo

Yeah.. that is a problem! :-) Thanks for liking the idea, now we can
talk about the fine details!!

One way is to not let the user direct access to the associative array,
but wrap the e.info["MyDetail"] call in a nothrow function, such as
e.info("MyDetail"), and an e.hasInfo("MyDetail"), and of course:
e.addInfo("MyDetail", value) and e.allInfoNames() or something.

The nothrow function would return an empty value if not found (I fear
that it might not be of the same Variant subtype as the Variant value
was intended when present).

--jm


On 02/20/2012 05:53 PM, H. S. Teoh wrote:
> On Mon, Feb 20, 2012 at 05:31:28PM -0300, Juan Manuel Cabo wrote:
>>> ...
>>> Sure. Again, this is not advocating replacement of exception hierarchies 
>>> with tables!
>>> ... 
>>>
>>> Andrei
>>>
>>
>> I think that the case of rethrowing an exception with added detail is
>> the worst enemy of clean Exception hierarchies.
> 
> Hmm. This is a valid point. Sometimes you want to add contextual details
> to an exception in order to provide the final catching code with more
> useful information. Otherwise you may end up with a chain of mostly
> redundant exception classes:
> 
>   class UTFError : Exception {...}
>   class LexUTFError : LexUTFError {
>   int line, col;
>   ...
>   }
>   class ConfigFileParseError : LexUTFError {
>   string cfgfile_name;
>   }
> 
>   auto decodeUTF(...) {
>   ...
>   throw new UTFError;
>   }
> 
>   auto configLexer(...) {
>   try {
>   ...
>   decodeUTF(...);
>   } catch(UTFError e) {
>   throw new LexUTFError(...);
>   }
>   }
> 
>   auto configParser(...) {
>   try {
>   ...
>   configLexer(...);
>   } catch(LexUTFError e) {
>   throw new ConfigFileParseError(...);
>   }
>   }
> 
> 
>> The idea of Variant[string] remedies that case without creating a new
>> exception class just for the added fields. If that case is solved,
>> then the tipical need for creating new exception types that don't
>> really aid selecting them for catching and recovery is solved too.
> [...]
> 
> However, I still hesitate about using Variant[string]. How would you
> address the following problem:
> 
>   // Module A
>   class MyException : Exception {
>   this() {
>   info["mydetail"] = ...;
>   }
>   }
> 
>   // Module B
>   auto func() {
>   try {
>   ...
>   } catch(MyException e) {
>   if (e.info["mydetail"] == ...) {
>   ...
>   }
>   }
>   }
> 
> If module A's maintainer renames "mydetail" to "detail", then module B
> will still compile with no problem, but now e.info["mydetail"] doesn't
> exist and will cause a runtime error at worst. At best, the catch block
> won't be able to recover from the error as it did before, because now it
> can't find the info it was looking for.
> 
> If "mydetail" had been a field stored in MyException, then module B
> would get a compile-time error, and the problem can be fixed
> immediately, instead of going unnoticed until it blows up at the
> customer's production server.
> 
> 
> T
>

Re: The Right Approach to Exceptions

2012-02-20 Thread Juan Manuel Cabo

I like one golden rule that I have:

 "You should only create a new exception type, if it makes sense to write
  a  catch(MyNewShinyType ex){}  "

other reasons for creating a new exception class I don't consider them valid
(but that's just me!).
This is because, exception types (in my head) are only a way to distinguish
whether to select them or let them go when I write a catch(), and to
help the catch recover.  Now, if I can't distinguish the *what* of the error,
I cannot recover well. The *cause* of the error goes inside the exception
object, not encoded in the type. Other details of the error go inside of
the exception object, not encoded in the type name.

So all I care about is the *what* of the error, so that it will fall in
the correct catch statement. Other criteria obscures that.

The Variant[string] helps keep the hierarchy clean. The hierachy should
tell the *what* of the error so that I can pick one when writing a catch block.

--jm

On 02/20/2012 05:51 PM, Jonathan M Davis wrote:
> On Monday, February 20, 2012 17:31:28 Juan Manuel Cabo wrote:
>>> ...
>>> Sure. Again, this is not advocating replacement of exception hierarchies
>>> with tables! ...
>>>
>>> Andrei
>>
>> I think that the case of rethrowing an exception with added detail is the
>> worst enemy of clean Exception hierarchies.
>> The idea of Variant[string] remedies that case without creating a new
>> exception class just for the added fields. If that case is solved, then the
>> tipical need for creating new exception types that don't really aid
>> selecting them for catching and recovery is solved too.
> 
> Having derived exceptions with additional information is a _huge_ boon, and I 
> contend that it's vasty better with variant, which would be highly error 
> prone, because it's not properly statically checked. Changes to what's put in 
> the variant could kill code at runtime - code which by its very definiton is 
> not supposed to be the normal code path, so you're less likely to actually 
> run 
> into the problem before you ship your product. Whereas with the information 
> in 
> actual member variables, if they get changed, you get a compilation error, 
> and 
> you know that you have to fix your code.
> 
> Rethrowing is a separate issue. And in many cases, the correct thing to do is 
> to chain exceptions. You catch one, do something with it, and then you throw 
> a 
> new one which took the first one as an argument. Then you get both. That 
> functionality is already built into Exception.
> 
> - Jonathan M Davis

Re: The Right Approach to Exceptions

2012-02-20 Thread Juan Manuel Cabo

> That's a very interesting angle!
>
> Andrei

Thanks!!

The main point I was making is, that otherwise, a user would be forced
to create a new exception kind and chain the original exception
to it. Now... what about the catch blocks defined upstream in
the call tree?
So this solves that then. The code upstream can still make a
  catch (FileNotFound)  that obviously distinguishes
FileNotFound from other exception types. Otherwise,
if all one does is ie: COMExceptions that chain other exceptions,
the distinction is lost, and rethrows mess everything.

--jm

On 02/20/2012 05:49 PM, Andrei Alexandrescu wrote:
> On 2/20/12 1:32 PM, Juan Manuel Cabo wrote:
>> So, if your boss wants the URL of the request that was made
>> when the standard library threw you a FileNotFoundException,
>> you can do:
>>
>>
>> try {
>>   ...
>>  } catch (Exception ex) {
>>  //Rethrow the exception with the added detail:
>> ex.details["custom_url"] = getenv("URI");
>>  throw ex;
>>  }
> 
> That's a very interesting angle!
> 
> Andrei
> 
>

Re: The Right Approach to Exceptions

2012-02-20 Thread Juan Manuel Cabo

> ...
> Sure. Again, this is not advocating replacement of exception hierarchies with 
> tables!
> ... 
> 
> Andrei
> 

I think that the case of rethrowing an exception with added detail is the worst
enemy of clean Exception hierarchies.
The idea of Variant[string] remedies that case without creating a new exception
class just for the added fields. If that case is solved, then the tipical need
for creating new exception types that don't really aid selecting them for
catching and recovery is solved too.

--jm

Re: The Right Approach to Exceptions

2012-02-20 Thread Juan Manuel Cabo

On 02/20/2012 03:23 PM, Jonathan M Davis wrote:
> I don't see how you could possibly make that uniform. It's very non-uniform 
> by 
> its very nature. The handling _needs_ to be non-uniform.
> 

The handling might need to be non-uniform, but the exception hierarchy doesn't.

Not talking about i18n formatting now.
Sometimes I'd like to add a 'trait' to an exception, but find myself needing
to create a new exception type just for that, which will sit oddly in the
hierarchy.

Consider the case of rethrowing an exception with added detail.

--jm

Re: The Right Approach to Exceptions

2012-02-20 Thread Juan Manuel Cabo

I agree to disagree.

But about your argument for efficiency, any check that you do when
examining an exception doesn't need to be lightning fast, after all,
if you are looking at an exception object, it means that the stack
got unwound, and any nano seconds that you spend doing this:

catch (Exception ex) {
if ("mycustomfield" in ex) {
.. do something ..
}
}

which is just an "in" check in an associative array, which might never
have more than 10 elements (so even linear search is appropiate),
is overshadowed by the time that the runtime took to unwind the stack
and serve the exception to your catch block.

--jm


On 02/20/2012 05:05 PM, Sean Kelly wrote:
> On Feb 20, 2012, at 11:54 AM, Juan Manuel Cabo wrote:
> 
>> About this part:
>>
>>>> What you want is throw a COMException and link it to the original
>>>> Exception. You have to consider Exception as a linkedlist, one
>>>> being the cause of another.
>>
>> The Variant[string] is an idea to help avoid people creating new kinds
>> of exception types that don't add nothing.
> 
> I don't think this makes sense.  To effectively use whatever's in the table I 
> pretty much have to know what error I'm handling, and this isn't possible if 
> type information is lost.  Unless this determination is moved to a run-time 
> check of some field within the exception, and then I'm making my code that 
> much messier and less efficient by putting in tests of this identifier 
> against a list of constants.  Personally, I don't see any use for this table 
> beyond providing context, much like we already have with file, line, etc.

Re: The Right Approach to Exceptions

2012-02-20 Thread Juan Manuel Cabo

With all due respect, I don't see module exception categories
as something good for the categorization of exceptions.

As I said in a previous post, long lost in this mega thread,
please don't create exception categories, unless it makes
sense to write a   catch (MyNewShinyCategory ex) {}  for it.
An artificial category would be in the way of other criteria
that aids catch grouping better.
(For instance, many different modules might want to add a
kind of IOException (not that I advocate the IOException
category, this is just for illustration))

--jm


On 02/20/2012 01:45 PM, Andrei Alexandrescu wrote:
> On 2/20/12 10:37 AM, foobar wrote:
>> On Monday, 20 February 2012 at 15:50:08 UTC, Andrei Alexandrescu wrote:
>>> Actually that just shuffles the matter around. Any setup does demand
>>> that some library (in this case most probably the standard library) will
>>> be a dependency knot because it defines the hierarchy that others should
>>> use.
>>
>> Not accurate. A 3rd party library that want to be compatible will no
>> doubt depend on the standard library's _exception hierarchy_ but that
>> does *not* mean it should depend on the parallel functionality in the
>> standard library. Per our example with IO, if I use tango.io I certainly
>> do not want my application code to include redundantly both std.io and
>> tango.io. I wanted to use tango.io as an *alternative* to std.io.
> 
> This is a confusion. Using PackageException!"std.io" does not require 
> importing std.io. Conversely, using
> std.IOException _does_ require importing std.exceptions or whatnot. So from a 
> dependency management viewpoint,
> PackageException is superior to IOException.
> 
> The converse disadvantage is that typos won't be caught during compilation. 
> For example, using PackageException!"sdt.io"
> will go through no problem, but of course won't be caught by people waiting 
> for a PackageException!"std.io".
> 
> 
> Andrei
>

Re: The Right Approach to Exceptions

2012-02-20 Thread Juan Manuel Cabo

About this part:

>> What you want is throw a COMException and link it to the original
>> Exception. You have to consider Exception as a linkedlist, one
>> being the cause of another.

The Variant[string] is an idea to help avoid people creating new kinds
of exception types that don't add nothing.

I guess that you are proving my point.

--jm



On 02/20/2012 04:48 PM, Juan Manuel Cabo wrote:
>> That wouldn't work, because you'll erase the stacktrace.
>>
>> Plus, you are confusing inheritance with composition here. What you want is 
>> throw a COMException and link it to the
>> original Exception. You have to consider Exception as a linkedlist, one 
>> being the cause of another.
> 
> You are correct. But it doesn't change the FILE and LINE attributes of the 
> exception.
> The code below changes the msg of the exception and rethrows it.
> Please note that the stacktrace is changed as you say. But the:
>object.Exception@t.d(17): another
> points to the site where it was produced originally:
> 
> #!/usr/bin/rdmd
> import std.stdio;
> void main () {
> anotherFunc();
> }
> void anotherFunc() {
> try {
> writeln("another func");
> badfunc();
> } catch (Exception ex) {
> ex.msg = "another";
> throw ex;
> }
> }
> void badfunc() {
> writeln("bad func");
> throw new Exception("badfunc");
> }
> 
> 
> another func
> bad func
> object.Exception@t.d(17): another
> 
> ./t(void t.anotherFunc()+0x2b) [0x42a1c7]
> ./t(_Dmain+0x9) [0x42a195]
> ./t(extern (C) int rt.dmain2.main(int, char**).void runMain()+0x17) [0x43c003]
> ./t(extern (C) int rt.dmain2.main(int, char**).void tryExec(scope void 
> delegate())+0x2a) [0x43b97a]
> ./t(extern (C) int rt.dmain2.main(int, char**).void runAll()+0x42) [0x43c056]
> ./t(extern (C) int rt.dmain2.main(int, char**).void tryExec(scope void 
> delegate())+0x2a) [0x43b97a]
> ./t(main+0xd3) [0x43b90b]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xff) [0x7fc83b628eff]
> 
> 
> 
> 
> 
> 
> 
> On 02/20/2012 04:44 PM, deadalnix wrote:
>> Le 20/02/2012 20:32, Juan Manuel Cabo a écrit :
>>> So, if your boss wants the URL of the request that was made
>>> when the standard library threw you a FileNotFoundException,
>>> you can do:
>>>
>>>
>>> try {
>>>   ...
>>>  } catch (Exception ex) {
>>>  //Rethrow the exception with the added detail:
>>> ex.details["custom_url"] = getenv("URI");
>>>  throw ex;
>>>  }
>>>
>>> (don't beat me if URI wasn't the envvar that apache sends for uri, I
>>> don't remember it at the moment).
>>>
>>> --jm
>>>
>>>
>>>
>>> On 02/20/2012 04:27 PM, Juan Manuel Cabo wrote:
>>>>> And so variant is the way to go ?
>>>>>
>>>>> Clearly, this is a very strong arguement in favor of typed Exception, 
>>>>> that provide usefull information about what went
>>>>> wrong. This is a safe approach.
>>>>>
>>>>> Because this Variant stuff is going to require massive ducktyping of 
>>>>> Exceptions, with all possible errors involved. The
>>>>> keys in the Variant[string] will depend on the Exception the dev is 
>>>>> facing. This should be avoided and should warn us
>>>>> about the requirement of typed Exceptions.
>>>>
>>>>
>>>> Some of the things that characterize an exception, their traits, are
>>>> transversal to any hierachy that you can imagine, now and in the future.
>>>>
>>>> You can choose to keep changing a hierarchy, or build in some mechanism
>>>> to the Exception base class, that will allow you to get your traits
>>>> without downcasting.
>>>>
>>>> Say that at your place of work the boss decides that all exception classes
>>>> should have a COM error code, or that all exception classes should
>>>> provide the URL of the request that generated it.
>>>>
>>>> You know what will happen?
>>>>
>>>> Your boss will require you to derive all your exception classes from
>>>> COMException or from WebRequestException and then redefine FileNotFound
>>>> as a subclass of them. So you will have your FileNotFoundException
>>>> different than the standard library exception.
>>>>
>>>> Don't believe me? It happened to .NET... they've got a COMException
>>>> that wraps any other kind of error during a COM call.
>>>>
>>>>
>>>> --jm
>>>>
>>>

Re: The Right Approach to Exceptions

2012-02-20 Thread Juan Manuel Cabo

> That wouldn't work, because you'll erase the stacktrace.
>
> Plus, you are confusing inheritance with composition here. What you want is 
> throw a COMException and link it to the
> original Exception. You have to consider Exception as a linkedlist, one being 
> the cause of another.

You are correct. But it doesn't change the FILE and LINE attributes of the 
exception.
The code below changes the msg of the exception and rethrows it.
Please note that the stacktrace is changed as you say. But the:
   object.Exception@t.d(17): another
points to the site where it was produced originally:

#!/usr/bin/rdmd
import std.stdio;
void main () {
anotherFunc();
}
void anotherFunc() {
try {
writeln("another func");
badfunc();
} catch (Exception ex) {
ex.msg = "another";
throw ex;
}
}
void badfunc() {
writeln("bad func");
throw new Exception("badfunc");
}


another func
bad func
object.Exception@t.d(17): another

./t(void t.anotherFunc()+0x2b) [0x42a1c7]
./t(_Dmain+0x9) [0x42a195]
./t(extern (C) int rt.dmain2.main(int, char**).void runMain()+0x17) [0x43c003]
./t(extern (C) int rt.dmain2.main(int, char**).void tryExec(scope void 
delegate())+0x2a) [0x43b97a]
./t(extern (C) int rt.dmain2.main(int, char**).void runAll()+0x42) [0x43c056]
./t(extern (C) int rt.dmain2.main(int, char**).void tryExec(scope void 
delegate())+0x2a) [0x43b97a]
./t(main+0xd3) [0x43b90b]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xff) [0x7fc83b628eff]
----






On 02/20/2012 04:44 PM, deadalnix wrote:
> Le 20/02/2012 20:32, Juan Manuel Cabo a écrit :
>> So, if your boss wants the URL of the request that was made
>> when the standard library threw you a FileNotFoundException,
>> you can do:
>>
>>
>> try {
>>   ...
>>  } catch (Exception ex) {
>>  //Rethrow the exception with the added detail:
>> ex.details["custom_url"] = getenv("URI");
>>  throw ex;
>>      }
>>
>> (don't beat me if URI wasn't the envvar that apache sends for uri, I
>> don't remember it at the moment).
>>
>> --jm
>>
>>
>>
>> On 02/20/2012 04:27 PM, Juan Manuel Cabo wrote:
>>>> And so variant is the way to go ?
>>>>
>>>> Clearly, this is a very strong arguement in favor of typed Exception, that 
>>>> provide usefull information about what went
>>>> wrong. This is a safe approach.
>>>>
>>>> Because this Variant stuff is going to require massive ducktyping of 
>>>> Exceptions, with all possible errors involved. The
>>>> keys in the Variant[string] will depend on the Exception the dev is 
>>>> facing. This should be avoided and should warn us
>>>> about the requirement of typed Exceptions.
>>>
>>>
>>> Some of the things that characterize an exception, their traits, are
>>> transversal to any hierachy that you can imagine, now and in the future.
>>>
>>> You can choose to keep changing a hierarchy, or build in some mechanism
>>> to the Exception base class, that will allow you to get your traits
>>> without downcasting.
>>>
>>> Say that at your place of work the boss decides that all exception classes
>>> should have a COM error code, or that all exception classes should
>>> provide the URL of the request that generated it.
>>>
>>> You know what will happen?
>>>
>>> Your boss will require you to derive all your exception classes from
>>> COMException or from WebRequestException and then redefine FileNotFound
>>> as a subclass of them. So you will have your FileNotFoundException
>>> different than the standard library exception.
>>>
>>> Don't believe me? It happened to .NET... they've got a COMException
>>> that wraps any other kind of error during a COM call.
>>>
>>>
>>> --jm
>>>
>>

Re: The Right Approach to Exceptions

2012-02-20 Thread Juan Manuel Cabo

So, if your boss wants the URL of the request that was made
when the standard library threw you a FileNotFoundException,
you can do:


try {
  ...
} catch (Exception ex) {
//Rethrow the exception with the added detail:
ex.details["custom_url"] = getenv("URI");
throw ex;
}

(don't beat me if URI wasn't the envvar that apache sends for uri, I
don't remember it at the moment).

--jm



On 02/20/2012 04:27 PM, Juan Manuel Cabo wrote:
>> And so variant is the way to go ?
>>
>> Clearly, this is a very strong arguement in favor of typed Exception, that 
>> provide usefull information about what went
>> wrong. This is a safe approach.
>>
>> Because this Variant stuff is going to require massive ducktyping of 
>> Exceptions, with all possible errors involved. The
>> keys in the Variant[string] will depend on the Exception the dev is facing. 
>> This should be avoided and should warn us
>> about the requirement of typed Exceptions.
> 
> 
> Some of the things that characterize an exception, their traits, are
> transversal to any hierachy that you can imagine, now and in the future.
> 
> You can choose to keep changing a hierarchy, or build in some mechanism
> to the Exception base class, that will allow you to get your traits
> without downcasting.
> 
> Say that at your place of work the boss decides that all exception classes
> should have a COM error code, or that all exception classes should
> provide the URL of the request that generated it.
> 
> You know what will happen?
> 
> Your boss will require you to derive all your exception classes from
> COMException or from WebRequestException and then redefine FileNotFound
> as a subclass of them. So you will have your FileNotFoundException
> different than the standard library exception.
> 
> Don't believe me? It happened to .NET... they've got a COMException
> that wraps any other kind of error during a COM call.
> 
> 
> --jm
>

Re: The Right Approach to Exceptions

2012-02-20 Thread Juan Manuel Cabo

> And so variant is the way to go ?
> 
> Clearly, this is a very strong arguement in favor of typed Exception, that 
> provide usefull information about what went
> wrong. This is a safe approach.
> 
> Because this Variant stuff is going to require massive ducktyping of 
> Exceptions, with all possible errors involved. The
> keys in the Variant[string] will depend on the Exception the dev is facing. 
> This should be avoided and should warn us
> about the requirement of typed Exceptions.


Some of the things that characterize an exception, their traits, are
transversal to any hierachy that you can imagine, now and in the future.

You can choose to keep changing a hierarchy, or build in some mechanism
to the Exception base class, that will allow you to get your traits
without downcasting.

Say that at your place of work the boss decides that all exception classes
should have a COM error code, or that all exception classes should
provide the URL of the request that generated it.

You know what will happen?

Your boss will require you to derive all your exception classes from
COMException or from WebRequestException and then redefine FileNotFound
as a subclass of them. So you will have your FileNotFoundException
different than the standard library exception.

Don't believe me? It happened to .NET... they've got a COMException
that wraps any other kind of error during a COM call.


--jm

Re: The Right Approach to Exceptions

2012-02-20 Thread Juan Manuel Cabo

I like the idea!

Remember please for anyone reading: Use positional arguments in
format strings. Otherwise:

"The '%s' file's size is %d which is wrong"

translated to

"El tamaño %d es incorrecto para el archivo %s"

will be trouble. Instead please do:

"The '%1$s' file's size is %2$d which is wrong"

specially for standard library messages. This would be very helpful!

--jm



> into this:
>
> try
> getopt(args, ...)
> catch(Exception e)
> {
> stderr.writeln(stringTemplate(typeid(e).toString(), e.info));
> return -1;
> }
>
> The stringTemplate function loads the formatting template from a table 
> indexed on typeid(e).toString() and formats it
> with the info. It's simple factorization.
>
>
> Andrei

Re: The Right Approach to Exceptions

2012-02-20 Thread Juan Manuel Cabo

On 02/20/2012 02:57 PM, Andrei Alexandrescu wrote:
> On 2/20/12 11:44 AM, foobar wrote:
>> This extra processing is orthogonal to the exception. the same exception
>> can be logged to a file, processed (per above example) and generate
>> graphical notification to the user, etc. The exception contains the
>> information pertaining only to what went wrong. the rest is not part of
>> this discussion.
> 
> Exactly. I don't see how a disagreement follows from here. So isn't it 
> reasonable to design the exception such that it
> can offer information pertaining to what went wrong, in a uniform manner?
> 
>> The exact same exception in the example would also be thrown on a
>> mistyped URL in an application that tries to scrape some info from a
>> website for further processing. The error is still the same - the url is
>> incorrect but different use cases handle it differently. In the former
>> example I might call to a i18n lib (someone already mentioned gettext)
>> while in the latter I'll call a logging library with the the mistyped
>> url (for statistics' sake).
>> in the first I use the url to find a closest match, in the second I want
>> to log said requested url. Both handled by *other* mechanisms.
>> in both cases the exception needs a url field and in both cases I have
>> no need for the Variant[string] map.
> 
> The Variant[string] map saves a lot of duplication whenever you want to 
> format a human-readable string (which is a
> common activity with exceptions). It transforms this (I'm too lazy to write 
> code anew by hand, so I'll paste Jonathan's):
> 
> try
> getopt(args, ...)
> catch(MissingArgumentException mae)
> {
> stderr.writefln("%s is missing an argument", mae.flag);
> return -1;
> }
> catch(InvalidArgumentException iae)
> {
> stderr.writelfln("%s is not a valid argument for %s. You must give it a
> %s.", mae.arg, mae.flag, mae.expectedType);
> return -1;
> }
> catch(UnknownFlagException ufe)
> {
> stderr.writefln("%s is not a known flag.", ufe.ufe);
> return -1;
> }
> catch(GetOptException goe)
> {
> stderr.writefln("There was an error with %s",  goe.flag);
> return -1;
> }
> //A delegate that you passed to getopt threw an exception.
> catch(YourException ye)
> {
> //...
> }
> catch(Exception e)
> {
> stderr.writeln("An unexpected error occured.");
> return -1;
> }
> 
> into this:
> 
> try
> getopt(args, ...)
> catch(Exception e)
> {
> stderr.writeln(stringTemplate(typeid(e).toString(), e.info));
> return -1;
> }
> 
> The stringTemplate function loads the formatting template from a table 
> indexed on typeid(e).toString() and formats it
> with the info. It's simple factorization.
> 
> 
> Andrei

Re: The Right Approach to Exceptions

2012-02-19 Thread Juan Manuel Cabo

Thanks for the interest!!, Not sure when, but I'll write a 
translation.

--jm

On Sunday, 19 February 2012 at 15:57:25 UTC, Andrei Alexandrescu 
wrote:

On 2/19/12 4:30 AM, Juan Manuel Cabo wrote:
Hello D community! This is my first post!! I hope I can bring 
clarity to

all this. If not, I apologize.

[snip]

Thanks for an insightful post. If you found the time, it would 
be great if you could translate your paper on exceptions from 
Spanish.


Andrei

Re: The Right Approach to Exceptions

2012-02-19 Thread Juan Manuel Cabo

Well, since keys would be string, you can define them at the base 
class.
So, FileException can define a few detail names that you know are 
used in its derived classes. And so, Exception can define detail 
names that are standard for all classes.


So it would look like:

class FileException {
[..]
static immutable details_filename = "filename";
static immutable details_ispipe = "ispipe";
[..]
}

class Exception {
[..]
static immutable details_transient = "transient";
static immutable details_i18n_name = "i18n_name";
[..]
}


So, you *know* that a portion of the tree supports certain 
details in the associative array, when you type the dot after the 
exception class name in an IDE with autocomplete, (or ctrl-N in 
Vim with the definition open ;-) ).
And you can document with ddoc those static variables. So the 
example would be now:


For instance:

 ...
 catch (Exception ex) {
 if (Exception.details_i18n_name in ex.details) {
 log(translate(ex.details[Exception.details_i18n_name]));
 }
 if (FileException.details_filename in ex.details) {
 log("This file is trouble: "
 ~ ex.details[FileException.details_filename]);
 }
 if (Exception.details_transient in ex.details) {
 repeatOneMoreTime();
 }
 }
 ...




On Sunday, 19 February 2012 at 14:54:29 UTC, Jacob Carlborg wrote:

On 2012-02-19 13:27, Juan Manuel Cabo wrote:
How about adding a string[string] or a variant[string] to the 
Exception
class, so one can know details about the subclassed exception 
without

downcasting? How ugly would that be?

For instance:

...
catch (Exception ex) {
if ("transient" in ex.details) {
repeatOneMoreTime();
}
if ("i18n_code" in ex.details) {
log(translate(ex.details["i18n_code"]));
}
}
...

Details can be standard by convention or otherwise custom.
(I can see that this can lead to messy proliferation of 
details, but at

least solves most of the issues).


How would you know which keys are available in "ex.details", 
documentation?

Re: The Right Approach to Exceptions

2012-02-19 Thread Juan Manuel Cabo

I uploaded my little 2003 article on exception categorization to 
github,
where I detailed my whole point_of_view_of_the_catch and 
whole_program_invariant phylosophy for organizing clasess, and 
compared
C++, .NET and Java hierarchies. Its not a professional article, 
and I never wrote a translation to english:


https://github.com/jmcabo/ExceptionArticle/blob/master/articuloExcepciones_v2.pdf

--jm


On Sunday, 19 February 2012 at 10:30:45 UTC, Juan Manuel Cabo 
wrote:
Hello D community! This is my first post!! I hope I can bring 
clarity to all this. If not, I apologize.


Some time ago I researched the best way to classify exceptions 
and build a hierarchy. I came up with the following rules:



1) At the top level, there would be RecoverableExceptions and 
FatalExceptions (or you can call them something like 
CatcheableException and FatalExceptions, or Exception and 
Error).


2) Fatal exceptions shouldn't be catched. They imply that the 
program lost basic guarantees to go on (memory corruption, 
missing essential file, etc.). You could catch them if you 
wanted to, but it makes no sense other than at your top-level 
method (main(), etc.).


3) A RecoverableException can be converted to a FatalException 
by rethrowing it, once a catch decides so. You shouldn't do the 
reverse: a FatalException never should be converted to a 
RecoverableException.


4) It makes no sense to subclass FatalExceptions since there 
won't be a catch that groups them in a base type (since they 
are not catcheable).


5) Only create a new RecoverableException class type if it 
makes sense to write a catch for it alone. Otherwise, use an 
preexisting type.


6) Only group RecoverableExceptions in a category if it makes 
sense to write a catch for that category. Please don't group 
them because its fancy or "cleaner", that is a bad reason.



Who decides when an Exception is Unrecoverable? Library code 
almost never decides it, since an exception is only 
unrecoverable if the whole_program_invariant got broken, and 
libraries are only a part of a program. So they will tend to 
throw RecoverableExceptions, which can be reconverted to 
Unrecoverable by the program.


In some cases, it is clear that an exception is Unrecoverable. 
When you call a function without satisfying its arguments 
precondition (ie: null argument) the only way to fix that is by 
editing the program. You shouldn't have called it like that in 
the first place, why would you? So you let the 
UnrecoverableException bubble up to your main function, and log 
its stacktrace to fix it.


Unrecoverable means the program got 'tainted', basic guarantees 
got broken (possible memory corruption, etc.). Most exceptions 
will be Recoverable.


Now, expanding on the hierarchy: I said that it makes no sense 
to subclass UnrecoverableExceptions. Recoverable exceptions on 
the other hand, need to be subclassed 
_with_the_catch_on_your_mind_. You are passing info from the 
throw site to the catch site. The catch block is the 
interpreter of the info, the observer. You are communicating 
something to the catch block.


So please, do not create a new types if there is no value in 
writing a catch that only cathes that exception and that can 
recover from that exception. Otherwise, use an existing type.



I wrote these rules some time ago. Please excuse me if they 
come off a bit pedantic Its all only a clarifying 
convention.



According to all this:

* FileNotFoundException is useful. It tells you what happened. 
It is a RecoverableException (under my convention) because 
until it reaches the program, the library doesn't know if the 
program can recover from that (it could be a system missing 
file, or just a document the user asked for).


* DiskFailureException is only useful if someone can write a 
catch for it. If so, then it is a RecoverableException. Only 
the program can decide if it broke basic guarantees.


* Most argument exceptions are Unrecoverable. A function 
throwing shouldn't have been called like that in the first 
place. The only fix is to go back to editing the program. 
(precondition broken).



Another thing: you cannot decide whether an exception is 
Unrecoverable based only on whether the thing that got broken 
is the postcondition of a function. It is the 
whole_program_invariant that decides that. For instance:  
findStuff(someStuff)  might not know if someStuff is important 
enough for the stability of the program if not found. The 
postcondition is broken if it doesn't return the Stuff. That 
might be recoverable.


And PLEASE: don't make classifications by the point of view of 
the cause of the problem. DO make classifications by the point 
of view of the fixing/recovery of the problem; the catch block 
is who you are talking to. 
FileNotFoundBecauseFilesystemUnmounted is worthless.


So, to sum up: (1) it makes no sense to subclass fatal 
exceptions, and (2) never subc

Re: The Right Approach to Exceptions

2012-02-19 Thread Juan Manuel Cabo

How about adding a string[string] or a variant[string] to the 
Exception class, so one can know details about the subclassed 
exception without downcasting? How ugly would that be?


For instance:

...
catch (Exception ex) {
  if ("transient" in ex.details) {
repeatOneMoreTime();
  }
  if ("i18n_code" in ex.details) {
log(translate(ex.details["i18n_code"]));
  }
}
...

Details can be standard by convention or otherwise custom.
(I can see that this can lead to messy proliferation of details, 
but at least solves most of the issues).


--jm (BIG FAN OF D. GUYS I LOVE ALL YOUR GOOD WORK)




On Sunday, 19 February 2012 at 08:06:38 UTC, Andrei Alexandrescu 
wrote:

On 2/19/12 1:12 AM, Jonathan M Davis wrote:
On Sunday, February 19, 2012 00:43:58 Andrei Alexandrescu 
wrote:

On 2/18/12 8:00 PM, H. S. Teoh wrote:
  From this and other posts I'd say we need to design the 
base exception


classes better, for example by defining an overridable 
property
isTransient that tells caller code whether retrying might 
help.


Just because an exception is transient doesn't mean it makes 
sense to
try again. For example, saveFileMenu() might read a filename 
from the
user, then save the data to a file. If the user types an 
invalid
filename, you will get an InvalidFilename exception. From an 
abstract
point of view, an invalid filename is not a transient 
problem: retrying
the invalid filename won't make the problem go away. But the 
application
in this case *wants* to repeat the operation by asking the 
user for a

*different* filename.

On the other hand, if the same exception happens in an app 
that's trying
to read a configuration file, then it *shouldn't* retry the 
operation.


I'm thinking an error is transient if retrying the operation 
with the
same exact data may succeed. That's a definition that's 
simple, useful,

and easy to operate with.


A core problem with the idea is that whether or not it makes 
sense to try
again depends on what the caller is doing. In general, I think 
that it's best
to give the caller as much useful information is possible so 
that _it_ can

decide the best way to handle the exception.


That sounds like "I violently agree".

Andrei

Re: The Right Approach to Exceptions

2012-02-19 Thread Juan Manuel Cabo

That proposed syntax is nicer than this, but at least you can do 
it this way:

just call the same function from both catch blocks.

#!/usr/bin/rdmd
import std.stdio, std.utf, std.string;
void main() {
void handleStringAndUtf(Exception ex) {
if (typeid(ex).name == "std.utf.UTFException") {
// .. UtfException specific handling ..
writeln("length: ", (cast(UTFException)ex).len);
}
// .. handling of UtfException and StringException in common ..
writeln(ex.toString);
writeln(typeid(ex).name);
}

try {
throw new UtfException("");
//throw new StringException("");
} catch (StringException ex) {
handleStringAndUtf(ex);
} catch (UTFException ex) {
handleStringAndUtf(ex);
}
}


--jm



On Sunday, 19 February 2012 at 09:12:40 UTC, Jonathan M Davis 
wrote:

On Sunday, February 19, 2012 02:04:50 Andrei Alexandrescu wrote:

On 2/19/12 12:56 AM, Jonathan M Davis wrote:
No. Sometimes you want to catch specific types and handle a 
subset of them in a particular way but don't want to handle 
_all_ of the exceptions with the same base class the same way. 
For instance, if you had the following and all of them are 
derived from FileException (save FileException itself):


catch(e : FileNotFoundException, NotAFileException)
{
//...
}
catch(AccessDeniedException e)
{
//...
}
catch(FileException e)
{
//...
}

You want to handle certain exceptions differently and you want 
to handle some of them the same, but you don't want to handle 
all FileExceptions the same way. Without a way to put multiple 
exception types in the same block, you tend to end up with code 
duplication (and no I'm not necessarily advocating that we have 
those specific exception types - they're just examples).


It's even worse if you don't have much of a hierarchy, since if 
_everything_ is derived from Exception directly, then catching 
the common type - Exception - would catch _everything_. For 
instance, what if you want to handle StringException and 
UTFException together but FileException differently? You can't 
currently do


catch(e : StringException, UTFException)
{
//...
}
catch(FileException e)
{
//...
}

Right now, you'd have to have separate catch blocks for 
StringException and UTFException.


A well-designed exception hierarchy reduces the problem 
considerably, because then there's a much higher chance that 
catching a common exception will catch what you want and not 
what you don't, but that doesn't mean that you're never going 
to run into cases where catching the common type doesn't work.


- Jonathan M Davis

Re: The Right Approach to Exceptions

2012-02-19 Thread Juan Manuel Cabo

Hello D community! This is my first post!! I hope I can bring 
clarity to all this. If not, I apologize.


Some time ago I researched the best way to classify exceptions 
and build a hierarchy. I came up with the following rules:



1) At the top level, there would be RecoverableExceptions and 
FatalExceptions (or you can call them something like 
CatcheableException and FatalExceptions, or Exception and Error).


2) Fatal exceptions shouldn't be catched. They imply that the 
program lost basic guarantees to go on (memory corruption, 
missing essential file, etc.). You could catch them if you wanted 
to, but it makes no sense other than at your top-level method 
(main(), etc.).


3) A RecoverableException can be converted to a FatalException by 
rethrowing it, once a catch decides so. You shouldn't do the 
reverse: a FatalException never should be converted to a 
RecoverableException.


4) It makes no sense to subclass FatalExceptions since there 
won't be a catch that groups them in a base type (since they are 
not catcheable).


5) Only create a new RecoverableException class type if it makes 
sense to write a catch for it alone. Otherwise, use an 
preexisting type.


6) Only group RecoverableExceptions in a category if it makes 
sense to write a catch for that category. Please don't group them 
because its fancy or "cleaner", that is a bad reason.



Who decides when an Exception is Unrecoverable? Library code 
almost never decides it, since an exception is only unrecoverable 
if the whole_program_invariant got broken, and libraries are only 
a part of a program. So they will tend to throw 
RecoverableExceptions, which can be reconverted to Unrecoverable 
by the program.


In some cases, it is clear that an exception is Unrecoverable. 
When you call a function without satisfying its arguments 
precondition (ie: null argument) the only way to fix that is by 
editing the program. You shouldn't have called it like that in 
the first place, why would you? So you let the 
UnrecoverableException bubble up to your main function, and log 
its stacktrace to fix it.


Unrecoverable means the program got 'tainted', basic guarantees 
got broken (possible memory corruption, etc.). Most exceptions 
will be Recoverable.


Now, expanding on the hierarchy: I said that it makes no sense to 
subclass UnrecoverableExceptions. Recoverable exceptions on the 
other hand, need to be subclassed _with_the_catch_on_your_mind_. 
You are passing info from the throw site to the catch site. The 
catch block is the interpreter of the info, the observer. You are 
communicating something to the catch block.


So please, do not create a new types if there is no value in 
writing a catch that only cathes that exception and that can 
recover from that exception. Otherwise, use an existing type.



I wrote these rules some time ago. Please excuse me if they come 
off a bit pedantic Its all only a clarifying 
convention.



According to all this:

* FileNotFoundException is useful. It tells you what happened. It 
is a RecoverableException (under my convention) because until it 
reaches the program, the library doesn't know if the program can 
recover from that (it could be a system missing file, or just a 
document the user asked for).


* DiskFailureException is only useful if someone can write a 
catch for it. If so, then it is a RecoverableException. Only the 
program can decide if it broke basic guarantees.


* Most argument exceptions are Unrecoverable. A function throwing 
shouldn't have been called like that in the first place. The only 
fix is to go back to editing the program. (precondition broken).



Another thing: you cannot decide whether an exception is 
Unrecoverable based only on whether the thing that got broken is 
the postcondition of a function. It is the 
whole_program_invariant that decides that. For instance:  
findStuff(someStuff)  might not know if someStuff is important 
enough for the stability of the program if not found. The 
postcondition is broken if it doesn't return the Stuff. That 
might be recoverable.


And PLEASE: don't make classifications by the point of view of 
the cause of the problem. DO make classifications by the point of 
view of the fixing/recovery of the problem; the catch block is 
who you are talking to. FileNotFoundBecauseFilesystemUnmounted is 
worthless.


So, to sum up: (1) it makes no sense to subclass fatal 
exceptions, and (2) never subclass a RecoverableException if you 
are not helping a catch block with that (but please do if it aids 
recovery).


..so verbose and pedantic for my first post... yikes.. i beg 
forgiveness!!!





On Sunday, 19 February 2012 at 09:27:48 UTC, Jonathan M Davis 
wrote:

On Sunday, February 19, 2012 19:00:20 Daniel Murphy wrote:

I wasn't really serious about implicit fallthrough.


Lately, it seems like I can never tell whether anyone's being 
serious or not online. :)



Out of the syntaxes I could come up with:
catch(Ex1, Ex2 e)
catch(e : Ex

83 matches

Mail list logo