Location of popFront (was: randomSample)

2009-05-24 Thread Lionello Lunesu

Andrei,

I noticed in random.d, uniform template, that popFront is called in 
different locations for integral compared to floating point types: for 
integral types you .front first and .popFront afterwards, but for floating 
point types you start with .popFront and then check .front.


This has a peculiar effect: for example, if you do uniform(0.0,100.0) 
followed by uniform(0,100) there's a big chance that the integral part of 
the first random number is equal to the second random number.


import std.stdio, std.random;
void main()
{
 writeln(uniform(0.0,100.0));
 writeln(uniform(0,100));
}

I don't think this warrants a bug report, but I do think the location of 
.popFront should be standardized, either before or after any .front.


Just sayin'.

L. 



Re: Can you find out where the code goes wrong?

2009-05-24 Thread Lionello Lunesu


"davidl"  wrote in message 
news:op.uugvg5ahj5j...@my-tomato...

[snip]

The culprit is the on stack array.

Should the compiler warn on slicing on a fixed length array? or even give 
an error?
I find this use case can easily go wrong! You may even think this code is 
correct at the very first glance.


Definately a bug. You should file it to bugzilla.

When returning the original stack-allocated array the compiler correctly 
complains:


test.d(25): Error: escaping reference to local v

but as soon as you slice it, even "v[]", it is no longer detected.

Good catch!

L. 



Can you find out where the code goes wrong?

2009-05-24 Thread davidl

import std.stdio;

string func()
{
string s="abc";
return s;
}

void func1()
{
writefln("func1");
string v = func();
writefln("call func");
writefln(func2());
}

byte[] func2()
{
writefln("hello!");
byte[16] v= [65,65,65,65,
 65,65,65,65,
 65,65,65,65,
 65,65,65,65];
writefln(v[0..16]);
return v[0..16];
}

void main(string[] args)
{
func1();
}

The culprit is the on stack array.

Should the compiler warn on slicing on a fixed length array? or even give  
an error?
I find this use case can easily go wrong! You may even think this code is  
correct at the very first glance.


Re: how to use GC as a leak detector? i.e. get some help info from GC?

2009-05-24 Thread nobody
> As other asked, are you using D1 Tango/Phobos? D2? In Tango/D2 you can

DMD v2.030  on Linux.

> enable logging in the GC (using the LOGGING version identifier).

How to do it in D2?

> Is your program source available? I'm gathering programs to make a D GC

Sorry, no.


Re: how to use GC as a leak detector? i.e. get some help info from GC?

2009-05-24 Thread nobody
== Quote from Brad Roberts (bra...@puremagic.com)'s article
> After enabling the gc, did you force a collection?  Just enabling it won't 
> cause
> one to occur.

Yes, I called:

  core.memory.GC.enable();
  core.memory.GC.collect();
  core.memory.GC.disable();


Re: how to use GC as a leak detector? i.e. get some help info from GC?

2009-05-24 Thread nobody
> > I suspected the GC is buggy when mixed with manual deletes.
> I personally have not experienced this.  Please be more specific:
> D1 or D2?

D2.

> If D1, Phobos or Tango?
> DMD, LDC, or GDC?

DMD v2.030

> Compiler version?
> Also, please file a bug report, especially if you can create a concise,
> reproducible test case.

It's hard to isolate the code, and since the program is non-trivial I'm not 100%
sure, as it could be my bug.


Re: how to use GC as a leak detector? i.e. get some help info from GC?

2009-05-24 Thread Leandro Lucarella
nobody, el 24 de mayo a las 20:03 me escribiste:
> == Quote from Jason House (jason.james.ho...@gmail.com)'s article
> > Why not use valgrind? With the GC disabled, it should give accurate results.
> 
> Strange enough, indeed I have tried valgrind with the GC disabled version.  It
> didn't report anything useful.
> 
> That's why I'm puzzled, does D's GC do something special?
> 
> The GC disabled version run out of 3G memory; but the GC enabled version 
> stays at
> ~800M throughout the run.

I guess that with such amount of memory used, your program can greatly
benefit from using NO_SCAN if your 800M of data are plain old data. Did
you tried it? And if you never have interior pointers to that data, your
program can possibly avoid a lot of false positives due to the
conservativism if you use NO_INTERIOR (this is only available if you patch
the GC with David Simcha's patch[1]).

[1] http://d.puremagic.com/issues/show_bug.cgi?id=2927

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/

GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)

This is what you get,
when you mess with us.


Re: how to use GC as a leak detector? i.e. get some help info from GC?

2009-05-24 Thread Leandro Lucarella
nobody, el 24 de mayo a las 19:05 me escribiste:
> Hi,
> 
> I'm writing a data processing program in D, which deals with large amounts of
> small objects. One of the thing I found is that D's GC is horribly slow in
> such situation. I tried my program with gc enable & disabled (with some manual
> deletes). The GC disabled version (2 min) is ~100 times faster than the GC
> enabled version (4 hours)!
> 
> But of course the GC disabled version still leak memory, it soon exceeds the
> machine memory limit when I try to process more data; while the GC enabled
> version don't have such problem.
> 
> So my plan is to use the GC disabled version with manual deletes. But it was
> very hard to find all the memory leaks. I'm wondering: is there anyway to use
> GC as a leak detector? can the GC enabled version give me some help
> information on which objects get collected, so I can manually delete them in
> my GC disabled version?  Thanks!

As other asked, are you using D1 Tango/Phobos? D2? In Tango/D2 you can
enable logging in the GC (using the LOGGING version identifier).

Is your program source available? I'm gathering programs to make a D GC
benchmark suite an your programs seems like a good candidate for measuring
the GC performance.

Thank you.

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/

GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)

Que importante, entonces en estos días de globalización refregar
nuestras almas, pasarle el lampazo a nuestros corazones para alcanzar
un verdadero estado de babia peperianal.
-- Peperino Pómoro


Re: how to use GC as a leak detector? i.e. get some help info from GC?

2009-05-24 Thread Brad Roberts
nobody wrote:
>> One thing you could try is disabling the GC (this really just disables 
>> automatic
>> running of the collector) and run it manually at points that you know make 
>> sense.
>>  For example, you could just insert a GC.collect() statement at the end of 
>> every
>> run of your main loop.
>> Another thing to try is avoiding appending to arrays.  If you know the 
>> length in
>> advance, you can get pretty good speedups by pre-allocating the array 
>> instead of
>> appending using the ~= operator.
>> You can safely delete specific objects manually even when the GC is enabled. 
>>  For
>> very large objects with trivial lifetimes, this is probably worth doing.  
>> First of
>> all, the GC will run less frequently.  Secondly, D's GC is partially 
>> conservative,
>> meaning that occasionally memory will not be freed when it should be.  The
>> probability of this happening is proportional to the size of the memory 
>> block.
> 
> I have tried all these: with GC enabled only periodically runs in the main 
> loop,
> however the memory still grows faster than I expected when I feed more data 
> into
> the program. Then I manually delete some specific objects. However the program
> start to fail randomly.
> 
> Has anyone experienced similar issues: i.e. with GC on, you defined you own 
> dtor
> for certain class, and called delete manually on certain objects.
> 
> The program fails at random stages, with some stack trace showing some GC 
> calls like:
> 
>  0x0821977a in _D2gc3gcx3Gcx16fullcollectshellMFZk ()
> 
> I suspected the GC is buggy when mixed with manual deletes.

After enabling the gc, did you force a collection?  Just enabling it won't cause
one to occur.

Later,
Brad


Re: how to use GC as a leak detector? i.e. get some help info from GC?

2009-05-24 Thread dsimcha
== Quote from nobody (n...@where.com)'s article
> > One thing you could try is disabling the GC (this really just disables 
> > automatic
> > running of the collector) and run it manually at points that you know make 
> > sense.
> >  For example, you could just insert a GC.collect() statement at the end of 
> > every
> > run of your main loop.
> > Another thing to try is avoiding appending to arrays.  If you know the 
> > length in
> > advance, you can get pretty good speedups by pre-allocating the array 
> > instead of
> > appending using the ~= operator.
> > You can safely delete specific objects manually even when the GC is 
> > enabled.  For
> > very large objects with trivial lifetimes, this is probably worth doing.  
> > First of
> > all, the GC will run less frequently.  Secondly, D's GC is partially 
> > conservative,
> > meaning that occasionally memory will not be freed when it should be.  The
> > probability of this happening is proportional to the size of the memory 
> > block.
> I have tried all these: with GC enabled only periodically runs in the main 
> loop,
> however the memory still grows faster than I expected when I feed more data 
> into
> the program. Then I manually delete some specific objects. However the program
> start to fail randomly.
> Has anyone experienced similar issues: i.e. with GC on, you defined you own 
> dtor
> for certain class, and called delete manually on certain objects.
> The program fails at random stages, with some stack trace showing some GC 
> calls
like:
>  0x0821977a in _D2gc3gcx3Gcx16fullcollectshellMFZk ()
> I suspected the GC is buggy when mixed with manual deletes.

I personally have not experienced this.  Please be more specific:

D1 or D2?
If D1, Phobos or Tango?
DMD, LDC, or GDC?
Compiler version?

Also, please file a bug report, especially if you can create a concise,
reproducible test case.


Re: how to use GC as a leak detector? i.e. get some help info from GC?

2009-05-24 Thread nobody
> One thing you could try is disabling the GC (this really just disables 
> automatic
> running of the collector) and run it manually at points that you know make 
> sense.
>  For example, you could just insert a GC.collect() statement at the end of 
> every
> run of your main loop.
> Another thing to try is avoiding appending to arrays.  If you know the length 
> in
> advance, you can get pretty good speedups by pre-allocating the array instead 
> of
> appending using the ~= operator.
> You can safely delete specific objects manually even when the GC is enabled.  
> For
> very large objects with trivial lifetimes, this is probably worth doing.  
> First of
> all, the GC will run less frequently.  Secondly, D's GC is partially 
> conservative,
> meaning that occasionally memory will not be freed when it should be.  The
> probability of this happening is proportional to the size of the memory block.

I have tried all these: with GC enabled only periodically runs in the main loop,
however the memory still grows faster than I expected when I feed more data into
the program. Then I manually delete some specific objects. However the program
start to fail randomly.

Has anyone experienced similar issues: i.e. with GC on, you defined you own dtor
for certain class, and called delete manually on certain objects.

The program fails at random stages, with some stack trace showing some GC calls 
like:

 0x0821977a in _D2gc3gcx3Gcx16fullcollectshellMFZk ()

I suspected the GC is buggy when mixed with manual deletes.


Re: XML API

2009-05-24 Thread Daniel Keep


Michel Fortin wrote:
> On 2009-05-24 12:51:43 -0400, Daniel Keep 
> said:

(Cutting us mostly going back-and-forth on what a callback api would
look like.

>> ...
>>
>> Like I said, this seems like a lot of work to bolt a callback interface
>> onto something a pull api is designed for.
>>
>> ...
>>
>> Except of course that you now can't easily control the loop, nor can do
>> you do fall-through on the cases.
> 
> Again, my definition of a callback API doesn't include an implicit loop,
> just a callback. And I intend the callback to be a template argument so
> it can be dispatched using function overloading and/or function
> templates. So you'll have this instead:
> 
> bool continue = true;
> do
> continue = pp.readNext!(callback)();
> while (continue);
> 
> void callback(OpenElementToken t) { blah(t.name); }
> void callback(CloseElementToken t) { ... }
> void callback(CharacterDataToken t) { ... }
> ...
> 
> No switch statement and no inversion of control.

Except that you can't define overloads of a function inside a function.
 Which means you have to stuff all of your code in a set of increasingly
obtusely-named globals or private members.  Like elemAStart, elemAData,
elemAAttr, elemAClose, elemBStart, elemBData, elemBAttr, ...

One problem I see here is that you're going to spaghettify the code and
state.  For example, let's say I'm writing code to handle a particular
element.  I can't put the code and state for this into a single
function, I have to break it out over several.

One function for each event.  This means I need to have all state
variables visible from each function.  So I have to start shoving the
state into the owning object instead of on the stack.

Whoops, I can't recurse now, can I?  Sucks if I'm using any sort of
hierarchical structure.  I can't use the call stack, so I have to invent
my own.  I don't want to make every state variable a stack, so I put
each component of the parser into a separate object which I can
instantiate and kick off.

And at that point, I've just reinvented SAX.  Well, almost.  I have
control over the loop.  I still can't simply break out of it; I've got
to mess around with flags to get that done.

Meanwhile, if I write that code with a PullParser, it's just a
collection of normal functions, one per element type with all the
related code together in one place.  Or, if I don't want them all
bundled together, I can dispatch to smaller functions.

I have a feeling you're going to head down this path irrespective, so
I'll just hope you can figure out a way to make the api not suck.


Re: how to use GC as a leak detector? i.e. get some help info from GC?

2009-05-24 Thread dsimcha
== Quote from nobody (n...@where.com)'s article
> Hi,
> I'm writing a data processing program in D, which deals with large amounts of
> small objects. One of the thing I found is that D's GC is horribly slow in
> such situation. I tried my program with gc enable & disabled (with some manual
> deletes). The GC disabled version (2 min) is ~100 times faster than the GC
> enabled version (4 hours)!
> But of course the GC disabled version still leak memory, it soon exceeds the
> machine memory limit when I try to process more data; while the GC enabled
> version don't have such problem.
> So my plan is to use the GC disabled version with manual deletes. But it was
> very hard to find all the memory leaks. I'm wondering: is there anyway to use
> GC as a leak detector? can the GC enabled version give me some help
> information on which objects get collected, so I can manually delete them in
> my GC disabled version?  Thanks!

I've dealt with a bunch of somewhat similar situations in code I've written, 
here
are some tips that others have not already mentioned, and that might be less
drastic than going with fully manual memory management:

One thing you could try is disabling the GC (this really just disables automatic
running of the collector) and run it manually at points that you know make 
sense.
 For example, you could just insert a GC.collect() statement at the end of every
run of your main loop.

Another thing to try is avoiding appending to arrays.  If you know the length in
advance, you can get pretty good speedups by pre-allocating the array instead of
appending using the ~= operator.

You can safely delete specific objects manually even when the GC is enabled.  
For
very large objects with trivial lifetimes, this is probably worth doing.  First 
of
all, the GC will run less frequently.  Secondly, D's GC is partially 
conservative,
meaning that occasionally memory will not be freed when it should be.  The
probability of this happening is proportional to the size of the memory block.

Lastly, I've been working on a generic second stack/mark-release allocator for 
D2,
called TempAlloc.  It's useful for when you need to temporarily allocate memory 
in
a last in, first out order, but you can't use the call stack for whatever 
reason.
 I've also implemented a few basic data structures (hash tables and hash sets)
that are specifically designed for this allocator.  Right now, it's coevolving
with my dstats statistics lib, but if you want to try it or at least look at it
and give me some feedback, I'd like to eventually get it to the point where it 
can
be added to Phobos and/or Tango.  See
http://svn.dsource.org/projects/dstats/docs/alloc.html .


Re: !in operator?

2009-05-24 Thread Stewart Gordon

Jason House wrote:


Method 1:

if (x !in y)
  foo();
else{
  auto z = x in y;
  bar(z);
}

Method 2:

auto z = x in y;
if (z is null)
  foo;
else
  bar(z);

Method 1 essentially calls in twice while method 2 calls in once.



But there's no requirement to look it up after finding out whether it's 
there or not.


And how's it any different from

if (x in y) {
auto z = x in y;
bar(z);
} else {
foo();
}

or even

if (x in y) {
bar(y[x]);
} else {
foo();
}

?

Besides, why would any decent compiler not optimise it to a single lookup?

Stewart.


Re: Problem with .deb packages

2009-05-24 Thread Michael P.
Bruno Deligny Wrote:

> Jesse Phillips a écrit :
> > On Sat, 02 May 2009 14:57:43 +0200, Bruno Deligny wrote:
> > 
> >> When i try to install dmd1 or dmd2 on my ubuntu i386 with the deb
> >> packages on http://www.digitalmars.com/d/download.html, it says "Error :
> >> incorrect Architecture « amd64 »"
> >>
> >> The packages were built for the amd64 architecture.
> > 
> > I don't know how the packages were built for amd64, there only i386 
> > packages. You have to provide dpkg the --force-architecture switch.
> > dpkg --force-architecture -i ...deb
> 
> The packages are still broken. I dont know who did it but we can't let 
> that on the website.
> 
> It's hard to persuade people to use D if packages are broken and there 
> isn't Windows installer. I think a lot of people dont even try by seeing 
> that.

Would this be worthy of a bugzilla report?
I encountered this too when I tried to install DMD using the .deb packages.


Re: how to use GC as a leak detector? i.e. get some help info from GC?

2009-05-24 Thread Frits van Bommel

nobody wrote:

== Quote from Jason House (jason.james.ho...@gmail.com)'s article

Why not use valgrind? With the GC disabled, it should give accurate results.


Strange enough, indeed I have tried valgrind with the GC disabled version.  It
didn't report anything useful.

That's why I'm puzzled, does D's GC do something special?


The GC allocates memory directly from the OS, it doesn't use malloc/free and 
friends. It does this even when the GC is "disabled", which just means the 
collections won't happen. (Disabling the GC doesn't change the method of allocation)
Valgrind probably doesn't detect those OS calls (and almost certainly doesn't 
know about the GC calls).


If you're using Tango, you can link to the 'stub' GC instead of the normal 
('basic') one. The stub GC doesn't actually collect, it passes calls on to 
malloc/calloc/realloc/free instead. That should make Valgrind work.

(something similar probably applies if you're using D2 with druntime)


Re: how to use GC as a leak detector? i.e. get some help info from GC?

2009-05-24 Thread nobody
== Quote from Unknown W. Brackets (unkn...@simplemachines.org)'s article
> Theoretically, you could recompile the GC to write to a log file any
> time it frees anything.

Is it possible to recompile Phobos to let the GC write to a log whenever it 
frees?
I guess I also need the type info of the object being freed.




Re: how to use GC as a leak detector? i.e. get some help info from GC?

2009-05-24 Thread Nick Sabalausky
"nobody"  wrote in message 
news:gvc5q7$2bc...@digitalmars.com...
> Hi,
>
> I'm writing a data processing program in D, which deals with large amounts 
> of
> small objects. One of the thing I found is that D's GC is horribly slow in
> such situation. I tried my program with gc enable & disabled (with some 
> manual
> deletes). The GC disabled version (2 min) is ~100 times faster than the GC
> enabled version (4 hours)!
>
> But of course the GC disabled version still leak memory, it soon exceeds 
> the
> machine memory limit when I try to process more data; while the GC enabled
> version don't have such problem.
>
> So my plan is to use the GC disabled version with manual deletes. But it 
> was
> very hard to find all the memory leaks. I'm wondering: is there anyway to 
> use
> GC as a leak detector? can the GC enabled version give me some help
> information on which objects get collected, so I can manually delete them 
> in
> my GC disabled version?  Thanks!
>

Depending how exactly your program is working, another common thing that 
might help is to manually manage free pools. Ie, allocate a bunch up-front, 
and instead of letting one get GCed when done with it, hold on to it, make 
note of it being available for re-use, and then reuse it instead of 
allocating a new one. Or, allocate one big chuck of memory and stick your 
small objects in there. They typically do this sort of thing for particle 
systems.




Re: how to use GC as a leak detector? i.e. get some help info from GC?

2009-05-24 Thread nobody
== Quote from Jason House (jason.james.ho...@gmail.com)'s article
> Why not use valgrind? With the GC disabled, it should give accurate results.

Strange enough, indeed I have tried valgrind with the GC disabled version.  It
didn't report anything useful.

That's why I'm puzzled, does D's GC do something special?

The GC disabled version run out of 3G memory; but the GC enabled version stays 
at
~800M throughout the run.


Re: !in operator?

2009-05-24 Thread bearophile
Jason House:
> Method 1 essentially calls in twice while method 2 calls in once.

Sometimes I just want to know if something isn't present.
Having !in doesn't prevent me from writing and using x = y in aa; when I want 
it.


> PS: Please don't assume that I'm advocating not having a !in operator. I'm 
> just pointing out possible reasons it may have been avoided.<

I think that's not a possible reason :-)

Bye,
bearophile


Re: how to use GC as a leak detector? i.e. get some help info from GC?

2009-05-24 Thread Unknown W. Brackets
Theoretically, you could recompile the GC to write to a log file any 
time it frees anything.


For data processing, though, you really want to try to have a fixed 
memory buffer.  You've got to be hurting from the allocations and frees, 
which if at all possible you should get rid of.


Also, if you're allocating buffers of memory (e.g. for the data), you 
can tell the GC not to scan them.  This will probably solve the problem 
of the GC being so slow.


-[Unknown]


nobody wrote:

Hi,

I'm writing a data processing program in D, which deals with large amounts of
small objects. One of the thing I found is that D's GC is horribly slow in
such situation. I tried my program with gc enable & disabled (with some manual
deletes). The GC disabled version (2 min) is ~100 times faster than the GC
enabled version (4 hours)!

But of course the GC disabled version still leak memory, it soon exceeds the
machine memory limit when I try to process more data; while the GC enabled
version don't have such problem.

So my plan is to use the GC disabled version with manual deletes. But it was
very hard to find all the memory leaks. I'm wondering: is there anyway to use
GC as a leak detector? can the GC enabled version give me some help
information on which objects get collected, so I can manually delete them in
my GC disabled version?  Thanks!




Re: OT: on IDEs and code writing on steroids

2009-05-24 Thread BCS

Hello Yigal,


C# assemblies are analogous to C/C++/D libs.
you can't create a standalone executable in D just by parsing the D
source files (for all the imports) if you need to link in external libs.
you need to at least specify the lib name if it's on the linker's
search path or provide the full path otherwise.


pagma(lib, ...); //?


Same thing with assemblies.
you have to provide that meta-data (lib names) anyway both in C# and
D. the only difference is that C# (correctly) recognizes that this is
the better default.


IMHO the c# way is the /worse/ default. Based on that being my opinion, I 
think we have found where we will have to disagree. Part of my reasoning 
is that in the normal case, for practical reasons, that file will have to 
be maintained by an IDE, thus /requiring/ development to be in an IDE of 
some kind. In D, that data in can normally be part of the source code, and 
only in unusual cases does it need to be formally codified.





Re: Problem with .deb packages

2009-05-24 Thread grauzone

Jason House wrote:

grauzone Wrote:


Daniel Keep wrote:

grauzone wrote:

...

Now the irony is, that Wlater wouldn't even allow Debian to redistribute
a properly packaged dmd... (if Debian wanted to)

Speak ye of the evil Wizard Wlater, previous servant of the dark empire
of Sym'n'tek?  :3

Oops.


As for the distribution problem, I think it's because Walter *can't*
allow it to be freely redistributed.

Why not? It can't be for license reasons?


Sadly, that's exactly why. The backend is under restrictions Walter can't 
control. For a sillier example, there's a disclaimer that the code is not 
intended to work after 1999.


Is that really so? I would have guessed that this restriction is only 
for redistributing the backend source. I mean, when dmd still came 
without the backend source, it was shipped without the backend license.


Re: how to use GC as a leak detector? i.e. get some help info from GC?

2009-05-24 Thread Jason House
nobody Wrote:

> Hi,
> 
> I'm writing a data processing program in D, which deals with large amounts of
> small objects. One of the thing I found is that D's GC is horribly slow in
> such situation. I tried my program with gc enable & disabled (with some manual
> deletes). The GC disabled version (2 min) is ~100 times faster than the GC
> enabled version (4 hours)!
> 
> But of course the GC disabled version still leak memory, it soon exceeds the
> machine memory limit when I try to process more data; while the GC enabled
> version don't have such problem.
> 
> So my plan is to use the GC disabled version with manual deletes. But it was
> very hard to find all the memory leaks. I'm wondering: is there anyway to use
> GC as a leak detector? can the GC enabled version give me some help
> information on which objects get collected, so I can manually delete them in
> my GC disabled version?  Thanks!
> 
> 

Why not use valgrind? With the GC disabled, it should give accurate results.


Re: Problem with .deb packages

2009-05-24 Thread Jason House
grauzone Wrote:

> Daniel Keep wrote:
> > 
> > grauzone wrote:
> >> ...
> >>
> >> Now the irony is, that Wlater wouldn't even allow Debian to redistribute
> >> a properly packaged dmd... (if Debian wanted to)
> > 
> > Speak ye of the evil Wizard Wlater, previous servant of the dark empire
> > of Sym'n'tek?  :3
> 
> Oops.
> 
> > As for the distribution problem, I think it's because Walter *can't*
> > allow it to be freely redistributed.
> 
> Why not? It can't be for license reasons?

Sadly, that's exactly why. The backend is under restrictions Walter can't 
control. For a sillier example, there's a disclaimer that the code is not 
intended to work after 1999.


Re: OT: on IDEs and code writing on steroids

2009-05-24 Thread Yigal Chripun

BCS wrote:

Hello Christopher,


BCS wrote:


But that's not the point. Neither make nor VS's equivalent is what
this thread was about. At least not where I was involved. My point is
that the design of c# *requiters* the maintenance (almost certainly
by a c# specific IDE) of some kind of external metadata file that
contains information that can't be derived from the source code its
self, where as with D, no such metadata is *needed*. If you wanted,
you could build a tool to take D source code and generate a makefile
or a bash build script from it


If you wanted, you could create a tool to do the same with C# source
code, assuming there exists a directory containing all and only those
source files that should end up in the resulting assembly.


I'm /not/ willing to assume that (because all to often it's not true) 
and you also need the list of other assemblies that should be included.




C# assemblies are analogous to C/C++/D libs.
you can't create a standalone executable in D just by parsing the D 
source files (for all the imports) if you need to link in external libs. 
you need to at least specify the lib name if it's on the linker's search 
path or provide the full path otherwise.

Same thing with assemblies.

you have to provide that meta-data (lib names) anyway both in C# and D. 
the only difference is that C# (correctly) recognizes that this is the 
better default.


Re: !in operator?

2009-05-24 Thread Jason House
Stewart Gordon Wrote:

> Jason House wrote:
> 
> > That is unfortunately a rather sticky point.  The in operator does not 
> > return bool.  I think the lack of !in is to encourage writing of efficient 
> > code.  I'm not really sure though.
> 
> How, exactly, does not having !in make code efficient?
> 
> Stewart.

Consider the following code snippets:

Method 1:

if (x !in y)
  foo();
else{
  auto z = x in y;
  bar(z);
}

Method 2:

auto z = x in y;
if (z is null)
  foo;
else
  bar(z);

Method 1 essentially calls in twice while method 2 calls in once.

PS: Please don't assume that I'm advocating not having a !in operator. I'm just 
pointing out possible reasons it may have been avoided. 


how to use GC as a leak detector? i.e. get some help info from GC?

2009-05-24 Thread nobody
Hi,

I'm writing a data processing program in D, which deals with large amounts of
small objects. One of the thing I found is that D's GC is horribly slow in
such situation. I tried my program with gc enable & disabled (with some manual
deletes). The GC disabled version (2 min) is ~100 times faster than the GC
enabled version (4 hours)!

But of course the GC disabled version still leak memory, it soon exceeds the
machine memory limit when I try to process more data; while the GC enabled
version don't have such problem.

So my plan is to use the GC disabled version with manual deletes. But it was
very hard to find all the memory leaks. I'm wondering: is there anyway to use
GC as a leak detector? can the GC enabled version give me some help
information on which objects get collected, so I can manually delete them in
my GC disabled version?  Thanks!




Re: XML API

2009-05-24 Thread Michel Fortin

On 2009-05-24 14:13:31 -0400, Michel Fortin  said:


The reason is that if your callback api only does a single callback, all
you've really done is move the switch statement inside the function call
at the cost of having to define a crapload of functions outside of it.


The thing is that inside the parser code there is already a separate 
code path for dealing with each type of token. Various callbacks can be 
called from these separate code paths. When you return after parsing 
one token, the code path isn't different anymore, so you need to add an 
extra swich statement that wouldn't be there with a callback called 
from the right code path.


I suddenly noticed that I misunderstood what you meant in the paragraph 
above so I don't expect my answer above to fit your question. 
Nevertheless, I suppose the examples at the end of my previous post 
will clarify things: basically the callback isn't a function pointer, 
it's an alias template argument which can disptach to overloaded 
functions or template functions so you don't need a switch statement.


Sorry for any confusion.


--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



XML API

2009-05-24 Thread Michel Fortin

On 2009-05-24 12:51:43 -0400, Daniel Keep  said:


Michel Fortin wrote:

...

A callback API isn't necessarily SAX. A callback API doesn't necessarily
have to parse everything until completion, it could parse only the next
token and call the appropriate callback.


When I talk "callback api," I mean something fundamentally like SAX.


SAX is defintely a popular callback API for XML, but to me a callback 
API just imply that some callback gets called.




The reason is that if your callback api only does a single callback, all
you've really done is move the switch statement inside the function call
at the cost of having to define a crapload of functions outside of it.


The thing is that inside the parser code there is already a separate 
code path for dealing with each type of token. Various callbacks can be 
called from these separate code paths. When you return after parsing 
one token, the code path isn't different anymore, so you need to add an 
extra swich statement that wouldn't be there with a callback called 
from the right code path.




If I can construct a range class/struct over my callback API I'll be
happy. And if I can recursively call the parser API inside a callback
handler so I can reuse the call stack while parsing then I'll be very
happy.


I don't see how constructing a range over a callback api will work.
Callback apis are inversion of control, ranges aren't.


Your definition of a callback API is about inversion of control. My 
definition is just that it parse one token and call a function for that 
token. If you read what I wrote using your definition, it obviously 
can't work indeed.




...

Like I said, this seems like a lot of work to bolt a callback interface
onto something a pull api is designed for.

At best, you'll end up rewriting this:


foreach( tt ; pp )
{
switch( tt )
{
case XmlTokenType.StartElement: blah(pp.name); break;
...
}
}


to this:


pp.parse
(
XmlToken(Type.StartElement, {blah(pp.name);}),
...
);


Except of course that you now can't easily control the loop, nor can do
you do fall-through on the cases.


Again, my definition of a callback API doesn't include an implicit 
loop, just a callback. And I intend the callback to be a template 
argument so it can be dispatched using function overloading and/or 
function templates. So you'll have this instead:


bool continue = true;
do
continue = pp.readNext!(callback)();
while (continue);

void callback(OpenElementToken t) { blah(t.name); }
void callback(CloseElementToken t) { ... }
void callback(CharacterDataToken t) { ... }
...

No switch statement and no inversion of control.

And here's my current prototype for a range:

alias Algebraic!(
CharDataToken, CommentToken, PIToken, CDataSectionToken, 
AttrToken,
XMLDeclToken, OpenElementToken, CloseElementToken, 
EmptyElementToken
) XMLToken;

struct XMLForwardRange(Parser)
{
bool empty;
XMLToken front;
Parser parser;

this(Parser parser)
{
this.parser = parser;
popFront(); // parse first token
}

void popFront()
{
empty = !parser.readNext!(callback)();
}

private void callback(T)(T token)
{
front = token;
}
}

Constructing a pull parser using the same pattern should be pretty easy 
if you wanted to.



--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



Re: Problem with .deb packages

2009-05-24 Thread Daniel Keep


grauzone wrote:
> Daniel Keep wrote:
>>
>> grauzone wrote:
>>> ...
>>>
>>> Now the irony is, that Wlater wouldn't even allow Debian to redistribute
>>> a properly packaged dmd... (if Debian wanted to)
>>
>> Speak ye of the evil Wizard Wlater, previous servant of the dark empire
>> of Sym'n'tek?  :3
> 
> Oops.
> 
>> As for the distribution problem, I think it's because Walter *can't*
>> allow it to be freely redistributed.
> 
> Why not? It can't be for license reasons?

I don't think Walter has complete ownership over all of the code.


Re: Problem with .deb packages

2009-05-24 Thread grauzone

Daniel Keep wrote:


grauzone wrote:

...

Now the irony is, that Wlater wouldn't even allow Debian to redistribute
a properly packaged dmd... (if Debian wanted to)


Speak ye of the evil Wizard Wlater, previous servant of the dark empire
of Sym'n'tek?  :3


Oops.


As for the distribution problem, I think it's because Walter *can't*
allow it to be freely redistributed.


Why not? It can't be for license reasons?


Re: !in operator?

2009-05-24 Thread Stewart Gordon

Jason House wrote:

That is unfortunately a rather sticky point.  The in operator does not 
return bool.  I think the lack of !in is to encourage writing of efficient 
code.  I'm not really sure though.


How, exactly, does not having !in make code efficient?

Stewart.


Re: !in operator?

2009-05-24 Thread Stewart Gordon

Jeremie Pelletier wrote:
Why is there no !in operator just like there is a !is operator? Is it 
because this expression evaluates to a pointer to the found element? 



Of course not.  This compiles:

void main() {
char* abc;
assert (!abc);
}

so why shouldn't !in?

Stewart.


Re: Problem with .deb packages

2009-05-24 Thread Daniel Keep


grauzone wrote:
> ...
> 
> Now the irony is, that Wlater wouldn't even allow Debian to redistribute
> a properly packaged dmd... (if Debian wanted to)

Speak ye of the evil Wizard Wlater, previous servant of the dark empire
of Sym'n'tek?  :3

As for the distribution problem, I think it's because Walter *can't*
allow it to be freely redistributed.


Re: Finalizing D2

2009-05-24 Thread Daniel Keep


Michel Fortin wrote:
> ...
> 
> A callback API isn't necessarily SAX. A callback API doesn't necessarily
> have to parse everything until completion, it could parse only the next
> token and call the appropriate callback.

When I talk "callback api," I mean something fundamentally like SAX.
The reason is that if your callback api only does a single callback, all
you've really done is move the switch statement inside the function call
at the cost of having to define a crapload of functions outside of it.

> If I can construct a range class/struct over my callback API I'll be
> happy. And if I can recursively call the parser API inside a callback
> handler so I can reuse the call stack while parsing then I'll be very
> happy.

I don't see how constructing a range over a callback api will work.
Callback apis are inversion of control, ranges aren't.

As for using a callback api recursively, that just seems like a lot of
work to replicate the way a pull api works in the first place.

>> Something like Tango's PullParser is the superior API because although
>> it's more verbose up-front, that's as bad as it gets.  Plus, you can
>> actually do stuff like call subroutines.
> 
> All that is needed really is a callback system that parses only one
> token. Then the callback can update the PullParser state, or the
> token-range state, run in a loop to produce a SAX-like API, or directly
> do what you want to do, which may include parsing more tokens using
> different callbacks until you reach a closing tag.

Like I said, this seems like a lot of work to bolt a callback interface
onto something a pull api is designed for.

At best, you'll end up rewriting this:

> foreach( tt ; pp )
> {
> switch( tt )
> {
> case XmlTokenType.StartElement: blah(pp.name); break;
> ...
> }
> }

to this:

> pp.parse
> (
> XmlToken(Type.StartElement, {blah(pp.name);}),
> ...
> );

Except of course that you now can't easily control the loop, nor can do
you do fall-through on the cases.


Re: OT: on IDEs and code writing on steroids

2009-05-24 Thread Lutger
Yigal Chripun wrote:
...
>> 
> this I completely disagree with. those are the same faulty reasons I 
> already answered.
> an IDE does _not_ create bad programmers, and does _not_ encourage bad 
> code. it does encourage descriptive names which is a _good_ thing.
> 
> writing "strcpy" ala C style is cryptic and *wrong*. code is read 
> hundred times more than it's written and a better name would be for 
> instance - "stringCopy".
> it's common nowadays to have tera-byte sized HDD so why people try to 
> save a few bytes from their source while sacrificing readability?
...

This is not what I was saying. 

I'm not talking about strcpy vs stringCopy. stringCopy is short. I'm talking 
about things like SetCompatibleTextRenderingDefault.

And this example isn't even so bad. Fact is, it is easier to come up with long 
identifiers and there is no penalty in the form of typing cost for doing so. 

It's not about bad programmers (or saving bytes, that's just ridiculous), but 
IDE 
does encourage some kind of constructs because they are easier in that 
environment. Good programmers come up with good, descriptive names, whether 
they 
program in an IDE or not. 

At work I must program in VB.NET. This language is pretty verbose in describing 
even the most common things. It's easier to parse when you're new to the 
language, but after a while I find all the verbosity gets in the way of 
readability.





Re: Problem with .deb packages

2009-05-24 Thread grauzone
The packages are still broken. I dont know who did it but we can't let 
that on the website.


It's hard to persuade people to use D if packages are broken and there 
isn't Windows installer. I think a lot of people dont even try by seeing 
that.


You can bet on that. It makes you wonder how whoever assembled the 
package tested it. Did you just go with --force-all because he couldn't 
figure out various things about the package system? What the heck did he 
do? And why the hell is it not fixed yet?


Providing broken packages is as nice to the user as providing virus 
infected .exe files.


Now the irony is, that Wlater wouldn't even allow Debian to redistribute 
a properly packaged dmd... (if Debian wanted to)


Re: OT: on IDEs and code writing on steroids

2009-05-24 Thread BCS

Hello Christopher,


BCS wrote:


But that's not the point. Neither make nor VS's equivalent is what
this thread was about. At least not where I was involved. My point is
that the design of c# *requiters* the maintenance (almost certainly
by a c# specific IDE) of some kind of external metadata file that
contains information that can't be derived from the source code its
self, where as with D, no such metadata is *needed*. If you wanted,
you could build a tool to take D source code and generate a makefile
or a bash build script from it


If you wanted, you could create a tool to do the same with C# source
code, assuming there exists a directory containing all and only those
source files that should end up in the resulting assembly.


I'm /not/ willing to assume that (because all to often it's not true) and 
you also need the list of other assemblies that should be included.





Re: Problem with .deb packages

2009-05-24 Thread Bruno Deligny

Jesse Phillips a écrit :

On Sat, 02 May 2009 14:57:43 +0200, Bruno Deligny wrote:


When i try to install dmd1 or dmd2 on my ubuntu i386 with the deb
packages on http://www.digitalmars.com/d/download.html, it says "Error :
incorrect Architecture « amd64 »"

The packages were built for the amd64 architecture.


I don't know how the packages were built for amd64, there only i386 
packages. You have to provide dpkg the --force-architecture switch.

dpkg --force-architecture -i ...deb


The packages are still broken. I dont know who did it but we can't let 
that on the website.


It's hard to persuade people to use D if packages are broken and there 
isn't Windows installer. I think a lot of people dont even try by seeing 
that.


Re: Finalizing D2

2009-05-24 Thread Michel Fortin

On 2009-05-24 03:22:47 -0400, Daniel Keep  said:


Callbacks are "easier" to set up, but are incredibly complicated for any
sort of structured parsing.  The problem is that you can't easily change
the behaviour of the parser once it's started.

I had to write a SAX parser for a structured data format a few years
ago.  I swear that 90% of the code (and it's a monstrously huge module)
was just boilerplate to work around the bloody callback system.  I've
come to the conclusion that the SAX api is about the worse POSSIBLE way
of parsing anything more complex than a flat file that shouldn't have
been XML in the first place.


A callback API isn't necessarily SAX. A callback API doesn't 
necessarily have to parse everything until completion, it could parse 
only the next token and call the appropriate callback.


If I can construct a range class/struct over my callback API I'll be 
happy. And if I can recursively call the parser API inside a callback 
handler so I can reuse the call stack while parsing then I'll be very 
happy.




Something like Tango's PullParser is the superior API because although
it's more verbose up-front, that's as bad as it gets.  Plus, you can
actually do stuff like call subroutines.


All that is needed really is a callback system that parses only one 
token. Then the callback can update the PullParser state, or the 
token-range state, run in a loop to produce a SAX-like API, or directly 
do what you want to do, which may include parsing more tokens using 
different callbacks until you reach a closing tag.



--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



Re: OT: on IDEs and code writing on steroids

2009-05-24 Thread Christopher Wright

BCS wrote:

Hello Yigal,


Georg Wrede wrote:


Yigal Chripun wrote:



Make _is_ a build tool


Yes. But since it's on every Unix since almost 40 years back, it
doesn't count here.  :-)

Besides, it has tons of other uses, too. One might as well say that a
text editor is a build tool. You construct (or erect) software with
it. ;-)


Nope. it does count as an external build tool



OK and so can bash because it can run scripts.


No, the main purpose of make is to build software. You probably wouldn't 
think to use a makefile to automate converting flac files to ogg files, 
for instance. Or look at bashburn -- it has a user interface (albeit 
using text menus rather than graphics). You might be able to do that 
with a makefile, but it would be seriously awkward, and you'd mainly be 
using shell scripting.


And bash does not have any special features to assist in building software.

But that's not the point. Neither make nor VS's equivalent is what this 
thread was about. At least not where I was involved. My point is that 
the design of c# *requiters* the maintenance (almost certainly by a c# 
specific IDE) of some kind of external metadata file that contains 
information that can't be derived from the source code its self, where 
as with D, no such metadata is *needed*. If you wanted, you could build 
a tool to take D source code and generate a makefile or a bash build 
script from it


If you wanted, you could create a tool to do the same with C# source 
code, assuming there exists a directory containing all and only those 
source files that should end up in the resulting assembly. If you follow 
C# best practices, this is what you will do -- and your directory 
structure will match your namespaces besides. But this is not enforced.


Re: Finalizing D2

2009-05-24 Thread Daniel Keep


Michel Fortin wrote:
> On 2009-05-23 01:25:49 -0400, Andrei Alexandrescu
>  said:
> 
>> * std.xml: replace with something that moves faster than molasses.
> 
> I started to write an XML parser using D1 and a pseudo-range
> implementation a little while ago, but never finished it. (I was
> undecided about the API, and that somewhat killed my interest.)
> 
> Perhaps I should finish it and contribute to Phobos.
> 
> The irking thing about the API was that if I expose a range for parsing
> and returning tokens, I then need a switch statement to do the right
> thing about each kind of these tokens (like instantiating the proper
> node type) whereas with a callback API you don't need to bother saving
> and then switching on a flag value telling you which kind of node you've
> read (and callbacks can be aliases in templates). They are two different
> compromises between speed and flexibility and I guess both should be
> supported.

Callbacks are "easier" to set up, but are incredibly complicated for any
sort of structured parsing.  The problem is that you can't easily change
the behaviour of the parser once it's started.

I had to write a SAX parser for a structured data format a few years
ago.  I swear that 90% of the code (and it's a monstrously huge module)
was just boilerplate to work around the bloody callback system.  I've
come to the conclusion that the SAX api is about the worse POSSIBLE way
of parsing anything more complex than a flat file that shouldn't have
been XML in the first place.

Something like Tango's PullParser is the superior API because although
it's more verbose up-front, that's as bad as it gets.  Plus, you can
actually do stuff like call subroutines.