date:20120313


On 12.03.2012 16:43, bls wrote:

On 03/10/2012 03:28 PM, Philippe Sigaud wrote:

Hello,

I created a new Github project, Pegged, a Parsing Expression Grammar
(PEG) generator in D.

https://github.com/PhilippeSigaud/Pegged

docs: https://github.com/PhilippeSigaud/Pegged/wiki


Just WOW!

Nice to have on your WIKI would be a EBNF to PEG sheet.

Wirth EBNF Pegged
A = BC. A - B C
A = B|C. A - C / C


Maybe A - B / C. And even then it's not exactly equivalent if the 
grammar was ambiguous.

Imagine: B - a, C - aa

--
Dmitry Olshansky

Re: Pegged, From EBNF to PEG


On 12.03.2012 17:45, bls wrote:

On 03/13/2012 04:28 AM, Dmitry Olshansky wrote:

On 12.03.2012 16:43, bls wrote:

On 03/10/2012 03:28 PM, Philippe Sigaud wrote:

Hello,

I created a new Github project, Pegged, a Parsing Expression Grammar
(PEG) generator in D.

https://github.com/PhilippeSigaud/Pegged

docs: https://github.com/PhilippeSigaud/Pegged/wiki


Just WOW!

Nice to have on your WIKI would be a EBNF to PEG sheet.

Wirth EBNF Pegged
A = BC. A - B C
A = B|C. A - C / C


Maybe A - B / C. And even then it's not exactly equivalent if the
grammar was ambiguous.
Imagine: B - a, C - aa

PEG is pretty new to me. Can you elaborate a bit ?


PEG defines order of alternatives, that is pretty much like a top-down 
recursive descent parser would parse it. Alternatives are tried from 
left to right, if first one fails, it tries next and so on.
In an example I give B is always picked first and so C is never ever 
looked at.


Somewhat less artificial example:
Literal - IntL| FloatL
FloatL - [0-9]+(.[0-9]+)?
IntL - [0-9]+

If you change it to: Literal - FloatL| IntL then integer literals would 
get parsed as floating point.






My mistake.. cleaned up stuff..

Pegged Wirth EBNF

Sequence
A - B C A = BC.

B or C
A - B / C A = B|C.

Zero or one B
A - B? A = [B].

Zero or more Bs
A - B* A = {B}.

One or more Bs
A - B+ Not available

PEG description of EBNF

EBNF - Procuction+
Production - Identifier '=' Expression '.'
Expression - Term ( '|' Term)*
Term - Factor Factor*
Factor - Identifier / Literal / '[' Expression ']' / '{' Expression '}'
/ '(' Expression ')'
lowerCase - [a-z]
upperCase - [A-Z]
Identifier - (lowerCase / upperCase) (lowerCase / upperCase)*


Why not:
Identifier - [a-zA-Z]+


Literal - (' .+ ') / ('' .+ '')


This needs escaping. Plain '.+' in pattern asks for trouble 99% of time.


Still not sure if this is correct. Especially :
Term - Factor Factor*


Another thing I never really understand is the production order, In
other words : Why not top down ..
Start :
lowerCase - [a-z]
upperCase - [A-Z]
Identifier - (lowerCase / upperCase) (lowerCase / upperCase)*





End :
EBNF - Procuction+

where End is Root..


In fact grammars are usually devised the other way around, e.g.
Start:
 Program - ...
Ehm... what the whole program is exactly ? Ok, let it be Declaration* 
for now. What kind of declarations do we have? ... and so on. Latter 
grammars get tweaked and extended numerous times.


At any rate production order has no effect on the grammar, it's still 
the same. The only thing of importance is what non-terminal considered 
final (or start if you are LL-centric).




TIA, Bjoern



--
Dmitry Olshansky

Re: Pegged, From EBNF to PEG

2012-03-13 Thread Alex Rønne Petersen


On 13-03-2012 17:17, Dmitry Olshansky wrote:

On 12.03.2012 17:45, bls wrote:

On 03/13/2012 04:28 AM, Dmitry Olshansky wrote:

On 12.03.2012 16:43, bls wrote:

On 03/10/2012 03:28 PM, Philippe Sigaud wrote:

Hello,

I created a new Github project, Pegged, a Parsing Expression Grammar
(PEG) generator in D.

https://github.com/PhilippeSigaud/Pegged

docs: https://github.com/PhilippeSigaud/Pegged/wiki


Just WOW!

Nice to have on your WIKI would be a EBNF to PEG sheet.

Wirth EBNF Pegged
A = BC. A - B C
A = B|C. A - C / C


Maybe A - B / C. And even then it's not exactly equivalent if the
grammar was ambiguous.
Imagine: B - a, C - aa

PEG is pretty new to me. Can you elaborate a bit ?


PEG defines order of alternatives, that is pretty much like a top-down
recursive descent parser would parse it. Alternatives are tried from
left to right, if first one fails, it tries next and so on.
In an example I give B is always picked first and so C is never ever
looked at.

Somewhat less artificial example:
Literal - IntL| FloatL
FloatL - [0-9]+(.[0-9]+)?
IntL - [0-9]+

If you change it to: Literal - FloatL| IntL then integer literals would
get parsed as floating point.





My mistake.. cleaned up stuff..

Pegged Wirth EBNF

Sequence
A - B C A = BC.

B or C
A - B / C A = B|C.

Zero or one B
A - B? A = [B].

Zero or more Bs
A - B* A = {B}.

One or more Bs
A - B+ Not available

PEG description of EBNF

EBNF - Procuction+
Production - Identifier '=' Expression '.'
Expression - Term ( '|' Term)*
Term - Factor Factor*
Factor - Identifier / Literal / '[' Expression ']' / '{' Expression '}'
/ '(' Expression ')'
lowerCase - [a-z]
upperCase - [A-Z]
Identifier - (lowerCase / upperCase) (lowerCase / upperCase)*


Why not:
Identifier - [a-zA-Z]+


That was an illustrative example from the Pegged docs. But yeah, you 
should just use a range; reads nicer.





Literal - (' .+ ') / ('' .+ '')


This needs escaping. Plain '.+' in pattern asks for trouble 99% of time.


Still not sure if this is correct. Especially :
Term - Factor Factor*


Another thing I never really understand is the production order, In
other words : Why not top down ..
Start :
lowerCase - [a-z]
upperCase - [A-Z]
Identifier - (lowerCase / upperCase) (lowerCase / upperCase)*





End :
EBNF - Procuction+

where End is Root..


In fact grammars are usually devised the other way around, e.g.
Start:
Program - ...
Ehm... what the whole program is exactly ? Ok, let it be Declaration*
for now. What kind of declarations do we have? ... and so on. Latter
grammars get tweaked and extended numerous times.

At any rate production order has no effect on the grammar, it's still
the same. The only thing of importance is what non-terminal considered
final (or start if you are LL-centric).



TIA, Bjoern






--
- Alex

Re: Pegged, a Parsing Expression Grammar (PEG) generator in D

2012-03-13 Thread Tobias Pankrath

I am impressed. That's a really nice showcase for the D compile 
time features.


Can I use PEG to parse languages like python and haskell where 
indention matters without preprocessing?


Will you make it work with input ranges of dchar? So that I can 
easily plug in some preprocessing steps?

Mono-D 0.3.4

2012-03-13 Thread alex


Again a couple of fixes  improvements [v0.3.4]

- [DDoc launcher] Extended functionality (now delegates  array 
literals are handled, too)
- [Refactoring] Fixed most of the renaming  reference 
findinghighlighting bugs
- [Settings] Enabled relative include paths for projects (will 
take the project's dir as base directory)  global configurations 
(uses the config's bin path as base path)
- [Formatter] Fixed indent problem with pressing newline in block 
comments
- [Internal] Added instructions for debugging the addin under 
MonoDevelop


v0.3.3:

- [Settings] Made url for opening manual pages editable, but it's 
still using dlang.org by default
- [Resolver] Re-fixed structs' default ctor - slightly buggy but 
working
- [Doc outline] Fixed representation of e.g. private const 
literals

- [Doc outline] Added special icon for alias declarations
- [Parser] Fixed synchronized parse order issue
- [Parser] Fixed class invariant parsing  modified their 
representation in the doc outline

- [Building] Small fix when executing stand-alone files
- [Parser] Mixin parse error
- There are text boxes instead of lists for include paths in the 
option dialogs now


Original Post: http://mono-d.alexanderbothe.com/?p=350
Further issues: https://github.com/aBothe/Mono-D/issues

Re: Pegged, From EBNF to PEG

On Tue, Mar 13, 2012 at 18:05, Alex Rønne Petersen xtzgzo...@gmail.com wrote:

 lowerCase - [a-z]
 upperCase - [A-Z]
 Identifier - (lowerCase / upperCase) (lowerCase / upperCase)*


 Why not:
 Identifier - [a-zA-Z]+


 That was an illustrative example from the Pegged docs. But yeah, you should
 just use a range; reads nicer.

The docs are for teaching PEG :) (btw, it's the docs describe C-like
identifiers, that's why I chose a longer approach)
It's always this 'tension', between inlining and refactoring.
[a-zA-Z]+ is shorter and more readable. But If you decide to extend
your grammar to UTF-32, it'd be easier to just change the 'letter'
rule.

Re: Pegged, From EBNF to PEG

On Mon, Mar 12, 2012 at 13:43, bls bizp...@orange.fr wrote:

 Just WOW!

Thanks! Don't be too excited, it's still quite slow as a parser. But
that is a fun project :)

 Nice to have on your WIKI would be a EBNF to PEG sheet.

 Wirth EBNF      Pegged
 A = BC.         A - B C
 A = B|C.        A - C / C
 A = [B].        A - B?
 A = {B}.        A - B*

fact is, I don't know EBNF that much. I basically learned everything I
know about parsing or grammars while coding Pegged in February :) I
probably made every mistakes in the book.

Hey, it's a github public wiki, I guess you can create a new page?

Re: How about colors and terminal graphics in std.format?

2012-03-13 Thread Chad J


On 03/13/2012 01:34 AM, H. S. Teoh wrote:

On Tue, Mar 13, 2012 at 03:17:42PM +1300, James Miller wrote:

On 13 March 2012 15:17, H. S. Teohhst...@quickfur.ath.cx  wrote:

We could start off with said module just doing colors for now, and
then gradually add more stuff to it later.


We could end up at a D-flavoured ncurses library!

[...]

That would be a dream come true for me.

I have to admit that I find the ncurses API quite ugly, and not very
well-designed. It's the curse (pun intended) of inheriting a decades-old
API that was designed back in the days when people didn't know very much
about good API design.


T



Yes.

Maybe someday...

Re: EBNF grammar for D?

Alix Pexton:

 Rainer Schuetze pulled all the grammar out of the docs and fixed them up
a while back as part of his work on Visual D. Its not in straight EBNF and
it may not be 100% up to date, but it may be a good place to start.

 http://www.dsource.org/projects/visuald/wiki/GrammarComparison

 I hope that is of some use!

It is! Thanks Alix and thanks Rainer, the page is clean and readable. I'll
use it at once.

I admit reading the spec in detail for the first time yesterday, and I had
a few WAT moments. No with the way it's written per se, but for what is
seemingly allowed by the D grammar.

Re: How about colors and terminal graphics in std.format?

2012-03-13 Thread James Miller

On 13 March 2012 18:50, Chad J chadjoan@__spam.is.bad__gmail.com wrote:
 On 03/13/2012 01:41 AM, James Miller wrote:

 On 13 March 2012 18:24, Chad Jchadjoan@__spam.is.bad__gmail.com  wrote:

 I'm not sure I agree with resetting to a default color.  What if I want
 to

 write to the stream without altering the terminal's graphics settings?


 Actually, I meant more to make sure that any output is reset to the
 terminal's default. I'm pretty sure there is a way to do this. The
 point is that not undoing mode changes is bad form.

 Otherwise, I can live with the colourings being nested, but I would
 suggest a change in syntax, I understand that yours is mostly just for
 show, but using parenthesis will be annoying, I'd probably use braces
 ('{' and '}') instead, since they are less common.

 writefln('%Cred(\(this is in color\))');
  vs
 writefln('%Cred{(this is in color)}');

 Neither are /that/ pretty, but at least the second one requires less
 escaping in the common case.

 --
 James Miller


 Oh, I see what you mean.

 This is why the second paren always had a % before it:

 writefln('%Cred((this is in color)%)');

 Is this OK?  I know that escaping is still involved, but the text itself
 does not need escaping: only the special closing element does.

 I like this constraint because it means that the only character you ever
 have to escape in your normal text is %, which you write by using %%
 instead.

That works, and I think it matches zsh's style. I still think that
'{', '}' would be better, but I'm not dead-set on it.

--
James Miller

Re: Arbitrary abbreviations in phobos considered ridiculous

On Tue, Mar 13, 2012 at 01:37:24AM -0400, Nick Sabalausky wrote:
[...]
 Yea, there's a lot of things that are much better done in CSS that a
 lot of people don't even know about. For example, most rollovers are
 easily doable in pure CSS. But there's a lot stuff out there
 (paricularly things created in Adobe's software) that use JS for
 rollovers, which doesn't even work as well (even with JS on).

Ugh. Don't get me started on Adobe. I don't know what they do to their
programmers, but obviously UI design is not part of their staff
training. Have you seen acrobat reader's UI? It's utterly atrocious.
Completely counterintuitive, and an embarrassment to modern UI design.
And that's their product line. Don't even mention their website.


 OTOH, I don't like CSS drop-down menus. Maybe it's different in CSS3,
 but in CSS2 the only way to make CSS menus work is for them to open
 upon rollover, not click. And dropdown menus opening upon rollover is
 just a usability mess, IMO, *and* inconsistent with pretty much any
 GUI OS I've ever used.
[...]

Hmm. I rather *like* CSS drop-down menus, actually. At least, it's *way*
better than those annoying badly-written bloated JS menus. Though if
it's not done properly, it can be an utter annoyance. E.g., if the menu
is separated from the element that triggers it by a gap of several
pixels... then it's almost impossible to actually use it. (This happens
to me a lot 'cos I fiddle with default font size settings. Which lets me
see the pervasiveness of broken pixel-dependent CSS in all its glory.
OK, I better stop now, 'cos static vs. fluid layouts are another of my
pet peeves... this thread will never end if I keep going.)


T

-- 
Amateurs built the Ark; professionals built the Titanic.

Re: EBNF grammar for D?

Rainer Schuetze
 There is also a script to generate the grammar text file here:

 http://www.dsource.org/projects/visuald/browser/grammar

The resulting text file is quite good. Heck, I think I could modify a
parser generator I'm writing to accept it directly.

I'll use that also, thanks for your work, Rainer.

Re: Arbitrary abbreviations in phobos considered ridiculous

H. S. Teoh hst...@quickfur.ath.cx wrote in message 
news:mailman.601.1331619011.4860.digitalmar...@puremagic.com...
 On Tue, Mar 13, 2012 at 01:37:24AM -0400, Nick Sabalausky wrote:
 [...]
 Yea, there's a lot of things that are much better done in CSS that a
 lot of people don't even know about. For example, most rollovers are
 easily doable in pure CSS. But there's a lot stuff out there
 (paricularly things created in Adobe's software) that use JS for
 rollovers, which doesn't even work as well (even with JS on).

 Ugh. Don't get me started on Adobe. I don't know what they do to their
 programmers, but obviously UI design is not part of their staff
 training. Have you seen acrobat reader's UI? It's utterly atrocious.
 Completely counterintuitive, and an embarrassment to modern UI design.
 And that's their product line. Don't even mention their website.


Yea, I have a *very* low opinion of Adobe's products in general. They can 
barely get anything right, and that's on top of their sky-high prices. Each 
CS is more absurdly bloated than the last, and uses weirder and weirder 
skins, and Adobe clearly has zero understanding of security. I don't hate 
Adobe like I hate some other companies, but I consider their output to 
mostly be garbage and I hate dealing with their products (and 
documentation).

It's a shame they no longer have *really good* competition, like Paint Shop 
Pro which used to compete toe-to-toe with Photoshop and at a much lower 
price (until Corel bought it and drive it straight into the ground along 
with the rest of Corel's products). I use GIMP, but it's uhh...kind of a 
gimpy program (aptly-named).

Have you heared of the Corona SDK? It's not technically made by Adobe 
but...I'll put it this way: Corona does intially come across as fairly 
slick. But then you compare it to one of their competitors, Marmalade SDK, 
and despite Marmalade's lack of slickness, you quickly realize that 
Marmalade was made by and for grownups, while Corona...is basically just an 
overpriced toy in grown-ups clothing. *After* I came to that conclusion, I 
discovered the company which makes Corona, surprise surprise, was founded by 
former higher-ups from Adobe's Flash dept. Go figure. Flash, which can 
*also* be accurately described as an overpriced toy in grown-ups clothing.

Regarding implementing a stable sort for Phobos

I've been playing with sorting algorithms a lot in recent months, 
so I want to implement a *working* stable sort for Phobos which 
is broken at the moment. I have a working library and I'm still 
adding to it. It's much more complex than a simple merge sort, 
being over 300 lines of code at the moment.


- It's a natural merge sort, which is faster on partially sorted 
lists, and adds little overhead for mostly random lists.

- It uses O(log n log n) additional space for merging.
- I wrote it to sort random-access ranges *without* slicing, but 
I think the exclusion of slicing makes it slower. I'm writing a 
separate implementation which uses slicing and I'll keep it if 
it's much faster.
- To avoid multiple allocations, the user can allocate their own 
temporary memory and pass it to the sort function.
- I decided against using pointers. While I could use them to 
write highly optimized code for arrays, pointers can't be used in 
safe code and don't work very well in CTFE at the moment.


Is it worth keeping the implementation *without* slicing? Many 
functions in Phobos do require slicing, including the unstable 
sort, and I think most random-access ranges do or could support 
slicing.


What would you prefer in terms of memory usage vs performance? 
O(n/2) space is optimal for performance, O(1) (in-place) requires 
zero allocations but is slower, and O(log n log n) provides a 
good balance.


Should I implement concurrency? Merge sort is very easy to 
parallelize, and being in-place or natural doesn't change this 
fact.


Should we take a different path and go for a basic merge sort or 
even Timsort? I've considered writing a Timsort though that seems 
like daunting task to me, so here's an implementation written in 
C - https://github.com/swenson/sort

Re: Regarding implementing a stable sort for Phobos

2012-03-13 Thread Chad J


On 03/13/2012 02:31 AM, Xinok wrote:

I've been playing with sorting algorithms a lot in recent months, so I
want to implement a *working* stable sort for Phobos which is broken at
the moment. I have a working library and I'm still adding to it. It's
much more complex than a simple merge sort, being over 300 lines of code
at the moment.

...


Hey, I'd love to see more sorting algorithms in phobos.  Being stuck 
with one seems kind of... wrong.


If the range is a linked list, shouldn't it be possible to do a merge 
sort with optimally in-place and use no extra memory whatsoever?  I know 
it can be done in-place, but I've never benchmarked it.  I wonder if 
it's worth considering, and how it would compare against array-based 
merge sort with allocations and such.


Although it's probably out of your scope right now, I'd like to see 
insertion sort some day.  I would use it for things like broadphase 
sorting in collision detection (that is, you sort everything by say, x 
coordinates first, and then you can walk through the whole simulation 
from left-to-right and have very few things to check collisions for at 
each point).  Since the ordering of the objects in the simulation is 
unlikely to change much between frames, it will be almost entirely 
sorted each time.  I have to imagine insertion sort would be awesome at 
that; nearly O(n).  Maybe if it hits more than log(n) nonlocal 
insertions it would bail out into a merge sort or something.

Re: toHash = pure, nothrow, const, @safe

2012-03-13 Thread Jacob Carlborg


On 2012-03-13 01:40, Walter Bright wrote:

On 3/12/2012 1:56 PM, Martin Nowak wrote:

It doesn't require all source code.
It just means that without source code nothing can be inferred and the
attributes fall back to what has been annotated by hand.


Hello endless bug reports of the form:

It compiles when I send the arguments to dmd this way but not that way.
dmd is broken. D sux.


We already have that, sometimes :(

--
/Jacob Carlborg

Re: How about colors and terminal graphics in std.format?

2012-03-13 Thread Jacob Carlborg


On 2012-03-13 02:36, Christian Manning wrote:


It would be great if an std.terminal contained general stuff for
manipulating/querying a terminal portably, as well as colour output, eg.
get terminal size, move cursor around, erase line... just things to help
with building UIs, progress bars, etc. that are easy to use.


I actually have a library for this written in C++, somewhere.

--
/Jacob Carlborg

Re: Regarding implementing a stable sort for Phobos


Le 13/03/2012 07:53, Chad J a écrit :

On 03/13/2012 02:31 AM, Xinok wrote:

I've been playing with sorting algorithms a lot in recent months, so I
want to implement a *working* stable sort for Phobos which is broken at
the moment. I have a working library and I'm still adding to it. It's
much more complex than a simple merge sort, being over 300 lines of code
at the moment.

...


Hey, I'd love to see more sorting algorithms in phobos. Being stuck with
one seems kind of... wrong.



I have a radix sort (that need some rework to be phobos quality) and a 
smoothsort (that could be included in phobos).



If the range is a linked list, shouldn't it be possible to do a merge
sort with optimally in-place and use no extra memory whatsoever? I know
it can be done in-place, but I've never benchmarked it. I wonder if it's
worth considering, and how it would compare against array-based merge
sort with allocations and such.



I have a sort for ForwardRange, but it is O(n²) and unstable. However, 
it is in place.


I don't think we should allocate behind one's back, so merge sort should 
be an option, unless called explicitely.



Although it's probably out of your scope right now, I'd like to see
insertion sort some day. I would use it for things like broadphase
sorting in collision detection (that is, you sort everything by say, x
coordinates first, and then you can walk through the whole simulation
from left-to-right and have very few things to check collisions for at
each point). Since the ordering of the objects in the simulation is
unlikely to change much between frames, it will be almost entirely
sorted each time. I have to imagine insertion sort would be awesome at
that; nearly O(n). Maybe if it hits more than log(n) nonlocal insertions
it would bail out into a merge sort or something.


smoothsort is a good solution for that. radix is also guarantee to be 
O(n). Insertionsort is quite risky, because it can ends up in O(n²) very 
easily.

Re: Regarding implementing a stable sort for Phobos


On Tuesday, 13 March 2012 at 06:53:30 UTC, Chad J wrote:
Hey, I'd love to see more sorting algorithms in phobos.  Being 
stuck with one seems kind of... wrong.


Things like this are better left to 3rd party libs. Phobos 
already has two, a stable and unstable sort, which fulfill 99% of 
cases.


If the range is a linked list, shouldn't it be possible to do a 
merge sort with optimally in-place and use no extra memory 
whatsoever?  I know it can be done in-place, but I've never 
benchmarked it.  I wonder if it's worth considering, and how it 
would compare against array-based merge sort with allocations 
and such.


Yes, it's possible because insertions are inexpensive in linked 
lists. However, it would be impractical to implement one at the 
moment because Phobos has no concept of linked lists (ranges 
wouldn't cover it).


Although it's probably out of your scope right now, I'd like to 
see insertion sort some day.  I would use it for things like 
broadphase sorting in collision detection (that is, you sort 
everything by say, x coordinates first, and then you can walk 
through the whole simulation from left-to-right and have very 
few things to check collisions for at each point).  Since the 
ordering of the objects in the simulation is unlikely to change 
much between frames, it will be almost entirely sorted each 
time.  I have to imagine insertion sort would be awesome at 
that; nearly O(n).  Maybe if it hits more than log(n) nonlocal 
insertions it would bail out into a merge sort or something.


Insertion sort is one of the simplest sorts to write. I use it to 
optimize this stable sort as its super efficient at sorting small 
sublists. A natural merge sort would also work well in this case, 
but Timsort would work best. Timsort is also a natural merge 
sort, but it goes much farther than that.

Re: Feq questions about the D language

2012-03-13 Thread Don Clugston


On 12/03/12 01:20, Walter Bright wrote:

On 3/11/2012 2:57 PM, Caligo wrote:

And just for the record, there are software projects that are millions
of lines of code in C/C++ and have ZERO workarounds. Also, I have
never encountered a bug in GCC when programming in C++, even when
trying out the latest C++11.


GCC itself is fairly bug free,


I had used g++ for a grand total of three hours before I found a 
wrong-code regression -- the absolute worst category of bug. I've never 
found a DMD bug which was significantly worse than that one.


GCC is much more mature than the DMD front-end, of course, but I don't 
think it's any more bug-free than the DMD back-end.


but then again I don't push it that hard.

The runtime library, more specifically the math functions, are
erratically buggy. (This is why Phobos has its own implementations of
those functions, rather than simply forwarding to gcc's library
versions.) The ld linker on OSX is pretty awful, as in costing me much
time in devising workarounds.

Re: Multiple return values...

On 13 March 2012 06:45, Andrei Alexandrescu
seewebsiteforem...@erdani.orgwrote:

 You see, at this point I have no idea what to believe anymore. You argued
 very strongly from the position of one whose life depends on efficiency.
 Here and there you'd mix some remark about syntax, and I'd like whaa?...
 but generally discounted it as distraction from the main point, which was
 that all you must do is f(g()) where the body of g() is insignificantly
 small, which makes the cost of passing arguments around absolutely
 paramount.

 And now you come with this completely opposite viewpoint in which the
 syntax is paramount and urgent, whereas codegen is like let's leave it for
 later. I really am confused.


Okay sorry, let me clarify. My own personal stance is unchanged, but I
appreciate your assertion of priorities and I relent :)
This topic has meandered between 2 distinct threads, syntax and abi, and I
feel strongly about both, so maybe my personal sense of priority comes
across wrong as I'm discussing one topic or the other.

Trying to see it from a practicality standpoint, there is a pull request
there which would seem like a near-complete implementation of the
syntax, so that's a much easier/smaller step than messing with the ABI.
Also, the syntax element of the feature will benefit far more people, and
more immediately.
Note, I still find myself wanting this feature, at least syntactically,
every other day (my motivation starting the thread initially). But for my
purposes (simd math library currently) it wouldn't do for it to be
inefficient. At least the promise of an efficient implementation down the
road is needed to make use of it.

I think I feel a sense of urgency towards the ABI aspect because it is a
breaking change, and I suspect the longer anything like that is left, the
less likely/more risky it becomes.
If it gets delayed for 6-12 months, are you honestly more or less likely to
say it's a good idea to fiddle with the ABI?

I am sold on the Tuple approach now, so that's a big discussion that can be
dismissed. I think it was as a result of realising this that the ABI became
of higher importance in my mind, since I agree, workable syntax is
technically possible already (although ugly/verbose).


You don't see the immediate value in a convenient MRV syntax? It would
 improve code clarity in many places, and allow the code to also be more
 efficient down the road.


 I see value in Kenji's related diff, but not in adding syntax to e.g.
 return (int, int). But we want to make sure we address the matter
 holistically (for example: is Kenji's diff enough, or do we need to worry
 about assignment too?). The worst strategy in chess is to move a piece and
 then start analyzing the new situation on the board.


Shall we discuss the shortcomings of his implementation? Can someone
demonstrate the details of his implementation?
From the little examples up in the thread, it looked like you could only
declare new variables inline, but not assign out to existing ones. I'd say
this needs to be added too, and perhaps that will throw the whole design
into turmoil? ;)

Re: Regarding implementing a stable sort for Phobos


On Tuesday, 13 March 2012 at 08:37:06 UTC, deadalnix wrote:
I have a radix sort (that need some rework to be phobos 
quality) and a smoothsort (that could be included in phobos).


Would you mind sharing your smoothsort? I haven't implemented one 
myself and I'd love to test it out.
Radix sort, on the other hand, is not a comparison sort. You'd 
have to rewrite it for every possible element and container type.


I have a sort for ForwardRange, but it is O(n²) and unstable. 
However, it is in place.


I posted one a few days ago myself - 
http://forum.dlang.org/thread/cmipnxrarexjgnrdq...@forum.dlang.org


I don't think we should allocate behind one's back, so merge 
sort should be an option, unless called explicitely.


When it comes to stable sorts, merge sort is the best choice. I 
found tree sort to be quite slow (using RedBlackTree in 
std.container), and a stable quick sort is still slower than a 
merge sort. So I guess that'd mean an in-place merge sort. I've 
implemented one which was 3x slower than quick sort. Allocating 
even a small amount of space makes a big difference.


smoothsort is a good solution for that. radix is also guarantee 
to be O(n). Insertionsort is quite risky, because it can ends 
up in O(n²) very easily.

Re: Regarding implementing a stable sort for Phobos


Le 13/03/2012 10:19, Xinok a écrit :

On Tuesday, 13 March 2012 at 08:37:06 UTC, deadalnix wrote:

I have a radix sort (that need some rework to be phobos quality) and a
smoothsort (that could be included in phobos).


Would you mind sharing your smoothsort? I haven't implemented one myself
and I'd love to test it out.


It is on github : 
https://github.com/deadalnix/Dsort/blob/master/sort/smooth.d



Radix sort, on the other hand, is not a comparison sort. You'd have to
rewrite it for every possible element and container type.



You can do quite a lot with a bijective transformation.

Turning a SIGSEGV into a regular function call under Linux, allowing throw

2012-03-13 Thread FeepingCreature

Note: I worked out this method for my own language, Neat, but the basic 
approach should be portable to D's exceptions as well.

I've seen it argued a lot over the years (even argued it myself) that it's 
impossible to throw from Linux signal handlers. This is basically correct, 
because they constitute an interruption in the stack that breaks exceptions' 
ability to unroll properly.

However, there is a method to turn a signal handler into a regular function 
call that you can throw from.

Basically, what we need to do is similar to a stack buffer overflow exploit. 
Under Linux, the extended signal handler that is set with sigaction is called 
with three arguments: the signal, a siginfo_t* and a ucontext_t* as the third.

The third parameter is what we're interested in. Deep inside the ucontext_t 
struct is uc.mcontext.gregs[REG_EIP], the address of the instruction that 
caused the segfault. This is the location that execution returns to when the 
signal handler returns. By overwriting this location, we can turn a return into 
a function call.

First, gregs[REG_EAX] = gregs[REG_EIP];

We can safely assume that the function that caused the segfault doesn't really 
need its EAX anymore, so we can reuse it to reconstruct a proper stackframe to 
throw from later.

Second, gregs[REG_EIP] = cast(void*) sigsegv_userspace_handler;

Note that the naked attribute was not used. If used, it can make this code 
slightly easier.

extern(C) void sigsegv_userspace_handler() {
  // done implicitly
  // asm { push ebp; }
  // asm { mov ebp, esp; }
  asm { mov ebx, [esp]; } // backup the pushed ebp
  asm { mov [esp], eax; } // replace it with the correct return address
  // which was originally left out due to the
  // irregular way we entered this function (via a ret).
  asm { push ebx; }   // recreate the pushed ebp
  asm { mov ebp, esp; }   // complete stackframe.
  // originally, our stackframe (because we entered this function via a ret)
  // was [ebp]. Now, it's [return address][ebp], as is proper for cdecl.
  // at this point, we can safely throw
  // (or invoke any other non-handler-safe function).
  throw new SignalException(SIGSEGV);
}

Re: Turning a SIGSEGV into a regular function call under Linux, allowing throw


Le 13/03/2012 11:09, FeepingCreature a écrit :

Note: I worked out this method for my own language, Neat, but the basic 
approach should be portable to D's exceptions as well.

I've seen it argued a lot over the years (even argued it myself) that it's 
impossible to throw from Linux signal handlers. This is basically correct, 
because they constitute an interruption in the stack that breaks exceptions' 
ability to unroll properly.

However, there is a method to turn a signal handler into a regular function 
call that you can throw from.

Basically, what we need to do is similar to a stack buffer overflow exploit. 
Under Linux, the extended signal handler that is set with sigaction is called 
with three arguments: the signal, a siginfo_t* and a ucontext_t* as the third.

The third parameter is what we're interested in. Deep inside the ucontext_t 
struct is uc.mcontext.gregs[REG_EIP], the address of the instruction that 
caused the segfault. This is the location that execution returns to when the 
signal handler returns. By overwriting this location, we can turn a return into 
a function call.

First, gregs[REG_EAX] = gregs[REG_EIP];

We can safely assume that the function that caused the segfault doesn't really 
need its EAX anymore, so we can reuse it to reconstruct a proper stackframe to 
throw from later.

Second, gregs[REG_EIP] = cast(void*)sigsegv_userspace_handler;

Note that the naked attribute was not used. If used, it can make this code 
slightly easier.

extern(C) void sigsegv_userspace_handler() {
   // done implicitly
   // asm { push ebp; }
   // asm { mov ebp, esp; }
   asm { mov ebx, [esp]; } // backup the pushed ebp
   asm { mov [esp], eax; } // replace it with the correct return address
   // which was originally left out due to the
   // irregular way we entered this function (via a 
ret).
   asm { push ebx; }   // recreate the pushed ebp
   asm { mov ebp, esp; }   // complete stackframe.
   // originally, our stackframe (because we entered this function via a ret)
   // was [ebp]. Now, it's [return address][ebp], as is proper for cdecl.
   // at this point, we can safely throw
   // (or invoke any other non-handler-safe function).
   throw new SignalException(SIGSEGV);
}


And is this Exception recoverable in a safe way ?

The ucontext_t struct is system dependent. So this is tricky.

The Exception should be an Error to comply with nothrow spec.

Re: Turning a SIGSEGV into a regular function call under Linux, allowing throw

2012-03-13 Thread FeepingCreature

On 03/13/12 11:23, deadalnix wrote:
 Le 13/03/2012 11:09, FeepingCreature a écrit :
 Note: I worked out this method for my own language, Neat, but the basic 
 approach should be portable to D's exceptions as well.

 I've seen it argued a lot over the years (even argued it myself) that it's 
 impossible to throw from Linux signal handlers. This is basically correct, 
 because they constitute an interruption in the stack that breaks exceptions' 
 ability to unroll properly.

 However, there is a method to turn a signal handler into a regular function 
 call that you can throw from.

 Basically, what we need to do is similar to a stack buffer overflow exploit. 
 Under Linux, the extended signal handler that is set with sigaction is 
 called with three arguments: the signal, a siginfo_t* and a ucontext_t* as 
 the third.

 The third parameter is what we're interested in. Deep inside the ucontext_t 
 struct is uc.mcontext.gregs[REG_EIP], the address of the instruction that 
 caused the segfault. This is the location that execution returns to when the 
 signal handler returns. By overwriting this location, we can turn a return 
 into a function call.

 First, gregs[REG_EAX] = gregs[REG_EIP];

 We can safely assume that the function that caused the segfault doesn't 
 really need its EAX anymore, so we can reuse it to reconstruct a proper 
 stackframe to throw from later.

 Second, gregs[REG_EIP] = cast(void*)sigsegv_userspace_handler;

 Note that the naked attribute was not used. If used, it can make this code 
 slightly easier.

 extern(C) void sigsegv_userspace_handler() {
// done implicitly
// asm { push ebp; }
// asm { mov ebp, esp; }
asm { mov ebx, [esp]; } // backup the pushed ebp
asm { mov [esp], eax; } // replace it with the correct return address
// which was originally left out due to the
// irregular way we entered this function (via a 
 ret).
asm { push ebx; }   // recreate the pushed ebp
asm { mov ebp, esp; }   // complete stackframe.
// originally, our stackframe (because we entered this function via a ret)
// was [ebp]. Now, it's [return address][ebp], as is proper for cdecl.
// at this point, we can safely throw
// (or invoke any other non-handler-safe function).
throw new SignalException(SIGSEGV);
 }
 
 And is this Exception recoverable in a safe way ?
 

I'm not familiar with recovering. Note that you can _not_ safely return from 
the userspace handler, because we overwrote EAX to make space for our ESI 
backup.

You'd need to find somewhere else to stick that backup, like a TLS global 
variable or some known part of the stack.

 The ucontext_t struct is system dependent. So this is tricky.
 

Yeah, this is Linux only.

Re: toHash = pure, nothrow, const, @safe

2012-03-13 Thread Peter Alexander


On Monday, 12 March 2012 at 09:40:15 UTC, Walter Bright wrote:

On 3/12/2012 1:08 AM, Martin Nowak wrote:
What's wrong with auto-inference. Inferred attributes are only 
strengthening

guarantees.


Auto-inference is currently done for lambdas and template 
functions - why? - because the function's implementation is 
guaranteed to be visible to the compiler. For other functions, 
not so, and so the attributes must be part of the function 
signature.


Dumb question:

Why not auto-infer when the function body is available, and put 
the inferred attributes into the automatically generated .di file?


Apologies if I've missed something completely obvious.

Re: toHash = pure, nothrow, const, @safe


Le 13/03/2012 12:02, Peter Alexander a écrit :

On Monday, 12 March 2012 at 09:40:15 UTC, Walter Bright wrote:

On 3/12/2012 1:08 AM, Martin Nowak wrote:

What's wrong with auto-inference. Inferred attributes are only
strengthening
guarantees.


Auto-inference is currently done for lambdas and template functions -
why? - because the function's implementation is guaranteed to be
visible to the compiler. For other functions, not so, and so the
attributes must be part of the function signature.


Dumb question:

Why not auto-infer when the function body is available, and put the
inferred attributes into the automatically generated .di file?

Apologies if I've missed something completely obvious.


That is exactly what I was thinking about.

Re: toHash = pure, nothrow, const, @safe

2012-03-13 Thread Don Clugston


On 13/03/12 03:05, Walter Bright wrote:

On 3/12/2012 6:15 PM, Stewart Gordon wrote:

And what about toString?


Good question. What do you suggest?


Why can't we just kill that abomination?

Re: Multiple return values...

2012-03-13 Thread Iain Buclaw

On 13 March 2012 09:12, Manu turkey...@gmail.com wrote:
 On 13 March 2012 06:45, Andrei Alexandrescu seewebsiteforem...@erdani.org
 wrote:

 You see, at this point I have no idea what to believe anymore. You argued
 very strongly from the position of one whose life depends on efficiency.
 Here and there you'd mix some remark about syntax, and I'd like whaa?...
 but generally discounted it as distraction from the main point, which was
 that all you must do is f(g()) where the body of g() is insignificantly
 small, which makes the cost of passing arguments around absolutely
 paramount.

 And now you come with this completely opposite viewpoint in which the
 syntax is paramount and urgent, whereas codegen is like let's leave it for
 later. I really am confused.


 Okay sorry, let me clarify. My own personal stance is unchanged, but I
 appreciate your assertion of priorities and I relent :)
 This topic has meandered between 2 distinct threads, syntax and abi, and I
 feel strongly about both, so maybe my personal sense of priority comes
 across wrong as I'm discussing one topic or the other.

 Trying to see it from a practicality standpoint, there is a pull request
 there which would seem like a near-complete implementation of the syntax, so
 that's a much easier/smaller step than messing with the ABI. Also, the
 syntax element of the feature will benefit far more people, and more
 immediately.
 Note, I still find myself wanting this feature, at least syntactically,
 every other day (my motivation starting the thread initially). But for my
 purposes (simd math library currently) it wouldn't do for it to be
 inefficient. At least the promise of an efficient implementation down the
 road is needed to make use of it.

 I think I feel a sense of urgency towards the ABI aspect because it is a
 breaking change, and I suspect the longer anything like that is left, the
 less likely/more risky it becomes.
 If it gets delayed for 6-12 months, are you honestly more or less likely to
 say it's a good idea to fiddle with the ABI?

 I am sold on the Tuple approach now, so that's a big discussion that can be
 dismissed. I think it was as a result of realising this that the ABI became
 of higher importance in my mind, since I agree, workable syntax is
 technically possible already (although ugly/verbose).



What about alternative optimisations for MRV, rather than stating that
it should always be returned in registers where possible (and breaking
ABI on all target platforms).  What about, for example, using named
return value optimisation in this case to help improve the cost of
returning on non-x86 architectures.

Just throwing random thoughts out there.

-- 
Iain Buclaw

*(p  e ? p++ : p) = (c  0x0f) + '0';

Re: toHash = pure, nothrow, const, @safe

2012-03-13 Thread Martin Nowak

On Tue, 13 Mar 2012 01:40:08 +0100, Walter Bright  
newshou...@digitalmars.com wrote:



On 3/12/2012 1:56 PM, Martin Nowak wrote:

It doesn't require all source code.
It just means that without source code nothing can be inferred and the
attributes fall back to what has been annotated by hand.


Hello endless bug reports of the form:

It compiles when I send the arguments to dmd this way but not that way.  
dmd is broken. D sux.


Yeah, you're right. It would easily create confusing behavior.

Re: Arbitrary abbreviations in phobos considered ridiculous

2012-03-13 Thread Ary Manzana


On 03/13/2012 02:14 AM, H. S. Teoh wrote:

On Mon, Mar 12, 2012 at 10:35:54PM -0400, Nick Sabalausky wrote:

Jonathan M Davisjmdavisp...@gmx.com  wrote in message
news:mailman.572.1331601463.4860.digitalmar...@puremagic.com...

[...]

All I'm saying is that if it makes sense for the web developer to
use javascript given what they're trying to do, it's completely
reasonable to expect that their users will have javascript enabled
(since virtually everyone does). If there's a better tool for the
job which is reasonably supported, then all the better. And if it's
easy to provide a workaround for the lack of JS at minimal effort,
then great. But given the fact that only a very small percentage of
your user base is going to have JS disabled, it's not unreasonable
to require it and not worry about the people who disable it if
that's what you want to do.



Personally, I disagree with the notion that non-JS versions are a
workaround.

[...]

Me too. To me, non-JS versions are the *baseline*, and JS versions are
enchancements. To treat JS versions as baseline and non-JS versions as
workaround is just so completely backwards.


While I don't agree that non-JS is the baseline (because most if not all 
browsers come with JS enabled by default, so why would you want to 
disable javascript for?), I'm starting to understand that providing both 
non-JS and JS versions is useful.


At least so that:
 - Some users don't go mad when they can't use it, and then realise 
it's because JS is disabled

 - And for the above reason, not to loose reputation to those people :-P

But if people didn't have an option to disable JS, we wouldn't have this 
discussion. I think it as having an option to disable CSS.


(I was going to put as an argument that my cellphone didn't have an 
option to disable JS, but it does... h... :-P)

Re: Arbitrary abbreviations in phobos considered ridiculous

2012-03-13 Thread Ary Manzana


On 03/13/2012 01:52 AM, Nick Sabalausky wrote:

Ary Manzanaa...@esperanto.org.ar  wrote in message
news:jjmhja$3a$2...@digitalmars.com...

On 03/12/2012 10:58 PM, H. S. Teoh wrote:


The problem today is that JS is the next cool thing, so everyone is
jumping on the bandwagon, and everything from a single-page personal
website to a list of links to the latest toaster oven requires JS to
work, even when it's not necessary at all. That's the silliness of it
all.


T


It's not the next cool thing. It makes thing more understandable for the
user. And it makes the web transfer less content,


That gets constantly echoed throughout the web, but it's a red herring: Even
if you handle it intelligently like Adam does (ie, lightweight), the amount
of data transfer saved is trivial. We're talking *part* of *one* measly HTML
file here. And even that can be gzipped: HTML compresses *very* well. Yes,
techincally it can be less transfer, but only negligably so. And bandwith is
the *only* possible realistic improvement here, not speed, because the speed
of even a few extra K during a transfer that was already going to happen
anyway is easily outweighed by the overhead of things like actually making a
round-trip to the server at all, plus likely querying a server-side DB, plus
interpreting JS, etc.

If, OTOH you handle it like most people do, and not like Adam does, then for
brief visits you can actually be tranferring *more* data just because of all
that excess JS boilerplate people like to use. (And then there's the
start-up cost of actually parsing all that boilerplate and then executing
their initialization portions. And in many cases there's even external JS
getting loaded in, etc.)

The problem with optimization is that it's not a clear-cut thing: If you're
not looking at it holistically, optimizing one thing can either be an
effective no-op or even cause a larger de-optimization somewhere else. So
just because you've achived the popular goal of less data transer upon
your user clicking a certain link, doesn't necessarily mean you've won a net
gain, or even broken even.


True.

I always have to remember this interesting talk about saying This is 
faster than this without a scientific proof:


http://vimeo.com/9270320

Re: How about colors and terminal graphics in std.format?

2012-03-13 Thread Christian Manning


On Tuesday, 13 March 2012 at 07:45:19 UTC, Jacob Carlborg wrote:

On 2012-03-13 02:36, Christian Manning wrote:

It would be great if an std.terminal contained general stuff 
for
manipulating/querying a terminal portably, as well as colour 
output, eg.
get terminal size, move cursor around, erase line... just 
things to help

with building UIs, progress bars, etc. that are easy to use.


I actually have a library for this written in C++, somewhere.


Any chance of a release? :)

I'd like to have a stab at porting it to D, when I have time, if 
you aren't already planning to.

Re: Multiple return values...

On 13 March 2012 13:27, Iain Buclaw ibuc...@ubuntu.com wrote:

 What about alternative optimisations for MRV, rather than stating that
 it should always be returned in registers where possible (and breaking
 ABI on all target platforms).  What about, for example, using named
 return value optimisation in this case to help improve the cost of
 returning on non-x86 architectures.

 Just throwing random thoughts out there.


What difference would that actually make? The effect is still the same,
unless perhaps you were returning directly into some output structure, that
might be a win in that case (but that's the opposite of what MRV is
actually for).
Definitely no good for slices, and it doesn't help calls, only returns.

The non-x86 platforms don't only suffer from return values, they suffer
passing TO functions as well. So currently they take the hit on both sides.
Slices are fundamental to the language feature, they need to be efficient :/

Re: Feq questions about the D language

2012-03-13 Thread Peter Alexander

On Monday, 12 March 2012 at 02:33:23 UTC, Andrei Alexandrescu 
wrote:

On 3/11/12 5:37 PM, Timon Gehr wrote:

On 03/11/2012 10:57 PM, Caligo wrote:
And just for the record, there are software projects that are 
millions
of lines of code in C/C++ and have ZERO workarounds. Also, I 
have
never encountered a bug in GCC when programming in C++, even 
when

trying out the latest C++11.


I have encountered bugs in both GCC and Clang.
Without using any C++11 features, and even though I don't use 
C++

regularly.


We at Facebook found a bunch of gcc bugs for each release we've 
used, and have known workarounds. I'd find it surprising if a 
large C++ project didn't fit the same pattern.


They do, but I think the difference here is the kind of bugs you 
find. In GCC, most of the bugs are rare edge cases (yes, I'm sure 
there are some less rare bugs too), but in DMD, there are lots of 
this language feature simply doesn't work. Things like 
selective imports, Object const-correctness, post-blitting const 
structs etc.



At any rate, the comparison is rigged because C++ is much more 
mature and invested in.


It may be an unfair comparison, but it is an appropriate one. If 
a customer is evaluating products, he isn't going to give special 
treatment to those that are less mature. Bugs are bugs no matter 
how you justify them.

Re: Arbitrary abbreviations in phobos considered ridiculous

2012-03-13 Thread Adam D. Ruppe


On Tuesday, 13 March 2012 at 12:22:00 UTC, Ary Manzana wrote:
But if people didn't have an option to disable JS, we wouldn't 
have this discussion. I think it as having an option to disable 
CSS.


You can disable css :P

Keeping your site working without css is a lot harder IMO
than doing the same without javascript. I often assume
display: none; will work to hide unnecessary things.

Sometimes, doing simple things with css is a bit hard too.

For example, one easy way to make a site still work without
css is to put your content at the top of the HTML page,
with  as few as possible distractions in the process.

Oh yeah, and of course, always use the proper semantic
tags, which you should do anyway. Descriptive tags
help css too!

Anyway, so, you put the navigation menus, etc., at the
bottom of the html file.


Here's the problem though: you want those menus to show
up at the top for people with css. And that is incredibly
hard to get right currently. (I think css3 will make it
easier, but IE10 is the only browser to properly support
the needed features last I checked, and IE10 has 0% market.)


But with css2, you can't float to the top you can't
display: table to the top.

The best you can do is position: absolute, which can get
you trapped in the document tree and is just generally
a pain in the butt - you have to break the natural flow.


I often just say gah to it and either minimize those
things so it isn't too big of a hassle anyway, or put
a display: none #content link at the top.

(The only people who go without css in my experience are
lynx users anyway, so making it easier to scroll past the
crap is the important thing.)




Now, here's one case I never think about: what about if
JS is enabled, and CSS is not? Now that would be weird.

Probably usable but just really weird.

Re: Regarding implementing a stable sort for Phobos


On 3/13/12 4:02 AM, Xinok wrote:

On Tuesday, 13 March 2012 at 06:53:30 UTC, Chad J wrote:

Hey, I'd love to see more sorting algorithms in phobos. Being stuck
with one seems kind of... wrong.


Things like this are better left to 3rd party libs. Phobos already has
two, a stable and unstable sort, which fulfill 99% of cases.


I think we need a good sort for ranges of ranges (e.g. array of string). 
Right now sort() does pretty badly on arrays of strings with large 
common prefixes.


Andrei

Re: Regarding implementing a stable sort for Phobos


On 3/13/12 1:31 AM, Xinok wrote:

I've been playing with sorting algorithms a lot in recent months, so I
want to implement a *working* stable sort for Phobos which is broken at
the moment. I have a working library and I'm still adding to it. It's
much more complex than a simple merge sort, being over 300 lines of code
at the moment.


Working is better than broken.


- It's a natural merge sort, which is faster on partially sorted lists,
and adds little overhead for mostly random lists.
- It uses O(log n log n) additional space for merging.


That's 1024 when n is 4 billion. I think you can safely approximate it 
with alloca or a fixed-size stack-allocated array.



- I wrote it to sort random-access ranges *without* slicing, but I think
the exclusion of slicing makes it slower. I'm writing a separate
implementation which uses slicing and I'll keep it if it's much faster.


Having random access implies having slicing.


- To avoid multiple allocations, the user can allocate their own
temporary memory and pass it to the sort function.


If you need different allocation strategies, I suggest you make it a 
policy (like stability is).



- I decided against using pointers. While I could use them to write
highly optimized code for arrays, pointers can't be used in safe code
and don't work very well in CTFE at the moment.


Perhaps it's best to have two distinct implementations guarded by if 
(__ctfe). The runtime implementation can be @trusted.



Is it worth keeping the implementation *without* slicing? Many functions
in Phobos do require slicing, including the unstable sort, and I think
most random-access ranges do or could support slicing.


No need.


What would you prefer in terms of memory usage vs performance? O(n/2)
space is optimal for performance, O(1) (in-place) requires zero
allocations but is slower, and O(log n log n) provides a good balance.


The latter rounded up to a constant sounds best.


Should I implement concurrency? Merge sort is very easy to parallelize,
and being in-place or natural doesn't change this fact.


Let's save that for std.parallel_algorithm.


Should we take a different path and go for a basic merge sort or even
Timsort? I've considered writing a Timsort though that seems like
daunting task to me, so here's an implementation written in C -
https://github.com/swenson/sort


I don't know how your function's performance profile compares with 
timsort's.



Andrei

Re: Arbitrary abbreviations in phobos considered ridiculous

2012-03-13 Thread Adam D. Ruppe


On Tuesday, 13 March 2012 at 05:38:44 UTC, Nick Sabalausky wrote:
OTOH, I don't like CSS drop-down menus. Maybe it's different in 
CSS3, but in CSS2 the only way to make CSS menus work is for 
them

to open  upon rollover, not click.


Yeah, the way I do it is with a hybrid approach:

menu.onclick = function() { toggleClass(this, opened); };

.with-js menu  ul {
   display: none;
   /* or if you want it to roll, use the transition */
}

.with-js menu.open  ul {
   display: block;
}


and there you go. If you ask me, most your javascripts
should be changing classes, not actually doing the
work itself. That way, you keep the clean separation -
the class just represents the current state.

(indeed, with-js is another toggled class; put a little
script in the head that adds it to the html element.)



One downside of the css3 animations though is that
doing it without fixed height kinda sucks.

I wanted a quick foldout of something for the work
site. The simple height: 0px then open means
height: auto didn't work - you can't animate
to height: auto. Which sucks and they should
fix that, but apparently Apple's developers are
too incompetent to implement it in Safari, so
it gets left out of the standard. Or something
like that. I hate the standard, the process is
biased bullshit.


Anyway, to make it work, I ended up doing this:

/* overflow: hidden rox btw, use it a lot, you'll
thank me later */
max-height: 0px; overflow: hidden; transition: max-height 0.2s;

.open { max-height: 20em; }



Which gives the effect.. but the animation is from the
numbers given, not the actual height.

So, if you say max-height: 200em for instance, and
your content is only 20em tall, the visible animation
will complete in 0.02 seconds instead of 0.2!


Thus, I decided to pick an approximately right,
and call the animation time good enough.

The problem is now though: what if you add an item
to the drop down? If you don't remember to adjust
max-height too the new thing is just hidden.


Gah, the css is now dependent on the specific
content!



But, you don't always have to do this stuff
(BTW if anyone knows a better technique, let
me know!), and even so, it beats the crap
out of javascript animations.

Re: Multiple return values...


On 3/13/12 4:12 AM, Manu wrote:

I think I feel a sense of urgency towards the ABI aspect because it is a
breaking change, and I suspect the longer anything like that is left,
the less likely/more risky it becomes.
If it gets delayed for 6-12 months, are you honestly more or less likely
to say it's a good idea to fiddle with the ABI?


I think Walter could answer that.


I am sold on the Tuple approach now, so that's a big discussion that can
be dismissed.


Great!


Shall we discuss the shortcomings of his implementation? Can someone
demonstrate the details of his implementation?
 From the little examples up in the thread, it looked like you could
only declare new variables inline, but not assign out to existing ones.
I'd say this needs to be added too, and perhaps that will throw the
whole design into turmoil? ;)


I thought more about it and we should be fine with two functions (untested):

enum Skip {};
@property ref Skip skip() {
static __gshared Skip result;
return result;
}

void scatter(T, U...)(auto ref T source, ref U targets) {
assert(source.length == targets.length);
foreach (i, ref target; targets) {
static if (is(typeof(target) != Skip)) {
target = source[i];
}
}
}

void gather(T, U...)(ref T target, auto ref U sources) {
assert(target.length == sources.length);
foreach (i, source; sources) {
static if (is(typeof(source) != Skip)) {
target[i] = source;
}
}
}

Usage:

auto t = tuple(1, hi, 2.3);
int a;
string b;
double c;
t.scatter(a, b, skip); // assigns a and b from tuple
b = !;
++c;
t.gather(skip, b, c); // assigns tuple from variables b and c


Andrei

Re: toHash = pure, nothrow, const, @safe


On 3/13/12 6:02 AM, Peter Alexander wrote:

On Monday, 12 March 2012 at 09:40:15 UTC, Walter Bright wrote:

On 3/12/2012 1:08 AM, Martin Nowak wrote:

What's wrong with auto-inference. Inferred attributes are only
strengthening
guarantees.


Auto-inference is currently done for lambdas and template functions -
why? - because the function's implementation is guaranteed to be
visible to the compiler. For other functions, not so, and so the
attributes must be part of the function signature.


Dumb question:

Why not auto-infer when the function body is available, and put the
inferred attributes into the automatically generated .di file?

Apologies if I've missed something completely obvious.


Because in the general case functions call one another so there's no way 
to figure which to look at first.


Andrei

Re: Regarding implementing a stable sort for Phobos


On Tuesday, 13 March 2012 at 09:32:49 UTC, deadalnix wrote:

Le 13/03/2012 10:19, Xinok a écrit :
Would you mind sharing your smoothsort? I haven't implemented 
one myself

and I'd love to test it out.


It is on github : 
https://github.com/deadalnix/Dsort/blob/master/sort/smooth.d


Thanks. I found a couple cases where it performs better, but 
overall, the overhead of the algorithm seems to be too much and 
most other algorithms performed better.


While some need to be rewritten, I have a slew of algorithms if 
you want them for your project.

[video] A better way to program

2012-03-13 Thread proxy

Very interesting talk about the merits of direct feedback in any 
creative process(first half) and being a developer 
activist(second half).


It eventually gets going and it isn't only about game 
programming at about 18 mins in you will find the same ideas 
applied to more abstract coding and even to other engineering 
disciplines.


http://www.i-programmer.info/news/112-theory/3900-a-better-way-to-program.html

Re: Multiple return values...

On 13 March 2012 16:44, Andrei Alexandrescu
seewebsiteforem...@erdani.orgwrote:

 I thought more about it and we should be fine with two functions
 (untested):

 enum Skip {};
 @property ref Skip skip() {
static __gshared Skip result;
return result;
 }

 void scatter(T, U...)(auto ref T source, ref U targets) {
assert(source.length == targets.length);
foreach (i, ref target; targets) {
static if (is(typeof(target) != Skip)) {
target = source[i];
}
}
 }

 void gather(T, U...)(ref T target, auto ref U sources) {
assert(target.length == sources.length);
foreach (i, source; sources) {
static if (is(typeof(source) != Skip)) {
target[i] = source;
}
}
 }

 Usage:

 auto t = tuple(1, hi, 2.3);
 int a;
 string b;
 double c;
 t.scatter(a, b, skip); // assigns a and b from tuple
 b = !;
 ++c;
 t.gather(skip, b, c); // assigns tuple from variables b and c


Well, that 'works' :) .. Is that a proposal for a 'final' syntax, or
something to work with in the mean time?
I said I've come to accept the Tuple *implementation*, but I'm absolutely
not ready to accept the syntax baggage ;)
I'd really rather see something that actually looks like a language feature
in its final manifestation. Is natural and convenient to read and type.

float t;
...
(myStruct.pos, t, _, int err) = intersectThings();

Or something to this effect. That's about as clear and concise as it gets
for my money.

Re: toHash = pure, nothrow, const, @safe


Le 13/03/2012 15:46, Andrei Alexandrescu a écrit :

On 3/13/12 6:02 AM, Peter Alexander wrote:

On Monday, 12 March 2012 at 09:40:15 UTC, Walter Bright wrote:

On 3/12/2012 1:08 AM, Martin Nowak wrote:

What's wrong with auto-inference. Inferred attributes are only
strengthening
guarantees.


Auto-inference is currently done for lambdas and template functions -
why? - because the function's implementation is guaranteed to be
visible to the compiler. For other functions, not so, and so the
attributes must be part of the function signature.


Dumb question:

Why not auto-infer when the function body is available, and put the
inferred attributes into the automatically generated .di file?

Apologies if I've missed something completely obvious.


Because in the general case functions call one another so there's no way
to figure which to look at first.

Andrei


This problem is pretty close to garbage collection. Let's use pure as 
example, but it work with other qualifier too.


function are marked pure, impure, or pure given all function called are 
pure (possibly pure). Then you go throw all possibly pure function and 
if it call an impure function, they mark it impure. When you don't mark 
any function as impure on a loop, you can mark all remaining possibly 
pure functions as pure.

Re: toHash = pure, nothrow, const, @safe


Le 13/03/2012 01:50, Walter Bright a écrit :

On 3/12/2012 4:11 AM, deadalnix wrote:

For struct, we have inference,


? No we don't.



Ok my mistake. So why not dig in that direction ?


so most of the time attributes will correct.
const pure nothrow @safe are something we want, but is it something we
want to
enforce ?


Yes, because they are referred to by TypeInfo, and that's fairly useless
if it isn't const etc.


I always though that TypeInfo is a poor substitute for metaprograming 
and compile time reflexion.

Re: Regarding implementing a stable sort for Phobos

On Tuesday, 13 March 2012 at 14:31:59 UTC, Andrei Alexandrescu 
wrote:

On 3/13/12 1:31 AM, Xinok wrote:
- It's a natural merge sort, which is faster on partially 
sorted lists,

and adds little overhead for mostly random lists.
- It uses O(log n log n) additional space for merging.


That's 1024 when n is 4 billion. I think you can safely 
approximate it with alloca or a fixed-size stack-allocated 
array.


How about stack allocated for small lists, and heap allocated for 
larger lists? e.g. Limit the stack to 1KiB and use the heap for 
anything larger.


- I wrote it to sort random-access ranges *without* slicing, 
but I think
the exclusion of slicing makes it slower. I'm writing a 
separate
implementation which uses slicing and I'll keep it if it's 
much faster.


Having random access implies having slicing.

- To avoid multiple allocations, the user can allocate their 
own

temporary memory and pass it to the sort function.


If you need different allocation strategies, I suggest you make 
it a policy (like stability is).


- I decided against using pointers. While I could use them to 
write
highly optimized code for arrays, pointers can't be used in 
safe code

and don't work very well in CTFE at the moment.


Perhaps it's best to have two distinct implementations guarded 
by if (__ctfe). The runtime implementation can be @trusted.


If the performance gain is great enough, I'll consider doing that.

Is it worth keeping the implementation *without* slicing? Many 
functions
in Phobos do require slicing, including the unstable sort, and 
I think

most random-access ranges do or could support slicing.


No need.


I'll leave it out of Phobos.

What would you prefer in terms of memory usage vs performance? 
O(n/2)

space is optimal for performance, O(1) (in-place) requires zero
allocations but is slower, and O(log n log n) provides a good 
balance.


The latter rounded up to a constant sounds best.

Should I implement concurrency? Merge sort is very easy to 
parallelize,

and being in-place or natural doesn't change this fact.


Let's save that for std.parallel_algorithm.


I'll leave it out of Phobos for now.

Re: Regarding implementing a stable sort for Phobos

2012-03-13 Thread Sean Kelly

How does the built-in sort do?  I ask because the sort routine I wrote works 
the same way, which is optimized for ranges with a lot of common elements. 

On Mar 13, 2012, at 7:33 AM, Andrei Alexandrescu 
seewebsiteforem...@erdani.org wrote:

 On 3/13/12 4:02 AM, Xinok wrote:
 On Tuesday, 13 March 2012 at 06:53:30 UTC, Chad J wrote:
 Hey, I'd love to see more sorting algorithms in phobos. Being stuck
 with one seems kind of... wrong.
 
 Things like this are better left to 3rd party libs. Phobos already has
 two, a stable and unstable sort, which fulfill 99% of cases.
 
 I think we need a good sort for ranges of ranges (e.g. array of string). 
 Right now sort() does pretty badly on arrays of strings with large common 
 prefixes.
 
 Andrei

Re: Regarding implementing a stable sort for Phobos

2012-03-13 Thread Sean Kelly

I forgot to mention that my routine uses the same basic algorithm as the 
built-in sort. 

On Mar 13, 2012, at 8:54 AM, Sean Kelly s...@invisibleduck.org wrote:

 How does the built-in sort do?  I ask because the sort routine I wrote works 
 the same way, which is optimized for ranges with a lot of common elements. 
 
 On Mar 13, 2012, at 7:33 AM, Andrei Alexandrescu 
 seewebsiteforem...@erdani.org wrote:
 
 On 3/13/12 4:02 AM, Xinok wrote:
 On Tuesday, 13 March 2012 at 06:53:30 UTC, Chad J wrote:
 Hey, I'd love to see more sorting algorithms in phobos. Being stuck
 with one seems kind of... wrong.
 
 Things like this are better left to 3rd party libs. Phobos already has
 two, a stable and unstable sort, which fulfill 99% of cases.
 
 I think we need a good sort for ranges of ranges (e.g. array of string). 
 Right now sort() does pretty badly on arrays of strings with large common 
 prefixes.
 
 Andrei

Re: toHash = pure, nothrow, const, @safe

2012-03-13 Thread Alex Rønne Petersen


On 13-03-2012 16:56, deadalnix wrote:

Le 13/03/2012 01:50, Walter Bright a écrit :

On 3/12/2012 4:11 AM, deadalnix wrote:

For struct, we have inference,


? No we don't.



Ok my mistake. So why not dig in that direction ?


so most of the time attributes will correct.
const pure nothrow @safe are something we want, but is it something we
want to
enforce ?


Yes, because they are referred to by TypeInfo, and that's fairly useless
if it isn't const etc.


I always though that TypeInfo is a poor substitute for metaprograming
and compile time reflexion.


Yes, and in some cases, it doesn't even work right; i.e. you can declare 
certain opCmp and opEquals signatures that work fine for ==, , , etc 
but don't get emitted to the TypeInfo metadata, and vice versa. It's a mess.


--
- Alex

Re: Regarding implementing a stable sort for Phobos


Le 13/03/2012 16:08, Xinok a écrit :

On Tuesday, 13 March 2012 at 09:32:49 UTC, deadalnix wrote:

Le 13/03/2012 10:19, Xinok a écrit :

Would you mind sharing your smoothsort? I haven't implemented one myself
and I'd love to test it out.


It is on github :
https://github.com/deadalnix/Dsort/blob/master/sort/smooth.d


Thanks. I found a couple cases where it performs better, but overall,
the overhead of the algorithm seems to be too much and most other
algorithms performed better.

While some need to be rewritten, I have a slew of algorithms if you want
them for your project.


smooth sort is intended to be used on semi sorted data (like transparent 
polygons on a 3D scene). Ideal to keep some data sorted.


It also have a guarantee to run in O(n*log(n)). But qsort variation 
(like we have in phobos) is faster in the general case.

Re: How about colors and terminal graphics in std.format?

2012-03-13 Thread Jacob Carlborg


On 2012-03-13 13:31, Christian Manning wrote:

On Tuesday, 13 March 2012 at 07:45:19 UTC, Jacob Carlborg wrote:

On 2012-03-13 02:36, Christian Manning wrote:


It would be great if an std.terminal contained general stuff for
manipulating/querying a terminal portably, as well as colour output, eg.
get terminal size, move cursor around, erase line... just things to help
with building UIs, progress bars, etc. that are easy to use.


I actually have a library for this written in C++, somewhere.


Any chance of a release? :)

I'd like to have a stab at porting it to D, when I have time, if you
aren't already planning to.


I have been thinking about porting it to D from time to time. I can see 
what I can do :)


--
/Jacob Carlborg

Re: Multiple return values...


On 3/13/12 10:48 AM, Manu wrote:

float t;
...
(myStruct.pos, t, _, int err) = intersectThings();


I actually find the scatter syntax better than this. Anyway, I hope 
you'll agree there's not much difference pragmatically.


Andrei

Re: toHash = pure, nothrow, const, @safe


On 3/13/12 10:47 AM, deadalnix wrote:

This problem is pretty close to garbage collection. Let's use pure as
example, but it work with other qualifier too.

function are marked pure, impure, or pure given all function called are
pure (possibly pure). Then you go throw all possibly pure function and
if it call an impure function, they mark it impure. When you don't mark
any function as impure on a loop, you can mark all remaining possibly
pure functions as pure.


Certain analyses can be done using the so-called worklist approach. The 
analysis can be pessimistic (initially marking all functions as not 
carrying the property analyzed and gradually proving some do carry it) 
or optimistic (the other way around). The algorithm ends when the 
worklist is empty. This approach is well-studied and probably ought more 
coverage in compiler books. I learned about it in a graduate compiler class.


However, the discussion was about availability of the body. A 
worklist-based approach would need all functions that call one another 
regardless of module. That makes the analysis interprocedural, i.e. 
difficult on large codebases.



Andrei

Re: Regarding implementing a stable sort for Phobos


On 3/13/12 10:54 AM, Sean Kelly wrote:

How does the built-in sort do?  I ask because the sort routine I
wrote works the same way, which is optimized for ranges with a lot of
common elements.


It's not about common (equal) elements, it's about elements for which 
comparisons do a lot of work because they have common prefixes. Consider:


auto arr = [ aaa, aab, aac, aad ];
sort!((a, b) = a  b)(arr);

There will be a lot of redundant prefix comparisons because the sorting 
method doesn't have information about the common prefixes.


Trie-based sorting is a more efficient method for ranges of ranges, see 
e.g. http://en.wikipedia.org/wiki/Burstsort.



Andrei

Re: toHash = pure, nothrow, const, @safe


Le 13/03/2012 17:06, Andrei Alexandrescu a écrit :

On 3/13/12 10:47 AM, deadalnix wrote:

This problem is pretty close to garbage collection. Let's use pure as
example, but it work with other qualifier too.

function are marked pure, impure, or pure given all function called are
pure (possibly pure). Then you go throw all possibly pure function and
if it call an impure function, they mark it impure. When you don't mark
any function as impure on a loop, you can mark all remaining possibly
pure functions as pure.


Certain analyses can be done using the so-called worklist approach. The
analysis can be pessimistic (initially marking all functions as not
carrying the property analyzed and gradually proving some do carry it)
or optimistic (the other way around). The algorithm ends when the
worklist is empty. This approach is well-studied and probably ought more
coverage in compiler books. I learned about it in a graduate compiler
class.

However, the discussion was about availability of the body. A
worklist-based approach would need all functions that call one another
regardless of module. That makes the analysis interprocedural, i.e.
difficult on large codebases.


Andrei


I expect the function we are talking about here not to call almost all 
the codebase. It would be scary.

Re: Regarding implementing a stable sort for Phobos


On Tuesday, 13 March 2012 at 16:04:55 UTC, deadalnix wrote:

Le 13/03/2012 16:08, Xinok a écrit :

On Tuesday, 13 March 2012 at 09:32:49 UTC, deadalnix wrote:

Le 13/03/2012 10:19, Xinok a écrit :
Would you mind sharing your smoothsort? I haven't 
implemented one myself

and I'd love to test it out.


It is on github :
https://github.com/deadalnix/Dsort/blob/master/sort/smooth.d


Thanks. I found a couple cases where it performs better, but 
overall,
the overhead of the algorithm seems to be too much and most 
other

algorithms performed better.


smooth sort is intended to be used on semi sorted data (like 
transparent polygons on a 3D scene). Ideal to keep some data 
sorted.


It also have a guarantee to run in O(n*log(n)). But qsort 
variation (like we have in phobos) is faster in the general 
case.


It only performs well if there aren't many elements to move 
around. For example, I took a sorted list with 1 million 
elements, and appended 64 random elements. Smoothsort was the 
second slowest, only marginally beating heap sort. My natural 
merge sort was 27x faster.

Re: Regarding implementing a stable sort for Phobos

On Tuesday, 13 March 2012 at 16:11:05 UTC, Andrei Alexandrescu 
wrote:

On 3/13/12 10:54 AM, Sean Kelly wrote:
How does the built-in sort do?  I ask because the sort routine 
I
wrote works the same way, which is optimized for ranges with a 
lot of

common elements.


It's not about common (equal) elements, it's about elements for 
which comparisons do a lot of work because they have common 
prefixes. Consider:


auto arr = [ aaa, aab, aac, aad ];
sort!((a, b) = a  b)(arr);

There will be a lot of redundant prefix comparisons because the 
sorting method doesn't have information about the common 
prefixes.


Trie-based sorting is a more efficient method for ranges of 
ranges, see e.g. http://en.wikipedia.org/wiki/Burstsort.



Andrei


Rather than a sort function, I think we'd benefit more from Trie 
in std.container. If implemented correctly, it could be self 
sorting like RedBlackTree.

Re: toHash = pure, nothrow, const, @safe

On Tue, Mar 13, 2012 at 11:06:00AM -0500, Andrei Alexandrescu wrote:
 On 3/13/12 10:47 AM, deadalnix wrote:
 This problem is pretty close to garbage collection. Let's use pure as
 example, but it work with other qualifier too.
 
 function are marked pure, impure, or pure given all function called
 are pure (possibly pure). Then you go throw all possibly pure
 function and if it call an impure function, they mark it impure. When
 you don't mark any function as impure on a loop, you can mark all
 remaining possibly pure functions as pure.
 
 Certain analyses can be done using the so-called worklist approach.
 The analysis can be pessimistic (initially marking all functions as
 not carrying the property analyzed and gradually proving some do carry
 it) or optimistic (the other way around). The algorithm ends when the
 worklist is empty. This approach is well-studied and probably ought
 more coverage in compiler books. I learned about it in a graduate
 compiler class.
[...]

I have an idea.

Instead of making potentially risky changes to the compiler, or changes
with unknown long-term consequences, what about an external tool (or a
new compiler option) that performs this analysis and saves it into a
file, say in json format or something?

So we run the analysis on druntime, and it tells us exactly which
functions can be marked pure, const, whatever, then we can (1) look
through the list to see if functions that *should* be pure aren't, then
investigate why and (possibly) fix the problem; (2) annotate all
functions in druntime just by going through the list, without needing to
manually fix one function, find out it breaks 5 other functions, fix
those functions, find another 25 broken, etc..

We can also run this on phobos, cleanup whatever functions aren't marked
pure, and then go through the list and annotate everything in one shot.

Now that I think of it, it seems quite silly that we should be agonizing
over the amount of manual work needed to annotate druntime and phobos,
when the compiler already has all the necessary information to automate
most of the tedious work.


T

-- 
It is not the employer who pays the wages. Employers only handle the
money. It is the customer who pays the wages. -- Henry Ford

Re: [video] A better way to program

2012-03-13 Thread Vladimir Panteleev


On Tuesday, 13 March 2012 at 15:34:17 UTC, proxy wrote:
Very interesting talk about the merits of direct feedback in 
any creative process(first half) and being a developer 
activist(second half).


It eventually gets going and it isn't only about game 
programming at about 18 mins in you will find the same ideas 
applied to more abstract coding and even to other engineering 
disciplines.


http://www.i-programmer.info/news/112-theory/3900-a-better-way-to-program.html


Could someone summarize the video?

Re: Multiple return values...

On 13 March 2012 18:07, Andrei Alexandrescu
seewebsiteforem...@erdani.orgwrote:

 On 3/13/12 10:48 AM, Manu wrote:

 float t;
 ...
 (myStruct.pos, t, _, int err) = intersectThings();


 I actually find the scatter syntax better than this. Anyway, I hope you'll
 agree there's not much difference pragmatically.


There's a few finicky differences. I'm still of the understanding (and I
may be wrong, still mystified by some of D's more complicated template
syntax) that once you give the returned tuple a name, it is structurally
bound to the stack. At that point, passing any member by-ref to any
function must conservatively commit the entire tuple to the stack. This
behaviour won't be intuitive to most users, and can be easily avoided; by
obscuring the Tuple from user visibility, they can only access the returned
values through their independant output assignments, which guarantees the
independence of each returned item.

Syntactically, scatter can't declare new variables inline (?), it also uses
additional lines of code (1 + as many variables as you need to declare),
which is very disruptive to flow. Maths-y code should be un-cluttered and
read sequentially. Having to put extra lines in to munge un-related things
really ruins the code IMO ('t' is of no consequence to the user, pollutes
their namespace, gets in the way with extra lines, etc).
What people want from MRV is to capture the returned values independently.
If I *wanted* to capture the returned Tuple (the extremely rare case), I'd
rather do that explicitly, something like this:
auto t = tuple(mrvFunc());

scatter/gather is nice and simple, I'll take it in the mean time, but I
think it would be a shame for it to stop there longer term...
That said though, it's all still nothing to me without at least a promise
on the ABI :) .. And I feel that should ideally come in the form of a
language policy/promise that this feature will be 'efficient' (or at very
least, not *inefficient* as it is now), and leave it to compiler
implementations to concur with that promise, ie, failing to be 'standards'
compliant if they fail to do so.

Re: Regarding implementing a stable sort for Phobos


Le 13/03/2012 17:38, Xinok a écrit :

On Tuesday, 13 March 2012 at 16:04:55 UTC, deadalnix wrote:

Le 13/03/2012 16:08, Xinok a écrit :

On Tuesday, 13 March 2012 at 09:32:49 UTC, deadalnix wrote:

Le 13/03/2012 10:19, Xinok a écrit :

Would you mind sharing your smoothsort? I haven't implemented one
myself
and I'd love to test it out.


It is on github :
https://github.com/deadalnix/Dsort/blob/master/sort/smooth.d


Thanks. I found a couple cases where it performs better, but overall,
the overhead of the algorithm seems to be too much and most other
algorithms performed better.


smooth sort is intended to be used on semi sorted data (like
transparent polygons on a 3D scene). Ideal to keep some data sorted.

It also have a guarantee to run in O(n*log(n)). But qsort variation
(like we have in phobos) is faster in the general case.


It only performs well if there aren't many elements to move around. For
example, I took a sorted list with 1 million elements, and appended 64
random elements. Smoothsort was the second slowest, only marginally
beating heap sort. My natural merge sort was 27x faster.


Yes, that being said, my implementation use multiple swap where move 
would have been more appropriate, and don't implement some improvements.


Merge sort is also known to be very fast (it is default is some 
langauges) but trigger memory allocation, something that some cannot afford.


Definitively, this is something we should have in phobos.

Re: Multiple return values...


On 3/13/12 12:02 PM, Manu wrote:

There's a few finicky differences. I'm still of the understanding (and I
may be wrong, still mystified by some of D's more complicated template
syntax) that once you give the returned tuple a name, it is structurally
bound to the stack. At that point, passing any member by-ref to any
function must conservatively commit the entire tuple to the stack. This
behaviour won't be intuitive to most users, and can be easily avoided;
by obscuring the Tuple from user visibility, they can only access the
returned values through their independant output assignments, which
guarantees the independence of each returned item.


Here we go moving the goalposts again.


Syntactically, scatter can't declare new variables inline (?), it also
uses additional lines of code (1 + as many variables as you need to
declare), which is very disruptive to flow.


This is in addition to Kenji's change.


What people want from MRV is to capture the returned
values independently. If I /wanted/ to capture the returned Tuple (the
extremely rare case), I'd rather do that explicitly, something like this:
auto t = tuple(mrvFunc());


No. Tuple stays together by default and is expanded explicitly. This is 
not negotiable.



scatter/gather is nice and simple, I'll take it in the mean time, but I
think it would be a shame for it to stop there longer term...
That said though, it's all still nothing to me without at least a
promise on the ABI :) .. And I feel that should ideally come in the form
of a language policy/promise that this feature will be 'efficient' (or
at very least, not /inefficient/ as it is now), and leave it to compiler
implementations to concur with that promise, ie, failing to be
'standards' compliant if they fail to do so.


This is not for me to promise.


Andrei

Re: toHash = pure, nothrow, const, @safe

2012-03-13 Thread Martin Nowak

On Tue, 13 Mar 2012 04:40:01 +0100, Andrei Alexandrescu  
seewebsiteforem...@erdani.org wrote:


I think the three others have a special regime because pointers to them  
must be saved for the sake of associative arrays. toString is used only  
generically,

 Andrei


Adding a special case for AAs is not a good idea but
these operators are indeed special and should have
a defined behavior.
Requiring pureness for comparison for example is good
for all kind of generic algorithms.

Re: Regarding implementing a stable sort for Phobos

2012-03-13 Thread bearophile

Andrei Alexandrescu:

 it's about elements for which 
 comparisons do a lot of work because they have common prefixes. Consider:
 
 auto arr = [ aaa, aab, aac, aad ];
 sort!((a, b) = a  b)(arr);
 
 There will be a lot of redundant prefix comparisons because the sorting 
 method doesn't have information about the common prefixes.

As a benchmark for this new sorting algorithm I suggest to use a poor's man BWT 
(http://en.wikipedia.org/wiki/Burrows%E2%80%93Wheeler_transform ). The purpose 
here is not to create the most efficient BWT.

Bye,
bearophile

Re: Breaking backwards compatiblity

2012-03-13 Thread Simen Kjærås

On Tue, 13 Mar 2012 06:45:12 +0100, H. S. Teoh hst...@quickfur.ath.cx  
wrote:



On Tue, Mar 13, 2012 at 04:10:20AM +0100, Simen Kjærås wrote:

On Tue, 13 Mar 2012 03:50:49 +0100, Nick Sabalausky a@a.a wrote:

[...]

D is great for physics programming. Now you can have much, much more
than 26 variables :)

True, though mostly, you'd just change to using Greek letters, right?


And Russian. And extended Latin. And Chinese (try exhausting that one!).
And a whole bunch of other stuff that you may not have known even
existed.


I know Unicode covers a lot more than just Greek. I didn't know the usage
of Chinese was very common among physicists, though. :p



Finally we can use θ for angles, alias ulong ℕ...


+1.

Come to think of it, I wonder if it's possible to write a large D
program using only 1-letter identifiers. After all, Unicode has enough
alphabetic characters that you could go for a long, long time before you
exhausted them all. (The CJK block will be especially resilient to
exhaustion.) :-)


63,207[1] designated characters thus far[2]. Add in module names and other
'namespaces', and I'd say that should be no problem at all. As long as
your head doesn't explode, that is.

[1] http://unicode.org/alloc/CurrentAllocation.html

[2] Yeah, not all of those are valid identifiers.

Re: [video] A better way to program

2012-03-13 Thread proxy

On Tuesday, 13 March 2012 at 16:57:48 UTC, Vladimir Panteleev 
wrote:

On Tuesday, 13 March 2012 at 15:34:17 UTC, proxy wrote:
Very interesting talk about the merits of direct feedback in 
any creative process(first half) and being a developer 
activist(second half).


It eventually gets going and it isn't only about game 
programming at about 18 mins in you will find the same ideas 
applied to more abstract coding and even to other engineering 
disciplines.


http://www.i-programmer.info/news/112-theory/3900-a-better-way-to-program.html


Could someone summarize the video?


http://developers.slashdot.org/comments.pl?sid=2719385cid=39320247

Re: Multiple return values...

On 13 March 2012 19:25, Andrei Alexandrescu
seewebsiteforem...@erdani.orgwrote:

 On 3/13/12 12:02 PM, Manu wrote:

 There's a few finicky differences. I'm still of the understanding (and I
 may be wrong, still mystified by some of D's more complicated template
 syntax) that once you give the returned tuple a name, it is structurally
 bound to the stack. At that point, passing any member by-ref to any
 function must conservatively commit the entire tuple to the stack. This
 behaviour won't be intuitive to most users, and can be easily avoided;
 by obscuring the Tuple from user visibility, they can only access the
 returned values through their independant output assignments, which
 guarantees the independence of each returned item.


 Here we go moving the goalposts again.


I don't see how? I'm just saying that I don't think they are pragmatically
identical.

Syntactically, scatter can't declare new variables inline (?), it also
 uses additional lines of code (1 + as many variables as you need to
 declare), which is very disruptive to flow.


 This is in addition to Kenji's change.


What value does it add over Kenji's change? Is this because Kenji's change
is unable to perform direct to existing variables?
My understanding from early in the thread was that Kenji's change hides the
returned tuple, and performs a convenient unpack. How can you perform a
scatter if the tuple instance is no longer visible?

What people want from MRV is to capture the returned
 values independently. If I /wanted/ to capture the returned Tuple (the

 extremely rare case), I'd rather do that explicitly, something like this:
 auto t = tuple(mrvFunc());


 No. Tuple stays together by default and is expanded explicitly. This is
 not negotiable.


Then I think you commit to polluting the common case with wordy redundant
noise. Why is it so important?
If it were expanded by default, all you need to do it put a tuple
constructor around it to wrap it up again.
It creates semantic multi-assignment problems I suspect? This is what I
reckon needs to be addressed to make the implementation really nice.

scatter/gather is nice and simple, I'll take it in the mean time, but I
 think it would be a shame for it to stop there longer term...
 That said though, it's all still nothing to me without at least a
 promise on the ABI :) .. And I feel that should ideally come in the form
 of a language policy/promise that this feature will be 'efficient' (or
 at very least, not /inefficient/ as it is now), and leave it to compiler

 implementations to concur with that promise, ie, failing to be
 'standards' compliant if they fail to do so.


 This is not for me to promise.


Sure, but it'd be good to get a weigh in on that issue from Walter, and
others, Iain?

Re: [video] A better way to program

2012-03-13 Thread Brad Anderson

On Tue, Mar 13, 2012 at 10:57 AM, Vladimir Panteleev 
vladi...@thecybershadow.net wrote:

 On Tuesday, 13 March 2012 at 15:34:17 UTC, proxy wrote:

 Very interesting talk about the merits of direct feedback in any creative
 process(first half) and being a developer activist(second half).

 It eventually gets going and it isn't only about game programming at
 about 18 mins in you will find the same ideas applied to more abstract
 coding and even to other engineering disciplines.

 http://www.i-programmer.info/**news/112-theory/3900-a-better-**
 way-to-program.htmlhttp://www.i-programmer.info/news/112-theory/3900-a-better-way-to-program.html


 Could someone summarize the video?


It's basically a very slick argument for interactive coding.  His demos are
all pretty neat but you have to watch it to get the full effect.

Regards,
Brad Anderson

Re: Regarding implementing a stable sort for Phobos


On Tuesday, 13 March 2012 at 06:32:01 UTC, Xinok wrote:
I've been playing with sorting algorithms a lot in recent 
months, so I want to implement a *working* stable sort for 
Phobos which is broken at the moment. I have a working library 
and I'm still adding to it. It's much more complex than a 
simple merge sort, being over 300 lines of code at the moment.


I've implemented slicing which has improved benchmarks quite a 
bit.


Sorting a random array of 1 million uints:
Phobos Unstable Sort - 132ms
Phobos   Stable Sort - 2037ms
Proposed Stable Sort - 187ms

Sorting a random array of 1 million strings:
Phobos Unstable Sort - 1228ms
Phobos   Stable Sort - 3516ms
Proposed Stable Sort - 1178ms

It still uses O(log n log n) space. I modified the code to 
allocate up to 1KiB on the stack, and use the heap for anything 
larger. I simply marked the entry sort function as @trusted. The 
non-slicing code is still in the lib but disabled. I've yet to 
add contracts, documentation, and a unittest.


I won't be adding optimized code for arrays utilizing pointers as 
I expect the performance gain to be as little as 10%.


You can download the preliminary lib here:
http://www.mediafire.com/?ux49x30dj483dqg

Re: Regarding implementing a stable sort for Phobos

2012-03-13 Thread Jesse Phillips

On Tuesday, 13 March 2012 at 14:31:59 UTC, Andrei Alexandrescu 
wrote:

On 3/13/12 1:31 AM, Xinok wrote:
- I wrote it to sort random-access ranges *without* slicing, 
but I think
the exclusion of slicing makes it slower. I'm writing a 
separate
implementation which uses slicing and I'll keep it if it's 
much faster.


Having random access implies having slicing.


Currently it can not be assumed that isRandomAccessRange has 
slicing:


http://dlang.org/phobos/std_range.html#isRandomAccessRange

Maybe it should be a requirement?

It seems to me that Bidirectional ranges can't be infinite, and 
by extension Random Access ranges too. But slicing could be 
supported on an infinite range. So hasSlicing is still useful, 
but I think could be a good requirement on RA ranges.

Re: [video] A better way to program

2012-03-13 Thread Brad Anderson

On Tue, Mar 13, 2012 at 12:34 PM, Brad Anderson e...@gnuk.net wrote:

On Tue, Mar 13, 2012 at 10:57 AM, Vladimir Panteleev
vladi...@thecybershadow.net wrote:

On Tuesday, 13 March 2012 at 15:34:17 UTC, proxy wrote:

Very interesting talk about the merits of direct feedback in any
creative process(first half) and being a developer activist(second half).

It eventually gets going and it isn't only about game programming at
about 18 mins in you will find the same ideas applied to more abstract
coding and even to other engineering disciplines.

http://www.i-programmer.info/**news/112-theory/3900-a-better-**
way-to-program.htmlhttp://www.i-programmer.info/news/112-theory/3900-a-better-way-to-program.html

Could someone summarize the video?

It's basically a very slick argument for interactive coding. His demos
are all pretty neat but you have to watch it to get the full effect.

Regards,
Brad Anderson

A word of warning though. He goes on, and on, and on about the philosophy
of design in a rather self important way that I don't think our profession
of software developers deserves (over other makers of tools). Can you
imagine the manufacturer of a hammer spending hours talking about the
philosophy of tool design and the unique connection between the the hammer
and the hammerer that some change to the tool fosters? Instead they just
list concrete improvements; more durable, better striking head, lighter
weight, etc.

Apple has really done a number on the minds and egos of young designers
(for better or worse).

Regards,
Brad Anderson

Re: Multiple return values...

2012-03-13 Thread Jose Armando Garcia

On Tue, Mar 13, 2012 at 9:07 AM, Andrei Alexandrescu
seewebsiteforem...@erdani.org wrote:
 On 3/13/12 10:48 AM, Manu wrote:

 float t;
 ...
 (myStruct.pos, t, _, int err) = intersectThings();


This can be checked at compile time. The D compiler can check that the
number of arguments and the types match.


 I actually find the scatter syntax better than this. Anyway, I hope you'll
 agree there's not much difference pragmatically.


Correct if I am wrong but the scatter and gather functions cannot
check that the number of arguments and their type match at compile
time.

 Andrei

Re: Arbitrary abbreviations in phobos considered ridiculous

Ary Manzana a...@esperanto.org.ar wrote in message 
news:jjne58$1ouf$1...@digitalmars.com...
 On 03/13/2012 02:14 AM, H. S. Teoh wrote:
 On Mon, Mar 12, 2012 at 10:35:54PM -0400, Nick Sabalausky wrote:
 Jonathan M Davisjmdavisp...@gmx.com  wrote in message
 news:mailman.572.1331601463.4860.digitalmar...@puremagic.com...
 [...]
 All I'm saying is that if it makes sense for the web developer to
 use javascript given what they're trying to do, it's completely
 reasonable to expect that their users will have javascript enabled
 (since virtually everyone does). If there's a better tool for the
 job which is reasonably supported, then all the better. And if it's
 easy to provide a workaround for the lack of JS at minimal effort,
 then great. But given the fact that only a very small percentage of
 your user base is going to have JS disabled, it's not unreasonable
 to require it and not worry about the people who disable it if
 that's what you want to do.


 Personally, I disagree with the notion that non-JS versions are a
 workaround.
 [...]

 Me too. To me, non-JS versions are the *baseline*, and JS versions are
 enchancements. To treat JS versions as baseline and non-JS versions as
 workaround is just so completely backwards.

 While I don't agree that non-JS is the baseline (because most if not all 
 browsers come with JS enabled by default, so why would you want to disable 
 javascript for?), I'm starting to understand that providing both non-JS 
 and JS versions is useful.

 At least so that:
  - Some users don't go mad when they can't use it, and then realise it's 
 because JS is disabled
  - And for the above reason, not to loose reputation to those people :-P

 But if people didn't have an option to disable JS, we wouldn't have this 
 discussion.[...]


Bullcrap. If people didn't have an option to disable JS, there'd be a lot 
more people using *very* *VERY* old browsers, and that would piss of 
*cough*modern*cough* webdevs even more.

The problem isn't that JS *can* be disabled. Some people *just don't want 
it*:

When they disable JS, yea, ok, on *some* sites they get a *slighty worse* 
user experience with, say, posting a comment. But it *also* gives them a 
*far BETTER* user experience on all those sites that misuse and overuse JS. 
It also increases security. The idea that JS-enabled pages are just simply 
better is patently false: Yes, *some* are *slightly* better, but many are 
*much* worse (no matter how good their respective developers believe 
themselves to be. *Everyone* believes Oh, well, when *I* use it, it works 
very well. I'm sure the Reddit developers have fooled themselves into 
thinking their site is reasonably fast).

Negative integer modulo/division

2012-03-13 Thread bearophile

When I translate Python code to D I sometimes need in D the different integer 
division and the different modulo operation of Python3. They give different 
results with the operands are negative:

Python2 code:

for x in xrange(-10, 1):
print x, , x % 3, , x // 3


Python output:

-10  2  -4
-9  0  -3
-8  1  -3
-7  2  -3
-6  0  -2
-5  1  -2
-4  2  -2
-3  0  -1
-2  1  -1
-1  2  -1
0  0  0



D code:

import std.stdio;
void main() {
foreach (x; -10 .. 1)
writeln(x,   , x % 3,   , x / 3);
}


D output:

-10  -1  -3
-9  0  -3
-8  -2  -2
-7  -1  -2
-6  0  -2
-5  -2  -1
-4  -1  -1
-3  0  -1
-2  -2  0
-1  -1  0
0  0  0



For the modulus I sometimes use:
((x % y) + y) % y

So I suggest to add both simple functions to Phobos, possibly as efficient 
compiler intrinsics (this means inlined asm).


It seems Ada and CommonLisp have functions for both needs:
http://en.wikipedia.org/wiki/Modulo_operation

I have also seen this Wikipedia page doesn't tell (it has a ?) about what % 
does on floating point values.

Bye,
bearophile

Re: Arbitrary abbreviations in phobos considered ridiculous

Ary Manzana a...@esperanto.org.ar wrote in message 
news:jjne58$1ouf$1...@digitalmars.com...

 But if people didn't have an option to disable JS, we wouldn't have this 
 discussion. I think it as having an option to disable CSS.


That's not even an accurate comparison anyway. Disabling CSS never does much 
to improve things, and usually it'll just make things *far* worse. People 
don't fuck up CSS nearly to the extent that they fuck up JS. Hell, CSS 
*can't* be fucked up as badly as JS can be. The no JS == no CSS comparison 
is like saying Disabling JS is like disabling vowels. No, no it isn't like 
that. Not remotely.

Re: Arbitrary abbreviations in phobos considered ridiculous

Nick Sabalausky a@a.a wrote in message 
news:jjo65v$305$1...@digitalmars.com...
 Ary Manzana a...@esperanto.org.ar wrote in message 
 news:jjne58$1ouf$1...@digitalmars.com...
 On 03/13/2012 02:14 AM, H. S. Teoh wrote:
 On Mon, Mar 12, 2012 at 10:35:54PM -0400, Nick Sabalausky wrote:
 Jonathan M Davisjmdavisp...@gmx.com  wrote in message
 news:mailman.572.1331601463.4860.digitalmar...@puremagic.com...
 [...]
 All I'm saying is that if it makes sense for the web developer to
 use javascript given what they're trying to do, it's completely
 reasonable to expect that their users will have javascript enabled
 (since virtually everyone does). If there's a better tool for the
 job which is reasonably supported, then all the better. And if it's
 easy to provide a workaround for the lack of JS at minimal effort,
 then great. But given the fact that only a very small percentage of
 your user base is going to have JS disabled, it's not unreasonable
 to require it and not worry about the people who disable it if
 that's what you want to do.


 Personally, I disagree with the notion that non-JS versions are a
 workaround.
 [...]

 Me too. To me, non-JS versions are the *baseline*, and JS versions are
 enchancements. To treat JS versions as baseline and non-JS versions as
 workaround is just so completely backwards.

 While I don't agree that non-JS is the baseline (because most if not all 
 browsers come with JS enabled by default, so why would you want to 
 disable javascript for?), I'm starting to understand that providing both 
 non-JS and JS versions is useful.

 At least so that:
  - Some users don't go mad when they can't use it, and then realise it's 
 because JS is disabled
  - And for the above reason, not to loose reputation to those people :-P

 But if people didn't have an option to disable JS, we wouldn't have this 
 discussion.[...]


 Bullcrap. If people didn't have an option to disable JS, there'd be a lot 
 more people using *very* *VERY* old browsers, and that would piss of 
 *cough*modern*cough* webdevs even more.

 The problem isn't that JS *can* be disabled. Some people *just don't want 
 it*:

 When they disable JS, yea, ok, on *some* sites they get a *slighty worse* 
 user experience with, say, posting a comment. But it *also* gives them a 
 *far BETTER* user experience on all those sites that misuse and overuse 
 JS. It also increases security.

Oh, and with JS disabled, it's impossible for sites *cough*GitHub*cough* to 
break the back button.

The idea that JS-enabled pages are just simply better is patently false: 
Yes, *some* are *slightly* better, but many are *much* worse (no matter how 
good their respective developers believe themselves to be. *Everyone* 
believes Oh, well, when *I* use it, it works very well. I'm sure the 
Reddit developers have fooled themselves into thinking their site is 
reasonably fast).

[draft] New std.regex walkthrough

For a couple of releases we have a new revamped std.regex, that as far 
as I'm concerned works nicely, thanks to my GSOC commitment last summer. 
Yet there was certain dark trend around std.regex/std.regexp as both had 
severe bugs, missing documentation and what not, enough to consider them 
unusable or dismiss prematurely.


It's about time to break this gloomy aura, and show that std.regex is 
actually easy to use, that it does the thing and has some nice extras.


Link: http://blackwhale.github.com/regular-expression.html

Comments are welcome from experts and newbies alike, in fact it should 
encourage people to try out a few tricks ;)


This is intended as replacement for an article on dlang.org
about outdated (and soon to disappear) std.regexp:
http://dlang.org/regular-expression.html

[Spoiler] one example relies on a parser bug being fixed (blush):
https://github.com/D-Programming-Language/phobos/pull/481
Well, it was a specific lookahead inside lookaround so that's not severe 
bug ;)


P.S. I've been following through a bunch of new bug reports recently, 
thanks to everyone involved :)



--
Dmitry Olshansky

Re: Arbitrary abbreviations in phobos considered ridiculous

Nick Sabalausky a@a.a wrote in message 
news:jjmmh3$9jb$1...@digitalmars.com...
 Adam D. Ruppe destructiona...@gmail.com wrote in message 
 news:oxkxtvkuybdommyer...@forum.dlang.org...
 On Tuesday, 13 March 2012 at 04:24:45 UTC, Nick Sabalausky wrote:
 2. On the web, animation means JS.

 css3 does animations that are pretty easy to use,
 degrade well, and tend to be fast. Moreover css
 is where it belongs anyway - it is pure presentation.


 Interesting, I had no idea! Thanks for the tip :)

 Far, far superior to the JS crap.


 Yea, there's a lot of things that are much better done in CSS that a lot 
 of people don't even know about. For example, most rollovers are easily 
 doable in pure CSS. But there's a lot stuff out there (paricularly things 
 created in Adobe's software) that use JS for rollovers, which doesn't 
 even work as well (even with JS on).


Another thing is Flash. Almost *everyone* uses JS to embed flash. But *it's 
not needed*! I embed Flash with pure HTML and it works perfectly fine. Don't 
even need any server-side code! (You probably need JS to tell the user when 
they don't have Flash or, in some cases, when they don't have a new enough 
version, and suggest a download link. But including those features with JS 
still does *nothing* to prevent you from making the applet run without JS. 
And...It's not even a fallback! It's just embedding with method A instead of 
method B. And method A is, frankly, dead-simple.)

Re: Multiple return values...


On 3/13/12 2:07 PM, Jose Armando Garcia wrote:

On Tue, Mar 13, 2012 at 9:07 AM, Andrei Alexandrescu
seewebsiteforem...@erdani.org  wrote:

On 3/13/12 10:48 AM, Manu wrote:


float t;
...
(myStruct.pos, t, _, int err) = intersectThings();




This can be checked at compile time. The D compiler can check that the
number of arguments and the types match.


scatter() can also be compile-time checked. I left that to a runtime 
assert for more flexibility, but probably more checking is better 
particularly because skip allows skipping some values.



I actually find the scatter syntax better than this. Anyway, I hope you'll
agree there's not much difference pragmatically.



Correct if I am wrong but the scatter and gather functions cannot
check that the number of arguments and their type match at compile
time.


Just replace the two assert()s with static assert or a template constraint.


Andrei

Re: [draft] New std.regex walkthrough


On 3/13/12 2:27 PM, Dmitry Olshansky wrote:

For a couple of releases we have a new revamped std.regex, that as far
as I'm concerned works nicely, thanks to my GSOC commitment last summer.
Yet there was certain dark trend around std.regex/std.regexp as both had
severe bugs, missing documentation and what not, enough to consider them
unusable or dismiss prematurely.

It's about time to break this gloomy aura, and show that std.regex is
actually easy to use, that it does the thing and has some nice extras.

Link: http://blackwhale.github.com/regular-expression.html


Reddited: 
http://www.reddit.com/r/programming/comments/quyy1/walk_through_regexen_in_the_d_programming/



Andrei

Re: Multiple return values...


On 3/13/12 1:20 PM, Manu wrote:

What value does it add over Kenji's change? Is this because Kenji's
change is unable to perform direct to existing variables?


Yes.


My understanding from early in the thread was that Kenji's change hides
the returned tuple, and performs a convenient unpack. How can you
perform a scatter if the tuple instance is no longer visible?


If I understand you correctly, you just say fun().scatter(v1, v2, v3).


Andrei

Re: Arbitrary abbreviations in phobos considered ridiculous

2012-03-13 Thread David Gileadi


On 3/13/12 12:28 PM, Nick Sabalausky wrote:

Another thing is Flash. Almost *everyone* uses JS to embed flash. But *it's
not needed*! I embed Flash with pure HTML and it works perfectly fine. Don't
even need any server-side code!


I thought that using JS to load Flash was to avoid Eolas lawsuits. 
http://en.wikipedia.org/wiki/Eolas, Workarounds section.

Re: [draft] New std.regex walkthrough

Dmitry Olshansky dmitry.o...@gmail.com wrote in message 
news:jjo73v$4gv$1...@digitalmars.com...
 For a couple of releases we have a new revamped std.regex, that as far as 
 I'm concerned works nicely, thanks to my GSOC commitment last summer. Yet 
 there was certain dark trend around std.regex/std.regexp as both had 
 severe bugs, missing documentation and what not, enough to consider them 
 unusable or dismiss prematurely.

 It's about time to break this gloomy aura, and show that std.regex is 
 actually easy to use, that it does the thing and has some nice extras.

 Link: http://blackwhale.github.com/regular-expression.html

 Comments are welcome from experts and newbies alike, in fact it should 
 encourage people to try out a few tricks ;)

 This is intended as replacement for an article on dlang.org
 about outdated (and soon to disappear) std.regexp:
 http://dlang.org/regular-expression.html

 [Spoiler] one example relies on a parser bug being fixed (blush):
 https://github.com/D-Programming-Language/phobos/pull/481
 Well, it was a specific lookahead inside lookaround so that's not severe 
 bug ;)

 P.S. I've been following through a bunch of new bug reports recently, 
 thanks to everyone involved :)


Looks nice at an initial glance through. Few things I'll point out though:

- The bullet-list immediately after the text Now, come to think of it, this 
tiny sample showed a lot of useful things already: looks like it's 
outdented instead of indented. Just kinda looks a little odd.

- Speaking of the same line, I'd omit the Now, come to think of it part. 
It sounds too stream-of-conciousness and not very professional article.

- I'm very much in favor of using backticked strings for regexes instead of 
r, because with the latter, you can't include double-quotes, which I'd 
think would be a much more common need in a regex than a backtick. Although 
I understand that backticks aren't easy to make on some keyboards. (In the 
US layout I have, it's just an unshifted tilde, ie, the key just to the left 
of 1. I guess some people don't have a backtick key though?)

Re: Arbitrary abbreviations in phobos considered ridiculous

David Gileadi gilea...@nspmgmail.com wrote in message 
news:jjo7vn$648$1...@digitalmars.com...
 On 3/13/12 12:28 PM, Nick Sabalausky wrote:
 Another thing is Flash. Almost *everyone* uses JS to embed flash. But 
 *it's
 not needed*! I embed Flash with pure HTML and it works perfectly fine. 
 Don't
 even need any server-side code!

 I thought that using JS to load Flash was to avoid Eolas lawsuits. 
 http://en.wikipedia.org/wiki/Eolas, Workarounds section.

Ugh, working for the USPTO should be a capital offense. So should submitting 
an application for a software patent.

Re: [video] A better way to program

2012-03-13 Thread sclytrack


On 03/13/2012 05:57 PM, Vladimir Panteleev wrote:

On Tuesday, 13 March 2012 at 15:34:17 UTC, proxy wrote:

Very interesting talk about the merits of direct feedback in any
creative process(first half) and being a developer activist(second half).

It eventually gets going and it isn't only about game programming at
about 18 mins in you will find the same ideas applied to more abstract
coding and even to other engineering disciplines.

http://www.i-programmer.info/news/112-theory/3900-a-better-way-to-program.html



Could someone summarize the video?




You when you use RAD, you edit the properties on the property tab. Then 
you see the immediate result in the form. Because you edit the properties.


Well here it is not the editing in the property tab but directly in the 
code itself. I know it's weird. Like the colour you get a drop down list
in the code. Or a slider that pops up in the code to modify the value. 
Totally weird.

Re: Arbitrary abbreviations in phobos considered ridiculous

On Tue, Mar 13, 2012 at 12:42:47PM -0700, David Gileadi wrote:
 On 3/13/12 12:28 PM, Nick Sabalausky wrote:
 Another thing is Flash. Almost *everyone* uses JS to embed flash. But
 *it's not needed*! I embed Flash with pure HTML and it works
 perfectly fine. Don't even need any server-side code!
 
 I thought that using JS to load Flash was to avoid Eolas lawsuits.
 http://en.wikipedia.org/wiki/Eolas, Workarounds section.

Ugh. Yet another reason to hate Flash.

How I long for the day when Flash will die a long overdue horrible
death. I would s celebrate!!


T

-- 
The diminished 7th chord is the most flexible and fear-instilling chord.
Use it often, use it unsparingly, to subdue your listeners into
submission!

Re: [video] A better way to program

proxy p...@xy.com wrote in message 
news:heezhlrlpjogvinob...@forum.dlang.org...
 On Tuesday, 13 March 2012 at 16:57:48 UTC, Vladimir Panteleev wrote:
 On Tuesday, 13 March 2012 at 15:34:17 UTC, proxy wrote:
 Very interesting talk about the merits of direct feedback in any 
 creative process(first half) and being a developer activist(second 
 half).

 It eventually gets going and it isn't only about game programming at 
 about 18 mins in you will find the same ideas applied to more abstract 
 coding and even to other engineering disciplines.

 http://www.i-programmer.info/news/112-theory/3900-a-better-way-to-program.html

 Could someone summarize the video?

 http://developers.slashdot.org/comments.pl?sid=2719385cid=39320247

Thanks for the link.

Reading it, I'm reminded of the cvars system id was using *ages* ago. Not 
exactly the same thing, but doesn't have the downsides of the suggested 
approach.

Re: Multiple return values...

On 13 March 2012 21:40, Andrei Alexandrescu
seewebsiteforem...@erdani.orgwrote:

 On 3/13/12 1:20 PM, Manu wrote:

 What value does it add over Kenji's change? Is this because Kenji's
 change is unable to perform direct to existing variables?


 Yes.


  My understanding from early in the thread was that Kenji's change hides
 the returned tuple, and performs a convenient unpack. How can you
 perform a scatter if the tuple instance is no longer visible?


 If I understand you correctly, you just say fun().scatter(v1, v2, v3).


Ah okay, I see.
And you think that's more readable and intuitive than: (v1, v2, v3) =
fun(); ?

Re: [draft] New std.regex walkthrough


On 13.03.2012 23:42, Nick Sabalausky wrote:

Dmitry Olshanskydmitry.o...@gmail.com  wrote in message
news:jjo73v$4gv$1...@digitalmars.com...

For a couple of releases we have a new revamped std.regex, that as far as
I'm concerned works nicely, thanks to my GSOC commitment last summer. Yet
there was certain dark trend around std.regex/std.regexp as both had
severe bugs, missing documentation and what not, enough to consider them
unusable or dismiss prematurely.

It's about time to break this gloomy aura, and show that std.regex is
actually easy to use, that it does the thing and has some nice extras.

Link: http://blackwhale.github.com/regular-expression.html

Comments are welcome from experts and newbies alike, in fact it should
encourage people to try out a few tricks ;)

This is intended as replacement for an article on dlang.org
about outdated (and soon to disappear) std.regexp:
http://dlang.org/regular-expression.html

[Spoiler] one example relies on a parser bug being fixed (blush):
https://github.com/D-Programming-Language/phobos/pull/481
Well, it was a specific lookahead inside lookaround so that's not severe
bug ;)

P.S. I've been following through a bunch of new bug reports recently,
thanks to everyone involved :)



Looks nice at an initial glance through. Few things I'll point out though:

- The bullet-list immediately after the text Now, come to think of it, this
tiny sample showed a lot of useful things already: looks like it's
outdented instead of indented. Just kinda looks a little odd.

- Speaking of the same line, I'd omit the Now, come to think of it part.
It sounds too stream-of-conciousness and not very professional article.


Thanks, these are kind of things I intend to fix/improve/etc.
Hence the [draft] prefix.



- I'm very much in favor of using backticked strings for regexes instead of
r, because with the latter, you can't include double-quotes, which I'd
think would be a much more common need in a regex than a backtick. Although
I understand that backticks aren't easy to make on some keyboards. (In the
US layout I have, it's just an unshifted tilde, ie, the key just to the left
of 1. I guess some people don't have a backtick key though?)



Same here, but I recall there is a movement (was it?) against backticked 
strings, including some of DPL's highly ranked members ;)
So I thought that maybe it's best to not impose my (perverted?) style on 
readers.


--
Dmitry Olshansky

Re: [draft] New std.regex walkthrough

2012-03-13 Thread Jesse Phillips


On Tuesday, 13 March 2012 at 19:27:59 UTC, Dmitry Olshansky wrote:
For a couple of releases we have a new revamped std.regex, that 
as far as I'm concerned works nicely, thanks to my GSOC 
commitment last summer. Yet there was certain dark trend around 
std.regex/std.regexp as both had severe bugs, missing 
documentation and what not, enough to consider them unusable or 
dismiss prematurely.


Thank you for the work Dmitry, I look forward to reading this and 
ultimately have been happy with the changes.


D has been getting a great number of face lifts on its many faces.

Re: [draft] New std.regex walkthrough

2012-03-13 Thread bearophile

Dmitry Olshansky:

 It's about time to break this gloomy aura, and show that std.regex is 
 actually easy to use, that it does the thing and has some nice extras.

This seems a good moment to ask people regarding this small problem, that we 
have already discussed a little in Bugizilla (there is a significant need to 
show here some Bugzilla discussions):

http://d.puremagic.com/issues/show_bug.cgi?id=7260

The problem is easy to show:

import std.stdio: write, writeln;
import std.regex: regex, match;

void main() {
string text = abc312de;

foreach (c; text.match(1|2|3|4))
write(c,  );
writeln();

foreach (c; text.match(regex(1|2|3|4, g)))
write(c,  );
writeln();
}


It outputs:

[3] 
[3] [1] [2]

In my code I have seen that usually the g option (that means repeat over the
whole input) is what I want. So what do you think about making g the default?

This request is not as arbitrary as it looks, if you compare to the older API. 
See Bug 7260 for more info.

Bye,
bearophile

Re: [draft] New std.regex walkthrough


On 14.03.2012 0:05, bearophile wrote:

Dmitry Olshansky:


It's about time to break this gloomy aura, and show that std.regex is
actually easy to use, that it does the thing and has some nice extras.


This seems a good moment to ask people regarding this small problem, that we 
have already discussed a little in Bugizilla (there is a significant need to 
show here some Bugzilla discussions):

http://d.puremagic.com/issues/show_bug.cgi?id=7260



Yeah, it's prime  thing that I regret when thinking of current API.


The problem is easy to show:

import std.stdio: write, writeln;
import std.regex: regex, match;

void main() {
 string text = abc312de;

 foreach (c; text.match(1|2|3|4))
 write(c,  );
 writeln();

 foreach (c; text.match(regex(1|2|3|4, g)))
 write(c,  );
 writeln();
}


It outputs:

[3]
[3] [1] [2]

In my code I have seen that usually the g option (that means repeat over the
whole input) is what I want. So what do you think about making g the default?


I like the general idea of foreach on match to work intuitively.
Yet I'm not convinced to use extra flag as non-global.

I'd propose to yank g flag entirely assuming all regex are global, but 
that breaks code in a lot of subtle ways. Problems of using global flag 
by default:


1. Generic stuff:
assert(equal(match(...), someOtherRange)); //normal regex silently 
becomes global, quite unexpectedly


2. replace that then have to be 2 funcs - replaceFirst, replaceAll or we 
are back to the problem of extra flag.


I'm thinking there is a path through opApply to allow foreach iteration 
of non-global regex as if it had global flag, yet not getting full range 
interface. It's hackish but so far it's as best as it gets.



This request is not as arbitrary as it looks, if you compare to the older API. 
See Bug 7260 for more info.





--
Dmitry Olshansky

Re: [draft] New std.regex walkthrough

2012-03-13 Thread Brad Anderson

On Tue, Mar 13, 2012 at 1:27 PM, Dmitry Olshansky dmitry.o...@gmail.comwrote:

For a couple of releases we have a new revamped std.regex, that as far as
I'm concerned works nicely, thanks to my GSOC commitment last summer. Yet
there was certain dark trend around std.regex/std.regexp as both had severe
bugs, missing documentation and what not, enough to consider them unusable
or dismiss prematurely.

It's about time to break this gloomy aura, and show that std.regex is
actually easy to use, that it does the thing and has some nice extras.

Link:
http://blackwhale.github.com/**regular-expression.htmlhttp://blackwhale.github.com/regular-expression.html

Comments are welcome from experts and newbies alike, in fact it should
encourage people to try out a few tricks ;)

This is intended as replacement for an article on dlang.org
about outdated (and soon to disappear) std.regexp:
http://dlang.org/regular-**expression.htmlhttp://dlang.org/regular-expression.html

[Spoiler] one example relies on a parser bug being fixed (blush):
https://github.com/D-**Programming-Language/phobos/**pull/481https://github.com/D-Programming-Language/phobos/pull/481
Well, it was a specific lookahead inside lookaround so that's not severe
bug ;)

P.S. I've been following through a bunch of new bug reports recently,
thanks to everyone involved :)

--
Dmitry Olshansky

Second paragraph:
- ..,expressions, though one though one should... has too many though
ones

Third paragraph:
- ...keeping it's implementation... should be its
- We'll see how close to built-ins one can get this way. was kind of
confusing. I'd consider just doing away with the distinction between built
in and non-built in regex since it's an implementation detail most
programmers who use it don't even need to know about. Maybe say that it is
not built in and explain why that is a neat thing to have (meaning, the
language itself is powerful enough to express it in user code).

Fourth paragraph:
- ...article you'd have... should probably be you'll or, preferably,
you will.
- ...utilize it's API... should be its
- yet it's not required to get an understanding of the API. I'd probably
change this to ...yet it's not required to understand the API

Lost track of which paragraph:
- ... that allows writing a regex pattern in it's natural notation
another its
- trying to match special characters like I'd write trying to match
special regex characters like for clarity
- over input like e.g. search or simillar I'd remove the e.g., write
search as search() to show it's a function in other languages and fix the
spelling of similar :P
- An element type is Captures for the string type being used, it is a
random access range. I just found this confusing. Not sure what it's
trying to say.
- I won't go into full detail of the range conception, suffice to say,
I'd change conception to concept and remove suffice to say. (It's a
shame we don't a range article we can link to).
- At that time ancors like misspelled anchors
- Needless to say, one need not I'd remove the Needless to say, because
I think it's actually important to say :P
- replace(text, regex(r([0-9]{1,2})/([0-9]{1,2})/([0-9]{4}),g),
--); Is this code example correct? It references $1, $2, etc. in the
explanatory paragraph below but they are no where to be found.
- When you are explaining named captures it sounds like you are about to
show them in the subsequent code example but you are actually showing what
it'd look like without them which was a bit confusing.
- Maybe some more words on what lookaround/lookahead do as I was lost.
- Amdittedly, barrage of ? and ! makes regex rather obscure, more then
it's actually is. However should be Admittedly, the barrage of ? and !
makes the regex rather obscure, more than it actually is.. Maybe change
obscure to a different adjective. Perhaps complex looking or
complicated. (note I've removed the However as the upcoming sentence
isn't contradicting what you just said.
- Needless to say it's, again, I think it's rather important to say :P
- Run-time version took around 10-20us on my machine, admittedly no
statistics. here, borrow this µ :P. Also, I'd get rid of admittedly no
statistics.
- meaningful tasks, it's features another its
- together it's major and another :P
- ...flexible tools: match, replace, spliter should be spelled splitter

Great article. I didn't even know about the replacement delegate feature
which is something I've often wished I could use in other regex systems. D
and Phobos need more articles like this. We should have a link to it from
the std.regex documentation once this is added to the website.

Regards,
Brad Anderson

Re: How about colors and terminal graphics in std.format?

2012-03-13 Thread Christian Manning


On Tuesday, 13 March 2012 at 16:05:31 UTC, Jacob Carlborg wrote:

On 2012-03-13 13:31, Christian Manning wrote:
On Tuesday, 13 March 2012 at 07:45:19 UTC, Jacob Carlborg 
wrote:

On 2012-03-13 02:36, Christian Manning wrote:

It would be great if an std.terminal contained general stuff 
for
manipulating/querying a terminal portably, as well as colour 
output, eg.
get terminal size, move cursor around, erase line... just 
things to help

with building UIs, progress bars, etc. that are easy to use.


I actually have a library for this written in C++, somewhere.


Any chance of a release? :)

I'd like to have a stab at porting it to D, when I have time, 
if you

aren't already planning to.


I have been thinking about porting it to D from time to time. I 
can see what I can do :)


Looking forward to it!

Re: [draft] New std.regex walkthrough

On Tue, Mar 13, 2012 at 11:27:57PM +0400, Dmitry Olshansky wrote:
 For a couple of releases we have a new revamped std.regex, that as
 far as I'm concerned works nicely, thanks to my GSOC commitment last
 summer. Yet there was certain dark trend around std.regex/std.regexp
 as both had severe bugs, missing documentation and what not, enough
 to consider them unusable or dismiss prematurely.
 
 It's about time to break this gloomy aura, and show that std.regex
 is actually easy to use, that it does the thing and has some nice
 extras.
 
 Link: http://blackwhale.github.com/regular-expression.html
 
 Comments are welcome from experts and newbies alike, in fact it
 should encourage people to try out a few tricks ;)
[...]

Yay! Updated docs is always a good thing. I'd like to do some
copy-editing to make it nicer to read. (Hope you don't mind my extensive
revisions, I'm trying to make the docs as professional as possible.)
My revisions are in straight text under the quoted sections, and inline
comments are enclosed in [].


 Introduction
 
 String processing is a kind of daily routine that most applications do
 in a one way or another.  It should come as no wonder that many
 programming languages have standard libraries stoked with specialized
 functions for common needs.

String processing is a common task performed by many applications. Many
programming languages come with standard libraries that are equipped
with a variety of functions for common string processing needs.


 The D programming language standard library among others offers a nice
 assortment in std.string and generic ones from std.algorithm.

The D programming language standard library also offers a nice
assortment of such functions in std.string, as well as generic functions
in std.algorithm that can also work with strings.


 Still no amount of fixed functionality could cover all needs, as
 naturally flexible text data needs flexible solutions. 

Still no amount of predefined string functions could cover all needs.
Text data is very flexible by nature, and so needs flexible solutions.


 Here is where regular expressions come in handy, often succinctly
 called as regexes.

This is where regular expressions, or regexes for short, come in.


 Simple yet powerful language for defining patterns of strings, put
 together with a substitution mechanism, forms a Swiss Army knife of
 text processing.

Regexes are a simple yet powerful language for defining patterns of
strings, and when integrated with a substitution mechanism, forms a
Swiss Army knife of text processing.


 It's considered so useful that a number of languages provides built-in
 support for regular expressions, though one though one should not jump
 to conclusion that built-in implies faster processing or more
 features. It's all about getting more convenient and friendly syntax
 for typical operations and usage patterns. 

It's considered so useful that a number of languages provides built-in
support for regular expressions. (This doesn't necessarily mean,
however, that built-in implies faster processing or more features.  It's
more a matter of providing a more convenient and friendly syntax for
typical operations and usage patterns.) 

[I think it's better to put the second part in parentheses, since it's
not really the main point of this doc.]


 The D programming language provides a standard library module
 std.regex.

[OK]


 Being a highly expressive systems language, it opens a possibility to
 get a good look and feel via core features, while keeping it's
 implementation within the language.

Being a highly expressive systems language, D allows regexes to be
implemented within the language itself, yet still have the same level of
readability and usability that a built-in implementation would provide.


 We'll see how close to built-ins one can get this way. 

We will see below how close to built-in regexes we can achieve.


 By the end of article you'd have a good understanding of regular
 expression capabilities in this library, and how to utilize it's API
 in a most straightforward way.

By the end of this article, you will have a good understanding of the
regular expression capabilities offered by this library, and how to
utilize its API in the most straightforward way.



 Examples in this article assume the reader has fairly good
 understanding of regex elements, yet it's not required to get an
 understanding of the API.

Examples in this article assume that the reader has fairly good
understanding of regex elements, but this is not required to get an
understanding of the API.

[I'll do this much for now. More to come later.]


T

-- 
Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are, by
definition, not smart enough to debug it. -- Brian W. Kernighan

Re: [draft] New std.regex walkthrough