Re: Integer conversions too pedantic in 64-bit

2011-02-19 Thread Max Samukha

On 02/19/2011 07:39 AM, Walter Bright wrote:

Jonathan M Davis wrote:

Vader had no clue


So much for his force!


How can one expect consistency from a fairytale?


Re: Integer conversions too pedantic in 64-bit

2011-02-19 Thread Jeff Nowakowski

On 02/18/2011 08:39 PM, Walter Bright wrote:


Huge? How about it never occurs to Vader to search for Luke at the most
obvious location in the universe - his nearest living relatives (Uncle
Owen)? That's just the start of the ludicrousness.

Ok, I have no right to be annoyed, but what an opportunity (to make a
truly great movie) squandered.


Lighten up, Francis. It was a truly great movie, for it's time.


Re: Integer conversions too pedantic in 64-bit

2011-02-19 Thread Nick Sabalausky
Russel Winder rus...@russel.org.uk wrote in message 
news:mailman.1784.1298102229.4748.digitalmar...@puremagic.com...

 Sadly all the effects companies are using C++ and Python, can D get
traction as the language of choice for the post-production companies?

IIRC, Someone here said that they had written one of the effects tools used 
for Surrogates and that they wrote it in D.




Re: Integer conversions too pedantic in 64-bit

2011-02-18 Thread Nick Sabalausky
Jonathan M Davis jmdavisp...@gmx.com wrote in message 
news:mailman.1758.1298013272.4748.digitalmar...@puremagic.com...
 On Thursday 17 February 2011 23:09:32 Russel Winder wrote:
 On Thu, 2011-02-17 at 11:09 -0800, Walter Bright wrote:
  Russel Winder wrote:
   Do not be afraid of the word.  Fear leads to anger.  Anger leads to
   hate.  Hate leads to suffering. (*)
  
   (*) With apologies to Master Yoda (**) for any misquote.
 
  Luke, trust your feelings! -- Oggie Ben Doggie
 
  Of course, expecting consistency from Star Wars is a waste of time.

 What -- me worry?  Alfred E Newman  (*)

 Star Wars is like Dr Who you expect revisionist history in every
 episode.  I hate an inconsistent storyline, so the trick is to assume
 each episode is a completely separate story unrelated to any other
 episode.

 The funny thing is that Doctor Who does a number of things which I would
 normally consider to make a show a bad show - such as being inconsistent 
 in its
 timeline and generally being episodic rather than having real story arcs 
 (though
 some of the newer Doctor Who stuff has had more of a story arc than was 
 typical
 in the past) - but in spite of all that, it's an absolutely fantastic 
 show -
 probably because the Doctor's just so much fun. Still, it's interesting 
 how it
 generally breaks the rules of good storytelling and yet is still so great 
 to
 watch.


One of the things that gets me about Doctor Who (at least the newer ones) is 
that The Doctor keeps getting companions from modern-day London who, like 
the Doctor, are enthralled by the idea of travelling anywhere in time and 
space, and yet...it seems like they still wind up spending most of their 
time in modern-day London anyway :)  (I agree it's an enjoyable show though. 
The character of The Doctor is definitely a big part of what makes it work.)





Re: Integer conversions too pedantic in 64-bit

2011-02-18 Thread Alexander Malakhov
Don nos...@nospam.com писал(а) в своём письме Wed, 16 Feb 2011 17:21:06  
+0600:


Exactly. It is NOT the same as the 8  16 bit case. The thing is, the  
fraction of cases where the MSB is important has been decreasing  
*exponentially* from the 8-bit days. [...]


Some facts to back your opinion:

* today's most powerful supercomputer have just 230 TB of RAM, which is  
between 2^47 and 2^48

  (http://www.top500.org/site/systems/3154)

* Windows7 x64 __virtual__ memory limit is 8 TB (= 2^43)
  
(http://msdn.microsoft.com/en-us/library/aa366778(VS.85).aspx#physical_memory_limits_windows_7)

--
Alexander


Re: Integer conversions too pedantic in 64-bit

2011-02-18 Thread Walter Bright

Russel Winder wrote:

Star Wars is like Dr Who you expect revisionist history in every
episode.  I hate an inconsistent storyline, so the trick is to assume
each episode is a completely separate story unrelated to any other
episode.


My trick was to lose all interest in SW.

Have you seen the series Defying Gravity? The plot is a spaceship is sent 
around a to pass by various planets in the solar system on a mission of 
discovery. The script writers apparently thought this was boring, so to liven 
things up they installed a ghost on the spaceship.


It's really, really sad.


Re: Integer conversions too pedantic in 64-bit

2011-02-18 Thread Nick Sabalausky
Walter Bright newshou...@digitalmars.com wrote in message 
news:ijmnp7$433$1...@digitalmars.com...
 Russel Winder wrote:
 Star Wars is like Dr Who you expect revisionist history in every
 episode.  I hate an inconsistent storyline, so the trick is to assume
 each episode is a completely separate story unrelated to any other
 episode.

 My trick was to lose all interest in SW.


I must not be enough of a Star Wars guy, I don't know what anyone's talking 
about here. Was it the prequel trilogy that introduced the inconsistencies 
(I still haven't gotten around to episodes 2 or 3 yet), or were there things 
in the orignal trilogy that I managed to completely overlook? (Or something 
else entirely?)

 Have you seen the series Defying Gravity? The plot is a spaceship is 
 sent around a to pass by various planets in the solar system on a mission 
 of discovery. The script writers apparently thought this was boring, so to 
 liven things up they installed a ghost on the spaceship.

 It's really, really sad.

Sounds like Stargate Universe: A bunch of people trapped on a ancient 
spaceship of exploration...but to make that concept interesting the 
writers had to make every damn character on the show a certifiable drama 
queen. Unsurprisingly, dead after only two seasons - a record low for 
Stargate. Really looking forward to the movie sequels though (as well as the 
new SG-1/Atlantis movies that, I *think*, are still in the works).






Re: Integer conversions too pedantic in 64-bit

2011-02-18 Thread Jonathan M Davis
On Friday, February 18, 2011 14:20:03 Nick Sabalausky wrote:
 Walter Bright newshou...@digitalmars.com wrote in message
 news:ijmnp7$433$1...@digitalmars.com...
 
  Russel Winder wrote:
  Star Wars is like Dr Who you expect revisionist history in every
  episode.  I hate an inconsistent storyline, so the trick is to assume
  each episode is a completely separate story unrelated to any other
  episode.
  
  My trick was to lose all interest in SW.
 
 I must not be enough of a Star Wars guy, I don't know what anyone's talking
 about here. Was it the prequel trilogy that introduced the inconsistencies
 (I still haven't gotten around to episodes 2 or 3 yet), or were there
 things in the orignal trilogy that I managed to completely overlook? (Or
 something else entirely?)

The prequel movies definitely have some inconsistencies with the originals, but 
for the most part, they weren't huge. I suspect that the real trouble comes in 
when you read the books (which I haven't).

- Jonathan M Davis


Re: Integer conversions too pedantic in 64-bit

2011-02-18 Thread Walter Bright

Jonathan M Davis wrote:
The prequel movies definitely have some inconsistencies with the originals, but 
for the most part, they weren't huge. I suspect that the real trouble comes in 
when you read the books (which I haven't).


Huge? How about it never occurs to Vader to search for Luke at the most obvious 
location in the universe - his nearest living relatives (Uncle Owen)? That's 
just the start of the ludicrousness.


Ok, I have no right to be annoyed, but what an opportunity (to make a truly 
great movie) squandered.


Re: Integer conversions too pedantic in 64-bit

2011-02-18 Thread Jonathan M Davis
On Friday, February 18, 2011 17:39:34 Walter Bright wrote:
 Jonathan M Davis wrote:
  The prequel movies definitely have some inconsistencies with the
  originals, but for the most part, they weren't huge. I suspect that the
  real trouble comes in when you read the books (which I haven't).
 
 Huge? How about it never occurs to Vader to search for Luke at the most
 obvious location in the universe - his nearest living relatives (Uncle
 Owen)? That's just the start of the ludicrousness.
 
 Ok, I have no right to be annoyed, but what an opportunity (to make a truly
 great movie) squandered.

Well, that's not really an inconsistency so much as not properly taking 
everything into account in the plot (though to be fair, IIRC, Vader had no clue 
that he even _had_ kids, so it's not like he would have gone looking in the 
first 
place). Regardless, I don't think that there's much question that those films 
could have been much better.

- Jonathan M Davis


Re: Integer conversions too pedantic in 64-bit

2011-02-18 Thread Walter Bright

Jonathan M Davis wrote:
Vader had no clue 


So much for his force!


Re: Integer conversions too pedantic in 64-bit

2011-02-18 Thread Don

Walter Bright wrote:

Jonathan M Davis wrote:
The prequel movies definitely have some inconsistencies with the 
originals, but for the most part, they weren't huge. I suspect that 
the real trouble comes in when you read the books (which I haven't).


Huge? How about it never occurs to Vader to search for Luke at the most 
obvious location in the universe - his nearest living relatives (Uncle 
Owen)? That's just the start of the ludicrousness.


Ok, I have no right to be annoyed, but what an opportunity (to make a 
truly great movie) squandered.


I nominate the second prequel for the worst movie of all time.
I never saw the third one.



Re: Integer conversions too pedantic in 64-bit

2011-02-18 Thread Walter Bright

Don wrote:

I nominate the second prequel for the worst movie of all time.
I never saw the third one.



You didn't miss a thing.


Re: Integer conversions too pedantic in 64-bit

2011-02-18 Thread Russel Winder
On Fri, 2011-02-18 at 17:52 -0800, Jonathan M Davis wrote:
 On Friday, February 18, 2011 17:39:34 Walter Bright wrote:
  Jonathan M Davis wrote:
   The prequel movies definitely have some inconsistencies with the
   originals, but for the most part, they weren't huge. I suspect that the
   real trouble comes in when you read the books (which I haven't).
  
  Huge? How about it never occurs to Vader to search for Luke at the most
  obvious location in the universe - his nearest living relatives (Uncle
  Owen)? That's just the start of the ludicrousness.

The wikipedia article http://en.wikipedia.org/wiki/Star_Wars is quite
interesting, and indicates why there are lots of little inconsistencies
as well as quite a few big ones.  As to the veracity of the material,
who knows, it's the Web, lies have the exact same status as truth.

  Ok, I have no right to be annoyed, but what an opportunity (to make a truly
  great movie) squandered.
 
 Well, that's not really an inconsistency so much as not properly taking 
 everything into account in the plot (though to be fair, IIRC, Vader had no 
 clue 
 that he even _had_ kids, so it's not like he would have gone looking in the 
 first 
 place). Regardless, I don't think that there's much question that those films 
 could have been much better.

I think there has been a loss of historical context here, leading to
anti-rose coloured (colored?) spectacles.  in 1977, Star Wars was a
watershed film.  Simple fairy tale storyline, space opera on film
instead of book.  It's impact was greater than 2001: A Space Odyssey
which had analogous impact albeit to a smaller audience in 1968.  I am
sure there are films from the 1940s and 1950s that deserve similar
status but television changed the nature of film impact, making 2001 and
Star Wars more influential -- again historical context is important.  I
think Return of the Jedi is quite fun and that the rest of the Star Wars
films lost the simplicity and brilliance of Star Wars, pandering to the
need for huge budget special effects, essentially driving us to the
computer generated, poor storyline, stuff that gets churned out today.
With the exception of The Lord of The Rings. 

Sadly all the effects companies are using C++ and Python, can D get
traction as the language of choice for the post-production companies?

Crikey, this thread has drifted a good few light years from the original
title.

-- 
Russel.
=
Dr Russel Winder  t: +44 20 7585 2200   voip: sip:russel.win...@ekiga.net
41 Buckmaster Roadm: +44 7770 465 077   xmpp: rus...@russel.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder


signature.asc
Description: This is a digitally signed message part


Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread Denis Koroskin
On Wed, 16 Feb 2011 06:49:26 +0300, Michel Fortin  
michel.for...@michelf.com wrote:



On 2011-02-15 22:41:32 -0500, Nick Sabalausky a@a.a said:


I like nint.


But is it unsigned or signed? Do we need 'unint' too?

I think 'word'  'uword' would be a better choice. I can't say I'm too  
displeased with 'size_t', but it's true that the 'size_t' feels out of  
place in D code because of its name.





I second that. word/uword are shorter than ssize_t/size_t and more in line  
with other type names.


I like it.


Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread David Nadlinger

On 2/17/11 8:56 AM, Denis Koroskin wrote:

I second that. word/uword are shorter than ssize_t/size_t and more in
line with other type names.

I like it.


I agree that size_t/ptrdiff_t are misnomers and I'd love to kill them 
with fire, but when I read about »word«, I intuitively associated it 
with »two bytes« first – blame Intel or whoever else, but the potential 
for confusion is definitely not negligible.


David


Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread Don

David Nadlinger wrote:

On 2/17/11 8:56 AM, Denis Koroskin wrote:

I second that. word/uword are shorter than ssize_t/size_t and more in
line with other type names.

I like it.


I agree that size_t/ptrdiff_t are misnomers and I'd love to kill them 
with fire, but when I read about »word«, I intuitively associated it 
with »two bytes« first – blame Intel or whoever else, but the potential 
for confusion is definitely not negligible.


David


Me too. A word is two bytes. Any other definition seems to be pretty 
useless.


The whole concept of machine word seems very archaic and incorrect to 
me anyway. It assumes that the data registers and address registers are 
the same size, which is very often not true.
For example, on an 8-bit machine (eg, 6502 or Z80), the accumulator was 
only 8 bits, yet size_t was definitely 16 bits.
It's quite plausible that at some time in the future we'll get a machine 
with 128-bit registers and data bus, but retaining the 64 bit address 
bus. So we could get a size_t which is smaller than the machine word.


In summary: size_t is not the machine word.


Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread Russel Winder
minor-rant

On Thu, 2011-02-17 at 10:13 +0100, Don wrote:
[ . . . ]
 Me too. A word is two bytes. Any other definition seems to be pretty 
 useless.

Sounds like people have been living with 8- and 16-bit processors for
too long.

A word is the natural length of an integer item in the processor.  It is
necessarily machine specific.  cf. DEC-10 had 9-bit bytes and 36-bit
word, IBM 370 has an 8-bit byte and a 32-bit word, though addresses were
24-bit.  ix86 follows IBM 8-bit byte and 32-bit word.

The really interesting question is whether on x86_64 the word is 32-bit
or 64-bit.

 The whole concept of machine word seems very archaic and incorrect to 
 me anyway. It assumes that the data registers and address registers are 
 the same size, which is very often not true.

Machine words are far from archaic, even on the JVM, if you don't know
the length of the word on the machine you are executing on, how do you
know the set of values that can be represented?  In floating point
numbers, if you don't know the length of the word, how do you know the
accuracy of the computation?

Clearly data registers and address registers can be different lengths,
it is not the job of a programming language that compiles to native code
to ignore this and attempt to homogenize things beyond what is
reasonable.

If you are working in native code then word length is a crucial property
since it can change depending on which processor you compile for.

 For example, on an 8-bit machine (eg, 6502 or Z80), the accumulator was 
 only 8 bits, yet size_t was definitely 16 bits.

The 8051 was only surpassed a couple of years ago by ARMs as the most
numerous processor on the planet.  8-bit processors may only have had
8-bit ALUs -- leading to an hypothesis that the word was 8-bits -- but
the word length was effectively 16-bit due to the hardware support for
multi-byte integer operations.

 It's quite plausible that at some time in the future we'll get a machine 
 with 128-bit registers and data bus, but retaining the 64 bit address 
 bus. So we could get a size_t which is smaller than the machine word.
 
 In summary: size_t is not the machine word.

Agreed !

As long as the address bus is less wide than an integer, there are no
apparent problems using integers as addresses.  The problem comes when
addresses are wider than integers.  A good statically-typed programming
language should manage this by having integers and addresses as distinct
sets.  C and C++ have led people astray.  There should be an appropriate
set of integer types and an appropriate set of address types and using
one from the other without active conversion is always going to lead to
problems.

Do not be afraid of the word.  Fear leads to anger.  Anger leads to
hate.  Hate leads to suffering. (*)

/minor-rant

(*) With apologies to Master Yoda (**) for any misquote.

(**) Or more likely whoever his script writer was.
-- 
Russel.
=
Dr Russel Winder  t: +44 20 7585 2200   voip: sip:russel.win...@ekiga.net
41 Buckmaster Roadm: +44 7770 465 077   xmpp: rus...@russel.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder


signature.asc
Description: This is a digitally signed message part


Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread spir

On 02/17/2011 05:19 AM, Kevin Bealer wrote:

== Quote from spir (denis.s...@gmail.com)'s article

On 02/16/2011 03:07 AM, Jonathan M Davis wrote:

On Tuesday, February 15, 2011 15:13:33 spir wrote:

On 02/15/2011 11:24 PM, Jonathan M Davis wrote:

Is there some low level reason why size_t should be signed or something
I'm completely missing?


My personal issue with unsigned ints in general as implemented in C-like
languages is that the range of non-negative signed integers is half of the
range of corresponding unsigned integers (for same size).
* practically: known issues, and bugs if not checked by the language
* conceptually: contradicts the obvious idea that unsigned (aka naturals)
is a subset of signed (aka integers)


It's inevitable in any systems language. What are you going to do, throw away a
bit for unsigned integers? That's not acceptable for a systems language. On some
level, you must live with the fact that you're running code on a specific 
machine
with a specific set of constraints. Trying to do otherwise will pretty much
always harm efficiency. True, there are common bugs that might be better
prevented, but part of it ultimately comes down to the programmer having some
clue as to what they're doing. On some level, we want to prevent common bugs,
but the programmer can't have their hand held all the time either.

I cannot prove it, but I really think you're wrong on that.
First, the question of 1 bit. Think at this -- speaking of 64 bit size:
* 99.999% of all uses of unsigned fit under 2^63
* To benefit from the last bit, you must have the need to store a value 2^63=
v  2^64
* Not only this, you must step on a case where /any/ possible value for v
(depending on execution data) could be= 2^63, but /all/ possible values for v
are guaranteed  2^64
This can only be a very small fraction of cases where your value does not fit
in 63 bits, don't you think. Has it ever happened to you (even in 32 bits)?
Something like: what a luck! this value would not (always) fit in 31 bits, but
(due to this constraint), I can be sure it will fit in 32 bits (always,
whatever input data it depends on).
In fact, n bits do the job because (1) nearly all unsigned values are very
small (2) the size used at a time covers the memory range at the same time.
Upon efficiency, if unsigned is not a subset of signed, then at a low level you
may be forced to add checks in numerous utility routines, the kind constantly
used, everywhere one type may play with the other. I'm not sure where the gain 
is.
Upon correctness, intuitively I guess (just a wild guess indeed) if unigned
values form a subset of signed ones programmers will more easily reason
correctly about them.
Now, I perfectly understand the sacrifice of one bit sounds like a sacrilege 
;-)
(*)
Denis
(*) But you know, when as a young guy you have coded for 8  16-bit machines,
having 63 or 64...


If you write low level code, it happens all the time.  For example, you can copy
memory areas quickly on some machines by treating them as arrays of long and
copying the values -- which requires the upper bit to be preserved.

Or you compute a 64 bit hash value using an algorithm that is part of some
standard protocol.  Oops -- requires an unsigned 64 bit number, the signed 
version
would produce the wrong result.  And since the standard expects normal behaving
int64's you are stuck -- you'd have to write a little class to simulate unsigned
64 bit math.  E.g. a library that computes md5 sums.

Not to mention all the code that uses 64 bit numbers as bit fields where the
different bits or sets of bits are really subfields of the total range of 
values.

What you are saying is true of high level code that models real life -- if the
value is someone's salary or the number of toasters they are buying from a store
you are probably fine -- but a lot of low level software (ipv4 stacks, video
encoders, databases, etc) are based on designs that require numbers to behave a
certain way, and losing a bit is going to be a pain.

I've run into this with Java, which lacks unsigned types, and once you run into 
a
case that needs that extra bit it gets annoying right quick.


You're right indeed, but this is a different issue. If you need to perform 
bit-level manipulation, then the proper type to use is u-somesize.
What we were discussing, I guess, is the standard type used by both stdlib and 
application code for indices/positions and counts/sizes/lenths.

SomeType count (E) (E[] elements, E element)
SomeType search (E) (E[] elements, E element, SomeType fromPos=0)

Denis
--
_
vita es estrany
spir.wikidot.com



Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread spir

On 02/17/2011 10:13 AM, Don wrote:

David Nadlinger wrote:

On 2/17/11 8:56 AM, Denis Koroskin wrote:

I second that. word/uword are shorter than ssize_t/size_t and more in
line with other type names.

I like it.


I agree that size_t/ptrdiff_t are misnomers and I'd love to kill them with
fire, but when I read about »word«, I intuitively associated it with »two
bytes« first – blame Intel or whoever else, but the potential for confusion
is definitely not negligible.

David


Me too. A word is two bytes. Any other definition seems to be pretty useless.

The whole concept of machine word seems very archaic and incorrect to me
anyway. It assumes that the data registers and address registers are the same
size, which is very often not true.
For example, on an 8-bit machine (eg, 6502 or Z80), the accumulator was only 8
bits, yet size_t was definitely 16 bits.
It's quite plausible that at some time in the future we'll get a machine with
128-bit registers and data bus, but retaining the 64 bit address bus. So we
could get a size_t which is smaller than the machine word.

In summary: size_t is not the machine word.


Right, there is no single native machine word size; but I guess what we're 
interesting in is, from those sizes, the one that ensures minimal processing 
time. I mean, the data size for which there are native computation instructions 
(logical, numeric), so that if we use it we get the least number of cycles for 
a given operation.
Also, this size (on common modern architectures, at least) allows directly 
accessing all of the memory address space; not a neglectable property ;-).

Or are there points I'm overlooking?

Denis
--
_
vita es estrany
spir.wikidot.com



Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread Don

spir wrote:

On 02/17/2011 10:13 AM, Don wrote:

David Nadlinger wrote:

On 2/17/11 8:56 AM, Denis Koroskin wrote:

I second that. word/uword are shorter than ssize_t/size_t and more in
line with other type names.

I like it.


I agree that size_t/ptrdiff_t are misnomers and I'd love to kill them 
with
fire, but when I read about »word«, I intuitively associated it with 
»two
bytes« first – blame Intel or whoever else, but the potential for 
confusion

is definitely not negligible.

David


Me too. A word is two bytes. Any other definition seems to be pretty 
useless.


The whole concept of machine word seems very archaic and incorrect 
to me
anyway. It assumes that the data registers and address registers are 
the same

size, which is very often not true.
For example, on an 8-bit machine (eg, 6502 or Z80), the accumulator 
was only 8

bits, yet size_t was definitely 16 bits.
It's quite plausible that at some time in the future we'll get a 
machine with
128-bit registers and data bus, but retaining the 64 bit address bus. 
So we

could get a size_t which is smaller than the machine word.

In summary: size_t is not the machine word.


Right, there is no single native machine word size; but I guess what 
we're interesting in is, from those sizes, the one that ensures minimal 
processing time. I mean, the data size for which there are native 
computation instructions (logical, numeric), so that if we use it we get 
the least number of cycles for a given operation.


There's frequently more than one such size.

Also, this size (on common modern architectures, at least) allows 
directly accessing all of the memory address space; not a neglectable 
property ;-).


This is not necessarily the same.


Or are there points I'm overlooking?

Denis


Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread Kagamin
Adam Ruppe Wrote:

 alias iota lazyRangeThatGoesFromStartToFinishByTheGivenStepAmount;

Ever wondered, what iota is. At last it's self-documented.


Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread Kagamin
dsimcha Wrote:

 Now that DMD has a 64-bit beta available, I'm working on getting a whole bunch
 of code to compile in 64 mode.  Frankly, the compiler is way too freakin'
 pedantic when it comes to implicit conversions (or lack thereof) of
 array.length.  99.999% of the time it's safe to assume an array is not going
 to be over 4 billion elements long.  I'd rather have a bug the 0.001% of the
 time than deal with the pedantic errors the rest of the time, because I think
 it would be less total time and effort invested.  To force me to either put
 casts in my code everywhere or change my entire codebase to use wider integers
 (with ripple effects just about everywhere) strikes me as purity winning out
 over practicality.

int ilength(void[] a) @property
{
  return cast(int)a.length;
}

---
int mylen=bb.ilength;
---


Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread Kagamin
Walter Bright Wrote:

 Actually, you can have a segmented model on a 32 bit machine rather than a 
 flat 
 model, with separate segments for code, data, and stack. The Digital Mars DOS 
 Extender actually does this. The advantage of it is you cannot execute data 
 on 
 the stack.

AFAIK you inevitably have segments in flat model, x86 just doesn't work in 
other way. On windows stack segment seems to be the same as data segment, code 
segment is different. Are they needed for access check? I thought access modes 
are checked in page tables.


Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread Don

Russel Winder wrote:

minor-rant

On Thu, 2011-02-17 at 10:13 +0100, Don wrote:
[ . . . ]
Me too. A word is two bytes. Any other definition seems to be pretty 
useless.


Sounds like people have been living with 8- and 16-bit processors for
too long.

A word is the natural length of an integer item in the processor.  It is
necessarily machine specific.  cf. DEC-10 had 9-bit bytes and 36-bit
word, IBM 370 has an 8-bit byte and a 32-bit word, though addresses were
24-bit.  ix86 follows IBM 8-bit byte and 32-bit word.


Yes, I know. It's true but I think rather useless.
We need a name for an 8 bit quantity, and a 16 bit quantity, and higher 
powers of two. 'byte' is an established name for the first one, even 
though historically there were 9-bit bytes. IMHO 'word' wasn't such a 
bad name for the second one, even though its etomology comes from the 
machine word size of some specific early processors. But the equally 
arbitrary name 'short' has become widely accepted.



The really interesting question is whether on x86_64 the word is 32-bit
or 64-bit.


With the rising importance of the SIMD instruction set, you could even 
argue that it is 128 bits in many cases...



The whole concept of machine word seems very archaic and incorrect to 
me anyway. It assumes that the data registers and address registers are 
the same size, which is very often not true.


Machine words are far from archaic, even on the JVM, if you don't know
the length of the word on the machine you are executing on, how do you
know the set of values that can be represented?  In floating point
numbers, if you don't know the length of the word, how do you know the
accuracy of the computation?


Yes, but they're not necessarily the same number. There is a native size 
for every type of operation, but it's not universal across all operations.


I don't think there's a way you can define machine word in a way which 
is terribly useful. By the time you've got something unambiguous and 
well-defined, it doesn't have many interesting properties. It's valid in 
such limited cases that you'd be better off with a clearer name.



Clearly data registers and address registers can be different lengths,
it is not the job of a programming language that compiles to native code
to ignore this and attempt to homogenize things beyond what is
reasonable.


Agreed, and this is I think what makes the concept of machine word not 
very helpful.




If you are working in native code then word length is a crucial property
since it can change depending on which processor you compile for.

For example, on an 8-bit machine (eg, 6502 or Z80), the accumulator was 
only 8 bits, yet size_t was definitely 16 bits.


The 8051 was only surpassed a couple of years ago by ARMs as the most
numerous processor on the planet.  8-bit processors may only have had
8-bit ALUs -- leading to an hypothesis that the word was 8-bits -- but
the word length was effectively 16-bit due to the hardware support for
multi-byte integer operations.


The 6502 was restricted to 8 bits in almost every way. About half of the 
instructions that involved 16 bit quantities would wrap on page 
boundaries. jmp (0x7FF) would do an indirect jump, getting the low word 
from address 0x7FF and the high word from 0x700 !!



It's quite plausible that at some time in the future we'll get a machine 
with 128-bit registers and data bus, but retaining the 64 bit address 
bus. So we could get a size_t which is smaller than the machine word.


In summary: size_t is not the machine word.


Agreed !

As long as the address bus is less wide than an integer, there are no
apparent problems using integers as addresses.  The problem comes when
addresses are wider than integers.  A good statically-typed programming
language should manage this by having integers and addresses as distinct
sets.  C and C++ have led people astray.  There should be an appropriate
set of integer types and an appropriate set of address types and using
one from the other without active conversion is always going to lead to
problems.


Indeed.



Do not be afraid of the word.  Fear leads to anger.  Anger leads to
hate.  Hate leads to suffering. (*)

/minor-rant

(*) With apologies to Master Yoda (**) for any misquote.

(**) Or more likely whoever his script writer was.


Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread dsimcha
Funny, as simple as it is, this is a great idea for std.array because it 
shortens the verbose cast(int) a.length to one extra character.  You 
could even put an assert in it to check in debug mode only that the 
conversion is safe.


On 2/17/2011 7:18 AM, Kagamin wrote:

dsimcha Wrote:


Now that DMD has a 64-bit beta available, I'm working on getting a whole bunch
of code to compile in 64 mode.  Frankly, the compiler is way too freakin'
pedantic when it comes to implicit conversions (or lack thereof) of
array.length.  99.999% of the time it's safe to assume an array is not going
to be over 4 billion elements long.  I'd rather have a bug the 0.001% of the
time than deal with the pedantic errors the rest of the time, because I think
it would be less total time and effort invested.  To force me to either put
casts in my code everywhere or change my entire codebase to use wider integers
(with ripple effects just about everywhere) strikes me as purity winning out
over practicality.


int ilength(void[] a) @property
{
   return cast(int)a.length;
}

---
int mylen=bb.ilength;
---




Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread Steven Schveighoffer

On Thu, 17 Feb 2011 09:45:14 -0500, Kagamin s...@here.lot wrote:


dsimcha Wrote:


Funny, as simple as it is, this is a great idea for std.array because it
shortens the verbose cast(int) a.length to one extra character.  You
could even put an assert in it to check in debug mode only that the
conversion is safe.



 int ilength(void[] a) @property
 {
return cast(int)a.length;
 }


I'm not sure the code is correct. I have a vague impression that void[]  
is like byte[], at least, it's used as such, and conversion from int[]  
to byte[] multiplies the length by 4.


Yes, David has proposed a corrected version on the Phobos mailing list:

http://lists.puremagic.com/pipermail/phobos/2011-February/004493.html

-Steve


Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread Kagamin
dsimcha Wrote:

 Funny, as simple as it is, this is a great idea for std.array because it 
 shortens the verbose cast(int) a.length to one extra character.  You 
 could even put an assert in it to check in debug mode only that the 
 conversion is safe.

  int ilength(void[] a) @property
  {
 return cast(int)a.length;
  }

I'm not sure the code is correct. I have a vague impression that void[] is like 
byte[], at least, it's used as such, and conversion from int[] to byte[] 
multiplies the length by 4.


Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread Olivier Pisano

Le 17/02/2011 13:28, Don a écrit :


Yes, I know. It's true but I think rather useless.
We need a name for an 8 bit quantity, and a 16 bit quantity, and higher
powers of two. 'byte' is an established name for the first one, even
though historically there were 9-bit bytes. IMHO 'word' wasn't such a
bad name for the second one, even though its etomology comes from the
machine word size of some specific early processors. But the equally
arbitrary name 'short' has become widely accepted.


8 bits: octet - http://en.wikipedia.org/wiki/Octet_%28computing%29


Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread Kevin Bealer
== Quote from Daniel Gibson (metalcae...@gmail.com)'s article
 It was not proposed to alter ulong (int64), but to only a size_t equivalent. 
 ;)
 And I agree that not having unsigned types (like in Java) just sucks.
 Wasn't Java even advertised as a programming language for network stuff? Quite
 ridiculous without unsigned types..
 Cheers,
 - Daniel

Ah yes, but if you want to copy data quickly you want to use the efficient size
for doing so.  Since architectures vary, size_t (or the new name if one is 
added)
would seem to new users to be the natural choice for that size.  So it becomes a
likely error if it doesn't behave as expected.

My personal reaction to this thread is that I think most of the arguments of the
people who want to change the name or add a new one are true -- but not 
sufficient
to make it worth while.  There is always some learning curve and size_t is not
that hard to learn or that hard to accept.

Kevin


Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread bearophile
Steven Schveighoffer:

 Yes, David has proposed a corrected version on the Phobos mailing list:
 
 http://lists.puremagic.com/pipermail/phobos/2011-February/004493.html

I suggest it to return a signed value, like an int. But a signed long is OK too.
I suggest a name as len (or slen) because I often write length wrongly.

Does it support code like:
auto l = arr.len;
arr.len = 10;
arr.len++;

A big problem: it's limited to arrays, so aa.len or rbtree.len, set.len, etc, 
don't work. So I'd like something more standard... So I am not sure this is a 
good idea.

Bye,
bearophile


Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread Steven Schveighoffer
On Thu, 17 Feb 2011 13:08:08 -0500, bearophile bearophileh...@lycos.com  
wrote:



Steven Schveighoffer:


Yes, David has proposed a corrected version on the Phobos mailing list:

http://lists.puremagic.com/pipermail/phobos/2011-February/004493.html


I suggest it to return a signed value, like an int. But a signed long is  
OK too.
I suggest a name as len (or slen) because I often write length  
wrongly.


This isn't replacing length, it is in addition to length (which will  
continue to return size_t).




Does it support code like:
auto l = arr.len;
arr.len = 10;
arr.len++;


arr.length = 10 already works.  It's int l = arr.length that doesn't.
if arr.length++ doesn't work already, it should be made to work (separate  
bug).


A big problem: it's limited to arrays, so aa.len or rbtree.len, set.len,  
etc, don't work. So I'd like something more standard... So I am not sure  
this is a good idea.


The point is to avoid code like cast(int)arr.length everywhere you can  
safely assume arr.length can fit in a (u)int.


This case is extremely common for arrays, you seldom have an array of more  
than 2 or 4 billion elements.


For other types, the case might not be as common, plus you can add  
properties to other types, something you cannot do to arrays.


As far as I'm concerned, this isn't going to affect me at all, I like to  
use size_t.  But I don't see the harm in adding it.


-Steve


Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread Walter Bright

Russel Winder wrote:

Do not be afraid of the word.  Fear leads to anger.  Anger leads to
hate.  Hate leads to suffering. (*)



(*) With apologies to Master Yoda (**) for any misquote.


Luke, trust your feelings! -- Oggie Ben Doggie

Of course, expecting consistency from Star Wars is a waste of time.


Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread Walter Bright

Kagamin wrote:

Walter Bright Wrote:


Actually, you can have a segmented model on a 32 bit machine rather than a
flat model, with separate segments for code, data, and stack. The Digital
Mars DOS Extender actually does this. The advantage of it is you cannot
execute data on the stack.


AFAIK you inevitably have segments in flat model, x86 just doesn't work in
other way.  On windows stack segment seems to be the same as data segment,
code segment is different. Are they needed for access check? I thought access
modes are checked in page tables.


Operating systems choose to set the segment registers to all the same value 
which results in the 'flat' model, but many other models are possible with the 
x86 hardware.


Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread Andrej Mitrovic
Is it true that you're not allowed to play with the segment registers
in 32bit flat protected mode?


Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread Andrej Mitrovic
On 2/17/11, Walter Bright newshou...@digitalmars.com wrote:
 Andrej Mitrovic wrote:
 Is it true that you're not allowed to play with the segment registers
 in 32bit flat protected mode?

 Yes, that's the operating system's job.


They took our jerbs!


Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread Walter Bright

Andrej Mitrovic wrote:

On 2/17/11, Walter Bright newshou...@digitalmars.com wrote:

Andrej Mitrovic wrote:

Is it true that you're not allowed to play with the segment registers
in 32bit flat protected mode?

Yes, that's the operating system's job.



They took our jerbs!


You can always start your own company and hire yourself, or write your own 
operating system and set the segment registers!


Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread Nick Sabalausky
Russel Winder rus...@russel.org.uk wrote in message 
news:mailman.1748.1297936806.4748.digitalmar...@puremagic.com...
 A word is the natural length of an integer item in the processor.
 It is necessarily machine specific.  cf. DEC-10 had 9-bit bytes
 and 36-bit word, IBM 370 has an 8-bit byte and a 32-bit word,
 though addresses were 24-bit.  ix86 follows IBM 8-bit byte and
 32-bit word.

Right. Programmers may have gotten used to word being 2-bytes due to 
things like the Win API and x86 Assemblers not updating their usage for the 
sake of backwards compatibility, but in the EE world where the term 
originates, word is device-specific and is very useful as such.

 Do not be afraid of the word.  Fear leads to anger.  Anger
 leads to hate.  Hate leads to suffering. (*)

This version is better:
http://media.bigoo.ws/content/image/funny/funny_1309.jpg





Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread Nick Sabalausky
Walter Bright newshou...@digitalmars.com wrote in message 
news:ijk6la$1d9a$1...@digitalmars.com...
 Andrej Mitrovic wrote:
 On 2/17/11, Walter Bright newshou...@digitalmars.com wrote:
 Andrej Mitrovic wrote:
 Is it true that you're not allowed to play with the segment registers
 in 32bit flat protected mode?
 Yes, that's the operating system's job.


 They took our jerbs!

 You can always start your own company and hire yourself, or write your own 
 operating system and set the segment registers!

They took our jerbs! is a South Park reference.




Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread Walter Bright

Nick Sabalausky wrote:
Walter Bright newshou...@digitalmars.com wrote in message 
news:ijk6la$1d9a$1...@digitalmars.com...

Andrej Mitrovic wrote:

On 2/17/11, Walter Bright newshou...@digitalmars.com wrote:

Andrej Mitrovic wrote:

Is it true that you're not allowed to play with the segment registers
in 32bit flat protected mode?

Yes, that's the operating system's job.


They took our jerbs!
You can always start your own company and hire yourself, or write your own 
operating system and set the segment registers!


They took our jerbs! is a South Park reference.


I've seen it everywhere on the intarnets.


Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread Russel Winder
On Thu, 2011-02-17 at 11:09 -0800, Walter Bright wrote:
 Russel Winder wrote:
  Do not be afraid of the word.  Fear leads to anger.  Anger leads to
  hate.  Hate leads to suffering. (*)
 
  (*) With apologies to Master Yoda (**) for any misquote.
 
 Luke, trust your feelings! -- Oggie Ben Doggie
 
 Of course, expecting consistency from Star Wars is a waste of time.

What -- me worry?  Alfred E Newman  (*)

Star Wars is like Dr Who you expect revisionist history in every
episode.  I hate an inconsistent storyline, so the trick is to assume
each episode is a completely separate story unrelated to any other
episode.


(*) Or whoever http://en.wikipedia.org/wiki/Alfred_E._Neuman
-- 
Russel.
=
Dr Russel Winder  t: +44 20 7585 2200   voip: sip:russel.win...@ekiga.net
41 Buckmaster Roadm: +44 7770 465 077   xmpp: rus...@russel.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder


signature.asc
Description: This is a digitally signed message part


Re: Integer conversions too pedantic in 64-bit

2011-02-17 Thread Jonathan M Davis
On Thursday 17 February 2011 23:09:32 Russel Winder wrote:
 On Thu, 2011-02-17 at 11:09 -0800, Walter Bright wrote:
  Russel Winder wrote:
   Do not be afraid of the word.  Fear leads to anger.  Anger leads to
   hate.  Hate leads to suffering. (*)
   
   (*) With apologies to Master Yoda (**) for any misquote.
  
  Luke, trust your feelings! -- Oggie Ben Doggie
  
  Of course, expecting consistency from Star Wars is a waste of time.
 
 What -- me worry?  Alfred E Newman  (*)
 
 Star Wars is like Dr Who you expect revisionist history in every
 episode.  I hate an inconsistent storyline, so the trick is to assume
 each episode is a completely separate story unrelated to any other
 episode.

The funny thing is that Doctor Who does a number of things which I would 
normally consider to make a show a bad show - such as being inconsistent in its 
timeline and generally being episodic rather than having real story arcs 
(though 
some of the newer Doctor Who stuff has had more of a story arc than was typical 
in the past) - but in spite of all that, it's an absolutely fantastic show - 
probably because the Doctor's just so much fun. Still, it's interesting how it 
generally breaks the rules of good storytelling and yet is still so great to 
watch.

- Jonathan M Davis


Re: Integer conversions too pedantic in 64-bit

2011-02-16 Thread Mafi

Am 15.02.2011 22:49, schrieb Michel Fortin:

On 2011-02-15 16:33:33 -0500, Walter Bright newshou...@digitalmars.com
said:


Nick Sabalausky wrote:

Walter Bright newshou...@digitalmars.com wrote in message
news:ijeil4$2aso$3...@digitalmars.com...

spir wrote:

Having to constantly explain that _t means type, that size does
not mean size, what this type is supposed to mean instead, what it
is used for in core and stdlib functionality, and what programmers
are supposed to use it for... isn't this a waste of our time? This,
only because the name is mindless?

No, because there is a vast body of work that uses size_t and a vast
body of programmers who know what it is and are totally used to it.


And there's a vast body who don't.

And there's a vast body who are used to C++, so let's just abandon D
and make it an implementation of C++ instead.


I would agree that D is a complete waste of time if all it consisted
of was renaming things.


I'm just wondering whether 'size_t', because it is named after its C
counterpart, doesn't feel too alien in D, causing people to prefer
'uint' or 'ulong' instead even when they should not. We're seeing a lot
of code failing on 64-bit because authors used the fixed-size types
which are more D-like in naming. Wouldn't more D-like names that don't
look like relics from C -- something like 'word' and 'uword' -- have
helped prevent those bugs by making the word-sized type look worth
consideration?

I am also for renaming it. It should begin with u to ensure everybody 
knows it's unsigned even if there's no signed counterpart.


But what we definitely should avoid is to have two names for the same 
thing. It's the same mistake C++ did with inheriting everything from C 
and _adding_ it's own way.


Mafi


Re: Integer conversions too pedantic in 64-bit

2011-02-16 Thread spir

On 02/16/2011 04:49 AM, Michel Fortin wrote:

On 2011-02-15 22:41:32 -0500, Nick Sabalausky a@a.a said:


I like nint.


But is it unsigned or signed? Do we need 'unint' too?

I think 'word'  'uword' would be a better choice. I can't say I'm too
displeased with 'size_t', but it's true that the 'size_t' feels out of place in
D code because of its name.


yop! Vote for word / uword.
unint looks like meaning (x € R / not (x € Z)) lol!

Denis
--
_
vita es estrany
spir.wikidot.com



Re: Integer conversions too pedantic in 64-bit

2011-02-16 Thread spir

On 02/16/2011 03:07 AM, Jonathan M Davis wrote:

On Tuesday, February 15, 2011 15:13:33 spir wrote:

On 02/15/2011 11:24 PM, Jonathan M Davis wrote:

Is there some low level reason why size_t should be signed or something
I'm completely missing?


My personal issue with unsigned ints in general as implemented in C-like
languages is that the range of non-negative signed integers is half of the
range of corresponding unsigned integers (for same size).
* practically: known issues, and bugs if not checked by the language
* conceptually: contradicts the obvious idea that unsigned (aka naturals)
is a subset of signed (aka integers)


It's inevitable in any systems language. What are you going to do, throw away a
bit for unsigned integers? That's not acceptable for a systems language. On some
level, you must live with the fact that you're running code on a specific 
machine
with a specific set of constraints. Trying to do otherwise will pretty much
always harm efficiency. True, there are common bugs that might be better
prevented, but part of it ultimately comes down to the programmer having some
clue as to what they're doing. On some level, we want to prevent common bugs,
but the programmer can't have their hand held all the time either.


I cannot prove it, but I really think you're wrong on that.

First, the question of 1 bit. Think at this -- speaking of 64 bit size:
* 99.999% of all uses of unsigned fit under 2^63
* To benefit from the last bit, you must have the need to store a value 2^63 = 
v  2^64
* Not only this, you must step on a case where /any/ possible value for v 
(depending on execution data) could be = 2^63, but /all/ possible values for v 
are guaranteed  2^64
This can only be a very small fraction of cases where your value does not fit 
in 63 bits, don't you think. Has it ever happened to you (even in 32 bits)? 
Something like: what a luck! this value would not (always) fit in 31 bits, but 
(due to this constraint), I can be sure it will fit in 32 bits (always, 
whatever input data it depends on).
In fact, n bits do the job because (1) nearly all unsigned values are very 
small (2) the size used at a time covers the memory range at the same time.


Upon efficiency, if unsigned is not a subset of signed, then at a low level you 
may be forced to add checks in numerous utility routines, the kind constantly 
used, everywhere one type may play with the other. I'm not sure where the gain is.
Upon correctness, intuitively I guess (just a wild guess indeed) if unigned 
values form a subset of signed ones programmers will more easily reason 
correctly about them.


Now, I perfectly understand the sacrifice of one bit sounds like a sacrilege 
;-)
(*)

Denis

(*) But you know, when as a young guy you have coded for 8  16-bit machines, 
having 63 or 64...

--
_
vita es estrany
spir.wikidot.com



Re: Integer conversions too pedantic in 64-bit

2011-02-16 Thread Walter Bright

Jonathan M Davis wrote:
It's inevitable in any systems language. What are you going to do, throw away a 
bit for unsigned integers? That's not acceptable for a systems language. On some 
level, you must live with the fact that you're running code on a specific machine 
with a specific set of constraints. Trying to do otherwise will pretty much 
always harm efficiency. True, there are common bugs that might be better 
prevented, but part of it ultimately comes down to the programmer having some 
clue as to what they're doing. On some level, we want to prevent common bugs, 
but the programmer can't have their hand held all the time either.


Yup. A systems language is going to map closely onto the target machine, and 
that means its characteristics will show up in the language. Trying to pretend 
that arithmetic on integers is something other than what the CPU natively does 
just will not work.


Re: Integer conversions too pedantic in 64-bit

2011-02-16 Thread Iain Buclaw
== Quote from spir (denis.s...@gmail.com)'s article
 On 02/16/2011 04:49 AM, Michel Fortin wrote:
  On 2011-02-15 22:41:32 -0500, Nick Sabalausky a@a.a said:
 
  I like nint.

It's the machine integer, so I think the word 'mint' would better match your
naming logic. Also, reminds me of this small advert:
http://www.youtube.com/watch?v=zuy6o8YXzDo ;)

 
  But is it unsigned or signed? Do we need 'unint' too?
 
  I think 'word'  'uword' would be a better choice. I can't say I'm too
  displeased with 'size_t', but it's true that the 'size_t' feels out of 
  place in
  D code because of its name.
 yop! Vote for word / uword.
 unint looks like meaning (x € R / not (x € Z)) lol!
 Denis

word/uword sits well with my understanding.


Re: Integer conversions too pedantic in 64-bit

2011-02-16 Thread Don

spir wrote:

On 02/16/2011 03:07 AM, Jonathan M Davis wrote:

On Tuesday, February 15, 2011 15:13:33 spir wrote:

On 02/15/2011 11:24 PM, Jonathan M Davis wrote:

Is there some low level reason why size_t should be signed or something
I'm completely missing?


My personal issue with unsigned ints in general as implemented in C-like
languages is that the range of non-negative signed integers is half 
of the

range of corresponding unsigned integers (for same size).
* practically: known issues, and bugs if not checked by the language
* conceptually: contradicts the obvious idea that unsigned (aka 
naturals)

is a subset of signed (aka integers)


It's inevitable in any systems language. What are you going to do, 
throw away a
bit for unsigned integers? That's not acceptable for a systems 
language. On some
level, you must live with the fact that you're running code on a 
specific machine
with a specific set of constraints. Trying to do otherwise will pretty 
much

always harm efficiency. True, there are common bugs that might be better
prevented, but part of it ultimately comes down to the programmer 
having some
clue as to what they're doing. On some level, we want to prevent 
common bugs,

but the programmer can't have their hand held all the time either.


I cannot prove it, but I really think you're wrong on that.

First, the question of 1 bit. Think at this -- speaking of 64 bit size:
* 99.999% of all uses of unsigned fit under 2^63
* To benefit from the last bit, you must have the need to store a value 
2^63 = v  2^64
* Not only this, you must step on a case where /any/ possible value for 
v (depending on execution data) could be = 2^63, but /all/ possible 
values for v are guaranteed  2^64
This can only be a very small fraction of cases where your value does 
not fit in 63 bits, don't you think. Has it ever happened to you (even 
in 32 bits)? Something like: what a luck! this value would not (always) 
fit in 31 bits, but (due to this constraint), I can be sure it will fit 
in 32 bits (always, whatever input data it depends on).
In fact, n bits do the job because (1) nearly all unsigned values are 
very small (2) the size used at a time covers the memory range at the 
same time.


Upon efficiency, if unsigned is not a subset of signed, then at a low 
level you may be forced to add checks in numerous utility routines, the 
kind constantly used, everywhere one type may play with the other. I'm 
not sure where the gain is.
Upon correctness, intuitively I guess (just a wild guess indeed) if 
unigned values form a subset of signed ones programmers will more easily 
reason correctly about them.


Now, I perfectly understand the sacrifice of one bit sounds like a 
sacrilege ;-)

(*)

Denis



(*) But you know, when as a young guy you have coded for 8  16-bit 
machines, having 63 or 64...


Exactly. It is NOT the same as the 8  16 bit case. The thing is, the 
fraction of cases where the MSB is important has been decreasing 
*exponentially* from the 8-bit days. It really was necessary to use the 
entire address space (or even more, in the case of segmented 
architecture on the 286![1]) to measure the size of anything. D only 
supports 32 bit and higher, so it isn't hamstrung in the way that C is.


Yes, there are still cases where you need every bit. But they are very, 
very exceptional -- rare enough that I think the type could be called 
__uint, __ulong.


[1] What was size_t on the 286 ?
Note that in the small memory model (all pointers 16 bits) it really was 
possible to have an object of size 0x_, because the code was in 
a different address space.


Re: Integer conversions too pedantic in 64-bit

2011-02-16 Thread KennyTM~

On Feb 16, 11 11:49, Michel Fortin wrote:

On 2011-02-15 22:41:32 -0500, Nick Sabalausky a@a.a said:


I like nint.


But is it unsigned or signed? Do we need 'unint' too?

I think 'word'  'uword' would be a better choice. I can't say I'm too
displeased with 'size_t', but it's true that the 'size_t' feels out of
place in D code because of its name.




'word' may be confusing to Windows programmers because in WinAPI a 
'WORD' means an unsigned 16-bit integer (aka 'ushort').


http://msdn.microsoft.com/en-us/library/cc230402(v=PROT.10).aspx


Re: Integer conversions too pedantic in 64-bit

2011-02-16 Thread Steven Schveighoffer
On Tue, 15 Feb 2011 18:18:22 -0500, Rainer Schuetze r.sagita...@gmx.de  
wrote:




Steven Schveighoffer wrote:
 In addition size_t isn't actually defined by the compiler.  So the  
library controls the size of size_t, not the compiler.  This should  
make it extremely portable.




I do not consider the language and the runtime as completely seperate  
when it comes to writing code.


You are right, in some cases the runtime just extends the compiler  
features.  However, I believe the runtime is meant to be used in multiple  
compilers.  I would expect object.di to remain the same.  Probably core  
too.  This should be easily checkable with the newer gdc, which I believe  
uses a port of druntime.


BTW, though defined in object.di, size_t is tied to some compiler  
internals:


alias typeof(int.sizeof) size_t;

and the compiler will make assumptions about this when creating array  
literals.


This is true.  This makes it depend on the compiler.  However, I believe  
the spec is concrete about what the sizeof type should be (if not, it  
should be made concrete).


I don't have a perfect solution, but maybe builtin arrays could be  
limited to 2^^32-1 elements (or maybe 2^^31-1 to get rid of endless  
signed/unsigned conversions), so the normal type to be used is still  
int. Ranges should adopt the type sizes of the underlying objects.
 No, this is too limiting.  If I have 64GB of memory (not out of the  
question), and I want to have a 5GB array, I think I should be allowed  
to.  This is one of the main reasons to go to 64-bit in the first place.


Yes, that's the imperfect part of the proposal. An array of ints could  
still use up to 16 GB, though.


Unless you cast it to void[].  What would exactly happen there, a runtime  
error?  Which means a runtime check for an implicit cast? I don't think  
it's really an option to make array length always be uint (or int).


I wouldn't have a problem with using signed words for length.  using more  
than 2GB for one array in 32-bit land would be so rare that having to jump  
through special hoops would be fine by me.  Obviously for now, 2^63-1  
sized arrays is plenty room for todays machines in 64-bit land.


What bothers me is that you have to deal with these portability issues  
from the very moment you store the length of an array elsewhere. Not a  
really big deal, and I don't think it will change, but still feels a bit  
awkward.


Java defines everything to be the same regardless of architecture, and the  
result is you just can't do certain things (like have a 5GB array).  A  
system-level language should support the full range of architecture  
capabilities, so you necessarily have to deal with portability issues.


If you want a super-portable language that runs the same everywhere, use  
an interpreted/bytecode language like Java, .Net or Python.  D is for  
getting close to the metal.


I see size_t as a way to *mostly* make things portable.  It is not  
perfect, and really cannot be.  It's necessary to expose the architecture  
so you can adapt to it, there's no getting around taht.


Really, it's rare that you have to use it anyways, most should use auto.

-Steve


Re: Integer conversions too pedantic in 64-bit

2011-02-16 Thread Steven Schveighoffer

On Tue, 15 Feb 2011 16:50:21 -0500, Nick Sabalausky a@a.a wrote:


Nick Sabalausky a@a.a wrote in message
news:ijesem$brd$1...@digitalmars.com...

Steven Schveighoffer schvei...@yahoo.com wrote in message
news:op.vqx78nkceav7ka@steve-laptop...


size_t works,  it has a precedent, it's already *there*, just use it,  
or

alias it if you  don't like it.



One could make much the same argument about the whole of C++. It works,  
it

has a precedent, it's already *there*, just use it.



The whole reason I came to D was because, at the time, D was more  
interested
in fixing C++'s idiocy than just merely aping C++ as the theme seems to  
be

now.


Nick, this isn't a feature, it's not a design, it's not a whole language,  
it's a *single name*, one which is easily changed if you want to change it.


module nick;

alias size_t wordsize;

Now you can use it anywhere, it's sooo freaking simple, I don't understand  
the outrage.


BTW, what I meant about it's already there is that any change to the  
size_t name would have to have some benefit besides it's a different  
name because it will break any code that currently uses it.  If this  
whole argument is to just add another alias, then I'll just stop reading  
this thread since it has no point.


-Steve


Re: Integer conversions too pedantic in 64-bit

2011-02-16 Thread spir

On 02/16/2011 12:21 PM, Don wrote:

spir wrote:

On 02/16/2011 03:07 AM, Jonathan M Davis wrote:

On Tuesday, February 15, 2011 15:13:33 spir wrote:

On 02/15/2011 11:24 PM, Jonathan M Davis wrote:

Is there some low level reason why size_t should be signed or something
I'm completely missing?


My personal issue with unsigned ints in general as implemented in C-like
languages is that the range of non-negative signed integers is half of the
range of corresponding unsigned integers (for same size).
* practically: known issues, and bugs if not checked by the language
* conceptually: contradicts the obvious idea that unsigned (aka naturals)
is a subset of signed (aka integers)


It's inevitable in any systems language. What are you going to do, throw away a
bit for unsigned integers? That's not acceptable for a systems language. On
some
level, you must live with the fact that you're running code on a specific
machine
with a specific set of constraints. Trying to do otherwise will pretty much
always harm efficiency. True, there are common bugs that might be better
prevented, but part of it ultimately comes down to the programmer having some
clue as to what they're doing. On some level, we want to prevent common bugs,
but the programmer can't have their hand held all the time either.


I cannot prove it, but I really think you're wrong on that.

First, the question of 1 bit. Think at this -- speaking of 64 bit size:
* 99.999% of all uses of unsigned fit under 2^63
* To benefit from the last bit, you must have the need to store a value 2^63
= v  2^64
* Not only this, you must step on a case where /any/ possible value for v
(depending on execution data) could be = 2^63, but /all/ possible values for
v are guaranteed  2^64
This can only be a very small fraction of cases where your value does not fit
in 63 bits, don't you think. Has it ever happened to you (even in 32 bits)?
Something like: what a luck! this value would not (always) fit in 31 bits,
but (due to this constraint), I can be sure it will fit in 32 bits (always,
whatever input data it depends on).
In fact, n bits do the job because (1) nearly all unsigned values are very
small (2) the size used at a time covers the memory range at the same time.

Upon efficiency, if unsigned is not a subset of signed, then at a low level
you may be forced to add checks in numerous utility routines, the kind
constantly used, everywhere one type may play with the other. I'm not sure
where the gain is.
Upon correctness, intuitively I guess (just a wild guess indeed) if unigned
values form a subset of signed ones programmers will more easily reason
correctly about them.

Now, I perfectly understand the sacrifice of one bit sounds like a
sacrilege ;-)
(*)

Denis




(*) But you know, when as a young guy you have coded for 8  16-bit machines,
having 63 or 64...


Exactly. It is NOT the same as the 8  16 bit case. The thing is, the fraction
of cases where the MSB is important has been decreasing *exponentially* from
the 8-bit days. It really was necessary to use the entire address space (or
even more, in the case of segmented architecture on the 286![1]) to measure the
size of anything. D only supports 32 bit and higher, so it isn't hamstrung in
the way that C is.

Yes, there are still cases where you need every bit. But they are very, very
exceptional -- rare enough that I think the type could be called __uint, 
__ulong.


Add this: in the case where one needs exactly all 64 bits, then the proper type 
to use is exactly ulong.



[1] What was size_t on the 286 ?
Note that in the small memory model (all pointers 16 bits) it really was
possible to have an object of size 0x_, because the code was in a
different address space.


Denis
--
_
vita es estrany
spir.wikidot.com



Re: Integer conversions too pedantic in 64-bit

2011-02-16 Thread gölgeliyele

On 2/16/11 9:09 AM, Steven Schveighoffer wrote:

On Tue, 15 Feb 2011 16:50:21 -0500, Nick Sabalausky a@a.a wrote:


Nick Sabalausky a@a.a wrote in message



module nick;

alias size_t wordsize;

Now you can use it anywhere, it's sooo freaking simple, I don't
understand the outrage.


But that is somewhat selfish. Given size_t causes dissatisfaction with a 
lot of people, people will start create their won aliases and then you 
end up having 5 different versions of it around. If this type is an 
important one for writing architecture independent code that can take 
advantage of architectural limits, then we better don't have 5 different 
names for it in common code.


I don't think changing stuff like this should be distruptive. size_t can 
be marked deprecated and could be removed in a future release, giving 
people enough time to adapt.


Furthermore, with the 64-bit support in dmd approaching, this is the 
time to do it, if ever.




Re: Integer conversions too pedantic in 64-bit

2011-02-16 Thread Steven Schveighoffer

On Wed, 16 Feb 2011 09:23:09 -0500, gölgeliyele usul...@gmail.com wrote:


On 2/16/11 9:09 AM, Steven Schveighoffer wrote:

On Tue, 15 Feb 2011 16:50:21 -0500, Nick Sabalausky a@a.a wrote:


Nick Sabalausky a@a.a wrote in message



module nick;

alias size_t wordsize;

Now you can use it anywhere, it's sooo freaking simple, I don't
understand the outrage.


But that is somewhat selfish. Given size_t causes dissatisfaction with a  
lot of people, people will start create their won aliases and then you  
end up having 5 different versions of it around. If this type is an  
important one for writing architecture independent code that can take  
advantage of architectural limits, then we better don't have 5 different  
names for it in common code.


Sir, you've heard from the men who don't like size_t.  But what about the  
silent masses who do?


So we change it.  And then people don't like what it's changed to, for  
example, I might like size_t or already have lots of code that uses  
size_t.  So I alias your new name to size_t in my code.  How does this  
make things better/different?


bearophile doesn't like writeln.  He uses something else in his libs, it's  
just an alias.  Does that mean we should change writeln?


IT'S A NAME!!! one which many are used to using/knowing.  Whatever name it  
is, you just learn it, and once you know it, you just use it.  If we  
hadn't been using it for the last 10 years, I'd say, sure, let's have a  
vote and decide on a name.  You can't please everyone with every name.   
size_t isn't so terrible that it needs to be changed, so can we focus  
efforts on actual important things?  This is the sheddiest bikeshed  
argument I've seen in a while.


I'm done with this thread...

-Steve


Re: Integer conversions too pedantic in 64-bit

2011-02-16 Thread gölgeliyele

On 2/16/11 9:45 AM, Steven Schveighoffer wrote:



I'm done with this thread...

-Steve


Ok, I don't want to drag on. But there is a reason why we have a style. 
size_t is against the D style and obviously does not match. I use size_t 
as much as Walter does in my day job, and I even like it. It just does 
not fit into D's type names. That is all.


Re: Integer conversions too pedantic in 64-bit

2011-02-16 Thread Jonathan M Davis
On Wednesday, February 16, 2011 06:51:21 gölgeliyele wrote:
 On 2/16/11 9:45 AM, Steven Schveighoffer wrote:
  I'm done with this thread...
  
  -Steve
 
 Ok, I don't want to drag on. But there is a reason why we have a style.
 size_t is against the D style and obviously does not match. I use size_t
 as much as Walter does in my day job, and I even like it. It just does
 not fit into D's type names. That is all.

If we were much earlier in the D development process, then perhaps it would 
make 
some sense to change the name. But as it is, it's going to break a lot of code 
for a simple name change. Lots of C, C++, and D programmers are fine with 
size_t. 
I see no reason to break a ton of code just because a few people complain about 
a name on the mailing list.

Not to mention, size_t isn't exactly normal anyway. Virtually every type in D 
has a fixed size, but size_t is different. It's an alias whose size varies 
depending on the architecture you're compiling on. As such, perhaps that fact 
that it doesn't follow the normal naming scheme is a _good_ thing.

I tend to agree with Steve on this. This is core language stuff that's been the 
way that it is since the beginning. Changing it is just going to break code and 
cause even more headaches for porting code from C or C++ to D. This definitely 
comes across as bikeshedding. If we were way earlier in the development process 
of D, then I think that there would be a much better argument. But at this 
point, the language spec is supposed to be essentially stable. And just because 
the name doesn't quite fit in with the others is _not_ a good enough reason to 
go 
and change the language spec.

- Jonathan M Davis


Re: Integer conversions too pedantic in 64-bit

2011-02-16 Thread Daniel Gibson
Am 16.02.2011 19:20, schrieb Jonathan M Davis:
 On Wednesday, February 16, 2011 06:51:21 gölgeliyele wrote:
 On 2/16/11 9:45 AM, Steven Schveighoffer wrote:
 I'm done with this thread...

 -Steve

 Ok, I don't want to drag on. But there is a reason why we have a style.
 size_t is against the D style and obviously does not match. I use size_t
 as much as Walter does in my day job, and I even like it. It just does
 not fit into D's type names. That is all.
 
 If we were much earlier in the D development process, then perhaps it would 
 make 
 some sense to change the name. But as it is, it's going to break a lot of 
 code 
 for a simple name change. Lots of C, C++, and D programmers are fine with 
 size_t. 
 I see no reason to break a ton of code just because a few people complain 
 about 
 a name on the mailing list.
 
 Not to mention, size_t isn't exactly normal anyway. Virtually every type in D 
 has a fixed size, but size_t is different. It's an alias whose size varies 
 depending on the architecture you're compiling on. As such, perhaps that fact 
 that it doesn't follow the normal naming scheme is a _good_ thing.
 
 I tend to agree with Steve on this. This is core language stuff that's been 
 the 
 way that it is since the beginning. Changing it is just going to break code 
 and 
 cause even more headaches for porting code from C or C++ to D. This 
 definitely 
 comes across as bikeshedding. If we were way earlier in the development 
 process 
 of D, then I think that there would be a much better argument. But at this 
 point, the language spec is supposed to be essentially stable. And just 
 because 
 the name doesn't quite fit in with the others is _not_ a good enough reason 
 to go 
 and change the language spec.
 
 - Jonathan M Davis

Well IMHO it would be feasible to add another alias (keeping size_t), update
phobos to use the new alias and to recommend to use the new alias instead of 
size_t.
Or, even better, add a new *type* that behaves like size_t but prevents
non-portable use without explicit casting, use it throughout phobos and keep
size_t for compatibility reasons (and for interfacing with C).

But I really don't care much.. size_t is okay for me the way it is.
The best argument I've heard so far was from Michel Fortin, that having a more
D-ish name may encourage the use of size_t instead of uint - but hopefully
people will be more portability-aware once 64bit DMD is out anyway.

IMHO it's definitely too late (for D2) to add a better type that is signed etc,
like Don proposed. Also I'm not sure how well that would work when interfacing
with C.

It may make sense for the compiler to handle unsigned/signed comparisons and
operations more strictly or more securely (= implicit casting to the next
bigger unsigned type before comparing or stuff like that), though.

Cheers,
- Daniel


Re: Integer conversions too pedantic in 64-bit

2011-02-16 Thread Walter Bright

Don wrote:

[1] What was size_t on the 286 ?


16 bits

Note that in the small memory model (all pointers 16 bits) it really was 
possible to have an object of size 0x_, because the code was in 
a different address space.


Not really. I think the 286 had a hard limit of 16 Mb.

There was a so-called huge memory model which attempted (badly) to fake a 
linear address space across the segmented model. It never worked very well (such 
as having wacky problems when an object straddled a segment boundary), and 
applications built with it sucked in the performance dept. I never supported it 
for that reason.


A lot of the effort in 16 bit programming went to breaking up data structures so 
no individual part of it spanned more than 64K.


Re: Integer conversions too pedantic in 64-bit

2011-02-16 Thread Don

Walter Bright wrote:

Don wrote:

[1] What was size_t on the 286 ?




16 bits

Note that in the small memory model (all pointers 16 bits) it really 
was possible to have an object of size 0x_, because the code 
was in a different address space.


Not really. I think the 286 had a hard limit of 16 Mb.


I mean, you can have a 16 bit code pointer, and a 16 bit data pointer. 
So, you can concievably have a 64K data item, using the full size of size_t.
That isn't possible on a modern, linear address space, because the code 
has to go somewhere...




There was a so-called huge memory model which attempted (badly) to 
fake a linear address space across the segmented model. It never worked 
very well (such as having wacky problems when an object straddled a 
segment boundary), and applications built with it sucked in the 
performance dept. I never supported it for that reason.


A lot of the effort in 16 bit programming went to breaking up data 
structures so no individual part of it spanned more than 64K.


Yuck.
I just caught the very last of that era. I wrote a couple of 16-bit 
DLLs. From memory, you couldn't assume the stack was in the data 
segment, and you got horrific memory corruption if you did.

I've got no nostalgia for those days...



Re: Integer conversions too pedantic in 64-bit

2011-02-16 Thread Walter Bright

Don wrote:

Walter Bright wrote:

Don wrote:

[1] What was size_t on the 286 ?




16 bits

Note that in the small memory model (all pointers 16 bits) it really 
was possible to have an object of size 0x_, because the code 
was in a different address space.


Not really. I think the 286 had a hard limit of 16 Mb.


I mean, you can have a 16 bit code pointer, and a 16 bit data pointer. 
So, you can concievably have a 64K data item, using the full size of 
size_t.
That isn't possible on a modern, linear address space, because the code 
has to go somewhere...


Actually, you can have a segmented model on a 32 bit machine rather than a flat 
model, with separate segments for code, data, and stack. The Digital Mars DOS 
Extender actually does this. The advantage of it is you cannot execute data on 
the stack.


There was a so-called huge memory model which attempted (badly) to 
fake a linear address space across the segmented model. It never 
worked very well (such as having wacky problems when an object 
straddled a segment boundary), and applications built with it sucked 
in the performance dept. I never supported it for that reason.


A lot of the effort in 16 bit programming went to breaking up data 
structures so no individual part of it spanned more than 64K.


Yuck.
I just caught the very last of that era. I wrote a couple of 16-bit 
DLLs. From memory, you couldn't assume the stack was in the data 
segment, and you got horrific memory corruption if you did.

I've got no nostalgia for those days...


I rather enjoyed it, and the pay was good g.


Re: Integer conversions too pedantic in 64-bit

2011-02-16 Thread dsimcha
This whole conversation makes me feel like The Naive Noob for 
complaining about how much 32-bit address space limitations suck and we 
need 64 support.


On 2/16/2011 8:52 PM, Walter Bright wrote:

Don wrote:

Walter Bright wrote:

Don wrote:

[1] What was size_t on the 286 ?




16 bits


Note that in the small memory model (all pointers 16 bits) it really
was possible to have an object of size 0x_, because the code
was in a different address space.


Not really. I think the 286 had a hard limit of 16 Mb.


I mean, you can have a 16 bit code pointer, and a 16 bit data pointer.
So, you can concievably have a 64K data item, using the full size of
size_t.
That isn't possible on a modern, linear address space, because the
code has to go somewhere...


Actually, you can have a segmented model on a 32 bit machine rather than
a flat model, with separate segments for code, data, and stack. The
Digital Mars DOS Extender actually does this. The advantage of it is you
cannot execute data on the stack.


There was a so-called huge memory model which attempted (badly) to
fake a linear address space across the segmented model. It never
worked very well (such as having wacky problems when an object
straddled a segment boundary), and applications built with it sucked
in the performance dept. I never supported it for that reason.

A lot of the effort in 16 bit programming went to breaking up data
structures so no individual part of it spanned more than 64K.


Yuck.
I just caught the very last of that era. I wrote a couple of 16-bit
DLLs. From memory, you couldn't assume the stack was in the data
segment, and you got horrific memory corruption if you did.
I've got no nostalgia for those days...


I rather enjoyed it, and the pay was good g.




Re: Integer conversions too pedantic in 64-bit

2011-02-16 Thread Kevin Bealer
== Quote from spir (denis.s...@gmail.com)'s article
 On 02/16/2011 03:07 AM, Jonathan M Davis wrote:
  On Tuesday, February 15, 2011 15:13:33 spir wrote:
  On 02/15/2011 11:24 PM, Jonathan M Davis wrote:
  Is there some low level reason why size_t should be signed or something
  I'm completely missing?
 
  My personal issue with unsigned ints in general as implemented in C-like
  languages is that the range of non-negative signed integers is half of the
  range of corresponding unsigned integers (for same size).
  * practically: known issues, and bugs if not checked by the language
  * conceptually: contradicts the obvious idea that unsigned (aka naturals)
  is a subset of signed (aka integers)
 
  It's inevitable in any systems language. What are you going to do, throw 
  away a
  bit for unsigned integers? That's not acceptable for a systems language. On 
  some
  level, you must live with the fact that you're running code on a specific 
  machine
  with a specific set of constraints. Trying to do otherwise will pretty much
  always harm efficiency. True, there are common bugs that might be better
  prevented, but part of it ultimately comes down to the programmer having 
  some
  clue as to what they're doing. On some level, we want to prevent common 
  bugs,
  but the programmer can't have their hand held all the time either.
 I cannot prove it, but I really think you're wrong on that.
 First, the question of 1 bit. Think at this -- speaking of 64 bit size:
 * 99.999% of all uses of unsigned fit under 2^63
 * To benefit from the last bit, you must have the need to store a value 2^63 
 =
 v  2^64
 * Not only this, you must step on a case where /any/ possible value for v
 (depending on execution data) could be = 2^63, but /all/ possible values for 
 v
 are guaranteed  2^64
 This can only be a very small fraction of cases where your value does not fit
 in 63 bits, don't you think. Has it ever happened to you (even in 32 bits)?
 Something like: what a luck! this value would not (always) fit in 31 bits, 
 but
 (due to this constraint), I can be sure it will fit in 32 bits (always,
 whatever input data it depends on).
 In fact, n bits do the job because (1) nearly all unsigned values are very
 small (2) the size used at a time covers the memory range at the same time.
 Upon efficiency, if unsigned is not a subset of signed, then at a low level 
 you
 may be forced to add checks in numerous utility routines, the kind constantly
 used, everywhere one type may play with the other. I'm not sure where the 
 gain is.
 Upon correctness, intuitively I guess (just a wild guess indeed) if unigned
 values form a subset of signed ones programmers will more easily reason
 correctly about them.
 Now, I perfectly understand the sacrifice of one bit sounds like a 
 sacrilege ;-)
 (*)
 Denis
 (*) But you know, when as a young guy you have coded for 8  16-bit machines,
 having 63 or 64...

If you write low level code, it happens all the time.  For example, you can copy
memory areas quickly on some machines by treating them as arrays of long and
copying the values -- which requires the upper bit to be preserved.

Or you compute a 64 bit hash value using an algorithm that is part of some
standard protocol.  Oops -- requires an unsigned 64 bit number, the signed 
version
would produce the wrong result.  And since the standard expects normal behaving
int64's you are stuck -- you'd have to write a little class to simulate unsigned
64 bit math.  E.g. a library that computes md5 sums.

Not to mention all the code that uses 64 bit numbers as bit fields where the
different bits or sets of bits are really subfields of the total range of 
values.

What you are saying is true of high level code that models real life -- if the
value is someone's salary or the number of toasters they are buying from a store
you are probably fine -- but a lot of low level software (ipv4 stacks, video
encoders, databases, etc) are based on designs that require numbers to behave a
certain way, and losing a bit is going to be a pain.

I've run into this with Java, which lacks unsigned types, and once you run into 
a
case that needs that extra bit it gets annoying right quick.

Kevin


Re: Integer conversions too pedantic in 64-bit

2011-02-16 Thread Daniel Gibson
Am 17.02.2011 05:19, schrieb Kevin Bealer:
 == Quote from spir (denis.s...@gmail.com)'s article
 On 02/16/2011 03:07 AM, Jonathan M Davis wrote:
 On Tuesday, February 15, 2011 15:13:33 spir wrote:
 On 02/15/2011 11:24 PM, Jonathan M Davis wrote:
 Is there some low level reason why size_t should be signed or something
 I'm completely missing?

 My personal issue with unsigned ints in general as implemented in C-like
 languages is that the range of non-negative signed integers is half of the
 range of corresponding unsigned integers (for same size).
 * practically: known issues, and bugs if not checked by the language
 * conceptually: contradicts the obvious idea that unsigned (aka naturals)
 is a subset of signed (aka integers)

 It's inevitable in any systems language. What are you going to do, throw 
 away a
 bit for unsigned integers? That's not acceptable for a systems language. On 
 some
 level, you must live with the fact that you're running code on a specific 
 machine
 with a specific set of constraints. Trying to do otherwise will pretty much
 always harm efficiency. True, there are common bugs that might be better
 prevented, but part of it ultimately comes down to the programmer having 
 some
 clue as to what they're doing. On some level, we want to prevent common 
 bugs,
 but the programmer can't have their hand held all the time either.
 I cannot prove it, but I really think you're wrong on that.
 First, the question of 1 bit. Think at this -- speaking of 64 bit size:
 * 99.999% of all uses of unsigned fit under 2^63
 * To benefit from the last bit, you must have the need to store a value 2^63 
 =
 v  2^64
 * Not only this, you must step on a case where /any/ possible value for v
 (depending on execution data) could be = 2^63, but /all/ possible values 
 for v
 are guaranteed  2^64
 This can only be a very small fraction of cases where your value does not fit
 in 63 bits, don't you think. Has it ever happened to you (even in 32 bits)?
 Something like: what a luck! this value would not (always) fit in 31 bits, 
 but
 (due to this constraint), I can be sure it will fit in 32 bits (always,
 whatever input data it depends on).
 In fact, n bits do the job because (1) nearly all unsigned values are very
 small (2) the size used at a time covers the memory range at the same time.
 Upon efficiency, if unsigned is not a subset of signed, then at a low level 
 you
 may be forced to add checks in numerous utility routines, the kind constantly
 used, everywhere one type may play with the other. I'm not sure where the 
 gain is.
 Upon correctness, intuitively I guess (just a wild guess indeed) if unigned
 values form a subset of signed ones programmers will more easily reason
 correctly about them.
 Now, I perfectly understand the sacrifice of one bit sounds like a 
 sacrilege ;-)
 (*)
 Denis
 (*) But you know, when as a young guy you have coded for 8  16-bit machines,
 having 63 or 64...
 
 If you write low level code, it happens all the time.  For example, you can 
 copy
 memory areas quickly on some machines by treating them as arrays of long and
 copying the values -- which requires the upper bit to be preserved.
 
 Or you compute a 64 bit hash value using an algorithm that is part of some
 standard protocol.  Oops -- requires an unsigned 64 bit number, the signed 
 version
 would produce the wrong result.  And since the standard expects normal 
 behaving
 int64's you are stuck -- you'd have to write a little class to simulate 
 unsigned
 64 bit math.  E.g. a library that computes md5 sums.
 
 Not to mention all the code that uses 64 bit numbers as bit fields where the
 different bits or sets of bits are really subfields of the total range of 
 values.
 
 What you are saying is true of high level code that models real life -- if the
 value is someone's salary or the number of toasters they are buying from a 
 store
 you are probably fine -- but a lot of low level software (ipv4 stacks, video
 encoders, databases, etc) are based on designs that require numbers to behave 
 a
 certain way, and losing a bit is going to be a pain.
 
 I've run into this with Java, which lacks unsigned types, and once you run 
 into a
 case that needs that extra bit it gets annoying right quick.
 
 Kevin

It was not proposed to alter ulong (int64), but to only a size_t equivalent. ;)
And I agree that not having unsigned types (like in Java) just sucks.
Wasn't Java even advertised as a programming language for network stuff? Quite
ridiculous without unsigned types..

Cheers,
- Daniel


Re: Integer conversions too pedantic in 64-bit

2011-02-16 Thread Nick Sabalausky
KennyTM~ kenn...@gmail.com wrote in message 
news:ijghne$ts1$1...@digitalmars.com...
 On Feb 16, 11 11:49, Michel Fortin wrote:
 On 2011-02-15 22:41:32 -0500, Nick Sabalausky a@a.a said:

 I like nint.

 But is it unsigned or signed? Do we need 'unint' too?

 I think 'word'  'uword' would be a better choice. I can't say I'm too
 displeased with 'size_t', but it's true that the 'size_t' feels out of
 place in D code because of its name.



 'word' may be confusing to Windows programmers because in WinAPI a 'WORD' 
 means an unsigned 16-bit integer (aka 'ushort').

 http://msdn.microsoft.com/en-us/library/cc230402(v=PROT.10).aspx

That's just a legacy issue from when windows was mainly on 16-bit machines. 
Word means native size. 




Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread Jacob Carlborg

On 2011-02-15 01:08, Walter Bright wrote:

dsimcha wrote:

Now that DMD has a 64-bit beta available, I'm working on getting a
whole bunch
of code to compile in 64 mode. Frankly, the compiler is way too freakin'
pedantic when it comes to implicit conversions (or lack thereof) of
array.length. 99.999% of the time it's safe to assume an array is not
going
to be over 4 billion elements long. I'd rather have a bug the 0.001%
of the
time than deal with the pedantic errors the rest of the time, because
I think
it would be less total time and effort invested. To force me to either
put
casts in my code everywhere or change my entire codebase to use wider
integers
(with ripple effects just about everywhere) strikes me as purity
winning out
over practicality.


We dealt with that in updating Phobos/Druntime to 64 bits. The end
result was worth it (and yes, there would have been undiscovered bugs
without those pedantic checks).

Most of the issues are solved if you use auto and foreach where
possible, and size_t for the rest of the cases.


Yes, exactly, what's the reason not to use size_t. I've used size_t for 
length and index in arrays for as long as I've been using D.


--
/Jacob Carlborg


Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread spir

On 02/15/2011 02:28 AM, Jonathan M Davis wrote:

On Monday, February 14, 2011 17:06:43 spir wrote:

On 02/15/2011 01:56 AM, Jonathan M Davis wrote:

On Monday, February 14, 2011 16:30:09 Andrej Mitrovic wrote:

Here's something I've noticed (x86 code):

void main()
{

  ulong size = 2;
  int[] arr = new int[](size);

}

This will error with:
sizetTest.d(8): Error: cannot implicitly convert expression (size) of
type ulong to uint

size_t is aliased to uint since I'm running 32bit.

I'm really not experienced at all with 64bit, so I don't know if it's
good to use uint explicitly (my hunch is that it's not good). uint as
the array size wouldn't even compile in 64bit, right?

If I'm correct, wouldn't it be better if the error showed that it
expects size_t which might be aliased to whatever type for a
particular machine?


Use size_t. It's the type which is used. It's aliased to whichever type
is appropriate for the architecture. On 32 bits, that would be a 32 bit
integer, so it's uint. On 64 bits, that would be a 64 bit integer, so
it's ulong.


Rename size-t, or rather introduce a meaningful standard alias? (would vote
for Natural)


Why? size_t is what's used in C++. It's well known and what lots of programmers
would expect What would you gain by renaming it?


Then state on D's front page:
   D is a language for C++ programmers...

size_t is wrong, wrong, wrong:
* Nothing says it's a type alias (should be Size).
* The name's morphology is weird.
* It does not even tell about semantics  usage: a majority of use cases is 
probably as indices! (ordinal, not cardinal as suggested by the name). 
(sizediff_t also exists, but seems unused in D)


Natural would be good according to all those points: the name tells it's 
unsigned, a natural number is either an ordinal or a cardinal, and it fits D 
style guidelines. Better proposals welcome :-)


Aliasing does /not/ mean removing size_t, just proposing a correct, sensible, 
and meaningful alternative. If people like it, and if using the correct name is 
encouraged, then after a few years the legacy garbage can endly be recycled ;-)
In any case, this alternative must be *standard*, for the whole community to 
know it. I have used Ordinal  Cardinal for a while, but stopped because of 
that: people reading my code had to guess a bit (not that hard, but still), or 
jump to declarations.


Again: size_t is /wrong/. The fact that for you it means what it means, due to 
your experience as C++ programmer, does not change a iota (lol!) to its 
wrongness. If we never improve languages just because of mindless conservatism, 
then in 3 generations programmers will still be stuck with junk from the 1970's.


Denis
--
_
vita es estrany
spir.wikidot.com



Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread spir

On 02/15/2011 02:55 AM, Nick Sabalausky wrote:

Nick Sabalauskya@a.a  wrote in message
news:ijcm8d$1lf5$1...@digitalmars.com...

spirdenis.s...@gmail.com  wrote in message
news:mailman.1648.1297732015.4748.digitalmar...@puremagic.com...


Rename size-t, or rather introduce a meaningful standard alias? (would
vote for Natural)



My bikeshed is painted native and word :)



...With some wordsize around the trim.


Not bad, but how does wordsize tell about usage (ordinal=index/position, 
cardinal=count/length) and semantics (unsigned)?
uint is rather good; actually means about the same as natural for me. But 
it's a bit cryptic and does not adapt to platform native word size, 
unfortunately. I use uint for now to avoid custom, but correct  meaningful, 
alias(es) for size_t. (I must have a blockage with using mindless terms like 
size_t ;-)



denis
--
_
vita es estrany
spir.wikidot.com



Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread spir

On 02/15/2011 02:58 AM, Nick Sabalausky wrote:

Jonathan M Davisjmdavisp...@gmx.com  wrote in message
news:mailman.1650.1297733226.4748.digitalmar...@puremagic.com...

On Monday, February 14, 2011 17:06:43 spir wrote:


Rename size-t, or rather introduce a meaningful standard alias? (would
vote
for Natural)


Why? size_t is what's used in C++. It's well known and what lots of
programmers
would expect What would you gain by renaming it?



Although I fully realize how much this sounds like making a big deal out of
nothing, to me, using size_t has always felt really clumsy and awkward. I
think it's partly because of using an underscore in such an otherwise short
identifier, and partly because I've been aware of size_t for years and still
don't have the slightest clue WTF that t means. Something like wordsize
would make a lot more sense and frankly feel much nicer.

And, of course, there's a lot of well-known things in C++ that D
deliberately destroys. D is a different language, it may as well do things
better.


Agreed. While making something different...
About the suffix -_t, I bet it means type, what do you think? (may well be 
wrong, just because I have here and there seen custom types like name_t or 
point_t or such) Anyone has a history of C/++ at hand?


Denis
--
_
vita es estrany
spir.wikidot.com



Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread spir

On 02/15/2011 03:11 AM, Don wrote:

Nick Sabalausky wrote:

Jonathan M Davis jmdavisp...@gmx.com wrote in message
news:mailman.1650.1297733226.4748.digitalmar...@puremagic.com...

On Monday, February 14, 2011 17:06:43 spir wrote:

Rename size-t, or rather introduce a meaningful standard alias? (would vote
for Natural)

Why? size_t is what's used in C++. It's well known and what lots of programmers
would expect What would you gain by renaming it?



Although I fully realize how much this sounds like making a big deal out of
nothing, to me, using size_t has always felt really clumsy and awkward. I
think it's partly because of using an underscore in such an otherwise short
identifier, and partly because I've been aware of size_t for years and still
don't have the slightest clue WTF that t means. Something like wordsize
would make a lot more sense and frankly feel much nicer.

And, of course, there's a lot of well-known things in C++ that D deliberately
destroys. D is a different language, it may as well do things better.


To my mind, a bigger problem is that size_t is WRONG. It should be an integer.
NOT unsigned.


That would /also/ solve dark corner issue  bugs. Let us define a standard 
alias to be used for indices, length, and such, and take the opportunity to 
give it a meaningful name. Then let core and lib functions to expect  return 
integer's. But this is a hard path, don't you think?


Denis
--
_
vita es estrany
spir.wikidot.com



Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread spir

On 02/15/2011 03:26 AM, Jonathan M Davis wrote:

On Monday, February 14, 2011 18:19:35 Nick Sabalausky wrote:

Jonathan M Davisjmdavisp...@gmx.com  wrote in message
news:mailman.1655.1297736016.4748.digitalmar...@puremagic.com...


I believe that t is for type. The same goes for types such as time_t. The
size
part of the name is probably meant to be short for either word size or
pointer
size.

Personally, I see nothing wrong with size_t and see no reason to change
it. If
it were a particularly bad name and there was a good suggestion for a
replacement, then perhaps I'd support changing it. But I see nothing
wrong with
size_t at all.


So it's (modified) hungarian notation? Didn't that go out with boy bands,
Matrix spoofs and dancing CG babies?


How is it hungarian notation? Hungarian notation puts the type of the variable
in the name. size_t _is_ the type. I don't see any relation to hungarian
notation. And I'm pretty sure that size_t predates the invention of hungarian
notation by a fair margin anyway.


size_t is not the type of size_t ;-)
For sure it is Hungarian notation. What is the type of size_t? Type (at least 
conceptually, even if D does not have live type elements). Just what the name says.


denis
--
_
vita es estrany
spir.wikidot.com



Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread spir

On 02/15/2011 05:50 AM, Andrej Mitrovic wrote:

The question is then do you want to be more consistent with the
language (abolish size_t and make something nicer), or be consistent
with the known standards (C99 ISO, et all.).

I'd vote for a change, but I know it will never happen (even though it
just might not be too late if we're not coding for 64 bits yet). It's
hardcoded in the skin of C++ programmers, and Walter is at least one
of them.


We don't need to change in the sense of replace. We just need a /standard/ 
correct and meaningful alternative. It must be standard to be shared wealth 
of the community, thus defined in the core stdlib or whereever (as opposed to 
people using their own terms, all different, as I did for a while).


alias size_t GoodTypeName; // always available

Possibly in a while there would be a consensus to get rid of such historic junk 
as size_t, but it's a different step, and probably a later phase of the 
language's evolution imo. All we nedd now, is to be able to use a good name for 
an unsigned type sized to machine word and usable for indices, length, etc. 
Maybe the #1 type in real code, by the way, or is it string?
As long as such a name is not defined as standard, it may be counter-productive 
for the community, and annoying for others reading our code, to use our own 
preferred terms.


Denis
--
_
vita es estrany
spir.wikidot.com



Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread spir

On 02/15/2011 06:51 AM, Walter Bright wrote:

Andrej Mitrovic wrote:

The question is then do you want to be more consistent with the
language (abolish size_t and make something nicer), or be consistent
with the known standards (C99 ISO, et all.).

I'd vote for a change, but I know it will never happen (even though it
just might not be too late if we're not coding for 64 bits yet). It's
hardcoded in the skin of C++ programmers, and Walter is at least one
of them.


We also don't go around renaming should to shud, or use dvorak keyboards.

Having to constantly explain that use 'ourfancyname' instead of size_t, it
works exactly the same as size_t is a waste of our time and potential users'
time.


Having to constantly explain that _t means type, that size does not mean 
size, what this type is supposed to mean instead, what it is used for in core 
and stdlib functionality, and what programmers are supposed to use it for... 
isn't this a waste of our time? This, only because the name is mindless?


Please, just allow others having a correct, meaningful (and hopefully 
styleguide compliant) alternative --defined as a standard just like size_t. And 
go on using size_t as you like it.


denis
--
_
vita es estrany
spir.wikidot.com



Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread spir

On 02/15/2011 03:44 AM, Piotr Szturmaj wrote:

spir wrote:

Rename size-t, or rather introduce a meaningful standard alias? (would
vote for Natural)


Maybe ptrint and ptruint?


If ptr means pointer, then it's wrong: size-t is used for more than that, I 
guess. Strangely enough, while size may suggest it, .length does not return a 
size_t but an uint.


Denis
--
_
vita es estrany
spir.wikidot.com



Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread Daniel Gibson

Am 15.02.2011 12:50, schrieb spir:

On 02/15/2011 03:44 AM, Piotr Szturmaj wrote:

spir wrote:

Rename size-t, or rather introduce a meaningful standard alias? (would
vote for Natural)


Maybe ptrint and ptruint?


If ptr means pointer, then it's wrong: size-t is used for more than
that, I guess. Strangely enough, while size may suggest it, .length
does not return a size_t but an uint.

Denis


.length of what? An array?
I'm pretty sure it returns size_t.

Cheers,
- Daniel


Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread Daniel Gibson

Am 15.02.2011 11:30, schrieb spir:

On 02/15/2011 02:58 AM, Nick Sabalausky wrote:

Jonathan M Davisjmdavisp...@gmx.com wrote in message
news:mailman.1650.1297733226.4748.digitalmar...@puremagic.com...

On Monday, February 14, 2011 17:06:43 spir wrote:


Rename size-t, or rather introduce a meaningful standard alias? (would
vote
for Natural)


Why? size_t is what's used in C++. It's well known and what lots of
programmers
would expect What would you gain by renaming it?



Although I fully realize how much this sounds like making a big deal
out of
nothing, to me, using size_t has always felt really clumsy and
awkward. I
think it's partly because of using an underscore in such an otherwise
short
identifier, and partly because I've been aware of size_t for years and
still
don't have the slightest clue WTF that t means. Something like
wordsize
would make a lot more sense and frankly feel much nicer.

And, of course, there's a lot of well-known things in C++ that D
deliberately destroys. D is a different language, it may as well do
things
better.


Agreed. While making something different...
About the suffix -_t, I bet it means type, what do you think? (may
well be wrong, just because I have here and there seen custom types like
name_t or point_t or such) Anyone has a history of C/++ at hand?

Denis


I've seen _t in C code for typedef'ed types.

like
  struct foo_s { ... };
  typedef struct foo_s foo_t;

and then foo_t myfoo; myfoo.x = 42; etc
instead of struct foo_s myfoo; myfoo.x = 42; etc

and also stuff like
  typedef float vec_t;
  typedef vec_t vec3_t[3];


So it is used to indicate that the it's an aliased type.

I don't see the problem with size_t.
It's the type used for sizes. sizeof(foo) (or foo.sizeof in D) uses it.

Cheers,
- Daniel



Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread Piotr Szturmaj

spir wrote:

On 02/15/2011 03:44 AM, Piotr Szturmaj wrote:

spir wrote:

Rename size-t, or rather introduce a meaningful standard alias? (would
vote for Natural)


Maybe ptrint and ptruint?


If ptr means pointer, then it's wrong: size-t is used for more than
that, I guess. Strangely enough, while size may suggest it, .length
does not return a size_t but an uint.


ptr prefix shows that int/uint depends on CPU word (32/64 bit), i.e. 
they have the same size as pointer. However, it may led to confusion, 
which type - signed or unsigned - is right for the job.


Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread Steven Schveighoffer

On Mon, 14 Feb 2011 20:58:17 -0500, Nick Sabalausky a@a.a wrote:


Jonathan M Davis jmdavisp...@gmx.com wrote in message
news:mailman.1650.1297733226.4748.digitalmar...@puremagic.com...

On Monday, February 14, 2011 17:06:43 spir wrote:


Rename size-t, or rather introduce a meaningful standard alias? (would
vote
for Natural)


Why? size_t is what's used in C++. It's well known and what lots of
programmers
would expect What would you gain by renaming it?



Although I fully realize how much this sounds like making a big deal out  
of
nothing, to me, using size_t has always felt really clumsy and  
awkward. I
think it's partly because of using an underscore in such an otherwise  
short
identifier, and partly because I've been aware of size_t for years and  
still
don't have the slightest clue WTF that t means. Something like  
wordsize

would make a lot more sense and frankly feel much nicer.

And, of course, there's a lot of well-known things in C++ that D
deliberately destroys. D is a different language, it may as well do  
things

better.


Hey, bikeshedders, I found this cool easter-egg feature in D!  It's called  
alias!  Don't like the name of something?  Well you can change it!


alias size_t wordsize;

Now, you can use wordsize instead of size_t in your code, and the compiler  
doesn't care! (in fact, that's all size_t is anyways *hint hint*)


;)

-Steve


Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread Adam Ruppe
Sometimes I think we should troll the users a little and make
a release with names like so:

alias size_t
TypeUsedForArraySizes_Indexes_AndOtherRelatedTasksThatNeedAnUnsignedMachineSizeWord;

alias ptrdiff_t
TypeUsedForDifferencesBetweenPointers_ThatIs_ASignedMachineSizeWordAlsoUsableForOffsets;

alias iota lazyRangeThatGoesFromStartToFinishByTheGivenStepAmount;


Cash money says everyone would be demanding an emergency release with
shorter names. We'd argue for months about it... and probably settle
back where we started.


Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread spir

On 02/15/2011 02:01 PM, Daniel Gibson wrote:

Am 15.02.2011 12:50, schrieb spir:

On 02/15/2011 03:44 AM, Piotr Szturmaj wrote:

spir wrote:

Rename size-t, or rather introduce a meaningful standard alias? (would
vote for Natural)


Maybe ptrint and ptruint?


If ptr means pointer, then it's wrong: size-t is used for more than
that, I guess. Strangely enough, while size may suggest it, .length
does not return a size_t but an uint.

Denis


.length of what? An array?
I'm pretty sure it returns size_t.


unittest {
int[] ints; auto l = ints.length;
writeln(typeof(l).stringof);
}
press play ;-)

denis
--
_
vita es estrany
spir.wikidot.com



Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread spir

On 02/15/2011 02:36 PM, Steven Schveighoffer wrote:

On Mon, 14 Feb 2011 20:58:17 -0500, Nick Sabalausky a@a.a wrote:


Jonathan M Davis jmdavisp...@gmx.com wrote in message
news:mailman.1650.1297733226.4748.digitalmar...@puremagic.com...

On Monday, February 14, 2011 17:06:43 spir wrote:


Rename size-t, or rather introduce a meaningful standard alias? (would
vote
for Natural)


Why? size_t is what's used in C++. It's well known and what lots of
programmers
would expect What would you gain by renaming it?



Although I fully realize how much this sounds like making a big deal out of
nothing, to me, using size_t has always felt really clumsy and awkward. I
think it's partly because of using an underscore in such an otherwise short
identifier, and partly because I've been aware of size_t for years and still
don't have the slightest clue WTF that t means. Something like wordsize
would make a lot more sense and frankly feel much nicer.

And, of course, there's a lot of well-known things in C++ that D
deliberately destroys. D is a different language, it may as well do things
better.


Hey, bikeshedders, I found this cool easter-egg feature in D! It's called
alias! Don't like the name of something? Well you can change it!

alias size_t wordsize;

Now, you can use wordsize instead of size_t in your code, and the compiler
doesn't care! (in fact, that's all size_t is anyways *hint hint*)


Sure, but it's not the point of this one bikeshedding thread. If you do that, 
then you're the only one who knows what wordsize means. Good, maybe, for 
app-specific semantic notions (alias Employee[] Staff;); certainly not for 
types at the highest degree of general purpose like size_t. We need a standard 
alias.


Denis
--
_
vita es estrany
spir.wikidot.com



Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread Adam Ruppe
spir wrote:
 press play

Since size_t is an alias, you wouldn't see it's name anywhere
except the source code.


Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread Daniel Gibson

Am 15.02.2011 15:18, schrieb spir:

On 02/15/2011 02:01 PM, Daniel Gibson wrote:

Am 15.02.2011 12:50, schrieb spir:

On 02/15/2011 03:44 AM, Piotr Szturmaj wrote:

spir wrote:

Rename size-t, or rather introduce a meaningful standard alias? (would
vote for Natural)


Maybe ptrint and ptruint?


If ptr means pointer, then it's wrong: size-t is used for more than
that, I guess. Strangely enough, while size may suggest it, .length
does not return a size_t but an uint.

Denis


.length of what? An array?
I'm pretty sure it returns size_t.


unittest {
int[] ints; auto l = ints.length;
writeln(typeof(l).stringof);
}
press play ;-)

denis


void main() {
  size_t x;
  writefln(typeof(x).stringof);
}
try this, too ;-)

Because it's an alias the information about size_t gone at runtime and 
the real type is shown. uint in your case. (Here - gdc on amd64 - it's 
ulong).


Cheers,
- Daniel


Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread Steven Schveighoffer

On Tue, 15 Feb 2011 09:26:21 -0500, spir denis.s...@gmail.com wrote:


On 02/15/2011 02:36 PM, Steven Schveighoffer wrote:




Hey, bikeshedders, I found this cool easter-egg feature in D! It's  
called

alias! Don't like the name of something? Well you can change it!

alias size_t wordsize;

Now, you can use wordsize instead of size_t in your code, and the  
compiler

doesn't care! (in fact, that's all size_t is anyways *hint hint*)


Sure, but it's not the point of this one bikeshedding thread. If you do  
that, then you're the only one who knows what wordsize means. Good,  
maybe, for app-specific semantic notions (alias Employee[] Staff;);  
certainly not for types at the highest degree of general purpose like  
size_t. We need a standard alias.


The standard alias is size_t.  If you don't like it, alias it to something  
else.  Why should I have to use something that's unfamiliar to me because  
you don't like size_t?


I guarantee whatever you came up with would not be liked by some people,  
so they would have to alias it, you can't please everyone.  size_t works,  
it has a precedent, it's already *there*, just use it, or alias it if you  
don't like it.


No offense, but this discussion is among the most pointless I've seen.

-Steve


Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread Jens Mueller
spir wrote:
 On 02/15/2011 02:01 PM, Daniel Gibson wrote:
 Am 15.02.2011 12:50, schrieb spir:
 On 02/15/2011 03:44 AM, Piotr Szturmaj wrote:
 spir wrote:
 Rename size-t, or rather introduce a meaningful standard alias? (would
 vote for Natural)
 
 Maybe ptrint and ptruint?
 
 If ptr means pointer, then it's wrong: size-t is used for more than
 that, I guess. Strangely enough, while size may suggest it, .length
 does not return a size_t but an uint.
 
 Denis
 
 .length of what? An array?
 I'm pretty sure it returns size_t.
 
 unittest {
 int[] ints; auto l = ints.length;
 writeln(typeof(l).stringof);
 }
 press play ;-)

I do not get it.
The above returns uint which is fine because my dmd v2.051 is 32-bit
only. I.e. size_t is an alias to uint (see src/druntime/src/object_.d
lin 52).
But somehow I think you are implying it does not return size_t.
This is right in the sense that it does not return the alias name size_t
but it returns the aliased type name, namely uint. What's the problem?
This
writeln(size_t.stringof);
also returns uint.
I read that the compiler is free to return whatever name of an alias,
i.e. either the name of the alias or the name of the thing it was
aliased to (which can be again an alias). I do not understand the rule
for stringof (reading
http://www.digitalmars.com/d/2.0/property.html#stringof) but I never had
a problem.

Jens


Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread Iain Buclaw
== Quote from dsimcha (dsim...@yahoo.com)'s article
 Now that DMD has a 64-bit beta available, I'm working on getting a whole bunch
 of code to compile in 64 mode.  Frankly, the compiler is way too freakin'
 pedantic when it comes to implicit conversions (or lack thereof) of
 array.length.  99.999% of the time it's safe to assume an array is not going
 to be over 4 billion elements long.  I'd rather have a bug the 0.001% of the
 time than deal with the pedantic errors the rest of the time, because I think
 it would be less total time and effort invested.  To force me to either put
 casts in my code everywhere or change my entire codebase to use wider integers
 (with ripple effects just about everywhere) strikes me as purity winning out
 over practicality.

I have a similar grudge about short's being implicitly converted to int's,
resulting in hundreds of unwanted casts thrown in everywhere cluttering up code.

ie:
   short a,b,c,;
   a = b + c;

Hidden implicit casts should die.


Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread foobar
Steven Schveighoffer Wrote:

 On Tue, 15 Feb 2011 09:26:21 -0500, spir denis.s...@gmail.com wrote:
 
  On 02/15/2011 02:36 PM, Steven Schveighoffer wrote:
 
 
  Hey, bikeshedders, I found this cool easter-egg feature in D! It's  
  called
  alias! Don't like the name of something? Well you can change it!
 
  alias size_t wordsize;
 
  Now, you can use wordsize instead of size_t in your code, and the  
  compiler
  doesn't care! (in fact, that's all size_t is anyways *hint hint*)
 
  Sure, but it's not the point of this one bikeshedding thread. If you do  
  that, then you're the only one who knows what wordsize means. Good,  
  maybe, for app-specific semantic notions (alias Employee[] Staff;);  
  certainly not for types at the highest degree of general purpose like  
  size_t. We need a standard alias.
 
 The standard alias is size_t.  If you don't like it, alias it to something  
 else.  Why should I have to use something that's unfamiliar to me because  
 you don't like size_t?
 
 I guarantee whatever you came up with would not be liked by some people,  
 so they would have to alias it, you can't please everyone.  size_t works,  
 it has a precedent, it's already *there*, just use it, or alias it if you  
 don't like it.
 
 No offense, but this discussion is among the most pointless I've seen.
 
 -Steve

I disagree that the discussion is pointless. 
On the contrary, the OP pointed out some valid points:

1.  that size_t is inconsistent with D's style guide. the _t suffix is a C++ 
convention and not a D one. While it makes sense for [former?] C++ programmers 
it will confuse newcomers to D from other languages that would expect the 
language to follow its own style guide. 
2. the proposed change is backwards compatible - the OP asked for an 
*additional* alias.
3. generic concepts should belong to the standard library and not user code 
which is also where size_t is already defined. 

IMO, we already have a byte type, it's plain common sense to extend this with a 
native word type. 


Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread bearophile
Daniel Gibson:

 void main() {
size_t x;
writefln(typeof(x).stringof);
 }
 try this, too ;-)
 
 Because it's an alias the information about size_t gone at runtime and 
 the real type is shown. uint in your case. (Here - gdc on amd64 - it's 
 ulong).

I think both typeof() and stringof are compile-time things.

And regarding lost alias information I suggest to do as Clang does:
http://d.puremagic.com/issues/show_bug.cgi?id=5004

Bye,
bearophile


Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread Walter Bright

spir wrote:
Having to constantly explain that _t means type, that size does not 
mean size, what this type is supposed to mean instead, what it is used 
for in core and stdlib functionality, and what programmers are supposed 
to use it for... isn't this a waste of our time? This, only because the 
name is mindless?


No, because there is a vast body of work that uses size_t and a vast body of 
programmers who know what it is and are totally used to it.




Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread Walter Bright

foobar wrote:
1.  that size_t is inconsistent with D's style guide. the _t suffix is a C++ convention and not a D one. While it makes sense for [former?] C++ programmers it will confuse newcomers to D from other languages that would expect the language to follow its own style guide. 


It's a C convention.


2. the proposed change is backwards compatible - the OP asked for an 
*additional* alias.


I do not believe that value is added by adding more and more aliases for the 
same thing. It makes the library large and complex but with no depth.


Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread Rainer Schuetze


I think David has raised a good point here that seems to have been lost 
in the discussion about naming.


Please note that the C name of the machine word integer was usually 
called int. The C standard only specifies a minimum bit-size for the 
different types (see for example 
http://www.ericgiguere.com/articles/ansi-c-summary.html). Most of 
current C++ implementations have identical int sizes, but now long 
is different. This approach has failed and has caused many headaches 
when porting software from one platform to another. D has recognized 
this and has explicitely defined the bit-size of the various integer 
types. That's good!


Now, with size_t the distinction between platforms creeps back into the 
language. It is everywhere across phobos, be it as length of ranges or 
size of containers. This can get viral, as everything that gets in touch 
with these values might have to stick to size_t. Is this really desired?


Consider saving an array to disk, trying to read it on another platform. 
How many bits should be written for the size of that array?


Consider a range that maps the contents of a file. The file can be 
larger than 4GB, though a lot of the ranges that wrap the file mapping 
range will truncate the length to 32 bit on 32-bit platforms.


I don't have a perfect solution, but maybe builtin arrays could be 
limited to 2^^32-1 elements (or maybe 2^^31-1 to get rid of endless 
signed/unsigned conversions), so the normal type to be used is still 
int. Ranges should adopt the type sizes of the underlying objects.


Agreed, a type for the machine word integer must exist, and I don't care 
how it is called, but I would like to see its usage restricted to rare 
cases.


Rainer


dsimcha wrote:

Now that DMD has a 64-bit beta available, I'm working on getting a whole bunch
of code to compile in 64 mode.  Frankly, the compiler is way too freakin'
pedantic when it comes to implicit conversions (or lack thereof) of
array.length.  99.999% of the time it's safe to assume an array is not going
to be over 4 billion elements long.  I'd rather have a bug the 0.001% of the
time than deal with the pedantic errors the rest of the time, because I think
it would be less total time and effort invested.  To force me to either put
casts in my code everywhere or change my entire codebase to use wider integers
(with ripple effects just about everywhere) strikes me as purity winning out
over practicality.


Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread so

I disagree that the discussion is pointless.
On the contrary, the OP pointed out some valid points:

1.  that size_t is inconsistent with D's style guide. the _t suffix is  
a C++ convention and not a D one. While it makes sense for [former?] C++  
programmers it will confuse newcomers to D from other languages that  
would expect the language to follow its own style guide.
2. the proposed change is backwards compatible - the OP asked for an  
*additional* alias.
3. generic concepts should belong to the standard library and not user  
code which is also where size_t is already defined.


IMO, we already have a byte type, it's plain common sense to extend this  
with a native word type.


Funny thing is the most important argument against size_t got the least  
attention.

I will leave it as an exercise for the reader.


Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread spir

On 02/15/2011 08:05 PM, Walter Bright wrote:

foobar wrote:

1. that size_t is inconsistent with D's style guide. the _t suffix is a C++
convention and not a D one. While it makes sense for [former?] C++
programmers it will confuse newcomers to D from other languages that would
expect the language to follow its own style guide.


It's a C convention.


2. the proposed change is backwards compatible - the OP asked for an
*additional* alias.


I do not believe that value is added by adding more and more aliases for the
same thing. It makes the library large and complex but with no depth.


If we asked for various aliases for numerous builtin terms of the language, 
your point would be fully valid. But here is only asked for a single standard 
alias for what may well be the most used type in the language; which presently 
has a obscure alias name.

Cost: one line of code in object.d:
alias typeof(int.sizeof)size_t;
alias typeof(int.sizeof)Abcdef; // add this

As an aside, the opportunity may be taken to use machine-word-size signed 
values as a standard for indices/positions and sizes/counts/lengths (and 
offsets?), everywhere in the language, for the coming 64-bit version. Don, 
IIRC, and Bearophile, referred to issues due to unsigned values.
This would also give an obvious name for the alias, Integer, that probably 
few would contest (hope so).


Denis
--
_
vita es estrany
spir.wikidot.com



Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread spir

On 02/15/2011 03:25 PM, Daniel Gibson wrote:

Am 15.02.2011 15:18, schrieb spir:

On 02/15/2011 02:01 PM, Daniel Gibson wrote:

Am 15.02.2011 12:50, schrieb spir:

On 02/15/2011 03:44 AM, Piotr Szturmaj wrote:

spir wrote:

Rename size-t, or rather introduce a meaningful standard alias? (would
vote for Natural)


Maybe ptrint and ptruint?


If ptr means pointer, then it's wrong: size-t is used for more than
that, I guess. Strangely enough, while size may suggest it, .length
does not return a size_t but an uint.

Denis


.length of what? An array?
I'm pretty sure it returns size_t.


unittest {
int[] ints; auto l = ints.length;
writeln(typeof(l).stringof);
}
press play ;-)

denis


void main() {
size_t x;
writefln(typeof(x).stringof);
}
try this, too ;-)

Because it's an alias the information about size_t gone at runtime and the
real type is shown. uint in your case. (Here - gdc on amd64 - it's ulong).


Oops, you're right! had not realised yet names are de-aliased on output.

denis
--
_
vita es estrany
spir.wikidot.com



Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread Steven Schveighoffer
On Tue, 15 Feb 2011 14:15:06 -0500, Rainer Schuetze r.sagita...@gmx.de  
wrote:




I think David has raised a good point here that seems to have been lost  
in the discussion about naming.


Please note that the C name of the machine word integer was usually  
called int. The C standard only specifies a minimum bit-size for the  
different types (see for example  
http://www.ericgiguere.com/articles/ansi-c-summary.html). Most of  
current C++ implementations have identical int sizes, but now long  
is different. This approach has failed and has caused many headaches  
when porting software from one platform to another. D has recognized  
this and has explicitely defined the bit-size of the various integer  
types. That's good!


Now, with size_t the distinction between platforms creeps back into the  
language. It is everywhere across phobos, be it as length of ranges or  
size of containers. This can get viral, as everything that gets in touch  
with these values might have to stick to size_t. Is this really desired?


Do you really want portable code?  The thing is, size_t is specifically  
defined to be *the word size* whereas C defines int as a fuzzy size  
should be at least 16 bits, and recommended to be equivalent to the  
natural size of the machine.  size_t is *guaranteed* to be the same size  
on the same platform, even among different compilers.


In addition size_t isn't actually defined by the compiler.  So the library  
controls the size of size_t, not the compiler.  This should make it  
extremely portable.


Consider saving an array to disk, trying to read it on another platform.  
How many bits should be written for the size of that array?


It depends on the protocol or file format definition.  It should be  
irrelevant what platform/architecture you are on.  Any format or protocol  
worth its salt will define what size integers you should store.


Then you need a protocol implementation that converts between the native  
size and the stored size.


This is just like network endianness vs. host endianness.  You always use  
htonl and ntohl even if your platform has the same endianness as the  
network, because you want your code to be portable.  Not using them is a  
no-no even if it works fine on your big-endian system.


I don't have a perfect solution, but maybe builtin arrays could be  
limited to 2^^32-1 elements (or maybe 2^^31-1 to get rid of endless  
signed/unsigned conversions), so the normal type to be used is still  
int. Ranges should adopt the type sizes of the underlying objects.


No, this is too limiting.  If I have 64GB of memory (not out of the  
question), and I want to have a 5GB array, I think I should be allowed  
to.  This is one of the main reasons to go to 64-bit in the first place.


-Steve


Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread Nick Sabalausky
bearophile bearophileh...@lycos.com wrote in message 
news:ijefj9$25sm$1...@digitalmars.com...
 Daniel Gibson:

 void main() {
size_t x;
writefln(typeof(x).stringof);
 }
 try this, too ;-)

 Because it's an alias the information about size_t gone at runtime and
 the real type is shown. uint in your case. (Here - gdc on amd64 - it's
 ulong).

 I think both typeof() and stringof are compile-time things.

 And regarding lost alias information I suggest to do as Clang does:
 http://d.puremagic.com/issues/show_bug.cgi?id=5004


That would *really* be nice. In my Goldie parsing lib, I make heavy use of 
templated aliases to provide maximally-reader-friendly types for 
strongly-typed tokens (ie, if the programmer desires, each symbol and each 
production rule has its own type, to ensure maximum compile-time safety). 
These aliases wrap much less readable internal types. Expecting the user to 
understand the internal type for any error message is not nice.




Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread Nick Sabalausky
Walter Bright newshou...@digitalmars.com wrote in message 
news:ijeil4$2aso$3...@digitalmars.com...
 spir wrote:
 Having to constantly explain that _t means type, that size does not 
 mean size, what this type is supposed to mean instead, what it is used 
 for in core and stdlib functionality, and what programmers are supposed 
 to use it for... isn't this a waste of our time? This, only because the 
 name is mindless?

 No, because there is a vast body of work that uses size_t and a vast body 
 of programmers who know what it is and are totally used to it.


And there's a vast body who don't.

And there's a vast body who are used to C++, so let's just abandon D and 
make it an implementation of C++ instead.





Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread Nick Sabalausky
Jens Mueller jens.k.muel...@gmx.de wrote in message 
news:mailman.1694.1297781518.4748.digitalmar...@puremagic.com...

 I read that the compiler is free to return whatever name of an alias,
 i.e. either the name of the alias or the name of the thing it was
 aliased to (which can be again an alias). I do not understand the rule
 for stringof (reading
 http://www.digitalmars.com/d/2.0/property.html#stringof) but I never had
 a problem.


DMD itself has never really understood stringof.




Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread Daniel Gibson
Am 15.02.2011 19:10, schrieb bearophile:
 Daniel Gibson:
 
 void main() {
size_t x;
writefln(typeof(x).stringof);
 }
 try this, too ;-)

 Because it's an alias the information about size_t gone at runtime and 
 the real type is shown. uint in your case. (Here - gdc on amd64 - it's 
 ulong).
 
 I think both typeof() and stringof are compile-time things.
 
 And regarding lost alias information I suggest to do as Clang does:
 http://d.puremagic.com/issues/show_bug.cgi?id=5004
 
 Bye,
 bearophile

Hmm yeah, you're probably right. After sending my reply I thought about that 
myself.
However: At the time typeof() is handled by the compiler the aliases are already
resolved.

I agree that aka for alias information in error-messages would be helpful in
general, but this wouldn't help here.



Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread Walter Bright

Nick Sabalausky wrote:
Walter Bright newshou...@digitalmars.com wrote in message 
news:ijeil4$2aso$3...@digitalmars.com...

spir wrote:
Having to constantly explain that _t means type, that size does not 
mean size, what this type is supposed to mean instead, what it is used 
for in core and stdlib functionality, and what programmers are supposed 
to use it for... isn't this a waste of our time? This, only because the 
name is mindless?
No, because there is a vast body of work that uses size_t and a vast body 
of programmers who know what it is and are totally used to it.




And there's a vast body who don't.

And there's a vast body who are used to C++, so let's just abandon D and 
make it an implementation of C++ instead.


I would agree that D is a complete waste of time if all it consisted of was 
renaming things.


Re: Integer conversions too pedantic in 64-bit

2011-02-15 Thread Daniel Gibson
Am 15.02.2011 22:20, schrieb Nick Sabalausky:
 Walter Bright newshou...@digitalmars.com wrote in message 
 news:ijeil4$2aso$3...@digitalmars.com...
 spir wrote:
 Having to constantly explain that _t means type, that size does not 
 mean size, what this type is supposed to mean instead, what it is used 
 for in core and stdlib functionality, and what programmers are supposed 
 to use it for... isn't this a waste of our time? This, only because the 
 name is mindless?

 No, because there is a vast body of work that uses size_t and a vast body 
 of programmers who know what it is and are totally used to it.

 
 And there's a vast body who don't.
 

They've got to learn some name for it anyway, so why not size_t?
This also makes using C functions that use size_t easier/more clear.

 And there's a vast body who are used to C++, so let's just abandon D and 
 make it an implementation of C++ instead.
 


  1   2   >