Re: Using standardized SI prefixes

2007-06-13 Thread Scott James Remnant
On Wed, 2007-06-13 at 12:51 +0200, Christof Krüger wrote:

 On Tue, 2007-06-12 at 15:52 +0100, Ian Jackson wrote:
  shirish writes (Using standardized SI prefixes):
 Please look at http://en.wikipedia.org/wiki/Binary_prefix .
  
  Urgh, these things are ugly and an abomination.  We should avoid them.
  
 I'd really like to hear some real arguments against SI prefixes, besides
 being ugly or funny to pronounce or just because it has always been
 like that. Advantages of using SI prefixes has been mentioned in this
 thread. Please tell me the disadvantages so there can actually be a
 constructive discussion.
 
User Confusion.

Most users do not know what a tebibyte is, and they do not care.  They
know that a terabyte is about a million million bytes, and that is
sufficient.

Since you're rounding anyway, the loss of accuracy between about a
million million bytes and just over a million million bytes is not
significant.  Certainly not at the expense at having to teach users
another new unit.

Hard drives are bought in gigabytes, memory is bought in gigabytes, etc.
Quoting the same figures with a different unit in the operating system
is pedantry for its own sake.

Users have already learnt that the term gigabyte is approximate.

Introducing new units has only added confusion, rather than removed it.

Before the new units, we all knew that 1GB was an approximate figure and
likely to be (for bytes) based on a power of 2.  Now we have figures
quoted in GB and GiB, some of which are power of 10, some of which are
power of 2.  Some figures quoted in GiB are wrong, and should be in GB;
likewise some in GB should be GiB.  And we still have many figures in
both GB and GiB which are neither of the two!

Renaming the 1.44MB floppy helps in neither case; it is neither 1.44MB
or 1.44MiB.  One could name it the 1.4MB or 1.47MiB floppy and confuse
everyone into thinking it's a different thing, of course.  Or maybe it
should be the 1,440KB floppy, or the 1,475KiB floppy?  Neither of these
help the situation.

Without the binary unit to consider, when we quote a drive as 1TB, we
know that it has *at least* 1,000,000,000,000 bytes available.
Depending on the drive, it may have anywhere between this and
1,099,511,627,776 bytes available.  It's actually more likely to have
something strange like 1,024,000,000,000 available.

(And none of this takes into account partitioning and filesystem
overhead!)

I see no problem with this 1TB quote being approximate.  It's rounded
anyway.  If you really want to know how many bytes are available, you
can use this great unit called the byte which is accurate and not
subject to change[0].

Scott

[0] Unless you're older than 25.
-- 
Scott James Remnant
Ubuntu Development Manager
[EMAIL PROTECTED]


signature.asc
Description: This is a digitally signed message part
-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: Using standardized SI prefixes

2007-06-13 Thread Alex Jones
On Wed, 2007-06-13 at 14:29 +0100, Scott James Remnant wrote:
 Without the binary unit to consider, when we quote a drive as 1TB, we
 know that it has *at least* 1,000,000,000,000 bytes available.
 Depending on the drive, it may have anywhere between this and
 1,099,511,627,776 bytes available.  It's actually more likely to have
 something strange like 1,024,000,000,000 available.

10% error is no good for me. You can continue to play the at least
card, but what about when it's more important if it is at most
something? And seeing as this error only goes up exponentially, at which
prefix do you draw the line and say no more?

And no-one uses floppy disks any more. Let's just bury them all and
forget about them. :D

 I see no problem with this 1TB quote being approximate.  It's rounded
 anyway.  If you really want to know how many bytes are available, you
 can use this great unit called the byte which is accurate and not
 subject to change[0].

1 TB is not rounded. It means precisely 1 × 10^12 bytes, no more and no
less. If they want to actually put 1.024 TB on the disk then they can
say 1 TB (approx.) like any other industry (detergent, bacon, etc.).
-- 
Alex Jones
http://alex.weej.com/


-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: Using standardized SI prefixes

2007-06-13 Thread Scott James Remnant
On Wed, 2007-06-13 at 15:01 +0100, Alex Jones wrote:

 1 TB is not rounded. It means precisely 1 × 10^12 bytes, no more and no
 less.
 
No it doesn't.

The meaning of 1 TB depends on the context, and has always done so.

Scott
-- 
Scott James Remnant
Ubuntu Development Manager
[EMAIL PROTECTED]


signature.asc
Description: This is a digitally signed message part
-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: Using standardized SI prefixes

2007-06-13 Thread Christof Krüger

 Let me start with a dumb example:
 For a child or uninterested commoner that flying critter is simply a
 birdie.  For those in the know exactly the same entity is a Falco
 peregrinus.
 Even if simply calling it birdie or perhaps falcon would be
 easier, more user friendly more understandable for everyone it
 simply would not be /correct/.
The word birdie is a generalization of quite every critter that can fly.
So it is correct, the critter Falco peregrinus is a birdie, too.
Calling this critter falco peregrinus is correct, too. The example
just doesn't apply here because KB is not a generalization of KiB and
vice versa.

 
 Computers deal with numbers in base two.  Humans deal with numbers in
 base 10.  When computers and humans interact (on a technical level)
 humans must adapt to the computer, because computers can not.
 Dealing with chunks of data, addresses, registers, etc. has to be done
 in base 2.  Even if 1024 is close enough to 10^3 for a PHB or
 marketing humanoid, that will never make those two numbers equal.
Right, and this is the reason why having the same name for different
things is not good.

 And it must never be allowed to.  Computers, computer designers, computer
 technicians and most computer programmers will always deal with the
 _real_ base 2 numbers like 1024.
Unfortunately, computer designers, technicians etc. are not living in an
isolated world (well.. maybe some of them).
No one wants to forbid the computer people to use base 2 numbers. They
are just asked to write KiB instead of KB if they mean base 2
quantities, because the rest of the world already uses kilo as 1000.
Changing the rest of the world makes no sense and having distinct names
for distinct thing does no harm.


 Another example.  Pi is an irrational number starting with 3.14
 Sure, it would be easier to standardize it to 3.00.  Done deal.  It
 would be easier to remember and more marketable.  It would also be
 totally useless AND completely wrong.  AFAIK some very dumb people
 actually managed to decree by law that pi was to equal 3.  They had to
 stop doing that.
Well, another example that does not apply here. Nobody wants to change
something true to something wrong. The status quo is that KB can mean
either 1000 or 1024 bytes depending on the context (or shoe size of the
developer or whatever). So there is an ambiguity here. Introducing SI
prefixes would eliminate ambiguities if applied consistently. Pi is well
defined. There is no ambiguity.

 Computers
 have always, do, and will continue to deal with their numbers along
 the progression of 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, etc...
 So, when dealing with computers, must we.
Yup, I totally agree. But why do we call it kilo then, when we
actually mean 1024? Someone found it handy dozens of years ago and
everybody has adapted it. So back then, someone was redefining your pi
to 3 because it was close enough and now we should leave it this way?
Remember that until computers have been invented (or binary logic), kilo
has always meant 1000.

 A well-known and very common trait of language is that one given word
 can often have more than one specific meaning.  When this is the case
 you need a context to be sure.  This is considered normal, and never a
 real problem.  This should hold true regarding computers and counting
 as well.
This is called a homograph. An example taken from wikipedia:
shift n. (a change)
shift n. (a period at work)
I agree that in normal life you can guess the meaning from the context
because it has completely different meanings.
However, I don't agree that this should hold true in computer science.
One possible meaning of KB is 1000 bytes. The other is 1024 bytes.
Now take the sentence: Hello John. I've got a file here and want to
send it to you. It's 25KB large. Now please extract from the context
which meaning is significant here? The problem is that the both possible
meanings depict exactly the same: a quantity of bytes.

 Finally a personal and subjective thought.  At times one has to chose
 whether to oversimplify facts and information to the point where
 everyone understands it, (If this happens they DO NOT understand it;
 they are given the illusion of understanding) or whether to educate
 the public.
I think that you base your argumentation on wrong assumptions. The
purpose of introducing SI prefixes is *not* to make the newbie's life
simpler, at least not as primary goal. Surely, there are situations
where it really doesn't matter (e.g. if you are interested in the order
of magnitude 10% error may be totally acceptable). However, SI prefixes
make life easier for technical stuff where it is important to be exact
without having to guess the context, ask every time or consider the
professional background of your communication partner.

Regards,
  Christof Krüger


-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 

Re: Using standardized SI prefixes

2007-06-13 Thread Christof Krüger
On Wed, 2007-06-13 at 14:29 +0100, Scott James Remnant wrote:
 [...]
 And we still have many figures in both GB and GiB which are neither of
 the two!

okay ... reading on ...

 [...]
 I see no problem with this 1TB quote being approximate.  It's
 rounded anyway.

So you don't care if it is approximate? Then you should care less if
it's even exact!

However, I find that tebibyte, gibibyte, mebibyte and kibibyte sound
quite familiar to their base-10 friends so that it should be no problem
even for a dumb user to understand its meaning if he already knew what a
gigabyte or megabyte is. This is especially the case with the short
notation (e.g. KiB vs. KB).

The more important case is when a user actually *cares* about the exact
number.
At the moment base 10 and base 2 numbers are often prefixed both with k
for kilo, M for mega etc. This means that there will be confusion if
something is labeled 100GB.
Now consider introducing SI prefixes.
There still will be confusion with 100GB, because apparently not
everyone likes SI prefixes and continues using the old prefixes with
base 2 numbers. However, when something is labeled 100GiB, there is no
confusion (remember that we are talking about a user that cares about
the exact number, the dumb user will guess that GiB must be something
similar to GB).
Okay, so we gained some confidence about what is meant. How can we get
rid of the rest of uncertainty? Answer: Use the SI prefixes
consistently! This will take a while of course, but eventually you can
only benefit.

Regards,
  Christof Krüger


-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: Using standardized SI prefixes

2007-06-13 Thread Dennis Kaarsemaker
After wasting too much time reading this thread, I think the bike shed
should be yellow this time.

And for something at least slightly useful:
This is not something Ubuntu should do, upstreams should do this. So if
anyone really cares about this, poke our upstreams instead of rambling
on about whether the difference between the different gigglebytes or
tibblebytes is significant.
-- 
Dennis K.

Time is an illusion, lunchtime doubly so.


signature.asc
Description: This is a digitally signed message part
-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: Using standardized SI prefixes

2007-06-13 Thread Onno Benschop
As I see it there are two ways of resolving the difference between KiB
and KB.

* Use Rosetta to update the text and fix the output so that it now
  reads KiB. This would be relatively simple to do, but not actually
  helpful longer term.
* Fix the source code that calculates KB by doing a bit shift[0] and
  instead dividing the number of bytes by a power of 10.



[0] I'm assuming that most applications will calculate how many
Kilobytes/Megabytes are used by dividing by a power of two.

-- 
Onno Benschop

Connected via Optus B3 at S31°54'06 - E115°50'39 (Yokine, WA)
--
()/)/)()..ASCII for Onno..
|?..EBCDIC for Onno..
--- -. -. ---   ..Morse for Onno..

Proudly supported by Skipper Trucks, Highway1, Concept AV, Sony Central, Dalcon
ITmaze   -   ABN: 56 178 057 063   -  ph: 04 1219    -   [EMAIL PROTECTED]


-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: Using standardized SI prefixes

2007-06-13 Thread James \Doc\ Livingston
On Wed, 2007-06-13 at 15:01 +0100, Alex Jones wrote:
 On Wed, 2007-06-13 at 14:29 +0100, Scott James Remnant wrote:
  Without the binary unit to consider, when we quote a drive as 1TB, we
  know that it has *at least* 1,000,000,000,000 bytes available.
  Depending on the drive, it may have anywhere between this and
  1,099,511,627,776 bytes available.  It's actually more likely to have
  something strange like 1,024,000,000,000 available.
 
 10% error is no good for me. You can continue to play the at least
 card, but what about when it's more important if it is at most
 something? And seeing as this error only goes up exponentially, at which
 prefix do you draw the line and say no more?

You'll get error anyway - a 500Gb disk doesn't necessarily contain
500x10^9 bytes, 500x2^30 bytes or any combination of 10^m and 2^n. Drive
manufacturers build drives make them with densities and sizes that are
convenient to them, and round down to the nearest nice number for
marketing purposes. A 500Gb drive could have 512,345,678,900 bytes on
it for all we know.



 And no-one uses floppy disks any more. Let's just bury them all and
 forget about them. :D
 
  I see no problem with this 1TB quote being approximate.  It's rounded
  anyway.  If you really want to know how many bytes are available, you
  can use this great unit called the byte which is accurate and not
  subject to change[0].
 
 1 TB is not rounded. It means precisely 1 × 10^12 bytes, no more and no
 less. If they want to actually put 1.024 TB on the disk then they can
 say 1 TB (approx.) like any other industry (detergent, bacon, etc.).

How many other industries do this? If I buy a 500g pack of bacon, I
don't get 500g - I get around 500g, close enough that the appropriate
consumer trading authority doesn't come and have words with them. Very
few things I ever buy have approx mentioned with how much I get.


Cheers,

-- 
Too many errors on one line (make fewer) -- MPW C error message


-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: Using standardized SI prefixes

2007-06-13 Thread James \Doc\ Livingston
On Thu, 2007-06-14 at 00:35 +0200, Christof Krüger wrote:
 I agree that this is the way to go. However, I think the OP wanted to
 suggest to have something like an official policy so that
 changes/patches are also created by ubuntu and eventually proposed
 upstream.
 But I guess there will be no consensus on it so just let upstream do it.

Good luck convincing every upstream of this.


Cheers,

-- 
Accept that some days you're the pigeon and some days you're the statue


-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: Using standardized SI prefixes

2007-06-13 Thread Ben Finney
Ivan Jager [EMAIL PROTECTED] writes:

 On Wed, 13 Jun 2007, Alex Jones wrote:
  1 TB is not rounded. It means precisely 1 × 10^12 bytes, no more
  and no less. If they want to actually put 1.024 TB on the disk
  then they can say 1 TB (approx.) like any other industry
  (detergent, bacon, etc.).

 1 TB has only one significant digit. It would be silly to think that
 it was an exact measurement, at least in fields I am familiar
 with. ;) No one I know would think 1km is as precisely measured as
 1.0km.

The difference being that digital specifications for things like
storage capacity and memory are not measured. They are calculated, and
in those contexts they *are* precise.

Rounding can be done after the calculated number is obtained, but it's
not inherent in the process of obtaining the number the way that
measuring 1 km or 1 tablespoon is.

Since we *can* give a perfectly precise quantity of bytes and other
digital phenomena, and often do, this is even more reason to use the
precise meaning of the units for those quantities.

-- 
 \  I moved into an all-electric house. I forgot and left the |
  `\   porch light on all day. When I got home the front door wouldn't |
_o__) open.  -- Steven Wright |
Ben Finney


-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: Using standardized SI prefixes

2007-06-13 Thread Alex Jones
On Thu, 2007-06-14 at 09:03 +1000, James Doc Livingston wrote:
  1 TB is not rounded. It means precisely 1 × 10^12 bytes, no more and no
  less. If they want to actually put 1.024 TB on the disk then they can
  say 1 TB (approx.) like any other industry (detergent, bacon, etc.).
 
 How many other industries do this? If I buy a 500g pack of bacon, I
 don't get 500g - I get around 500g, close enough that the appropriate
 consumer trading authority doesn't come and have words with them. Very
 few things I ever buy have approx mentioned with how much I get.

That's what I was saying. I buy a 950 g pack of detergent, it says on
the packet:

950 g ℮

The key being the ℮, which is European packaging standard for
estimated. I'd be surprised if they don't have something similar to
use instead of approx. in Australia. Look out for it!
-- 
Alex Jones
http://alex.weej.com/


-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss