Re: [rust-dev] How to find Unicode string length in rustlang

2014-05-30 Thread Nathan Myers
A good name would be size().  That would avoid any confusion over various 
length definitions, and just indicate how much address space it occupies.


Nathan Myers

On May 29, 2014 8:11:47 PM Palmer Cox  wrote:


Thinking about it more, units() is a bad name. I think a renaming could
make sense, but only if something better than len() can be found.

-Palmer Cox


On Thu, May 29, 2014 at 10:55 PM, Palmer Cox  wrote:

> What about renaming len() to units()?
>
> I don't see len() as a problem, but maybe as a potential source of
> confusion. I also strongly believe that no one reads documentation if they
> *think* they understand what the code is doing. Different people will see
> len(), assume that it does whatever they want to do at the moment, and for
> a significant portion of strings that they encounter it will seem like
> their interpretation, whatever it is, is correct. So, why not rename len()
> to something like units()? Its more explicit with the value that its
> actually producing than len() and its not all that much longer to type. As
> stated, exactly what a string is varies greatly between languages, so, I
> don't think that lacking a function named len() is bad. Granted, I would
> expect that many people expect that a string will have method named len()
> (or length()) and when they don't find one, they will go to the
> documentation and find units(). I think this is a good thing since the
> documentation can then explain exactly what it does.
>
> I much prefer len() to byte_len(), though. byte_len() seems like a bit
> much to type and it seems like all the other methods on strings should then
> be renamed with the byte_ prefix which seems unpleasant.
>
> -Palmer Cox
>
>
> On Thu, May 29, 2014 at 3:39 AM, Masklinn  wrote:
>
>>
>> On 2014-05-29, at 08:37 , Aravinda VK  wrote:
>>
>> > I think returning length of string in bytes is just fine. Since I
>> didn't know about the availability of char_len in rust caused this
>> confusion.
>> >
>> > python 2.7 - Returns length of string in bytes, Python 3 returns number
>> of codepoints.
>>
>> Nope, depends on the string type *and* on compilation options.
>>
>> * Python 2's `str` and Python 3's `bytes` are byte sequences, their
>>  len() returns their byte counts.
>> * Python 2's `unicode` and Python 3's `str` before 3.3 returns a code
>>  units count which may be UCS2 or UCS4 (depending whether the
>>  interpreter was compiled with `—enable-unicode=ucs2` — the default —
>>  or `—enable-unicode=ucs4`. Only the latter case is a true code points
>>  count.
>> * Python 3.3's `str` switched to the Flexible String Representation,
>>  the build-time option disappeared and len() always returns the number
>>  of codepoints.
>>
>> Note that in no case to len() operations take normalisation or visual
>> composition in account.
>>
>> > JS returns number of codepoints.
>>
>> JS returns the number of UCS2 code units, which is twice the number of
>> code points for those in astral planes.
>> ___
>> Rust-dev mailing list
>> Rust-dev@mozilla.org
>> https://mail.mozilla.org/listinfo/rust-dev
>>
>
>




___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] UTF-8 strings versus "encoded ropes"

2014-05-01 Thread Nathan Myers

On 05/01/2014 04:57 PM, Daniel Micay wrote:

On 01/05/14 07:49 PM, Nathan Myers wrote:

In defining a library string we always grapple over how it
should differ from a raw (variable or fixed) array of bytes.
Ease of appending and of assigning into substrings always
comes up. In the old days, copies shared storage, but nowadays
that's considered evil. Indexed random access lookup was once
thought essential, but with today's variable-sized characters,
strings have become sequential structures. We might snip out a
substring and splice another in its place, but we must identify
those places by stepping iterators to them. We need to put

>> string values in partial or total order, but no single ordering
>> is compellingly best.  Equality depends on context.


The outcome is that the context-independent requirements on
strings may not differ enough from an array of bytes to justify
a separate type. We might better give our byte arrays a few
stringy capabilities. Most users of strings don't need to know
anything about what's in them, and can operate on the raw byte
arrays. To use a string as a map key, though, implies choices:
fold case? canonicalize sequences? We need an object that can
remember your choices, and that the map can apply to strings
given to it.

Ideally what we use to express our interpretation of some set
of strings could be used on any sequence of bytes, not necessarily
contiguous in memory, not necessarily all in memory at once,
not necessarily even produced until called for.

The history of programming languages is littered with mistakes
around string types.  There's no reason why Rust must repeat
them all.

Nathan Myers


There's a string type because it *enforces* the guarantee of containing
valid UTF-8, meaning it can always be converted to code points. This
also means all of the Unicode algorithms can assume that they're dealing
with a valid sequence of code points with no out-of-range values or
surrogates, per the specification.


A UTF-8 string type can certainly earn its keep.  (Probably it should
have "utf8" somewhere in its name.)  Not all byte sequences a program
encounters are, or can or should be converted to, valid UTF-8.  Any
that might not be must still be put in something that users probably
want to call a string. The other issues remain; there are many equally
valid orderings for UTF-8 sequences, so any fixed choice will often be
wrong.

A discriminated string type that may be matched at runtime as a valid
UTF-8 sequence or not depending on what was last put in it would
probably be useful often enough to want it in std.

Nathan Myers
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] UTF-8 strings versus "encoded ropes"

2014-05-01 Thread Nathan Myers

On 05/01/2014 02:52 PM, Patrick Walton wrote:

On 5/1/14 6:53 AM, Malthe Borch wrote:

In Rust, the built-in std::str type "is a sequence of unicode
codepoints encoded as a stream of UTF-8 bytes".
...
A string would be essentially a rope where each leaf specifies an
encoding, e.g. UTF-8 or ISO8859-1 (ideally expressed as one or two
bytes).


This is too complex for a systems language with a simple library.


In defining a library string we always grapple over how it
should differ from a raw (variable or fixed) array of bytes.
Ease of appending and of assigning into substrings always
comes up. In the old days, copies shared storage, but nowadays
that's considered evil. Indexed random access lookup was once
thought essential, but with today's variable-sized characters,
strings have become sequential structures. We might snip out a
substring and splice another in its place, but we must identify
those places by stepping iterators to them. We need to put string values 
in partial or total order, but no single ordering is

compellingly best.  Equality depends on context.

The outcome is that the context-independent requirements on
strings may not differ enough from an array of bytes to justify
a separate type. We might better give our byte arrays a few
stringy capabilities. Most users of strings don't need to know
anything about what's in them, and can operate on the raw byte
arrays. To use a string as a map key, though, implies choices:
fold case? canonicalize sequences? We need an object that can
remember your choices, and that the map can apply to strings
given to it.

Ideally what we use to express our interpretation of some set
of strings could be used on any sequence of bytes, not necessarily
contiguous in memory, not necessarily all in memory at once,
not necessarily even produced until called for.

The history of programming languages is littered with mistakes
around string types.  There's no reason why Rust must repeat
them all.

Nathan Myers
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] UTF-8 strings versus "encoded ropes"

2014-05-01 Thread Nathan Myers
It would be a mistake for a byte sequence container, stream, or string type 
to know anything about particular encodings. An encoding is an 
interpretation imposed on a byte sequence. Users of a sequence need to be 
able to choose what interpretation to apply without interference from some 
previous user's choice, and without need to make a copy.


As an example, a given string may be seen as raw bytes, as a series of 
delimited records, as Unicode code points within some of those records, as 
a series of JSON name-value pairs within such a record, and as a decimal 
number in a JSON value part.  The same interpretations need to work on a 
raw byte stream that would not tolerate in-band Rust-specific annotations.


The UTF-8 view of a string is an interesting special case. Depending on 
context, what is considered a "character" may be a code point of at most 4 
bytes, or any number of bytes representing a base and combining characters 
which might or might not be collapsible to a canonical, single code point, 
or a series of such constructs that is to be displayed as a ligature such 
as "Qu" or "ffi". (Some languages are best displayed as mostly ligatures.)


Nathan Myers


On May 1, 2014 6:54:04 AM Malthe Borch  wrote:


In Rust, the built-in std::str type "is a sequence of unicode
codepoints encoded as a stream of UTF-8 bytes".

Meanwhile, building on experience with Python 2 and 3, I think it's
worth considering a more flexible design.

A string would be essentially a rope where each leaf specifies an
encoding, e.g. UTF-8 or ISO8859-1 (ideally expressed as one or two
bytes).

That is, a string may be comprised of segments of different encodings.
On the I/O barrier you would then explicitly encode (and flatten) to a
compatible encoding such as UTF-8.

Likewise, data may be read as 8-bit raw and then "decoded" at a later
stage. For instance, HTTP request headers are ISO8859-1, but the
entire input stream is 8-bit raw.

Sources:

- https://maltheborch.com/2014/04/pythons-missing-string-type
- http://lucumr.pocoo.org/2014/1/9/ucs-vs-utf8/
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev



___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] A small announcement for zinc, the bare metal rust stack

2014-04-23 Thread Nathan Myers
Ruby is aluminum oxide. C is elemental carbon; C++, doubly ionized. Perl is 
mostly calcium carbonate. But there are better wordplay opportunities here 
than obscure chemistry references.




On April 23, 2014 12:28:48 AM Vladimir Pouzanov  wrote:


Luckily enough, I had the concept for zinc even before I started coding in
rust :-) And yes, there are lots of different oxides in rust world.


On Wed, Apr 23, 2014 at 3:58 AM, Thad Guidry  wrote:

> Actually...I do not. :)
>
>
> On Tue, Apr 22, 2014 at 9:05 PM, Chris Morgan  wrote:
>
>> On Wed, Apr 23, 2014 at 10:35 AM, Thad Guidry 
>> wrote:
>> > I would have named it ... "oxide" instead of zinc ;-) ... rust = iron
>> oxide
>> Do you know how many projects written in Rust have already been named
>> “oxide”?
>>
>
>
>
> --
> -Thad
> +ThadGuidry 
> Thad on LinkedIn 
>
> ___
> Rust-dev mailing list
> Rust-dev@mozilla.org
> https://mail.mozilla.org/listinfo/rust-dev
>
>


-- Sincerely,
Vladimir "Farcaller" Pouzanov
http://farcaller.net/



___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Reminder: ~[T] is not going away

2014-04-03 Thread Nathan Myers



Perhaps the best thing is to wait a month (or two or three) until DST
is more of a reality and then see how we feel.


Are you thinking we should also wait before converting the current uses
of ~[T] to Vec? Doing the migration gives us the performance[1] and
zero-length-zero-alloc benefits, but there were some concerns about
additional library churn if we end up converting back to DST's ~[T].


I can't speak about how a usage choice affects the standard library,
but it seems worth mentioning that vector capacity doesn't have to be
in the base object; it can live in the secondary storage, prepended
before the elements.  A zero-length Vec might be null for the
case of zero capacity, or non-null when it has room to grow.  For
maximally trivial conversion to ~T[], the pointer in Vec would
point to the first element, with the capacity at a negative offset.

Nathan Myers


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] "Virtual fn" is a bad idea

2014-03-12 Thread Nathan Myers

On 03/12/2014 08:17 AM, Patrick Walton wrote:

On 3/12/14 3:29 AM, Nathan Myers wrote:

Given such primitives, esoteric constructions like virtual inheritance
could be left to ambitious users.


We did sketch a macro-based solution for this that was basically what
you said. It involved an `rtti!()` macro that allowed construction of
vtables per object and structs, as well as an `family!()` macro that
allowed you to downcast and upcast. But it was very complex, with lots
of moving parts. At the end of the day you need some things to be built
in to the language if you want them to be nice.


This is interesting because it means the work is half done.
Results of such an effort divide more or less neatly into

 - essentials
 - hacks around language limitations to implement essentials
 - presentation hacks to make the construct usable by normal people

There's no getting around the first one; the complexity is in the
library or in the compiler.  The latter two help to identify what
to add to the language core to make such a library practical.

The complexity of the features to eliminate the hacks would tend
to be much smaller than of the hacks themselves, and no larger
than pulling the whole into the core language. It would be tempting,
then, to generalize the new features to support more of the common
usage patterns (e.g. delegation), and it's a matter of project
discipline to know where to stop.  Rust doesn't lack project
discipline.

Nathan Myers
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] "Virtual fn" is a bad idea

2014-03-12 Thread Nathan Myers

On 03/11/2014 02:18 PM, Patrick Walton wrote:
> You need:
> 1. One-word pointers to each DOM node, not two...
> 2. Access to fields common to every instance of a trait without
>   virtual dispatch...
> 3. Downcasting and upcasting.

Let's look at what C++ virtual functions F and the classes T they
operate on really amount to.

At runtime, class T is just a const global array V of pointers to
those functions F. A T instance is just a struct with &V in it.
The functions F are notable only for having a T* first argument,
but that is really just a C++ convention. You can field those with
no difficulty in C, as indeed GTK+ does.

A derived class T2 is another const global array V2 of pointers to
functions F2, the first N of which are stack-frame compatible with
F, and, each, optionally identical.  The T2 instance is another
struct, with its first member an instance of T. Nothing there
conflicts with the Rust we know.

Compile time support is only a little more specialized.  We need
some representation-preserving type coercions for the various pointers.
(For multiple inheritance, some of the coercions would add a compile-
time constant.)  The only type-compatibility enforcement needed is for
the function-argument lists.

The const-global-array-of-function-pointers has been called a driver,
and the struct-with-a-pointer-to-it has been a file (or FILE, or FCB
to old-timers) for longer than I have been alive.  Syntactic sugar to
extend T and V, and enforcing F2 compatibility with F, takes us all the
way to "object-oriented".

It would be a good demonstration of Rust expressiveness to make object
orientation a library construct.  Having that in the standard library
would suffice for interoperability. Language primitives just sufficient
to enable such a library would be more useful than wired-in support.

Given such primitives, esoteric constructions like virtual inheritance
could be left to ambitious users.

Nathan Myers

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] "Virtual fn" is a bad idea

2014-03-11 Thread Nathan Myers

Bill's posting pleases me more than any other I have seen
so far.

Virtual functions in C++ have been an especially fruitful
source of unfortunate consequences, both in the language and
in users' programs.  It is common in C++ coding guidelines
to forbid public virtual functions, to help avoid mixing
implementation details into a public interface, but that
only mitigates one problem.

Virtual functions have been a source of confusion, too,
because they can be overloaded two different ways, with
identical syntax but radically different semantics. The
compiler cannot help catch mistakes because either might
reasonably have been intended.  This is part of the
rationale for "pure virtual" classes, just so that
failing to override a virtual as intended through some
trivial error stands some chance of being noticed. For
this reason, some coding guidelines go farther and _only_
allow overriding a parent's pure virtual functions.

Virtual functions are the chief ingredient in what Alex
Stepanov calls "O-O gook".  It should surprise no one
that Java and C# went the wrong way by making all member
functions virtual, thereby exposing all programs and all
programmers to these ills all the time.

That said, virtual functions do provide a more structured
form of function pointer, which we do need.  Any such feature
should start with the problems it must solve, and work toward
a defensible design, not by patching traditional O-O method
overriding. Rust has the advantage over early C++ that
lambdas and macros are available as well-defined building
blocks. Ideally, Rust's architectural replacement for
virtual functions would be purely a standard-library
construct, demonstrating greater expressiveness to enable
user code to do what another language is obliged to have
built into its core.

[aside: I don't know of any family connection to Bill.]

Nathan Myers

On 03/11/2014 12:09 PM, Bill Myers wrote:

I see a proposal to add "virtual struct" and "virtual fn" in the workweek 
meeting notes, which appears to add an exact copy of Java's OO system to Rust.

I think however that this should be carefully considered, and preferably not 
added at all (or failing that, feature gated and discouraged).

The core problem of "virtual functions" (shared by Java's classes, etc.) is 
that rather than exposing a single public API, they expose two: the API formed by public 
functions, and the API formed by virtual functions to be overridden by subclasses, and 
the second API is exposed in an inflexible and unclean way.

A much better way of allowing to override part of a struct's behavior is by 
defining a trait with the overridable functionality, and allowing to pass in an 
implementation of the trait to the base class, while also providing a default 
implementation if desired.

Another way is to have the "subclass" implement all the traits that the "base class" implements, 
include a field of the "base class" type, and then direct all non-overridden functionality to the "base 
class" (here syntax sugar can be easily added to eliminate the boilerplate, by automatically implementing all 
non-implemented trait functions by calling the same function on the base class field).

These approaches can be combined, as the first approach allows to change the "inside" 
behavior of the base class, while the second one allows to put extra behavior "around" 
the base class code.

The fact that OO using virtual functions (as opposed to traits) is a bad design 
is one of the crucial steps forward of the design of languages like Go and 
current Rust compared to earlier OO languages, and Rust should not go backwards 
on this.

Here is a list of issues with virtual functions:

1. Incentive for bad documentation

Usually there is no documentation for how virtual functions are supposed to be 
overridden, and it as awkward to add it since it needs to be mixed with the 
documentation on how to use the struct

2. Mishmashing multiple unrelated APIs

With traits, you could pass in multiple objects to implement separate sets of 
overridable functionality; with virtual structs you need to mishmash all those 
interfaces into a single set of virtual functions, all sharing data even when 
not appropriate.

3. No encapsulation

Private data for virtual function implementations is accessible to all other 
functions in the struct.

This means for instance that if you have a virtual function called "compute_foo()" that is 
implemented by default by reading a "foo" field in the base class, then all other parts of the base 
class can access "foo" too.

If anything else accesses mistakenly "foo" directly, which it can, then overriding 
"compute_foo()" will not work as expected.

If compute_foo() were provided by an external trait implementation, then "foo" 
would be priv

Re: [rust-dev] About RFC: "A 30 minute introduction to Rust"

2014-03-03 Thread Nathan Myers

On 03/03/2014 09:18 PM, Kevin Ballard wrote:

On Mar 3, 2014, at 8:44 PM, Nathan Myers  wrote:


There are certainly cases in either language where nothing but a
pointer will do.  The problem here is to present examples that are
simple enough for presentation, not contrived, and where Rust has
the clear advantage in safety and (ideally) clarity.  For such

>> examples I'm going to insist on a competent C++ coder if we are
>> not to drive away our best potential converts.


You seem to be arguing that C++ written correctly by a highly-skilled

> C++ coder is just as good as Rust code, and therefore the inherent
> safety of Rust does not give it an advantage over C++. And that's

ridiculous.


That would be a ridiculous position to argue, and this would be a
ridiculous place to argue it.  Maybe try reading the preceding
paragraph again?

My concern is that the examples presented in tutorials must be
compelling to skilled C++ programmers.  If we fail to win them over, the 
whole project will have been a waste of time.  The most skilled

C++ programmers know all too well what mistakes show up over and
over again.  They have lots of experience with proposed solutions
that fail.

C++ is mature enough now that some are looking for the language
that can pick up where C++ leaves off.  They wonder if Rust might
become that language. (It manifestly is not that language yet.)
They are who will need to initiate new, important projects that
risk using it, and they are who will explain what it doesn't do
well enough yet, and how to fix it -- but only if we can keep
their already heavily-oversubscribed interest in the first 30
minutes.  A silly example is deadly.

Nathan Myers
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] About RFC: "A 30 minute introduction to Rust"

2014-03-03 Thread Nathan Myers

On 03/03/2014 07:46 PM, comex wrote:

On Mon, Mar 3, 2014 at 10:17 PM, Nathan Myers  wrote:

It's clear that we need someone fully competent in C++ to
code any comparisons.  In C++, one is only ever motivated to
("unsafely") extract the raw pointer from a smart pointer when
only a raw pointer will do.  This is exactly as likely to happen
in Rust as in C++, and in exactly the same places.  Where it is
needed, Rust offers no safer alternative.


This is simply wrong.


I assume you take issue not with the leading sentence above,
but with those following.

> Most C++ code I've seen frequently uses raw

pointers in order to pass around temporary references to objects that
are not reference counted (or even objects that are reference counted,
to avoid the overhead for simple copies).  ...


For temporary references in C++ code, I prefer to use references.  But
we do need actual raw pointers to managed (sub)objects as arguments to
foreign C functions.  There, C++ and Rust coders are in the same boat.
Both depend on the C function not to keep a copy of the unsafely-issued
borrowed pointer.  C++ does allow a reference to last longer than the
referent, and that's worth calling attention to.


In Rust, many of the situations where C++ uses raw pointers allow use
of borrowed pointers, which are safe and have no overhead.


There are certainly cases in either language where nothing but a
pointer will do.  The problem here is to present examples that are
simple enough for presentation, not contrived, and where Rust has
the clear advantage in safety and (ideally) clarity.  For such examples
I'm going to insist on a competent C++ coder if we are not to drive
away our best potential converts.

Nathan Myers
n...@cantrip.org
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] About RFC: "A 30 minute introduction to Rust"

2014-03-03 Thread Nathan Myers

On 03/03/2014 05:54 PM, Patrick Walton wrote:

On 3/3/14 5:53 PM, Daniel Micay wrote:

On 03/03/14 08:19 PM, Steve Klabnik wrote:

Part of the issue with that statement is that you may or may not
program in this way. Yes, people choose certain subsets of C++ that
are more or less safe, but the language can't help you with that.


You can choose to write unsafe code in Rust too.


You have to write the *unsafe* keyword to do so.


It's clear that we need someone fully competent in C++ to
code any comparisons.  In C++, one is only ever motivated to
("unsafely") extract the raw pointer from a smart pointer when
only a raw pointer will do.  This is exactly as likely to happen
in Rust as in C++, and in exactly the same places.  Where it is
needed, Rust offers no safer alternative.

If this were the kind of safety that motivated Rust development,
C++ users would be fully justified in declaring the project and
the language a big waste of time, and never looking at it again.
That response seems worth some effort to avoid evoking. No amount
of explanation why the reader should not abandon us in disgust
can help when that reader has already left.

Any C++ examples offered should exhibit unsafe practices that
readers encounter in real code, and that result in real bugs.
Such examples are easy to find.  We don't need to invent fakes.

Most of the value of the function prototypes that C89 adopted
from the C++ of the day was time not spent manually checking
all the call sites whenever arguments changed. (With C++,
they gave us overloading, too.)  C programs can still declare
functions with no unchecked arguments, but now we have every
reason not to, and nobody does.  Useful safety means less that
needs to be paid attention to.

(Arguably, BTW, the keyword should be "safe", because you are
asserting to the compiler that it should consider what is being
done there to be safe, despite any misgivings it may have, and
challenging the reader to contradict it.  But that's a bridge
already burnt.)

Nathan Myers
n...@cantrip.org
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


[rust-dev] Pointer syntax

2014-02-01 Thread Nathan Myers

On 01/31/2014 09:43 PM, Eric Summers wrote:

I think I like the mut syntax in let expressions, but I still like shoving the 
pointer next the type like I would do in C/C++ for something like fn drop(&mut 
self) {}.

I guess it is somewhat rare to use mutable pointers as function parameters, so 
maybe not a big deal.


While we're talking about syntax, hasn't anybody noticed
that prefix pointer-designator and dereference operators
are crazy, especially for otherwise left-to-right declaration
order?  C++ had no choice, but Rust can make the sensible
choice: the only one that Pascal got right.  (That used the
caret, also an eerily apt choice.)

Nathan Myers

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Appeal for CORRECT, capable, future-proof math, pre-1.0

2014-01-11 Thread Nathan Myers

On 01/11/2014 03:14 PM, Daniel Micay wrote:

On Sat, Jan 11, 2014 at 6:06 PM, Nathan Myers  wrote:

A big-integer type that uses small-integer
arithmetic until overflow is a clever trick, but it's purely
an implementation trick.  Architecturally, it makes no sense
to expose the trick to users.


I didn't suggest exposing it to users. I suggested defining a wrapper
around the big integer type with better performance characteristics
for small integers.


Your wrapper sounds to me like THE big-integer type.  The thing you
called a "big integer" doesn't need a name.


No single big-integer or
overflow-trapping type can meet all needs. (If you try, you
fail users who need it simple.)  That's OK, because anyone
can code another, and a simple default can satisfy most users.


What do you mean by default? If you don't know the bounds, a big
integer is clearly the only correct choice. If you do know the
bounds,you can use a fixed-size integer. I don't think any default
other than a big integer is sane, so I don't think Rust needs a

> default inference fallback.

As I said,

>> In fact, i64 satisifies almost all users almost all the time.

No one would complain about a built-in "i128" type.  The thing
about a fixed-size type is that there are no implementation
choices to leak out.  Overflowing an i128 variable is quite
difficult, and 128-bit operations are still lots faster than on
any variable-precision type. I could live with "int" == "i128".

Nathan Myers
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Appeal for CORRECT, capable, future-proof math, pre-1.0

2014-01-11 Thread Nathan Myers

On 01/10/2014 10:08 PM, Daniel Micay wrote:

I don't think failure on overflow is very useful. It's still a bug if
you overflow when you don't intend it. If we did have a fast big
integer type, it would make sense to wrap it with an enum heading down
a separate branch for small and large integers, and branching on the
overflow flag to expand to a big integer. I think this is how Python's
integers are implemented.


Failure on overflow *can* be useful in production code, using
tasks to encapsulate suspect computations.  Big-integer types
can be useful, too.  A big-integer type that uses small-integer
arithmetic until overflow is a clever trick, but it's purely
an implementation trick.  Architecturally, it makes no sense
to expose the trick to users.

The fundamental error in the original posting was saying machine
word types are somehow not "CORRECT".  Such types have perfectly
defined behavior and performance in all conditions. They just
don't pretend to model what a mathematician calls an "integer".
They *do* model what actual machines actually do. It makes
sense to call them something else than "integer", but "i32"
*is* something else.

It also makes sense to make a library that tries to emulate
an actual integer type.  That belongs in a library because it's
purely a construct: nothing in any physical machine resembles
an actual integer.  Furthermore, since it is an emulation,
details vary for practical reasons. No single big-integer or
overflow-trapping type can meet all needs. (If you try, you
fail users who need it simple.)  That's OK, because anyone
can code another, and a simple default can satisfy most users.

In fact, i64 satisifies almost all users almost all the time.

Nathan Myers
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Rust 0.9 released

2014-01-09 Thread Nathan Myers

Much blush

Congratulations, this looks like a big step in a right direction,
and in a very short time.

When I build on Debian amd64 with g++-4.8.2, I get internal C++
compiler failures in stage 0.  I run "make" again, and it builds
OK for a while, and then dies on another file.  It does finish
stage 0 after a few cycles of this.  Should I be using the "other"
C++ compiler?

Also, running "make -j4", I get failures like:

/bin/mv: cannot stat 
`/tmp/rust-0.9/x86_64-unknown-linux-gnu/llvm/utils/TableGen/Release+Asserts/AsmMatcherEmitter.d.tmp': 
No such file or directory


I haven't seen these when building with just "make".

(0.8 built without hiccups on g++-4.7 and "make -j3".)

Nathan Myers

On 01/09/2014 01:04 PM, Brian Anderson wrote:

Just in case somebody wants one with the correct title. So sad.

On 01/09/2014 01:04 PM, Brian Anderson wrote:

Mozilla and the Rust community are pleased to announce version 0.9 of the
Rust compiler and tools. Rust is a systems programming language with a
focus on safety, performance and concurrency.

This was another eventful release in which we made extensive improvements
to the runtime and I/O subsystem, introduced static linking and link-time
optimization, and reduced the variety of closures in the language. 0.9
also
begins a final series of planned changes to how pointers are treated in
Rust, starting with the deprecation of the built-in "managed pointer"
type
and its accompanying `@` sigil, and the introduction of smart pointer
types
to the standard library.

The brief release notes are included in this announcement, and there is
further explanation in the detailed release [notes] on the wiki.
Documentation and all the links in this email are available on the
[website]. As usual, version 0.9 should be considered an alpha release,
suitable for early adopters and language enthusiasts. Please file [bugs]
and join the [fun].

[website]: http://www.rust-lang.org
[notes]: https://github.com/mozilla/rust/wiki/Doc-detailed-release-notes
[bugs]: https://github.com/mozilla/rust/issues
[fun]:
https://github.com/mozilla/rust/wiki/Note-guide-for-new-contributors

This release is available as both a tarball and a Windows installer:

* http://static.rust-lang.org/dist/rust-0.9.tar.gz
http://static.rust-lang.org/dist/rust-0.9.tar.gz.asc
SHA256 (of .tar.gz):
c0911c3545b797a1ca16f3d76bf5ed234754b828efd1e22c182c7300ac7dd5d1

* http://static.rust-lang.org/dist/rust-0.9-install.exe
http://static.rust-lang.org/dist/rust-0.9-install.exe.asc
SHA256 (of .exe):
6ab14e25761d61ba724c5f77403d09d566d3187a2e048e006036b960d938fe90

Thanks to everyone who contributed!

Regards,
The Rust Team


Version 0.9 (January 2014)
--

* Language
* The `float` type has been removed. Use `f32` or `f64` instead.
* A new facility for enabling experimental features (feature gating)
has been added, using the crate-level `#[feature(foo)]` attribute.
* Managed boxes (@) are now behind a feature gate
(`#[feature(managed_boxes)]`) in preperation for future removal. Use
the standard library's `Gc` or `Rc` types instead.
* `@mut` has been removed. Use `std::cell::{Cell, RefCell}` instead.
* Jumping back to the top of a loop is now done with `continue` instead
of `loop`.
* Strings can no longer be mutated through index assignment.
* Raw strings can be created via the basic `r"foo"` syntax or with
matched hash delimiters, as in `r###"foo"###`.
* `~fn` is now written `proc (args) -> retval { ... }` and may only be
called once.
* The `&fn` type is now written `|args| -> ret` to match the literal
form.
* `@fn`s have been removed.
* `do` only works with procs in order to make it obvious what the cost
of `do` is.
* Single-element tuple-like structs can no longer be dereferenced to
obtain the inner value. A more comprehensive solution for overloading
the dereference operator will be provided in the future.
* The `#[link(...)]` attribute has been replaced with
`#[crate_id = "name#vers"]`.
* Empty `impl`s must be terminated with empty braces and may not be
terminated with a semicolon.
* Keywords are no longer allowed as lifetime names; the `self` lifetime
no longer has any special meaning.
* The old `fmt!` string formatting macro has been removed.
* `printf!` and `printfln!` (old-style formatting) removed in favor of
`print!` and `println!`.
* `mut` works in patterns now, as in `let (mut x, y) = (1, 2);`.
* The `extern mod foo (name = "bar")` syntax has been removed. Use
`extern mod foo = "bar"` instead.
* New reserved keywords: `alignof`, `offsetof`, `sizeof`.
* Macros can have attributes.
* Macros can expand to items with attributes.
* Macros can expand to multiple items.
* The `asm!` macro is feature-gated (`#[feature(asm)]`).
* Comments may be nested.
* Values automatically coerce to trait objects they implement, without
an explicit `as`.
* Enum discrimi

[rust-dev] Defined, precisely defined, and undefined

2014-01-01 Thread Nathan Myers

On 12/31/2013 01:41 PM, Patrick Walton wrote:

On 12/31/13 1:33 PM, Nathan Myers wrote:

>>

The possibility of precisely defining the behavior of a bounded
channel in all circumstances is what makes it suitable as a
first-class primitive.


Unbounded channels have defined behavior as well. Undefined behavior

> has a precise definition and OOM is not undefined behavior.

Undefined is not the opposite of precisely defined.  OOM can not
be part of precisely defined behavior, because it isn't.

Nathan Myers
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Unbounded channels: Good idea/bad idea?

2014-01-01 Thread Nathan Myers

On 12/31/2013 01:41 PM, Patrick Walton wrote:
> Bounded channels can be constructed from unbounded channels as well,
> so I don't see how this is an argument for making bounded channels
> the primitive.

Sometimes we need an indefinite-precision integer, other times a
fixed-size integer. Sometimes we need a variable-sized array, others
a compile-time fixed-size array. A range-limited integer or array
can be constructed using the unrestricted type, but would lack exactly
those properties it needs in order to usable where you need one.

It should be clear that in each case one type is more fundamental,
implementation-wise, than the other, and can be used in circumstances
where the other is entirely unaffordable. One is necessarily primitive,
while the other can be constructed from primitives with no performance
penalty or restrictions.

There are many languages that offer indefinite-precision integers
and variable-sized arrays as primitives. No one uses them for
system coding.

Nathan Myers

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Unbounded channels: Good idea/bad idea?

2013-12-31 Thread Nathan Myers

On 12/30/2013 08:58 PM, Patrick Walton wrote:

I'm not particularly interested in sacrificing performance by not
implementing one or the other in libstd. I think it's clear we need
both forms of channels, and they should be first-class primitives.


It's clear there are people who *want* both kinds of channels. It
does not follow that they should both be first-class primitives.
The possibility of precisely defining the behavior of a bounded
channel in all circumstances is what makes it suitable as a
first-class primitive. (The other, practical, requirement on
primitives is that their number be minimal.)

To implement an unbounded channel requires inherently more complex
operations -- re-allocations or segmented storage, response to failure
-- and corresponding choices that leak consequences visible to users.
In an application with extreme constraints on performance, the
arbitrary choices that must be taken implementing any particular
unbounded-channel design are likely not to match requirements, with
the typical result that such an application uses a private
re-implementation.  In the remaining, less demanding applications,
there is no penalty for using an unbounded channel constructed from
other, first-class primitives.

Nathan Myers
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Unbounded channels: Good idea/bad idea?

2013-12-27 Thread Nathan Myers

On 12/27/2013 09:09 PM, Patrick Walton wrote:

On 12/27/13 8:18 PM, Daniel Micay wrote:

Most of the standard library isn't appropriate for a kernel. The
support for concurrency (threads/tasks, mutexes, condition variables,
blocking channels), floating point math, input/output, failure,
multi-processing, logging and garbage collection isn't going to work.


I'd like to make that stuff either moved out into a separate library or
discardable.


I think we really are close to agreement on the above.

Tasks remain useful as a unit of resource management, independently
of threads or concurrency.  Kernel i/o differs, but int->ASCII is
the same as in user space.  Logging in kernels is often the only
way to locate bugs, and it differs in initialization, but not in
usage.  Memory management in the Linux kernel uses different allocation
methods, but makes very heavy use of (manual!) reference-counted
garbage collection.

Partitioning between core-language and supplementary libraries by
general category would be a mistake.  Usually the apparatus that
enforces usage conventions should be in the core, for interoperability,
but embodiments of the conventions (e.g. specific data structures,
spawn() methods, allocators) not.


Iterators, atomics, Option and low-level memory/pointer functions are
useful, but it's also trivial to re-implement and the standard library
doesn't do it particularly well in the first place.


Then let's get bugs on file and fix them upstream in libstd.
Inefficiency in userspace is just as bad as inefficiency in kernel space.


They are trivial to implement given a design, but a good design is
arrived at only through long and difficult evolution.  That makes it
especially worth exposing in the core library, and making user space
analogs match.

Nathan Myers
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Unbounded channels: Good idea/bad idea?

2013-12-27 Thread Nathan Myers

On 12/27/2013 01:35 PM, Patrick Walton wrote:

On 12/27/13 1:34 PM, Daniel Micay wrote:

Rust tasks and channels don't work in a freestanding environment.
Unbounded allocation is *definitely* not suitable for a kernel
though... even a large bounded channel would be unsuitable. The world
of small stacks and interrupts is quite different than what the
standard library is written for.


I think the standard library should get there over time though.


Possibly the best measure of a language's power is how well it enables
writing generally useful libraries.  How much of the standard library
older languages would have had to build in is an objective measure of
success.  While important parts of the runtime will be re-implemented
for a kernel setting, kernel coders will need many of the powerful
standard-library features.

This argues for including those standard-library features that mean
the same everywhere in the "freestanding" language definition. In
kernel code, tasks and channels might be created differently, but
used identically.

Nathan Myers
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Unbounded channels: Good idea/bad idea?

2013-12-26 Thread Nathan Myers

It's clear that you can have any sort of channel, or all
possible combinations, and end up with a language usable for
typical problems. Any single primitive works to build all the
rest.  So, this isn't a question of whether users are allowed
to code the way they want to. It comes down to a question of
what and who Rust is for.

A systems language meant to implement rigorously specified
designs needs to be as rigorously specified itself -- a huge job,
at best.  For that, it needs a primitive with behavior that can
be completely and precisely expressed for all runtime conditions.
Anything else that can be built using the primitive can go in
libraries that a rigorous design need not depend on.

If performance matters, then the primitive chosen should impose
no overhead for features not needed in the simplest, fastest
use case.  It's easy to add features and overhead.  It's not
just good luck that the primitives that are simplest to specify
usually are also fastest.

Nathan Myers
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Joe Armstrong's "universal server"

2013-12-19 Thread Nathan Myers

So, Rust is incompatible with dlopen()?

Nathan Myers

On 12/19/2013 03:03 AM, Felix S. Klock II wrote:

rust-dev-

 From reading the article, I thought the point was that a universal
server could be deployed and initiated before the actual service it
would offer had actually been *written*.

I agree with Kevin that the exercise would be pretty much pointless for
Rust, unless you enjoy writing interpreters and/or JIT compilers and
want to implement one for Rust.  (I know we had rusti at one point, keep
reading...)

In particular, I assume this works in Erlang because Erlang programs are
compiled to an interpreted bytecode representation (for an abstract BEAM
machine) that can then be JIT compiled for the target architecture when
it is run.

But Rust does not not have an architecture independent target code
representation; I do not think LLVM bitcode counts, at least not the
kind we generate, since I believe that has assumptions about the target
architecture baked into the generated code.

Cheers,
-Felix

On 18/12/2013 19:17, Kevin Ballard wrote:

That's cute, but I don't really understand the point. The sample
program he gave:

test() ->
Pid = spawn(fun universal_server/0),
Pid ! {become, fun factorial_server/0},
Pid ! {self(), 50},
receive
X -> X
end.

will behave identically if you remove universal_server from the equation:

test() ->
Pid = spawn(fun factorial_server/0),
Pid ! {self(), 50},
receive
X -> X
end.

The whole point of universal_server, AFAICT, is to just demonstrate
something clever about Erlang's task communication primitives. The
equivalent in Rust would require passing channels back and forth,
because factorial_server needs to receive different data than
universal_server. The only alternative that I can think of would be to
have a channel of ~Any+Send objects, which isn't very nice.

To that end, I don't see the benefit of trying to reproduce the same
functionality in Rust, because it's just not a good fit for Rust's
task communication primitives.

-Kevin

On Dec 18, 2013, at 6:26 AM, Benjamin Striegel mailto:ben.strie...@gmail.com>> wrote:


Hello rusties, I was reading a blog post by Joe Armstrong recently in
which he shows off his favorite tiny Erlang program, called the
Universal Server:

http://joearms.github.io/2013/11/21/My-favorite-erlang-program.html

I know that Rust doesn't have quite the same task communication
primitives as Erlang, but I'd be interested to see what the Rust
equivalent of this program would look like if anyone's up to the task
of translating it.


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Unbounded channels: Good idea/bad idea?

2013-12-19 Thread Nathan Myers

On 12/19/2013 12:57 AM, Kevin Ballard wrote:


Running out of memory can certainly be a problem with unbounded channels, but 
it's not a problem unique to unbounded channels. I'm not sure why it deserves 
so much extra thought here to the point that we may default to bounded. We 
don't default to bounded in other potential resource-exhaustion scenarios. For 
example, ~[T] doesn't default to having a maximum capacity that cannot be 
exceeded. The only maximum there is the limits of memory. I can write a loop 
that calls .push() on a ~[T] until I exhaust all my resources, but nobody 
thinks that's a serious issue.


This is a reasonable question.  A channel should be able to handle an
indefinitely large amount of traffic over its lifetime.  In the absence
of performance considerations, there is no reason for its capacity to
be more than one.  A channel with a capacity of more than one supports
pipelining, enabling coarser-grained work units that reduce scheduling
overhead.  As the capacity grows, that benefit shrinks to zero, while
system latency may grow without bound.

By contrast, there is nothing inherently time-related about ~[T].

I wonder... maybe filling in the last slot in a channel should also
make the sending thread yield, and if the channel is still full when
the thread runs again, something interesting is allowed to happen --
by default, nothing or task termination, but you can provide a
callback that might grow the channel, or push back upstream, or
what-have-you. Then there's only one kind of channel seen by
send and recv, maximally optimized, that does whatever fool thing
you want, and only ever costs extra at a scheduling boundary when
you can afford it.

Probably this idea is incompatible with 1:1 threads, or something.

Nathan Myers
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Unbounded channels: Good idea/bad idea?

2013-12-18 Thread Nathan Myers

On 12/18/2013 10:23 PM, Kevin Ballard wrote:

In my experience using Go, most of the time when I use a channel I don't 
particularly care about the size, as long as it has a size of at least 1 (to 
avoid blocking on the send). However, if I do care about the size, usually I 
want it to be effectively infinite (and I have some code in my IRC bot that 
uses a separate goroutine in order to implement an infinite channel). Upon 
occasion I do want an explicitly bounded channel, but, at least in my code, 
that tends to be rarer than wanting effectively infinite.


In a working system, "effectively infinite" means the sender uses up
its timeslice before exceeding your latency goal, and the receiver
empties the channel before its timeslice is used up. In a failing 
system, it means unpredictable performance or behavior.



My general feeling is that providing both bounded and unbounded channels would 
be good. Even better would be allowing for different ways of handling bounded 
channels (e.g. block on send, drop messages, etc.), but I imagine that 
providing only one type of bounded channel (using block on send if it's full 
and providing a .try_send() that avoids blocking) would be sufficient 
(especially as e.g. dropping messages can be implemented on top of this type of 
channel).


More choice is always better, except when you have limited resources.
We always have limited resources, so we consider priorities.


I also believe that unbounded should be the default, because it's the most 
tolerant type of channel when you don't want to have to think about bounding 
limits. It also means async send is the default, which I think is a good idea.


There are different definitions of tolerant, as there are of simplicity.
Usually it's better to fail in ways that are easy to understand and fix.
This is the same lesson we learn from garbage-collected memory systems.

Nathan Myers


On Dec 18, 2013, at 9:36 PM, Patrick Walton  wrote:


On 12/18/13 8:48 PM, Kevin Ballard wrote:


By that logic, you'd want to drop the oldest unprocessed events, not the newest.


Right.

To reiterate, there is a meta-point here: Blessing any communications primitive 
as the One True Primitive never goes well for high-performance code. I think we 
need multiple choices. The hard decision is what should be the default.

Patrick

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev



___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Unbounded channels: Good idea/bad idea?

2013-12-18 Thread Nathan Myers

On 12/18/2013 09:36 PM, Patrick Walton wrote:

On 12/18/13 8:48 PM, Kevin Ballard wrote:


By that logic, you'd want to drop the oldest unprocessed events, not
the newest.


Dropping is dropping. If you prefer to drop old events, pull them off
the channel and drop them fast enough that new events don't spill.


To reiterate, there is a meta-point here: Blessing any communications
primitive as the One True Primitive never goes well for high-performance
code. I think we need multiple choices. The hard decision is what should
be the default.


It helps to consider what serves as a useful building block.  An
unbounded channel is useless as the basis for a bounded channel.

You can easily code an unbounded channel that implements precisely
the storage-management and ultimate-failure policies that suit
you.  A primitive can't choose correctly, or even enable you to
express your choices without arbitrary limitations or vexing
complexity.

Nathan Myers
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Unbounded channels: Good idea/bad idea?

2013-12-18 Thread Nathan Myers

On 12/18/2013 07:07 PM, Patrick Walton wrote:

(dropping messages, or exploding in memory consumption, or
introducing subtle deadlocks) are all pretty bad. It may well

> be that dropping the messages is the last bad option, because

the last two options usually result in a crashed app...


As I understand it, getting into a state where the channel would
drop messages is a programming error.  In that sense, terminating
the task in such a case amounts to an assertion failure.

In the case of Servo, somebody needs drop excess events
because it makes no sense to queue more user-interface actions
than the user can remember.

Nathan Myers
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Unbounded channels: Good idea/bad idea?

2013-12-18 Thread Nathan Myers

On 12/18/2013 06:29 PM, Tony Arcieri wrote:


Adding bounds to a channel doesn't require that sends block, and I think
Rust is doing the Right Thing(TM) here in regard to non-blocking sends and
I would never ask you to change that. There are other options for bounding
channels which don't involve a blocking send though:

1) Drop messages on the floor:
2) Crash the sender:
3) Make sends to a full channel an error: ...


Of course there is little difficulty in providing three different
send primitives, and anyway both (1) and (2) can be trivially
constructed from (3), albeit at prohibitive (i.e. one or two cycles!) 
cost.  I confess that (2) had not occurred to me as a reasonable

alternative.  In Rust I assume it would cleanly terminate the task.

Discussion late into the night suggested that fixing channels (in
both senses) is still very much within the charter of post-0.8
development, and I join the chorus of proponents of this change.
For many applications a one-element channel is the right size.
Probably for practically all uses a compile-time-fixed size is
best, and (thus) suffices for a built-in feature.

Nathan Myers
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] List of potential C# 6.0 features

2013-12-10 Thread Nathan Myers

On 12/10/2013 12:35 PM, Ziad Hatahet wrote:

Thought this would be of interest to the list:
http://damieng.com/blog/2013/12/09/probable-c-6-0-features-illustrated


It's a good sign that nothing like most of these would add any
value in Rust. The C# syntax hacks (old and proposed) for what
they call "property" members only shorten badly designed code.
Making bad code look less bad than it is should count as a
negative.  In places where there is no practical possibility
of good code, making bad code look better may be the best one
can do, but even there the right choice is to code somewhere
else instead.

Nathan Myers
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Fwd: Please simplify the syntax for Great Justice

2013-11-18 Thread Nathan Myers

On 11/18/2013 04:27 PM, Patrick Walton wrote:

On 11/18/13 4:23 PM, Ziad Hatahet wrote:

...and possibly change `~`.


To what?


`*`


Also, s/fn/fun/g

We must not underestimate the importance of being perceived as a
fun language.

"C++ is a general purpose programming language designed to make 
programming more enjoyable for the serious programmer."

 - Bjarne Stroustrup

"Rust is a general purpose language designed to make programming
more fun for the serious programmer."
 - nobody, yet.

Seriously,
Nathan Myers
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


[rust-dev] Implementation complexity

2013-11-14 Thread Nathan Myers

On 11/11/2013 03:52 PM, Gaetan wrote:

Can we have Two rust?

The first one would be easy to learn, easy to read, and do most of ones
would expect: on demand garbage collector, traits, Owned pointers,...

>

The second one would include all advanced feature we actually don t
need everyday


This is a special case of the general design principle: push policy
choices up, implementation details down.

There's no need to choose between M:N vs. 1:1 threading, or contiguous
vs. segmented stacks, at the language design level.  It just takes
different kinds of spawn(). The default chosen is whatever works most
transparently.  Similarly, a thread with a tiny or segmented stack is
not what we usually want, but when we (as users) determine we can live
with its limitations and costs -- including expensive call/return
across segment boundaries, and special ffi protocol -- there's no
fundamental reason not to support it.

There are practical reasons, though.  Each choice offered adds to the
complexity of the implementation, and multiplies the testing needed.
We don't want it to be very expensive to port the rust runtime to a
new platform, so these special modes should be limited in number, and
optional.  Ideally a program could try to use one and, when it fails,
fall back to the default mode. There is no need to make this falling-
back invisible, but there are good reasons not to.

Making the choice of default mode depend on the platform (1:1 here, M:N
there) might force complexity on users not necessarily equipped to cope
with it, so it is best to make the defaults the same in all
environments, wherever practical.

(Graydon et al. understand all this, but it might not be obvious to all
of the rapidly growing readership here.)

Nathan Myers
n...@cantrip.org
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


[rust-dev] Does Rust ever sleep?

2013-10-18 Thread Nathan Myers

Does Rust ever sleep?

This is a rhetorical question, conceivably related to the Neil Young
album, or to the possibility of a project motto, or maybe about
whether the Rust runtime really runs programs in parallel, with
as many true OS-level threads as cores.

Nathan Myers
n...@cantrip.org
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Counting users (was: Update on I/O progress)

2013-05-04 Thread Nathan Myers

On 05/04/2013 03:55 AM, james wrote:

On 01/05/2013 01:25, Tim Chevalier wrote:
"Please keep unstructured critique to a minimum. If you have solid 
ideas you want to experiment with, make a fork and see how it works."



It might be development policy, but it seems to me a terrible idea.


For the record, I consider this a completely sensible policy.

My purpose in the original essay was _not_ to provoke discussion,
particularly about user counts. It was to suggest that reasoning
about individual features based on user count, real or imagined,
is a fundamental error.

I can explain briefly what worked best in the C++ standardization
process, but will not without invitation from the principals.

Nathan Myers
n...@cantrip.org
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


[rust-dev] Counting users (was: Update on I/O progress)

2013-04-30 Thread Nathan Myers
> What makes an API a second class citizen? If we expose both, and 99% 
of users ...


Which users count?  99% of programmers do half of the
programming work (*).  The rest do the other half (**).  To
inconvenience 99 out 100 programmers would be a grave error,
To inconvenience the 1% who do the hardest work is worse,
if those are the programmers you hope will use your language.

Most of the 99% won't use Rust under any circumstances; they
will stick with an easier language. The small fraction of the 99%
who do use Rust will, anyhow, amount to most of the Rust users.
The 1% code much like 99%ers when they can. The fraction of the
1% who pick up Rust for difficult work depends strongly on
whether Rust can do what they need, but also on whether enough
of  the 99% have adopted it.

The rule of thumb is "easy things should be easy, hard things should
be possible".  But is it hard because of the problem, or because of the
language? Making hard problems easier might be much of the purpose
of language design, but hard problems have a way of staying that way.
If you can keep easy things easy, and convert enough of what was
hard into something routine and maintainable, you have succeeded.
Too often you can't make it easier, so the best you can do is keep
out of the way.   The 1% might not thank you for a feature meant for
them when they actually need something else.

Most programs start out seeming easy to write in a convenient but
weak language. As they grow you discover the hard parts that may
need a stronger language.  A strong language needs to be (almost) as
convenient as a weak language so that it will be chosen for problems
that seem easy at first.

(*)  True story: Ellemtel had 500 engineers on a crash project, and at
its end half the code was written by one of them.

(**) I have read earnest assertions by influential 99%ers that the rest
don't actually exist.

Nathan Myers
n...@cantrip.org
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Simpler/more flexible condition handling was: Re: Update on I/O progress

2013-04-29 Thread Nathan Myers

On 04/29/2013 11:26 AM, Graydon Hoare wrote:

On 13-04-27 08:49 AM, Lee Braiden wrote:

Hi all,

This is going to be long, but I've tried to organise my thoughts clearly
and as succinctly as possible.

I've read your email a few times and I _think_ it mostly consists of a
request to add catchable exceptions to the language. Which we won't do
(or I won't do, and I will resist strongly as I think it will hurt
users, performance and correctness of code). I will reply -- somewhat
repetitively, I'm afraid -- to minor points but want to clarify a few
things in advance:


Non-catchable exceptions struck me as the best idea in the language.
Maybe C++ will be remembered for the destructor, and Rust will be
remembered for exceptions that make design simpler.

As a point of terminology, "non-resumable" exceptions are what C++
has, and the "resumable" exception is an ill-conceived Microsoft
extension in their VC++ product.  "Resume", there, meant to jump
out of code running at a catch site and back to the point of the
throw, making the throw statement a sort of no-op with side effects.

Nathan Myers
n...@cantrip.org
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Update on I/O progress

2013-04-25 Thread Nathan Myers

On 04/24/2013 03:38 PM, Brian Anderson wrote:

## String encoding

I have not yet put much thought about how to deal with string encoding 
and decoding. The existing `io` module simply has Reader and Writer 
extension traits that add a number of methods like `read_str`, etc. I 
think this is the wrong approach because it doesn't allow any extra 
state for the encoding/decoding, e.g. there's no way to customize the 
way newline is handled. Probably we'll need some decorator types like 
`StringReader`, etc.


An opportunity not to be missed... since I/O  objects are often
propagated through a system from top to bottom, they are an
excellent place to attach state that intermediate levels does not
need to know about.   Such state traditionally includes locale-ish
formatting services, but the possibilities are much broader: time
zone, measurement-unit system, debug level, encryption/
authentication apparatus, undo/redo log, cumulative stats,
rate and resource limits -- the list goes on. Some of these
can be standardized, and many have been, but users need to
be able to add their own on an equal basis with standard
services.


## Close

Do we need to have `close` methods or do we depend solely on RAII for 
closing streams?


close() can fail and report an error code.  When you are not
interested in that, the destructor can do the job and throw
away the result, but often enough you need to know.  Usually
there's not much to do about it beyond reporting the event,
but that report can be essential: did the data sent reach its
destination?  If not, why not?

Nathan Myers
n...@cantrip.org
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Renaming of core and std

2013-04-24 Thread Nathan Myers

Thanks, Graydon, for the detailed reply.

> As a meta-comment, the reflexive use of abbreviations in
> not-frequently-typed names seems like a problem.
I don't find it a problem. A bunch of people have expressed distaste,
but a bunch of others have expressed pleasure. As a recently-tweeted
quip put it, "One person's idiom is another's boilerplate."

Fair enough.  But familiarity creates bias.  How much does the reaction
of newcomers to the language matter?  Too-cryptic names add to
cognitive load at a time when there is little capacity to spare. Your
reservations about "ext" are well placed.

modern editors with auto-complete make these unnecessary?

We've generally avoided (and I am opposed to) design choices that
require a "modern editor". At least anything more modern than vi or
emacs. I know some people even write code in acme, or microemacs.

This is a wholly admirable policy, but "require" seems like a pretty
strong word, here.

There was some work on this sort of guideline-making in the style guide
recently, but I recall you objecting to those guidelines (indeed, the
mere idea of them). And you also suggested we sacrifice everything for
"conscious attention, screen space, editing time, short-term memory",
which suggests (to me) a strong preference for short names. So I can't
really tell what if anything you feel like we should be doing differently.

My opinion may be worth only as much as it weighs, but I do
not object to guidelines in general; I just hope to see proposals
traceable, in detail, to defensible principles and measurable
consequences.  Without, there's a real temptation to enshrine
personal preferences that owe more to history than to sense.
As a new language, Rust offers a rare opportunity to leave old
mistakes behind.  (I would count StudlyCapsNames among such
mistakes, but that bridge seems burnt.)

Short names are good, but Kernighan's "telephone test"  identifies
a sane natural limit.  Too, language constructs that are considered poor
form benefit from unwieldy names (e.g. C++ "reinterpret_cast").

- Nathan Myers

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Renaming of core and std

2013-04-23 Thread Nathan Myers



  - Is the "large standard library" approach of python and haskell (say)
really problematic wrt. pace of library evolution? Is it wise to
even _have_ an "ext"? If so, what does it mean? Supported and
tested non-mutually-dependent selections from the package ecosystem?
The moral equivalent of boost?

Finally, wrt. naming, in the meeting someone mentioned that:

 extern mod "rust-lang.org/ext";

might not really read well. Lots of "ext" in there. Any other names 
come to mind?


A name reflecting its true role seems appropriate.  Drawing on
the Linux kernel experience, I propose "staging".  Alternatively,
"trial", "proposed", "experimental", "unstable".  The Boost experience
suggests that explicit interface versioning, if only by naming
convention, would be wise: the most frequently expressed reason
for Boost non-use has been interface instability between releases.
A policy of supporting use of old interfaces in scrupulously
module-name-qualified code, enabling interface evolution
without breaking existing uses, would aid adoption.

As a meta-comment, the reflexive use of abbreviations in
not-frequently-typed names seems like a problem.  Surely
modern editors with auto-complete make these unnecessary?
Do we need a policy on what sort of names merit abbreviation,
and how much?  (I am forced to admit disappointment at "fn"
instead of "fun".)  Alex Stepanov's policy designing STL was
to avoid abbreviations wherever defensible.  It mostly worked
out well, although "iterator" turned out to be too long.

Nathan Myers
n...@cantrip.org
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


[rust-dev] Fwd: Re: Some work on Rust coding standards

2013-04-10 Thread Nathan Myers


On 04/10/2013 03:57 PM, Brian Anderson wrote:

There have been a few mentions recently about writing up the Rust
coding standards. Several of us sat in a room and tried to identify
and agree on some that we thought were important. I've pasted the
resulting notes into the wiki:

https://github.com/mozilla/rust/wiki/Note-style-guide

These are very, very rough but cover a number of topics. Comments and
suggestions are welcome. These need to be cleaned up into readable
prose, with decent examples, and moved from the 'Notes' section of the
wiki to the 'Docs' section where users will find them. Help is
definitely appreciated.


Is it permitted to recoil in horror?  No?  OK.

Before commenting on individual items, it seems better to start by
identifying what conventions can achieve, and what they shouldn't try:

1. The overarching goal is interoperability.  Codify conventions to ease
mixing libraries from diverse sources. Only codify what actually
matters.  Different people make different esthetic choices, and a style
guide cannot change that.  So, don't publish a style guide, publish
coding guidelines.  Leave superficialities for the the superficial to
quarrel over.

2. Some resources will always be scarce: conscious attention, screen
space, editing time, short-term memory.  Sacrifice anything, even
consistency, to favor those. (E.g., start long text strings at the left
margin, ignoring current indentation level.)

3. Rules should favor needs of coders reading or writing the most
difficult code.  That means, in particular, don't ask coders to waste
scarce screen space calling attention to details that are not
important.  Don't demand fussy aligning.  Extra whitespace should be
reserved to the coder to enable signaling something important, e.g.
early function return.  Deep indentation usually wastes whitespace on
incidentals.  Non-coding vertical whitespace could better be used in the
next window over.

4. Some rules may have a purely operational purpose. E.g. two potential
breakpoint sites that would be usefully distinct should be on different
lines, such as conditional predicate and consequent, so that debugging
is easier.

5. As the language definition solidifies, it gets harder to make unwise
design choices actually illegal, and the coding guide must take up the
slack.  It took a long time to realize that C++ virtual functions
shouldn't be public.

6. Layout recommendations should be representable as code, so an editor
can re-flow code automatically as the window is narrowed, or as the
indentation of blocks of code changes.

Nathan Myers
n...@cantrip.org

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev