Rust updates

bearophile Sun, 08 Jul 2012 06:50:27 -0700

On Reddit they are currently discussing again about the Rustlanguage, and about the browser prototype written in Rust, named"Servo" (https://github.com/mozilla/servo ):

http://www.reddit.com/r/programming/comments/w6h7x/the_state_of_servo_a_mozilla_experiment_in/



So I've taken another look at the Rust tutorial:
http://dl.rust-lang.org/doc/tutorial.html

and I've seen Rust is quite more defined compared to the last twotimes I've read about it. So below I put more extracts from thetutorial, with few comments of mine (but most text you find belowis from the tutorial).

On default in Rust types are immutable. If you want the mutabletype you need to annotate it with "mut" in some way.

Rust designers seems to love really short keywords, this is in myopinion a bit silly. On the other hand in D you have keywordslike "immutable" that are rather long to type. So I prefer a midway between those two.

Rust has type classes from Haskell (with some simplifications forhigher kinds), uniqueness typing, and typestates.


In Haskell typeclasses are very easy to use.

From my limited study, the Rust implementation of uniquenesstyping doesn't look hard to understand and use. It staticallyenforced, it doesn't require lot of annotations and I think itscompiler implementation is not too much hard, because it's a puretype system test. Maybe D designers should take a look, maybe forD3.


Macros are planned, but I think they are not fully implemented.

I think in Go the function stack is segmented and growable as inGo. This saves RAM if you need a small stack, and avoids stackoverflows where lot of stack is needed.


-------------------------

Instead of the 3 char types of D, Rust has 1 char type:

char  A character is a 32-bit Unicode code point.

-------------------------

And only one string type:

str String type. A string contains a UTF-8 encoded sequence ofcharacters.

For algorithms that do really need to index by character, there'sthe option to convert your string to a character vector (usingstr::chars).


-------------------------

Tuples are rightly built-in. Tuple singletons are not supported(empty tuples are kind of supported with ()):



(T1, T2)  Tuple type. Any arity above 1 is supported.

-------------------------

Despite Walter said that having more than a type of pointer isbad, both Ada and Rust have several pointer types. Rust has threeof them (plus their mutable variants).

Rust supports several types of pointers. The simplest is theunsafe pointer, written *T, which is a completely uncheckedpointer type only used in unsafe code (and thus, in typical Rustcode, very rarely). The safe pointer types are @T for shared,reference-counted boxes, and ~T, for uniquely-owned pointers.


All pointer types can be dereferenced with the * unary operator.

Shared boxes never cross task boundaries.

-------------------------

This seems a bit overkill to me:

It's also possible to avoid any type ambiguity by writing integerliterals with a suffix. The suffixes i and u are for the typesint and uint, respectively: the literal -3i has type int, while127u has type uint. For the fixed-size integer types, just suffixthe literal with the type name: 255u8, 50i64, etc.


-------------------------

This is very strict, maybe too much strict:

No implicit conversion between integer types happens. If you areadding one to a variable of type uint, saying += 1u8 will giveyou a type error.


-------------------------

Even more than Go:

++ and -- are missing


And fixes a C problem:

the logical bitwise operators have higher precedence. In C, x & 2> 0 comes out as x & (2 > 0), in Rust, it means (x & 2) > 0,which is more likely to be what you expect (unless you are a Cveteran).


-------------------------

Enums are datatypes that have several different representations.For example, the type shown earlier:


enum shape {
    circle(point, float),
    rectangle(point, point)
}

A value of this type is either a circle, in which case itcontains a point record and a float, or a rectangle, in whichcase it contains two point records. The run-time representationof such a value includes an identifier of the actual form that itholds, much like the 'tagged union' pattern in C, but with betterergonomics.

The above declaration will define a type shape that can be usedto refer to such shapes, and two functions, circle and rectangle,which can be used to construct values of the type (takingarguments of the specified types). So circle({x: 0f, y: 0f}, 10f)is the way to create a new circle.

Enum variants do not have to have parameters. This, for example,is equivalent to a C enum:


enum direction {
    north,
    east,
    south,
    west
}

-------------------------

This is probably quite handy:

A powerful application of pattern matching is destructuring,where you use the matching to get at the contents of data types.Remember that (float, float) is a tuple of two floats:


fn angle(vec: (float, float)) -> float {
    alt vec {
      (0f, y) if y < 0f { 1.5 * float::consts::pi }
      (0f, y) { 0.5 * float::consts::pi }
      (x, y) { float::atan(y / x) }
    }
}

- - - - - - - -

Records can be destructured in alt patterns. The basic syntax is{fieldname: pattern, ...}, but the pattern for a field can beomitted as a shorthand for simply binding the variable with thesame name as the field.


alt mypoint {
    {x: 0f, y: y_name} { /* Provide sub-patterns for fields */ }
    {x, y}             { /* Simply bind the fields */ }
}

The field names of a record do not have to appear in a pattern inthe same order they appear in the type. When you are notinterested in all the fields of a record, a record pattern mayend with , _ (as in {field1, _}) to indicate that you're ignoringall other fields.


- - - - - - - -

For enum types with multiple variants, destructuring is the onlyway to get at their contents. All variant constructors can beused as patterns, as in this definition of area:


fn area(sh: shape) -> float {
    alt sh {
        circle(_, size) { float::consts::pi * size * size }
        rectangle({x, y}, {x: x2, y: y2}) { (x2 - x) * (y2 - y) }
    }
}

-------------------------

This is quite desirable in D too:

To a limited extent, it is possible to use destructuring patternswhen declaring a variable with let. For example, you can say thisto extract the fields from a tuple:


let (a, b) = get_tuple_of_two_ints();

-------------------------

Stack-allocated closures:

There are several forms of closure, each with its own role. Themost common, called a stack closure, has type fn& and candirectly access local variables in the enclosing scope.


let mut max = 0;
[1, 2, 3].map(|x| if x > max { max = x });

Stack closures are very efficient because their environment isallocated on the call stack and refers by pointer to capturedlocals. To ensure that stack closures never outlive the localvariables to which they refer, they can only be used in argumentposition and cannot be stored in structures nor returned fromfunctions. Despite the limitations stack closures are usedpervasively in Rust code.


-------------------------

Unique closures:

Unique closures, written fn~ in analogy to the ~ pointer type(see next section), hold on to things that can safely be sentbetween processes. They copy the values they close over, muchlike boxed closures, but they also 'own' them—meaning no othercode can access them. Unique closures are used in concurrentcode, particularly for spawning tasks.

There are also heap-allocated closures (so there are 3 kinds ofclosures).


- - - - - - - -

In contrast to shared boxes, unique boxes are not referencecounted. Instead, it is statically guaranteed that only a singleowner of the box exists at any time.


let x = ~10;
let y <- x;

This is where the 'move' (<-) operator comes in. It is similar to=, but it de-initializes its source. Thus, the unique box canmove from x to y, without violating the constraint that it onlyhas a single owner (if you used assignment instead of the moveoperator, the box would, in principle, be copied).

Unique boxes, when they do not contain any shared boxes, can besent to other tasks. The sending task will give up ownership ofthe box, and won't be able to access it afterwards. The receivingtask will become the sole owner of the box.


-------------------------

In D you control this adding "private" before names, but I thinka centralized control point at the top of the module is safer andcleaner:

By default, a module exports everything that it defines. This canbe restricted with export directives at the top of the module orfile.


mod enc {
    export encrypt, decrypt;
    const super_secret_number: int = 10;
    fn encrypt(n: int) -> int { n + super_secret_number }
    fn decrypt(n: int) -> int { n - super_secret_number }
}

-------------------------

This is needed by the uniqueness typing:

Evaluating a swap expression neither changes reference counts nordeeply copies any unique structure pointed to by the moved rval.Instead, the swap expression represents an indivisible exchangeof ownership between the right-hand-side and the left-hand-sideof the expression. No allocation or destruction is entailed.


An example of three different swap expressions:

x <-> a;
x[i] <-> a[i];
y.z <-> b.c;

-------------------------

For some info on the typestate system, from the Rust manual:

http://dl.rust-lang.org/doc/rust.html#typestate-system

This description is simpler than I have thought. It seemspossible to create an experimental D compiler with just a similartypestate system, it looks like a purely additive change (butmaybe it's not a small change). It seems to not even require newsyntax, beside an assert-like check() that can't be disable andthat uses a pure expression/predicate.


Bye,
bearophile

Rust updates

Reply via email to